Plugin rot and how to do regex replaces in mysql

I’ve spent a few hours over the past week restoring my old blog, which can now be reached at brbcoffee.com/archive/.  This will be the final chapter in that saga.

My old blog is sort of old. It started out on MySpace, moved to Google’s BlogSpot platform, then moved to Tumblr for a while because I got some new authors on board. Finally we moved to a WordPress solution.  From there on BRBcoffee has been on WordPress, but it’s moved around hosts like an alcoholic goes from bar to bar trying to get his fix. I’ve employed a bunch of plugins to do various things for me, from importing and exporting posts to providing language switching capabilities to advanced caching to image scaling to– you get the picture. This is all fine and dandy, but it does pose a few problems when you move around like I do.

  1. The image scaling edits your wp_posts database table, naturally. When you move, the scaled images don’t necessarily come with, resulting in a lot of broken links.
  2. The translation plugin just has it’s own tags in that same database table, and keeps two posts in one field. When the plugin isn’t there anymore, or isn’t compatible with your WordPress version, the tags become visible on the site, and you get two languages at once.
  3. Plugins are often developed by lone wolf devs, and they are never up to date when you check up on them years later.

So, we need to clean up the mess they’ve created, and we don’t install Yet Another Plugin™ to try to fix the problem. We need to dig into our database and fix things ourselves. There’s only one problem: Mysql is made for storing, finding, updating, removing, etc. It isn’t the tool for the job when you need to do complex things like searching for a pattern and replacing only the part of the string that matches that pattern. It can be done, but I’m sure, but why would you do it when there are better tools for the job? Here’s how I do it:

  1. Export the table you need to edit with
    mysqldump -u USER -p DATABASE TABLE > dumpfile.sql
  2. Do work on the dumpfile with specialized tools like sed, awk, or even vim when in a pinch.
  3. Empty the table before reimporting the modified dump with
    mysql -u USER -p DATABASE -e “delete from TABLE;”
  4. Finally reimport the table with
    mysql -u USER -p DATABASE < dumpfile.sql

Mysql also comes with a tool called mysqlimport, but I had a problem with it, so I just used the standard mysql command in the end.  So what do those switches mean?

  • -p means this user has a password, ask me for it before doing anything
  • -u USER means use this username. Pretty obvious.
  • -e means execute. Execute the following quoted code.

The only weird thing is that -u takes an argument, but -p doesn’t. It looks like DATABASE should be the argument to -p, but it isn’t. DATABASE is simply the database you want to do work on. In my case, it was called “archive,” while yours might be “blog,” “sales_records,” or even “naughty_movies.” That’s commandline switches for you. In my opinion, a single dash should be reserved for switches, double for arguments, like so:

  • -p [ask me for a password]
  • --user <name> [I'm logging in as name, how quaint]

Oh well, you get used to it. That’s how you should do many complex updates on a migrated database. This isn’t for when you want do do one small change on a live system. The 3rd step removes all records from the specified table. Your system will practically be offline until you’ve reimported the changed data.

If you wish this were simpler, you’re not alone. Behavior like this has been a feature request in mysql since 2007. Go there and click affects me, and maybe in another 7 years we’ll have it built in. Till then, use the method described above, and for heaven’s sake, don’t let the plugin rot get your blog too. If you can do it without a plugin, do it without a plugin.

I am a wizard
Signing off

And the archive is restored!

I did it! I partially restored the dead disk, and I did the necessary doctoring on the old WordPress database dump. Sadly, my OffTopic blog wasn’t in the backup tarball, so that’s gone forever. It only had a handful of posts anyway, so no big loss.

Anyway, brbcoffee.com/archive/ has everything dating back to february 2009. I was 18 when I wrote that stuff. Don’t judge. Some of the images are gone, as is the nature of the internet. Sorry about that.

Bye

I lied

I found a tool called TestDisk, which can scan every individual cylinder of my hard drive for partition information. It takes forever, but it seems to work. Here’s hoping I remember my partition layout well enough to turn this into something workable.

A screenshot of the recovery tool running
That’s a lot of headers–

I’m a terrible computer engineer

Here follows an account of how almost a decade of blogging got lost in the æther:

  1. The power supply for my old netbook broke.
  2. I moved.
  3. I moved from Amazon AWS to DigitalOcean.
  4. I moved from Ubuntu to Fedora (DigitalOcean used to support Arch, but doesn’t anymore. This annoys me).
  5. I didn’t update the blog for a long time.
  6. I retrieved all the data from my netbook hard drive’s ext4 partition that looked interesting.
  7. I turned the old hard drive into a backup disk.
  8. I wrote a new partition table to my backup disk.
  9. I added an NTFS partition to my backup disk’s partition table.
  10. I discovered that my website was down.
  11. I discovered that I had failed to retrieve the backup of the blog from the old hard drive.
  12. I discovered that I only had that one backup.
  13. I wept.

Now, I’ve tried to run some basic data recovery. It’s easy enough to retrieve data from a deleted ext4 partition, but getting data from a partition that has been deleted, whose partition table no longer exists, and that has been overwritten by a non-compatible partition type, that’s a completely different story. There are very few vectors left to me at this point. They are:

    • Find a backup disk that has my blog on it.
      • Number of possible disks: 1.
      • Probability that any of them have my blog: 0.2.
    • Download everything web.archive.org has on me. Write a program to retrieve all the blog posts, in order, with metadata.

Things I’ve learned from this:

    1. Your offsite backup is only as reliable as your offsite hardware.
    2. Your offsite hardware is only as reliable as your offsite hardware maintainer.
    3. Don’t be your own offsite hardware maintainer.
    4. I don’t really care what I blogged about in high school.

So, that’s where I’m at. As soon as I have a few minutes to spare I’ll decide whether or not it’s worth it to mine through the internet wayback machine for my teenage musings. I’m guessing I’ll end up doing it for the challenge, and hey, I could even write a blog series about how I did it! But even if I do this thing, and succeed, I won’t import the old blog into this one. BRBcoffee is reborn today, free of the past. Brace for impact.

Bjørn out.