Removing Old WordPress Multisite Tables

August 29th, 2018

One issue I have come across while managing large WordPress multisites is dealing with databases for subsites after they've been deleted.

WordPress plugins typically create tables to store information in. In a multisite environment, these tables are created for each site. That means if you have plugin X that creates 5 tables for itself and you have three sites, you'll have fifteen tables total. The problem is that if you delete one of the sites, you'll still have a fifteen plugins.

What do developers do when things get tedious? Automate it.

Site IDs

Each site in a WordPress Multisite has a blogid associated with it. Whenever tables are created in the database for that specific sub-site, it normally takes the form of ```wp#table_namewhere # is replaced by the blog's ID. h It also has an entry in thewp_blogstable. To fully remove a site, we'll need to remove all of the tables that matchwp#_*(where the asterick is a wildcard & the # is the blog's ID). We'll also need to remove the entry from thewp_blogs``` table.

Deleting a specifc site

Let's say we have three sites on a WordPress install with the IDs of 1, 2 and 3. The site with the ID of 1 is the primary site.

If we wanted to delete the site with an ID of 2, we would need to remove all of the tables that start with "wp2" and remove the site's entry from wp_blogs

Let's remove the site entry first.

mysql $DB -e "DELETE FROM wp_blogs WHERE blog_id = 2"

Next, we need a way to delete all tables based on a prefix. Sadly there is no built-in way to do this in SQL. It is doable through the clever use of commands however.

First we need a way to get a list of all the tables in the database. We can do that using the mysql command. If you run the command mysql $DB -e "SHOW TABLES" -s it will return a list of all the tables in the database.

The -e flag is used to pass a SQL command to run after logging in and the -s flag silences the output (removing some decoration from the output).

We can pipe the output of the above command to grep to find only the tables that match our pattern (wp2).

mysql $DB -e "SHOW TABLES" -s | egrep "^wp_2_"

Finally we can redirect that output into the final command which will delete all of the tables it finds. We can do this with xargs. Using the -I flag, we can use the output of the previous command to substute a certain pattern in a new command.

mysql $DB -e "SHOW TABLES" -s | egrep "^wp_2_" |
 xargs -I "@@" mysql $DB -u $DB_USER -e "DROP TABLE @@"

In the above command, we're telling xargs that the pattern "@@" should be replaced with the incoming data. In our case, that incoming data is a table name. We then have xargs run a mysql command that drops the table.

Running the above command with delete all of the tables with that the prefix of "wp2", cleaning up our database.

Deleting everything BUT certain sites

The above command is nice if you know which site you want to delete, but what about if you only know which sites you want to keep? For example, let's say we have a WordPress multisite install with 100 sites. Out of the 100 sites, only six are still active, so we want to delete every install but those six.

The IDs of the six sites are 1, 2, 3, 4, 5 and 6.

First, let's remove all of the entries from wp_blogs that DON'T match the above IDs.

Just like before, we can get a list of all of the database tables using mysql $DB -e "SHOW TABLES" -s.

Next, we need to pipe the output of the previous command into grep and find all the tables that start with a prefix of "wp#" where # is one of the six IDs. We can do that with a regular expresion.

mysql $DB -e "SHOW TABLES" -s | egrep "^wp_(1|2|3|4|5|6)_"

The (1|2|3|4|5|6) part of the regular expression is used to match for any of those six numbers. We could also do [1-6] but that only works if the ID's we want to delete are all within the range of 1 to 6.

So far, the above command is finding a match for every table that starts the prefix "wp#" but what we really want is all the tables that don't. We can use -v flag for grep to invert the matching so it will do just that.

Now that want to get just the ID field from each table. For example if we have the table name wp_2_users, we just want the 2. This is so we find out which site ID's we need to delete and remove them from wp_blogs.

Below is the grep command for just outputing the site ID for each table name:

grep -oP 'wp_\K\d+(?=_)'

The -o flag is used to output just the part that matches the pattern and the -P tells grep to use Perl style regular expressions.

We can then pipe the output into the uniq command to strip out any duplicates.

The final step is to send it to xargs to delete each site entry from wp_blogs

The full command is:

mysql $DB -e "SHOW TABLES" -s | egrep "^wp_(1|2|3|4|5|6)_" | grep -oP 'wp_\K\d+(?=_)' | uniq |
  xargs -I "@@" mysql $DB -e "DELETE FROM wp_blogs WHERE blog_id = @@"

Now that all of the the site entries are deleted, we need to delete the tables for each site.

The command is very similar for what we used for removing a specific site except modified to match for any tables that don't have the prefix we're looking for.

mysql $DB -e "SHOW TABLES" -s | egep "^wp_(1|2|3|4|5|6)_" -v | egrep "^wp_[0-9]"

The last grep command is used to return just the tables that are for a specific site. If we didn't include this, we would end up deleting tables used by the primary site such as wp_users. We don't want that.

The final step is to pipe the contents to xargs and drop the tables:

mysql $DB -e "SHOW TABLES" -s | egep "^wp_(1|2|3|4|5|6)_" -v | egrep "^wp_[0-9]" |
  xargs -I "@@" mysql $DB -e "DROP TABLE @@"

Putting it all together

Next, we'll be pulling all of the above into a shell script and making it a little easier.

If we just takes the command from above and put them into a script, they look something like this:

DB=wordpress
mysql $DB -e "SHOW TABLES" -s | egrep "^wp_(1|2|3|4|5|6)_" | grep -oP 'wp_\K\d+(?=_)' | uniq |
  xargs -I "@@" mysql $DB -e "DELETE FROM wp_blogs WHERE blog_id = @@"

mysql $DB -e "SHOW TABLES" -s | egep "^wp_(1|2|3|4|5|6)_" -v | egrep "^wp_[0-9]" |
  xargs -I "@@" mysql $DB -e "DROP TABLE @@"

The issue with that is that the IDs are hardcoded into the commands. Rather than having them embedded in the commands, we should put them in a separate file.

I'll be using ids.txt where each line is a new ID. For example: For the above example, it would be like this:

1
2
3
4
5
6

Using cat, awk, and sed, we can read out of the above file and join each site ID together with a pipe. This way we can create the "(1|2|3|4|5)" portion of the regular expression.

First we need to cat the file, which simply prints the contents of the file to the screen.

cat ids.txt

Next we'll be using awk to join the lines together with a pipe at the end.

we can do that with the -vORS flag.

cat ids.txt | awk -vORS="|" '{ print $1 }'

The -vORS flag tells awk to add the pipe ("|") at the end of each line.

The '{ print $1 }' portition just prints the line out.

If we run the above command, it should return

1|2|3|4|5|6|

The issue is there is a trailing pipe at the end. Let's remove that. We can do that using sed:

cat ids.txt | awk -vORS="|" '{ print $1 }' | sed 's/|$//'

The argument to sed tells it to do a search and replace on each line. It looks for a trailing pipe |$ and replaces it with nothing //

Now that final step is to adapt our shell script to now dynamically create the regular expression instead of hardcoding the IDs.

DB=wordpress
mysql $DB -e "SHOW TABLES" -s |
  egrep "^wp_(`cat ids.txt | awk -vORS="|" '{ print $1 }' | sed 's/|$//'`)_" |
  grep -oP 'wp_\K\d+(?=_)' | uniq |
  xargs -I "@@" mysql $DB -e "DELETE FROM wp_blogs WHERE blog_id = @@"

mysql $DB -e "SHOW TABLES" -s |
  egep "^wp_(`cat ids.txt | awk -vORS="|" '{ print $1 }' | sed 's/|$//'`)_" -v |
  egrep "^wp_[0-9]" |
  xargs -I "@@" mysql $DB -e "DROP TABLE @@"

There's the final script!

Here is a raw gist link for easy download

Hopefully this helps someone out there! I had a lot of fun putting it together.