WordPress 4.8.2 Data Query Woes

Millions of websites across the Internet were automatically updated to WordPress 4.8.2 yesterday.    Thousands of those sites have key components that are no longer working.   Many of those sites will be looking for new plugins and themes unless new patches are made to make those themes and plugins 4.8.2 compatible.   WordPress 4.8.2 Data Query WoesWordPress 4.8.2 Data Query WoesWordPress 4.8.2 Data Query Woesn the meantime there are going to be a lot more partially broken websites online for the next month.

Unfortunately for my Store Locator Plus add ons, we build and test against the latest “nightly” builds which has been on WordPress 4.9 beta for a while.  Somehow the changes to the prepare method did not make it to that build on our test systems until 36 hours ago; that is about 3 hours AFTER we finished testing the latest 4.8.4 updates.  The very same locator updates that we published just over 30 hours ago.

What Changed?

WordPress changed a key component of the system architecture that is used by thousands of plugins, the database management interface.    The prepare() function of the WordPress Database class (WPDB) is used to manage data queries and replace placeholders with variables when performing searches.

As part of the WordPress 4.8.2 security update the ability to use a variety of different standardized printf formats, where a %d represents a number or a %s represents a string as a placeholder, was severely restricted.    As of 4.8.2 any plugin or theme that uses “undocumented’ or “unsupported” formats will no longer work.    To be more specific, any plugins or themes using the placeholders WordPress deems undocumented are not supported.

The officially supported placeholders are limited to:

Many plugins and themes use positional arguments like %1s or , as is the case in my Store Locator Plus Power add on, use the more accurate %u placeholder instead of %d.   These plugins and themes no longer work as of 4.8.2.

The Fix

Any plugin and theme that uses the wpdb->prepare() call, and they all should if they are processing data queries with variables being passed in, will need to change the string formats to use the 4 approved placeholders noted above.

Why %d and %u Are Important

In my case I elected to use the standard printf %u format which is an unsigned integer format.   The reason I chose this option is because the WordPress standard for data storage of keys in WordPress tables is the MySQL unsigned integer.

What’s he difference?   Both PHP and MySQL store 2 billion values in a signed integer and 4 billion in an unsigned integer.    That means you can have TWICE as many records in your database with unsigned integers but more importantly you avoid possible confusion.   Can you imagine telling someone “No, no, no, I meant record number NEGATIVE 1357 not POSITITVE 1357”?

WordPress decided that they would limit the standard prepare to the %d , signed integer format which is limited to 2 billion unique entries.  Granted that will cover nearly any real world WordPress install, but for the one site that hits record 2-billion-and-one there is likely to be trouble.

Today, I brought this up on the WP Core ticket system.   Maybe they’ll change it.  Maybe not.  In the meantime I’m replacing all %u references with %d because my client’s don’t care about technical merit but rather just want their sites to stay running.

 

Update:  on my PHP7 based system the “internal workings” of PHP don’t change how %d and %u are working.  That’s good news.

You can run this same sprintf() on your box to verify %d and %u are interchangeable, but for WPDB ALWAYS use %d.   Now to change a lot of code.

Performance Benefits Of COUNT(*) In SQL

This is something I often come across in code.  Just came across it in our MySLP project today , in fact.

While the performance impact of that statement in itself is negligible, it does add up.  For most database engines that statement incurs additional overhead as it processes logic related to a specific column containing a specific data type.   Some database engines check if the column is null for EVERY RECORD, for example.   Not a big deal if you have an index on the column but even with an index there is overhead.

The better option is to use count(*)

COUNT(*) is a semantic placeholder, NOT an operation directive in-and-of-itself.   Most database engines, MySQL included, recognize this construct and employ “shortcuts”.

Performing this operation on a moderate-sized table, for example will show a slight difference in the speed of getting the count of records in the table with count(*) almost-always pulling the desired count of records in ZERO seconds and other variants taking a second or more.

Not much, granted, but it IS a big deal with you are working in large volume instances where you are processing billions of records not thousands.  Same thing if you have thousands of tables not dozens.   A site with hundreds-of-thousands of visitors not hundreds.

The performance differential is even more pronounced when using joins in your count.

Don’t be fooled into thinking count(id) only reads one column from your database while count(*) reads ALL of them and therefore count(id) MUST be faster.  That is not how it works.  Not for most database engines.

 

Here is an article by Percona on the same subject you may be interested in:

https://www.percona.com/blog/2007/04/10/count-vs-countcol/

%d bloggers like this: