Follow Jeff Sayre on Twitter

BuddyPress Privacy Offers You the Power of the Force

By

Okay, that title may not be accurate. But for those advanced BuddyPress administrators and site owners who are performance focused, the BuddyPress Privacy Component will offer the option of creating the ACL (access control list) tables with the InnoDB storage engine instead of the MyISAM storage engine which WordPress and BuddyPress use as the default. This offers a number of advantages such as referential integrity with cascading deletes and updates and row-level locking instead of table-wide locking—which increases performance by facilitating multi-user concurrency, a crucial point for under-powered servers or highly-trafficked sites.

InnoDB offers many advantages including one of my favorites as a developer—cascading deletes and updates.The conventional wisdom that the MyISAM storage engine is always the preferred type is outdated. I will not go into the whys in this article as it is discussed in many places. Here is an article that discusses the benefits of the InnoDB storage engine type in great detail.

Refer to the MySQL documentation for more details on the InnoDB storage engine.

Suffice it to say, that two years ago, the Drupal project decided to make the InnoDB storage engine the default type for Drupal 7 and have not looked back since. You can read more about that decision here.

Use the Force, Luke

To use the InnoDB storage engine type for the BP Privacy tables, you must manually configure before installing. Details will be provided in the readme.txt file.

InnoDB offers many advantages including one of my favorites as a developer—cascading deletes and updates. To take advantage of this, I’ve coded dual delete methods within each class model. The method that is fired depends upon which storage engine a given install has chosen. If the InnoDB storage engine is in use, then record deletion is a very simple process as deletes are automatically cascaded down through the child tables. If the default MyISAM storage engine is in use, then multi-table deletes require a more complex looping routine to ensure that associated records are deleted.

Do WordPress and BuddyPress really need to use the BIGINT numeric field type for UserID? Is it even realistic to make allowances for the possibility of 18.446 quintillion unique user records?Although MySQL allows for the mixing of storage engine types across tables, there are solid reasons for standardizing on one storage engine type across your entire database. It is up to you to decide if you want to mix and match storage engine types and to determine which tables may benefit from one storage engine type versus another.

If you decide to install BP Privacy using the InnoDB option, then you should consider whether it makes sense to convert all of your existing WordPress and BuddyPress tables to use the InnoDB storage engine. If you are doing a clean install of WordPress and/or BuddyPress, you can set the default storage engine type to the storage engine of your choice for each table before running the install script. Future core upgrades, unfortunately, will then require extra vigilance. So, proceed with extreme caution.

Here are some additional resources to help you better understand the risks, benefits, and issues involved (read the comments in the below articles as well):

How to scale WordPress to half a million blogs and 8,000,000 page views a month

Convert your MySQL database from MyISAM to InnoDB

5 Essential Steps For Hosting A Scalable WordPress Blog Or Website

Not All Lightsabers are Equal

To understand the next section, you need to heed the words of this ancient Jedi saying. It means, do not compare apples to oranges. Note: I’m not sure if that phrase is a Jedi saying as I made it up.

The MyISAM and InnoDB storage engines are different storage technologies. Therefore, whereas overall performance results can be compared between these two different storage engines, what I discuss in the next section pertains to performance tuning a table within a given storage engine type.

These Aren’t the Droids You’re Looking For

To do my part in reducing data bloat, and helping those users that have lower-powered servers, or are on shared hosting, I’ve structured BP Privacy’s tables to be as compact as possible—whether using the MyISAM or InnoDB storage engine type. How? Well for starters, by using the INT, instead of BIGINT, numeric field type for the Main ACL table’s primary key field and by using partial indexing where practical for both the Main and Lists ACL tables.

The benefit of using the INT numeric field type is that it’s a 50% reduction in storage space compared to BIGINT (not a fifty percent reduction in the maximum number of unique possible records within a table, but in the number of bits required to store the data in the ID field of each record in the table). Using the INT numeric field type for the table ID still allows for 4.29 billion unique records within each table. This should be more than adequate for 99.5% (likely even more) of all BuddyPress installs.

The other performance gain comes from the fact that in the latest version of BP Privacy (starting with v1.0), I’ve done away with storing user lists in a serialized array within the userlist field of the single table. Instead, I’ve normalized that bit of data, storing any group or user lists in a separate ACL Lists child table. The benefits of this approach are many.

Whereas serialized data can be useful when storing disparate data in a settings field, for instance, it is not a great idea when the type of data will always be the same in a given field. Why? Because when you serialize data, you are storing a table within a table. This can lead to all sorts of performance issues and makes it very difficult to perform meaningful searches with the data “locked” within the serialized array. In fact, once you start storing serialized objects in a relational database on a regular basis, then you have moved beyond the utility of a table-based model. Instead, a schemaless system would prove more beneficial.

There is another performance hit when choosing to serialize the same type of data within a table’s field. Whether a given record stores any data in the serialized field or not, the overall record overhead size is much larger as a result of a set minimum number of stored bytes for the LONGTEXT data type. This means that space is used in the table even if data is not stored in that field. As most users will more than likely not be setting any group or user list data, it makes more sense to only take up space when actual data is required to be saved. Therefore, the child ACL Lists table.

If you are designing tables for WordPress or BuddyPress, it’s crucial that you learn how to properly store data and when it is okay to serialize (most of the time, it is not). Are you serializing data because you think that’s the way it must be done, because that’s the way most developers do it in the WP ecosystem? Think again. If you’ve heard that sharding requires data to be serialized. The answer to that is, wrong. Sharding and serialization are two separate issues. There are other, more desirable ways to make your table shardable. With the exception of serialized data within the WP settings and options metadata fields, serialization is misunderstood and misused within WordPress and BuddyPress tables.

Given the fact that these improvements are for a few, small privacy tables, it may not seem like that little of a space savings makes a difference in the grander picture. But imagine if WordPress and BuddyPress core made similar changes and all plugin developers took the time to optimize their plugin’s data schema (if they have one). Every bit saved on better field type and index sizes can add up to big performance gains, especially as a WordPress-powered site begins to get noticed and needs to scale.

Ask yourself this. Do WordPress and BuddyPress each really need to use the BIGINT numeric field type for UserID? Is it even realistic to make allowances for the possibility of 18.446 quintillion unique user records? Or, would larger benefits accrue to WordPress and BuddyPress users if the data size for that field was reduced by 50% (by using INT) or even better 62.5% (by using MEDIUMINT). Imagine where else data schema optimization could occur! By thinking about data storage issues, you can help to reduce table size and thereby improve performance.

Additional Resources:

To get to the heart of the discussion above–the infamous serialized fields–start the below-linked-to video at timecode 32:48, which is point nine in his list of ten. The section is entitled EAV. The most pertinent discussion will start at timecode 35:22, with what the author calls EAV 2.0: E-BLOB, but it is a good idea to start a few minutes earlier at the beginning of point nine.

This hour-long video became the source of a latter OSCON presentation. The audio is pretty bad, but it gives some great pointers on database best-management practices, 10 Ways to Wreck Your Database.

A Jedi Uses the Force for Knowledge and Defense, Never for Attack

As this famous quote from Yoda implies, with great power comes great responsibility (how about that for mixing SciFi and comic book metaphors). So, here it goes.

Warning: If attempting anything mentioned in this article or in the linked-to articles, the typical caveats apply: backup your existing WordPress and BuddyPress database(s) before you do anything, know how to restore your database(s) from your backups, know what you are doing and what you’re getting yourself into, don’t cry if you break something, it is entirely your responsibility to fix anything that breaks, and there is no one you can blame except yourself if something goes wrong.

I will not answer any questions about how to do anything discussed above or in the linked-to articles, nor provide any assistance in helping you do this, nor provide any help if you do this and your site breaks. It is up to you to know what you are doing. Proceed at your own risk. You have been warned.

Article Comments

  1. Jeff, Great Article– thoughtful and informative! Thanks for all your work and contributions moving WordPress & BuddyPress forward. Looking forward to the BuddyPress Privacy plugin launch.

    • Jeff Sayre says:

      Thanks, John –

      I’ve spent a lot of time looking at WP, BP, and 3rd-party plugin data schemas and realized that significant performance gains could be achieved if the schemas were optimized. There is very little reason to choose the BIGINT numeric filed types for the unique key (primary key) of a table. Also, many indices could be improved upon via partial indexing instead of simply indexing the entire field size.

      Finally, I think it is time to take a long, hard look at the virtues of InnoDB versus MyISAM. The latest versions of MySQL have addressed most of the earlier concerns that some users had with using this storage engine type.

      As WordPress and BuddyPress developers, we need to not only think about providing tight, well-thought out code, but also well-designed, optimized data schemas. We must always be looking for ways in which we can increase performance without our clients (those who use WP and BP) having to do much on their end. This article suggests a few ways in which we can improve the data schemas to enable noticeable performance gains—especially on larger installs.

  2. Rebeca says:

    As I thought about this more and more over the past few days, I released that it is not a good practice to release a major new BuddyPress component targeted to a version of BuddyPress that will be obsolete a few days after launch.

    • Jeff Sayre says:

      Since I released the BuddyPress Privacy Component Version 1.0-RC1 on February 18, 2011 and the next major version of BuddyPress was released on September 21, 2011, I would say that your realization is way off the mark. Subdecimal releases usually have minor operational affects on plugins. My component worked fine under 1.2.8 and 1.2.9—even though it is tagged as working up to BP 1.2.7.

      FYI, it is with major releases of BuddyPress where big issues arise. The big issue with my component and many others that have yet to be updated to the latest version, did not occure until the BuddyPress 1.5.x series—yes they skipped versioning for 1.3 and 1.4 and jumped straight to the 1.5.x series.

Share on Twitter
Share on Facebook
Share on FriendFeed
Share on LinkedIn
Share on StumbleUpon
Share on Digg
Share on Delicious
Share on Technorati
Add to Google Bookmarks

Archives