THE PENDING DRAFT

Optimizing a Custom Query in WordPress

April 14, 2015

With the WP_Query class WordPress provides a powerful tool to query things from the database. But as with most powerful things, you can also mess things up quite a bit if you’re not careful enough.

With great power comes great responsibility.

(Voltair, Yoda, Uncle Ben in Spiderman or whoever said it first)

Let me give you an example of a query we needed in a project and how we implemented it.

The Goal

We want to get the IDs of all Attachments (images in this case), that are attached to any post of a particular Custom Post Type.

The Wrong Way

The good thing about WordPress is that you can find a snippet for pretty much anything with a simple Google Search. The bad thing about WordPress is that you can find a snippet for pretty much anything with a simple Google Search. Often it’s as easy as throwing a snippet into your plugin or functions.php in your theme and it just magically works.

Here’s the first “solution” i found somewhere on StackOverflow:

// Query for all posts of post_type 'post-type-name'
$cpt_query = new WP_Query(
  array(
    'post_type' => 'post-type-name',
    'posts_per_page' => -1 // get everything
  )
);

// Create an empty array for our attachment ids
$attachment_ids = array();

// Run a Loop through our CPT-Query
if( $cpt_query->have_posts() ){
  while( $cpt_query->have_posts() ){
    $cpt_query->the_post();

    // Inside the Loop, get all attachments which are children of this post
    $attachments = get_children(
      array(
        'post_parent' => get_the_ID(),
        'post_type' => 'attachment'
      )
    );

    // Add the attachment id to our array
    foreach ( $attachments as $attachment ) {
      $attachment_id = strval( $attachment->ID );
      array_push( $attachment_ids, $attachment_id );
    }
  }
  wp_reset_postdata();
}

 

While this worked perfectly as it should, it had a pretty expensive overhead. Right now we query for all posts of that Post Type, then load all information about that post, loop through every single one of them and inside that Loop we query again for all attachments, load all data and again loop through every single one of them to get the ids.

We add a lot of queries post per post while looping through them, not good! This might not be a problem when you only have some posts with a handful of attachments. But because of the way this query was built the queries would increment for every post which would easily lead to thousands of queries when more posts and attachments are added.

So, what can we do to fix this? Let’s see.

First of all, it’s important to realize that you have a problem, preferably before anyone else does. I use the Query Monitor Plugin during development to display the amount of queries (among other useful information) in the admin bar. On top of that, it’s a good idea to test your plugin/theme not only on a blank installation but also to create, delete, add, move as much as you can to check if there are issues you won’t notice otherwise.

‘fields’ => ‘ids’ to the rescue

Inside WP_Query we can define which fields we want to retrieve by adding 'fields' => '...' to the query. That way we won’t load every single piece of information about the posts we find, but only what we need: the IDs.

$cpt_query = new WP_Query(
  array(
    'post_type' => 'post-type-name',
    'posts_per_page' => -1,
    'fields' => 'ids'
  )
);

 

This will get us an array of IDs. We don’t have to loop through them anymore and can directly make a second query to get our attachments.

$attachment_query = new WP_Query(
  array(
    'post_type' => 'attachment',
    'posts_per_page' => -1,
    'post_status' => 'any', // Not sure if this is needed, but it makes sure we really get every attachment
    'post_parent__in' => $cpt_query->posts, // Look for children of the ids from our first query
    'fields' => 'ids' // Again, we only want to get the ids
  )
);

 

This second query gets all attachments that are children of (attached to) one of the posts we got in the first query. The result of $attachment_query->posts will again be an array of IDs, this time for the attachments.

This means instead of potentially thousands we will only ever have two queries, regardless of the amount of posts and attachments.

The Solution

To sum up here’s the complete code we use today in our plugin (we added some more parameters and checks, but it’s technically identical).

function pdr_get_attachments_from_cpt( $query ) {
  $cpt_query = new WP_Query(
    array(
      'post_type' => 'post-type-name',
      'posts_per_page' => -1,
      'fields' => 'ids'
    )
  );

  $attachment_query = new WP_Query(
    array(
      'post_type' => 'attachment',
      'posts_per_page' => -1,
      'post_status' => 'any', // Not sure if this is needed, but it makes sure we really get every attachment
      'post_parent__in' => $cpt_query->posts, // Get children of the ids from our first query
      'fields' => 'ids' // Again, we only want to get the ids
    )
  );
}

 

In our case we hooked that function to the pre_get_posts filter to exclude those attachments from all queries, I left that part out for the sake of simplicity. Let me know if you want to see the complete code.

This is exactly the kind of thing I meant when I wrote about taking things apart as a regular learning habit. It didn’t took a whole lot of time to figure out what was going wrong with the original function, I fixed a piece of code which would have gotten us into trouble later on and I learned a few things about how queries in WordPress work.

What did you learn today?

4 responses on “Optimizing a Custom Query in WordPress” so far

  1. Ozh

    April 14, 2015

    Did you benchmark both solutions? I doubt that on most setup doing 2 optimized queries is more efficient than one single query.

    I for one (but definitely not an expert in that field and maybe wrong on the subject) try to minimize the overall number of queries rather than splitting them into optimized sub queries.

    • Claudio Rimann

      April 15, 2015

      I haven’t benchmarked them, but i’m pretty sure it would be noticeable very soon when you start to add more posts.

      In the first example, you see that it used get_children() inside the loop of the first query, which calls get_posts which in turn calls WP_Query.
      As this is in the loop it added at least a query for every post it finds.

      So, the difference isn’t really two optimized queries against one single query, but rather two optimized against possibly thousands of queries depending on the number of posts there are.

  2. Ovidiu Bica

    May 29, 2015

    Hi there, I think Google landed me to the right place. I was looking for a solution to filter OUT posts without image attachments and every solution I found seemed to generate a lot of heat when dealing with many posts.

    I need to run this through pre_get_posts as I already have other filters in place (exclude expired posts, sort by number of comments, price, etc.)

    I would like to have a look at your pre_get_posts code if possible.

    Regards,
    Ovidiu.

Leave your comment