Faster Retrieval and Array Searching with Laravel

April 15, 2014

The other day I ran into a performance issue when working with a large database table in a Laravel application. The application has an artisan command that parses a large chunk of text data and looks for specific word matches in this large database table before deciding what to do with each particular word match.

Querying this table for every word match that the parser finds would cause me to run into something like the dreaded n+1 problem, where I would be executing a database select on every word that the command encountered. So I decided to load this table into memory and let the command search it without hitting the database.

The table is approximately 300,000 rows, and I'm most interested in one column that is a VARCHAR(255). In most cases the column values are 8-20 bytes only, which would eat up about 6MB of memory. If the table grew significantly, I may have to figure out another solution.

Here is a summary of the various steps I took to accomplish this task, and to improve performance along the way.

All timings were taken on my dual core development machine.

1.

$column_vals = array_map(function($row) {
  return $row['colname'];
}, MyModel::all()->toArray());

MyModel is my Eloquent model. Here I use the array_map function to collect all values for the column in question. This took 25 seconds. ~24 of those seconds were spent with the MyModel::all()->toArray() call.

2.

$rows = MyModel::all();
$column_vals = [];
foreach ($rows as $row) {
  $column_vals[]= $row->colname;
}

Removing the ->toArray() call improved things a lot. This took 9 seconds. About 8 of those were spent on the MyModel::all() call.

3.

$rows = DB::table('mymodel')->select('colname')->get();
$column_vals = [];
foreach($rows as $row) {
  $column_vals[]=$row->colname;
}

Just selecting the column that I needed brought the time down to <1 second. I'm a lot happier with that. (I don't think it matters on a select but note that my table is not indexed.) Now on to the search.

1.

for ($i = 0; $i < 1000; $i++) {
  $has_val = in_array('NotInArray', $column_vals)
}

I looped 1000 times looking for a value that was not in my 300,000 element array. 26 seconds. Crap.

2.

replace the $column_vals[]= $row->colname in the retrieval (#3 above) with

$column_vals[$row->colname] = 1;

Adding elements to an associative array alone took the <1 second in #3 above down by a couple hundred milliseconds. Then the search....

for ($i = 0; $i < 1000; $i++) {
  $has_val = array_key_exists('NotInArray', $column_vals)
}

Switching from in_array() to an associative array and using array_key_exists() brought the search part down to 9E-5 seconds. Very happy. Can I do better?

for ($i = 0; $i < 1000; $i++) {
  $has_val = isset($column_vals['NotInArray']);
}

Switching to isset() even brought things down a bit more. 6E-5 seconds for searching 1000 times. Woo!

The bottom line is that when possible only select the columns that you need from an Eloquent model, and favor isset() and associative arrays over in_array() when searching over large arrays.

 

comments powered by Disqus

About This Site

This site was designed by We Are How.

This site is powered by Sculpin static site generator and the source is available here.

Yotta = 10^24, or 1 000 000 000 000 000 000 000 000, the largest metric prefix.


Contact

Get in touch to find out how we can help you refine your vision and implement a dynamite product that will help your business grow. Our agile product development process is thoughtfully designed to give clients ongoing feedback and visibility from project inception to completion.