Understanding PHP Generators

PHP generators are a useful concept to understand in the world of PHP development.

When it comes to driving, speed is not everything. But on the web, speed makes all the difference. The faster your application, the better the user experience. Well, this article is on PHP Generators, so why are we talking about speed? As you are soon about to find out, generators make a huge difference on speed and memory management.

What are PHP Generators?

Added to PHP in version 5.5, generators are functions that provide a simple way to loop through data without the need to build an array in memory. Still a bit confused? An example is a good way to show generators in action.

First, let's quickly create a generator.php file that we will use throughout this tutorial. After creating the file, we add this little code snippet.

<?php

function getRange ($max = 10) {
    $array = [];

    for ($i = 1; $i < $max; $i++) {
        $array[] = $i;
    }

    return $array;
}

foreach (getRange(15) as $range) {
    echo "Dataset {$range} <br>";
}

We can quickly spin up an inbuilt PHP server in the directory where we created the generator.php file:

php -S localhost:8000

So if we go to http://localhost:8000/generator.php, we should get something like this.

The code is pretty much self-explanatory, and this definitely doesn't look like much. But if we go back into our code and make a little change

<?php

foreach (getRange(PHP_INT_MAX) as $range) {
    echo "Dataset {$range} <br>";
}

Now, the upper range(max) of generated numbers is PHP_INT_MAX, which is the largest number that your version of PHP can reach. After doing this, head over to the browser and refresh. But this time, you'll notice something different. The generator script throws a warning error.

Well, that's a shame, PHP ran out of memory. Possible solutions that come to mind include going into php.ini and increasing memory_limit. Let's ask ourselves these questions, is this really effective? Do we want a single script to hog all our server's memory? The answers are no and no. This is not effective, and we do not want a single script to use up all our memory.

Using Generators

Let's define the same function above, call it with the same value PHP_INT_MAX and run it again. But, this time, we will be creating a generator function.

<?php

function getRange ($max = 10) {
    for ($i = 1; $i < $max; $i++) {
        yield $i;
    }
}

foreach (getRange(PHP_INT_MAX) as $range) {
    echo "Dataset {$range} <br>";
}

Dissecting the getRange function, this time, we only loop through the values and yield an output. yield is similar to return as it returns a value from a function, but the only difference is that yield returns a value only when it is needed and does not try to keep the entire dataset in memory.

If you head over to your browser, you should see data being displayed on the page. Given the appropriate time, the browser eventually displays the data.

Note: Generators can only be used from a function.

Why Do This?

There are times when we might want to parse a large dataset (it can be log files), perform computation on a large database result, etc. We don't want actions like this hogging all the memory. We should try to conserve memory as much as possible. The data doesn't necessarily need to be large — generators are effective no matter how small a dataset is. Don't forget, our aim is speed while using less memory.

Returning Keys

There are times when our data only make sense when they are key-value based. When using generators, we can yield key-value pairs like this.

<?php

function getRange ($max = 10) {
    for ($i = 1; $i < $max; $i++) {
        $value = $i * mt_rand();

        yield $i => $value;
    }
}

We can then go ahead and use the pair as we would do with any array like this.

<?php

foreach (getRange(PHP_INT_MAX) as $range => $value) {
    echo "Dataset {$range} has {$value} value<br>";
}

Sending Values to Generator

Generators can also take in values. This means that generators allow us to inject values into them, maybe as a command or something. For example, we can send a value to our generator telling to stop execution or change the output. Using the getRange function above, we can do this.

<?php

function getRange ($max = 10) {
    for ($i = 1; $i < $max; $i++) {
        $injected = yield $i;

        if ($injected === 'stop') return;
    }
}

To send inject this value, we can do this.

<?php

$generator = getRange(PHP_INT_MAX);

foreach ($generator as $range) {
    if ($range === 10000) {
        $generator->send('stop');
    }

    echo "Dataset {$range} <br>";
}

NOTE: Using return in a generator breaks out of the generator function.

Don't Misuse Generators

Using PHP_INT_MAX is a tad overboard. For me, PHP_INT_MAX is 2147483647 that is:

two billion one hundred forty-seven million four hundred eighty-three thousand six hundred forty-seven

Generators are supposed to be memory efficient. This doesn't mean that they won't cause the same problem they are trying to solve if misused.

Conclusion

Generators offer a significant performance boost that we cannot deny. Most times we don't need to have powerful servers to handle our code. We just need to do a little refactoring. Generators are useful and we ought to use them more often than not.

Samuel Oloruntoba

Self-proclaimed full-stack web developer and a quasi-academic. I work mostly on the backend (PHP and Node) with a recent enthusiasm for frontend development (React, SVG, HTML5 Canvas).