Well, it’s the final hour of the final day, so it’s time for my weekly mandatory blog post. I was going to counter-rant a hack-job post and call the author a chump as much as my fingers could muster. But I was told that “Chump!” three hundred times is not a blog post. As I was thinking of other topics to write about I realized two things. First, no one cares that much about Lindsay Lohan anymore. Second, I have a hard time deciding what to do.

But it is this distinct lack of commitment that, while being so destructive to my personal relationships, has brought me into a rarely discussed nether-region of PHP functionally: function references and lambda functions. The sad thing is there isn’t really either in the language – at least not what you’re thinking of if you just read “function references” and “lambda functions” and though, “Oh yeah! Those are awesome!”. No, I’m going to talk about the dark and shameful ways in which we make do without these features in PHP. That feeling you just got? It’s called excitement. Actually, it’s probably indigestion, but get it checked out anyway. All good? Follow us after the jump for a look into the depths of PHP 5.

Choosing Functions at Runtime
In our object-relational mappings we often want to assign arbitrary functions to modify data as it goes into the database and to modify data that is being returned by our objects. It’s a common thing to do, and it’s often called the “filter” pattern. Some people have reported severe degradation of performance due to PHP’s methods for calling a function identified at runtime, so let’s first visit benchmark-ville to hear the real story.

In PHP, you can call a function by its name, by a variable that holds its name, or by two methods – call_user_func and call_user_func_array. You may also use the Reflection classes to achieve the same result. Consider the following example:

function f($a, $b) {
return $a + $b;
}

//Call the function directly
f(1,2);

//Call the function indirectly
$name = 'f';
$name(1,2); //returns 3

//Use a function to call the target function
call_user_func('f', $a, $b);

//Use a Reflection object to call the target function
$F = new ReflectionFunction('f');
$F->invoke(1,2);

I took measurements of microtime()*10000 before and after the calls, taking the average difference over 10,000 calls to find these results. The four methods require approximately the same amount of time to run on functions in the global scope, but as we target object methods and static methods, we find its performance decreases further.

Function Invocation Times * 10000 (sec)
Direct Call Indirect Call call_user_func Reflection::invoke
Global Function 0.07018 0.07741 0.09791 0.08810
Object’s Instance Method 0.08173 - 0.11092 0.09029
Static Method 0.07867 - 0.11721 0.08909
Object’s __call() 0.11761 - - -

You can read this table for or against using these methods. Using call_user_func is only slightly slower than calling an object’s instance method and faster than overloading with __call(), so it’s acceptable to use this and the Reflection class for runtime function execution. Further, since Reflection is faster, if the initialization time of the Reflection objects is negligible (for example, when we run the same function many times), then it would seem to be perfectly acceptable.

Alternatively, if we were to call these functions many times in a series we could very well start to see a performance impact. Since this impact is likely realized slowly as more complexity is introduced to the system, it is unlikely that we would understand the full effect of our decision until it’s too late.

We’ve only discussed ways of invoking a function with a fixed number of arguments. In reality this is sufficient but can be inelegant. PHP has two ways of mapping an array of values to spots in an argument list during invocation: call_user_func_array and Reflection::invokeArgs.

Function Invocation Times * 10000 (sec)
call_user_func_array Reflection::invokeArgs
Global Function 0.10263 0.09455
Object’s Instance Method 0.11663 0.09778
Static Method 0.12209 0.09627

Using call_user_func_array is almost twice as slow as a direct call, and that’s not something to shake a stick at (although, I am admittedly against stick shaking in general). But if you’re only running your filter once as things are output or if your “Plan B” is more than two function calls worth of indirection, use call_user_func_array. Reflection again impresses me with performance. As long as we cache the objects and don’t have to invest a lot of time in the overhead of creation, they seem like a very appropriate solution.

Creating Functions at Runtime

Choosing a function that you’ve written at runtime is one thing, but what if you’re so indecisive that you can’t even do that? You want to have your function written at runtime! You have two options: First, you can write a .php file with what you want and then include it. Then you can have the joy of extra files floating around your file system and dealing with file locks and so on and so forth until you snap on someone who was just asking for directions and now thinks people in your town are jerks. The option to go with is create_function.

The function create_function does not create an anonymous function with a closure and magic. Rather, it creates a function in the global namespace with a crazy name that you’re unlikely to type. It is, for all intents and purposes, a glorified eval. It returns a crazy name associated with your function as a string, so when you invoke it you’re actually just doing the indirect invocation as above. But since it’s a valid function name, you can use it in the other methods mentioned above as well and you’ll get the same performance.

The trauma most associated with feckless use of create_function is that each function takes up memory and the sort of things you may be doing that would require creating these magic beasties usually mean you are using a lot of them—it’s also extremely easy to hit your memory limit, no matter what that actual limit is. Trust me.

But there are cases where it makes sense to use these things. For example, in a workflow graph we have a utility class which simply evaluates the arguments to determine which branch along which the workflow should proceed. Since we also want the administrative users to be able to enter these, we are left with two options: generate whole classes or use create_function. It was a sort of accepted knowledge that using create_function would be too slow, so I decided to prove it.

Lambda Function Definition Over 1000 Calls
Code Length (chars) Average Execution Time / 10000 (msec) Average Memory Used (bytes)
88 0.37948 1940
176 0.54889 3244
264 0.70673 4539
352 0.87082 5835
440 1.04604 7128
528 1.24263 8434
616 1.40099 9729
704 1.56975 11026
792 1.73913 12323
880 1.90314 13618

As you can see, for small bits of code we take up about 5 function calls worth of time, a non-trivial chunk of memory, and as the size of our code increases, the execution time and memory required seem to get far too unwieldy far too fast.

If my description above of create_function as being a crazy eval piqued your interest, then note that they’re pretty much identical in runtime and memory footprint:

Function Definition Over 1000 Calls
Time * 10000 (msec) Memory
Lambda 1.03871 7117
Eval 0.97363 7012

Conclusion

So there you have it. I don’t think it’s fair to say that these methods are too slow for all applications, especially for a language such as PHP. I mean, come on, we’re not finding spectral norms here! Right? Am I right? I mean we have R and Matlab for that.

Posted in: Development