Wednesday, July 06, 2011

PHP: Transparent Self-Caching of Objects

When programming PHP I work with objects. Some objects exist only once in memory. Either just because they are instantiated once or by using a Singleton design pattern. Other objects are multiple. In PHP, it being a stateless language, these objects need to be created with every page load. Having too many objects loading every time will slow down your web-site. What comes as a natural solution is caching.
Caching is often thought as an outside operation in regards to the objects. Thus it looks scary, you have to change the way you instantiate objects in your code (in many different places) so that it uses caching. Here I will present a way to implement an internal caching - objects will cache themselves almost seamlessly.
Look at the code below:



class SlowObject {
protected $id;
static public $instances = array(); // public for debugging


function __construct($id) {
$this->id = $id;
sleep(1); // slow instantiation
}


static function getInstance($id) {
if (!($inst = &self::$instances[$id])) {
$inst = new self($id);
}
return $inst;
}


}


// testing
$a = SlowObject::getInstance(1);
$b = SlowObject::getInstance(2);
$c = SlowObject::getInstance(1);


echo 'Created '.sizeof(SlowObject::$instances).' objects in '.(microtime(true)-$_SERVER['REQUEST_TIME']).' sec.
';


This is some kind of extension to the Singleton pattern. Instead of really single instance we make a single instance for each supplied ID. Output:
Created 2 objects in 2.5048339366913 sec.
Note that it had created two unique objects (not three) and it took roughly two seconds (not three). This alone gives performance improvements just because we're not creating objects with the same ID twice - only once for each ID. 
Now we will extend this approach to cache all the objects of the same type to the disk (you can extend it to use memcache or APC to be even more effective, but disk is enough for demonstration purposes).




class SlowObject {
protected $id;
static public $instances = array(); // public for debugging


function __construct($id) {
$this->id = $id;
sleep(1); // slow instantiation
}


static function getInstance($id) {
$filename = '../cache/'.__CLASS__.'.'.$id.'.cache';
$cached = @unserialize(file_get_contents($filename));
if (!is_object($cached)) {
$cached = self::getInstanceRaw($id);
file_put_contents($filename, serialize($cached));
}
return $cached;
}


static function getInstanceRaw($id) {
if (!($inst = &self::$instances[$id])) {
$inst = new self($id);
}
return $inst;
}


}


// testing
$a = SlowObject::getInstance(1);
$b = SlowObject::getInstance(2);
$c = SlowObject::getInstance(1);


echo 'Created '.sizeof(SlowObject::$instances).' objects in '.(microtime(true)-$_SERVER['REQUEST_TIME']).' sec.
';



The first time you run this it outputs:
Created 2 objects in 2.9201810359955 sec.
Two seconds something is roughly the same time as before, but created object are now serialized into files. Next time you run this:
Created 0 objects in 0.27261805534363 sec.
Notice the difference? Run it multiple times in a row, it's very fast every time. And you didn't need to change your calling code much, just maybe from $a = new SlowObject($id); to $a = SlowObject::getInstance($id);. The in-memory and on-disk caching were attached internally to the object.
Now you should make sure that as soon as your data change (in DB for example) you stop using the cached object and create a new one with updated data (saving it again to the cache). It's as simple as deleting the cache file upon updating. Just check that you don't manipulate DB directly without calling update function on the objects somewhere else in your project.

function update(array $update) {
@unlink('../cache/'.__CLASS__.'.'.$this->id.'.cache');
parent::update($update);
}
You may put this functionality into some base class and extend it every time you want your object nicely cache-able.
If you find this interesting, let me know and I show you the class which deals with saving objects to the files automatically (using __destruct function) and how did I solve a few gotchas.

No comments: