Thursday, July 19, 2012

PHP scripts executed sequentially

tl;dr: session_start() is a blocking call because it locks the session file.

I had 10 images on a page that all used the same php file, but with different parameters. Getting an image took about one second. For whatever reason, the images were loaded one after the other instead of at the same time, which was problematic.

One thing that made me lose quite a few hours was that the http server wasn't Apache, it was Boost.Asio inside a daemon that I wrote. The web page was communicating with this daemon through that http server. I spent a lot of time debugging the daemon, trying to figure out whether a mutex was locked somewhere that would prevent multiple connections to be made at the same time.

I then started wondering whether there were situations when browsers cannot load images concurrently. There doesn't seem to be any, although there is a limit to the number of connections to the same server: 15 on Firefox (see network.http.max-connections-per-server in about:config), 6 on Internet Explorer and possibly 6 for Chrome (although I can't find much reliable information on that, and I don't feel like going through the code). Subdomains could be used to allow more concurrent connections to the same server, but I digress.

I then went back to the php scripts and started commenting stuff. When something happens that cannot explain, remove everything and see if it fixes it. If it does, uncomment the code bit by bit until it fails. After I commented session_start() on top of a php file, lo and behold, the images loaded concurrently.

I had never paid much attention to session_start(). I had always called it at the very top of some sort of common.php, along with database connections and defines and stuff. I knew sessions were writing information to a file on disk (in session.save_path), I had to look at them to find bugs in other scripts and, had I thought about that for a few seconds, I would have realized that this file has to be locked in some way to serialize writing.

And, unsurprisingly, it does. The session file is locked for writing when session_start() is called, which means that all the scripts that execute for the same session (which usually means in the same browser) block until the session is closed.

My solution was to treat the session as a mutex: lock it late and release it quickly.

There are two circumstances where a session needs to be opened: reading all the values in $_SESSION and writing a new value to it. To load all the values, simply opening and closing the session with session_write_close() works:
// common.php
load_session();

function load_session()
{
    session_start();
    session_write_close();
}
After load_session() is called, $_SESSION is filled and the session itself can be closed.

Instead of writing values directly to $_SESSION, going through a function allows for opening the session, writing the value, and then closing it again:
function set_session($n, $v)
{
    session_start();
    $_SESSION[$n] = $v;
    session_write_close();
}
There are synchronization issues: if two scripts execute concurrently and one writes values to the session, the second script will not get the new values because session_start() was called only once at the beginning. If synchronization is needed, the session has to be locked and scripts executed sequentially.

Note that a session cannot be started once something has been output, which means that set_session() should be treated like a header() call.

Lots of internets (if I had known what to search for):