Monday, 15 August 2011

A cURL drop-in replacement for simplexml_load_file()

When using shared hosting (such as that provided by Dreamhost, Servage, etc.), the very nature of multiple users hosting from a single server (without their own virtual machine) means that a number of concessions have to be made in order to keep the service secure. At the "obvious" end of this spectrum are OS-level measures such as blocking root access, disabling useradd etc., but there are also myriad additional concerns that arise when configuring PHP (and other server-side scripting environments) for use in a shared hosting environment.

In the case of PHP, common measures include disabling functions that allow for file execution (exec, system, proc_open, etc.), turning register_globals and enable_dl off and, the focus of this post, disabling certain functions that allow remote files to be loaded from a URL. This can be achieved by setting allow_url_fopen (or, less stringently, allow_url_include) to false in the server's php.ini file. The main problem this presents for users of shared hosting is that this setting can only be set in php.ini (and can't, therefore, be modified at runtime using ini_set()). The implications of setting allow_url_fopen to false are fairly wide-ranging, as it disables all access to remote files through "URL-aware fopen wrappers", which includes anything that calls include(), include_once(), require() and require_once() with a remote URL.

In a recent project we were working on, we were hit by this problem during testing. The problem actually stemmed from our use of simplexml_load_file(), which makes use of an fopen wrapper internally. Happily, the client URL (cURL) library was installed and functional on the shared hosting, so we were able to write a simple replacement using cURL and simplexml_load_string():

$url = "http://example.com/remote.xml";

if (ini_get('allow_url_fopen')) {
$xml = simplexml_load_file($url);
} else {
// Setup a cURL request
$curl_request = curl_init($url);
curl_setopt($curl_request, CURLOPT_HEADER, false);
curl_setopt($curl_request, CURLOPT_RETURNTRANSFER, true);

// Execute the cURL request
$raw_xml = curl_exec($curl_request);

// ...Check for errors from cURL...

$xml = simplexml_load_string($raw_xml);
}

The code is (hopefully!) self-explanatory, with the possible exception of the (fairly minimal) cURL options. Setting CURLOPT_HEADER to false omits the HTTP headers from the returned output, whilst setting CURLOPT_RETURNTRANSFER to true causes curl_exec() to return the response as a string rather than outputting it directly. Finally, the handling of cURL errors is a topic unto itself, but the error number from the last cURL operation can be obtained using the curl_errno() function, passing in the cURL handle returned from curl_init().

No comments:

Post a Comment