PHP Web Services (2013)

Chapter 6. XML

XML is another very common data format used with APIs, and should feel familiar to us as developers. Anyone who has spent much time with the Web will understand the “pointy brackets” style of XML and will be able to read it. XML is a rather verbose format; the additional punctuation and scope for attributes, character data, and nested tags can make for a slightly bigger data size than other formats.

XML has many more features than JSON, and can represent a great many more things. You’ll see more of this in Chapter 7, where complex data types and namespaces will come into play. XML doesn’t have to be complicated; simple data can also be easily represented, just as it is with JSON. Consider our shopping list again:

§  eggs

§  bread

§  milk

§  bananas

§  bacon

§  cheese

The XML representation of this list would be:

<?xml version="1.0"?>

<list>

  <item>eggs</item>

  <item>bread</item>

  <item>milk</item>

  <item>bananas</item>

  <item>bacon</item>

  <item>cheese</item>

</list>

Working with XML in PHP isn’t as easy as working with JSON. To produce the previous example, the code in Example 6-1 was used.

Example 6-1. Example of working with XML

<?php

$list = array(

        "eggs",

        "bread",

        "milk",

        "bananas",

        "bacon",

        "cheese"

);

$xml = new SimpleXMLElement("<list />");

foreach($list as $item) {

    $xml->addChild("item", $item);

}

// for nice output

$dom = dom_import_simplexml($xml)->ownerDocument;

$dom->formatOutput = true;

echo $dom->saveXML();

The starting point is the array that will be our list, then a SimpleXMLElement object is instantiated with a root tag that forms the basis for the document. In XML, everything has to be in a tag so an <item> tag has been introduced in order to contain each list item.

The final block only makes the output prettier, which isn’t usually important because XML is for machines, not for humans. To get the XML to convert from a SimpleXMLElement object, call the asXML() method on that object, which returns a string. The string, however is all on one line!

The previous example instead converted from SimpleXMLElement to DOMElement, and then grabbed the DOMDocument from that. Set the formatOutput to true, so when a call is made to DOMDocument::saveXML() (to ask it to return the XML as a string), the resulting output will be nicely formatted.

When to Choose XML

XML’s abilities to represent attributes, children, and character data all provide a more powerful and descriptive way to represent data than, for example, JSON. These same features make XML a great way to represent very detailed information, including data-type information, so it’s a great choice when those details really do matter. It can include information about the types of data and custom data types, and each element can have attributes that cover even more information.

The larger data format is less of a concern when working with powerful machines and fast network connections, so XML is a popular choice when exchanging data between computers or servers, rather than sending things to phones or web browsers. Do be aware, however, that bandwidth costs may well still apply and may be a significant cost factor when large amounts of data are being transferred.

APIs are all about integration between systems and sometimes the choice of data format will be dictated by whatever is on the other end of the relationship. XML is particularly popular among many enterprise technology platforms such as Java, Oracle, and .NET, so users of these technologies will often request XML as a preferred format. If you are working with products or people that would prefer XML or are more confident handling this format, then offer XML, even if only as one of multiple data format options in your API.

XML in PHP

There are many ways we can work with XML in PHP, and they’re all useful in different situations. There are three main approaches to choose from and they all have their advantages and disadvantages:

1.    SimpleXML is the most approachable, and my personal favorite. It is easy to use and understand, is well documented, and provides a simple interface (as the name suggests) for getting the job done. SimpleXML does have some limitations, but it is recommended for most applications.

2.    DOM is handy when a project encounters some of the limitations in SimpleXML. It’s more powerful and therefore more complicated to use, but there are a small number of operations that can’t be done with SimpleXML. There are built-in functions to allow conversion between these two formats, so it’s very common to use a combination of both in applications, as we saw earlier in Example 6-1.

3.    XMLReader, XMLWriter, and their sister XMLParser are lower-level ways of dealing with XML. In general, these tools are complicated and unintuitive but they have a major advantage: they don’t load the entire XML document into memory at once. If very large data sets are involved, then this approach will be your friend.

XML in Existing APIs

There are a wide variety of APIs using XML. This next example looks at the photo-sharing site Flickr. The Flickr API provides a wide variety of functionality for working with photos, and every language will have some classes available that you can use with it, but there’s no reason not to interact with the API directly. Example 6-2 shows how to find a list of kitten pictures.

Example 6-2. Fetching data from Flickr’s XMLRPC service

<?php

require("api-key.php");

$animal = "kitten";

$data = file_get_contents('http://api.flickr.com/services/rest/?'

    . http_build_query(array(

        "method" => "flickr.photos.search",

        "api_key" => $api_key,

        "tags" => $animal,

        "format" => "xmlrpc",

        "per_page" => 6

    ))

);

This requests all the newest photos tagged “kitten” from Flickr. Flickr uses an API key passed as a URL parameter, which is a different approach to the Authorization header examples that have been demonstrated so far; each API will implement this in a different way. Although the header is a better practice, the developers of Flickr were trailblazers with implementing APIs for users, so there was no best practice when it was built. Since it’s simply a GET request, this example uses file_get_contents() to fetch the carefully crafted URL. The resulting response looks something like this:

<?xml version="1.0" encoding="utf-8" ?>

<methodResponse>

    <params>

        <param>

            <value>

                <string>

<photos page="1" pages="131292" perpage="6" total="787750">

    <photo id="8294579422" owner="9482106@N04" secret="9a3bac5af4" server="8220" farm="9" title="Smokey 2012-12-18" ispublic="1" isfriend="0" isfamily="0" />

    <photo id="8294535628" owner="39066615@N08" secret="90e31d5254" server="8074" farm="9" title="Curious Tommy" ispublic="1" isfriend="0" isfamily="0" />

    <photo id="8293485771" owner="28797694@N04" secret="6650f1db57" server="8213" farm="9" title="Tiny tooth" ispublic="1" isfriend="0" isfamily="0" />

    <photo id="8294535494" owner="39066615@N08" secret="cc6fd4db0c" server="8351" farm="9" title="Tommy" ispublic="1" isfriend="0" isfamily="0" />

    <photo id="8294424628" owner="26742588@N04" secret="b6cd3f3556" server="8224" farm="9" title="White Is The New" ispublic="1" isfriend="0" isfamily="0" />

    <photo id="8294402524" owner="33892219@N06" secret="572968b650" server="8356" farm="9" title="Cat Angel" ispublic="1" isfriend="0" isfamily="0" />

</photos>

                </string>

            </value>

        </param>

    </params>

</methodResponse>

Because the actual data is sent as an escaped XML string, the XML is parsed in PHP, then the string is extracted and parsed as a separate step in order to obtain the real data. Flickr doesn’t supply the actual URL of the image, but gives enough information in the response that the instructionscan be followed to assemble the actual URL. SimpleXML is used in this example—first to parse the response, then to parse the data inside it. This library represents child elements as object properties (and each child is a SimpleXMLElement), while attributes are accessed using array notation.

Here’s Example 6-2 again, processing the data and outputting it with titles and <img> tags:

<?php

require("api-key.php");

$animal = "kitten";

$data = file_get_contents('http://api.flickr.com/services/rest/?'

    . http_build_query(array(

        "method" => "flickr.photos.search",

        "api_key" => $api_key,

        "tags" => $animal,

        "format" => "xmlrpc",

        "per_page" => 6

    ))

);

$simplexml = new SimpleXMLElement($data);

$data_array = $simplexml->params->param->value->children();

$photos = new SimpleXMLElement($data_array->string);

if($photos) {

    foreach($photos->photo as $photo) {

        echo $photo['title'] . "\n";

        echo '<img src="http://farm' . $photo['farm'] . '.staticflickr.com/'

            . $photo['server'] . '/' .$photo['id'] . '_' . $photo['secret']

            . '.jpg" /><br />' . "\n";

    }

}

The main body of the data contains a <photos> tag with multiple <photo> tags inside it—one for each photo. Each <photo> tag has some attributes inside it, so array notation is used to access these, retrieve the title, and build the image tag.

When working with APIs, different data formats are seen in use in a variety of settings. This chapter has shown how to create, work with, and parse XML. XML is more common on older and larger applications, but the data format will depend on the target market of the API, and many providers will offer multiple formats. Flickr, for example, offers the data in both JSON and XML format, but also offers a serialized PHP format. PHP’s serialized format is very easy to work with and is a great choice for two PHP applications exchanging data; if you were to integrate Flickr into your own PHP application, this would be good format to choose. When integrating with applications on other technology platforms, XML is a better-supported choice.