December 15 2010

Simple XML Parsing

Now, I may not be a pro at this XML parsing thing, nor at making the XML files themselves, but I do know that XML can be a really, really cool thing if used correctly. I’ve been working the past few months on a website that uses XML to communicate between client and server to grab information off of eBay. It’s a pretty nice little API system that lets you get all sorts of information about auctions and the like.

Now, I’m not going to get that detailed on this, I just want to give you a basic introduction to it all and get the ball rolling. Now, as always, I’m sure that this may not be the best way to parse data like this, but it sure does make it a lot easier for me to use the information out of an XML document.

First off, I’ll give you some sample XML:
<result>
<username>testing</username>
<password>foobarbaz</password>
<name>enygma</name>
</result>

Now, I know that’s not strictly “correctly formatted XML” but you get the idea. There it’s a hierarchical relationship in the tags – for each opening tag, there needs to be a closing tag, and inside of those, there can contain information. Usually, people pick good tag names that describe the information inside them. My example might be what you get back if you request something from a server (user information, for example). The “result” would contain the username, password and their name. Now, we get to the fun part – the parsing….

First off, you need to understand that the XML functions in PHP need to have four parts – the function to parse the opening tag, the function to parse the closing tag, the function to parse the “guts” of the tags (the information inside them), and the initial function that ties them all together. This last one is not completely necessary, but sure makes it nice to just pass it some XML and have it return the results.

In this example, I’m going to have the XML parser return the values in an array formatted like:

“username”=>testing
“password”=>foobarbaz
“name”=>enygma

This, as I’ve found out, is one of the more useful ways to use the information once you have it. The function can pass back the array of information
and you can then use it as you see fit. Now, there are going to be times when this just isnt enough – but those times are for another tutorial.
Anyway, on with the code!

——————————————————
<?
$info=”<result><username>testing</username><password>foobarbaz</password><name>enygma</name></result>”;
class parse{
function startElement($parser, $name, $attrs){
global $currentTag;
$currentTag=$name;
}

function endElement($parser, $tag){
global $currentTag;
$currentTag=”";
}

function getInfo($parser, $data){
global $currentTag;
$this->info_array[$currentTag]=$data;
}

function parseXML($data){
// make the parser to get the XML
$xml_parser = xml_parser_create();
xml_set_object($xml_parser, &$this);
xml_set_element_handler($xml_parser, “startElement”, “endElement”);
xml_set_character_data_handler($xml_parser, “getInfo”);
xml_parse($xml_parser, $data);
xml_parser_free($xml_parser);

print_r($this->info_array);
}
}

$parse=new parse;
$parse->parseXML($info);
?>
——————————————————

I used a class here just so that it’s easier to keep the XML paring to itself. This way, you can just include the file, call $parse->parseXML($info) like I do at the bottom, and get the results back. In this example, I take the XML that I gave you up at the top, and pass it through the needed functions to give me a nice array of the infomation as the output.

Let’s step through this so that we can see what these things do. First off, we have the “base function” called parseXML. This is where all the fun gets going. The XML stream (in this case, $info) is passed in as $data to the function. We then create the XML parser object with the xml_parser_create() function. This makes the object so that we can use it ($xml_parser). Now, the next function, xml_set_object, lets us use this object in the “parse” class. Otherwise, it would get confused and not understand that we want to use it inside the class and probably just freak out.

The xml_set_element_handler function is one of the cooler functions in the parser. It helps us define what function names we want to use for the initial element handler (opening tag) and the ending element handler (close tag). This allows us to make custom fucntions that do certain things based on which end of the XML tag we are on. You can name your two functions anything that you want, just so long as you change the second and third values in this function to match the function names. Those two functions do have very specific variables that they need to be passed, so you always have to define them like this:

function startElement($parser, $name, $attrs)
function endElement($parser, $tag)

If you don’t, not only with PHP complain about it, but it just flat out won’t work.

The xml_set_character_data_handler function is where the real meat of it all is, though. This is where you tell the XML parser what function to use for the data inside the XML tags. You can do all sorts of things with this one – but we chose to just add the current tag and it’s value to an array (made global in the class though the $this-> before it).

Almost done – stick with me! The xml_parse function takes in the $data that we passed to the function and ships it off to the parser ($xml_parse) to get taken care of. It then goes through the tags, calling startElement, getInfo, and endElement (in that order) for each tag. In our example, it adds the $currentTag (a global value) to the array with whatever value is inside that tag. The final function (xml_parser_free) is just mainly a good idea. It frees up the memory that the parser was using and “cleans up” the things we’ve done.

Well, I do hope that this has been a help to you in your XML-parsing needs. If you have any further questions, there are lots of other tutorials about that probably get more into specifics than this one did. And, as always, you’re more than welcome to email me at enygma@phpdeveloper.org.

You can leave a response, or trackback from your own site.

Leave a Reply

www