Cheap Web Hosting for Developers

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 8 SIMPLE API FOR XML (SAX)

Filed under: PHP and XML — webmaster @ 14:11

CHAPTER 8 SIMPLE API FOR XML (SAX) 273 Setting the Parser Options After you have created the parser, you can set the parser options. These options differ from those discussed in Chapter 5, which are used by the DOM and SimpleXML extensions. The xml extension defines only four options that can be used while parsing an XML document. Table 8-1 describes the available options, as well as their default values when not specified for the parser. Table 8-1. Parser Options Option Description XML_OPTION_TARGET_ENCODING Sets the encoding to use when the parser passes the xml information to the function handlers. The available encodings are US-ASCII, ISO-8859-1, and UTF-8, with the default being either the encoding set when the parser was created or UTF-8 when not specified. XML_OPTION_SKIP_WHITE Skips values that are entirely ignorable whitespaces. These values will not be passed to your function handlers. The default value is 0, which means pass whitespace to the functions. XML_OPTION_SKIP_TAGSTART Skips a certain number of characters from the beginning of a start tag. The default value is 0 to not skip any characters. XML_OPTION_CASE_FOLDING Determines whether element tag names are passed as all uppercase or left as is. The default value is 1 to use uppercase for all tag names. The default setting tends to be a bit controversial. XML is case-sensitive, and the default setting is to case fold characters. For

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

272 CHAPTER 8 SIMPLE API FOR XML

Filed under: PHP and XML — webmaster @ 04:10

272 CHAPTER 8 SIMPLE API FOR XML (SAX) /* Set parser options */ xml_parser_set_option ($xml_parser, XML_OPTION_CASE_FOLDING, 0); /* Register handlers */ xml_set_element_handler($xml_parser, “startElement”, “endElement”); xml_set_character_data_handler ($xml_parser, “chandler”); /* Parse XML */ if (!xml_parse($xml_parser, $xml, 1)) { /* Gather Error information */ die(sprintf(”XML error: %s at line %d”, xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); } /* Free parser */ xml_parser_free($xml_parser); ?> To begin examining this extension, you will skip the first step. It is quite difficult to attempt to write event-handling functions without even knowing what the events are and what parameters the functions need. Once the parser has been created and any parse options set, you will return to writing the handler functions. Listing 8-1 may also offer some insight into these functions prior to reaching the Event Handlers section. The Parser The parser is the focal point of this extension. Every built-in function for xml, other than the ones creating it and two encoding/decoding functions, requires the parser to be passed as a parameter. The parser, when created, takes the form of a resource within PHP 5, just as in PHP 4. The API was left unchanged, unlike the domxml extension, leaving the parser as a resource rather than adding an OOP interface. This not only allows no coding changes when moving from PHP 4 to PHP 5, but the extension already implements a way to use objects with the parser, which is discussed later in this chapter in the Using Objects and Methods section. Creating the Parser You create the parser using the function xml_parser_create(), which takes an optional parameter specifying the output encoding to use. Input encoding is automatically detected using either the encoding specified by the document or a BOM. When neither is detected, UTF-8 encoded input is assumed. Upon successful creation of the parser, it is returned to the application as a resource; otherwise, this function returns NULL. For example: if ($xml_parser = xml_parser_create()) { /* Insert code here */ } Upon successfully executing this code, the variable $xml_parsercontains the resource that will be used in the rest of the function calls within this extension.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 8 SIMPLE API FOR XML (SAX)

Filed under: PHP and XML — webmaster @ 17:31

CHAPTER 8 SIMPLE API FOR XML (SAX) 271 1. Define functions to handle events. 2. Create the parser. 3. Set any parser options. 4. Register the handlers (the functions you defined to handle events) with the parser. 5. Begin parsing. 6. Perform error checking. 7. Free the parser. Listing 8-1 contains a small example of using this extension, following the previous steps. I have used comments in the application to indicate the different steps. Listing 8-1. Sample Application Using the xml Extension Hello World ‘; /* start element handler function */ function startElement($parser, $name, $attribs) { print “<$name"; foreach ($attribs AS $attName=>$attValue) { print ” $attName=”.’”‘.$attValue.’”‘; } print “>”; } /* end element handler function */ function endElement($parser, $name) { print ““; } /* cdata handler function */ function chandler($parser, $data) { print $data; } /* Create parser */ $xml_parser = xml_parser_create();

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Clan Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

270 CHAPTER 8 SIMPLE API FOR XML

Filed under: PHP and XML — webmaster @ 06:05

270 CHAPTER 8 SIMPLE API FOR XML (SAX) Event-Based/Push Parser So, what is an event-based, or push, parser? Well, I m glad you asked that question. An event- based parser interacts with an application when specific events occur during the parsing of the XML document. Such an event may be the start or the end of an element or may be an encounter with a PI within the document. When an event occurs, the parser notifies the application and provides any pertinent information. In other words, the parser pushes the information to the application. The application is not requesting the data when it needs it, but rather it initially registers functions with the parser for the different events it would like notification for, which are then executed upon notification. Think of it in terms of a mailing list to which you can subscribe. All you need to do is register with the mailing list, and from then on, every time a new message is received from the list, the message is automatically sent to you. You do not need to keep checking the mailing list to see whether it contains any new messages. SAX in PHP The xml extension, which is the SAX handler in PHP, has been the primary XML handler since PHP 3. It has been the most stable extension and thus is widely used when dealing with XML. The expat library, http://expat.sourceforge.net/, initially served as the underlying parser for this extension. With the advent of PHP 5 and its use of the libxml2 library, a compatibility layer was written and made the default option. This means that by default, libxml2 now serves as the XML parsing library for the xml extension in PHP 5 and later, though the extension can also be built with the depreciated expat library. Enabled by default, it can be disabled in the PHP build through the –disable-xml configuration switch. (But then again, if you wanted to do this, you probably would not be reading this chapter!) You may have reasons for building this with the expat library, such as compatibility problems with your code or application. I will address some of these issues in the section Migrating from PHP 4 to PHP 5. If this is the case, you can use the configure switch –with-libexpat-dir=DIR with expat rather than libxml2. This is depreciated and should be used only in such cases where things may be broken and cannot be resolved using the libxml2 library. One other change for this extension from PHP 4 to PHP 5 is the default encoding. Originally, the default encoding used for output from this extension was ISO-8859-1. With the change to libxml2, the default encoding has changed in PHP 5.0.2 and later to UTF-8. This is true no matter which library you use to build the extension. If any existing code being upgraded to PHP 5 happens to require IISO-8859-1 as the default encoding, this is quickly and easily resolved, as you will see in the next section. Other than the potential migration issues, this chapter exclusively deals with the xml extension built using libxml2. Using the xml Extension Working with the xml extension is easy and straightforward. Once you have set up the parser and parsing begins, all your code is automatically executed. You do not need to do anything until the parsing has finished. The steps to use this extension are as follows:

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

Simple API for XML (SAX) CHAPTER 8

Filed under: PHP and XML — webmaster @ 16:00

Simple API for XML (SAX) CHAPTER 8 The extensions covered up until now have dealt with XML in a hierarchical structure residing in memory. They are tree-based parsers that allow you to move throughout the tree as well as modify the XML document. This chapter will introduce you to stream-based parsers and, in particular, the Simple API for XML (SAX). Through examples and a look at the changes in this extension from PHP 4 to PHP 5, you will be well equipped to write or possibly fix code using SAX. Introducing SAX In general terms, SAX is a streams-based parser. Chunks of data are streamed through the parser and processed. As the parser needs more data, it releases the current chunk of data and grabs more chunks, which are then also processed. This continues until either there is no more data to process or the process itself is stopped before reaching the end of the data. Unlike tree parsers, stream-based parsers interact with an application during parsing and do not persist the information in the XML document. Once the parsing is done, the XML processing is done. This differs greatly compared to the SimpleXML or DOM extension; in those cases, the parsing builds an in-memory tree; then, once done, interaction with the tree begins, and the application can manipulate the XML. Background SAX is just one of the stream-based parsers in PHP 5. What sets it apart from the other stream- based parsers is that it is an event-based, or push, parser. Originally developed in 1998 for use under Java, SAX is not based on any formal specification like the DOM extension is, although many DOM parsers are built using SAX. The goal of SAX was to provide a simple way to process XML utilizing the least amount of system resources. Its simplicity of use and its lightweight nature made this parser extremely popular early on and was one of the driving factors of why it is implemented in one form or another in other programming languages.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost PHP Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

268 CHAPTER 7 SIMPLEXML application, so it

Filed under: PHP and XML — webmaster @ 05:29

268 CHAPTER 7 SIMPLEXML application, so it assumes all incoming arrays contain locations and values for the PAD template. The setValue() function is recursive. As long as the value of the array is another array, the function calls itself with the $sxe variable pointing to the field name passed into the function, the new field name, and the new field value. Once the incoming value is no longer an array, it is set as the value of the new field passed to the function of the $sxe object passed into the function. The value is also encoded using htmlentities() to ensure the data will be properly escaped. For instance, a value containing the & character needs it converted to its entity format, &. The last use of SimpleXML worth mentioning in this application is within the validatePAD() function. PAD contains a RegEx field within each Field node of the specification. This field defines the regular expression the data needs to conform to in order to be considered valid. The same technique is used to loop through the specification file to find the RegEx node and the Path node, as you have seen in other functions in this application. The correct element is also navigated to within the template using similar techniques. Once you ve gathered all the information, you can test the regular expression against the value of the $sxe element from the working template. This example illustrated how you can use XML and SimpleXML to generate an application including its UI, data storage, and validation rules using a real-world case. If you are a current shareware author, you may already be familiar with the PAD format. Using techniques within this application, you should have no problems writing your own application to generate your PAD files. In any case, this example has shown that even though SimpleXML has a simple API and certain limitations, you can use it for some complex applications, even when you don t know the document structure. Conclusion The SimpleXML extension provides easy access to XML documents using a tree-based structure. The ease of use also results in certain limitations. As you have seen, elements cannot be created; only elements, attributes, and their content are accessible, and only limited information about a node is available. This chapter covered the SimpleXML extension by demonstrating its ease of use as well as its limitations. The chapter also discussed methods of dealing with these limitations, such as using the interoperability with the DOM extension and in certain cases with built-in PHP object functions. The material presented here provides an in-depth explanation of SimpleXML and its functionality; the examples should provide you with enough information to begin using SimpleXML in your everyday coding. The next chapter will introduce how to parse streamed XML data using the XMLReader extension. Processing XML data using streams is different from what you have dealt with to this point because unlike the tree parsers, DOM and SimpleXML, only portions of the document live in memory at a time.

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost PHP MySQL Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 7 SIMPLEXML 267 called on again

Filed under: PHP and XML — webmaster @ 19:29

CHAPTER 7 SIMPLEXML 267 called on again to load the empty template created by the DOM extension. This is performed only once when the application begins because the template is then passed in $_POST[’ptemplate’]. Being XML data, it is Base64-encoded within the form and Base64decoded before being used. The function printDisplay() takes three parameters. The first is the SimpleXMLElement containing the specification file. The second is the SimpleXMLElement containing the working template. The last parameter is a Boolean used for state. When in a preview state, the system generates display data only; otherwise, it displays editable fields. Being a standardized format, the application loops through the ->Fields->Field elements assuming they always exist. The Field element contains all the information for each node in the template document, including its location in the tree, which is stored in the Path child element. The Path, taking the form of a string such as XML_DIZ_INFO/Company_Info/Company_Name, is split into an array based on the /character, and the first element is removed. You do not need this element because it is the document element, which is already represented by the SimpleXMLElement holding the specification document. The first element breaks the display output into sections on the screen, skipping all fields that contain the node MASTER_PAD_VERSION_INFO. The information for this node and its children is already provided within the template file. The application then generates the appropriate input tags or displays content based on the state of the application. When input fields are generated, the name of the field corresponds to the location of the element within the document. For example, if you used XML_DIZ_INFO/Company_Info/Company_Name as the Path, the name within the form would be Company_Info[Company_Name]. Values for the fields are pulled from the getStoredValue() function. This is where it gets interesting with SimpleXML usage. The array containing the elements of the path is iterated. Each time, the variable $sxe, which originally contained the working template, is changed to be the child element of its current element using the $value variable, which is the name of the subnode. Examining a path from the specification file, such as XML_DIZ_INFO/Company_Info/Company_Name, the corresponding array, after removing the first element, would be array(’Company_Info’, ‘Company_Name’). This corresponds to the following XML fragment: Iterating through the array and setting $sxeeach time are the equivalent of manually coding this: $sxe = $sxe->Company_Info; $sxe = $sxe->Company_Name; You can navigate to the correct node using the information from the specification file without needing to know the document structure of the template file. Once iteration of the foreach is finished, the variable $sxeis cast to a string, which is the text content of the node the application is looking for, and is then returned to the application. When the data is submitted from the UI to the application, the function setValue() is called. As you probably recall, the name of the input fields indicate arrays, such as Company_Info[Company_Name]. No other named fields that are arrays are used in the

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

266 CHAPTER 7 SIMPLEXML /* Initial entry

Filed under: PHP and XML — webmaster @ 07:03

266 CHAPTER 7 SIMPLEXML /* Initial entry point so load the PAD template created from DOM */ $sxetemplate = simplexml_load_file($padtemplate); } /* If in working state display the working template for editing or preview */ if (! $bSave) { print ‘

‘; /* Base64-encoded working template to allow XML to be passed in hidden field */ print ‘‘; if (!$bError && isset($_POST[’Preview’])) { /* Working template is valid and in preview mode. Allow additional editing or final Save */ print ‘     ’. ‘‘; print ‘     ’. ‘‘; } print ‘

‘ ; } else { /* Final PAD file has been saved - Just print message */ print “PAD File Saved as $savefile”; } } else { /* Application unable to retrieve the specification file - Error */ print “Unable to load PAD Specification File”; } ?> The important areas to look at within this application are the user variables and the defined functions. The remainder of the application just pieces it all together. You must set three user variables. The default values will work just as well, but you can change them with respect to your current setup. These are the three user variables: $padspec: Location of PAD specification file. By default it pulls from http://www.padspec.org, but you can have it reside locally; in that case, modify the value to point to your local copy. $padtemplate: Location of the PAD template generated by the DOM extension in Chapter 6. $savefile: Location to save the final generated PAD file to when done. The specification file is used in every step of the process, so the first thing the application does is have SimpleXML load it. Initially, none of the POST variables is set, and SimpleXML is

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

CHAPTER 7 SIMPLEXML 265 } } }

Filed under: PHP and XML — webmaster @ 16:14

CHAPTER 7 SIMPLEXML 265 } } } } /* Return array containing any captured errors */ return $arRet; } /* Initial states for application variables */ $sxetemplate = NULL; $bPreview = FALSE; $bError = FALSE; $bSave = FALSE; /* BEGIN ACTUAL PROCESSING */ if ($sxe = simplexml_load_file($padspec)) { if (isset($_POST[’Save’]) || isset($_POST[’Preview’]) || isset($_POST[’Edit’])) { /* Working template in hidden field is Base64 encoded and must be decoded */ $sxetemplate = new SimpleXMLElement(base64_decode($_POST[’ptemplate’])); /* Loop through $_POST vars. vars that are arrays are PAD fields to be set */ foreach($_POST AS $name=>$value) { if (is_array($value)) { setValue($sxetemplate, $name, $value); } } if (isset($_POST[’Save’])) { /* Save finalized working template to file */ $sxetemplate->asXML($savefile); $bSave = TRUE; } elseif (isset($_POST[’Preview’])) { /* Validate the working template */ $arRet = validatePAD($sxe, $sxetemplate); if (count($arRet) > 0) { $bError = TRUE; print “ERRORS FOUND
“; /* Print out errors returned from validatePAD() */ foreach ($arRet AS $key=>$value) { print $value[0].”: “.$value[1].”
“; } } else { /* Working template was validated so allow data to be previewed */ $bPreview = TRUE; } } } else {

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Cheap Web Hosting services

PHP, MySQL, Java, Unix Cheap Web Hosting

264 CHAPTER 7 SIMPLEXML /* Retrieve text

Filed under: PHP and XML — webmaster @ 05:19

264 CHAPTER 7 SIMPLEXML /* Retrieve text content for node from working template */ function getStoredValue($sxe, $arPath) { if ($sxe) { /* Loop through node path to find SimpleXML element from working template */ foreach($arPath AS $key=>$value) { $sxe = $sxe->$value; } return (string)$sxe; } return “”; } /* Set the text content for a node from working template */ function setValue($sxe, $field, $value) { if (is_array($value)) { /* Loop through node path to find SimpleXML element from working template */ foreach ($value AS $fieldname=>$fieldvalue) { setValue($sxe->$field, $fieldname, $fieldvalue); } } else { /* Encode the value to ensure content will be valid XML */ $sxe->$field = htmlentities($value); } } /* Validate fields in working template using the RegEx defined in specification */ function validatePAD($spec, $template) { $arRet = array(); foreach ($spec->Fields->Field as $field) { $arPath = explode(”/”, trim($field->Path)); array_shift($arPath); if ($arPath[0] != “MASTER_PAD_VERSION_INFO”) { $sxe = $template; $regex = “/”.trim($field->RegEx).”/”; foreach($arPath AS $key=>$value) { $sxe = $sxe->$value; if (! $sxe) { break; } } if ($sxe) { $value = (string)$sxe; if (! preg_match($regex, $value)) { /* Capture fields failing validation for later display */ $arRet[] = array($field->Title, $field->RegExDocumentation);

Note: If you are looking for good and high quality web space to host and run your application check Lunarwebhost Clan Web Hosting services

Next Page »

Powered by Cheap Web Hosting