Tải bản đầy đủ (.pdf) (86 trang)

Professional PHP Programming phần 5 pdf

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.67 MB, 86 trang )

<element attr1="value1" attr2="value2" ></element>

However, as we noted above, there is also an alternative syntax, whereby we place the closing slash at
the end of the opening element:

<element attr1="value1" attr2="value2" />

The following line defines an empty element image, with an attribute src with the value
logo.gif:

<image src="logo.gif" />
Processing Instructions
XML processing instructions contain information for the application using the XML document.
Processing instructions do not constitute the part of the character data of the document – the XML
parser should pass these instructions unchanged to the application.

The syntax of the processing instruction might be strangely familiar to you:

<?TargetApp instructions?>

In the following example, php is the target application and print "This XML document was
created on Jan-07, 1999"; is the instruction:

<?php print "This XML document was created on Jan-07, 1999"; ?>
Entity References
Entities are used in the document as a way of avoiding typing long pieces of text many times in a
document. Entities are declared in the document's DTD (we will see later how to declare entities,
when we look at DTDs in more detail). The declared entities can be referenced throughout the
document. When the document is parsed by an XML parser, it replaces the entity reference with the
text defined in the entity declaration.


There are two types of entities – internal and external. The replacement text for an internal entity is
specified in an entity declaration, whereas the replacement text for an external entity resides in a
separate file, the location of which is specified in the entity declaration.

After the entity has been declared, it can be referenced within the document using the following
syntax:

&nameofentity;

Note that there should be no space between the ampersand (&), the entity name and the semicolon.

For example, let's assume that an entity myname with the value "Harish Rawat" has been
declared in the DTD of the document. The entity myname can be referred to in the document as:

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
<author>&myname;</author>

The parser, while parsing the document will replace &myname; by Harish Rawat. So the
application using the XML document will see the content of the element author as Harish Rawat.
Comments
Comments can be added in XML documents; the syntax is identical to that for HTML comments:

<! This is a comment >
The Document Type Definition
The document type definition of an XML document is defined within a declaration known as the
document type declaration. The DTD can be contained within this declaration, or the declaration can
point to an external document containing the DTD. The DTD consists of element type declarations,
attribute list declarations, entity declarations, and notation declarations. We will cover all of these in
this section.
Be sure to distinguish between the document type definition, or DTD, and the
document type declaration.
The syntax for a document type definition is:

<!DOCTYPE rootelementname [


]>

The rootelementname is the name of the root element of the document. The declarations for the
various elements, attributes, etc., are placed within the square braces.

An XML document can also have an external DTD, which can be referenced with the following
syntax:

<!DOCTYPE rootelementname SYSTEM "

The rootelementname is the name of the root element of the document. The location of the file
containing the DTD is
Element Type Declarations
The element type declaration indicates whether the element contains other elements, text, or is empty.
It also specifies whether the elements are mandatory or optional, and how many times the elements
can appear.

An element type declaration, specifying that an element can contain character data, looks as follows:

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
<!ELEMENT elementname (#PCDATA)>

Here ELEMENT is a keyword, elementname is the name of the element, and #PCDATA is also a
keyword. #PCDATA stands for "parsed character data", that is, the data that can be handled by the
XML parser.

For example, the following element declaration specifies that the element title contains character
data:

<!ELEMENT title (#PCDATA)>

The syntax of an element type declaration for an empty element is:

<!ELEMENT elementname EMPTY>


Here elementname is the name of the element, and EMPTY is a keyword.

For example, the following element type declaration specifies that element image is empty:

<!ELEMENT image EMPTY>

The syntax of an element type declaration for an element can contain anything – other elements or
parsed character data – is as follows:

<!ELEMENT elementname ANY>

Here elementname is the name of the element and ANY is a keyword.

An element type declaration for an element that contains only other elements looks like this:

<!ELEMENT parentelement (childelement1, childelement2, )>

Here the element parentelement contains the child elements childelement1,
childelement2, etc.

For example, the following element type declaration specifies that the element book contains the
elements title, authors, isbn, price:

<!ELEMENT book (title, authors, isbn, price)>

The syntax of element type declaration, specifying that parentelemnt contains either
childelement1 or childelement2, … .

<!ELEMENT parentelement (childelement1 | childelement2 | )>


For example, the following element type declaration specifies that element url can contain either
httpurl or ftpurl:
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
<!ELEMENT url (httpurl | ftpurl)>


The following operators can be used in the element type declaration, to specify the number of allowed
instances of elements within the parent element:

Operator Description
*
Zero or more instances of the element is allowed.
+
One or more instance of the element is allowed.
?
Optional.

The following element type declaration specifies that the element authors contains zero or more
instances of the element author:

<!ELEMENT authors (author*)>

The following element type declaration specifies that element authors contains one or more
instances of element author:

<!ELEMENT authors (author+)>

The following element type declaration specifies that the element toc contains the element
chapters and optionally can contain element appendixes:

<!ELEMENT toc (chapters, appendixes?)>
Attribute List Declarations
We saw earlier that an element can have attributes associated with it. The attribute list declaration
specifies the attributes which specific elements can take. It also indicates whether the attributes are
mandatory or not, the possible values for the attributes, default values etc.


The syntax of the attribute list declaration is:

<!ATTLIST elementname
attrname1 datatype1 flag1
attrname2 datatype2 flag2

>

Here elementname is the name of the element, attrname1 is the name of an attribute,
datatype1 specifies the type of information to be passed with the attribute and flag1 indicates
how the default values for the attribute are to be handled.

The possible values for the datatype field depend on the type of the attribute.

Possible values for the flags field are:

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
Flag Description
#REQUIRED
This flag indicates that the attribute should be present in all instances of the
element. If the attribute is not present in an instance of the element, then the
document is not a valid document.
#IMPLIED
This flag indicates that the application can assume a default value for the
attribute if the attribute is not specified in an element.
#FIXED
This flag indicates that the attribute can have only one value for all
instances of elements in the document.
CDATA Attributes
CDATA attributes can have any character data as their value.

The following attribute list declaration specifies that instances of the element price must have an
attribute currency whose value can be any character data:


<!ATTLIST price currency CDATA #REQUIRED>
Enumerated Attributes
Enumerated attributes can take one of the list of values provided in the declaration.

The following attribute list declaration specifies that instances of the element author can have an
attribute gender, with a value of either "male" or "female":

<!ATTLIST author gender (male|female) #IMPLIED>
ID and IDREF Attributes
Attributes of type ID must have a unique value in an XML document. These attributes are used to
uniquely identify instances of elements in the document.

The following attribute list declaration specifies that instances of element employee, must have an
attribute employeeid, and the value of it should be unique in the XML document:

<!ATTLIST employee employeeid ID #REQUIRED>

The value of attributes of type IDREF must match the value of an ID attribute on some element in the
XML document. Similarly, the values of attributes of type IDREFS must contain whitespace-
delimited ID values in the document. Attributes of type IDREF and IDREFS are used to establish
links between elements in the document.

The following attribute list declaration is used to establish a link between an employee and his or her
manager and subordinates.

<!ATTLIST employee
employeeid ID #REQUIRED
managerid IDREF #IMPLIED
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
subordinatesid IDREFS #IMPLIED>
Entity Attributes
Entity attributes provide a mechanism for referring to non-XML (binary) data from an XML
document. The value of an entity attribute must match the name of an external entity declaration
referring to non-XML data.


The following attribute list declaration specifies that the element book, can have an entity attribute
logo.

<!ATTLIST book logo ENTITY #IMPLIED>
Notation Declarations
Sometimes elements in XML documents might refer to an external file containing data in a format that
an XML parser cannot read. Suppose we have an XML document containing the details of book. We
may want to put a reference to a GIF image of the cover along with the details of the book. The XML
parser would not be able to process this data, so we need a mechanism to identify a helper application
which will process this non-XML data. Notation declarations allow the XML parser to identify helper
applications, which can be used to process non-XML data.

A notation declaration provides a name and an external identifier for a type of non-XML (unparsed)
data. The external identifier for the notation allows the XML application to locate a helper application
capable of processing data in the given notation.

For example, the following notation declaration specifies "file:///usr/bin/netscape" as the
helper application for non-XML data of type "gif":

<!NOTATION gif SYSTEM "file:///usr/bin/netscape">
Entity Declarations
Entity declarations define entities which are used within the XML document. Whenever the XML
parser encounters an entity reference in the XML document, it replaces it with the contents of the
entity as defined in the entity declaration.

Internal entity declarations are in the following format:

<!ENTITY myname "Harish Rawat">

This entity declaration defines an entity myname, with the value "Harish Rawat".


The following is an example of an external entity declaration, referring to a file containing XML data:

<!ENTITY description1 SYSTEM "description1.xml">

This entity declaration defines an entity named description1, with "description1.xml" as
the system identifier. A "system identifier" is the location of the file containing the data.

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
When declaring external entity declarations, public identifiers for the entity can also be specified. The
XML parser, on encountering the external entity reference first tries to resolve the reference using the
public identifier and only when it fails it tries to use system identifier.

In this example, the entity description1 is declared with the public identifier
" and the system identifier
"description1.xml":

<!ENTITY description1 SYSTEM "description1.xml"
PUBLIC "

If the file contains non-XML data, the syntax will be:

<!ENTITY booklogo SYSTEM "booklogo.gif" NDATA gif>

This entity declaration defines an entity booklogo, which refers to an external non-XML file
booklogo.gif, of notation gif. Notation declaration for gif should be declared earlier.
XML Support in PHP
PHP supports a set of functions that can be used for writing PHP-based XML applications. These
functions can be used for parsing well-formed XML documents. The XML parser in PHP is a streams-
based parser. Before parsing the document, different handlers (or callback functions) are registered
with the parser. The XML document is fed to the parser in sections, and as the parser parses the
document and recognizes different nodes, it calls the appropriate registered handler. Note that the
XML parser does not check for the validity of the XML document. It won't generate any errors or
warnings if the document is well-formed but not valid.


The PHP XML extension supports Unicode character set through different character encodings. There
are two types of character encodings, source encoding and target encoding. Source encoding is
performed when the XML document is parsed. The default source encoding used by PHP is ISO-8859-
1. Target encoding is carried out when PHP passes data to registered handler functions. Target
encoding affects character data as well as tag names and processing instruction targets.

If the XML parser encounters characters outside the range that its source encoding is capable of
representing, it will return an error. If PHP encounters characters in the parsed XML document that
cannot be represented in the chosen target encoding, such characters will be replaced by a question
mark.

XML support for PHP is implemented using the expat library. Expat is a library written in C, for
parsing XML documents. More information about expat can be found at
page.

Note that XML support is not available in PHP by default. We discuss installing PHP with XML
support in Chapter 2.
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
The PHP XML API
A PHP script which parses an XML document must perform the following operations:

1. Create an XML parser.
2. Register handler functions (callback functions) with the parser. The parser will call these
registered handlers as and when it recognizes different nodes in the XML document. Most of
the application logic is implemented in these handler functions.
3. Read the data from the XML file, and pass the data to the parser. This is where the actual
parsing of the data occur.
4. Free the parser, after the complete file has been parsed.

We will have a quick look at what this means in practice by showing a very simple XML parser (in
fact, just about the simplest possible!), before going on to look at the individual functions in turn.

<?php
// First we define the handler functions to inform the parser what action to
// take on encountering a specific type of node.


// We'll just print out element opening and closing tags and character data

// The handler for element opening tags
function startElementHandler($parser, $name, $attribs) {
echo("&lt;$name&gt;<BR>");
}

// The handler for element closing tags
function endElementHandler($parser, $name) {
echo("&lt;/$name&gt;<BR>");
}

// The handler for character data
function cdataHandler($parser, $data) {
echo("$data<BR>");
}

// Now we create the parser
$parser=xml_parser_create();

// Register the start and end element handlers
xml_set_element_handler($parser, "startElementHandler", "endElementHandler");

// Register the character data parser
xml_set_character_data_handler($parser, "cdataHandler");

// Open the XML file
$file="test.xml";
if (!($fp = fopen($file, "r"))) {

die("could not open $file for reading");























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
}

// Read chunks of 4K from the file, and pass it to the parser

while ($data = fread($fp, 4096)) {
if (!xml_parse($parser, $data, feof($fp))) {
die(sprintf("XML error %d %d", xml_get_current_line_number($parser),
xml_get_current_column_number($parser)));
}
}
?>

If we run this script against the following XML file:

<?xml version="1.0"?>
<books>
<book>Pro PHP</book>
<book>XML in IE5</book>
<book>Pro XML</book>
</books>

This will produce this output in the browser:

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -



Now we'll go on to discuss the functions in detail. In the following sections, all the XML-related
functions will be described, along with examples of their use.
Creating an XML Parser
The function xml_parser_create() creates an XML parser context.

int xml_parser_create(string [encoding_parameter]);

Paramter Optional Description Default
encoding_parameter
Yes The character source encoding
that will be used by the
parser. The source encoding

once set cannot be changed
later. The possible values are
"ISO-8859-1"























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
"ISO-8859-1", "US-

ASCII" and "UTF-8".

The function returns a handle (a positive integer value) on success, and false on error. The handle
returned by xml_parser_create() will be passed as an argument to all the function calls which
register handler functions with the parser, or change the options of the parser. We will see these
function calls shortly.

We can define multiple parsers in a single PHP script. You may want to do it if you are parsing more
than one XML document in the script.
Registering Handler functions
Before we can parse an XML document, we need to write functions which will handle the various
nodes of the XML document. For example, we need to write a function which will handle the opening
tag of XML elements, and another which will handle the closing tags. We also need to assign handlers
for character data, processing instructions, etc. These handlers must be registered with the XML
parser before the document can be parsed.
Registering Element Handlers
The function xml_set_element_handler() registers "start" and "end" element handler
functions with the XML parser. Its syntax is:

int xml_set_element_handler(int parser, string startElementHandler,
string endElementHandler);

Parameter Optional Description
parser
No The handler of an XML
parser, with which the
start and end element
handlers are registered
startElementHandler
No The name of the start

element handler
function. If null is
specified then no start
element handler is
registered.
endElenmentHandler
No The name of the end
element handler
function. If null is
specified then no end
element handler is
registered.

The function returns true on success, or false if the call fails. The function call will return false
if parser is not a valid parser handle.
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -

The registered handler functions startElementHandler and endElementHandler should exist
when an XML document is parsed; if they do not, a warning will be generated.
Start Element Handler
The user-defined start element handler function, registered with the parser through an
xml_set_element_handler() function call, will be called when the parser encounters the
opening tag of an element in the document. The function must be defined with the following syntax:

startElementHandler(int parser, string name, string attribs[]);

Parameter Optional? Description
parser
No Reference to the XML
parser which is calling
this function
name
No The name of the
element
attribs[]

No An associative array
containing the
attributes of the
element.

For example, suppose we are parsing the following line of an XML document:

<author gender="male" age="24">Harish Rawat</author>

The XML parser will call our registered start element handler function with the following parameters:

startElementHandler($parser, "author", array("gender"=>"male", "age"=>"24");
End Element Handler
The user-defined end element handler function, registered with the parser through
xml_set_element_handler() function call, will be called when the parser encounters a end tag
of an element in the document. This function should have the following syntax:

endElementHandler(int parser, string name);

Parameter Optional Description
parser
No Reference to the XML
parser which is calling
this function
name
No Tag name of the
element
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -

For example, if we parse the following line of an XML document:

<author sex="male" age="24">Harish Rawat</author>

The registered end element handler function will be called with the following parameters:


endElementHandler($parser, "author");

Notice that the value of the name parameter is "author" and not "/author".
The Character Data Handler
The function xml_set_character_data_handler() registers the character data handler with
the XML parser. The character data handler is called by the parser, for all non-markup contents of the
XML document:

int xml_set_character_data_handler (int parser, string characterDataHandler);

Parameter Optional Description
parser
No The handle for an
XML parser, with
which the character
data handler is
registered
characterDataHandler
No The name of the
character data
handler function. If
null is specified
then no character
data handler is
registered.

The function returns true on success else false is returned. The function will return false if the
parser is not a valid parser handle.

The registered handler function should exist when parsing of an XML document is done, else a error is

generated.
Prototype for the Character Data Handler
The user-defined character data handler function, registered with the parser through a call to the
xml_set_character_data_handler() function, will be called when the parser encounters
non-markup content in the XML document and should have the following syntax:

characterDataHandler(int parser, string data);

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
Parameter Optional Description
parser
No Reference to the XML
parser which is calling
this function.
data
No The character data as
present in the XML
document. The parser
returns the character
data as it is, and does
not remove any white
spaces.

While parsing the contents of an element, the character data handler can be called any number of
times. This should be kept in mind while defining the character data handler function.

For example, while parsing the following line of an XML document:

<author sex="male" age="24">Harish Rawat</author>

The character data handler can be called once with the following parameters:

characterDataHandler($parser, "Harish Rawat");

Or it can be called twice; firstly as:


characterDataHandler($parser, "Harish ");

And again as:

characterDataHandler($parser, "Rawat");
The Processing Instruction Handler
The function xml_set_processing_instruction_handler() registers with the XML parser
the function that will be used to handle processing instructions. The processing instruction handler is
called by the parser when it encounters a processing instruction in the XML document:

int xml_set_processing_instruction_handler(int parser, string
processingInstructionHandler);

Parameter Optional Description
parser
No The handle for an
XML parser with
which the
processing
instruction handler
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
is registered
processingInstruc
tionHandler
No The name of the
processing
instruction handler
function. If null is
specified then no
processing
instruction handler
is registered.

The function returns true on success or false on failure. The function will return false if

parser is not a valid parser handle.

The registered handler function should exist when an XML is parsed, or an error is generated.

Processing instructions, as we saw in the section on the XML Language, are application-specific
instructions embedded in the XML document. This is similar to the way we embed PHP instructions in
an HTML file.
Prototype for the Processing Instruction Handler
The user defined processing instruction handler function, registered with the parser through the
xml_set_processing_instruction_handler() function, will be called when the parser
encounters processing instructions in the XML document and should have the following syntax:

processingInstructionHandler(int parser, string target, string data);

Parameter Optional Description
parser
No Reference to the XML
parser which is calling
this function
target
No The target of the
processing instruction
data
No Data to be passed to the
parser

For example, if we are parsing the following processing instruction in an XML document:

<?php print "This document was created on Jan 01, 1999";?>


The processing instruction handler will be called with the following parameters:

processingInstructionHandler($parser, "php", string "print \"This XML document
was created on Jan 01, 1999\";");

A sample processing instruction handler might look like this:

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
function piHandler($parser, $target, $data) {
if (strcmp(strtolower($target), "php") == 0) {
eval($data);
}
}

If you are defining such a processing instruction handler in your application, then you should do some
security checks before executing the code. The code embedded through processing instructions can be
malicious – for example, it could delete all the files in the server.

One security check could be to execute the code in the processing instructions only if the owner of the
XML file and the XML parser are the same.
The Notation Declaration Handler
The function xml_set_notation_decl_handler() registers the notation declaration handler
with the parser. The notation declaration handler is called by the parser whenever it encounters a
notation declaration in the XML document.

int xml_set_notation_decl_handler(int parser,
string notationDeclarationHandler);

Parameter Optional Description
parser
No The handle for the
XML parser with
which the notation
declaration handler
is registered

notationDeclarat
ionHandler
No The name of the
notation declaration
handler function. If
null is specified
then no notation
declaration handler
is registered.

The function returns true on success, otherwise false is returned. The function will return false
if parser is not a valid parser handle.

An error will be generated when the XML document is parsed if the notation declaration handler does
not exist.
Prototype for the Notation Declaration Handler
The user defined notation declaration handler function, registered with the parser through a call to the
xml_set_notation_decl_handler() function, will be called when the parser encounters
notation declarations in the XML document and should have the syntax:

notationDeclarationHandler(int parser, string notationName, string base,
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
string systemId, string publicId);

Parameter Optional Description
parser
No Reference to the
XML parser which
is calling this
function
notationName
No Name of the
notation
base
No This is the base for

resolving the
systemId.
Currently the value
of this parameter
will always be a
null string.
systemId
No The system
identifier of the
notation declaration
publicId
No The public identifier
of the notation
declaration

For example, parsing the following notation declaration of an XML document:

<!NOTATION gif SYSTEM "file:///usr/bin/netscape">

Will cause the notation declaration handler to be called with the following parameters:

notationDeclarationHandler($parser, "gif", "", "file:///usr/bin/netscape", "");

Let's implement a sample notation declaration handler. This handler populates the associative array
$helperApps with a mapping between the notation name and the name of the application that will
handle the unparsed data of type $notationName. The $helperApps array can be used by the
unparsed entity declaration handler to identify the application that should be used to process non-
XML data. We will look at the unparsed entity declaration handler shortly.

function notationHandler($parser, $notationName, $base, $systemId, $publicId) {

global $helperApps;
if ($systemId) {
$helperApps[$notationName] = $systemId;
} else {
$helperApps[$notationName] = $publicId;
}
}
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
The External Entity Reference Handler
The function xml_set_external_entity_ref_handler() registers the external entity
reference handler with the XML parser. This function is called by the parser when it encounters an
external entity reference in an XML document. Note that the registered handler is called for external
entity references and not external entity declarations.

Unlike other parsers (such as Microsoft Internet Explorer 5), the XML parser of PHP does not handle
external entities. It simply calls the registered external entity reference handler to handle it.

int xml_set_external_entity_ref_handler(int parser,
string externalEntityRefHandler);

Parameter Optional Description
parser
No The handle for an XML parser with
which the external entity reference
handler is registered
externalEntityRef
Handler
No The name of the external entity
reference handler function. If null is
specified then no external entity
reference handler is registered.

The function returns true on success; otherwise, false is returned. The function will return false
if parser is not a valid parser handle.


The registered handler function should exist when parsing an XML document, or an error will be
generated.
Prototype for the External Entity Reference Handler
The user-defined external entity reference handler function, registered with the parser through an
xml_set_external_entity_ref_handler function call, will be called when the parser
encounters external entity references in the XML document. This should have the following syntax:

int externalEntityRefHandler(int parser, string entityName, string base,
string systemId, string publicId);

Parameter Optional Description
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
parser
No Reference to the
XML parser which
is calling this
function
entityName
No Name of the entity
base
No This is the base for
resolving
systemId.
Currently the value
of this parameter
will always be a
null string.
systemId
No The system
identifier of the
external entity
publicId
No The public identifier
of the external
entity


The user-defined external entity reference handler should handle the external references in the XML
document. If a true value is returned by the handler, the parser assumes that the external reference
was successfully handled and the parsing continues. If the handler returns false, the parser will stop
parsing.

As an example, suppose that an entity &book_1861002777; is defined in the DTD of an XML
document:

<!ENTITY book_1861002777 SYSTEM "1861002777.xml">

And the parser comes across the following line in an XML document:

&book_1861002777;

The external entity reference handler will be called with the parameters:

externalEntityRefHandler($parser, "book_186100277", "", "1861002777.xml", "");
The Unparsed Entity Declaration Handler
The function xml_set_unparsed_entity_decl_handler registers the external entity
reference handler with the xml parser. The unparsed entity declaration handler is called by the parser,
when it encounters an unparsed entity declaration in an XML document.

int xml_set_unparsed_entity_decl_handler(int parser,
string unparsedEntityDeclHandler);

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
Parameter Optional Description
parser
No The handle of an XML parser with
which the unparsed entity
declaration handler is registered
unparsedEntityDeclHa
ndler
No The name of the unparsed entity

declaration handler function. If null
is specified then no external entity
reference handler is registered.

The function returns true on success or false on failure. The function returns false if parser is
not a valid parser handle.

The registered handler function should exist when an XML is parsed, or an error will be generated.
Prototype for the Unparsed Entity Declaration Handler
The user-defined unparsed entity declaration handler function, registered with the parser through an
xml_set_unparsed_entity_decl_handler() function call, will be called when the parser
encounters an unparsed entity declaration in the XML document. Its syntax is:

unparsedEntityDeclHandler(int parser, string entityName, string base,
string systemId, string publicId, string notationName);

Parameter Optional Description
parser
No Reference to the
XML parser which
is calling this
function
entityName
No Name of the entity
base
No This is the base for
resolving
systemId.
Currently the value
of this parameter

will always be a
null string.
systemId
No The system
identifier of the
unparsed entity
publicId
No The public identifier
of the unparsed
entity
notationName
No The name of the
notation (defined in
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
an earlier notation
declaration),
identifying the type
of unparsed data.

For example, if the parser encounters the following line (in the DTD) of an XML document:

<!ENTITY book_gif_1861002777 SYSTEM "1861002777.gif" NDATA gif>

The unparsed entity declaration handler will be called with the following parameters:

unparsedEntityDeclHandler($parser, "book_gif_1861002777", "", "1861002777.gif",
"", "gif");
The Default Handler
The function xml_set_default_handler() registers the default handler with the XML parser.
The default handler is called by the parser for all the nodes of the XML document for which handlers
can not be registered (such as the XML version declaration, DTD declaration and comments). The
default handler is also called for any other nodes for which the handlers are not registered with the
parser. For example, if the start and end element handlers are not registered with the parser, the parser
will call the default handler (if registered) whenever it encounters element opening and closing tags in
the XML document.


int xml_set_default__handler(int parser, string defaultHandler);

Parameter Optional Description
parser
No The handle of an XML parser with
which the unparsed default handler
is registered.
DefaultHandler
No The name of the default handler
function. If null is specified then no
default handler is registered.

The function returns true on success, or false on error (e.g. if parser is not a valid parser
handle).

The registered handler function should exist when parsing an XML document, or an error is generated.
Prototype for the Default Handler
The user-defined default handler gets called by the parser for all the nodes in the XML document for
which handler functions are not registered. It should have the following syntax:

DefaultHandler(int parser, string data);

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
Parameter Optional Description
parser
No Reference to the XML
parser which is calling
this function
data
No The part of the XML
document for which
there is no registered
handler


For example, if the start and end element handlers are not registered with the parser and the parser
encounters this line in an XML document:

<author sex="male" age="24">Harish Rawat</author>

The default handler will be called with the following values of function parameters:

int xml_set_default__handler($parser, "<author sex=\"male" age=\"24\">");

Notice that the entire opening and closing tags of the element are passed as they are.
Parsing the XML Document
The xml_parse() function passes the contents of the XML document to the parser. This function
accomplishes the actual parsing of the document – it calls the appropriate registered handlers as and
when it encounters nodes in the document.

This function is called after all the handler functions for the various node types in the XML document
have been registered with the parser.

int xml_parse(int parser, string data, int [isFinal]);

Parameter Optional Description Default
parser
No The handle for an XML
parser, which will parse
the supplied data.

data
No The contents of the XML
document. The complete
contents of the XML file

need not be passed in one
call.

isFinal
Yes Specifies the end of inpu
t
data.
false

The function returns true if it was able to parse the data passed to it; otherwise, false is returned.
The error information in case of failure can be found with the xml_get_error_code() and
xml_get_error_string() functions. We shall look at these functions presently.

























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
The following code fragment illustrates the use of the xml_parse() function:

// Open the XML file
if (!($fp = fopen($file, "r"))) {
die("could not open $file for reading") ;
}

// Read chunks of 4K from the file, and pass it to the xml_parse() function
while ($data = fread($fp, 4096)) {
if (!xml_parse($xml_parser, $data, feof($fp))) {
die(sprintf("XML error %d %d", xml_get_current_line_number($xml_parser),
xml_get_current_column_number($xml_parser))) ;
}
}
Freeing the Parser
The function xml_parser_free() frees the XML parser which was created with the
xml_parser_create() function. All the resources associated with the parser are freed.

The XML parser should be freed after a complete XML document has been parsed, or if an error
occurs while parsing a document.


int xml_parse_free(int parser);

Parameter Optional Description
parser
No The handle of an
XML parser, which
is to freed.

The function returns true if the parser was freed, otherwise false.
Parser Options
There are two options for the parser. We can set values for these options using the
xml_parser_set_option() function, and retrieve the current value with the
xml_parser_get_option() function.

These options are:

Option Data Type Description Default
XML_OPTION_CASE_FOLDING
Integer If the value of the
option is true,
then the element
names (start and
end tags), will be
upper cased, when
true
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
the registered
handlers are called.
XML_OPTION_TARGET_ENCODING
String The value of this
option specifies the
target encoding
used by parser,
when it invokes
registered handlers.

Same as the
source
encoding
value,
specified
when the
parser was
created.
xml_parser_set_option
The xml_parser_set_option() function sets the option specified in the option argument to
the value in the value argument for the parser associated with the parser handle specified by the
parser argument.

int xml_parser_set_option(int parser, int option, mixed value);

The function returns true if the new option was set; if the call failed, false is returned.

The function xml_parser_set_option() can be called at any point in the PHP program. The
new option will take effect for any data that is parsed after the option has been set.
xml_parser_get_option
The xml_parser_get_option() function retrieves the value for the option specified by the
option argument for the parser specified by the parser argument.

mixed xml_parser_get_option(int parser, int option);

This function returns the value of the option (the data type of the return value therefore depends on
the option). If either the parser or the option argument is invalid then false is returned.
Utility Functions
The remaining functions provide useful information or services that we may need when parsing an
XML document. These functions provide information about any errors which occurred, the current

position in the XML document. There are also functions for encoding and decoding text.
xml_get_error_code
The function xml_get_error_code() returns the error code from the XML parser.

int xml_get_error_code(int parser);

Parameter Optional Description
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -
parser
No The handle of an XML parser

This function can be called after xml_parse() has returned false to find out the exact reason
why the parsing of the passed data failed. The function returns false if the parser is not a valid
XML parser.
xml_error_string
The xml_error_string() function returns the error message corresponding to an error code.

string xml_get_error_code(int errorCode);

Parameter Optional Description
ErrorCode
No An error code returned by the
xml_get_error_code() function

This function returns a string with a textual description of the error code passed in the ErrorCode
argument, or false if no description was found.
xml_get_current_line_number
The xml_get_current_line_number() function returns the current line number from the
parser.

int xml_get_current_line_number(int parser);

Parameter Optional Description Default
parser
No The handle of an

XML parser


This function returns the line number of the XML document that the parser is currently parsing. If
parser is not a valid parser, false is returned. This function can be used to print the line number
(for debugging purposes), when a call to the xml_parse() function returns false.
xml_get_current_column_number
The xml_get_current_column_number() function is similar to
xml_get_current_line_number(); the only difference is that it returns the number of the
current column in the line that the parser is parsing.

int xml_get_current_column_number(int parser);

The functions xml_get_current_line_number() and
xml_get_current_column_number() can be used together when reporting parse errors in the
XML document to give the user the exact location where the error occurred:

if (!xml_parser($parser, $data)) {
























































TEAM FLY PRESENTS
Simpo PDF Merge and Split Unregistered Version -

×