XML trong Java - Tiếng Anh

59 911 3
XML trong Java - Tiếng Anh

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Tìm hiểu cấu trúc và cú pháp của XML trong java

1Tutorial: XML programming in JavaDoug TidwellCyber Evangelist, developerWorks XML TeamSeptember 1999About this tutorialOur first tutorial, “Introduction to XML,” discussed the basics of XML and demonstrated its potential torevolutionize the Web. This tutorial shows you how to use an XML parser and other tools to create,process, and manipulate XML documents. Best of all, every tool discussed here is freely available atIBM’s alphaWorks site (www.alphaworks.ibm.com) and other places on the Web.About the authorDoug Tidwell is a Senior Programmer at IBM. He has well over a seventh of a century of programmingexperience and has been working with XML-like applications for several years. His job as a CyberEvangelist is basically to look busy, and to help customers evaluate and implement XML technology.Using a specially designed pair of zircon-encrusted tweezers, he holds a Masters Degree in ComputerScience from Vanderbilt University and a Bachelors Degree in English from the University of Georgia. Section 1 – Introduction Tutorial – XML Programming in Java2Section 1 – IntroductionAbout this tutorialOur previous tutorial discussed the basics of XMLand demonstrated its potential to revolutionize theWeb. In this tutorial, we’ll discuss how to use anXML parser to:• Process an XML document• Create an XML document• Manipulate an XML documentWe’ll also talk about some useful, lesser-knownfeatures of XML parsers. Best of all, every tooldiscussed here is freely available at IBM’salphaWorks site (www.alphaworks.ibm.com) andother places on the Web.What’s not hereThere are several important programming topicsnot discussed here:• Using visual tools to build XML applications• Transforming an XML document from onevocabulary to another• Creating interfaces for end users or otherprocesses, and creating interfaces to back-enddata storesAll of these topics are important when you’rebuilding an XML application. We’re working onnew tutorials that will give these subjects their due,so watch this space!XML application architectureAn XML application is typically built around an XMLparser. It has an interface to its users, and aninterface to some sort of back-end data store.This tutorial focuses on writing Java code that usesan XML parser to manipulate XML documents. Inthe beautiful picture on the left, this tutorial isfocused on the middle box.XMLApplicationXML ParserUserInterfaceDataStore(Original artwork drawn by Doug Tidwell. All rights reserved.) Tutorial – XML Programming in Java Section 2 – Parser basics3Section 2 – Parser basicsThe basicsAn XML parser is a piece of code that reads adocument and analyzes its structure. In thissection, we’ll discuss how to use an XML parser toread an XML document. We’ll also discuss thedifferent types of parsers and when you might wantto use them.Later sections of the tutorial will discuss what you’llget back from the parser and how to use thoseresults.How to use a parserWe’ll talk about this in more detail in the followingsections, but in general, here’s how you use aparser:1. Create a parser object2. Pass your XML document to the parser3. Process the resultsBuilding an XML application is obviously moreinvolved than this, but this is the typical flow of anXML application.Kinds of parsersThere are several different ways to categorizeparsers:• Validating versus non-validating parsers• Parsers that support the Document ObjectModel (DOM)• Parsers that support the Simple API for XML(SAX)• Parsers written in a particular language (Java,C++, Perl, etc.) Section 2 – Parser basics Tutorial – XML Programming in Java4Validating versus non-validating parsersAs we mentioned in our first tutorial, XMLdocuments that use a DTD and follow the rulesdefined in that DTD are called valid documents.XML documents that follow the basic tagging rulesare called well-formed documents.The XML specification requires all parsers to reporterrors when they find that a document is not well-formed. Validation, however, is a different issue.Validating parsers validate XML documents as theyparse them. Non-validating parsers ignore anyvalidation errors. In other words, if an XMLdocument is well-formed, a non-validating parserdoesn’t care if the document follows the rulesspecified in its DTD (if any).Why use a non-validating parser?Speed and efficiency. It takes a significant amountof effort for an XML parser to process a DTD andmake sure that every element in an XML documentfollows the rules of the DTD. If you’re sure that anXML document is valid (maybe it was generated bya trusted source), there’s no point in validating itagain.Also, there may be times when all you care about isfinding the XML tags in a document. Once youhave the tags, you can extract the data from themand process it in some way. If that’s all you needto do, a non-validating parser is the right choice.The Document Object Model (DOM)The Document Object Model is an officialrecommendation of the World Wide WebConsortium (W3C). It defines an interface thatenables programs to access and update the style,structure, and contents of XML documents. XMLparsers that support the DOM implement thatinterface.The first version of the specification, DOM Level 1,is available at http://www.w3.org/TR/REC-DOM-Level-1, if you enjoy reading that kind of thing. Tutorial – XML Programming in Java Section 2 – Parser basics5What you get from a DOM parserWhen you parse an XML document with a DOMparser, you get back a tree structure that containsall of the elements of your document. The DOMprovides a variety of functions you can use toexamine the contents and structure of thedocument.A word about standardsNow that we’re getting into developing XMLapplications, we might as well mention the XMLspecification. Officially, XML is a trademark of MITand a product of the World Wide Web Consortium(W3C).The XML Specification, an official recommendationof the W3C, is available at www.w3.org/TR/REC-xml for your reading pleasure. The W3C sitecontains specifications for XML, DOM, and literallydozens of other XML-related standards. The XMLzone at developerWorks has an overview of thesestandards, complete with links to the actualspecifications.The Simple API for XML (SAX)The SAX API is an alternate way of working withthe contents of XML documents. A de factostandard, it was developed by David Megginsonand other members of the XML-Dev mailing list.To see the complete SAX standard, check outwww.megginson.com/SAX/. To subscribe to theXML-Dev mailing list, send a message tomajordomo@ic.ac.uk containing the following:subscribe xml-dev. Section 2 – Parser basics Tutorial – XML Programming in Java6What you get from a SAX parserWhen you parse an XML document with a SAXparser, the parser generates events at variouspoints in your document. It’s up to you to decidewhat to do with each of those events.A SAX parser generates events at the start andend of a document, at the start and end of anelement, when it finds characters inside anelement, and at several other points. You write theJava code that handles each event, and you decidewhat to do with the information you get from theparser.Why use SAX? Why use DOM?We’ll talk about this in more detail later, but ingeneral, you should use a DOM parser when:• You need to know a lot about the structure of adocument• You need to move parts of the documentaround (you might want to sort certainelements, for example)• You need to use the information in thedocument more than onceUse a SAX parser if you only need to extract a fewelements from an XML document. SAX parsersare also appropriate if you don’t have muchmemory to work with, or if you’re only going to usethe information in the document once (as opposedto parsing the information once, then using it manytimes later). Tutorial – XML Programming in Java Section 2 – Parser basics7XML parsers in different languagesXML parsers and libraries exist for most languagesused on the Web, including Java, C++, Perl, andPython. The next panel has links to XML parsersfrom IBM and other vendors.Most of the examples in this tutorial deal with IBM’sXML4J parser. All of the code we’ll discuss in thistutorial uses standard interfaces. In the finalsection of this tutorial, though, we’ll show you howeasy it is to write code that uses another parser.Resources – XML parsersJava• IBM’s parser, XML4J, is available atwww.alphaWorks.ibm.com/tech/xml4j.• James Clark’s parser, XP, is available atwww.jclark.com/xml/xp.• Sun’s XML parser can be downloaded fromdeveloper.java.sun.com/developer/products/xml/(you must be a member of the Java DeveloperConnection to download)• DataChannel’s XJParser is available atxdev.datachannel.com/downloads/xjparser/.C++• IBM’s XML4C parser is available atwww.alphaWorks.ibm.com/tech/xml4c.• James Clark’s C++ parser, expat, is availableat www.jclark.com/xml/expat.html.Perl• There are several XML parsers for Perl. Formore information, seewww.perlxml.com/faq/perl-xml-faq.html.Python• For information on parsing XML documents inPython, see www.python.org/topics/xml/. Section 2 – Parser basics Tutorial – XML Programming in Java8One more thingWhile we’re talking about resources, there’s onemore thing: the best book on XML and Java (in ourhumble opinion, anyway).We highly recommend XML and Java: DevelopingWeb Applications, written by Hiroshi Maruyama,Kent Tamura, and Naohiko Uramoto, the threeoriginal authors of IBM’s XML4J parser. Publishedby Addison-Wesley, it’s available at bookpool.comor your local bookseller.SummaryThe heart of any XML application is an XML parser.To process an XML document, your application willcreate a parser object, pass it an XML document,then process the results that come back from theparser object.We’ve discussed the different kinds of XMLparsers, and why you might want to use each one.We categorized parsers in several ways:• Validating versus non-validating parsers• Parsers that support the Document ObjectModel (DOM)• Parsers that support the Simple API for XML(SAX)• Parsers written in a particular language (Java,C++, Perl, etc.)In our next section, we’ll talk about DOM parsersand how to use them. Tutorial – XML Programming in Java Section 3 – The Document Object Model (DOM)9Section 3 – The Document Object Model (DOM)Dom, dom, dom, dom, dom,Doobie-doobie, Dom, dom, dom, dom, dom…The DOM is a common interface for manipulatingdocument structures. One of its design goals isthat Java code written for one DOM-compliantparser should run on any other DOM-compliantparser without changes. (We’ll demonstrate thislater.)As we mentioned earlier, a DOM parser returns atree structure that represents your entire document.Sample codeBefore we go any further, make sure you’vedownloaded our sample XML applications ontoyour machine. Unzip the file xmljava.zip, andyou’re ready to go! (Be sure to remember whereyou put the file.)DOM interfacesThe DOM defines several Java interfaces. Hereare the most common:• Node: The base datatype of the DOM.• Element: The vast majority of the objectsyou’ll deal with are Elements.• Attr: Represents an attribute of an element.• Text: The actual content of an Element orAttr.• Document: Represents the entire XMLdocument. A Document object is oftenreferred to as a DOM tree. Section 3 – The Document Object Model (DOM) Tutorial – XML Programming in Java10Common DOM methodsWhen you’re working with the DOM, there areseveral methods you’ll use often:• Document.getDocumentElement()Returns the root element of the document.• Node.getFirstChild() andNode.getLastChild()Returns the first or last child of a given Node.• Node.getNextSibling() andNode.getPreviousSibling()Deletes everything in the DOM tree, reformatsyour hard disk, and sends an obscene e-mailgreeting to everyone in your address book.(Not really. These methods return the next orprevious sibling of a given Node.)• Node.getAttribute(attrName)For a given Node, returns the attribute with therequested name. For example, if you want theAttr object for the attribute named id, usegetAttribute("id").<?xml version="1.0"?><sonnet type="Shakespearean"><author><last-name>Shakespeare</last-name><first-name>William</first-name><nationality>British</nationality><year-of-birth>1564</year-of-birth><year-of-death>1616</year-of-death></author><title>Sonnet 130</title><lines><line>My mistress’ eyes are .Our first DOM application!We’ve been at this a while, so let’s go ahead andactually do something. Our first application simplyreads an XML document and writes the document’scontents to standard output.At a command prompt, run this command:java domOne sonnet.xmlThis loads our application and tells it to parse thefile sonnet.xml. If everything goes well, you’llsee the contents of the XML document written outto standard output.The domOne.java source code is on page 33. [...]... version="1.0"?> <sonnet type="Shakespearean"> <author> <last-name>Shakespeare</last-name> <first-name>William</first-name> <nationality>British</nationality> <year-of-birth>1564</year-of-birth> <year-of-death>1616</year-of-death> </author> <title>Sonnet 130</title> <lines> <line>My... Programming in Java 14 <sonnet type="Shakespearean"> <author> <last-name>Shakespeare</last-name> <first-name>William</first-name> <nationality>British</nationality> <year-of-birth>1564</year-of-birth> <year-of-death>1616</year-of-death> </author> <title>Sonnet 130</title> <lines> <line>My... XML parsers for Perl. For more information, see www.perlxml.com/faq/perl -xml- faq.html. Python • For information on parsing XML documents in Python, see www.python.org/topics /xml/ . Tutorial – XML Programming in Java Section 3 – The Document Object Model (DOM) 9 Section 3 – The Document Object Model (DOM)    Dom, dom, dom, dom, dom,    Doobie-doobie,       Dom, dom, dom, dom, dom… The... working with the contents of XML documents. A de facto standard, it was developed by David Megginson and other members of the XML- Dev mailing list. To see the complete SAX standard, check out www.megginson.com/SAX/. To subscribe to the XML- Dev mailing list, send a message to majordomo@ic.ac.uk containing the following: subscribe xml- dev. Tutorial – XML Programming in Java Appendix – Listings of our... Tutorial – XML Programming in Java 2 Section 1 – Introduction About this tutorial Our previous tutorial discussed the basics of XML and demonstrated its potential to revolutionize the Web. In this tutorial, we’ll discuss how to use an XML parser to: • Process an XML document • Create an XML document • Manipulate an XML document We’ll also talk about some useful, lesser-known features of XML parsers.... Consortium (W3C). The XML Specification, an official recommendation of the W3C, is available at www.w3.org/TR/REC- xml for your reading pleasure. The W3C site contains specifications for XML, DOM, and literally dozens of other XML- related standards. The XML zone at developerWorks has an overview of these standards, complete with links to the actual specifications. The Simple API for XML (SAX) The SAX API... you. */ import java. io.OutputStreamWriter; import java. io.PrintWriter; import java. io.UnsupportedEncodingException; import java. io.Reader; import java. io.StringReader; import org.w3c.dom.Attr; import org.w3c.dom.Document; import org.w3c.dom.NamedNodeMap; import org.w3c.dom.Node; import org.w3c.dom.NodeList; import org .xml. sax.InputSource; import com.ibm .xml. parsers.*; /** * parseString .java * This sample... simply reads an XML document and writes the document’s contents to standard output. At a command prompt, run this command: java domOne sonnet .xml This loads our application and tells it to parse the file sonnet .xml. If everything goes well, you’ll see the contents of the XML document written out to standard output. The domOne .java source code is on page 33. Tutorial – XML Programming in Java Section... about the basic architecture of XML applications, and we’ve shown you how to work with XML documents. Future tutorials will cover more details of building XML applications, including: • Using visual tools to build XML applications • Transforming an XML document from one vocabulary to another • Creating front-end interfaces to end users or other processes, and creating back-end interfaces to data stores ... with IBM’s XML4 J parser. All of the code we’ll discuss in this tutorial uses standard interfaces. In the final section of this tutorial, though, we’ll show you how easy it is to write code that uses another parser. Resources – XML parsers Java • IBM’s parser, XML4 J, is available at www.alphaWorks.ibm.com/tech /xml4 j. • James Clark’s parser, XP, is available at www.jclark.com /xml/ xp. • Sun’s XML parser . Instructions: 0-- -- - -- - -- Total: 69 NodesNodes a-plentyIf you look at sonnet .xml, there are twenty-fourtags. You might think that would translate totwenty-four. type="Shakespearean"><author><last-name>Shakespeare</last-name><first-name>William</first-name><nationality>British</nationality><year-of-birth>1564</year-of-birth><year-of-death>1616</year-of-death></author><title>Sonnet

Ngày đăng: 17/08/2012, 09:33

Từ khóa liên quan

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan