•
•
•
•
•
•
TableofContents
Index
Reviews
ReaderReviews
Errata
Academic
XMLHacks
ByMichaelFitzgerald
Publisher :O'Reilly
PubDate :July2004
ISBN :0-596-00711-6
Pages :478
Thispractical,roll-up-your-sleevesguide
distillsyearsofingeniousXMLhackingintoa
completesetoftips,tricks,andtoolsforweb
developers,systemadministrators,and
programmerswhowanttoleveragethe
untappedpowerofXML.Ifyouwantmore
thantheaverageXMLuser--toexploreand
experiment,discoverclevershortcuts,and
showoffjustalittle(andhavefuninthe
process)--thisinvaluablebookisamusthave.
•
•
•
•
•
•
TableofContents
Index
Reviews
ReaderReviews
Errata
Academic
XMLHacks
ByMichaelFitzgerald
Publisher :O'Reilly
PubDate :July2004
ISBN :0-596-00711-6
Pages :478
Copyright
Credits
Author
Contributors
Preface
WhyXMLHacks?
HowThisBookIsOrganized
ConventionsUsedinThisBook
HowtoContactUs
UsingCodeExamples
Gotahack?
Acknowledgments
Chapter1.LookingatXMLDocuments
Hacks#1-10
Hack1.ReadanXMLDocument
Hack2.DisplayanXMLDocumentinaWebBrowser
Hack4.UseCharacterandEntityReferences
Hack6.ExploreXMLDocumentsinGraphicalEditors
Hack8.TestXMLDocumentsOnline
Hack3.ApplyStyletoanXMLDocumentwithCSS
Hack5.ExamineXMLDocumentsinTextEditors
Hack7.ChooseToolsforCreatinganXMLVocabulary
Hack9.TestXMLDocumentsfromtheCommandLine
Hack10.RunJavaProgramsthatProcessXML
Chapter2.CreatingXMLDocuments
Hacks#11-30
Hack11.EditXMLDocumentswith<oXygen/>
Hack13.EditXMLwithVim
Hack15.WorkwithXMLinMicrosoftExcel2003
Hack17.ConvertMicrosoftOfficeFiles,OldorNew,toXML
Hack19.ConvertTexttoXMLwithUphill
Hack20.CreateWell-FormedXMLwithMinimalManualTaggingUsinganSGML
Parser
Hack21.CreateanXMLDocumentfromaCSVFile
Hack12.EditXMLDocumentswithEmacsandnXML
Hack14.EditXMLDocumentswithMicrosoftWord2003
Hack16.WorkwithXMLinMicrosoftAccess2003
Hack18.CreateanXMLDocumentfromaTextFilewithxmlspy
Hack22.ConvertanHTMLDocumenttoXHTMLwithHTMLTidy
Hack23.TransformDocumentswithXQuery
Hack25.IncludeTextandDocumentswithEntities
Hack27.EncodeXMLDocuments
Hack29.What'stheDiff?DiffXMLDocuments
Hack24.ExecuteanXQuerywithSaxon
Hack26.IncludeExternalDocumentswithXInclude
Hack28.ExploreXLinkandXML
Hack30.LookatXMLDocumentsThroughtheLensoftheXMLInformationSet
Chapter3.TransformingXMLDocuments
Hacks#31-58
Hack31.UnderstandtheAnatomyofanXSLTStylesheet
Hack32.TransformanXMLDocumentwithaCommand-LineProcessor
Hack34.AnalyzeNodeswithTreeViewer
Hack33.TransformanXMLDocumentWithinaGraphicalEditor
Hack35.ExploreaDocumentTreewiththexmllintShell
Hack36.ViewDocumentsasTablesUsingGenericCSSorXSLT
Hack37.GenerateanXSLTIdentityStylesheetwithRelaxer
Hack38.Pretty-PrintXMLUsingaGenericIdentityStylesheetandXalan
Hack40.ConvertAttributestoElementsandElementstoAttributes
Hack42.CreateandProcessSpreadsheetML
Hack44.TransformYouriTunesLibraryFile
Hack46.GenerateXMLfromMySQL
Hack48.ProcessXMLDocumentswithXSL-FOandFOP
Hack50.BuildResultswithLiteralResultandInstructionElements
Hack52.PerformMathwithXSLT
Hack54.GenerateSVGwithXSLT
Hack56.UseLookupTableswithXSLTtoTranslateFIPSCodes
Hack39.CreateaTextFilefromanXMLDocument
Hack41.ConvertXMLtoCSV
Hack43.ChooseYourOutputFormatinXSLT
Hack45.GenerateMultipleOutputDocumentswithXSLT2.0
Hack47.GeneratePDFDocumentsfromXMLandCSS
Hack49.ProcessHTMLwithXSLTUsingTagSoup
Hack51.WritePushandPullStylesheets
Hack53.TransformXMLDocumentswithgrepandsed
Hack55.DitherScatterplotswithXSLTandSVG
Hack57.GroupinginXSLT1.0and2.0
Hack58.UseEXSLTExtensions
Chapter4.XMLVocabularies
Hacks#59-67
Hack59.UseXMLNamespacesinanXMLVocabulary
Hack61.CreateandValidateanXHTML1.0Document
Hack63.CreateaSOAP1.2Document
Hack65.UnraveltheOpenOfficeFileFormat
Hack60.CreateanRDDLDocument
Hack62.CreateBooks,TechnicalManuals,andPapersinXMLwithDocBook
Hack64.IdentifyYourselfwithFOAF
Hack66.RenderGraphicswithSVG
Hack67.UseXFormsinYourXMLDocuments
Chapter5.DefiningXMLVocabularieswithSchemaLanguages
Hacks#68-79
Hack68.ValidateanXMLDocumentwithaDTD
Hack69.ValidateanXMLDocumentwithXMLSchema
Hack70.ValidateMultipleDocumentsAgainstanXMLSchemaatOnce
Hack72.ValidateanXMLDocumentwithRELAXNG
Hack74.CreateanXMLSchemaDocumentfromanInstanceorDTD
Hack76.ConvertaRELAXNGSchematoXMLSchema
Hack78.UseRELAXNGtoGenerateDTDCustomizations
Hack71.ChecktheIntegrityofaW3CSchema
Hack73.CreateaDTDfromanInstance
Hack75.CreateaRELAXNGSchemafromanInstance
Hack77.UseRELAXNGandSchematronTogethertoValidateBusinessRules
Hack79.GenerateInstancesBasedonSchemas
Chapter6.RSSandAtom
Hacks#80-90
Hack80.SubscribetoRSSFeeds
Hack81.CreateanRSS0.91Document
Hack83.CreateanRSS2.0Document
Hack85.ValidateRSSandAtomDocuments
Hack87.SyndicateContentwithMovableType
Hack89.CreateRSS0.91FeedsfromGoogle
Hack82.CreateanRSS1.0Document
Hack84.CreateanAtomDocument
Hack86.CreateRSSwithXML::RSS
Hack88.PostRSSHeadlinesonYourSite
Hack90.SyndicateaListofBooksfromAmazonwithRSSandASP
Chapter7.AdvancedXMLHacks
Hacks#91-100
Hack91.PipelineXMLwithAnt
Hack92.UseElementsInsteadofEntitiestoAvoidthe"ampExplosionProblem"
Hack93.UseCocoontoCreateaWell-FormedViewofaWebPage,ThenScrape
ItforData
Hack94.FromWikitoXML,ThroughSGML
Hack95.CreateWell-FormedXMLwithJavaScript
Hack96.InspectandEditXMLDocumentswiththeDocumentObjectModel
Hack98.ProcessXMLwithC#
Hack97.ProcessingXMLwithSAX
Hack99.GenerateCodefromXML
Hack100.CreateWell-FormedXMLwithGenx
Colophon
Index
Copyright©2004O'ReillyMedia,Inc.Allrightsreserved.
PrintedintheUnitedStatesofAmerica.
PublishedbyO'ReillyMedia,Inc.,1005GravensteinHighway
North,Sebastopol,CA95472.
O'Reillybooksmaybepurchasedforeducational,business,or
salespromotionaluse.Onlineeditionsarealsoavailablefor
mosttitles().Formoreinformation,
contactourcorporate/institutionalsalesdepartment:(800)
998-9938or
NutshellHandbook,theNutshellHandbooklogo,andthe
O'ReillylogoareregisteredtrademarksofO'ReillyMedia,Inc.
TheHacksseriesdesignations,XMLHacks,theimageofa
socketwrench,"Hacks100Industrial-StrengthTipsandTools,"
andrelatedtradedressaretrademarksofO'ReillyMedia,Inc.
Manyofthedesignationsusedbymanufacturersandsellersto
distinguishtheirproductsareclaimedastrademarks.Where
thosedesignationsappearinthisbook,andO'ReillyMedia,Inc.
wasawareofatrademarkclaim,thedesignationshavebeen
printedincapsorinitialcaps.
Whileeveryprecautionhasbeentakeninthepreparationofthis
book,thepublisherandauthorassumenoresponsibilityfor
errorsoromissions,orfordamagesresultingfromtheuseof
theinformationcontainedherein.
Smallprint:Thetechnologiesdiscussedinthispublication,the
limitationsonthesetechnologiesthattechnologyandcontent
ownersseektoimpose,andthelawsactuallylimitingtheuseof
thesetechnologiesareconstantlychanging.Thus,someofthe
hacksdescribedinthispublicationmaynotwork,maycause
unintendedharmtosystemsonwhichtheyareused,ormay
notbeconsistentwithapplicableuseragreements.Youruseof
thesehacksisatyourownrisk,andO'ReillyMedia,Inc.
disclaimsresponsibilityforanydamageorexpenseresulting
fromtheiruse.Inanyevent,youshouldtakecarethatyouruse
ofthesehacksdoesnotviolateanyapplicablelaws,including
copyrightlaws.
Credits
Author
Contributors
Author
MichaelFitzgeraldisprincipalofWy'eastCommunications
(),awriting,training,andprogramming
consultancyspecializinginXML.Inadditiontothisbook,heis
theauthorofLearningXSLT(O'Reilly),XSLEssentials(Wiley&
Sons),andBuildingB2BApplicationswithXML:AResource
Guide(Wiley&Sons).MikeisthecreatorofOx,anopensource
Javatoolforgeneratingbrief,syntax-relateddocumentationat
thecommandline(Hewas
alsoamemberoftheoriginalRELAXNGtechnicalcommitteeat
OASIS(2001-2003).AnativeofOregon,Mikenowliveswithhis
familyinMapleton,Utah.Youcanfindhistechnicalblogat
/>
Contributors
TimothyAppnelhas13yearsofcorporateITandInternet
systemsdevelopmentexperienceandistheprincipalof
AppnelInternetSolutions,atechnologyconsultancy
specializinginMovableTypeandTypePadsystems.In
additiontobeingatechnologist,Timhasabackgroundin
publicationswhichincludescofoundingandmanaging
OculusMagazine,afreeindiemusicandarts'zine,forover
sevenyears.HeisanoccasionalcontributortotheO'Reilly
Networkandmaintainsapersonalweblogofhisthoughtsat
/>TaraCalishanistheeditoroftheonlinenewsletter
ResearchBuzz()andthe
authororcoauthorofseveralbooks,includingthe
bestsellingGoogleHacks(O'Reilly)andSpideringHacks
(O'Reilly).Asearchengineenthusiastformanyyears,she
beganherforayintotheworldofPerlwhenGooglereleased
itsAPIinApril2002.
JohnCowanistheseniorInternetsystemsdeveloperfor
ReutersHealth,averysmallsubsidiaryofReuters,awire
serviceandfinancialnewscompany.Hewasresponsiblefor
ReutersHealth'scurrentnewspublicationsystem,which
distributesabout100articlesperdaytoabout200
wholesalenewscustomers,mostlyinXML.(Yes,somostof
themwantHTMLandgetXHTML.Deal.)Johnisamember
oftheW3CXMLCoreWG(andtheeditoroftheXMLInfoset
andXML1.1specifications)andtheclosedUnicoremailing
listoftheUnicodeTechnicalCommittee.Healsohangsout
onfartoomanyothertechnicalmailinglists,masquerading
astheexpertonAfortheBmailinglistandtheexpertonB
fortheAmailinglist.Hisfriendssaythatheknowsatleast
somethingaboutalmosteverything,whilehisenemiessay
thatheknowsfartoomuchaboutfartoomuch.Inhis
copioussparetime,Johnconstructedandmaintains
TagSoup,aSAX-compatibleJavaparserforugly,nasty
HTML,andtheItsyBitsyTeenyWeenySimpleHypertext
DTD,asmallsubsetofXHTMLBasicsuitableforaddingrich
texttootherwisebaldandunconvincingdocumenttypes
(nowavailableinRELAXNG,too).Heisinterestedin
languagesnatural,constructed,andcomputerandisthe
authorofTheCompleteLojbanLanguage.Heisalsothe
currentmaintainerofFIGlet,theworld'sonlyUnicode
renderingenginethatusesASCIIcharactersinsteadof
pixels.
LeighDoddsisanapplicationdeveloperanddesigner
specializinginJavaandXML.Hecurrentlymanagesasmall
engineeringteamatIngenta( />andisresponsiblefordevelopingbibliographiccontent
managementanddocumentdeliverysystemsandservices.
Leighisalsoafreelanceauthorandhascontributed
numerousarticlesandtutorialstoxmlhack.com,XML.com,
andIBMdeveloperWorks.LeighisbasedinBath,United
Kingdom.
MicahDubinkoisasoftwareengineerwholivesinPhoenix,
Arizona,withhiswifeandchild,andworksforVerity,Inc.
(HeistheauthorofXForms
Essentials(O'Reilly),availableonlineat
.Healsoservedasaneditorand
authoroftheW3CXFormsspecification
(andparticipatedinthe
XFormseffortbeginninginSeptember1999,ninemonths
beforetheofficialWorkingGroupwaschartered.Hewas
awardedCompTIACDIA(CertifiedDocumentImaging
Architect)certificationinJanuary2001.
BobDuCharme(istheauthorof
XSLTQuickly(ManningPublications),XML:TheAnnotated
Specification(PrenticeHall),SGMLCD(PrenticeHall),and
OperatingSystemsHandbook(McGrawHill).Hewritesthe
monthly"TransformingXML"columnforXML.comandhas
contributedtoXMLMagazine,XMLJournal,IBM
developerWorks,XMLDeveloper,andXMLHandbook
(PrenticeHall).Aconsultingsoftwareengineerat
LexisNexis,BobreceivedhisBAinreligionfromColumbia
Universityandhismaster'sincomputersciencefromNew
YorkUniversity.
HansFugalistheauthorofdesert.vim,thenumber-one
ratedcolorschemeatvim.org
(He
alsowroternc.vim,aVimsyntaxhighlightingspecification
fortheRELAXNGcompactsyntax
(He
istheauthorofthegdmxmlschema,anXML
implementationoftheGENTECHGenealogicalDataModel
().Hehasabachelor'sdegreein
computersciencefromBrighamYoungUniversityandis
preparingtopursueaPhDincomputerscience.Heisvery
interestedincomputermusicandmaintainsafewcomputer
music-relatedpackagesinDebian.Heplaysthepianoand
theorgan.Heispresentlyemployedasasystem
administratoratWencorWest,Inc.
( />JasonHunteristheauthorofJavaServletProgramming
(O'Reilly)andcoauthorofJavaEnterpriseBestPractices
(O'Reilly).He'sanApacheMember,andasApache's
representativetotheJavaCommunityProcessExecutive
Committee,heestablishedalandmarkagreementforopen
sourceJava.HeispublisherofServlets.comand
XQuery.com,anoriginalcontributortoApacheTomcat,
creatorofthecom.oreilly.servletlibrary,andamemberof
theExpertGroupsresponsibleforservlet,JSP,JAXP,and
XQJAPIdevelopment,andhesitsontheW3CXQuery
WorkingGroup.Heco-createdtheopensourceJDOMlibrary
toenableoptimizedJavaandXMLintegration.Heworksat
MarkLogic(wherehehas
beenworkingontheirXQueryimplementationsinceJune
2002.
RickJelliffeisCTOofTopologiPty.Ltd.
(),acompanymakingXML-related
desktoptools,andspendsmostofhistimeworkingon
editors,validators,andpublishing-relatedmarkup.His
currentmainstandardsprojectiseditinganupcomingISO
standardfortheSchematronschemalanguage
(whichheoriginally
developed.AswellashisworkwithISOSC34andthe
originalXMLgroupatW3C,Rickwasasporadicmemberof
theW3CSchemaWorkingGroupandtheW3C
InternationalizationInterestGroup.HeistheauthorofXML
&SGMLCookbook:RecipesonStructuredInformation
(PrenticeHallPTR).HeleadtheChineseXMLNowprojectat
AcademiaSinicaComputingCentre
(HelivesinSydney,Australia,
andhasaneconomicsdegreefromSydneyUniversity.
SeanMcGrathisCTOofPropylon,anXMLsolutions
company.Heisaninternationallyacknowledgedauthority
onXMLandrelatedstandards.Heservedasaninvited
experttotheW3C'sExpertGroupthatdefinedXMLin1998.
Heistheauthorofthreebooksonmarkuplanguages
publishedbyPrenticeHallandwritestheweekly"ebusinessintheenterprise"newsletterforITWorld
(Hisblogisat
.
SeanNolanfoundedSoftwarePoetry
(andwastheCTOfor
drugstore.com,wherehewasthefifthemployeeandledthe
designandimplementationoftheiraward-winningecommercesystems.Whileatdrugstore.com,Seanwas
honoredasoneofthenation'sPremier100ITLeadersfor
2001byComputerworldmagazine.
ThomasPassinisasystemsengineerwithMitretekSystems
(anonprofitsystemsand
informationengineeringcompany.HegraduatedwithaBS
inphysicsfromtheMassachusettsInstituteofTechnology,
andstudiedgraduate-levelphysicsattheUniversityof
Chicago.HehasbeenactiveinXML-relatedworksince
1998.HehelpedtodevelopXMLversionsofmessage
standardsforAdvancedTravelerInformationSystems
(including
translationsofthemessageschemasfromASN.1toXML.
HedevelopedanXML/XSLT-basedquestionnairegeneration
system.HeisalsoactiveintheareaofTopicMaps,and
developedtheopensourceTM4JScriptJavascripttopicmap
engine.Mr.PassinistheauthorofExplorer'sGuidetothe
SemanticWeb,forthcomingfromManningin2004
( />DavePawsonisfromPeterboroughintheUnitedKingdom.
Hehasanaerospacebackground,andiscurrentlyworking
foronwebstandardsaccessibility.
Inhissparetime,hemaintainstheXSLTFAQ
(andaDocBook
FAQ(Hisinterestin
DSSSLandXSL-FOledtothepublicationoftheO'Reilly
bookXSL-FO.
DeanPetersisagrayingcode-monkeywhobydayisa
mild-manneredIIS/.NETprogrammer,butbynightbecomes
anot-so-evilLinuxgeniusashedevelopssoftwareand
articlesforhisblogs,
and.
EddieRobertssonfinishedhismaster'sdegreeincomputer
scienceattheLundInstituteofTechnologyinSwedenin
1999.ShortlythereafterhemovedtoSydney,Australiafor
employmentatAlletteSystems,whereheworkedasan
XMLdeveloperandtrainerspecializinginXMLschema
languages.DuringhislastfewyearsinSydney,Eddie
workedverycloselywithRickJelliffeandTopologiwiththe
designandimplementationofTopologi'ssuiteofXMLtools.
Inmid-2003,EddiemovedbacktoSweden,wherehe
continuestoworkwithsoftwareengineeringandXMLrelatedtechnologies.
RichardRosebeganlifeatanearlyageandrapidlystarted
absorbinginformation,findingthathelikedthetasteof
informationrelatingtocomputersthebest.Hehassince
feasteduponinformationfromtheUniversityofBristolin
theUnitedKingdom,whereheearnedaBScwithHonors.
HelivesinBristolbutcurrentlydoesnotwork,andhewill
bereturnedforstorecreditassoonassomebodycanfind
thereceipt.Richardwritesprogramsfortheintellectual
challenge.Healsoturnshishandtosystemadministration
andhasdonetheobligatorytimeintechsupport.Forfun,
hejuggles,doesclose-upmagic,andplaystheguitarbadly.
HecanalsobefoundonIRC,wherehecurrentlyisa
networkoperatorknownasrikontheOpenandFree
TechnologyCommunity(irc.oftc.net).
MichaelSmithisasoftwarenon-executivelivingand
workinginTokyo,withaparticularfondnessforRELAXNG
andDocBook.He'samemberoftheDocBookTechnical
Committee( />
isinvolvedwiththeDocBookOpenRepositorydevelopment
team,andistheowner/moderatorofthexml-docmailing
list(Inthegood
olddays,hewroteforthexmlhack.comwebsiteinsome
prettyselectcompany,amongwhomwereUcheOgbuji,Edd
Dumbill,MicahDubinko,SimonSt.Laurent,andEricvander
Vlist.
SimonSt.LaurentisaneditorwithO'ReillyMedia,Inc.Prior
tothat,he'dbeenawebdeveloper,networkadministrator,
computerbookauthor,andXMLtroublemaker.Helivesin
Dryden,NewYork.HisbooksincludeXML:APrimer,XML
ElementsofStyle,BuildingXMLApplications,Cookies,and
SharingBandwidth.Heisanoccasionalcontributorto
XML.com.
Preface
ExtensibleMarkupLanguageorXML
(appearedasa
recommendationoftheWorldWideWebConsortium
()inearly1998.XMLisarestrictedsubset
ofStandardGeneralizedMarkupLanguageorSGML(ISO/IEC
8879).Bysomegrace,XMLhasenjoyedconsiderablepopularity
andhasbeenalmostuniversallyreceivedasaninteroperability
solutionforheterogeneouscomputersystems.Althoughnot
withoutshortcomings,XMLisprobablythebestthingwehave
goingforustodealwithsoftwareinteroperabilityissues,mainly
becauseofitswideacceptanceandpresence.
Today,youcanfindXMLjustaboutanywhereyoufindsoftware.
Tonameafewexamples:
OpenOffice'sfileformat[Hack#65]consistsofasetof
ZIP-archivedXMLfiles.
Ant'sbuildfileformat[Hack#91]iswritteninXML,asare
MicrosoftVisualStudio.NETprojectfiles
( />Macplistconfigurationfiles[Hack#44]arealsowrittenin
XML.
WebpagesnowincreasinglyuseExtensibleHypertext
MarkupLanguage(XHTML),anXMLversionofHTML.
XMLUserInterfaceLanguage(XUL)isaMozillaprojectthat
allowsyoutodefineapplicationswithXML
(Likewise,Extensible
ApplicationMarkupLanguage(XAML)isanXML-based
languagefordefininguserinterfacesfortheAvalon
framework,partofMicrosoft'supcomingreleaseofWindows
code-named"Longhorn"
( />XMLisbynomeansapanaceaforalltheillsofinterchange,but
it'sbecominganincreasinglypracticaloptionforpackagingand
movingdatainandoutofsystemsorforrepresentingdataina
consistent,readableway.Anditcanbefuntouse,too,asmany
ofthehacksinthisbookdemonstrate.
TheXMLspecificationdefinesasyntaxforcreatingmarkup.
Markupconsistsofelements,attributes,andotherstructures
thatallowyoutolabeldocumentsanddatainawaythatcan
givethemmeaningthatotherhumanbeingsorsoftwarecan
understandandinterpret.BecausereliableXMLparsersare
readilyandoftenfreelyavailableinavarietyofprogramming
languages,itisrelativelyeasytointegrateXMLprocessinginto
justaboutanyapplication.
Thisbook'smissionistogiveyouarunningstartatdoingmany
ofthethingsthatarecommonlyandsometimes
uncommonlydonewithXML.Whileyou'llfindbeginning,
intermediate,andadvancedhacksbetweenthecovers,this
bookisnotanexhaustivetreatmentofeverythingyoucando
withXML.Instead,itfocusesonthemainstream,coretasks
foundinXMLterritory.Thesetaskscanbeaccomplishedquickly
andusuallyusedownloadable,opensourcesoftwareor
softwarethatisavailableforfreetrial.
WhyXMLHacks?
Thetermhackinghasabadreputationinthepress.Theyuseit
torefertosomeonewhobreaksintosystemsorwreakshavoc
withcomputersastheirweapon.Amongpeoplewhowritecode,
though,thetermhackreferstoa"quick-and-dirty"solutiontoa
problem,oracleverwaytogetsomethingdone.Andtheterm
hackeristakenverymuchasacompliment,referringto
someoneasbeingcreativeandhavingthetechnicalchopsto
getthingsdone.TheHacksseriesisanattempttoreclaimthe
word,documentthegoodwayspeoplearehacking,andpass
thehackerethicofcreativeparticipationontotheuninitiated.
Seeinghowothersapproachsystemsandproblemsisoftenthe
quickestwaytolearnaboutanewtechnology.
XMLHacksisforfolkswholiketocobbletogetheravarietyof
freeorlow-costtoolsandtechniques,withXMLasthe
touchstone,togetsomethingpracticaldone.Thisbookis
designedtomeettheneedsofabroadaudience:fromthose
whoarejustcuttingtheirteethonXMLtothosewhoare
alreadyfamiliarwithit.Evenexpertswillfindnewapproaches
tosolvinginterestingchallengesamongthesehacksfor
example,RickJelliffe'shackonconvertingWikitoXMLvia
SGML[Hack#94].Becauseitcoversalotofground,this
bookwillprobablymeetsomeneed,nomatteratwhatlevel
youarehackingwithXML.
HowThisBookIsOrganized
Thisbookisdividedintosevenchapters,eachofwhichisbriefly
describedhere:
Chapter1,LookingatXMLDocuments
Containsaseriesofintroductoryhacks,includingan
overviewofwhatanXMLdocumentshouldlooklike,howto
displayanXMLdocumentinabrowser,howtostyleanXML
documentwithCSS,andhowtousecommand-lineJava
applicationstoprocessXML.
Chapter2,CreatingXMLDocuments
TeachesyouhowtoeditXMLwithavarietyofeditors,
includingVim,Emacs,<oXygen/>,andMicrosoftOffice
2003applications.Amongotherthings,showsyouhowto
convertaplaintextfiletoXMLwithxmlspy,translateCSV
toXML,andconvertHTMLtoXHTMLwithHTMLTidy.
Chapter3,TransformingXMLDocuments
ExploresmanywaysthatyoucanuseXSLTandothertools
totransformXMLintoCSV,transformaniTuneslibrary
(plist)fileintoHTML,transformXMLdocumentswithgrep
andsed,andgenerateSVGwithXSLT.
Chapter4,XMLVocabularies
HelpsyougetacquaintedwithnamespacesandRDDL,and
describeshowtousecommonXMLvocabulariesand
frameworkssuchasXHTML,DocBook,RDDL,andRDFin
theformofFOAF.
Chapter5,DefiningXMLVocabularieswithSchemaLanguages
CoversthecreationofvalidXMLusingDTDs,XMLSchema,
RELAXNG,andSchematron.Italsoexplainshowto
generateschemasfrominstances,howtogenerate
instancesfromschemas,andhowtoconvertaschemafrom
oneschemalanguagetoanother.
Chapter6,RSSandAtom
TeachesyouhowtosubscribetoRSSfeedswithnews
readers;createRSS0.91,RSS1.0,RSS2.0,andAtom
documents;andgenerateRSSfromGooglequeriesand
withMovableTypetemplates.
Chapter7,AdvancedXMLHacks
ShowsyouhowtoperformXMLtasksinanAntpipeline,
howtouseCocoon,andhowtoprocessXMLdocuments
usingDOM,SAX,Genx,andthefacilitiesofC#'s
System.Xmlnamespace,amongothers.
ConventionsUsedinThisBook
Thefollowingisalistoftypographicalconventionsusedinthis
book:
Italic
Usedtoindicatenewterms,URLs,filenames,file
extensions,directories,commandsandoptions,and
programnames,andtohighlightcommentsinexamples.
Forexample,apathinthefilesystemmayappearas
C:\Hacks\examplesor/usr/mike/hacks/examples.
Constantwidth
Usedtoshowcodeexamples,XMLmarkup,Javapackageor
C#namespacenames,oroutputfromcommands.
Constantwidthbold
Usedinexamplestoshowemphasis.
Constantwidthitalic
Usedinexamplestoshowtextthatshouldbereplacedwith
user-suppliedvalues.
[RETURN]
Acarriagereturn([RETURN])attheendofalineofcode
isusedtodenoteanunnaturallinebreak;thatis,you
shouldnotentertheseastwolinesofcode,butasone
continuousline.Multiplelinesareusedinthesecasesdue
topage-widthconstraints.
Youshouldpayspecialattentiontonotessetapartfromthe
textwiththefollowingicons:
Thisisatip,suggestion,orgeneralnote.Itcontainsuseful
supplementaryinformationaboutthetopicathand.
Thisisawarningoranoteofcaution.
Thethermometericons,foundnexttoeachhack,indicatethe
relativecomplexityofthehack:
beginner
moderate
expert
UsingCodeExamples
ThisbookisheretohelpyougetyourjobdonewithXML.In
general,youmayusethemarkup,stylesheets,andcodeinthis
bookinyourprogramsanddocumentation(allavailablefor
downloadinaZIParchivefrom
ofthehacks
assumethattheseexamplefilesareinplaceinaworking
directory).Youdonotneedtocontactusforpermissionunless
you'rereproducingasignificantportionofthecode.For
example,writingaprogramthatusesseveralchunksofcode
fromthisbookdoesnotrequirepermission.However,sellingor
distributingaCD-ROMofexamplesfromanO'Reillybookdoes
requirepermission.Answeringaquestionbycitingthisbook
andquotinganexampledoesnotrequirepermission,but
incorporatingasignificantamountofexamplesfromthisbook
intoyourproduct'sdocumentationdoesrequirepermission.
Weappreciate,butdonotrequire,attributionwhenusingcode.
Anattributionusuallyincludesthetitle,author,publisher,and
ISBN.Forexample:"XMLHacksbyMichaelFitzgerald.
Copyright2004O'ReillyMedia,Inc.,0-596-00711-6."
Ifyoufeelyouruseofcodeexamplesfallsoutsidefairuseor
thepermissiongivenabove,feelfreetocontactusat