Tải bản đầy đủ (.pdf) (1,038 trang)

OReilly XML hacks 100 industrial strength tips and tools jul 2004 ISBN 0596007116

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (6.55 MB, 1,038 trang )










TableofContents
Index
Reviews
ReaderReviews
Errata
Academic

XMLHacks
ByMichaelFitzgerald

Publisher :O'Reilly
PubDate :July2004
ISBN :0-596-00711-6
Pages :478


Thispractical,roll-up-your-sleevesguide
distillsyearsofingeniousXMLhackingintoa
completesetoftips,tricks,andtoolsforweb
developers,systemadministrators,and
programmerswhowanttoleveragethe
untappedpowerofXML.Ifyouwantmore


thantheaverageXMLuser--toexploreand
experiment,discoverclevershortcuts,and
showoffjustalittle(andhavefuninthe


process)--thisinvaluablebookisamusthave.











TableofContents
Index
Reviews
ReaderReviews
Errata
Academic

XMLHacks
ByMichaelFitzgerald

Publisher :O'Reilly
PubDate :July2004
ISBN :0-596-00711-6

Pages :478



Copyright

Credits
Author

Contributors


Preface
WhyXMLHacks?

HowThisBookIsOrganized




ConventionsUsedinThisBook




HowtoContactUs

UsingCodeExamples
Gotahack?


Acknowledgments


Chapter1.LookingatXMLDocuments
Hacks#1-10

Hack1.ReadanXMLDocument






Hack2.DisplayanXMLDocumentinaWebBrowser




Hack4.UseCharacterandEntityReferences




Hack6.ExploreXMLDocumentsinGraphicalEditors




Hack8.TestXMLDocumentsOnline


Hack3.ApplyStyletoanXMLDocumentwithCSS
Hack5.ExamineXMLDocumentsinTextEditors
Hack7.ChooseToolsforCreatinganXMLVocabulary
Hack9.TestXMLDocumentsfromtheCommandLine

Hack10.RunJavaProgramsthatProcessXML


Chapter2.CreatingXMLDocuments
Hacks#11-30




Hack11.EditXMLDocumentswith<oXygen/>




Hack13.EditXMLwithVim




Hack15.WorkwithXMLinMicrosoftExcel2003




Hack17.ConvertMicrosoftOfficeFiles,OldorNew,toXML




Hack19.ConvertTexttoXMLwithUphill



Hack20.CreateWell-FormedXMLwithMinimalManualTaggingUsinganSGML
Parser
Hack21.CreateanXMLDocumentfromaCSVFile




Hack12.EditXMLDocumentswithEmacsandnXML
Hack14.EditXMLDocumentswithMicrosoftWord2003
Hack16.WorkwithXMLinMicrosoftAccess2003
Hack18.CreateanXMLDocumentfromaTextFilewithxmlspy

Hack22.ConvertanHTMLDocumenttoXHTMLwithHTMLTidy




Hack23.TransformDocumentswithXQuery




Hack25.IncludeTextandDocumentswithEntities





Hack27.EncodeXMLDocuments




Hack29.What'stheDiff?DiffXMLDocuments

Hack24.ExecuteanXQuerywithSaxon
Hack26.IncludeExternalDocumentswithXInclude
Hack28.ExploreXLinkandXML
Hack30.LookatXMLDocumentsThroughtheLensoftheXMLInformationSet


Chapter3.TransformingXMLDocuments
Hacks#31-58

Hack31.UnderstandtheAnatomyofanXSLTStylesheet




Hack32.TransformanXMLDocumentwithaCommand-LineProcessor



Hack34.AnalyzeNodeswithTreeViewer


Hack33.TransformanXMLDocumentWithinaGraphicalEditor




Hack35.ExploreaDocumentTreewiththexmllintShell



Hack36.ViewDocumentsasTablesUsingGenericCSSorXSLT



Hack37.GenerateanXSLTIdentityStylesheetwithRelaxer




Hack38.Pretty-PrintXMLUsingaGenericIdentityStylesheetandXalan




Hack40.ConvertAttributestoElementsandElementstoAttributes




Hack42.CreateandProcessSpreadsheetML





Hack44.TransformYouriTunesLibraryFile




Hack46.GenerateXMLfromMySQL




Hack48.ProcessXMLDocumentswithXSL-FOandFOP




Hack50.BuildResultswithLiteralResultandInstructionElements




Hack52.PerformMathwithXSLT




Hack54.GenerateSVGwithXSLT





Hack56.UseLookupTableswithXSLTtoTranslateFIPSCodes

Hack39.CreateaTextFilefromanXMLDocument
Hack41.ConvertXMLtoCSV
Hack43.ChooseYourOutputFormatinXSLT
Hack45.GenerateMultipleOutputDocumentswithXSLT2.0
Hack47.GeneratePDFDocumentsfromXMLandCSS
Hack49.ProcessHTMLwithXSLTUsingTagSoup
Hack51.WritePushandPullStylesheets
Hack53.TransformXMLDocumentswithgrepandsed
Hack55.DitherScatterplotswithXSLTandSVG
Hack57.GroupinginXSLT1.0and2.0

Hack58.UseEXSLTExtensions


Chapter4.XMLVocabularies
Hacks#59-67




Hack59.UseXMLNamespacesinanXMLVocabulary





Hack61.CreateandValidateanXHTML1.0Document




Hack63.CreateaSOAP1.2Document




Hack65.UnraveltheOpenOfficeFileFormat

Hack60.CreateanRDDLDocument
Hack62.CreateBooks,TechnicalManuals,andPapersinXMLwithDocBook
Hack64.IdentifyYourselfwithFOAF
Hack66.RenderGraphicswithSVG

Hack67.UseXFormsinYourXMLDocuments


Chapter5.DefiningXMLVocabularieswithSchemaLanguages
Hacks#68-79




Hack68.ValidateanXMLDocumentwithaDTD
Hack69.ValidateanXMLDocumentwithXMLSchema







Hack70.ValidateMultipleDocumentsAgainstanXMLSchemaatOnce




Hack72.ValidateanXMLDocumentwithRELAXNG




Hack74.CreateanXMLSchemaDocumentfromanInstanceorDTD




Hack76.ConvertaRELAXNGSchematoXMLSchema




Hack78.UseRELAXNGtoGenerateDTDCustomizations

Hack71.ChecktheIntegrityofaW3CSchema
Hack73.CreateaDTDfromanInstance
Hack75.CreateaRELAXNGSchemafromanInstance

Hack77.UseRELAXNGandSchematronTogethertoValidateBusinessRules
Hack79.GenerateInstancesBasedonSchemas


Chapter6.RSSandAtom
Hacks#80-90

Hack80.SubscribetoRSSFeeds




Hack81.CreateanRSS0.91Document




Hack83.CreateanRSS2.0Document




Hack85.ValidateRSSandAtomDocuments




Hack87.SyndicateContentwithMovableType





Hack89.CreateRSS0.91FeedsfromGoogle

Hack82.CreateanRSS1.0Document
Hack84.CreateanAtomDocument
Hack86.CreateRSSwithXML::RSS
Hack88.PostRSSHeadlinesonYourSite
Hack90.SyndicateaListofBooksfromAmazonwithRSSandASP


Chapter7.AdvancedXMLHacks
Hacks#91-100

Hack91.PipelineXMLwithAnt



Hack92.UseElementsInsteadofEntitiestoAvoidthe"ampExplosionProblem"



Hack93.UseCocoontoCreateaWell-FormedViewofaWebPage,ThenScrape
ItforData
Hack94.FromWikitoXML,ThroughSGML




Hack95.CreateWell-FormedXMLwithJavaScript





Hack96.InspectandEditXMLDocumentswiththeDocumentObjectModel




Hack98.ProcessXMLwithC#

Hack97.ProcessingXMLwithSAX
Hack99.GenerateCodefromXML

Hack100.CreateWell-FormedXMLwithGenx


Colophon

Index


Copyright©2004O'ReillyMedia,Inc.Allrightsreserved.
PrintedintheUnitedStatesofAmerica.
PublishedbyO'ReillyMedia,Inc.,1005GravensteinHighway
North,Sebastopol,CA95472.
O'Reillybooksmaybepurchasedforeducational,business,or
salespromotionaluse.Onlineeditionsarealsoavailablefor
mosttitles().Formoreinformation,
contactourcorporate/institutionalsalesdepartment:(800)

998-9938or
NutshellHandbook,theNutshellHandbooklogo,andthe
O'ReillylogoareregisteredtrademarksofO'ReillyMedia,Inc.
TheHacksseriesdesignations,XMLHacks,theimageofa
socketwrench,"Hacks100Industrial-StrengthTipsandTools,"
andrelatedtradedressaretrademarksofO'ReillyMedia,Inc.
Manyofthedesignationsusedbymanufacturersandsellersto
distinguishtheirproductsareclaimedastrademarks.Where
thosedesignationsappearinthisbook,andO'ReillyMedia,Inc.
wasawareofatrademarkclaim,thedesignationshavebeen
printedincapsorinitialcaps.
Whileeveryprecautionhasbeentakeninthepreparationofthis
book,thepublisherandauthorassumenoresponsibilityfor
errorsoromissions,orfordamagesresultingfromtheuseof
theinformationcontainedherein.
Smallprint:Thetechnologiesdiscussedinthispublication,the
limitationsonthesetechnologiesthattechnologyandcontent
ownersseektoimpose,andthelawsactuallylimitingtheuseof
thesetechnologiesareconstantlychanging.Thus,someofthe
hacksdescribedinthispublicationmaynotwork,maycause
unintendedharmtosystemsonwhichtheyareused,ormay
notbeconsistentwithapplicableuseragreements.Youruseof


thesehacksisatyourownrisk,andO'ReillyMedia,Inc.
disclaimsresponsibilityforanydamageorexpenseresulting
fromtheiruse.Inanyevent,youshouldtakecarethatyouruse
ofthesehacksdoesnotviolateanyapplicablelaws,including
copyrightlaws.



Credits
Author
Contributors


Author
MichaelFitzgeraldisprincipalofWy'eastCommunications
(),awriting,training,andprogramming
consultancyspecializinginXML.Inadditiontothisbook,heis
theauthorofLearningXSLT(O'Reilly),XSLEssentials(Wiley&
Sons),andBuildingB2BApplicationswithXML:AResource
Guide(Wiley&Sons).MikeisthecreatorofOx,anopensource
Javatoolforgeneratingbrief,syntax-relateddocumentationat
thecommandline(Hewas
alsoamemberoftheoriginalRELAXNGtechnicalcommitteeat
OASIS(2001-2003).AnativeofOregon,Mikenowliveswithhis
familyinMapleton,Utah.Youcanfindhistechnicalblogat
/>

Contributors
TimothyAppnelhas13yearsofcorporateITandInternet
systemsdevelopmentexperienceandistheprincipalof
AppnelInternetSolutions,atechnologyconsultancy
specializinginMovableTypeandTypePadsystems.In
additiontobeingatechnologist,Timhasabackgroundin
publicationswhichincludescofoundingandmanaging
OculusMagazine,afreeindiemusicandarts'zine,forover
sevenyears.HeisanoccasionalcontributortotheO'Reilly
Networkandmaintainsapersonalweblogofhisthoughtsat

/>TaraCalishanistheeditoroftheonlinenewsletter
ResearchBuzz()andthe
authororcoauthorofseveralbooks,includingthe
bestsellingGoogleHacks(O'Reilly)andSpideringHacks
(O'Reilly).Asearchengineenthusiastformanyyears,she
beganherforayintotheworldofPerlwhenGooglereleased
itsAPIinApril2002.
JohnCowanistheseniorInternetsystemsdeveloperfor
ReutersHealth,averysmallsubsidiaryofReuters,awire
serviceandfinancialnewscompany.Hewasresponsiblefor
ReutersHealth'scurrentnewspublicationsystem,which
distributesabout100articlesperdaytoabout200
wholesalenewscustomers,mostlyinXML.(Yes,somostof
themwantHTMLandgetXHTML.Deal.)Johnisamember
oftheW3CXMLCoreWG(andtheeditoroftheXMLInfoset
andXML1.1specifications)andtheclosedUnicoremailing
listoftheUnicodeTechnicalCommittee.Healsohangsout
onfartoomanyothertechnicalmailinglists,masquerading
astheexpertonAfortheBmailinglistandtheexpertonB
fortheAmailinglist.Hisfriendssaythatheknowsatleast


somethingaboutalmosteverything,whilehisenemiessay
thatheknowsfartoomuchaboutfartoomuch.Inhis
copioussparetime,Johnconstructedandmaintains
TagSoup,aSAX-compatibleJavaparserforugly,nasty
HTML,andtheItsyBitsyTeenyWeenySimpleHypertext
DTD,asmallsubsetofXHTMLBasicsuitableforaddingrich
texttootherwisebaldandunconvincingdocumenttypes
(nowavailableinRELAXNG,too).Heisinterestedin

languagesnatural,constructed,andcomputerandisthe
authorofTheCompleteLojbanLanguage.Heisalsothe
currentmaintainerofFIGlet,theworld'sonlyUnicode
renderingenginethatusesASCIIcharactersinsteadof
pixels.
LeighDoddsisanapplicationdeveloperanddesigner
specializinginJavaandXML.Hecurrentlymanagesasmall
engineeringteamatIngenta( />andisresponsiblefordevelopingbibliographiccontent
managementanddocumentdeliverysystemsandservices.
Leighisalsoafreelanceauthorandhascontributed
numerousarticlesandtutorialstoxmlhack.com,XML.com,
andIBMdeveloperWorks.LeighisbasedinBath,United
Kingdom.
MicahDubinkoisasoftwareengineerwholivesinPhoenix,
Arizona,withhiswifeandchild,andworksforVerity,Inc.
(HeistheauthorofXForms
Essentials(O'Reilly),availableonlineat
.Healsoservedasaneditorand
authoroftheW3CXFormsspecification
(andparticipatedinthe
XFormseffortbeginninginSeptember1999,ninemonths
beforetheofficialWorkingGroupwaschartered.Hewas
awardedCompTIACDIA(CertifiedDocumentImaging
Architect)certificationinJanuary2001.


BobDuCharme(istheauthorof
XSLTQuickly(ManningPublications),XML:TheAnnotated
Specification(PrenticeHall),SGMLCD(PrenticeHall),and
OperatingSystemsHandbook(McGrawHill).Hewritesthe

monthly"TransformingXML"columnforXML.comandhas
contributedtoXMLMagazine,XMLJournal,IBM
developerWorks,XMLDeveloper,andXMLHandbook
(PrenticeHall).Aconsultingsoftwareengineerat
LexisNexis,BobreceivedhisBAinreligionfromColumbia
Universityandhismaster'sincomputersciencefromNew
YorkUniversity.
HansFugalistheauthorofdesert.vim,thenumber-one
ratedcolorschemeatvim.org
(He
alsowroternc.vim,aVimsyntaxhighlightingspecification
fortheRELAXNGcompactsyntax
(He
istheauthorofthegdmxmlschema,anXML
implementationoftheGENTECHGenealogicalDataModel
().Hehasabachelor'sdegreein
computersciencefromBrighamYoungUniversityandis
preparingtopursueaPhDincomputerscience.Heisvery
interestedincomputermusicandmaintainsafewcomputer
music-relatedpackagesinDebian.Heplaysthepianoand
theorgan.Heispresentlyemployedasasystem
administratoratWencorWest,Inc.
( />JasonHunteristheauthorofJavaServletProgramming
(O'Reilly)andcoauthorofJavaEnterpriseBestPractices
(O'Reilly).He'sanApacheMember,andasApache's
representativetotheJavaCommunityProcessExecutive
Committee,heestablishedalandmarkagreementforopen
sourceJava.HeispublisherofServlets.comand
XQuery.com,anoriginalcontributortoApacheTomcat,



creatorofthecom.oreilly.servletlibrary,andamemberof
theExpertGroupsresponsibleforservlet,JSP,JAXP,and
XQJAPIdevelopment,andhesitsontheW3CXQuery
WorkingGroup.Heco-createdtheopensourceJDOMlibrary
toenableoptimizedJavaandXMLintegration.Heworksat
MarkLogic(wherehehas
beenworkingontheirXQueryimplementationsinceJune
2002.
RickJelliffeisCTOofTopologiPty.Ltd.
(),acompanymakingXML-related
desktoptools,andspendsmostofhistimeworkingon
editors,validators,andpublishing-relatedmarkup.His
currentmainstandardsprojectiseditinganupcomingISO
standardfortheSchematronschemalanguage
(whichheoriginally
developed.AswellashisworkwithISOSC34andthe
originalXMLgroupatW3C,Rickwasasporadicmemberof
theW3CSchemaWorkingGroupandtheW3C
InternationalizationInterestGroup.HeistheauthorofXML
&SGMLCookbook:RecipesonStructuredInformation
(PrenticeHallPTR).HeleadtheChineseXMLNowprojectat
AcademiaSinicaComputingCentre
(HelivesinSydney,Australia,
andhasaneconomicsdegreefromSydneyUniversity.
SeanMcGrathisCTOofPropylon,anXMLsolutions
company.Heisaninternationallyacknowledgedauthority
onXMLandrelatedstandards.Heservedasaninvited
experttotheW3C'sExpertGroupthatdefinedXMLin1998.
Heistheauthorofthreebooksonmarkuplanguages

publishedbyPrenticeHallandwritestheweekly"ebusinessintheenterprise"newsletterforITWorld
(Hisblogisat
.


SeanNolanfoundedSoftwarePoetry
(andwastheCTOfor
drugstore.com,wherehewasthefifthemployeeandledthe
designandimplementationoftheiraward-winningecommercesystems.Whileatdrugstore.com,Seanwas
honoredasoneofthenation'sPremier100ITLeadersfor
2001byComputerworldmagazine.
ThomasPassinisasystemsengineerwithMitretekSystems
(anonprofitsystemsand
informationengineeringcompany.HegraduatedwithaBS
inphysicsfromtheMassachusettsInstituteofTechnology,
andstudiedgraduate-levelphysicsattheUniversityof
Chicago.HehasbeenactiveinXML-relatedworksince
1998.HehelpedtodevelopXMLversionsofmessage
standardsforAdvancedTravelerInformationSystems
(including
translationsofthemessageschemasfromASN.1toXML.
HedevelopedanXML/XSLT-basedquestionnairegeneration
system.HeisalsoactiveintheareaofTopicMaps,and
developedtheopensourceTM4JScriptJavascripttopicmap
engine.Mr.PassinistheauthorofExplorer'sGuidetothe
SemanticWeb,forthcomingfromManningin2004
( />DavePawsonisfromPeterboroughintheUnitedKingdom.
Hehasanaerospacebackground,andiscurrentlyworking
foronwebstandardsaccessibility.
Inhissparetime,hemaintainstheXSLTFAQ

(andaDocBook
FAQ(Hisinterestin
DSSSLandXSL-FOledtothepublicationoftheO'Reilly
bookXSL-FO.
DeanPetersisagrayingcode-monkeywhobydayisa
mild-manneredIIS/.NETprogrammer,butbynightbecomes


anot-so-evilLinuxgeniusashedevelopssoftwareand
articlesforhisblogs,
and.
EddieRobertssonfinishedhismaster'sdegreeincomputer
scienceattheLundInstituteofTechnologyinSwedenin
1999.ShortlythereafterhemovedtoSydney,Australiafor
employmentatAlletteSystems,whereheworkedasan
XMLdeveloperandtrainerspecializinginXMLschema
languages.DuringhislastfewyearsinSydney,Eddie
workedverycloselywithRickJelliffeandTopologiwiththe
designandimplementationofTopologi'ssuiteofXMLtools.
Inmid-2003,EddiemovedbacktoSweden,wherehe
continuestoworkwithsoftwareengineeringandXMLrelatedtechnologies.
RichardRosebeganlifeatanearlyageandrapidlystarted
absorbinginformation,findingthathelikedthetasteof
informationrelatingtocomputersthebest.Hehassince
feasteduponinformationfromtheUniversityofBristolin
theUnitedKingdom,whereheearnedaBScwithHonors.
HelivesinBristolbutcurrentlydoesnotwork,andhewill
bereturnedforstorecreditassoonassomebodycanfind
thereceipt.Richardwritesprogramsfortheintellectual
challenge.Healsoturnshishandtosystemadministration

andhasdonetheobligatorytimeintechsupport.Forfun,
hejuggles,doesclose-upmagic,andplaystheguitarbadly.
HecanalsobefoundonIRC,wherehecurrentlyisa
networkoperatorknownasrikontheOpenandFree
TechnologyCommunity(irc.oftc.net).
MichaelSmithisasoftwarenon-executivelivingand
workinginTokyo,withaparticularfondnessforRELAXNG
andDocBook.He'samemberoftheDocBookTechnical
Committee( />

isinvolvedwiththeDocBookOpenRepositorydevelopment
team,andistheowner/moderatorofthexml-docmailing
list(Inthegood
olddays,hewroteforthexmlhack.comwebsiteinsome
prettyselectcompany,amongwhomwereUcheOgbuji,Edd
Dumbill,MicahDubinko,SimonSt.Laurent,andEricvander
Vlist.
SimonSt.LaurentisaneditorwithO'ReillyMedia,Inc.Prior
tothat,he'dbeenawebdeveloper,networkadministrator,
computerbookauthor,andXMLtroublemaker.Helivesin
Dryden,NewYork.HisbooksincludeXML:APrimer,XML
ElementsofStyle,BuildingXMLApplications,Cookies,and
SharingBandwidth.Heisanoccasionalcontributorto
XML.com.


Preface
ExtensibleMarkupLanguageorXML
(appearedasa
recommendationoftheWorldWideWebConsortium

()inearly1998.XMLisarestrictedsubset
ofStandardGeneralizedMarkupLanguageorSGML(ISO/IEC
8879).Bysomegrace,XMLhasenjoyedconsiderablepopularity
andhasbeenalmostuniversallyreceivedasaninteroperability
solutionforheterogeneouscomputersystems.Althoughnot
withoutshortcomings,XMLisprobablythebestthingwehave
goingforustodealwithsoftwareinteroperabilityissues,mainly
becauseofitswideacceptanceandpresence.
Today,youcanfindXMLjustaboutanywhereyoufindsoftware.
Tonameafewexamples:
OpenOffice'sfileformat[Hack#65]consistsofasetof
ZIP-archivedXMLfiles.
Ant'sbuildfileformat[Hack#91]iswritteninXML,asare
MicrosoftVisualStudio.NETprojectfiles
( />Macplistconfigurationfiles[Hack#44]arealsowrittenin
XML.
WebpagesnowincreasinglyuseExtensibleHypertext
MarkupLanguage(XHTML),anXMLversionofHTML.
XMLUserInterfaceLanguage(XUL)isaMozillaprojectthat
allowsyoutodefineapplicationswithXML
(Likewise,Extensible
ApplicationMarkupLanguage(XAML)isanXML-based


languagefordefininguserinterfacesfortheAvalon
framework,partofMicrosoft'supcomingreleaseofWindows
code-named"Longhorn"
( />XMLisbynomeansapanaceaforalltheillsofinterchange,but
it'sbecominganincreasinglypracticaloptionforpackagingand
movingdatainandoutofsystemsorforrepresentingdataina

consistent,readableway.Anditcanbefuntouse,too,asmany
ofthehacksinthisbookdemonstrate.
TheXMLspecificationdefinesasyntaxforcreatingmarkup.
Markupconsistsofelements,attributes,andotherstructures
thatallowyoutolabeldocumentsanddatainawaythatcan
givethemmeaningthatotherhumanbeingsorsoftwarecan
understandandinterpret.BecausereliableXMLparsersare
readilyandoftenfreelyavailableinavarietyofprogramming
languages,itisrelativelyeasytointegrateXMLprocessinginto
justaboutanyapplication.
Thisbook'smissionistogiveyouarunningstartatdoingmany
ofthethingsthatarecommonlyandsometimes
uncommonlydonewithXML.Whileyou'llfindbeginning,
intermediate,andadvancedhacksbetweenthecovers,this
bookisnotanexhaustivetreatmentofeverythingyoucando
withXML.Instead,itfocusesonthemainstream,coretasks
foundinXMLterritory.Thesetaskscanbeaccomplishedquickly
andusuallyusedownloadable,opensourcesoftwareor
softwarethatisavailableforfreetrial.


WhyXMLHacks?
Thetermhackinghasabadreputationinthepress.Theyuseit
torefertosomeonewhobreaksintosystemsorwreakshavoc
withcomputersastheirweapon.Amongpeoplewhowritecode,
though,thetermhackreferstoa"quick-and-dirty"solutiontoa
problem,oracleverwaytogetsomethingdone.Andtheterm
hackeristakenverymuchasacompliment,referringto
someoneasbeingcreativeandhavingthetechnicalchopsto
getthingsdone.TheHacksseriesisanattempttoreclaimthe

word,documentthegoodwayspeoplearehacking,andpass
thehackerethicofcreativeparticipationontotheuninitiated.
Seeinghowothersapproachsystemsandproblemsisoftenthe
quickestwaytolearnaboutanewtechnology.
XMLHacksisforfolkswholiketocobbletogetheravarietyof
freeorlow-costtoolsandtechniques,withXMLasthe
touchstone,togetsomethingpracticaldone.Thisbookis
designedtomeettheneedsofabroadaudience:fromthose
whoarejustcuttingtheirteethonXMLtothosewhoare
alreadyfamiliarwithit.Evenexpertswillfindnewapproaches
tosolvinginterestingchallengesamongthesehacksfor
example,RickJelliffe'shackonconvertingWikitoXMLvia
SGML[Hack#94].Becauseitcoversalotofground,this
bookwillprobablymeetsomeneed,nomatteratwhatlevel
youarehackingwithXML.


HowThisBookIsOrganized
Thisbookisdividedintosevenchapters,eachofwhichisbriefly
describedhere:

Chapter1,LookingatXMLDocuments
Containsaseriesofintroductoryhacks,includingan
overviewofwhatanXMLdocumentshouldlooklike,howto
displayanXMLdocumentinabrowser,howtostyleanXML
documentwithCSS,andhowtousecommand-lineJava
applicationstoprocessXML.

Chapter2,CreatingXMLDocuments
TeachesyouhowtoeditXMLwithavarietyofeditors,

includingVim,Emacs,<oXygen/>,andMicrosoftOffice
2003applications.Amongotherthings,showsyouhowto
convertaplaintextfiletoXMLwithxmlspy,translateCSV
toXML,andconvertHTMLtoXHTMLwithHTMLTidy.

Chapter3,TransformingXMLDocuments
ExploresmanywaysthatyoucanuseXSLTandothertools
totransformXMLintoCSV,transformaniTuneslibrary
(plist)fileintoHTML,transformXMLdocumentswithgrep
andsed,andgenerateSVGwithXSLT.

Chapter4,XMLVocabularies


HelpsyougetacquaintedwithnamespacesandRDDL,and
describeshowtousecommonXMLvocabulariesand
frameworkssuchasXHTML,DocBook,RDDL,andRDFin
theformofFOAF.

Chapter5,DefiningXMLVocabularieswithSchemaLanguages
CoversthecreationofvalidXMLusingDTDs,XMLSchema,
RELAXNG,andSchematron.Italsoexplainshowto
generateschemasfrominstances,howtogenerate
instancesfromschemas,andhowtoconvertaschemafrom
oneschemalanguagetoanother.

Chapter6,RSSandAtom
TeachesyouhowtosubscribetoRSSfeedswithnews
readers;createRSS0.91,RSS1.0,RSS2.0,andAtom
documents;andgenerateRSSfromGooglequeriesand

withMovableTypetemplates.

Chapter7,AdvancedXMLHacks
ShowsyouhowtoperformXMLtasksinanAntpipeline,
howtouseCocoon,andhowtoprocessXMLdocuments
usingDOM,SAX,Genx,andthefacilitiesofC#'s
System.Xmlnamespace,amongothers.


ConventionsUsedinThisBook
Thefollowingisalistoftypographicalconventionsusedinthis
book:

Italic
Usedtoindicatenewterms,URLs,filenames,file
extensions,directories,commandsandoptions,and
programnames,andtohighlightcommentsinexamples.
Forexample,apathinthefilesystemmayappearas
C:\Hacks\examplesor/usr/mike/hacks/examples.

Constantwidth
Usedtoshowcodeexamples,XMLmarkup,Javapackageor
C#namespacenames,oroutputfromcommands.

Constantwidthbold
Usedinexamplestoshowemphasis.

Constantwidthitalic
Usedinexamplestoshowtextthatshouldbereplacedwith
user-suppliedvalues.


[RETURN]


Acarriagereturn([RETURN])attheendofalineofcode
isusedtodenoteanunnaturallinebreak;thatis,you
shouldnotentertheseastwolinesofcode,butasone
continuousline.Multiplelinesareusedinthesecasesdue
topage-widthconstraints.
Youshouldpayspecialattentiontonotessetapartfromthe
textwiththefollowingicons:

Thisisatip,suggestion,orgeneralnote.Itcontainsuseful
supplementaryinformationaboutthetopicathand.

Thisisawarningoranoteofcaution.

Thethermometericons,foundnexttoeachhack,indicatethe
relativecomplexityofthehack:

beginner

moderate

expert


UsingCodeExamples
ThisbookisheretohelpyougetyourjobdonewithXML.In
general,youmayusethemarkup,stylesheets,andcodeinthis

bookinyourprogramsanddocumentation(allavailablefor
downloadinaZIParchivefrom
ofthehacks
assumethattheseexamplefilesareinplaceinaworking
directory).Youdonotneedtocontactusforpermissionunless
you'rereproducingasignificantportionofthecode.For
example,writingaprogramthatusesseveralchunksofcode
fromthisbookdoesnotrequirepermission.However,sellingor
distributingaCD-ROMofexamplesfromanO'Reillybookdoes
requirepermission.Answeringaquestionbycitingthisbook
andquotinganexampledoesnotrequirepermission,but
incorporatingasignificantamountofexamplesfromthisbook
intoyourproduct'sdocumentationdoesrequirepermission.
Weappreciate,butdonotrequire,attributionwhenusingcode.
Anattributionusuallyincludesthetitle,author,publisher,and
ISBN.Forexample:"XMLHacksbyMichaelFitzgerald.
Copyright2004O'ReillyMedia,Inc.,0-596-00711-6."
Ifyoufeelyouruseofcodeexamplesfallsoutsidefairuseor
thepermissiongivenabove,feelfreetocontactusat



×