BuildingTagCloudsinPerlandPHP
ByJimBumgardner
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-52794-2
PrintISBN-13:978-0-59-652794-5
Pages:48
TableofContents
Tagcloudsareeverywhereonthewebthesedays.Firstpopularizedbythewebsites
Flickr,Technorati,anddel.icio.us,theseamorphousclumpsofwordsnowappearonaslew
ofwebsitesasvisualevidenceoftheirmembershipintheelitecorpsof"Web2.0."This
PDFanalyzeswhatisandisn'tatagcloud,offersdesigntipsforusingthemeffectively,
andthengoesontoshowhowtocollecttagsanddisplaytheminthetagcloudformat.
ScriptsareprovidedinPerlandPHP.
Yes,somehavesaidtagcloudsareafad.Butasyouwillsee,tagclouds,whenused
properly,haverealmerits.Moreimportantly,theskillsyoulearninmakingyourowntag
cloudsenableyoutomakeotherinterestingkindsofinterfacesthatwilloutlastthe
mercurialfadsofthisyearorthenext.
BuildingTagCloudsinPerlandPHP
ByJimBumgardner
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-52794-2
PrintISBN-13:978-0-59-652794-5
Pages:48
TableofContents
Copyright
BuildingTagCloudsinPerlandPHP
TagClouds:EphemeralorEnduring?
WeightedLists
Section1.1.CreatingWeightedLists
Section1.2.TagCloudProperties
Section1.3.TheUtilityofTagClouds
SomeHistory
DesignTipsforBuildingTagClouds
Section4.1.ChoosetheRightLanguage
Section4.2.MakeYourTagCloudsVisibletoSearchEngines
Section4.3.FrequencySorting
Section4.4.AvoidRandomMappings
Section4.5.MakeTagCloudsRelevanttoYourUsers
Section4.6.TryDifferentMappings
MakingTagCloudsinPerl
Section5.1.CollectingTags
Section5.2.CollectingGenesisWordsinPerl
Section5.3.Collectingdel.icio.usTagsinPerl
Section5.4.DisplayingTagsInPerlUsingHTML::TagCloud
Section5.5.DisplayingTagsInPerlUsingYourOwnCode
Section5.6.MagnifyingtheLongTail(InversePowerMappinginPerl)
MakingTagCloudsinPHP
Section6.1.CollectingTags
Section6.2.CollectingGenesisWordsinPHP
Section6.3.Collectingdel.icio.usTagsinPHP
Section6.4.DisplayTagsinPHP
Section6.5.MagnifyingtheLongTail(InversePowerMappinginPHP)
Conclusion
Copyright
BuildingTagCloudswithPerlandPHP,byJimBumgardner
Copyright©2006O'ReillyMedia,Inc.Allrightsreserved.
NotforredistributionwithoutpermissionfromO'ReillyMedia,
Inc.
ISBN:0596527942
BuildingTagCloudsinPerlandPHP
ByJimBumgardner
TagcloudsareeverywhereontheWebthesedays.First
popularizedbythewebsitesFlickr,Technorati,anddel.icio.us,
theseamorphousclumpsofwordsnowappearonaslewofweb
sitesasvisualevidenceoftheirmembershipintheelitecorpsof
"Web2.0."
ThisPDFanalyzeswhatisandisn'tatagcloud,offersdesign
tipsforusingthemeffectively,andthengoesontoshowhow
tocollecttagsanddisplaytheminthetagcloudformat.Scripts
areprovidedinPerlandPHP.
Yes,tagcloudsareafad.Butasyouwillsee,tagclouds,when
usedproperly,haverealmerits.Moreimportantly,theskillsyou
learninconstructingyourowntagcloudsenableyoutomake
otherinterestingkindsofinterfacesthatwilloutlastthe
mercurialfadsofthisyearorthenext.
Contents
TagClouds:EphemeralorEnduring?
2
WeightedLists
3
SomeHistory
11
DesignTipsforBuildingTagClouds
13
MakingTagCloudsinPerl
15
MakingTagCloudsinPHP
31
Conclusion
46
TagcloudsareeverywhereontheWebthesedays.First
popularizedbythewebsitesFlickr,Technorati,anddel.icio.us,
theseamorphousclumpsofwordsnowappearonaslewofweb
sitesasvisualevidenceoftheirmembershipintheelitecorpsof
"Web2.0."
ThisPDFanalyzeswhatisandisn'tatagcloud,offersdesign
tipsforusingthemeffectively,andthengoesontoshowhowto
collecttagsanddisplaytheminthetagcloudformat.Scripts
areprovidedinPerlandPHP.
Yes,somehavesaidtagcloudsareafad.Butasyouwillsee,
tagclouds,whenusedproperly,haverealmerits.More
importantly,theskillsyoulearninconstructingyourowntag
cloudsenableyoutomakeotherinterestingkindsofinterfaces
thatwilloutlastthemercurialfadsofthisyearorthenext.
TagClouds:EphemeralorEnduring?
Ifyou'rereadingthis,you'veprobablyseenatagcloud(Figure
1)asyou'vebrowsedtheWeb.Inthisarticle,I'mgoingto
providealittleanalysisandhistoryoftagclouds,andthenget
ontomoreimportantmatters:I'lldemonstratehowtocreate
yourowntagcloudsinPerlandPHP.
Tagcloudsareacurrentfashion.ButinAprilof2005,web
designguruJeffreyZeldmandecriedtheirfaddishnessinhis
headline,"TagCloudsAretheNewMullets,"comparingthemto
theoncepopularhaircutthathasbecomeafashionjoke.And
thiswasbeforetheyreallystartedtocatchon.
Butjadedcriticismisacommonsideeffectofsuddenubiquity,
andZeldmanalsopraisedthebrillianceoftheidea.AndasI
havesaid,Iwillshowhowtagclouds,whenusedproperly,have
real,andlastingmerits.
Note:Allofthescriptsinthisarticlecanbedownloadedfrom
O'Reilly'swebsiteatthefollowingURL:
/>
Figure1.AtagcloudfromFlickr
WeightedLists
So,whatisatagcloud?Atagcloudisaspecifickindof
weightedlist.Forlackofastandardworkingdefinitionof
weightedlist,I'mgoingtomakeoneup.
Weightedlist
n.Alistofwordsorphrases,inwhichoneormorevisual
featuresinthelist(suchasfontsize)arecorrelatedto
someunderlyingdata.
Whiletagcloudsareaspecifictypeofweightedlist,notall
weightedlistsaretagclouds.Forexample,thelistofcitiesat
thepopularcraigslistwebsite(Figure2)isaweightedlist
becausefontsizeiscorrelatedwithpopularity,butitlacksthe
randomappearanceofatagcloud,duetothearrangementof
thecitiesinamatrix.
Figure2.Weightedcitieslistfromcraigslist
Anotherkindofweightedlist,onethat'sevenmoredistantfrom
tagclouds,isthatofthestatisticallyimprobablephrases(SIPs)
andcapitalizedphrases(CAPs)listsprovidedbyAmazon.com
(Figure3).IntheSIPlist,wordordercorrelatestothe
improbabilityofthephrase,andintheCAPlist,tothe
frequencywithwhichthephraseappearsinthebook.
Figure3.WeightedphraselistsfromAmazon.com
1.1.CreatingWeightedLists
Therearelotsofwaystomakeweightedlists.Givenanylistof
wordsorphrases,thereareahandfulofvisualfeaturesthat
youcanchoosetocorrelatewithunderlyingdata:
1.1.1.A:VisualFeatures
Fontsize
Wordorder
Wordcolor
Wordshape(typefaceandstyle)
Thekindsofunderlyingdatayoumightcorrelateormapthese
featurestoisamuchlargerlist,buthereareafewpossible
thingsyoumightwanttomap:
1.1.2.B:UnderlyingData
Quantity
Lexicalorder
Subject
Location
Time
Tomakeaweightedlist,takeoneoftheitemsfromcolumnA
andcorrelateittooneoftheitemsincolumnB(andrepeat,if
youlike,withdifferentitems).
Tagcloudsarejustonekindofweightedlist.Therearemany
differentimplementationsoftagclouds,andtheydonotall
sharethesamemappings,butalmostallofthemtendto
associatefontsizewithquantity.Forexample,theweighted
listsatFlickrhavethefollowingmappings:
A.VisualFeatures
B.UnderlyingData
Fontsize
Quantity
Wordorder
Lexicalorder
Wordcolor
Blue
Wordshape
Sansserif
Weightedlistsonotherwebsitesdifferinvaryingdegreesfrom
Flickr'sbasicdesign,butthemorecloselytheyfollowit,the
morelikelytheywillbedescribedas"tagclouds"ratherthanas
"weightedlists"or"lists."Thetagcloudonthewebsite43
Things(Figure4)hasthefollowingmappings:
A.VisualFeatures
B.UnderlyingData
Fontsize
Quantity
Wordorder
Random
Wordcolor
Blackwithbeigebackground
Typeface
Sanserif
Figure4.Tagcloudfor43Things
1.2.TagCloudProperties
Tagcloudsgenerallyhavethefollowingadditionalproperties:
Thewordsarearrangedinacontinuouslist,ratherthana
table.Theorderofthewordsisuncorrelatedtotag
frequency;forexample,theymightbelistedalphabetically
orrandomly.
Thewordsrepresenttags,orcommunity-createdmetadata.
Thismetadataoftenfollowspowerlawstherearefew
popularitems,andmanymoreunpopularitems.
Thetagsarelinksnavigabletothetaggedcontent.
Thefirstpropertygivestagcloudstheircloudyoramorphous
appearance.Theyhaveasimplebeautythatismoreattractive
thanagrid.
Thesecondtwopropertiesgivetagcloudsadualfunction.They
functionnotonlyasagraphofinterestingdata,butarea
navigationinterfacetouser-generatedcontent(orwhatDerek
Powazekcalls"authenticmedia").Inotherwords,tagclouds
arebothsomethingtolookatandsomethingtoclickon.
1.3.TheUtilityofTagClouds
Whileyoucanclickontagclouds,youcanalsojustlookat
themtogetaquickreadingofawebsite'szeitgeist.Lookingat
theFlickrtagcloudinFigure1,youcanseethatwedding
photosaretobefoundinlargequantities,andthattheyhavea
lotofphotostakeninLondonandJapan(perhapsat
weddings?).Lookingat43Things(Figure4),youcanseethata
lotofpeoplewanttogetatattoo.Thelistat43Thingsisa
randomizedselectionfromamuchlargerlist,soifyourefresh
thepageyou'llgetdifferentwinnerssuchas"buyahouse,"
"writeabook,"and"behappy."
Thedualnatureoftagcloudscomesattheexpenseofadesign
trade-off.Therearemoreeffectivewaystonavigate.In
general,"browsing"interfacesarenotasefficientforfinding
stuffassearching(andtagcloudsareusuallyaccompaniedbya
standardissuesearchbox,whichseesmoreuse).Butbrowsing
andsearchingaretwodifferentactivitiesthatservedifferent
needs.Thedynamicwaythattagcloudsshowpopularlistsisa
remarkablyeffectivewaytobrowse.
Therearealsomoreaccuratewaystographtagpopularity.
Considerthefollowinglists,whichshowthemostcommon
wordsinthebookofGenesis.Youcouldprovidetagsinatable
withactualnumbers(Figure5),orinabargraph(Figure6).
Figure5.Wordfrequencylist
Figure6.Wordfrequencybargraph
Thesemethodsbothprovideanunnecessaryincreasein
accuracyattheexpenseofagreatlossinvisualrealestate
(especiallythebargraph).Unlessyou'reintobiblical
numerology,youdon'treallyneedtoknowthatthename
"Esau"ismentionedexactly58times.Youjustwanttogeta
generalsenseofwhatispopularorfrequent.Becausetag
cloudsusethewordsthemselvestodescribethedata(Figure
7),theycanprovidetheessentialinformationforalarger
numberofwordsinamuchsmallerspace.
Figure7.Wordfrequencytagcloud
Ifmyownexperienceistypical,tagcloudsarelookedatmore
frequentlythantheyareclickedon.Generally,Ionlyclickon
tagcloudswhentheycorrespondverycloselytomyactual
interestsatthetime.However,theirfunctionasameasurement
ofzeitgeistisquiteusefulbyitself.
Tagcloudshaveanother,lessobviousfunction,alongwithbeing
somethingtolookatandsomethingtoclickon:theyeffectively
describethenatureofawebsitetosearchengineslikeGoogle.
Instaticwebsites,peopleusethe<metadescription>and
<metakeywords>tagstodescribethecontentofthewebsite
tosearchengines.ButinsiteslikeFlickr,whichconsistprimarily
ofuser-generatedcontent,youcan'tpredictwhattheprincipal
themeswillbetomorrowornextmonth.Tagcloudssolvethis
problembyprovidingarunningmeteroftheimportantitemson
asite.Thus,theycandynamicallyboostsearch-enginerankings
forthosetags.Andifthesearchenginepaysattentiontofont
size(andsomeofthemdo),somuchthebetter!
SomeHistory
Flickr,aphotography-sharingwebsitethatcaterstobloggers,
wasthefirstwebsitetousesomethingcalledatagcloud.
However,tagcloudsreallyhavetheirrootsintheblogging
community.Bloggershaveaneedtoorganizethelarge
amountsofmaterialtheyconstantlychurnout,andanexcellent
communicationsmediumtopropagatenewandinteresting
methods.
Flickr'stagcloudideawaslikelyinspired(directlyorindirectly)
fromanolderblogplugincalledZeitgeist(Figure8),byJim
Flanagan.
JimprovidedthisstorywhenIaskedhimaboutit:
In1997,whenIwasworkingatBrookhavenNationalLabin
LongIslandNY,theWebwasbecomingpopularenoughsothat
everybodyhadtohaveawebpage,andIwantedsomehowto
rebelagainstthecanonical,hierarchicalbulletedlistoflinks.So
IwroteaPerlCGIthatwouldtakeasmalldatabaseoflinksand
presentthemonthepageinvaryingcolorsandsizes.Thecolor
andsizewereselectedrandomlysodifferentthingswouldcycle
intoyourattentioneachtimeyouloadedthepage.
Muchlater,whenIgotintoblogging,Ifellintothenarcissistic
practiceofcheckingmyblogreferrallogstoseewhatwas
linkingtome.Idevelopedseveralpersonal"narcissurfing"
tools,andnoticedthattheGoogleandYahoosearchesthatled
tomysitewereoftenveryamusing.Inanattempttobuilda
pagetosharethesearchinformationwithmyreaders,Ifell
backtotherandom-coloredlinksapproach,exceptthatthis
time,thenumberofhitsfromacertainsearchtermcontrolled
thesize.
Afterawhile,severalbloggersaskedforthecode,andI
cleaneditupabitandsentitalong.It'sstillavailableat
/>Manybloggersusetheword"zeitgeist"tomeanaweighted
wordlistinthestyleofJim'splugin,asinFigure8.
Figure8.JimFlanagan'sZeitgeistplugininaction
IfyoulookatJim'scode(orFigure8),you'llseethatithasthe
followingmappings:
A.VisualFeatures
B.UnderlyingData
Fontsize
Quantity
Wordorder
Random
Wordcolor
Random
Typeface
Default
TheteamatFlickr(StewartButterfield,CalHenderson,and
GeorgeOates)implementedthefirsttagcloudsatFlickrusing
Zeitgeist-likeweightedlistsasinspiration.TheFlickrtagclouds
haveafewfundamentaldifferencesfromthewordlists
producedbyZeitgeist:
Theyrepresenttagsratherthansearchenginephrases,so
thedatabeingshownisactivelygeneratedbythesite's
communitywithinthesite,ratherthangatheredfromthe
site'sserverlogs.
Theydonotuserandomwordorder(althoughmanyother
tagcloudsdo).Thealphabeticalwordorderprovidesan
additionalwaytobrowsethelist,whilestillgivingthelista
randomappearance.
Flickralsogavetheirtagcloudsamorepolisheddesignthat
manyothersiteshaveemulated.Theychoseanattractivefont,
asinglecolor(ratherthanarandomassortmentofcolors,which
addsvisualcomplexitybutnoadditionalinformation),andthey
keptthelistsofwordsrelativelyshort,ratherthanallowing
themtogoonforpagesandpages,asmanyZeitgeist-based
pagesdo.