Tải bản đầy đủ (.pdf) (95 trang)

OReilly building tag clouds in perl and PHP may 2006 ISBN 0596527942

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.35 MB, 95 trang )

BuildingTagCloudsinPerlandPHP
ByJimBumgardner
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-52794-2
PrintISBN-13:978-0-59-652794-5
Pages:48

TableofContents

Tagcloudsareeverywhereonthewebthesedays.Firstpopularizedbythewebsites
Flickr,Technorati,anddel.icio.us,theseamorphousclumpsofwordsnowappearonaslew
ofwebsitesasvisualevidenceoftheirmembershipintheelitecorpsof"Web2.0."This
PDFanalyzeswhatisandisn'tatagcloud,offersdesigntipsforusingthemeffectively,
andthengoesontoshowhowtocollecttagsanddisplaytheminthetagcloudformat.
ScriptsareprovidedinPerlandPHP.
Yes,somehavesaidtagcloudsareafad.Butasyouwillsee,tagclouds,whenused
properly,haverealmerits.Moreimportantly,theskillsyoulearninmakingyourowntag
cloudsenableyoutomakeotherinterestingkindsofinterfacesthatwilloutlastthe
mercurialfadsofthisyearorthenext.


BuildingTagCloudsinPerlandPHP
ByJimBumgardner
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-52794-2
PrintISBN-13:978-0-59-652794-5
Pages:48



TableofContents













































Copyright
BuildingTagCloudsinPerlandPHP
TagClouds:EphemeralorEnduring?
WeightedLists
Section1.1.CreatingWeightedLists
Section1.2.TagCloudProperties
Section1.3.TheUtilityofTagClouds
SomeHistory
DesignTipsforBuildingTagClouds
Section4.1.ChoosetheRightLanguage
Section4.2.MakeYourTagCloudsVisibletoSearchEngines
Section4.3.FrequencySorting
Section4.4.AvoidRandomMappings
Section4.5.MakeTagCloudsRelevanttoYourUsers

Section4.6.TryDifferentMappings
MakingTagCloudsinPerl
Section5.1.CollectingTags
Section5.2.CollectingGenesisWordsinPerl
Section5.3.Collectingdel.icio.usTagsinPerl
Section5.4.DisplayingTagsInPerlUsingHTML::TagCloud
Section5.5.DisplayingTagsInPerlUsingYourOwnCode
Section5.6.MagnifyingtheLongTail(InversePowerMappinginPerl)
MakingTagCloudsinPHP
Section6.1.CollectingTags













Section6.2.CollectingGenesisWordsinPHP
Section6.3.Collectingdel.icio.usTagsinPHP
Section6.4.DisplayTagsinPHP
Section6.5.MagnifyingtheLongTail(InversePowerMappinginPHP)
Conclusion



Copyright
BuildingTagCloudswithPerlandPHP,byJimBumgardner
Copyright©2006O'ReillyMedia,Inc.Allrightsreserved.
NotforredistributionwithoutpermissionfromO'ReillyMedia,
Inc.
ISBN:0596527942


BuildingTagCloudsinPerlandPHP
ByJimBumgardner
TagcloudsareeverywhereontheWebthesedays.First
popularizedbythewebsitesFlickr,Technorati,anddel.icio.us,
theseamorphousclumpsofwordsnowappearonaslewofweb
sitesasvisualevidenceoftheirmembershipintheelitecorpsof
"Web2.0."
ThisPDFanalyzeswhatisandisn'tatagcloud,offersdesign
tipsforusingthemeffectively,andthengoesontoshowhow
tocollecttagsanddisplaytheminthetagcloudformat.Scripts
areprovidedinPerlandPHP.
Yes,tagcloudsareafad.Butasyouwillsee,tagclouds,when
usedproperly,haverealmerits.Moreimportantly,theskillsyou
learninconstructingyourowntagcloudsenableyoutomake
otherinterestingkindsofinterfacesthatwilloutlastthe
mercurialfadsofthisyearorthenext.

Contents
TagClouds:EphemeralorEnduring?

2


WeightedLists

3

SomeHistory

11

DesignTipsforBuildingTagClouds

13

MakingTagCloudsinPerl

15

MakingTagCloudsinPHP

31


Conclusion

46

TagcloudsareeverywhereontheWebthesedays.First
popularizedbythewebsitesFlickr,Technorati,anddel.icio.us,
theseamorphousclumpsofwordsnowappearonaslewofweb
sitesasvisualevidenceoftheirmembershipintheelitecorpsof
"Web2.0."

ThisPDFanalyzeswhatisandisn'tatagcloud,offersdesign
tipsforusingthemeffectively,andthengoesontoshowhowto
collecttagsanddisplaytheminthetagcloudformat.Scripts
areprovidedinPerlandPHP.
Yes,somehavesaidtagcloudsareafad.Butasyouwillsee,
tagclouds,whenusedproperly,haverealmerits.More
importantly,theskillsyoulearninconstructingyourowntag
cloudsenableyoutomakeotherinterestingkindsofinterfaces
thatwilloutlastthemercurialfadsofthisyearorthenext.




TagClouds:EphemeralorEnduring?
Ifyou'rereadingthis,you'veprobablyseenatagcloud(Figure
1)asyou'vebrowsedtheWeb.Inthisarticle,I'mgoingto
providealittleanalysisandhistoryoftagclouds,andthenget
ontomoreimportantmatters:I'lldemonstratehowtocreate
yourowntagcloudsinPerlandPHP.
Tagcloudsareacurrentfashion.ButinAprilof2005,web
designguruJeffreyZeldmandecriedtheirfaddishnessinhis
headline,"TagCloudsAretheNewMullets,"comparingthemto
theoncepopularhaircutthathasbecomeafashionjoke.And
thiswasbeforetheyreallystartedtocatchon.
Butjadedcriticismisacommonsideeffectofsuddenubiquity,
andZeldmanalsopraisedthebrillianceoftheidea.AndasI
havesaid,Iwillshowhowtagclouds,whenusedproperly,have
real,andlastingmerits.

Note:Allofthescriptsinthisarticlecanbedownloadedfrom

O'Reilly'swebsiteatthefollowingURL:
/>
Figure1.AtagcloudfromFlickr



WeightedLists
So,whatisatagcloud?Atagcloudisaspecifickindof
weightedlist.Forlackofastandardworkingdefinitionof
weightedlist,I'mgoingtomakeoneup.

Weightedlist
n.Alistofwordsorphrases,inwhichoneormorevisual
featuresinthelist(suchasfontsize)arecorrelatedto
someunderlyingdata.
Whiletagcloudsareaspecifictypeofweightedlist,notall
weightedlistsaretagclouds.Forexample,thelistofcitiesat
thepopularcraigslistwebsite(Figure2)isaweightedlist
becausefontsizeiscorrelatedwithpopularity,butitlacksthe
randomappearanceofatagcloud,duetothearrangementof
thecitiesinamatrix.

Figure2.Weightedcitieslistfromcraigslist


Anotherkindofweightedlist,onethat'sevenmoredistantfrom
tagclouds,isthatofthestatisticallyimprobablephrases(SIPs)
andcapitalizedphrases(CAPs)listsprovidedbyAmazon.com
(Figure3).IntheSIPlist,wordordercorrelatestothe



improbabilityofthephrase,andintheCAPlist,tothe
frequencywithwhichthephraseappearsinthebook.

Figure3.WeightedphraselistsfromAmazon.com


1.1.CreatingWeightedLists
Therearelotsofwaystomakeweightedlists.Givenanylistof
wordsorphrases,thereareahandfulofvisualfeaturesthat
youcanchoosetocorrelatewithunderlyingdata:

1.1.1.A:VisualFeatures
Fontsize
Wordorder
Wordcolor
Wordshape(typefaceandstyle)
Thekindsofunderlyingdatayoumightcorrelateormapthese
featurestoisamuchlargerlist,buthereareafewpossible
thingsyoumightwanttomap:

1.1.2.B:UnderlyingData
Quantity
Lexicalorder
Subject
Location


Time
Tomakeaweightedlist,takeoneoftheitemsfromcolumnA

andcorrelateittooneoftheitemsincolumnB(andrepeat,if
youlike,withdifferentitems).
Tagcloudsarejustonekindofweightedlist.Therearemany
differentimplementationsoftagclouds,andtheydonotall
sharethesamemappings,butalmostallofthemtendto
associatefontsizewithquantity.Forexample,theweighted
listsatFlickrhavethefollowingmappings:
A.VisualFeatures

B.UnderlyingData

Fontsize

Quantity

Wordorder

Lexicalorder

Wordcolor

Blue

Wordshape

Sansserif

Weightedlistsonotherwebsitesdifferinvaryingdegreesfrom
Flickr'sbasicdesign,butthemorecloselytheyfollowit,the
morelikelytheywillbedescribedas"tagclouds"ratherthanas

"weightedlists"or"lists."Thetagcloudonthewebsite43
Things(Figure4)hasthefollowingmappings:
A.VisualFeatures

B.UnderlyingData

Fontsize

Quantity

Wordorder

Random


Wordcolor

Blackwithbeigebackground

Typeface

Sanserif

Figure4.Tagcloudfor43Things





1.2.TagCloudProperties

Tagcloudsgenerallyhavethefollowingadditionalproperties:
Thewordsarearrangedinacontinuouslist,ratherthana
table.Theorderofthewordsisuncorrelatedtotag
frequency;forexample,theymightbelistedalphabetically
orrandomly.
Thewordsrepresenttags,orcommunity-createdmetadata.
Thismetadataoftenfollowspowerlawstherearefew
popularitems,andmanymoreunpopularitems.
Thetagsarelinksnavigabletothetaggedcontent.
Thefirstpropertygivestagcloudstheircloudyoramorphous
appearance.Theyhaveasimplebeautythatismoreattractive
thanagrid.
Thesecondtwopropertiesgivetagcloudsadualfunction.They
functionnotonlyasagraphofinterestingdata,butarea
navigationinterfacetouser-generatedcontent(orwhatDerek
Powazekcalls"authenticmedia").Inotherwords,tagclouds
arebothsomethingtolookatandsomethingtoclickon.




1.3.TheUtilityofTagClouds
Whileyoucanclickontagclouds,youcanalsojustlookat
themtogetaquickreadingofawebsite'szeitgeist.Lookingat
theFlickrtagcloudinFigure1,youcanseethatwedding
photosaretobefoundinlargequantities,andthattheyhavea
lotofphotostakeninLondonandJapan(perhapsat
weddings?).Lookingat43Things(Figure4),youcanseethata
lotofpeoplewanttogetatattoo.Thelistat43Thingsisa
randomizedselectionfromamuchlargerlist,soifyourefresh

thepageyou'llgetdifferentwinnerssuchas"buyahouse,"
"writeabook,"and"behappy."
Thedualnatureoftagcloudscomesattheexpenseofadesign
trade-off.Therearemoreeffectivewaystonavigate.In
general,"browsing"interfacesarenotasefficientforfinding
stuffassearching(andtagcloudsareusuallyaccompaniedbya
standardissuesearchbox,whichseesmoreuse).Butbrowsing
andsearchingaretwodifferentactivitiesthatservedifferent
needs.Thedynamicwaythattagcloudsshowpopularlistsisa
remarkablyeffectivewaytobrowse.
Therearealsomoreaccuratewaystographtagpopularity.
Considerthefollowinglists,whichshowthemostcommon
wordsinthebookofGenesis.Youcouldprovidetagsinatable
withactualnumbers(Figure5),orinabargraph(Figure6).

Figure5.Wordfrequencylist



Figure6.Wordfrequencybargraph


Thesemethodsbothprovideanunnecessaryincreasein
accuracyattheexpenseofagreatlossinvisualrealestate


(especiallythebargraph).Unlessyou'reintobiblical
numerology,youdon'treallyneedtoknowthatthename
"Esau"ismentionedexactly58times.Youjustwanttogeta
generalsenseofwhatispopularorfrequent.Becausetag

cloudsusethewordsthemselvestodescribethedata(Figure
7),theycanprovidetheessentialinformationforalarger
numberofwordsinamuchsmallerspace.

Figure7.Wordfrequencytagcloud

Ifmyownexperienceistypical,tagcloudsarelookedatmore
frequentlythantheyareclickedon.Generally,Ionlyclickon
tagcloudswhentheycorrespondverycloselytomyactual
interestsatthetime.However,theirfunctionasameasurement
ofzeitgeistisquiteusefulbyitself.
Tagcloudshaveanother,lessobviousfunction,alongwithbeing
somethingtolookatandsomethingtoclickon:theyeffectively
describethenatureofawebsitetosearchengineslikeGoogle.
Instaticwebsites,peopleusethe<metadescription>and


<metakeywords>tagstodescribethecontentofthewebsite
tosearchengines.ButinsiteslikeFlickr,whichconsistprimarily
ofuser-generatedcontent,youcan'tpredictwhattheprincipal
themeswillbetomorrowornextmonth.Tagcloudssolvethis
problembyprovidingarunningmeteroftheimportantitemson
asite.Thus,theycandynamicallyboostsearch-enginerankings
forthosetags.Andifthesearchenginepaysattentiontofont
size(andsomeofthemdo),somuchthebetter!


SomeHistory
Flickr,aphotography-sharingwebsitethatcaterstobloggers,
wasthefirstwebsitetousesomethingcalledatagcloud.

However,tagcloudsreallyhavetheirrootsintheblogging
community.Bloggershaveaneedtoorganizethelarge
amountsofmaterialtheyconstantlychurnout,andanexcellent
communicationsmediumtopropagatenewandinteresting
methods.
Flickr'stagcloudideawaslikelyinspired(directlyorindirectly)
fromanolderblogplugincalledZeitgeist(Figure8),byJim
Flanagan.
JimprovidedthisstorywhenIaskedhimaboutit:
In1997,whenIwasworkingatBrookhavenNationalLabin
LongIslandNY,theWebwasbecomingpopularenoughsothat
everybodyhadtohaveawebpage,andIwantedsomehowto
rebelagainstthecanonical,hierarchicalbulletedlistoflinks.So
IwroteaPerlCGIthatwouldtakeasmalldatabaseoflinksand
presentthemonthepageinvaryingcolorsandsizes.Thecolor
andsizewereselectedrandomlysodifferentthingswouldcycle
intoyourattentioneachtimeyouloadedthepage.
Muchlater,whenIgotintoblogging,Ifellintothenarcissistic
practiceofcheckingmyblogreferrallogstoseewhatwas
linkingtome.Idevelopedseveralpersonal"narcissurfing"
tools,andnoticedthattheGoogleandYahoosearchesthatled
tomysitewereoftenveryamusing.Inanattempttobuilda
pagetosharethesearchinformationwithmyreaders,Ifell
backtotherandom-coloredlinksapproach,exceptthatthis
time,thenumberofhitsfromacertainsearchtermcontrolled
thesize.
Afterawhile,severalbloggersaskedforthecode,andI


cleaneditupabitandsentitalong.It'sstillavailableat

/>Manybloggersusetheword"zeitgeist"tomeanaweighted
wordlistinthestyleofJim'splugin,asinFigure8.

Figure8.JimFlanagan'sZeitgeistplugininaction

IfyoulookatJim'scode(orFigure8),you'llseethatithasthe
followingmappings:


A.VisualFeatures

B.UnderlyingData

Fontsize

Quantity

Wordorder

Random

Wordcolor

Random

Typeface

Default

TheteamatFlickr(StewartButterfield,CalHenderson,and

GeorgeOates)implementedthefirsttagcloudsatFlickrusing
Zeitgeist-likeweightedlistsasinspiration.TheFlickrtagclouds
haveafewfundamentaldifferencesfromthewordlists
producedbyZeitgeist:
Theyrepresenttagsratherthansearchenginephrases,so
thedatabeingshownisactivelygeneratedbythesite's
communitywithinthesite,ratherthangatheredfromthe
site'sserverlogs.
Theydonotuserandomwordorder(althoughmanyother
tagcloudsdo).Thealphabeticalwordorderprovidesan
additionalwaytobrowsethelist,whilestillgivingthelista
randomappearance.
Flickralsogavetheirtagcloudsamorepolisheddesignthat
manyothersiteshaveemulated.Theychoseanattractivefont,
asinglecolor(ratherthanarandomassortmentofcolors,which
addsvisualcomplexitybutnoadditionalinformation),andthey
keptthelistsofwordsrelativelyshort,ratherthanallowing
themtogoonforpagesandpages,asmanyZeitgeist-based
pagesdo.


×