Tải bản đầy đủ (.pdf) (601 trang)

OReilly statistics hacks tips and tools for measuring the world and beating the odds may 2006 ISBN 0596101643

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (2.86 MB, 601 trang )

StatisticsHacks
ByBruceFrey
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-10164-3
PrintISBN-13:978-0-59-610164-0
Pages:356

TableofContents|Index

Wanttocalculatetheprobabilitythataneventwillhappen?Beabletospotfakedata?
Provebeyonddoubtwhetheronethingcausesanother?Orlearntobeabettergambler?
Youcandothatandmuchmorewith75practicalandfunhackspackedintoStatistics
Hacks.Thesecooltips,tricks,andmind-bogglingsolutionsfromtheworldofstatistics,
measurement,andresearchmethodswillnotonlyamazeandentertainyou,butwillgive
youanadvantageinseveralreal-worldsituations-includingbusiness.
Thisbookisidealforanyonewholikespuzzles,brainteasers,games,gambling,magic
tricks,andthosewhowanttoapplymathandsciencetoeverydaycircumstances.Several
hacksinthefirstchapteralone-suchasthe"centrallimittheorem,",whichallowsyouto
knoweverythingbyknowingjustalittle-serveassoundapproachesformarketingand
otherbusinessobjectives.Usingthetoolsofinferentialstatistics,youcanunderstandthe
wayprobabilityworks,discoverrelationships,predicteventswithuncannyaccuracy,and
evenmakealittlemoneywithawell-placedwagerhereandthere.

StatisticsHackspresentsusefultechniquesfromstatistics,educationalandpsychological
measurement,andexperimentalresearchtohelpyousolveavarietyofproblemsin
business,games,andlife.You'lllearnhowto:

PlaysmartwhenyouplayTexasHold'Em,blackjack,roulette,dicegames,oreven
thelottery


Designyourownwinnablebarbetstomakemoneyandamazeyourfriends
Predicttheoutcomesofbaseballgames,knowwhento"gofortwo"infootball,and
anticipatethewinnersofothersportingeventswithsurprisingaccuracy


Demystifyamazingcoincidencesanddistinguishthetrulyrandomfromtheonly
seeminglyrandom--evenkeepyouriPod's"random"shufflehonest
Spotfraudulentdata,detectplagiarism,andbreakcodes
Howtoisolatetheeffectsofobservationonthethingobserved

Whetheryou'reastatisticsenthusiastwhodoescalculationsinyoursleeporacivilianwho
isentertainedbycleversolutionstointerestingproblems,StatisticsHackshastoolstogive
youanedgeovertheworld'sslimodds.


StatisticsHacks
ByBruceFrey
...............................................
Publisher:O'Reilly
PubDate:May2006
PrintISBN-10:0-596-10164-3
PrintISBN-13:978-0-59-610164-0
Pages:356

TableofContents|Index

















































creditsCredits
Preface
Chapter1.TheBasics
Hack1.KnowtheBigSecret
Hack2.DescribetheWorldUsingJustTwoNumbers
Hack3.FiguretheOdds
Hack4.RejecttheNull
Hack5.GoBigtoGetSmall
Hack6.MeasurePrecisely
Hack7.MeasureUp
Hack8.PowerUp
Hack9.ShowCauseandEffect
Hack10.KnowBigWhenYouSeeIt
Chapter2.DiscoveringRelationships
Hack11.DiscoverRelationships
Hack12.GraphRelationships
Hack13.UseOneVariabletoPredictAnother
Hack14.UseMoreThanOneVariabletoPredictAnother

Hack15.IdentifyUnexpectedOutcomes
Hack16.IdentifyUnexpectedRelationships
Hack17.CompareTwoGroups
Hack18.FindOutJustHowWrongYouReallyAre
Hack19.SampleFairly
Hack20.SamplewithaTouchofScotch

















































































Hack21.ChoosetheHonestAverage
Hack22.AvoidtheAxisofEvil
Chapter3.MeasuringtheWorld
Hack23.SeetheShapeofEverything
Hack24.ProducePercentiles

Hack25.PredicttheFuturewiththeNormalCurve
Hack26.GiveRawScoresaMakeover
Hack27.StandardizeScores
Hack28.AsktheRightQuestions
Hack29.TestFairly
Hack30.ImproveYourTestScoreWhileWatchingPaintDry
Hack31.EstablishReliability
Hack32.EstablishValidity
Hack33.PredicttheLengthofaLifetime
Hack34.MakeWiseMedicalDecisions
Chapter4.BeatingtheOdds
Hack35.GambleSmart
Hack36.KnowWhentoHold'Em
Hack37.KnowWhentoFold'Em
Hack38.KnowWhentoWalkAway
Hack39.LoseSlowlyatRoulette
Hack40.PlayintheBlackinBlackjack
Hack41.PlaySmartWhenYouPlaytheLottery
Hack42.PlaywithCardsandGetLucky
Hack43.PlaywithDiceandGetLucky
Hack44.SharpenYourCard-Sharping
Hack45.AmazeYour23ClosestFriends
Hack46.DesignYourOwnBarBet
Hack47.GoCrazywithWildCards
Hack48.NeverTrustanHonestCoin
Hack49.KnowYourLimit
Chapter5.PlayingGames
Hack50.AvoidtheZonk
Hack51.PassGo,Collect$200,WintheGame
Hack52.UseRandomSelectionasArtificialIntelligence

Hack53.DoCardTricksThroughtheMail
Hack54.CheckYouriPod'sHonesty
Hack55.PredicttheGameWinners
Hack56.PredicttheOutcomeofaBaseballGame














































Hack57.PlotHistogramsinExcel
Hack58.GoforTwo
Hack59.RankwiththeBestofThem
Hack60.EstimatePibyChance
Chapter6.ThinkingSmart
Hack61.OutsmartSuperman
Hack62.DemystifyAmazingCoincidences
Hack63.SensetheRealRandomnessofLife
Hack64.SpotFakedData
Hack65.GiveCreditWhereCreditIsDue
Hack66.PlayaTuneonPascal'sTriangle

Hack67.ControlRandomThoughts
Hack68.SearchforESP
Hack69.CureConjunctionitus
Hack70.BreakCodeswithEtaoinShrdlu
Hack71.DiscoveraNewSpecies
Hack72.FeelConnected
Hack73.LearntoRideaVotercycle
Hack74.LiveLifeintheFastLane(You'reAlreadyIn)
Hack75.SeekOutNewLifeandNewCivilizations
Colophon
Index


Credits
AbouttheAuthor
BruceFrey,Ph.D.,isacomicbookcollectorandfilmbuff.Inhis
sparetime,heteachesstatisticstograduatestudentsand
conductsresearchinhissecretidentityasanassistant
professorinEducationalPsychologyandResearchatthe
UniversityofKansas.Heisanaward-winningteacher,andhis
scholarlyresearchinterestsareintheareasofteacher-made
testsandclassroomassessment,themeasurementof
spirituality,andprogramevaluationmethods.Bruce'shonors
includetakingthirdplaceintheKansasMonopolyChampionship
asateenager,secondplaceintheKansasFilmFestivalasa
collegestudent,andarespectablethird-placefinishinthe
Lawrence,Kansas,TexasHold'EmPokerTournamentasa
middle-agedman.Heisproudestoftwoaccomplishments:his
marriagetohissweetwife,andhispurchaseofalow-grade
copyofShowcase#4,acomicbookwhereinthe"SilverAge

Flashfirstappears,"whateverthatmeans.

Contributors
Thefollowingpeoplecontributedtheirhacks,writing,and
inspirationtothisbook:
JosephAdleristheauthorofBaseballHacks(O'Reilly),and
aresearcherintheAdvancedProductDevelopmentGroup
atVeriSign,focusingonproblemsinuserauthentication,
managedsecurityservices,andRFIDsecurity.Joehasyears
ofexperienceanalyzingdata,buildingstatisticalmodels,
andformulatingbusinessstrategiesasanemployeeand
consultantforcompaniesincludingDoubleClick,American


Express,andDun&Bradstreet.Heisagraduateofthe
MassachusettsInstituteofTechnologywithanSc.B.andan
M.Eng.incomputerscienceandcomputerengineering.Joe
isanunapologeticYankeesfan,butheappreciatesany
goodbaseballgame.JoelivesinSiliconValleywithhiswife,
twocats,andaDirecTVsatellitedish.
RonHale-Evansisawriter,thinker,andgamedesignerwho
earnshisdailysandwichwithfrequentgigsasatechnical
writer.HehasaBachelor'sdegreeinPsychologyfromYale,
withaminorinPhilosophy.Thinkingalotaboutthinkingled
himtocreatetheMentatWiki
(whichledtohisrecent
book,MindPerformanceHacks(O'Reilly).Youcanfindhis
multinefarious[sic]otherprojectsathishomepage,
,includinghisaward-winningboard
games,alistofhisShort-DurationPersonalSaviors,andhis

blog.Ron'snextbookwillprobablybeaboutgamesystems,
especiallysincehisseriesofarticlesonthattopicforthe
dearlydepartedTheGamesJournal
()hasbeenrelatively
popularamongbothgamersandacademics.Ifyouwantto
emailRonthenamesofsomegulliblepublishers,orifyou
justwanttobughim,youcanreachhimat
(rhymeswithnudismandhasnothingto
dowithLuddism).
BrianE.Hansen,27,grewupintheDallas,Texasarea.
Afterservingatwo-yearreligiousmissioninSpain,he
attendedTexasA&MUniversityandgraduatedin2004with
aB.S.degreeinPetroleumEngineering.Hecurrentlyworks
asaReservoirEngineerforalargeindependentoilandgas
explorationandproductioncompanyheadquarteredin
Irving,Texas.
JillH.LohmeierreceivedherPh.D.inCognitivePsychology


fromTheUniversityofMassachusetts,Amherst.Sheis
currentlytheEvaluationDirectorfortheSchoolProgram
EvaluationandResearchgroupattheUniversityofKansas.
Jilllikesoutdoorsports,especiallyrunning,hiking,and
playingsoccerwithherkids.
ErnestE.RothmanisaProfessorandChairofthe
MathematicalSciencesDepartmentatSalveRegina
University(SRU)inNewport,RhodeIsland.Ernieholdsa
Ph.D.inAppliedMathematicsfromBrownUniversityand
heldpositionsattheCornellTheoryCenterinIthaca,New
YorkbeforecomingtoSRU.Hisinterestsareprimarilyin

scientificcomputing,mathematicsandstatisticseducation,
andtheUnixunderpinningsofMacOSX.Youcankeep
abreastofhislatestactivitiesat
/>NeilJ.Salkindisasometimesfacultymemberatthe
UniversityofKansaswithanofficeoppositethatofBruce
Frey,ofStatisticsHacksfame.Inadditiontobeingthe
authorofStatisticsforPeopleWho(ThinkThey)Hate
Statistics(SAGE),Neilisadevelopmentalpsychologistwho
collectsbooks,cooks,worksonoldhousesandap1800
Volvo,andisactiveinMastersswimming.Hehasalso
writtenover100tradebooksandtextbooks,andworkswith
StudioBLiteraryAgencyinNewYork.
WilliamSkorupskiiscurrentlyanassistantprofessorinthe
SchoolofEducationattheUniversityofKansas,wherehe
teachescoursesinpsychometricsandstatistics.Heearned
hisBachelor'sdegreeineducationalresearchand
psychologyfromBucknellUniversityin2000,andhis
DoctorateinpsychometricmethodsfromtheUniversityof
Massachusetts,Amherstin2004.Hisprimaryresearch
interestisintheapplicationofmathematicalmodelsto
psychometricdata,includingtheuseofBayesianstatistics


forsolvingpracticalmeasurementproblems.Healsoenjoys
applyinghisknowledgeofstatisticsandprobabilityto
everydaysituations,suchasplayingpokeragainstthe
authorofthisbook!

Acknowledgments
I'dliketothankallthecontributorstothisbook,boththose

whoarelistedinthe"Contributors"sectionandthosewho
helpedwithideas,reviewedthemanuscript,andprovided
suggestionsofsourcesandresources.Thanksinthiscapacity
especiallygotoTimLangdon,neonbender,whosegiftofHarry
Blackstone,Jr.'spaperbackbookThere'sOneBornEveryMinute
(JovePublications)providedgreatinspirationformanyofthe
hacksherein.
I'dliketothankmyeditor,BrianSawyer,whoshepherdedthis
projectwithastronghandandastrongvisionofwhatisandis
notahack.Hewasrightmostofthetime.(Thoughnotallthe
time,Brian.Thathackaboutusingamonkeytopickthewinner
oftheKentuckyDerbyshouldhavemadeitin.Maybenext
time....)Brianwasinstrumentalinbringingthisprojectto
completion,especiallyduringastringofunluckyrollswherethe
oddsofsuccesslookedslim.
I'dliketothankNeilSalkind,statisticswritersupreme,forhis
helpwithmanyfacetsofmyprofessionallifeandthisbook.
Mostimportantly,thankstoBonnieJohnson,mysweetwife,
whomIvaguelyrecall,butwhoIthinkwillbewaitingformeat
homewhenIfinallyturninthelastrevisionofthisbook.




Preface
Chanceplaysahugepartinyourlife,whetheryouknowitor
not.Yourparticulargeneticmakeupmutatedslightlywhenyou
werecreated,anditdidsobasedonspecificlawsofprobability.
Performanceinschoolinvolveshumanerrors,yoursand
others',whichtendstokeepyouractualabilitylevelfrombeing

reflectedpreciselyinyourreportcardoronthosehigh-stakes
tests.Researchoncareersevensuggeststhatwhatyoudofora
livingwasprobablynotaresultofcarefulplanningand
preparation,butmorelikelyduetohappenstance.And,of
course,chancedeterminesyourfateingamesofchanceand
playsalargeroleintheoutcomeofsportingevents.
Fortunately,anentiresetofscientifictools,thevarious
applicationsofstatistics,canbeusedtosolvetheproblems
causedbyourfate-influencedsystem.Inferentialstatistics,a
fieldofsciencebasedentirelyonthenatureofprobability,
allowsustounderstandthewaythingswork,discover
relationshipsamongvariables,describeahugepopulationby
seeingjustasmallbitofit,makeuncannilyaccurate
predictions,and,yes,evenmakealittlemoneywithawellplacedwagerhereandthere.
Thisbookisacollectionofstatisticaltricksandtools.Statistics
Hackspresentsusefultoolsfromstatistics,ofcourse,butalso
fromtherealmsofeducationalandpsychologicalmeasurement
andexperimentalresearchdesign.Itprovidessolutionstoa
varietyofproblemsintheworldofsocialscience,butalsointhe
worldsofbusiness,games,andgambling.
Ifyouarealreadyatopscientistanddostatisticalcalculations
inyoursleep,you'llenjoythisbookandthecreative
applicationsitfindsforthoserustyoldtoolsyouknowsowell.
Ifyoujustlikethescientificapproachtolifeandareentertained
bycoolideasandcleversolutionstointerestingproblems,don't


worry.StatisticsHackswaswrittenwiththenonscientistin
mind,too,soifthatisyou,you'vecometotherightplace.It's
writtenforthenonstatisticianaswell,soifthisstilldescribes

you,you'llfeelsafehere.
If,ontheotherhand,youaretakingastatisticscourseorhave
someinterestintheacademicnatureofthetopic,youmight
findthisbookapleasantcompaniontothetextbookstypically
requiredforthosesortsofcourses.Therewon'tbeany
contradictionsbetweenyourtextbookandthisbook,sohearing
aboutreal-worldapplicationsofstatisticaltoolsthatseemonly
theoreticalwon'thurtyourdevelopment.It'sjustthatthereare
someprettycoolthingsthatyoucandowithstatisticsthat
seemmorelikefunthanlikework.

WhyStatisticsHacks?
Thetermhackinghasabadreputationinthepress.Theyuseit
torefertopeoplewhobreakintosystemsorwreakhavoc,
usingcomputersastheirweapon.Amongpeoplewhowrite
code,though,thetermhackreferstoa"quick-and-dirty"
solutiontoaproblemoracleverwaytogetsomethingdone.
Andthetermhackeristakenverymuchasacompliment,
referringtosomeoneasbeingcreative,havingthetechnical
chopstogetthingsdone.TheHacksseriesisanattemptto
reclaimtheword,documentthegoodwayspeoplearehacking,
andpassthehackerethicofcreativeparticipationontothe
uninitiated.Seeinghowothersapproachsystemsandproblems
isoftenthequickestwaytolearnaboutanewtechnology.
Thetechnologiesattheheartofthisbookarestatistics,
measurement,andresearchdesign.Computertechnologyhas
developedhand-in-handwiththesetechnologies,sotheuseof
thetermhackstodescribewhatisdoneinthisbookis
consistentwithalmosteveryperspectiveonthatword.Though
thereisjustalittlecomputerhackingcoveredinthesepages,

thereisaplethoraofcleverwaystogetthingsdone.


HowThisBookIsOrganized
Youcanreadthisbookfromcovertocoverifyoulike,buteach
hackstandsonitsown,sofeelfreetobrowseandjumptothe
differentsectionsthatinterestyoumost.Ifthere'sa
prerequisiteyouneedtoknowabout,across-referencewill
guideyoutotherighthack.
Theearlierhacksaremorefoundationalandprobablyprovide
generalizedsolutionsorstrategicapproachesacrossavarietyof
problemstoagreaterextentthanlaterhacks.Ontheother
hand,laterhacksprovidemuchmorespecifictricksforwinning
gamesorjustinformationtohelpyouunderstandwhat'sgoing
onaroundyou.
Thebookisdividedintoseveralchapters,organizedbysubject:

Chapter1,TheBasics
Usethesehacksasastrongsetoffoundationaltools,the
onesyouwillusemostoftenwhenyouarestat-hacking
yourwayintoandoutoftrouble.Thinkoftheseasyour
basictoolkit:yourhammer,saw,andvariousscrewdrivers.

Chapter2,DiscoveringRelationships
Thischaptercoversstatisticalwaystofind,describe,and
testrelationshipsamongvariables.Youwillbeabletomake
theinvisiblevisiblewiththesehacks.

Chapter3,MeasuringtheWorld



Avarietyoftipsandtricksformeasuringtheworldaround
youarepresentedhere.You'lllearntoasktheright
questions,assessaccurately,andevenincreaseyourown
performanceonhigh-stakestests.

Chapter4,BeatingtheOdds
Thischapterisforthegambler.Usetheoddstoyour
advantage,andmaketherightdecisionsinTexasHold'Em
pokerandjustabouteveryothergameinwhichprobability
determinestheoutcome.

Chapter5,PlayingGames
FromTVgameshowstrategytowinningMonopolyto
enjoyingsportstojusthavingfun,thischapterpresents
differenthacksforgettingthemostoutofyourgame
playing.

Chapter6,ThinkingSmart
Thischapterisperhapsthemostcerebralofthemall.Get
yourmindright,playmindgames,makediscoveries,and
unlockthemysteriesoftheworldaroundususingthe
statisticshacksyou'llfindhere.

ConventionsUsedinThisBook
Thefollowingisalistofthetypographicalconventionsusedin
thisbook:


Italics

Usedtoindicatekeytermsandconcepts,URLs,and
filenames.

Constantwidth
UsedforExcelfunctionsandcodeexamples.

Constantwidthitalic

Usedforcodetextthatshouldbereplacedbyuser-supplied
values.

Graytype
Usedtoindicateacross-referencewithinthetext.
Youshouldpayspecialattentiontonotessetapartfromthe
textwiththisicon:

Thisisatip,suggestion,orgeneralnote.Itcontainsuseful
supplementaryinformationaboutthetopicathand.

Thethermometericons,foundnexttoeachhack,indicatethe
relativecomplexityofthehack:

SafariEnabled


WhenyouseeaSafari®Enabledicononthecoverofyour
favoritetechnologybook,thatmeansthebookisavailable
onlinethroughtheO'ReillyNetworkSafariBookshelf.
Safarioffersasolutionthat'sbetterthane-books.It'savirtual
librarythatletsyoueasilysearchthousandsoftoptechbooks,

cutandpastecodesamples,downloadchapters,andfindquick
answerswhenyouneedthemostaccurate,currentinformation.
Tryitforfreeat.

HowtoContactUs
Wehavetestedandverifiedtheinformationinthisbooktothe
bestofourability,butyoumayfindthattherulesor
characteristicsofagivensituationaredifferentthandescribed
here.Asareaderofthisbook,youcanhelpustoimprove
futureeditionsbysendingusyourfeedback.Pleaseletusknow
aboutanyerrors,inaccuracies,misleadingorconfusing
statements,andtyposthatyoufindanywhereinthisbook.
Pleasealsoletusknowwhatwecandotomakethisbookmore
usefultoyou.Wetakeyourcommentsseriouslyandwilltryto
incorporatereasonablesuggestionsintofutureeditions.Youcan
writetousat:
O'ReillyMedia,Inc.
1005GravensteinHwyN.
Sebastopol,CA95472
800-998-9938(intheU.S.orCanada)
707-829-0515(international/local)
707-829-0104(fax)
Toasktechnicalquestionsortocommentonthebook,send
emailto:



ThewebsiteforStatisticsHackslistsexamples,errata,and
plansforfutureeditions.Youcanfindthispageat:
/>Formoreinformationaboutthisbookandothers,seethe

O'Reillywebsite:


GotaHack?
ToexploreHacksbooksonlineortocontributeahackforfuture
titles,visit:



Chapter1.TheBasics
There'sonlyasmallgroupoftoolsthatstatisticiansuseto
exploretheworld,answerquestions,andsolveproblems.Itis
thewaythatstatisticiansuseprobabilityorknowledgeofthe
normaldistributiontohelpthemoutindifferentsituationsthat
varies.Thischapterpresentsthesebasichacks.
Takingknowninformationaboutadistributionandexpressingit
asaprobability[Hack#1]isanessentialtrickfrequentlyused
bystat-hackers,asisusingatinybitofsampledatato
accuratelydescribeallthescoresinalargerpopulation[Hack
#2].Knowledgeofbasicrulesforcalculatingprobabilities[Hack
#3]iscrucial,andyougottaknowthelogicofsignificance
testingifyouwanttomakestatistically-baseddecisions[Hacks
#4and#8].
Minimizingerrorsinyourguesses[Hack#5]andscores[Hack
#6]andinterpretingyourdata[Hack#7]correctlyarekey
strategiesthatwillhelpyougetthemostbangforyourbuckin
avarietyofsituations.Andsuccessfulstat-hackershaveno
troublerecognizingwhattheresultsofanyorganizedsetof
observationsorexperimentalmanipulationreallymean[Hacks
#9and#10].

Learntousethesecoretools,andthelaterhackswillbea
breezetolearnandmaster.


Hack1.KnowtheBigSecret

Statisticiansknowonesecretthingthatmakesthem
seemsmarterthaneverybodyelse.
Theprimarypurposeofstatisticsasascientificmethodologyis
tomakeprobabilitystatementsaboutsamplesofscores.Before
wejumpintothat,weneedsomequickdefinitionstogetus
rolling,bothtounderstandthishackandtolayafoundationfor
otherstatisticshacks.
Samplesarenumericvaluesthatyouhavegatheredtogether
andcanseeinfrontofyouthatrepresentsomelarger
populationofscoresthatyouhavenotgatheredtogetherand
cannotseeinfrontofyou.Becausethesevaluesarealmost
alwaysnumbersthatindicatethepresenceorlevelofsome
characteristic,measurementfolkscallthesevaluesscores.A
probabilitystatementisastatementaboutthelikelihoodof
someeventoccurring.
Probabilityistheheartandsoulofstatistics.Acommon
perceptionofstatisticians,infact,isthattheymainlycalculate
theexactlikelihoodthatcertaineventsofinterestwilloccur,
suchaswinningthelotteryorbeingstruckbylightning.
Historically,thepersonwhohadthetoolstocalculatethelikely
outcomeofadicegamewasthesamepersonwhohadthetools
todescribealargegroupofpeopleusingonlyafewsummary
statistics.
So,traditionally,theteachingofstatisticsincludesatleastsome

timespentonthebasicrulesofprobability:themethodsfor
calculatingthechancesofvariouscombinationsorpermutations
ofpossibleoutcomes.Morecommonapplicationsofstatistics,


however,aretheuseofdescriptivestatisticstodescribeagroup
ofscores,ortheuseofinferentialstatisticstomakeguesses
aboutapopulationofscoresusingonlytheinformation
containedinasampleofscores.Insocialscience,thescores
usuallydescribeeitherpeopleorsomethingthatishappeningto
them.
Itturnsout,then,thatresearchersandmeasurers(thepeople
whoaremostlikelytousestatisticsintherealworld)arecalled
upontodomorethancalculatetheprobabilityofcertain
combinationsandpermutationsofinterest.Theyareableto
applyawidevarietyofstatisticalprocedurestoanswer
questionsofvaryinglevelsofcomplexitywithoutonceneeding
tocomputetheoddsofthrowingapairofsix-sideddiceand
gettingthree7sinarow.

Thoseoddsare.005or1/2of1percentifyoustartfromscratch.Ifyou
havealreadyrolledtwo7s,youhavea16.6percentchanceofrolling
thatthird7.

TheBigSecret
Thekeyreasonthatprobabilityissocrucialtowhatstatisticians
doisbecausetheyliketomakeprobabilitystatementsabout
thescoresinrealortheoreticaldistributions.

Adistributionofscoresisalistofallthedifferentvaluesand,

sometimes,howmanyofeachvaluethereare.


Forexample,ifyouknowthataquizjustadministeredina
classyouaretakingresultedinadistributionofscoresinwhich
25percentoftheclassgot10points,thenImightsay,without
knowingyouoranythingaboutyou,thatthereisa25percent
chancethatyougot10points.Icouldalsosaythatthereisa
75percentchancethatyoudidnotget10points.AllIhave
doneistakenknowninformationaboutthedistributionofsome
valuesandexpressedthatinformationasastatementof
probability.Thisisatrick.Itisthesecrettrickthatall
statisticiansknow.Infact,thisismostlyallthatstatisticians
everdo!
Statisticianstakeknowninformationaboutthedistributionof
somevaluesandexpressthatinformationasastatementof
probability.Thisisworthrepeating(or,technically,
threepeating,asIfirstsaiditfivesentencesago).Statisticians
takeknowninformationaboutthedistributionofsomevalues
andexpressthatinformationasastatementofprobability.
HeavenstoBetsy,wecanalldothat.Howhardcoulditbe?
Imaginethattherearethreemarblesinanotherwiseempty
coffeecan.Furtherimaginethatyouknowthatonlyoneofthe
marblesisblue.Therearethreevaluesinthedistribution:one
bluemarbleandtwomarblesofsomeothercolor,foratotal
samplesizeofthree.Thereisonebluemarbleoutofthree
marbles.Oh,statistician,whatarethechancesthat,without
looking,Iwilldrawthebluemarbleoutfirst?Oneoutofthree.
1/3.33percent.
Tobefair,thevaluesandtheirdistributionsmostcommonly

usedbystatisticiansareabitmoreabstractorcomplexthan
thoseofthemarblesinacoffeecanscenario,andsomuchof
whatstatisticiansdoisnotquitethattransparent.Appliedsocial
scienceresearchersusuallyproducevaluesthatrepresentthe
differencebetweentheaveragescoresofseveralgroupsof
people,forexample,oranindexofthesizeoftherelationship
betweentwoormoresetsofscores.Theunderlyingprocessis
thesameasthatusedwiththecoffeecanexample,though:


referencetheknowndistributionofthevalueofinterestand
makeastatementofprobabilityaboutthatvalue.
Thekey,ofcourse,ishowoneknowsthedistributionofall
theseexotictypesofvaluesthatmightinterestastatistician.
Howcanoneknowthedistributionofaveragedifferencesorthe
distributionofthesizeofarelationshipbetweentwosetsof
variables?Conveniently,pastresearchersandmathematicians
havedevelopedordiscoveredformulasandtheoremsandrules
ofthumbandphilosophiesandassumptionsthatprovideus
withtheknowledgeofthedistributionsofthesecomplexvalues
mostoftensoughtbyresearchers.Theworkhasbeendonefor
us.

ASmaller,DirtierSecret
Mostoftheproceduresthatstatisticiansusetotakeknown
informationaboutadistributionofscoresandexpressthat
informationasastatementofprobabilityhavecertain
requirementsthatmustbemetfortheprobabilitystatementto
beaccurate.Oneoftheseassumptionsthatalmostalwaysmust
bemetisthatthevaluesinasamplehavebeenrandomly

drawnfromthedistribution.
NoticethatinthecoffeecanexampleIslippedinthat"without
looking"business.Ifsomeforceotherthanrandomchanceis
guidingthesamplingprocess,thentheassociatedprobabilities
reportedaresimplywrongandhere'stheworstpartwecan't
possiblyknowhowwrongtheyare.Much,andmaybemost,of
theappliedpsychologicalandeducationalresearchthatoccurs
todayusessamplesofpeoplethatwerenotrandomlydrawn
fromsomepopulationofinterest.
Collegestudentstakinganintroductorypsychologycourse
makeupthesamplesofmuchpsychologicalresearch,for
example,andstudentsatelementaryschoolsconveniently


locatednearwhereaneducationalresearcherlivesareoften
chosenforstudy.Thisisaproblemthatsocialscience
researcherslivewithorignoreorworryabout,but,
nevertheless,itisalimitationofmuchsocialscienceresearch.


Hack2.DescribetheWorldUsingJustTwo
Numbers

Mostofthestatisticalsolutionsandtoolspresentedin
thisbookworkonlybecauseyoucanlookatasample
andmakeaccurateinferencesaboutalargerpopulation.
TheCentralLimitTheoremisthemeta-tool,theprime
directive,thekingofallsecretsthatallowsustopulloff
theseinferentialtricks.
Statisticsprovidesolutionstoproblemswheneveryourgoalis

todescribeagroupofscores.Sometimesthewholegroupof
scoresyouwanttodescribeisinfrontofyou.Thetoolsforthis
taskarecalleddescriptivestatistics.Moreoften,youcansee
onlypartofthegroupofthescoresyouwanttodescribe,but
youstillwanttodescribethewholegroup.Thissummary
approachiscalledinferentialstatistics.Ininferentialstatistics,
thepartofthegroupofscoresyoucanseeiscalledasample,
andthewholegroupofscoresyouwishtomakeinferences
aboutisthepopulation.
Itisquiteatrick,though,whenyouthinkaboutit,tobeableto
describewithanyconfidenceapopulationofvalueswhen,by
definition,youarenotdirectlyobservingthosevalues.Byusing
threepiecesofinformationtwosamplevaluesandan
assumptionabouttheshapeofthedistributionofscoresinthe
populationyoucanconfidentlyandaccuratelydescribethose
invisiblepopulations.Thesetofproceduresforderivingthat
eerilyaccuratedescriptioniscollectivelyknownastheCentral
LimitTheorem.

SomeQuickStatisticsBasics


Inferentialstatisticstendtousetwovaluestodescribe
populations,themeanandthestandarddeviation.

Mean
Ratherthandescribeasampleofvaluesbyshowingthemall,it
issimplymoreefficienttoreportsomefairsummaryofagroup
ofscoresinsteadoflistingeverysinglescore.Thissingle
numberismeanttofairlyrepresentallthescoresandwhat

theyhaveincommon.Consequently,thissinglenumberis
referredtoasthecentraltendencyofagroupofscores.
Typically,thebestmeasureofcentraltendency,foravarietyof
reasons,isthemean[Hack#21].Themeanisthearithmetic
averageofallthescoresandiscalculatedbyaddingtogetherall
thevaluesinagroup,andthendividingthattotalbythe
numberofvalues.Themeanprovidesmoreinformationabout
allthescoresinagroupthanothercentraltendencyoptions
(suchasreportingthemiddlescore,themostcommonscore,
andsoon).
Infact,mathematically,themeanhasaninterestingproperty.A
sideeffectofhowitiscreated(addingupallscoresanddividing
bythenumberofscores)producesanumberthatisascloseas
possibletoalltheotherscores.Themeanwillbeclosetosome
scoresandfarawayfromsomeothers,butifyouaddupthose
distances,yougetatotalthatisassmallaspossible.Noother
number,realorimagined,willproduceasmallertotaldistance
fromallthescoresinagroupthanthemean.

Standarddeviation
Justknowingthemeanofadistributiondoesn'tquitetellus
enough.Wealsoneedtoknowsomethingaboutthevariability
ofthescores.Aretheymostlyclosetothemeanormostlyfar


fromthemean?Twowildlydifferentdistributionscouldhavethe
samemeanbutdifferintheirvariability.Themostcommonly
reportedmeasureofvariabilitysummarizesthedistances
betweeneachscoreandthemean.
Aswiththemean,themoreinformativemeasureofvariability

wouldbeonethatusesallthevaluesinadistribution.A
measureofvariabilitythatdoesthisisthestandarddeviation.
Thestandarddeviationistheaveragedistanceofeachscore
fromthemean.Astandarddeviationcalculatesallthedistances
inadistributionandaveragesthem.The"distances"referredto
arethedistancebetweeneachscoreandthemean.

Anothercommonlyreportedvaluethatsummarizesthevariabilityina
distributionisthevariance.Thevarianceissimplythestandard
deviationsquaredandisnotparticularlyusefulinpicturinga
distribution,butitishelpfulwhencomparingdifferentdistributionsand
isfrequentlyusedasavalueinstatisticalcalculations,suchaswiththe
independentttest[Hack#17].

Theformulaforthestandarddeviationappearstobemore
complicatedthanitneedstobe,buttherearesome
mathematicalcomplicationswithsummingdistances(negative
distancesalwayscanceloutthepositivedistanceswhenthe
meanisusedasthedividingpoint).Consequently,hereisthe
equation:
Smeanstosumup.Thexmeanseachscore,andthenmeans
thenumberofscores.

CentralLimitTheorem
TheCentralLimitTheoremisfairlybrief,butverypowerful.


×