Tải bản đầy đủ (.pdf) (411 trang)

Big data little data no data scholarship in the networked world MIT press

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (5.32 MB, 411 trang )


Big Data, Little Data, No Data



Big Data, Little Data, No Data
Scholarship in the Networked World

Christine L. Borgman

The MIT Press
Cambridge, Massachusetts
London, England


© 2015 Christine L. Borgman



All rights reserved. No part of this book may be reproduced in any form by
any electronic or mechanical means (including photocopying, recording, or
information storage and retrieval) without permission in writing from the
publisher.



MIT Press books may be purchased at special quantity discounts for business or
sales promotional use. For information, please email
edu.




This book was set in Stone Sans and Stone Serif by the MIT Press. Printed and
bound in the United States of America.



Library of Congress Cataloging-in-Publication Data



Borgman, Christine L., 1951–
Big data, little data, no data : scholarship in the networked world / Christine L.
Borgman.
pages cm
Includes bibliographical references and index.
ISBN 978-0-262-02856-1 (hardcover : alk. paper)
1. Communication in learning and scholarship—Technological innovations.
2. Research—Methodology. 3. Research—Data processing. 4. Information
technology. 5. Information storage and retrieval systems. 6. Cyberinfrastructure.
I. Title.
AZ195.B66 2015
004—dc23
2014017233.



ISBN: 978–0-262–02856–1




10 9 8 7 6 5 4 3 2 1


For Betty Champoux Borgman, 1926–2012,
and Ann O’Brien, 1951–2014



Contents

Detailed Contents  ix
Preface xvii
Acknowledgments xxi
Part I: Data and Scholarship  1
1
2
3
4

Provocations 3
What are Data?   17
Data Scholarship  31
Data Diversity  55

Part II: Case Studies in Data Scholarship  81
5
6
7

Data Scholarship in the Sciences  83

Data Scholarship in the Social Sciences  125
Data Scholarship in the Humanities  161

Part III: Data Policy and Practice  203
8 Releasing, Sharing, and Reusing Data  205
9 Credit, Attribution, and Discovery  241
10 What to Keep and Why  271
References 289
Index 361



Detailed Table of Contents

Preface xvii
Acknowledgments xxi
Part I: Data and Scholarship  1
1 Provocations 3
Introduction 3
Big Data, Little Data  4
Bigness 5
Openness 7
The Long Tail  8
No Data  10
Data Are Not Available  11
Data Are Not Released  11
Data Are Not Usable  13
Provocations 13
Conclusion 15


2

What Are Data?  17

Introduction 17
Definitions and Terminology  18
Definitions by Example  19
Operational Definitions  20
Categorical Definitions  21
Degrees of Processing  21
Origin and Preservation Value  23
Collections 25
Conceptual Distinctions  26


x 

Detailed Table of Contents

Sciences and Social Sciences  26
Humanities 27
Conclusion 28

3

Data Scholarship  31

Introduction 31
Knowledge Infrastructures  32
The Social and the Technical  35

Communities and Collaboration  36
Knowledge and Representation  37
Theory, Practice, and Policy  38
Open Scholarship  39
Open Access to Research Findings  39
Open Access to Data  42
Open Technologies  45
Converging Communication  47
Data Metaphors  47
Units of Data  50
Documents of Record  51
Conclusion 52

4

Data Diversity  55

Introduction 55
Disciplines and Data  56
Size Matters  58
Project Goals  58
Data Collection  60
Data Analysis  61
When Are Data?  62
Distance Matters  64
Sources and Resources  64
Metadata 65
Definitions and Discovery  66
Communities and Standards  68
Provenance 70

External Influences  71
Economics and Values  71
Property Rights  75
Ethics 77
Conclusion 79


Detailed Table of Contents 

Part II: Case Studies in Data Scholarship  81
5

Data Scholarship in the Sciences  83

Introduction 83
Research Methods and Data Practices  83
Science Cases  84
Astronomy 85
Size Matters  86
Big Science, Little Science  86
Big Data, Long Tail  87
When Are Data?  90
Sources and Resources  91
Telescopes 91
Electromagnetic Spectrum  92
Celestial Objects  93
Astronomy Data Products  93
Knowledge Infrastructures  94
Metadata 94
Coordinate Systems  95

Celestial Objects  96
Data Archiving  97
Publications 98
Provenance 99
External Influences  100
Economics and Value  100
Property Rights  100
Ethics 101
Conducting Research in Astronomy  102
The COMPLETE Survey  102
Research Questions  103
Collecting Data  103
Analyzing Data  104
Publishing Findings  104
Curating, Sharing, and Reusing Data  105
Sensor-Networked Science and Technology  106
Size Matters  106
When Are Data?  108
Sources and Resources  109
Embedded Sensor Networks  109
Physical Samples  111
Software, Code, Scripts, and Models  111
Background Data  111

xi


xii 

Detailed Table of Contents


Knowledge Infrastructures  112
Metadata 112
Provenance 113
External Influences  113
Economics and Value  113
Property Rights  114
Ethics 115
Conducting Research with Embedded Sensor Networks  116
Research Questions  117
Collecting Data  117
Analyzing Data  119
Publishing Findings  119
Curating, Sharing, and Reusing Data  120
Conclusion 121

6

Data Scholarship in the Social Sciences  125

Introduction 125
Research Methods and Data Practices  126
Social Sciences Cases  127
Internet Surveys and Social Media Studies  128
Size Matters  128
When Are Data?  129
Sources and Resources  129
Knowledge Infrastructures  131
Metadata 132
Provenance 133

External Influences  135
Economics and Value  135
Property Rights  136
Ethics 136
Conducting Internet Surveys and Social Media Research  137
Research Questions  138
Collecting Data  139
Analyzing Data  140
Publishing Findings  141
Curating, Sharing, and Reusing Data  142
Sociotechnical Studies  143
Size Matters  144
When Are Data?  144
Sources and Resources  145
Field Observations and Ethnography  145
Interviews  146


Detailed Table of Contents 

Records and Documents  146
Building and Evaluating Technologies  147
Knowledge Infrastructures  147
Metadata 148
Provenance 148
External Influences  149
Economics and Value  149
Property Rights  149
Ethics 150
Conducting Sociotechnical Research in CENS  150

Research Questions  151
Collecting Data  152
Analyzing Data  154
Publishing Findings  155
Curating, Sharing, and Reusing Data  156
Conclusion 157

7

Data Scholarship in the Humanities  161

Introduction 161
Research Methods and Data Practices  162
Humanities Cases  164
Classical Art and Archaeology  164
Size Matters  165
When Are Data?  166
Sources and Resources  166
Physical versus Digital Objects  167
Digital versus Digitized  167
Surrogates versus Full Content  167
Static Images versus Searchable Representations  168
Searchable Strings versus Enhanced Content  169
Knowledge Infrastructures  170
Metadata 171
Provenance 172
Collections 173
External Factors  176
Economics and Value  176
Property Rights  178

Ethics 178
Conducting Research in Classical Art and Archaeology  179
Research Questions  180
Collecting Data  181
Analyzing Data  182

xiii


xiv 

Detailed Table of Contents

Publishing Findings  184
Curating, Sharing, and Reusing Data  184
Buddhist Studies  186
Size Matters  187
When Are Data?  187
Sources and Resources  188
Primary versus Secondary Sources  188
Static Images versus Enhanced Content  189
Knowledge Infrastructures  189
Metadata 190
Provenance 191
Collections 191
External Factors  192
Economics and Value  192
Property Rights  193
Ethics 193
Conducting Research in Buddhist Studies  194

Research Questions  195
Collecting Data  196
Analyzing Data  196
Publishing Findings  197
Curating, Sharing, and Reusing Data  199
Conclusion 200

Part III: Data Policy and Practice  203
8

Sharing, Releasing, and Reusing Data  205

Introduction 205
Supply and Demand for Research Data  207
The Supply of Research Data  208
To Reproduce Research  209
Defining Reproducibility  209
Determining What to Reproduce  209
Detecting Fraud  210
Resolving Disputes  211
To Make Public Assets Available to the Public  211
To Leverage Investments in Research  212
To Advance Research and Innovation  212
The Demand for Research Data  213
Scholarly Motivations  214
Publications and Data  215


Detailed Table of Contents 


Communicating Research  215
Publishing Research  216
Data as Assets and Liabilities  217
Releasing Data  218
Representation and Mobility  219
Provenance 220
Acquiring Data to Reuse  222
Background and Foreground Uses  222
Interpretation and Trust  223
Knowledge Infrastructures  224
Repositories, Collections, and Archives   225
Private Practice  227
Human Infrastructure  228
Intractable Problems  229
Disciplinary Knowledge Infrastructures  229
Sciences 230
Astronomy 230
Sensor Networked Science and Technology  231
Genomics 233
Social Sciences  235
Internet Research  235
Sociotechnical Research  235
Humanities 236
Classical Art and Archaeology  236
Buddhist Studies  237
Conclusion 237

9

Credit, Attribution, and Discovery of Data  241


Introduction 241
Principles and Problems  243
Theory and Practice  245
Substance and Style: How to Cite  245
Theories of Citation Behavior: What, When, and Why to Cite Objects  248
Meaning of Links  248
Selecting References  249
Theorizing and Modeling Citation Behavior  250
Citing Data  251
Clear or Contested: Who Is Credited and Attributed?  252
Naming the Cited Author  252
Negotiating Authorship Credit  253
Responsibility 255
Credit for Data  256

xv


xvi 

Detailed Table of Contents

Name or Number: Questions of Identity  258
Identifying People and Organizations  258
Identity and Discovery  260
Identifying Objects  261
Theory Meets Technology: Citations as Actions  264
Risks and Rewards: Citations as Currency  266
Conclusion 268


10 What to Keep and Why  271
Introduction 271
Provocations Revisited  273
Rights, Responsibilities, Roles, and Risks  273
Data Sharing  275
Publications and Data  278
Data Access  281
Stakeholders and Skills  283
Knowledge Infrastructures Past, Present, and Future  285
Conclusion 287

References 289
Index 361


Preface

Big data begets big attention these days, but little data are equally essential
to scholarly inquiry. As the absolute volume of data increases, the ability to
inspect individual observations decreases. The observer must step ever further away from the phenomena of interest. New tools and new perspectives
are required. However, big data is not necessarily better data. The farther
the observer is from the point of origin, the more difficult it can be to determine what those observations mean—how they were collected; how they
were handled, reduced, and transformed; and with what assumptions and
what purposes in mind. Scholars often prefer smaller amounts of data that
they can inspect closely. When data are undiscovered or undiscoverable,
scholars may have no data.
Research data are much more—and less—than commodities to be
exploited. Data management plans, data release requirements, and other
well-intentioned policies of funding agencies, journals, and research institutions rarely accommodate the diversity of data or practices across domains.

Few policies attempt to define data other than by listing examples of what
they might be. Even fewer policies reflect the competing incentives and
motivations of the many stakeholders involved in scholarship. Data can
be many things to many people, all at the same time. They can be assets
to be controlled, accumulated, bartered, combined, mined, and perhaps to
be released. They can be liabilities to be managed, protected, or destroyed.
They can be sensitive or confidential, carrying high risks if released. Their
value may be immediately apparent or not realized until a time much later.
Some are worth the investment to curate indefinitely, but many have only
transient value. Within hours or months, advances in technology and
research fronts have erased the value in some kinds of observations.
A starting point to understand the roles of data in scholarship is
to acknowledge that data rarely are things at all. They are not natural
objects with an essence of their own. Rather, data are representations of


xviii Preface

observations, objects, or other entities used as evidence of phenomena
for the purposes of research or scholarship. Those representations vary by
scholar, circumstance, and over time. Across the sciences, social sciences,
and the humanities, scholars create, use, analyze, and interpret data, often
without agreeing on what those data are. Conceptualizing something as
data is itself a scholarly act. Scholarship is about evidence, interpretation,
and argument. Data are a means to an end, which is usually the journal
article, book, conference paper, or other product worthy of scholarly recognition. Rarely is research done with data reuse in mind.
Galileo sketched in his notebook. Nineteenth-century astronomers took
images on glass plates. Today’s astronomers use digital devices to capture
photons. Images of the night sky taken with consumer-grade cameras can
be reconciled to those taken by space missions because astronomers have

agreed on representations for data description and mapping. Astronomy
has invested heavily in standards, tools, and archives so that observations
collected over the course of several centuries can be aggregated. However,
the knowledge infrastructure of astronomy is far from complete and far
from fully automated. Information professionals play key roles in organizing and coordinating access to data, astronomical and otherwise.
Relationships between publications and data are manifold, which is
why research data is fruitfully examined within the framework of scholarly communication. The making of data may be deliberate and long term,
accumulating a trove of resources whose value increases over time. It may
be ad hoc and serendipitous, grabbing whatever indicators of phenomena
are available at the time of occurrence. No matter how well defined the
research protocol, whether for astronomy, sociology, or ethnography, the
collection of data may be stochastic, with findings in each stage influencing choices of data for the next. Part of becoming a scholar in any field is
learning how to evaluate data, make decisions about reliability and validity,
and adapt to conditions of the laboratory, field site, or archive. Publications that report findings set them in the context of the domain, grounding
them in the expertise of the audience. Information necessary to understand
the argument, methods, and conclusions are presented. Details necessary
to replicate the study are often omitted because the audience is assumed
to be familiar with the methods of the field. Replication and reproducibility, although a common argument for releasing data, are relevant only
in selected fields and difficult to accomplish even in those. Determining
which scholarly products are worth preserving is the harder problem.
Policies for data management, release, and sharing obscure the complex
roles of data in scholarship and largely ignore the diversity of practices


Preface xix

within and between domains. Concepts of data vary widely across the sciences, social sciences, and humanities, and within each area. In most fields,
data management is learned rather than taught, leading to ad hoc solutions. Researchers often have great difficulty reusing their own data. Making those data useful to unknown others, for unanticipated purposes, is
even harder. Data sharing is the norm in only a few fields because it is very
hard to do, incentives are minimal, and extensive investments in knowledge infrastructures are required.

This book is intended for the broad audience of stakeholders in research
data, including scholars, researchers, university leaders, funding agencies, publishers, libraries, data archives, and policy makers. The first section frames data and scholarship in four chapters, provoking a discussion
about concepts of data, scholarship, knowledge infrastructures, and the
diversity of research practices. The second section consists of three chapters
exploring data scholarship in the sciences, social sciences, and humanities.
These case studies are parallel in structure, providing comparisons across
domains. The concluding section spans data policy and practice in three
chapters, exploring why data scholarship presents so many difficult problems. These include releasing, sharing, and reusing data; credit, attribution,
and discovery; and what to keep and why.
Scholarship and data have long and deeply intertwined histories. Neither are new concepts. What is new are efforts to extract data from scholarly processes and to exploit them for other purposes. Costs, benefits, risks,
and rewards associated with the use of research data are being redistributed
among competing stakeholders. The goal of this book is to provoke a much
fuller, and more fully informed, discussion among those parties. At stake is
the future of scholarship.

Christine L. Borgman
Los Angeles, California
May 2014



Acknowledgments

It takes a village to write a sole-authored book, especially one that spans
as many topics and disciplines as does this one. My writing draws upon
the work of a large and widely distributed village of colleagues—an “invisible college” in the language of scholarly communication. Scholars care
passionately about their data and have given generously of their time in
countless discussions, participation in seminars and workshops, and reading many drafts of chapters.
The genesis of this book project goes back too many years to list all who
have influenced my thinking, thus these acknowledgments can thank, at

best, those who have touched the words in this volume in some way. Many
more are identified in the extensive bibliography. No doubt I have failed to
mention more than a few of you with whom I have had memorable conversations about the topics therein.
My research on scholarly data practices dates to the latter 1990s, building
on prior work on digital libraries, information-seeking behavior, humancomputer interaction, information retrieval, bibliometrics, and scholarly
communication. The data practices research has been conducted with a
fabulous array of partners whose generative contributions to my thinking incorporate too much tacit knowledge to be made explicit here. Our
joint work is cited throughout. Many of the faculty collaborators, students,
and postdoctoral fellows participated in multiple projects; thus, they are
combined into one alphabetical list. Research projects on scholarly data
practices include the Alexandria Digital Earth Prototype Project (ADEPT);
Center for Embedded Networked Sensing (CENS); Cyberlearning Task
Force; Monitoring, Modeling, and Memory; Data Conservancy; Knowledge
Infrastructures; and Long-Tail Research.
Faculty collaborators on these projects include Daniel Atkins, Geoffrey
Bowker, Sayeed Choudhury, Paul Davis, Tim DiLauro, George Djorgovski,
Paul Edwards, Noel Enyedy, Deborah Estrin, Thomas Finholt, Ian Foster,


xxii Acknowledgments

James Frew, Jonathan Furner, Anne Gilliland, Michael Goodchild, Alyssa
Goodman, Mark Hansen, Thomas Harmon, Bryan Heidorn, William
Howe, Steven Jackson, Carl Kesselman, Carl Lagoze, Gregory Leazer, Mary
Marlino, Richard Mayer, Carole Palmer, Roy Pea, Gregory Pottie, Allen
Renear, David Ribes, William Sandoval, Terence Smith, Susan Leigh Star,
Alex Szalay, Charles Taylor, and Sharon Traweek. Students, postdoctoral fellows, and research staff collaborators on these projects include Rebekah
Cummings, Peter Darch, David Fearon, Rich Gazan, Milena Golshan, Eric
Graham, David Gwynn, Greg Janee, Elaine Levia, Rachel Mandell, Matthew
Mayernik, Stasa Milojevic, Alberto Pepe, Elizabeth Rolando, Ashley Sands,

Katie Shilton, Jillian Wallis, and Laura Wynholds.
Most of this book was developed and written during my 2012–2013 sabbatical year at the University of Oxford. My Oxford colleagues were fountains of knowledge and new ideas, gamely responding to my queries of
“what are your data?” Balliol College generously hosted me as the Oliver
Smithies Visiting Fellow and Lecturer, and I concurrently held visiting
scholar posts at the Oxford Internet Institute and the Oxford eResearch
Centre. Conversations at high table and low led to insights that pervade
my thinking about all things data—Buddhism, cosmology, Dante, genomics, chirality, nanotechnology, education, economics, classics, philosophy,
mathematics, medicine, languages and literature, computation, and much
more. The Oxford college system gathers people together around a table
who otherwise might never meet, much less engage in boundary-spanning
inquiry. I am forever grateful to my hosts, Sir Drummond Bone, Master
of Balliol, and Nicola Trott, Senior Tutor; William Dutton of the Oxford
Internet Institute; David de Roure, Oxford eResearch Centre; and Sarah
Thomas, Bodley’s Librarian. My inspiring constant companions at Oxford
included Kofi Agawu, Martin Burton, George and Carmella Edwards,
Panagis Filippakopoulos, Marina Jirotka, Will Jones, Elena Lombardi, Eric
Meyer, Concepcion Naval, Peter and Shirley Northover, Ralph Schroeder,
Anne Trefethen, and Stefano Zacchetti.
Others at Oxford who enlightened my thinking, perhaps more than
they know, include William Barford, Grant Blank, Dame Lynne Brindley,
Roger Cashmore, Sir Iain Chalmers, Carol Clark, Douglas Dupree, Timothy
Endicott, David Erdos, Bertrand Faucheux, James Forder, Brian Foster, JohnPaul Ghobrial, Sir Anthony Graham, Leslie Green, Daniel Grimley, Keith
Hannabus, Christopher Hinchcliffe, Wolfram Horstmann, Sunghee Kim,
Donna Kurtz, Will Lanier, Chris Lintott, Paul Luff, Bryan Magee, Helen
Margetts, Philip Marshall, Ashley Nord, Dominic O’Brien, Dermot O’Hare,
Richard Ovenden, Denis Noble, Seamus Perry, Andrew Pontzen, Rachel


Acknowledgments xxiii


Quarrell, David Robey, Anna Sander, Brooke Simmons, Rob Simpson, JinChong Tan, Linnet Taylor, Rosalind Thomas, Nick Trefethen, David Vines,
Lisa Walker, David Wallace, Jamie Warner, Frederick Wilmot-Smith, and
Timothy Wilson.
Very special acknowledgments are due to my colleagues who contributed substantially to the case studies in chapters 5, 6, and 7. The astronomy
case in chapter 5 relies heavily on the contributions of Alyssa Goodman
of the Harvard-Smithsonian Center for Astrophysics and her collaborators, including Alberto Accomazzi, Merce Crosas, Chris Erdmann, Michael
Kurtz, Gus Muench, and Alberto Pepe. It also draws on the research of the
Knowledge Infrastructures research team at UCLA. The case benefited from
multiple readings of drafts by professor Goodman and reviews by other
astronomers or historians of astronomy, including Alberto Accomazzi,
Chris Lintott, Michael Kurtz, Patrick McCray, and Brooke Simmons. Astronomers George Djorgovski, Phil Marshall, Andrew Pontzen, and Alex Szalay also helped clarify scientific issues. The sensor-networked science and
technology case in chapter 5 draws on prior published work about CENS.
Drafts were reviewed by collaborators and by CENS science and technology researchers, including David Caron, Eric Graham, Thomas Harmon,
Matthew Mayernik, and Jillian Wallis. The first social sciences case in chapter 6, on Internet research, is based on interviews with Oxford Internet
Institute researchers Grant Blank, Corinna di Gennaro, William Dutton,
Eric Meyer, and Ralph Schroeder, all of whom kindly reviewed drafts of the
chapter. The second case, on sociotechnical studies, is based on prior published work with collaborators, as cited, and was reviewed by collaborators
Matthew Mayernik and Jillian Wallis. The humanities case studies in chapter 7 were developed for this book. The CLAROS case is based on interviews
and materials from Donna Kurtz of the University of Oxford, with further
contributions from David Robey and David Shotton. The analysis of the
Pisa Griffin draws on interviews and materials from Peter Northover, also of
Oxford, and additional sources from Anna Contadini of SOAS, London. The
closing case, on Buddhist scholarship, owes everything to the patient tutorial of Stefano Zacchetti, Yehan Numata Professor of Buddhist Studies at
Oxford, who brought me into his sanctum of enlightenment. Humanities
scholars were generous in reviewing chapter 7, including Anna Contadini,
Johanna Drucker, Donna Kurtz, Peter Northover, Todd Presner, Joyce Ray,
and David Robey.
Many others shared their deep expertise on specialized topics. On biomedical matters, these included Jonathan Bard, Martin Burton, Iain
Chalmers, Panagis Filippakopoulos, and Arthur Thomas. Dr. Filippakopoulos



xxiv Acknowledgments

read drafts of several chapters. On Internet technologies and citation mechanisms, these included Geoffrey Bilder, Blaise Cronin, David de Roure, Peter
Fox, Carole Goble, Peter Ingwersen, John Klensin, Carl Lagoze, Salvatore
Mele, Ed Pentz, Herbert van de Sompel, and Yorick Wilks. Chapter 9 was
improved by the comments of Blaise Cronin, Kathleen Fitzpatrick, and John
Klensin. Paul Edwards and Marilyn Raphael were my consultants on climate
modeling. Sections on intellectual property and open access benefited from
discussions with David Erdos, Leslie Green, Peter Hirtle, Peter Murray-Rust,
Pamela Samuelson, Victoria Stodden, and John Wilbanks. Christopher Kelty
helped to clarify my understanding of common-pool resources, building on
other discussions of economics with Paul David, James Forder, and David
Vines. Ideas about knowledge infrastructures were shaped by long-running
discussions with my collaborators Geoffrey Bowker, Paul Edwards, Thomas
Finholt, Steven Jackson, Cory Knobel, and David Ribes. Similarly, ideas about
data policy were shaped by membership on the Board on Research Data and
Information, on CODATA, on the Electronic Privacy Information Center,
and by the insights of Francine Berman, Clifford Lynch, Paul Uhlir, and Marc
Rotenberg. On issues of libraries and archives, I consulted Lynne Brindley,
Johanna Drucker, Anne Gilliland, Margaret Hedstrom, Ann O’Brien, Susan
Parker, Gary Strong, and Sarah Thomas. Jonathan Furner clarified philosophical concepts, building upon what I learned from many Oxford conversations. Will Jones introduced me to the ethical complexities of research
on refugees. Abdelmonem Afifi, Mark Hansen, and Xiao-li Meng improved
my understanding of the statistical risks in data analysis. Clifford Lynch,
Lynne Markus, Matthew Mayernik, Ann O’Brien, Katie Shilton, and Jillian
Wallis read and commented upon large portions of the manuscript, as did
several helpful anonymous reviewers commissioned by Margy Avery of the
MIT Press.
I would be remiss not to acknowledge the invisible work of those who
rarely receive credit in the form of authorship. These include the funding

agencies and program officers who made this work possible. At the National
Science Foundation, Daniel Atkins, Stephen Griffin, and Mimi McClure
have especially nurtured research on data, scholarship, and infrastructure.
Tony Hey and his team at Microsoft Research collaborated, consulted,
and gave monetary gifts at critical junctures. Thanks to Lee Dirks, Susan
Dumais, Catherine Marshall, Catherine van Ingen, Alex Wade, and Curtis
Wong of MSR. Josh Greenberg at the Sloan Foundation has given us funds,
freedom, and guidance in studying knowledge infrastructures. Also invisible are the many people who invited me to give talks from the book-inprogress and those who attended. I am grateful for those rich opportunities


×