MEDICAL
INFORMATICS
Knowledge Management
and Data Mining in
Biomedicine
INTEGRATED SERIES IN INFORMATION SYSTEMS
Series Editors
Professor Ramesh Sharda Prof. Dr. Stefan Vo13
Oklahoma State University Universitat Hamburg
Other published titles in the series:
E-BUSINESS MANAGEMENT:
Integration of Web Technologies with Business
Models1
Michael
J.
Shaw
VIRTUAL CORPORATE UNIVERSITIES:
A Matrix of Knowledge and Learning
for the New Digital
DawdWalter R.J. Baets
&
Gert Van der Linden
SCALABLE ENTERPRISE SYSTEMS:
An Introduction to Recent Advances1
edited by Vittal Prabhu, Soundar Kumara, Manjunath Kamath
LEGAL PROGRAMMING:
Legal Compliance for RFID and Software Agent
Ecosystems in Retail Processes and Beyond1
Brian Subirana and Malcolm Bain
LOGICAL DATA MODELING:
What It Is and How To Do It1
Alan Chmura and
J. Mark Heumann
DESIGNING
AND
EVALUATING E-MANAGEMENT DECISION TOOLS:
The
Integration of Decision and Negotiation Models into Internet-Multimedia
Technologies1
Giampiero E.G. Beroggi
INFORMATION AND MANAGEMENT SYSTEMS FOR PRODUCT
CUSTOMIZATIONI Blecker, Friedrich, Kaluza, Abdelkafi
&
Kreutler
MEDICAL
INFORMATICS
Knowledge Management
and Data Mining in
Biomedicine
edited
by
Hsinchun Chen
Sherrilynne S
.
Fuller
Carol Friedman
William Hersh
Springer
-
Hsinchun Chen Sherrilynne S. Fuller
The University of Arizona, USA University of Washington, USA
Carol Friedman William Hersh
Columbia University, USA Oregon Health
&
Science Univ., USA
Library of Congress Cataloging-in-Publication Data
A
C.I.P. Catalogue record for this book is available
from the Library of Congress.
ISBN-10: 0-387-2438 1-X (HB)
ISBN- 10: 0-387-25739-X (e-book)
ISBN- 13: 978-0387-2438 1-8 (HB) ISBN- 13: 978-0387-25739-6 (e-book)
O
2005 by Springer Science+Business Media, Inc.
All rights reserved. This work may not be translated or copied in whole or in
part without the written permission of the publisher (Springer Science
+
Business Media, Inc., 233 Spring Street, New York, NY 10013, USA), except
for brief excerpts in connection with reviews or scholarly analysis. Use in
connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now
know or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks and
similar terms, even if the are not identified as such, is not to be taken as an
expression of opinion as to whether or not they are subject to proprietary rights.
Printed in the United States of America.
98765432
1
SPIN
1
1055556
TABLE OF CONTENTS
Editors' Biographies
xix
Authors' Biographies
xxiii
Preface
xxxix
UNIT I: Foundational Topics in Medical In formatics
Chapter
1:
Knowledge Management. Data Mining. and
Text Mining in Medical Informatics
3
Introduction
5
Knowledge Management, Data Mining, and Text Mining: An
Overview
6
2.1 Machine Learning and Data Analysis Paradigms
7
2.2 Evaluation Methodologies
11
Knowledge Management, Data Mining, and Text Mining
Applications in Biomedicine
12
3.1 Ontologies
13
3.2 Knowledge Management
14
3.3 Data Mining and Text Mining
18
3.4 Ethical and Legal Issues for Data Mining
22
Summary
22
References
23
Suggested Readings
31
Online Resources
31
Questions for Discussion
33
Chapter
2:
Mapping Medical Informatics Research
35
1
.
Introduction
37
2
.
Knowledge Mapping: Literature Review 37
3
.
Research Design
39
3.1 Basic Analysis
39
3.2 Content Map Analysis
40
3.3 Citation Analysis
41
4
.
Data Description
42
5
.
Results
44
5.1 Basic Analysis
44
5.2 Content Map Analysis
47
5.3 Citation Network Analysis
55
6
.
Conclusion and Discussion 57
7
.
Acknowledgement 58
References
58
Suggested Readings 60
Online Resources 61
Questions for Discussion 61
Chapter 3: Bioinformatics Challenges and Opportunities
63
1
.
Introduction
65
2
.
Overview of the Field
69
2.1 Definition of Bioinformatics
69
2.2 Opportunities and Challenges
-
Informatics Perspective
70
2.3 Opportunities and Challenges
-
Biological Perspective
79
3
.
Case Study 83
3.1
Informatics Perspective
-
The BIOINFOMED Study
and Genomic Medicine
83
3.2 Biological Perspective
-
The BioResearch Liaison
Program at the University of Washington
85
4
.
Conclusions and Discussion
89
5
.
Acknowledgements 91
References
91
Suggested Readings
92
Online Resources
93
Questions for Discussion 93
Chapter
4:
Managing Information Security and Privacy in Health Care
Data Mining: State of the Art
95
1
.
Introduction 97
.
2 Overview of Health Information Privacy and Security 98
2.1
Privacy and Healthcare Information
99
2.2
Security and Healthcare Information
99
3
.
Review of the Literature: Data Mining and Privacy
and Security 109
vii
3.1
General Approaches to Assuring Appropriate Use
110
3.2 Specific Approaches to Achieving Data Anonymity
112
3.3 Other Issues in Emerging "Privacy Technology"
116
3.4 "Value Sensitive Design": A Synthetic
Approach to Technological Development
117
3.5 Responsibility of Medical Investigators
119
4
.
Case Study: The Terrorist Information Awareness
Program (TIA)
12 1
4.1 The Relevance of TIA to Data Mining in Medical
Research
121
4.2 Understanding TIA
122
4.3 Controversy
124
4.4 Lessons Learned from TIA's Experience for Medical
Investigators Using "Datamining" Technologies
128
5
.
Conclusions and Discussion
129
6
.
Acknowledgements
131
References
131
Suggested Readings
134
Online Resources
135
Questions for Discussion
13 7
Chapter
5:
Ethical and Social Challenges of Electronic
Health Information
139
1
.
Introduction
141
2
.
Overview of the Field
142
2.1 Electronic Health Records
142
2.2 Clinical Alerts and Decision Support
146
2.3 Intemet-based Consumer Health Information
150
2.4 Evidence-based Medicine, Outcome Measures.
and Practice Guidelines
152
2.5 Data Mining
153
References
156
Suggested Readings
157
Online Resources
157
Questions for Discussion
158
viii
UNIT 11: Information and Knowledge Management
Chapter 6: Medical Concept Representation
163
1
.
Introduction 165
1.1 Use-cases 165
2
.
Context
168
2.1 Concept Characteristics
169
2.2 Domains
170
2.3 Structure 171
3
.
Biomedical Concept Collections
172
3.1 Ontologies 172
3.2 Vocabularies and Terminologies 174
3.3 Aggregation and Classification 175
3.4 Thesauri and Mappings 176
4
.
Standards and Semantic Interoperability
177
5
.
Acknowledgements 178
References 178
Suggested Readings 180
Online Resources 181
Questions for Discussion
181
Chapter
7:
Characterizing Biomedical Concept
Relationships: Concept Relationships as a Pathway
for Knowledge Creation and Discovery
183
.
1 Introduction
185
2
.
Background and Overview: The Use of Concept
Relationships for Knowledge Creation 188
2.1 Indexing Strategies and Vocabulary Systems 190
2.2 Integrating Document Structure in Systems 192
2.3 Text Mining Approaches
194
2.4 Literature-based Discovery IR Systems 195
2.5 Summary 198
.
3 Case Examples 198
3.1 Genescene 199
3.2 Telemakus 200
3.3 How Can a Concept Relationship System Help
with the Researcher's Problem and Questions?
202
3.4 Summary
206
4
.
Conclusions and Discussion
206
5
.
Acknowledgements
207
References
207
Suggested Readings
209
Online Resources
210
Questions for Discussion
210
Chapter 8: Biomedical Ontologies 211
1
.
Introduction
213
2
.
Representation of the Biomedical Domain in General
Ontologies
215
2.1 OpenCyc
215
2.2 WordNet
215
3
.
Examples of Medical Ontologies
217
3.1 GALEN
217
3.2 Unified Medical Language System
219
3.3 The Systematized Nomenclature of Medicine
220
3.4 Foundational Model of Anatomy
222
3.5 MENELAS ontology
223
4
.
Representations of the Concept
Blood
224
4.1
Blood
in Biomedical Ontologies
225
4.2 Differing Representations
227
4.3 Additional Knowledge
229
5
.
Issues in Aligning and Creating Biomedical Ontologies
230
6
.
Conclusion
231
7
.
Acknowledgments
232
References
232
Suggested Readings
234
Online Resources
234
Questions for Discussion
235
Appendix: Table showing characteristics of selected ontologies 235
Chapter
9:
Information Retrieval and Digital Libraries
237
Overview of Fields
239
Information Retrieval
241
2.1 Content
242
2.2 Indexing
247
2.3 Retrieval
254
2.4 Evaluation
257
2.5 Research Directions
261
Digital Libraries
262
3.1 Access 262
3.2 Interoperability 263
3.3 Preservation 263
Case Studies 264
4.1 PubMed
264
4.2 User-oriented Evaluation 265
4.3
Changes in Publishing
267
Acknowledgements
269
References
269
Suggested Readings
273
Online Resources
274
Questions for Discussion
275
Chapter
10:
Modeling Text Retrieval in Biomedicine
277
1
.
Introduction
279
2
.
Literature Review
280
3
.
An Ideal Model
282
4
.
General Text Retrieval
284
4.1 Vector Models
284
4.2 Language Models
286
5
.
Example Text Retrieval Systems Specialized to a
Biological Domain
288
5.1 Telemakus
289
5.2 XplorMed
290
5.3 AI3View:HivResist
291
5.4 The Future
292
xi
References
294
Suggested Readings
295
Online Resources 296
Questions for Discussion
296
Chapter 11: Public Access to Anatomic Images
299
Introduction
301
Background
303
2.1 Previous Work
303
2.2 Prologue: Database Design
305
The AnatQuest System 308
3.1
Need for Public Access
308
3.2 AnatQuest: Design Considerations 309
3.3
AnatQuest for Onsite Visitors
315
Next Steps
16
4.1 Increasing Content
316
4.2 Linking Text Resources to Image Database
18
4.3 Implemented Prototype: MedlinePlus Proxy Server
328
Summary
330
Acknowledgements
330
References
330
Suggested Readings
331
Online Resources
332
Questions for Discussion
332
Chapter 12: 3D Medical Informatics: Information Science
in Multiple Dimensions
333
Introduction
335
Overview
.
3D Medical Informatics
337
2.1 From Data to Knowledge
339
2.2 History
340
2.3 Why Study 3D Medical Informatics?
342
Example: 3D Models and Measurement of Neuroanatomy
across Subjects
344
3.1
Indexing Images with 3D Medical Informatics
345
3.2
Generalizing Elastic Deformable Models to 3D
346
xii
4
.
Surgical Templates: A Case Study in
3D
Informatics
348
4.1 Background and Related work 348
4.2 Design and Software Tools for Template Planning
Workstation
349
4.3 Results and Discussion
350
5
.
Grand Challenges in
3D
Medical Informatics
353
6
.
Conclusion
354
7
.
Acknowledgements
355
References
355
Suggested Readings
356
Online Resources
357
Questions for Discussion
357
Chapter
13:
Infectious Disease Informatics and
Outbreak Detection
359
Introduction
361
Infectious Disease Informatics: Background and Overview
362
2.1 Practical Challenges and Research Issues
362
2.2 Infectious Disease Informatics Research Framework
365
2.3 Infectious Disease Information Sharing Infrastructure
367
2.4 Infectious Disease Data Analysis and Outbreak
Detection
372
Infectious Disease Information Infrastructure and Outbreak
Detection: Case Studies
378
3.1 New York State's Health Information Network System
378
3.2 The BioPortal System
379
3.3 West Nile Virus Outbreak Analysis
386
Conclusions and Discussion
388
Acknowledgements
391
References
391
Suggested Readings
394
Online Resources
394
Questions for Discussion
394
xiii
UNIT
111:
Text Mining and Data Mining
Chapter
14:
Semantic Interpretation for the Biomedical
Research Literature
399
Introduction
401
Natural Language Processing
401
2.1 Overview
401
2.2 Levels of Linguistic Structure
402
Domain Knowledge: The UMLS
403
3.1 SPECIALIST Lexicon
404
3.2 Metathesaurus
404
Semantic Network
405
Semantic Interpretation for the Biomedical Literature
406
4.1 Overview
406
4.2 AQUA
407
4.3 PROTEUS-BIO
408
4.4 SemRep
409
4.5 Comparison of AQUA, PROTEUS-BIO, and SemRep 414
Application of SemRep 414
5.1 Automatic Summarization 414
5.2
lnformation Extraction in Molecular Genetics
417
Conclusion
419
References
420
Suggested Readings
421
Online Resources
422
Questions for Discussion
422
Chapter
15:
Semantic Text Parsing for Patient Records
423
1
.
Introduction
425
2
.
Overview
427
2.1
Challenges of Processing Clinical Reports
427
2.2
Components of an NLP System
431
2.3 Clinical Applications
437
3
.
Case Scenario
439
4
.
Conclusions and Discussion
443
5
.
Acknowledgements
443
xiv
References
444
Suggested Readings
446
Online Resources
447
Questions for Discussion
447
Chapter
16:
Identification of Biological Relationships
from Text Documents 449
Introduction
451
Overview of the Field
453
2.1 Background
453
2.2 Biological Information Extraction 453
2.3 Bioinformatics Tools
456
Case Studies
457
3.1 Identification of Flat Relationships from Text Documents 457
3.2
TransMiner: Formulating Novel, Implicit Associations
through Transitive Closure 461
3.3 Identification of Directional and Hierarchical Relationships 466
BioMap: A Knowledge Base of Biological Literature
477
4.1 BioMap Knowledgebase 480
4.2 Results and Discussions 482
Conclusions
484
Acknowledgements
484
References
48.5
Suggested Readings
487
Online Resources
488
Questions for Discussion
488
Chapter
17:
Creating. Modeling and Visualizing Metabolic
Networks:
FCModeler and PathBinder for Network
Modeling and Creation
491
1
.
Introduction
493
2
.
Overview
494
2.1 Metabolic Pathway Databases
494
2.2 Network Modeling and Reconstruction
494
2.3
Extracting Biological Interactions from Text
495
3
.
Metnet
498
3.1
Metabolic Networking Data Base (MetNetDB)
498
3.2 FCModeler: Network Visualization and Modeling 499
3.3 Network Validation Using Fuzzy Metrics
502
3.4 PathBinderA: Finding Sentences with Biomolecular
504
Interactions
SO4
4
.
Building on Metabolic Networks: Using MetNet 508
4.1 Construct the Genetic Network Using Time Correlation 508
4.2
Cluster and Network Validation
509
5
.
Discussion 514
6
.
Acknowledgements
514
References
515
Suggested Readings
7
Online Resources
7
Questions for Discussion
518
Chapter 18: Gene Pathway Text Mining and Visualization
519
1
.
Introduction
521
2
.
Literature Review/Overview
S21
2.1 Text Mining
S21
2.2 Visualization
525
3
.
Case Studies/Examples
526
3.1 Arizona Relation Parser
527
3.2 Genescene Parser 534
3.3 Genescene Visualizer 538
4
.
Conclusions and Discussion
541
5
.
Acknowledgements
542
References
542
Suggested Readings
544
Online Resources
545
Questions for Discussion
545
Chapter 19: The Genomic Data Mine
547
1
.
Introduction
549
2
.
Overview
550
2.1 Genomic Text Data
551
2.2 Genomic Map Data
556
xvi
2.3 Genomic Sequence Data
557
2.4 Genomic Expression Data
559
3
.
Case Study: The Gene Expression Omnibus 561
4
.
Conclusions and Discussion 562
References
564
Suggested Readings
569
Online Resources 569
Questions for Discussion
571
Chapter
20:
Exploratory Genomic Data Analysis
573
1
.
Introduction
575
2
.
Overview
576
2.1 Gene Expression Data
576
2.2 Mixed Populations 577
2.3 Methods for Mixed Populations
579
2.4 Distance 582
2.5 Hypothesis Selection
584
3
.
Case Studies
586
4
.
Conclusions 589
References
590
Suggested Readings
590
Online Resources
590
Questions for Discussion
591
Chapter
21:
Joint Learning Using Multiple Types
of Data and Knowledge
593
1
.
Introduction
595
2
.
Overview of the Field
597
2.1 Large-scale Biological Data and Knowledge Resources
597
2.2 Joint Learning Using Multiple Types of Data
599
2.3 Joint Learning Using Data and Knowledge
602
3
.
Kernel-based Data Fusion of Multiple Types of Data
604
3.1 Protein Function Prediction
604
3.2 Kernel-based Protein Function Prediction
604
4
.
Learning Regulatory Networks Using Microarray and
Existing Knowledge
608
xvii
4.1 Learning Regulatory Networks Using Microarray
608
4.2 Joint Learning Using Known Genetic Interactions
611
5
.
Conclusions and Discussion
617
6
.
Acknowledgements
618
References
618
Suggested Readings
621
Online Resources
622
Questions for Discussion
624
Author Index
625
Subject Index
627
EDITORS' BIOGRAPHIES
Hsinchun
Chen
is the McClelland Professor of
Management Information Systems (MIS) at the Eller
College of the University of Arizona. He received his
Ph.D. degree in Information Systems from New York
University. He is the author of more than nine books
and 200 articles covering medical informatics,
knowledge management, homeland security, semantic
retrieval, and Web computing in leading information
technology publications. He serves on the editorial
boards of Journal of the American Society for Information Science and
Technology, ACM Transactions on Information Systems, IEEE Transactions
on Systems, Man, and Cybernetics, IEEE Transactions on Intelligent
Transportation Systems, and Decision Support Systems. He is a scientific
counselor/advisor of the Lister Hill Center of the National Library of
Medicine (NLM/USA) and the National Library of China. Dr. Chen is the
director of the University of Arizona's Artificial Intelligence Lab (40+
researchers). Since 1990, Dr. Chen has received more than $17M in research
funding from various government agencies and major corporations. He has
been a PI of the NSF Digital Library Initiative Program and the NIH/NLMYs
Biomedical Informatics Program. His group has developed advanced
medical digital library, data mining, and text mining techniques for gene
pathway and disease informatics analysis and visualization. Dr. Chen's
work also has been recognized by major US corporations and been awarded
numerous industry awards including: AT&T Foundation Award in Science
and Engineering, SAP Award in ResearchlApplications, and Andersen
Consulting Professor of the Year Award. Dr. Chen has been heavily
involved in fostering digital library, medical informatics, knowledge
management, and intelligence informatics research and education in the US
and internationally. Dr. Chen was conference co-chair of ACMJIEEE Joint
Conference on Digital Libraries (JCDL) 2004 and has served as the
conferencelprogram co-chair for the past seven International Conferences of
Asian Digital Libraries (ICADL). Dr. Chen is also conference co-chair of the
IEEE International Conference on Intelligence and Security Informatics (ISI)
2003, 2004, and 2005. He has been a frequent advisor for major US and
international research programs. (Email:
;
URL:
Informatics, School
Sherrilynne Fuller
currently serves as Director,
Health Sciences Libraries and Information
Center, University of Washington. Her other
responsibilities at the University of
Washington include: Director, National Network of
Libraries of Medicine, Pacific Northwest Region, and
Assistant Director of Libraries. She is Professor,
Division of Biomedical and Health Informatics,
Department of Medical Education and Biomedical
of Medicine; Professor, Information School and
Adjunct Professor, Department of Health Services, School of Public Health
and Community Medicine. Dr. Fuller has a BA degree in Biology, a Master's
in Library Science from Indiana University, and a Ph.D. in Library and
Information Science from the University of Southern California. Dr. Fuller's
areas of research include: developing new approaches to represent and map
the results
of scientific research; design and evaluation of information
systems to support decision making at the place and time of need; and
integrated health sciences information systems design.
Dr. Fuller serves as Principal Investigator ofthe Health Sciences
Libraries and Information Center contract from the NLM to serve as the
Regional Medical Library for the Pacific Northwest (Alaska, Idaho,
Montana, Washington and Oregon);
Principal Investigator, Telemakus:
Mining and Mapping Research Findings to Promote Knowledge Discovery
in Aging funded by the Ellison Medical Foundation; Co-Investigator of
Biomedical Applications of the Next Generation Internet (NGI): Patient-
centric Tools for Regional Collaborative Cancer Care Using the NGI funded
by the National Library of Medicine; Co-Investigator of an International
Health and Biomedical Research and Training grant from the Fogarty
International Center; and advisor to a Health Services Research
Administration (HRSA) grant to explore models of Faculty Leadership in
Interprofessional Education to Promote Patient Safety.
Dr. Fuller has served as a member of the President's (White House)
Information Technology Advisory Committee and the Board of Regents of
the National Library of Medicine and on the Boards of the American
Medical Informatics Association and the Medical Library Association. She
is an elected fellow of the American College of Medical Informatics. (Email:
;
URL:
http:Nfaculty.washington.edu/sfuller/
)
Dr.
Carol
Friedman
is a Professor of Biomedical
Informatics at Columbia University. Dr. Friedman has a
B.S. degree in mathematics from the City University of
New York, an M.A. degree and a Ph.D. in Computer
Science from New York University. She has been
involved in NLP research for several decades, starting
with the pioneering Linguistic String Project. Dr.
Friedman's other areas of research include knowledge
representation, database design, object-oriented design,
and information visualization. Initially her research focused on the clinical
domain, and use of NLP for clinical applications. In the last few years she
has been involved in research in the biological domain as well. She is known
for the development of the MedLEE NLP system, which extracts and
encodes information occurring in clinical reports. It is being used
operationally at Columbia University Medical Center, where it has been
shown to improve patient care. She is involved in development of two other
NLP systems based on adaptations of MedLEE, GENIES and BioMedLEE,
which process scientific text in the biological domain. Dr. Friedman has
received more than $10M in research funding from various corporations and
government agencies including the National Library of Medicine, The
National Science Foundation, the New York State Office of Science and
Technology, and the Research Foundation of the City University of New
York. Dr. Friedman is involved in advancing biomedical informatics, text
mining, and knowledge management research and education. Dr. Friedman
was a conference co-chair for the Natural Language Track of the 2002
Pacific Symposium in Bioinformatics, the Workshop in Biomedicine in the
2002 and 2003 Association for Computational Linguistics Conferences, and
the 2004 BioLink Workshop in the Human Language Techology Conference
of the North American Chapter of the Association for Computational
Linguistics. Dr. Friedman is a member of the Board of Scientific Counselors
of the National Library of Medicine, is a member-at-large of the Executive
Board of the American College of Medical Informatics, is on the Editorial
Boards of the Journal of Biomedical Informatics and the Journal of the
Association of Medical Informatics, and is a reviewer for numerous journals
associated with bioinformatics. Dr. Friedman has been a guest editor of
special issues of the Journal of Biomedical Informatics, has published over
100 articles on NLP, has co-authored a book on natural language processing
(NLP), and is the author of various chapters on NLP.
(Email:
;
URL:
xxii
William
Hersh,
M.D. William Hersh, M.D. is
Professor and Chair of the Department of Medical
Informatics
&
Clinical Epidemiology in the School of
Medicine at Oregon Health
&
Science University
(OHSU) in Portland, Oregon. He also has academic
appointments in the Division of General Internal
Medicine of the Department of Medicine and in the
Department of Public Health and Preventive
Medicine. Dr. Hersh obtained his B.S. in Biology
from the University of Illinois at Champaign-Urbana in 1980 and his M.D.
from the University of Illinois at Chicago in 1984. After finishing his
residency in Internal Medicine at University of Illinois Hospital in Chicago
in 1987, he completed a Fellowship in Medical Informatics at Harvard
University in 1990. Dr. Hersh has been at OHSU since 1990, where he has
developed research and educational programs in medical informatics. He is
internationally recognized for his contributions to the field. He is a Fellow
of the American College of Medical Informatics and of the American
College of Physicians. Dr. Hersh also recently served as Secretary of the
American Medical Informatics Association. He is currently co-chair of the
Working Group on Education of the International Medical Informatics
Association. Dr. Hersh's research focuses on the development and
evaluation of information retrieval systems for biomedical practitioners and
researchers.
The majority of his research funding comes from the National
Library of Medicine, the Agency for Healthcare Research and Quality, and
the National Science Foundation. He has published over 100 scientific
papers and is author of the book,
Information Retrieval:
A
Health and
Biomedical Perspective
(Second Edition, Springer-Verlag, 2003), which has
an associated Web site, www.irbook.info. Dr. Hersh has also served on the
Editorial Board of five scientific journals. He is also a member of the
program committee of the Text Retrieval Conference (TREC) and currently
chairs TREC's Genomics Track. Dr. Hersh also serves as Associate Director
of the OHSU Evidence-Based Practice Center funded by the Agency for
Healthcare Research and Quality. Dr. Hersh's work in medical informatics
education is equally well-known. He serves as Director of OHSUYs
educational programs in biomedical informatics. He also teaches medical
informatics to medical students, nursing students, and internal medicine
residents. (Email: ;
URL:
AUTHORS' BIOGRAPHIES
Daniel Berleant, PhD, received the B.S. degree in
1982. After practicing in the software engineering field,
he received the MS (1990) and PhD (1991) degrees from
the University of Texas at Austin. He then developed a
research program in text mining and interaction and in
1
inference under severe uncertainty. In 1999 he accepted a
'I)
position at Iowa State University where he continues to
/
pursue research on text mining and text interaction, as
well as uncertainty quantification and software
engineering. He has advised or co-advised
24
master's theses and six PhD
students who have either graduated or are in progress. He has authored over
50 refereed papers and book chapters. (Email:
;
URL:
Dr. Olivier Bodenreider, MD, PhD, is a Staff
Scientist in the Cognitive Science Branch of the Lister
Hill National Center for Biomedical Communications at
the National Library of Medicine. He obtained the
MD
degree from the University of Strasbourg, France in 1990
and a PhD in Medical Informatics from the University of
Nancy, France in 1993. Before joining NLM, he was an
assistant professor for Biostatistics and Medical
Informatics at the University of Nancy, France, Medical
School. His research interests include terminology, knowledge
representation, and ontology in the biomedical domain, both from a
theoretical perspective and in their application to natural language
understanding, reasoning, information visualization, and interoperability.
(Email: )
Dr. Anita Burgun, MD, PhD, is an associate
professor at the University of Rennes I, School of
Medicine (France), where she conducts research on
knowledge representation and ontology in the
biomedical domain. She is involved in several projects
addressing semantic heterogeneity issues for
information integration.
(Email: anita.burgun-parenthoinemniv-rennes
1
.fr)
xxiv
Michael Chau, PhD, is currently Research Assistant
Professor in the School of Business at the University of
Hong Kong. He received his PhD degree in
Management Information Systems from the University
of Arizona and a Bachelor degree in Computer Science
!
(Information Systems) from the university of Hong
Kong. He was an active researcher in the Artificial
Intelligence Lab at the University of Arizona, where he
participated in several research projects funded by NSF,
NIH, NIJ, and DARPA. (Email:
;
URL:
Christopher
G.
Chute, MD, DrPH, received his
undergraduate and medical training at Brown University,
internal medicine residency at Dartmouth, and doctoral
training in Epiden~iology at Harvard. He is Board
Certified in Internal Medicine, and a Fellow of the
American College of Physicians, the American College
of Epidemiology, and the American College of Medical
Informatics. He became Head of the Section of Medical
Information Resources at Mayo Foundation in 1988 and
is now Professor and Chair of Biomedical Informatics. As a career scientist
at Mayo, Dr. Chute's NIH and AHCPRIAHRQ funded research in medical
concept representation, clinical information retrieval, and patient data
repositories have been widely published. He is Vice-chair of the ANSI
Health Information Standards Board, Convener of Healthcare Concept
Representation WG3 within the IS0 Health Informatics Technical
Committee, chair-elect of the US delegation to IS0 TC215 for Health
Informatics, co-chair of the HL7 Terminology Committee and
a
past
member of the NIH Medical Informatics Study Section. He has chaired
International Medical Informatics Association WG6 on Medical Concept
Representation since 1994. (Email:
;
URL:
http://mayoresearch.n~ayo.edu/mayo/research/staff/chute~cg.cfm)
Jeff Collmann, PhD, Associate Professor,
Department of Radiology, Georgetown University,
obtained his PhD in Social Anthropology from the
University of Adelaide, Adelaide, South Australia. He
completed a Postdoctoral Fellowship in Clinical Medical
Ethics, Department of Philosophy, University of
Tennessee and worked as a health care administrator. He
joined the Department of Radiology, Georgetown University in January
1992
where he now conducts research, writes, consults and lectures widely
on organizational dimensions of health information assurance. He serves as
a medical ethicist for the Telemedicine and Advanced Technology Research
Center, US Army Medical Research and Materiel Command, Ft. Detrick,
Maryland.
He functions as an advisor to the HIPAA compliance effort of
the Department of Defense and the US Air Force Surgeon General. He also
teaches courses at Georgetown University in the anthropology of medicine,
science and technology and Australian culture. (Email:
colln~
;
URL:
Ted Cooper,
MD, Clinical Associate Professor,
Department of Ophthalmology, Stanford University,
received his MD and completed a residency in
ophthaln~ology at the George Washington University.
He is a fellow of the American College of Medical
Informatics and the American Academy of
Ophthalmology. As National Director for
Confidentiality and Security he helped guide Kaiser
Permanente's response to HIPAA. He has lectured
widely on information assurance. He has participated in a number of health
informatics activities including director and chairperson of the Computer-
based Patient Record Institute. He is currently the chairperson of the Health
Information and Systems Society Privacy and Security Steering Committee,
a member of the Health Information and Systems Society Electronic Health
Record Steering Committee, and chairperson of
The CPRZ Toolkit:
Managing Information Security in Health Care
Work Group. (Email:
;
URL:
pl
Julie Dickerson,
PhD, received her B.S. degree
from the University of California, San Diego and her
MS and PhD degrees from the University of Southern
California. She is currently an Associate Professor of
I
Electrical and Computer Engineering at Iowa State
University. Dr. Dickerson designed radar systems for
Hughes Aircraft Company and Martin Marrietta while
getting her PhD Her current research activities are
intelligent systems, bioinformatics, pattern recognition,
and data visualization.
She is a Carver Fellow in the Virtual Reality
Applications Center and a member of the Baker Center for Bioinformatics in
the Plant Sciences Institute at Iowa State University. (Email:
;
URL:
xxvi
Jing Ding, is currently a PhD candidate in the
department of Electrical and Computer Engeering, and
the interdepartmental program of Bioinformatics and
Conlputational Biology at Iowa State University. He
received a MS (Conlputer Engineering) in 2003, and a
MS (Toxicology) in 2000, both from Iowa State
University. His research interests are in text-mining and knowledge
representation. (Email: )
Pan Du received the BS and MS degrees in Electrical
Engineering from National University of Defense
Technology, Changsha, China, in 1995 and 1998,
respectively. He is currently a co-major PhD student in
Electrical Engineering major and Bioinformatics and
Computational Biology major at Department of
Electrical and Computer Engineering, Iowa State
University. His research interests include systems biology, genetic network
modeling and inference, microarray data analysis, signal processing and
pattern recognition. (Email: )
Shauna Eggers is a computer programmer at the
University of Arizona Artificial Intelligence Lab. She
earned a B.S. in Computer Science and a B.A. in
Linguistics and German Studies from the University of
Arizona in May 2004. Her research interests include
natural language processing for biomedical applications
and knowledge visualization.
Millicent Eidson, MA, DVM, DACVPM
(Epidemiology) is State Public Health Veterinarian and
Director of the Zoonoses Program, New York State
Department of Health. She is also an Associate
Professor in the Department of Epiden~iology,
University at Albany School of Public Health.
Dr.
Eidson previously served as an Epidemic Intelligence
Z
Service (EIS) Officer with the Centers for Disease
Control and Prevention based at the National Cancer