Bioinformatics: Definitions,
Challenges and Impact on Health
Care Systems
Joyce Mitchell, PhD
Professor and Chair
Department of Biomedical Informatics
University of Utah School of Medicine
Topics
1. What is Bioinformatics?
2. Scope of Bioinformatics
a) Genomics
b) Proteomics
c) Functional genomics
3. Genomics data and patient care
4. Impact of Bioinformatics on Health
Information Systems
5. What is coming?
Central Dogma of Molecular Biology
DNA RNA
Protein
Phenotype
Phenotype
Transcription
Translation
Replication
Post Translational
Modification
What is Bioinformatics?
Definitions…
NIH Definition
Bioinformatics applies principles of
information sciences and technologies
to make the vast, diverse, and complex
life sciences data more understandable
and useful.
NIH Definition cont…
Bioinformatics:
Research, development,
or application of computational tools and
approaches for expanding the use of
biological, medical, behavioral or health
data, including those to acquire, store,
organize, archive, analyze, or visualize
such data.
/>
Another…
NCBI (National Center for Biotechnology Information)
Bioinformatics is the field of science in which
biology, computer science, and information
technology merge into a single discipline. The
ultimate goal of the field is to enable the
discovery of new biological insights and to create
a global perspective from which unifying
principles in biology can be discerned.
Bioinformatics & Health Informatics
Bioinformatics is the study of the flow of
information in biological sciences.
Health Informatics is the study of the flow of
information in patient care.
These two field are on a collision course as
genomics data becomes used in patient care.
Russ Altman,MD, PhD, Stanford Univ.
Scope of Bioinformatics
OMES and OMICS
Omes and Omics
Genomics
Primarily sequences (DNA and RNA)
Databanks and search algorithms
Supports studies of molecular evolution (“Tree wars”)
Proteomics
Sequences (Protein) and structures
Mass spectrometry, X-ray crystallography
Databanks, knowledge bases, visualization
Functional Genomics (transcriptomics)
Microarray data (and SNP Chips)
Databanks, analysis tools, controlled terminologies
Genetic Epidemiology – finding gene-disease associations
Linkage studies
GWAS studies (genome wide association studies)
Systems Biology (metabolomics)
Metabolites and interacting systems (interactomics)
Graphs, visualization, modeling, networks of entities
Central Dogma of Molecular
Biology
DNA RNA
Protein
Phenotype
Phenotype
Structural
Genomics
Functional Genomics
(Transcriptomics)
Proteomics
Phenomics
Human Genome Project
Human Genome Project - International
research effort
Determine sequence of human genome and
other model organisms
Began 1990, completed 2003
We are now in the “Post-Genomic” era
Next steps for ~20,000 genes
Function and regulation of all genes
Significance of variations between people
Cures, therapies, “genomic healthcare”
Genome and Genomics
Genome – entire complement of DNA in a species
Both nuclear and mitochondrial/chloroplast
Variants among individuals
Genomics – study of the sequence, structure and
function of the genome. Study relationships among
sets of genes rather than single genes.
Comparative genomics – study of the differences
among species. Usually covers evolutionary studies
of differences & conservation over time.
Genome Databases (e.g.,
GenBank)
Consists of
long strings of DNA bases – ATCG…
Annotations of this database to attach
meaning to the sequence data.
Example entry from GenBank:
/>er.fcgi?val=NM_000410&dopt=gb
Hemochromatosis gene HFE
The Genome Sequence
is at hand…so?
“The good news is that we have the human genome.
The bad news is it’s just a parts list”
“The Human Genome Project has
catalyzed striking paradigm changes
in biology - biology is now an
information science.”
Leroy Hood, MD, PhD
Institute for Systems Biology
Seattle, Washington
Genomes In Public Databases
Published complete genomes:
Ongoing prokaryotic genome
projects:
Ongoing eukaryotic genomes:
/>72
255
158
12/4/01 10/3/02
104
316
218
8/28/03
156
386
246
5/07
~500
~1500
~700
2700
Genomics activities
Sequence
the genes and chromosomes – done
by breaking the DNA into parts
Map
the location of various gene entities to
establish their order
Compare
the sequences with other known
sequences to determine similarity
Across species, conserved sequence “motifs”
Predict secondary structure of proteins
Create
large databases – GenBank, EMBL, DDBJ
Develop
algorithms and similarity measures
BLAST and its many forms
Central Dogma of Molecular
Biology
DNA RNA
Protein
Phenotype
Phenotype
Genomics
Transcriptomics
Functional Genetics
Proteomics
Proteome vs Transcriptome
Functional genomics (transcriptomics) looks
at the timing and regulation of gene products
(mRNA, primarily)
Proteome is final end-product (set of many or
all proteins).
Relationship between transcriptome and
proteome is complex, due to longevity of
mRNA signal, subsequent control of
translation to protein, and post translational
modifications.
Functional Genomics
Technologies:
Gene Chips, Microarrays, etc
Functional Genomics –
Microarrays
Transcriptome and transcriptomics
High throughput technique designed to
measure the relative abundance of mRNA
in a cell or tissue in response to an
experiment.
Also called gene expression analysis
(and multiple kinds of microarrays)
Gene Chips (SNP-Chips)
High throughput technique designed to
measure whether or not a sample of
tissue has various SNPs in its DNA.
The Gene Chip has small segments of
DNA on the chip with known variants
(SNPs). ~ 1 million SNPs per chip
GeneChip synthesis