Introduction to Proteomics
William P. Bennett
City of Hope National Medical Center
Surgical pathology is constantly challenged to render accurate diagnoses at earlier stages of
disease, and to work with ever smaller tissue samples. To vault these hurdles, pathologists have adopted
new technologies into their diagnostic criteria and algorithms. Notable examples include electron
microscopy, immunohistochemistry, and the polymerase chain reaction (PCR). In recent years, the Human
Genome Project has developed the new field of genomics, which provides unprecedented volumes of new
information to be mined, and sophisticated tools to be applied to diagnostic problems. But assimilating
genomics into clinical practice is no simple task, because the genome is vast and still mysterious.
Here's some of what we know: the human genome has a little more than 3 billion nucleotides, probably
25,000 to 35,000 genes that encode proteins, tens of millions of repeat sequences (microsatellites,
minisatellites, etc.), and several million single nucleotide polymorphisms, so far. But the sum of the
parts we understand is less than half of the whole. Many questions remain: What are the functions of
the unclassified genomic sequences? Is it "junk DNA"? How does damage to the genetic sequence promote
neoplastic transformation? How do methylation, acetylation and other epigenetic features fit into the
picture? Could quantitative measurements of methylation and/or allelic deletion help to diagnose
borderline malignancy or predict clinical course?
Although PCR-based assays already play important roles in surgical pathology, the complexity of the
genome has led to the suggestion that the proteome might be more understandable. At least we can measure
the shapes of proteins, develop antibodies to peptides, and understand some of the functions.
Furthermore, most tumor markers are proteins, and immunohistochemistry is a traditional method in
diagnostic pathology. New technologies have brought new tools, and the genome sequence has opened new
lines of research. For example, by determining the amino acid sequence of even a short peptide, one can
translate the peptide into nucleotide sequence and identify the gene by searching genetic databases.
When combined with high throughput instruments, this approach has produced complex protein "fingerprints"
of serum samples and raised the possibility of a "blood test" for cancer. As usual, there are major
challenges and questions. For example, a single gene may produce many proteins through alternative
splicing and post-translational modifications such as glycosylation, acetylation, and glucuronidation.
How much of this information will be needed for a specific diagnostic application? Then there is the
challenge of dynamic range: some proteins are expressed over wide ranges of abundance that may encompass
a million-fold or more. What technology can measure all proteins within a sample, if the concentrations
vary so much? None of these questions have been answered, but there is hope that proteomics will provide
new tools for diagnosing disease at early stages and for improving the classification and prognostication
of advanced disease.