—  SYMPOSIUM #40  —

Hematopathology: New Technologies
Moderators: Dr. John Wing Chan and Dr. Thomas Grogan

Section 6 - Introduction to Mass Spectrometry-Based Proteomics

Kojo S. J. Elenitoba-Johnson


Introduction
Proteomics is defined as the study of the proteome; and the proteome, a term analogous to the genome, is defined as the total protein complement of an organelle, cell, tissue or an entire organism [1, 2] . Proteomic studies require the simplification of a complex mixture of proteins into less complex components that are more amenable for analysis. In this regard intact proteins with different biophysical characteristics such as molecular weight, hydrophobicity, and posttranslational modifications may be present within a complex mixture intended for analysis. In top-bottom proteomics, intact proteins are analyzed. In bottom-up proteomics, the proteins are proteolytically cleaved using enzymes with or without cleavage specificity.

The development of sensitive instruments capable of analyzing larger biologic molecules such as proteins has greatly facilitated the analysis of the total complement of proteins in cells and tissues. Mass spectrometers measure the mass of the smallest of molecules with very high accuracy, and hence mass spectrometry can be considered as the smallest weighing scale. In parallel with the technological advancements in mass spectrometers, technological improvements in ionization methods such as matrix-assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI) have also enhanced the ability to analyze complex biologic molecules by mass spectrometry [3]. The final component that has greatly impacted the ability to conduct proteomic studies is development of translated genomic databases and specialized software algorithms that rapidly search mass spectrometric data against known or predicted proteins within the databases [3].

In general, the measurement of peptide masses by mass spectrometry is more accurate than that of intact proteins. Thus in bottom-up proteomics, the typical work-flow involves initial simplification of a complex protein mixture followed by digestion into peptides which are subjected to mass spectrometric analysis. The mass spectrometric data is then analyzed using specialized software algorithms that identify the proteins from which the peptide sequences are derived. The ability to accurately determine the mass of a unique peptide that originates from a particular protein greatly facilitates the identification of that protein. In a nutshell, protein identification centers on the fact that a peptide sequence composed of 6 amino acid residues or greater, provides a unique opportunity for identification of a protein. This is because the probability that any one amino acid would occupy a particular position within a peptide sequence is 1/20. For a sequence of 6 amino acid residues, the hypothetical probability is 1 in 206 = 1 in 64000000. However in some cases, even these odds may be insufficient to unequivocally identify a protein from a single peptide. Identification of longer peptide sequences provides an even greater degree of certainty in the identification of a protein. Thus in many cases, it is possible to utilize database searches to identify a protein from only a few peptides.

Protein Isolation
Cellular proteins have to be isolated from samples containing other biological molecules including carbohydrates, lipids and nucleic acids. Thus protein extraction protocols entail the homogenization of cells and tissues followed by application of detergents such as 3-([cholamidopropyl]dimethylammonio])-1 propane sulfate (CHAPS), Tween and sodium dodecyl sulfate (SDS) which help to dissolve the proteins and separate them from the lipid components, reducing agents such as dithiothreitol (DTT), denaturing agents such as urea which disrupt the bonds that are responsible for the formation of secondary and tertiary conformational structure, and enzymes which degrade nucleic acids such as DNAses and RNAses.

Separation of Complex Proteins into Simpler Components
Several techniques are utilized for the analytical separation of proteins. These include one-dimensional (1D) gel electrophoresis, two-dimensional gel electrophoresis (2D-GE), high performance liquid chromatography (HPLC), ion exchange and different types of affinity chromatography [4, 5] . Proteins isolated from gels or individual chromatography fractions can be subjected to proteolytic cleavage by enzymes such as trypsin with specific (effects cleavage at the carboxy-terminal of lysines or arginines), or non-specific cleavage specificity such as elastase or subtilisin [6, 7, 8] .

Mass Spectrometry for Proteomics
A mass spectrometer is typically composed of three components: an ionization source, the mass analyzer and the detector. Basically, the ionization source creates ions from the sample to be analyzed. The mass analyzer resolves the ions by their mass-to-charge ratio (m/z), and the detector determines the mass of the ions. There are two main types of mass spectrometers that are used for proteomics studies. In the first group are the instruments which integrate matrix-assisted laser desorption/ionization (MALDI) with a time-of-flight mass analyzer (TOF). In the second category, ESI serves as the mechanism for generation of ions which are subsequently analyzed by tandem mass spectrometry (MS/MS) [3].

Ionization
The recent development of so-called "soft" (low energy) ionization techniques such as matrix-assisted laser desorption/ionization (MALDI) [9, 10] and electrospray (ESI) techniques [11] have dramatically enhanced the possibility to analyze larger biomolecules in general, and proteins in particular by mass spectrometry. In MALDI, the sample intended for analysis is incorporated into a chemical matrix containing compounds such as 3,5-dimethoxy-4-hydroxycinnamic acid (sinapinic acid). Laser activation of the target by a laser within the ion source leads to the release from the target of peptide/protein ions into gas phase. More recently, a variation on the MALDI concept has been introduced namely; surface-enhanced laser desorption/ionization (SELDI) [12, 13] . This latter format is embodied in the Ciphergen Chip system and is composed of several chip matrices which exploit the differing bio-physical and chromatographic characteristics of the different proteins for their preferential selection. The different surfaces include among others, a hydrophobic surface, a strong anionic exchange surface, and immobilized metallic ion with a strong affinity for phosphorylated proteins.

In contrast to the MALDI wherein ions are generated from a solid matrix, electrospray ionization involves the generation of peptide ions from aqueous solution [11] . The solution containing the sample passed through a needle subjected to a high voltage. The solution stream is ejected from the needle orifice as a spray of droplets. The solvent is eliminated from the droplets by a heated capillary or an inert gas. Solutions with acidic pH favor protonation of the N-terminal amines and histidine nitrogens, and peptide fragmentation is facilitated when the peptide ions are positively charged. Thus ESI protocols commonly include acidification steps prior to peptide ion analysis in the mass analyzer.

Mass Analyzers
Mass spectrometers determine molecular mass by analysis of the mass-to-charge (m/z) ratio of an ion. The time-of-flight mass analyzers measure the transit time of the peptide/protein ions from the source end of the analyzer to the detector end. In this scheme, the larger ions traverse the analyzer tube in longer times than the smaller ions. Hence, the m/z ratio is directly related to the time-of-flight of the ions. MALDI-TOF instruments are capable of good resolution and are popular because they are easy to use and readily adaptable for high-throughput proteomics.

Electrospray ionization tandem mass spectrometry (ESI-MS/MS) is the other main instrumentational format for mass spectrometric proteomic analysis. ESI is versatile and can be interfaced with different types of tandem mass analyzers including triple quadrupoles, ion traps and quadrupole time-of-flight (Q-TOF) instruments. Tandem mass analyzers generally incorporate a collision cell wherein an ion species of interest can be selected for subsequent fragmentation by collisional induced dissociation (CID). Mass analysis of the fragmented ions allows for the determination of the peptide sequence.

Peptide Mass Fingerprinting and MS/MS
Peptide mass fingerprinting [14, 15] and tandem mass spectrometry are the main techniques for protein identification. Peptide mass fingerprinting is typically coupled with MALDI-TOF mass spectrometry and entails the measurement of the masses of proteolytically cleaved peptides. The measured peptide masses are then matched to peptide masses in genomic or protein sequence databases. By comparison, MS/MS permits extraction of peptide sequence information from the analysis of the fragmentation spectra, such that patterns of fragmentation reveal sequence identity of the peptide which can be searched against the databases for the protein of origin [3].

Quantitative proteomics
Most quantitative proteomic studies are relative in that they are designed to determine the proteomic differences between one cellular state and another. In this regard, 2D-GE has been extensively utilized with great success [16]. The isotope-coded affinity tag method has recently been developed and is advantageous in that it permits the evaluation of low-abundance proteins and proteins at both extremes of molecular weight and isoelectric point [17]. In this system, one sample is labeled with a tag containing a light isotope, and the other sample to which it is being compared is labeled with a tag with a heavy isotope. The two samples are combined, proteolytically digested and analyzed by mass spectrometry. A comparison of MS peak pairs permits relative quantification of a peptide and MS/MS identifies its protein source. Other stable isotope-based quantitative proteomic approaches are available such as iTRAQ which takes advantage of isobaric tags. These approaches generally provide relative protein quantification and are very useful in the identification of differential expression between two or more (iTRAQ) different samples.

Conclusion
The recent advances in protein separation techniques, mass spectrometry and completion of the genome sequences of several organisms are all critical developments that concertedly facilitate proteomic studies. In the near future, proteomics studies will impact the discovery of novel biomarkers and targets for the practical diagnosis and treatment of various diseases including malignancies.

References
  1. Wasinger VC, Cordwell SJ, Cerpa-Poljak A, Yan JX, Gooley AA, Wilkins MR, Duncan MW, Harris R, Williams KL, Humphery-Smith I: Progress with gene-product mapping of the Mollicutes: Mycoplasma genitalium. Electrophoresis 1995, 16:1090-1094

  2. Blackstock WP, Weir MP: Proteomics: quantitative and physical mapping of cellular proteins. Trends Biotechnol 1999, 17:121-127

  3. Yates JR, 3rd: Mass spectrometry and the age of the proteome. J Mass Spectrom 1998, 33:1-19

  4. Peng J, Elias JE, Thoreen CC, Licklider LJ, Gygi SP: Evaluation of multidimensional chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) for large-scale protein analysis: the yeast proteome. J Proteome Res 2003, 2:43-50

  5. Gygi SP, Corthals GL, Zhang Y, Rochon Y, Aebersold R: Evaluation of two-dimensional gel electrophoresis-based proteome analysis technology. Proc Natl Acad Sci U S A 2000, 97:9390-9395

  6. Jensen ON, Wilm M, Shevchenko A, Mann M: Sample preparation methods for mass spectrometric peptide mapping directly from 2-DE gels. Methods Mol Biol 1999, 112:513-530

  7. MacCoss MJ, McDonald WH, Saraf A, Sadygov R, Clark JM, Tasto JJ, Gould KL, Wolters D, Washburn M, Weiss A, Clark JI, Yates JR, 3rd: Shotgun identification of protein modifications from protein complexes and lens tissue. Proc Natl Acad Sci U S A 2002, 99:7900-7905

  8. Jonscher K, Currie G, McCormack AL, Yates JR, 3rd: Matrix-assisted laser desorption of peptides and proteins on a quadrupole ion trap mass spectrometer. Rapid Commun Mass Spectrom 1993, 7:20-26

  9. Tanaka K, Waki H, Ido Y, Akita S, Yoshida Y, Yoshida T: Protein and polymer analysis up to m/z 100,000 by laser ionization time of flight mass spectrometry. Rapid Commun Mass Spectrom 1988, 2:151

  10. Karas M, Hillenkamp F: Laser desorption ionization of proteins with molecular masses exceeding 10,000 daltons. Anal Chem 1988, 60:2299-2301

  11. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM: Electrospray ionization for mass spectrometry of large biomolecules. Science 1989, 246:64-71

  12. Kuwata H, Yip TT, Yip CL, Tomita M, Hutchens TW: Bactericidal domain of lactoferrin: detection, quantitation, and characterization of lactoferricin in serum by SELDI affinity mass spectrometry. Biochem Biophys Res Commun 1998, 245:764-773

  13. Merchant M, Weinberger SR: Recent advancements in surface-enhanced laser desorption/ionization-time of flight-mass spectrometry. Electrophoresis 2000, 21:1164-1177

  14. Gras R, Muller M, Gasteiger E, Gay S, Binz PA, Bienvenut W, Hoogland C, Sanchez JC, Bairoch A, Hochstrasser DF, Appel RD: Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis 1999, 20:3535-3550

  15. Jensen ON, Podtelejnikov AV, Mann M: Identification of the components of simple protein mixtures by high-accuracy peptide mass mapping and database searching. Anal Chem 1997, 69:4741-4750

  16. Rabilloud T: Detecting proteins separated by 2-D gel electrophoresis. Anal Chem 2000, 72:48A-55A

  17. Gygi SP, Rist B, Gerber SA, Turecek F, Gelb MH, Aebersold R: Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 1999, 17:994-999