In bottom-up proteomics, the analytes introduced into the mass spectrometer are peptides generated by enzymatic cleavage of one or many proteins. The proteins can first be separated by GE or chromatography, in which case the sample will contain only one or a few proteins. Alternatively, a complex protein mixture initially can be digested to the peptide level, then separated by on-line chromatography coupled to electrospray mass spectrometry (ESI–MS). In the latter case, the digest can contain thousands to hundreds of thousands of peptides, and require separation in two or more chromatographic dimensions before MS analysis. The identity of the original protein is determined by comparison of the peptide mass spectra with theoretical peptide masses calculated from a proteomic or genomic database. There are two approaches for protein identification using the bottom-up approach, peptide mass fingerprinting and tandem MS (MS–MS).
In top-down proteomics, intact protein molecular ions generated by ESI/MALDI are introduced into the mass analyzer and are subjected to gas-phase fragmentation. An obstacle to this approach is the determination of product ion masses from multiply charged product ions (1). These can vary in charge state up to that of the multiply charged protein precursor ion. This can introduce ambiguity in the interpretation of top-down MS-MS spectra. Two approaches have been used to circumvent this limitation. The first is charge state manipulation through gas phase ion–ion interactions, and the second is the use of instruments with high mass measurement accuracy (MMA). It provides an approach for large scale characterization of proteins, both types of FTMS instruments, ICR and Orbitrap have been used for this methodology. The molecular mass of the inact precursor protein along with fragment ions from MS/MS experiments enables high confidence mapping to database protein entries as well as PTM detection.
Advantages and limitations of bottom-up strategies: Bottom-up proteomics is the most mature and most widely used approach for protein identification and characterization. Reversed-phase HPLC provides high-resolution separations of peptide digests with solvents that are compatible with ESI. On-line nano-scale reversed-phase LC–ESI–MS–MS can be fully automated and is almost universally used for bottom-up proteomics. Commercial instruments with control software and bioinformatics tools optimized for bottom-up applications are available from several vendors. The bottom-up strategy using on-line multidimensional capillary HPLC–MS-MS has been most successful in the identification of proteins in digests derived from very complex mixtures such as cell lysates (6). Moreover, quantitative techniques have been developed using affinity tags and stable isotope labels for determination of up- and down-regulated proteins in expression proteomics (7).
There are several fundamental and practical limitations to the bottom-up strategy.
Most importantly, only a fraction of the total peptide population of a given protein is identified. Therefore, information on only a portion of the protein sequence is obtained. It is clear from genomic studies that each open reading frame can give rise to many protein isoforms, which can originate from alternative splicing products and varying types and locations of posttranslational modifications (PTMs). PTMs such as phosphorylation and glycosylation are known to be important in the regulation of protein function and cell metabolism. A consequence of the limited sequence coverage in bottom-up proteomics is loss of much information about PTMs. Moreover, PTMs are often labile in the CID process and require techniques such as neutral loss scanning to detect them.
Practical limitations are encountered when bottom-up methods are used for protein identification from very complex peptide mixtures. On-line multidimensional LC–MS-MS analyses using ion-exchange coupled to reversed-phase columns require extended run times of as long as 15 h or more. Although this can be automated, the throughput of multidimensional LC–MS-MS is quite limited. Other problems include the loss of information about low-abundance peptides in mass spectra dominated by high-abundance species. Finally, narrow chromatographic peak widths can compromise acquisition of adequate MS–MS information during elution.
Advantages and limitations of top-down strategies: The two major advantages of the top-down strategy are the potential access to the complete protein sequence and the ability to locate and characterize PTMs. In addition, the time-consuming protein digestion required for bottom-up methods is eliminated.
Top-down proteomics is a relatively young field compared to bottom-up proteomics, and currently suffers from several limitations. First, the very complex spectra generated by multiply charged proteins limits the approach to isolated proteins, or simple protein mixtures at best. Second, the favored instrumentation (FT-ICR, hybrid ion trap FT-ICR or hybrid ion trap–orbitrap) are expensive to purchase and operate. Third, the top-down approach does not work well with intact proteins larger than about 50 kDa. Fourth, the favored dissociation techniques (ECT, ETD) are low-efficiency processes requiring long ion accumulation, activation, and detection times. This limits the ability to couple top-down MS techniques with on-line separations. Fifth, the mechanisms of protein dissociation behavior are less well understood than those of peptide dissociation. If top-down approaches are to be adopted widely, a greater understanding of fragmentation of multiply charged ions is needed (1), including the influence of precursor ion charge state, the role of protein primary, secondary and tertiary structure, and the contribution of PTMs. Finally, bioinformatics tools for top-down proteomics are primitive compared to those for bottom-up proteomics.
Thermo Scientific* ProSightPC, the first stand-alone software for analyzing top-down proteomics data, has been enhanced to add support for middle-down and bottom-up experiments, making it an all-around tool for identification and characterization of both intact proteins and peptides.
ProSightPC* 2.0 software enables high-throughput processing of all accurate-mass MS/MS data, whether from top-down, middle-down or bottom-up experiments including the characterization of proteins with known PTMs. ProSightPC 2.0 software uses multiple search modes to determine the exact protein sequence including modifications and alternative splicing. It is the only proteomics software that allows the user to search their tandem MS data against proteome warehouses containing the known biological complexity present in UniProt.