Monday, March 28, 2011
Network-Based Pipeline for Analyzing MS Data: An Application toward Liver Cancer
Current limitations in proteome analysis by high-throughput mass spectrometry (MS) approaches have sometimes led to incomplete (or inconclusive) data sets being published or unpublished. In this work, we used an iTRAQ reference data on hepatocellular carcinoma (HCC) to design a two-stage functional analysis pipeline to widen and improve the proteome coverage and, subsequently, to unveil the molecular changes that occur during HCC progression in human tumorous tissue. The first involved functional cluster analysis by incorporating an expansion step on a cleaned integrated network. The second used an in-house developed pathway database where recovery of shared neighbors was followed by pathway enrichment analysis. In the original MS data set, over 500 proteins were detected from the tumors of 12 male patients, but in this paper we reported an additional 1000 proteins after application of our bioinformatics pipeline. Through an integrative effort of network cleaning, community finding methods, and network analysis, we also uncovered several biologically interesting clusters implicated in HCC. We established that HCC transition from a moderate to poor stage involved densely connected clusters that comprised of PCNA, XRCC5, XRCC6, PARP1, PRKDC, and WRN. From our pathway enrichment analyses, it appeared that the HCC moderate stage, unlike the poor stage, is enriched in proteins involved in immune responses, thus suggesting the acquisition of immuno-evasion. Our strategy illustrates how an original oncoproteome could be expanded to one of a larger dynamic range where current technology limitations prevent/limit comprehensive proteome characterization.