Synopsis

Subject Categories: Bioinformatics | Proteomics

Molecular Systems Biology 3 Article number: 89  doi:10.1038/msb4100134
Published online: 13 March 2007
Citation: Molecular Systems Biology 3:89

Large-scale mapping of human protein–protein interactions by mass spectrometry

Rob M Ewing1,2, Peter Chu1,a, Fred Elisma3, Hongyan Li1,a, Paul Taylor1,a, Shane Climie1,a, Linda McBroom-Cerajewski1,a, Mark D Robinson1,a, Liam O'Connor1,a, Michael Li1,a, Rod Taylor1, Moyez Dharsee1,2, Yuen Ho1,a, Adrian Heilbut1,a, Lynda Moore1,a, Shudong Zhang1, Olga Ornatsky1,a, Yury V Bukhman1,a, Martin Ethier3, Yinglun Sheng3, Julian Vasilescu3, Mohamed Abu-Farha3, Jean-Philippe Lambert3, Henry S Duewel1,a, Ian I Stewart1,2, Bonnie Kuehl1,a, Kelly Hogue1,a, Karen Colwill1,a, Katharine Gladwish1, Brenda Muskat1,a, Robert Kinach1,a, Sally-Lin Adams1,a, Michael F Moran1,a, Gregg B Morin1,a, Thodoros Topaloglou1,4 & Daniel Figeys1,3

  1. Protana (now Transition Therapeutics), Toronto, Ontario, Canada
  2. Infochromics, MaRS Discovery District, Toronto, Ontario, Canada
  3. Faculty of Medicine, The Ottawa Institute of Systems Biology, University of Ottawa, BMI, Ottawa, Ontario, Canada
  4. Information Engineering Center, Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Ontario, Canada

Correspondence to: Daniel Figeys1,3 The Ottawa Institute of Systems Biology, University of Ottawa, BMI, 451 Smyth Road, Ottawa, Ontario, Canada K1H 8M5. Tel.: +1 613 562 5800 ext 8674; Fax: +1 613 562 5655; E-mail: Email: dfigeys@uottawa.ca

Received 22 September 2006; Accepted 26 January 2007; Published online 13 March 2007

aPresent address: Faculty of Health Sciences, McMaster University, Hamilton, Ontario, Canada

aPresent address: Department of Biology, York University, Toronto, Ontario, Canada

aPresent address: Hospital for Sick Children and McLaughlin Centre for Molecular Medicine, and Department of Medical Genetics and Microbiology, University of Toronto, Toronto, Ontario, Canada

aPresent address: Popper and Company LLC, Sarasota, FL, USA

aPresent address: Structural Genomics Consortium, University of Toronto, Toronto, Ontario, Canada

aPresent address: Genetics and Bioinformatics, Walter and Eliza Hall Institute of Medical Research (WEHI), Parkville, Victoria, Australia

aPresent address: Novartis Institutes for Biomedical Research, Cambridge, MA, USA

aPresent address: Platform Computing, Markham, Ontario, Canada

aPresent address: Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario, Canada

aPresent address: CombinatoRx Inc, Cambridge, MA, USA

aPresent address: Michael Smith Genome Sciences Centre, BC Cancer Agency Genome Sciences Centre, Vancouver, British Columbia, Canada

aPresent address: Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada

aPresent address: Campbell Family Institute for Breast Cancer Research, University Health Network, Toronto, Ontario, Canada

aPresent address: Sigma-Aldrich Corporation, St Louis, MO, USA

aPresent address: Scientific Insights Consulting Group Inc., Mississauga, Ontario, Canada

aPresent address: Advanced Protein Technology Centre, Hospital for Sick Children, Toronto, Ontario, Canada

aPresent address: Samuel Lunenfeld Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada

aPresent address: MDS Pharma Services, Mississauga, Ontario, Canada

aPresent address: Division of Haematology/Oncology, Hospital for Sick Children, Toronto, Ontario, Canada

Top

Article highlights

  • We present a dataset of 6486 interactions between 2371 distinct proteins from a large-scale application of immunoprecipitation and high-throughput mass-spectrometry (IP-HTMS) on 338 human bait proteins expressed in human cells.
  • The dataset is cross-validated using previously published and predicted human protein interactions. In depth mining of the dataset shows that it represents a valuable source of novel protein-protein interactions with relevance to human diseases. In addition, our analysis reveals many novel protein interactions and pathway associations.
  • Protein interactions in the dataset are accompanied by a confidence score which is derived by combining several experimental and protein identification analysis metrics.

Top

Synopsis

Understanding the roles and consequences of protein–protein interactions is a fundamental goal in cellular biology and a prerequisite for the development of molecular systems biology. The endeavor of cataloging protein interactions is primarily hindered by the throughput and reproducibility of existing technologies. Different techniques for mapping protein interactions are available, such as the two-hybrid approach (Chien et al, 1991) and the LUMIER approach (Barrios-Rodiles et al, 2005) and assay whether two proteins interact in a pair-wise fashion. We have developed a high-throughput platform combining immunoprecipitation and high-throughput mass spectrometry (IP-HTMS) to rapidly identify potentially novel protein interactions for a bait protein of interest. We (Ho et al, 2002) and others (Gavin et al, 2002) previously used this approach to map protein–protein interactions in yeast, creating invaluable data sets for yeast biology and extrapolation into mammalian biology.

Mapping protein interactions in human cells has its own set of challenges owing to the number of potentially expressed genes, the number of different cell types, and the numbers of internal and external factors that impact the cellular system. Although a complete mapping of the human interactome is still beyond current capabilities, more focused studies are possible. Here we report the first large-scale application of IP-HTMS to the mapping of protein–protein interactions in human cells using 338 human bait proteins of significant biomedical interest. The complete data set is available from the Intact database (http://www.ebi.ac.uk/intact/site/) (accession EBI-1059370) or as a table of bait–prey pairs with associated confidence values (Supplementary Table II).

There has been much focus and discussion over the last few years on the quality and reproducibility of interactions in high-throughput protein–protein interaction datasets (e.g. von Mering et al, 2002). A guiding principle in our study has therefore been to implement stringent quality controls. The final data set includes protein interactions for 338 human bait proteins (Supplementary Table I). For over half of these baits, two or more replicate immunoprecipitation experiments were performed, requiring a total of 1034 individual immunoprecipitation experiments with associated SDS–PAGE. These experiments yielded over 16 000 gel bands for which over 400 000 MS/MS spectra were assigned peptide sequences. Approximately 1/5 of our immunoprecipitation experiments were control (no-bait) experiments allowing us to build a comprehensive list of spurious and ubiquitously binding proteins that could then be filtered out of the interaction network. Another 1/5 of the experiments were directed towards a study of the reproducibility of prey protein identification using our platform. These 202 immunoprecipitation experiments, derived from 18 baits, were used to train a statistical model that associates interaction reproducibility with various observed experimental parameters, such as the number of peptides identified for the given prey protein. This model was used to assign confidence values (taking a value between 0 and 1) to each of the 6486 interactions in the data set.

As the interaction confidence score is calculated solely from IP-HTMS experimental parameters, an initial focus was to confirm that the confidence score was an accurate means of ranking the interactions for further study. We observed, for example, that known interactions in the data set have, on average, significantly higher interaction confidence scores. For example, the set of baits corresponding to core and regulatory components of the proteasome enabled reconstruction of a proteasome interaction network (Figure 6C), comprising many known proteasome components and enriched for high-scoring interactions.

Figure 6c
Figure 6c :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

(CF) Complete interaction networks (representing both baits and preys) for selected groups of baits. Nodes are colored according to cellular component or biological process as indicated on each figure. Baits are shown as large, labeled oval shapes, preys as small, labeled oval shapes. Arrow direction indicates a bait–prey relationship and line thickness indicates the interaction confidence score (see legend in panel C). Preys are grouped according to the baits with which they were identified (except panel E where they are grouped according to interaction confidence score). (C) Proteasome baits (corresponds to bait–bait cluster B (panel iv)). (D) Sumoylation pathway (corresponds to bait-bait cluster B (panel vi)). (E) Nek6. (F) Translation initiation and elongation (corresponds to bait–bait cluster B (panel iii)).

Full figure and legend (279K)Figures & Tables index

We also integrated the IP-HTMS data set with several other genomic-scale data including other protein–protein interaction data sets, gene co-expression data, and annotations from the gene ontology project. In the latter case, we analyzed the frequency of co-occurrence of both bait and prey protein in the same biological process or cellular component category (Figure 3). We find that there is significant enrichment of bait–prey pairs sharing the same annotation category, indicating a strong tendency for bait proteins to bind prey proteins with related functions. Integration with gene co-expression data showed that interaction data sets, this one included, are enriched for gene pairs that are co-expressed. This enabled identification of tightly clustered sets of protein interactors that are also co-expressed at the mRNA level. For example, the LYAR bait protein (Ly1 antibody reactive clone) is a nucleolar protein of unknown function (Su et al, 1993). This bait identified a set of nucleolar-localized prey proteins that are also very tightly co-expressed (Figure 5). These results along with the other protein–protein interaction data sources provided a powerful means of cross-validating the human IP-HTMS data set and associated methodology.

Figure 3
Figure 3 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

GO coincidence maps. Coincidence maps showing enrichment of bait–prey GO category combinations. Each bait–prey category combination is represented by a square in the matrix and colored according to the P-value from a pairwise statistical test (Fisher exact test) of association. (A) Bait–prey biological processes. (B) Randomly permuted bait–prey biological processes. (C) Cellular component categories. (D) Randomly permuted bait–prey cellular component categories.

Full figure and legend (341K)Figures & Tables index

Figure 5
Figure 5 :  Unfortunately we are unable to provide accessible alternative text for this. If you require assistance to access this image, or to obtain a text description, please contact npg@nature.com

LYAR interactors also show strong gene co-expression with LYAR. Box plot showing distribution of P-values for all genes coexpressed (in three or more studies) with LYAR. Red points indicate co-expression P-values for 12 LYAR IP-HTMS interactors. Interactor descriptions include known subcellular localizations in square brackets where available.

Full figure and legend (133K)Figures & Tables index

Our focus in this paper has been to prepare a quality-controlled, large-scale human protein interaction data set that will add significantly to our knowledge of the human protein interactome. Given the focus on baits of significant biomedical interest (through functional or disease associations), we anticipate that this data set alongside other sources of human protein–protein interactions will be an important starting point for functional characterization of disease-related interactions and complexes. The IP-HTMS platform utilized here shows great promise as an effective means of protein interaction discovery and we anticipate that future applications will include broadening to a larger set of disease associated proteins, to other cell lines and coupling with drug treatments.

Top

Acknowledgements

DF acknowledges funding from the Canada Research Chair program, the Natural Sciences and Engineering Research Council of Canada, the Canadian Institutes of Health Research, la Fondation Jean-Louis Lévesque and MDS Inc. MM acknowledges funding from the Canada Research Chair Program. We also acknowledge past and present colleagues at MDS Proteomics/Protana who have contributed to this project.

Top

References

  1. Barrios-Rodiles M, Brown KR, Ozdamar B, Bose R, Liu Z, Donovan RS, Shinjo F, Liu Y, Dembowy J, Taylor IW, Luga V, Przulj N, Robinson M, Suzuki H, Hayashizaki Y, Jurisica I, Wrana JL (2005) High-throughput mapping of a dynamic signaling network in mammalian cells. Science 307: 1621–1625 | Article | PubMed | ISI | ChemPort |
  2. Chien CT, Bartel PL, Sternglanz R, Fields S (1991) The two-hybrid system: a method to identify and clone genes for proteins that interact with a protein of interest. Proc Natl Acad Sci USA 88: 9578–9582 | Article | PubMed | ChemPort |
  3. Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G (2002) Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415: 141–147 | Article | PubMed | ISI | ChemPort |
  4. Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, Yang L, Wolting C, Donaldson I, Schandorff S, Shewnarane J, Vo M, Taggart J, Goudreault M, Muskat B, Alfarano C, Dewar D, Lin Z, Michalickova K, Willems AR, Sassi H, Nielsen PA, Rasmussen KJ, Andersen JR, Johansen LE, Hansen LH, Jespersen H, Podtelejnikov A, Nielsen E, Crawford J, Poulsen V, Sorensen BD, Matthiesen J, Hendrickson RC, Gleeson F, Pawson T, Moran MF, Durocher D, Mann M, Hogue CW, Figeys D, Tyers M (2002) Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415: 180–183 | Article | PubMed | ISI | ChemPort |
  5. Su L, Hershberger RJ, Weissman IL (1993) LYAR, a novel nucleolar protein with zinc finger DNA-binding motifs, is involved in cell growth regulation. Genes Dev 7: 735–748 | Article | PubMed | ISI | ChemPort |
  6. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P (2002) Comparative assessment of large-scale data sets of protein–protein interactions. Nature 417: 399–403 | Article | PubMed | ISI | ChemPort |

MORE ARTICLES LIKE THIS

These links to content published by NPG are automatically generated.

NEWS AND VIEWS

Research notes

Nature Genetics News and Views (01 Dec 2003)

Protein interaction maps on the fly

Nature Biotechnology News and Views (01 Jan 2004)

See all 4 matches for News And Views

RESEARCH

Hepatitis C virus infection protein network

Molecular Systems Biology Article (04 Nov 2008)

See all 56 matches for Research

Extra navigation

.
ADVERTISEMENT