Experimental Methods: "Pulldown" Protocol Flowchart

High-throughput PCR amplification and cloning of selected targets into Gateway Entry vectors

We use a Gateway™ (Invitrogen Corporation) site-specific recombination based cloning strategy that is applicable to a range of microorganisms and adaptable to the various analytical methodologies employed in the CMCS. This cloning strategy involves design of ORF-specific PCR primers from published genomic DNA sequences with appropriate att sites for recombination, PCR amplification of target ORF from host genomic DNA, site-specific (BP) recombination of amplified ORF in pDONR221 vector to generate an entry clone. (See diagram)

Recombination of entry clones into low copy Gateway compatible broad host range expression plasmid

We have developed a series of expression vectors that can be used with multiple microorganisms, including E. coli, S. oneidensis and R. palustris, among others. These vectors are based on the broad host range plasmid pBBR1MCS5 modified to contain the Gateway™ pDEST multiple cloning region that allows site specific recombination cloning of targets from Gateway™ entry plasmids. These modified destination vectors allow for expression of fusion proteins tagged either N- or C- terminally with a hexahistidine-V5 tandem tag, or hexahistidine (6xHis), or glutathione S-transferase (GST). The destination plasmid encoding each affinity-labeled probe protein is transformed by electroporation or mating into R. palustris, S. oneidensis or E. coli for expression and affinity purification of target proteins.

Expression of "bait" V5/6xHis fusion protein in R. palustris or S. oneidensis and endogenous protein complex formation

Rpal Pipeline

IMAC purification followed by V5 antibody purification of bait protein with interactors

Elution of specific interactors

Trypsin digestion and LC MS/MS identification of interacting proteins

Mass spectrometry has the key role in the Center for Molecular and Cellular Systems of identifying the proteins making up the protein complexes, or "machines of life," that are the focus of the Center. Mass spectrometry is a diverse family of instrumental techniques that have in common the measurement of molecular mass, and the study of reactions gas-phase ions.

At the CMCS, proteins are identified via tandem mass spectrometric analysis of the peptides resulting from tryptic digestion of those proteins. To accommodate the very complex mixtures of peptides resulting from proteolytic digestion of a protein complex, each of the mass spectrometers is interfaced with high-performance liquid chromatography equipment to perform on-line separations.

Target proteins that interact with the tagged probe or “bait” protein are eluted, dried, reconstituted in aqueous buffer, and subjected to overnight trypsin digestion. The resulting peptides are analyzed by automated LC-ESI-MS/MS. An LCPackings (Sunnyvale CA) chromatographic system (Famos autosampler, Switchos flow switching module, and Ultimate HPLC) is interfaced with a ThermoFinnigan (San Jose CA) DecaXP Plus quadrupole ion trap mass spectrometer via a nanospray source. After injection via the autosampler, peptides from samples are concentrated onto a reverse-phase preconcentration column, and washed to remove excess salt. The peptides are then eluted onto an analytical column, which is fabricated in-house using a pressure cell (New Objective, Woburn, MA). The columns are 75 micron inside diameter, and ~15 cm in length, and are packed with a C18 reverse-phase chromatographic material. Peptides are eluted from the analytical column using a gradient from 95% water/5% acetonitrile to 30% water/70% acetonitrile (with 0.1% formic acid in all mobile phases). The mass spectrometer is operated in a data-dependent MS/MS mode, with up to four peptides automatically selected per MS scan for MS/MS analysis.

Bioinformatic analysis followed by interaction data visualization


The Automated Mass Spectrometry Data Analysis Pipeline automatically and efficiently performs protein searches for the Genomic Science Center's Mass Spectrometry data, using an analysis toolkit, an integrated suite of Mass Spectrometry analysis tools. Additionally, the Analysis Pipeline and the Center's LIMS system are closely integrated. The Pipeline queries the LIMS for new sample sets that may be ready for analysis, transfers the raw data to a central storage system, performs the protein searches, and updates the LIMS on completion of the analysis. The LIMS then imports the results of the protein searches into its database, which is then used for subsequent data mining operations.

The Analysis pipeline currently uses SEQUEST as the primary protein search tool. A new search tool, DBDigger was developed in-house and not only allows for faster searches, but also allows for more efficient searches for post-translationally modified peptides. Parallel implementation of these two search algorithms should allow for much greater sensitivity and specificity in our protein identifications. This will enable significantly higher throughput, distributed processing of Mass Spectrometry data.

A Raw file extractor using Xcalibur XDK Development Kit has been developed in-house. It is used for extraction of spectra from Mass Spectrometry RAW files and storage in a single MS2 file (a format proposed by Yate's Lab), instead of individual .dta files files for each spectrum.

Adopted the use of mzXML format and tools developed by the Institute for Systems Biology. mzXML is a reference file format standard to represent mass spectral data, developed by Institute for Systems Biology. The file format is open and extensible.

Proteins interacting with the bait proteins are identified from MS/MS spectra of their tryptic peptides. Automated protocols for implementing this identification via Sequest [Eng, 1994] are in place. The resulting peptide identifications are filtered and collated into protein identificationsusing DTASelect [Tabb, 2002]. Results from each analysis are stored in a database, from which they are presented via an internal web interface.

We have implemented a LIMS as the central data repository for information related to processing and analysis of CMCS samples. It maintains a detailed history for each sample by capturing processing parameters, protocols, stocks, tests, and analytical results for the complete life cycle of the sample. This includes detailed information on sample cloning, expression, processing, and analysis. Project data are also maintained to define each sample in the context of the collaborative research project it supports. The rigorous and standardized LIMS configuration enforces project standards, integrates project data, supports project publications, and provides data security, while providing flexibility for a dynamic research environment.