Nature Methods
- 5, 11 - 14 (2008)
doi:10.1038/nmeth1154
The year of sequencingKelly Rae ChiKelly Rae Chi is a freelance science journalist based in Chapel Hill, North Carolina, USA kellyraechi@gmail.com. In 2007, the next-generation sequencing technologies have come into their own with an impressive array of successful applications. Kelly Rae Chi reports.In the toxicology building of North Carolina State University in Raleigh, Nigel Deighton, head of a small genome research facility, and a few others unpack the facility's first next-generation sequencing machine, a 454 GS FLX, on loan from Roche Diagnostics for three months. They train for a few days, nebulize a colleague's bacterial DNA and PCR-amplify "the living daylights out of it," Deighton recalls. They load the bead-bound PCR products onto a plate with holes that are not visible to the naked eye, pop the plate into the machine and close the drawbridge-like door. Then they go out for a beer.
The next day, Deighton scrolls through graphs from the newly sequenced bacterial genomes. On its first test run, the GS FLX gave sixfold coverage of each genome, with read lengths of around 250 bases. "That's not bad, for the first go," Deighton says.
In the last few years, the next-generation sequencing platforms have been developed and sprung up in large sequencing facilities. But in 2007, even for smaller facilities such as Deighton's, sequencing started seeming more affordable, says George Church, who has been developing a related technique at Harvard Medical School.
The 454 Life Sciences instrument, released in 2005, was the first platform on the market (see 'Rothberg's recipe for success'). The company was purchased by Roche Diagnostics in 2007. Solexa, Inc., later acquired by Illumina, released its platform, which combines sequencing-by-synthesis chemistry and cluster technology, in June 2006 (see 'Creating the Genome Analyzer'). In October 2007, Applied Biosystems announced availability of the ABI SOLiD system, the only commercial system that uses a ligation-based chemistry (see 'Applied Biosystems' approach to sequencing'). These platforms are being applied in diverse ways, as evidenced at last year's sequencing conferences, Church says. Chad Nusbaum, co-director of genome sequencing and analysis at the Broad Institute, says researchers are increasingly using the sequencing as a general-purpose tool to tackle what he calls low-lying fruit.
In 2007, researchers performed whole-genome human sequencing using old and new platforms. Researchers at Baylor College of Medicine and 454 Life Sciences sequenced James Watson's genome in two months, for about $1 million. Two other personal genomes were sequenced: Craig Venter's, at the Institute he founded, and that of a Chinese individual, at the Beijing Genomics Institute. The J. Craig Venter Institute used Sanger technology for sequencing Venter's DNA, which cost an estimated $70 million and took several years, but is now focusing its efforts on sequencing more human genomes, a task that will rely on the next-generation platforms and reference sequences captured using Sanger technology, says Yu-Hui Rogers, scientific director of the J. Craig Venter Institute's Joint Technology Center.
Researchers at Yale University, in collaboration with 454 Life Sciences, combined the 454 sequencing technology with paired-end mapping to detect structural variation in the human genome, for an October 2007 study published in Science. "The technology allowed us to get some things we could never get before," says the study's lead author Michael Snyder, referring to inversions and breakpoints that would be missed using comparative genome hybridization and fosmid paired-end sequencing. John West, general manager of Illumina's DNA sequencing business unit and Kevin McKernan, an inventor of the Applied Biosystem's SOLiD System technology, both say their companies are playing with paired-end sequencing to detect structural variations of different sizes.
According to Nusbaum, the platforms are also clearly a boon for sequencing shorter pieces of material, such as DNA fragments that bind to histones and transcription factors. Taking advantage of Solexa's shorter read lengths, a 2007 Cell report of a study led by Keji Zhao at the US National Heart, Lung and Blood Institute analyzed histone modifications by chromatin immunoprecipitation (ChIP) in one of the first published studies using Solexa technology. Zhao says that using the traditional ChIP coupled with serial analysis of gene expression (SAGE) method (which uses Sanger sequencing of concatenated DNA tags) would have cost close to $100,000 per histone modification, a much higher price tag than the $2,000 it cost them with the Solexa machine. With the savings, they have been able to sequence more. "Without this machine, we can't do anything at the scale we're working on," Zhao says. Elaine Mardis, co-director of the Genome Sequencing Center at the University of Washington in St. Louis, says the new platforms are ideal for ChIP studies like Zhao's because researchers can sequence and characterize the bound fragments instead of inferring the sequences using hybridization arrays.
Metagenomics will also benefit from the new platforms, Mardis says, with the US National Institutes of Health's Human Microbiome Project as one example. In September 2007, researchers at Columbia University, led by W. Ian Lipkin, and 454 Life Sciences published a study in Science that used metagenomics to identify a virus implicated in the death of billions of honeybees. "Metagenomics was one of those things that was heinously expensive to do in the past and troublesome as well because it wasn't easy," Mardis says. Now, with the next-generation platforms, it is very straightforward to do these experiments and extract meaning from them, she adds.
Church predicts that, in 2008, people will use the platforms to see a clearer connection between sequence and function. Already, association studies linking genetic sequence and function are getting significant results at a much higher rate—from about one a year in previous years to 60 in 2007. "That's a pretty significant increase, but very few of those have actually been hunted down to the causative allele," Church says. Seeing the connections between sequence and function will require more human sequences and phenotypic data. Len Pennacchio, a geneticist from the US Department of Energy's Joint Genome Institute, says: "At this point, Watson's DNA is a parts list, but it hasn't told us what makes Jim Jim."
Though the platforms are not producing $1,000 human genomes yet—a research initiative the National Institutes of Health tried to spark in a 2004 request for proposals—they are on their way, says Richard Myers, director of the Stanford Human Genome Center. For now, the next limiting step might be generating more sophisticated computer algorithms to assemble and compare genomes. Those thousands of samples require delving deep to get a lot of sequence redundancy, so that the reads can be placed accurately. "It is a hard problem, but people are starting to apply [the technology] that way," he says.
In the meantime, Deighton will try to apply the 454 Life Sciences technology to the existing research at North Carolina State University. They have assembled the bacterial genomes and are going back to get better coverage, about 20-fold, so that they can fill in the gaps. About 15 researchers have signed up to use the machine to sequence everything from trees to butterflies. By the end of the three-month loan period, Deighton will do a cost analysis to show his colleagues that the machine's benefit outweighs its $500,000 price tag. "I'm convinced that it'll be worth keeping," he says.
|