|Genome Analysis Shows Humans Survive|
on Low Number of Genes-New York Times 2/11/01
By NICHOLAS WADE
Opening a new era in human biology and
medicine, two rival teams of scientists
this week present their first interpretations of
the human genome, the set of DNA-encoded
instructions that specify a person.
The two teams report in articles to be
published on Thursday and Friday that there
are far fewer human genes than thought —
probably a mere 30,000 or so — only a third
more than those found in the roundworm.
One team, Celera Genomics, has compiled a parts list of the proteins
needed to make a person. The other team, a publicly funded consortium,
has traced the history of how the "junk" regions of the genome
accumulated and has found that small elements of the junk may play a
useful role. They also discovered that human genes have been derived
directly from bacteria.
The two teams announced last June that they had assembled the human
genome, but it has taken them until now to analyze their findings.
The interpretation of the genome — identifying the genes, their functions
and controls, and how they relate to human physiology and disease — is
expected in time to revolutionize medicine by clarifying the mechanism of
many diseases and generating new tests and treatments.
Physically, the genome is minuscule — two copies of it are packed into
the nucleus of every ordinary human cell, each one of which is about a
fifth the size of the smallest speck of dust the eye can see. But the
genome is vast in terms of its informational content. Composed of
chemical symbols designated by a four-letter alphabet of A's, T's, C's,
and G's, the human genome is some 3.2 billion letters in length. If printed
in standard type, it would cover 75,490 pages of this newspaper.
The enormous task of decoding the genomic message began in 1990 and
is now substantially complete, although both teams' versions of the
genome are riddled with gaps.
With so much effort and scientific glory at stake, members of each team
remain highly critical of the other's approach, believing that their own
strategy for decoding the genome is likely to produce the better and more
accurate version. Since last June, however, both have been muting
criticism and observing a limited truce. The pact called for a joint
announcement, made at the White House on June 26 last year, that each
side had finished assembling its version of the genome, and for joint
publication of their findings, which is occurring later this week.
The joint publication, however, is about as separate as a union could be,
with each side's articles appearing in rival scientific journals issued on
different sides of the Atlantic. The findings were to be announced
tomorrow, but the embargo was lifted by the two journals after The
Observer of London broke it.
One team is a consortium of academic centers, mostly in the United
States and Britain but with members in France, Germany, China and
Japan. The consortium is financed largely by the National Institutes of
Health and the Wellcome Trust of London. Its version of the human
genome is described in a 62-page article in Nature, based in London.
The principal author is Dr. Eric Lander of the Whitehead Institute in
The other team is led by Dr. J. Craig Venter, president of Celera
Genomics in Rockville, Md. Its report appears in a 48-page article in
Science, based in Washington.
Despite the two team's many differences, they largely agree on their
findings about the human genome. Theirs is the first overall look at a
genetic document of extraordinary strangeness and complexity. No one
expected it to be comprehensible at first glance and the two teams have
so far mapped only the principal features of its terrain.
Genome Analysis Shows Humans Survive on Low
Number of Genes
(Page 2 of 3)
Their principal discovery is how few human
genes there seem to be. Textbooks have long
pegged the number of human genes at around
100,000, but with the sequence of human
DNA units in hand the two teams have found
far fewer than expected. Dr. Venter says he
has identified 26,588 protein-coding genes for
sure and another 12,000 possible genes. The
consortium says there are 30,000 to 40,000
human genes. Both sides prefer the lower end
of their range, since their methods of gene discovery tend to predict more
genes than they believe exist.
The low number of human genes — say 30,000 — can be seen as good
for medicine because it means there are fewer genes to understand.
The impact on human pride is another matter. Of the only two other
animal genomes sequenced so far, the roundworm has 19,000 genes and
the fruit fly, also a standard laboratory organism, 13,000. Both teams
devote part of their huge articles to discussing how it is that humans are
more complicated than simple invertebrate animals even though they
possess not that many more genes.
Despite these face-saving efforts, human self-esteem may be in for further
blows as genome analysis progresses. Dr. Venter said he could find only
300 human genes that had no recognizable counterpart in the mouse. The
mouse, though a fellow mammal, last shared a common ancestor with
people 100 million years ago, time in which many more genetic
differences might have been expected to develop.
Given the minor difference between man and mouse, Dr. Venter said he
expected the chimpanzee, which parted company from the human line
only five million years ago, to have an almost identical set of genes as
people but to possess variant forms of these genes.
The consortium, taking its own jab at anthropocentric pomp, identified
113 human genes, and possibly scores more, that have been acquired
directly from bacteria.
In the journal articles, the two sides also sketch out major features of the
genome's architecture, of which genes are only a small part. More than
half the genome consists of repetitive DNA that has no genetic meaning.
Much of the repetitive DNA is formed by a couple of rogue genes that
millions of years ago learned to copy and insert themselves into new sites
in the genome. Because mutations clock up in these repeated segments at
a fairly regular rate, their origins can be dated.
The consortium has found that the main families of repetitive DNA fell
extinct long ago and no longer add clutter to the genome. But one family
is still active, and since its members are often found near active genes
they may benefit the genome in some way.
Both teams' versions of the genome now seem to be in a good enough
state to be of great use to biologists. The consortium's genome is
available for free and Celera's through subscription. But Celera provides
extra services, such as the ability to compare the human genome
sequence with that of the mouse. Mouse DNA has retained a very similar
sequence to human DNA both in its genes and in the DNA regions that
control the activity of genes, but has diverged through mutation in all the
nonessential parts of the genome. Laying mouse DNA on top of human
DNA shows at a glance which regions evolution has thought worth
The consortium, however, is also working on the mouse genome and
intends to put that and other important tools for interpreting the human
genome in the public domain.
Experts are likely to debate which team's method for sequencing the
human genome is better. Dr. Venter's article includes a comparison chart
that shows that the consortium's version of the genome has many more
gaps than Celera's and that the gaps are larger. But in an interview Dr.
Venter complimented the consortium's efforts. "We are really impressed
at how good the public paper is, given their input data," he said.
Genome Analysis Shows Humans Survive on Low
Number of Genes
(Page 3 of 3)
But Dr. Lander said Celera's strategy was a
grand experiment that failed because it
produced more than 100,000 assembled
pieces that could not be anchored to the
genome sequence. Dr. Mark Adams of Celera
said that the statement was inaccurate and that
the company had assembled more than 95
percent of the genome into 2,845 large pieces
that were well anchored to the genome.
Despite their different strategies, both sides borrowed heavily from the
other. Dr. Venter used not only the snippets of DNA decoded by the
consortium but also important information about their position generated
by Dr. Robert H. Waterston of Washington University in St. Louis. The
consortium belatedly copied two of Dr. Venter's innovations, a clever
method of linking DNA sequence data by "paired-end reads," and
reliance on heavy-duty computing to assemble data. The consortium had
not prepared an assembly program, even though much of the analysis in
the report depends on it, until a graduate student at the University of
California at Santa Clara, James Kent, stepped in and wrote one for
them at the last minute.
The rivalry between the two sides takes many petty forms — speaking
time for each side at a news conference to be held tomorrow was
negotiated to the minute, and academic scientists including Dr. Lander
tried strenuously to prevent Science from publishing Celera's article
except under terms unacceptable to Dr. Venter. But the competition has
proved enormously beneficial overall. The consortium was on a leisurely
track to finish the genome by 2005 until Dr. Venter jumped into the race
in May 1998, saying he would complete the genome by 2000.
"I think the publicly funded group has brought off something
extraordinary," said Dr. Donald Kennedy, editor of Science and former
president of Stanford University. "Imagine trying to do this job in a
number of places with academic scientists — it's like herding cats. They
deserve all kinds of credit, but so does Venter and Celera. There is no
doubt the world is getting this well before it otherwise would have if
Venter had not entered the race."
The closeness of the finish has now become apparent. Dr. Venter said in
his article that he completed his first assembly of the human genome on
June 25, just the day before. Mr. Kent completed his first assembly of
the consortium's data on June 22, just three days before Celera's.
Both sides have in substantial measure achieved their goals. Celera went
from a concept to building a new plant from scratch to completed
genome sequence in just 25 months, despite the predictions of the
consortium's experts that its DNA sequencing strategy was bound to fail.
"This is something I felt I had been driving for for a decade," Dr. Venter
said last week, in commenting on his decision to place his name first on
the Celera report's list of authors. "No small amount of this was the
politics and psychology of being to stay with this and stick with it. If there
was any way to stop this, it was tried, down to the end of trying to block
our paper being published in Science. If we weren't resistant and
somewhat defiant this never would have gotten done."
Dr. Venter's principal partners include the scientific manager of Celera's
team, Dr. Adams, his computer program designers, Dr. Eugene W.
Myers and Dr. Granger G. Sutton, and Dr. Hamilton O. Smith, who
prepared the genome for analysis.
The consortium's goal was to place the human genome in the public
domain for unfettered use by the world's biologists, and it has now done
so four years ahead of its original schedule. The architects both of this
policy and the DNA sequencing strategy were Dr. John Sulston of the
Sanger Centre near Cambridge, England, and Dr. Waterston. Their
centers completed roughly a quarter each of the genome sequence, and
Dr. Lander's center at the Whitehead Institute did another quarter. Dr.
Lander was also chairman of the group that analyzed the completed
genome sequence. The consortium was led by Dr. Francis S. Collins of
the National Institutes of Health.
Both teams believe that the sequencing and interpretation of the human
genome is a historic event and expressed pride in their accomplishments.
But both groups expressed humility at the minute steps they have so far
taken in exploring the human genome's vast repository of knowledge.
"In principle," the consortium's biologists concluded in their report, "the
string of genetic bits holds long-sought secrets of human development,
physiology and medicine. In practice, our ability to transform such
knowledge into understanding remains woefully inadequate."
Dr. Venter said simply that the effort to sequence and interpret the human
genome had been "mentally exhausting, in part because we are not
mentally equipped to absorb all this."
"We feel like midgets describing the universe and we can't comprehend it
all," he added.