Vidyya Medical News Service
*
Volume 3 Issue 66 Published - 14:00 UTC 08:00 EST 12-Jun-2001 Next Update - 14:00 UTC 08:00 EST 13-Jun-2001
little clear gif used for spacer
 
  Today in Vidyya

Prescribing Information: Elmiron ® (Pentosan Polysulfate Sodium)

Reported by Bob Kuska

When Robert Strausberg, Ph.D., became director of the NCI's Cancer Genome Anatomy Project (CGAP) [ http://cgap.nci.nih.gov/] in 1997, he admittedly faced a huge challenge. He had been asked to lead a brand-new program, whose initial project was to create the first index of genes expressed in human cancers -- a feat, many said that was more ambitious than feasible.

Yet, four years later, the mission has been accomplished. Strausberg said he and his collaborators are close to wrapping up its tumor gene indexes, having identified over a million gene transcripts in over 40 tissues. Meanwhile, Strausberg said related projects, such as the Mammalian Gene Collection [ http://mgc.nci.nih.gov/ ] and Genetic Annotation Initiative [ http://lpg.nci.nih.gov/ ], have emerged as important tools to explore the molecular causes of cancer.

Strausberg said CGAP's success means scientists can now click on the CGAP web site and, within seconds, access free of charge a vast database of genes, chromosomal changes, and other biological information relevant to the study of cancer. "When you consider that just a decade ago, entire laboratories spent 10 years searching for a single gene that might be involved in cancer, you can see just how far the field has come in pursuing the molecular underpinnings of cancer," he said.

In a recent interview with Behind The News, Strausberg offered his perspective on the success of CGAP, the challenges it faces, and the future of molecular-based cancer care.

Q: OVER THE PAST FOUR YEARS, CGAP HAS IDENTIFIED OVER A MILLION TRANSCRIPTS. WHEN WILL THE INDEXES BE COMPLETE?
A: I think that the human gene indexes, while not complete, are in a very mature state right now. What we are doing now is filling in gaps in the database, since some tumor types have greater coverage than others. We are carefully evaluating the gaps that still remain and how to reach closure using various technological approaches.

Q: IN ADDITION TO GENE TRANSCRIPTS, DOES CGAP HAVE ANY PLANS DOWN THE ROAD TO EXPLORE PROTEOMICS?
A: The full CGAP vision that was put forward several years ago was not just one of finding transcripts, but of uncovering all of the molecular information in a cancer cell and its component parts, including proteins. So, the vision is to have molecular databases where you have information about all of the changes during cancer development. From that complete catalogue, one could find the most informative features for various aspects of cancer research.

Q: HOW DIFFICULT HAS DATA MANAGEMENT BEEN FOR THE CGAP DATABASE AND, WHAT HAVE BEEN SOME OF THE LESSONS LEARNED IN CREATING SUCH A VAST BIOLOGICAL DATABASE?
A: To my mind, it's not the computing power that is limiting. It is really our ability to carefully capture the biology of cancer and to link different types of information so that there is a seamless interface.

One issue, at a very basic level, is having common terminology for genes and proteins, such that one can link different kinds of databases. With that in place, there is the opportunity to link data sets in a manner that was not possible just a few years ago. For example, the emphasis on gene expression technology as a basic feature of cancer research means that we have the opportunity to link the basic information from CGAP with information about intervention strategies coming from the NCI Developmental Therapeutics Program, the Director's Challenge, and the Early Detection Research Network, all gaining various perspectives of molecular changes associated with cancer development and progression. Moreover, the ability to link human gene data with that from model organisms provides an opportunity to experimentally study functions of genes related to cancer development.

What's needed is terminology that will provide a foundation for to link all of these databases. So, at a very basic level, human genes are named differently than mouse genes. New nomenclature, based on specific DNA sequence information, will provide the necessary foundation for these efforts.

Q: SO, THE MAJOR ISSUES ARE ANNOTATION AND VOLUME OF INFORMATION?
A: Yes. Everybody now is confronted with an enormous volume of data. The key is to build effective tools for mining the data sets such that the CGAP investment is used most effectively. Toward that end, CGAP has built, and will continue to build, a variety of bioinformatics tools that allow data mining from varius perspectives. It's really a matter of building a panel of tools that allow one to move seamlessly. . . to ask the question that you would like to ask scientifically and then be taken through a series of databases without necessarily having a priori knowledge of all the data sets that might provide key information.

For example, if you find a transcript that appears to be uniquely expressed in the prostate, you'd like to know right away: Do we have information about the corresponding protein? What is the function of this protein? What else do we know about that gene from the biomedical literature? And most importantly, do we have information that suggests this might be a good target for intervention?

Q: THE TRICK IS ALWAYS TO STAY TWO STEPS AHEAD?
A: That's what we're trying to do. And the key is that Dr. Richard Klausner facilitated organization of CGAP in such a way that we can rapidly respond to new opportunities and CGAP can meet the needs of the cancer research community.

Q: FOUR YEARS LATER, HOW DO YOU FEEL ABOUT THE SUCCESS OF CGAP?
A: I still believe firmly in the vision that we put forward four years ago. But, we're not going to be satisfied until we reach our ultimate goal: improved patient outcome. That is what this is all about. So, CGAP is not just a success because we built catalogues. It's really being able to build those links that improve the lives of patients.

I'm quite encouraged that not only have the databases been created, but the databases have been, in fact, useful. I point to the early results from the cDNA microarray data. Again, there was this vision that we could segment cancers based on their unique molecular profiles, that we would learn that some people respond better to a particular therapy because they actually have a different cancer than another segment of the population. I think that this is clearly turning out to be the case. I think that vision has held up remarkably well and has been well demonstrated within a four-year time span. Already, we see the community moving toward expanding these data sets, of really moving this into the clinical arena, of developing diagnostic tests that would be based on the molecular form of cancer, not just on microscopic analysis. Most importantly, we are now starting to benefit in the clinic from many years of research toward identifying molecular targets whose perturbation can result in exciting new intervention strategies.

While you can never be fully satisfied in science, I think that the vision that was put forward for CGAP holds up very well today. I still think that it provides a very good framework for moving forward over the next few years.

Q: SO, THE REAL REWARDS ARE REALLY YET TO COME?
A: I think that many rewards will come over the next decade, and we will see the translation of CGAP data into practical components of cancer care. That is really the key for me. The end goal is not to do the transcriptome, or to know what's expressed in the prostate. It is really turning it into practical applications, and that prospect is what I find most encouraging.

Q: WHAT ABOUT THE FUTURE OF CGAP?
A: I think that we will begin to see practical products coming from CGAP, the process of discovery, for many years to come. While our gene index project is now in a mature stage, we can't just be satisfied by cataloguing genes. We have to continually think of new approaches that will give us the most useful molecular information about cells that are likely to turn cancerous, and how to best intervene for successful patient outcome.

At a certain level, I think that as our cataloguing of genes becomes complete, it leads to more of a quest for knowledge. Cancers are comprised of a very heterogeneous group of cells; therefore we'd like to understand not only the overall molecular features of a tumor, but also at the cellular level, which key cells in cancer development can be best targeted.

In addition, we want to be able to look in vivo at gene expression. The current CGAP datasets come from tumors that have been removed from patients. New technological advances will eventually allow us to have catalogues of genes based on in vivo monitoring, which will give us the best picture of what is happening directly in the patient.

We will continue to build on the efforts of the CGAP Genetic Annotation Initiative. The GAI is assembling information on the diversity of the genome in the human population, the molecular changes of the genome as cancer progresses, and how those differences are manifested in gene expression. The same is true with the CGAP Cancer Chromosome Anatomy Project. This project presents an exciting opportunity to link gene information with changes at the chromosomal level. This brings me back to the bioinformatics. I think that the seamless interface of all of these data sets is certainly a realizable goal for CGAP.

There will be many creative strategies for preventing and intervening in cancer, and we'd like to assure that the platform for these efforts--databases of all of the molecular changes associated with cancer--are available to the entire research community.

So, I think that pushing the envelope is going to be a continuing CGAP theme. There is an ongoing interface of the clinical and basic research communities that will help to define the CGAP mission. I don't think that we could ever look at completion as such, but rather completion of a particular approach and saying, "What new opportunities arise from the current advances by CGAP and from biomedical research in general?"

 
 

More Today in Vidyya