The Cancer Journal - Volume 11, Number 3 (May-June 1998)
Cancer research after the genome project
The sequencing of the human genome seems likely to be finished earlier than expected. This will have consequences for physiology and pathophysiology. When this enormous dictionary of genes is complete, or even before, it will be necessary to reconsider the concept of function, which has become blurred as a result of being used at several different levels. At the molecular level, the functions of genes, or more precisely those of their corresponding proteins, seem to be confused with their biochemical properties. Thus we state that the function of a particular protein, a ligand for example, is to bind specifically to one or more other molecule[s], its receptor[s], and provoke a biochemical event: internalization, activation of a metabolic pathway etc.. This, however, is a long way from physiological functions, which are the results of integrated processes at the level of the whole organism. Between the whole organism and the molecular level there are, of course, a number of intermediate levels of tissue and cellular organization. It is possible that our heuristic view of these intermediate levels is misleading; that our way of extrapolating from biochemical to physiological function is based on a long-standing illusion about functional continuity.
This semantic confusion hides a dangerous theoretical gap. Have we not all often heard it said that "cancer is a genetic disease"? This affirmation, put forward so convincingly, is irrefutable and therefore not a scientific statement in Popper's sense. Is there any disease for which we can be completely certain that there is no influence of genetic factors? These are associated with environmental factors as well as with other factors more difficult to define. There is no doubt that most lung cancers are linked to tobacco use, but only genetically susceptible subjects develop the disease. In the same way, not all women carrying mutations in the BRCA1 gene will suffer from a cancer of the breast and/or ovary, since the rest of their genetic make-up and environmental factors modify the risk and the age of onset of the disease, in a way which is not yet clear. These are well known facts, but they are often forgotten.
When diseases are described in terms of the dysfunction they cause, it is clear that integrated processes are disturbed. This implies that each disease has a structure which will be lost when it is analyzed piece-meal.
Let us return to life after the genome project. About 80,000 proteins, most of them as yet unknown, will be identified once the human genome has been sequenced. They are beginning to be known collectively as the "proteome". For the moment this is incomplete, but is already very complicated. Recently, study of the differential expression of many genes has become feasible. Very powerful methods are now available: Serial Analysis of Gene Expression (SAGE), differential display of messenger RNA, multiplex hybridization of cDNA clones or specific oligonucleotides (complex microarrays). These allow the level of gene activity in a piece of tissue or a cell population to be analyzed. Thanks to extensive automation, these methods can now detect the activation state of several thousand genes at the same time, which will be reflected in the proteome and its associated network (1) (2) (3).
The Cancer Genome Anatomy Project (CGAP) set up by the NCI, which will soon be linked to a similar project in Europe, aims to correlate the mass of information collected in this way from precancerous lesions and from tumors at various stages of development with clinical and histological observations. This is an ambitious and incontestable target. It is, however, a risky business to obtain a vast number of experimental results which will be stored in a database in the hope that biomathematicians and computer experts will be able to develop appropriate algorithms to analyze, classify and compare them. This will certainly happen, but it is to be feared that the effort invested will not bear fruit in terms of therapeutic breakthroughs and disease control. Even if we have access to a large number of three-dimensional maps of gene expression and high-resolution expression profiles, how can this invaluable information be used in a clinical setting if we do not know how it fits into integrated structures? The prevailing explicit and implicit hypotheses about malignant transformation, cell-cycle control and apoptosis, the evolution of tumor cells, host-tumor relationships etc. also threaten to limit our ability to formulate new hypotheses, if we do not achieve a theoretical revolution to match the technological one. This revolution should change scientist's way of thinking, and transform them from "miners", involved in the technical business of collecting an large number of unconnected pieces of knowledge, to "refiners", distilling and re-working this knowledge to provide answers to medical problems which are at present poorly or not at all resolved. It is not enough to simply put molecular biologists, computer scientists and physicians together. If this combination does not lead to a joint effort to revive physiology and pathophysiology, to use the results of molecular biology but to go beyond them, all the talk about multidisciplinarity and interfaces will have no practical results. Unless one considers that the development and marketing of new drugs is an end in itself, rather that a simple necessity.
To contact the author...Click here Thank you.
1. Strachan T et al. Nature Genetics, 16, 126-132, 1997.
2. Fields S. Nature Genetics, 15, 325-327, 1997.
3. Nelson N. J. Nat Cancer Inst 88, 1803-1805, 1996.