Science Tribune - Article - August 1996
Mastering the complexity of information in patient health care
How to choose a decision aid when establishing a diagnosis or selecting suitable therapy is an important issue that all clinicians have to face. The choice among different theoretical models cannot depend on the opinion of specialists alone or on knowledge gained from operational research tools used in a purely mechanical way. This article deals with decision aids that use mapping methods based on multidimensional data reduction.
Decision aids in medicine : Shared versus collective intelligence
The mastery of the intricate physical phenomena involved in launching a space rocket is childsplay compared to mastering the intrinsic complexity of living organisms. Moreover, health care is rendered even more complex by extraneous factors such as :
- medical ethics,
- emotional responses in the face of disease and death,
- the need to take a speedy, but sensible, decision to alleviate suffering,
- the economic, social, and political impact of all medical decisions.
At present, there are two schools of thought on how to tackle this biological, psychological and social complexity :
- the one gives maximum credit to the intelligence of 'the' expert or a group of experts (shared intelligence),
- the other transfers the problem to 'automats', the products of the collective intelligence of civilisation. It opts for a 'cybermedicine' that privileges high-tech and artificial intelligence.
Which school of thought should the clinician follow or are there other options ?
The responsibility of the practitioner
The practitioner is in an uneasy position. It is as if he must choose between :
1) falling-back on the art of diagnosis and prescription. This art - an alchemy of intelligence and experience - is a subtle, partly unconscious, integration of information,
2) delegating his decision powers to more-or-less advanced technical procedures based on artificial cognition models that mime the nervous system (expert systems, connectionist models, decisions under constraint, multicriteria selection...).
However, transferring all know-how to an automat can quickly lead to the abdication of personal responsibility. However agreeable it may be to place the risks run by the patient into the reassuring hands of an approved system, all systems thus empowered are essentially irresponsible. This behaviour is a modern version of sheltering behind the authority of norms, regulations, etc... Besides, most of these model systems are closed 'black boxes' concealing procedures that remain incomprehensible to the practitioner. Blind belief in subtle expert-systems is no less suspect than believing the Delphi oracles; in both instances, trust in 'magical' procedures, without any form of censorship, can be treacherous.
Building models or trojan horses ?
In France as elsewhere, the near-bankruptcy of the national health system is leading toward a dangerous bend on the road of health economics. A participant of a purely institutional character - the state - may soon intervene even more obtrusively in the decision taken by the practitioner or, even worse, intrude in the 'politics' of diagnosis. In many countries, insurance companies have already not lost time in 'modelling' health risks. One can therefore easily imagine the dire consequences if, more or less unbeknown to the public, diagnostic aid systems were hybridized with econometric models for rationalising public health budgets.
The coming years may witness a situation whereby medical decisions are brought into question because necessity has become the rule of law. For those who govern us, 'modelling' may prove to be the ideal Trojan Horse because it can incorporate any form of constraint (physiological, genetic, lifestyle, financial, or the so-called 'public good'....). I personally firmly believe that, if a diagnostic-aid has to take into account all the parameters that intervene in the expression of a pathology or in the success of a treatment, these parameters must concern the patient alone and not society. Even if society is an organism suffering from chronic deficits !
Enlightened admittance of 'soft' models into the realm of medicine
The only valid entry ticket is prefixed CA (computer-assisted) thus making clear right from the start that the relationship between man and machine is one between master and slave, in that very order. Any other view - man as god or machine as goddess - will inevitably lead down a blind alley because evolution - whether Darwinian or technological - is a question of an adaptative compromise. The way forward is one that favours a hybridization and/or symbiosis of strategies.
This, of course, means that clinging to rigid traditions or chasing mirages are unpragmatic, doctrinaire (unidimensional) and disastrous attitudes. An example of the former is the rejection of a theoretical model (e.g. the automatic analysis of electro-cardiograms) because it is judged to be unfair competition. The search for models as substitutes for responsible action is a mirage cut off from any roots or reality.
Medicine belongs to the 'sciences of the imprecise ' (A. Moles ) (1) but cannot dispense with quantification. As Paul Valéry (2) provocatively stated : "only that which is measurable is worthy of being called science". But isn't it time to forego formal logic and differential equations, so dear to hard-and-fast Laplacian determinism, and prefer 'soft' mathematical methods ? A wide variety are available (fuzzy logic, catastrophe theory, systems analysis, data analysis, artificial intelligence, connectionism, simulated annealing, genetic algorithms ....) and the choice need not be 'frozen' at an early stage on the premise that it is the optimal method. To avoid creating a new 'mainstream' culture, it is as important to keep a multiplicity of approaches as to preserve the biodiversity of life.
Building models based on probabilism and not determinism
Because medical data is multiparametric, any descriptive or decision model in this field should meet certain requirements and, in particular, account for as much of the available data as possible.
What models in medicine should NOT do:
- seek a binary 'yes/no' answer in deference to our dualistic culture fascinated with norms. They should not, for example, strictly oppose the ill and the healthy. Normality, as given by a greater or lesser distance from a mean (in a supposedly Gaussian distribution), does not guarantee the absence of a pathology. Any approach that considers each parameter within an array as independent ignores the notion of a physiological system in which small individual deviations can be the signatures of great disturbances
- adopt an unduly simple hierarchical ranking system which does not account for the full data. Many ranking systems are based on the calculation of means but, for instance, a student who is ranked in an average position in his class, is he average in all subjects or top in mathematics and last in literature ?
- employ numerical coefficients that, allegedly, embrace all the data but that are often meaningless and sometimes even vectors of false information. It is as if one transposed the IQ (intelligence quotient) into the world of medicine ! Unlike quotients, profiles of intellectual aptitudes proscribe undue elitism, bring to light differentials in ability and reveal the polymorphism of a population group.
These models (yes/no answers, simple ranking systems, and numerical coefficients) are all the products of an obsession with scalar values and with a desire to oversimplify complex phenomena. This attitude is prevalent even among respectable research-workers who are, unconsciously, still fascinated with the the 'magic number' and the 'philosopher's stone'. Are they not seeking the ultra-specific miracle substance, the perfectly tolerated drug, 'the' longevity factor, etc...? And are they not suffering from amnesia ? Didn't Albert Einstein say something like : 'If all that is complicated is of no use, all that is simple is false '.
What models in medicine should do:
Complexity should be mastered not by isolating items (patients or tests) from their context (segmentation) but by reducing dimensionality.
Medical information comprises both intra- and inter-variable information, the latter intervening in the definition of a structure or organization. To analyse information entropy (as in Shannon's theory of communication), a mathematical tool for building structural models is needed. This tool will first derive factors from a series of profiles describing the system and then proceed to recombine these factors into a model. Extracting factors is a banal procedure familiar to all those who have learnt the rudiments of algebra but one that has not been generalized.
Dismantling the complex, according to Descartes' precepts (3), breaks information down into disjoint subunits but how a system functions depends more on its architecture than on its building blocks. The cogs and wheels of a watch tell us little about how it works except if one believes in the great watchmaker or in the magic of self-assembly. Faith in a preestablished order smacks of finalism ! Even if anatomy is the framework for physiology, anatomical information only partly explains physiological processes. As Pascal (4) observed, and the Gestalt theorists acquiesced, 'the whole is greater than the sum of its parts'. Each level of superstructure yields extra information above and beyond the information contained in the lower strata.
Question : Which set of tools can both dismantle for the sake of analysis and rebuild with a view to understanding ? Answer : The tools of multidimensional data reduction and, in particular, of factor analysis. Factor analysis (a) dismantles the system into factors (which are orthogonal - therefore independent - supercharacteristics), arranges them in order, and then gradually rebuilds the initial edifice or its most significant subunits.
The visual and esthetic appeal of multiparametric models
Multidimensionality reduction dates back to the early 20th century. It belongs to the analogic and not digital world . Measurements or counts are transposed into geometric terms, into distances that reflect correlations among variables. The outcome is a picture and, as everyone knows, a picture that tells a tale is worth a thousand words. Complex tables of results, whose portent cannot be directly perceived by the human eye, are thus converted into series of maps with high visual appeal and communicative powers.
What types of relationships can multidimensional reduction derive and represent ?
- relationships between individuals (typology), e.g., the relative distances (relationships) between patients within a population calculated on the basis of their nosological diagnostic profiles
- relationships between individuals and their characteristics (unsupervised correlations with no external constraints), e.g., the distribution of clinical symptoms among patients.
- supervised relationships that may be linear (discriminant analysis) or non-linear (non-linear mapping, neural networks...), e.g., the specific relationship between a pathology and various biological parameters in epidemiology studies.
Factorial mapping can be likened to the long-standing art of maps and cartography that synthesizes a mass of information and gives an overall view of the lie of the land whilst at the same time evoking many hidden features (dimensions). There are maps with few details, just a broad outline, but also maps that represent contours, communication networks, etc. A factorial map has the same potential for communicating information as a landmap. Structural subunits of the system can be drawn onto the map : contours represent clustering of variables (e.g. of patients), networks become dendrograms, etc... These additions reconstruct the edifice step-wise to provide a close virtual image. Even time can be represented by, for instance, the displacement of patients within the map whilst under therapy. Our description of the multidimensional world has thus progressed from tabular quantitiative or qualitative material to abstract art (mapping) and the movies !
Factorial mapping versus mainstream science : David against Goliath ?
Even though factorial mapping is not new, it has never found a place among traditional statistical methods probably because its rationale runs against mainstream currents. The aims of teaching manuals are to foster specific schools of thought.
a) Factorial mapping shows how items are organised in a multidimensional space. In this, it is opposed to all ranking methods that give hit-parades of the worst to the best, the smallest to the biggest, the first to the last.... But it does not fall into the trap of the segmentation techniques, used in marketing for instance, which create archetypal caricatures where each individual has its place and there is but one place for each individual, as in a stage opera ! By representing the additional information brought by each further factorial axis, factorial mapping illustrates the progression toward increasing complexity that is so characteristic of living organisms.
b) Factorial mapping deals with profiles of attributes, i.e., a selectivity profile. Traditional statistics deal only with the absolute values of attributes.
c) Factor analysis mocks standard deviations, p values, and the religion of statistical significance. It perceives interconnected phenomena as a continuum (logistics function) and rejects the discontinuity of any arbitrary cut-off level (sign function).
d) Factorial mapping places all variables on the same footing and lets them organise themselves freely into relationships. All so-called traditional multidimensional analyses arbitrarily choose a dominant (independent) variable and consider all other variables as dominated (dependent). Such a hypothesis is made in both multiple regression analysis and step-wise discriminant analysis which delight the clinician because they provide numerical coefficients. But, oh, how biased these can be !! The intellectual perversion of an obligatory a priori hypothesis has already pervaded all biological and clinical science to the extent that only research (and scientific publications !) that conform to this scheme are tolerated.
To my mind, this point is sufficiently important to justify a small digression on the status of the scientist, clinician, or analyst. For investigations to yield the most creative - and not necessarily productive - results, the researcher should step into the shoes of the humble explorer and set aside his most profound convictions, all dogma, and any manifestations of paranoia. The field of exploration is to stay wide open, unrestricted by a scholastic a priori hypothesis dictated by the supremacy of human reason and logic, as if this were the only way of tackling a problem. Overemphasis on the individual "I" or collective "WE" may just be a fashion but one with serious consequences. When the field of exploration is reduced, the rule of 'exhaustivity' is transgressed whilst the principles of 'pertinence' and 'homogeneity' are preserved. The end result could be the 'non fingo' demonstration of Newton where the presupposed result is introduced into the data of the problem under study, in essence a classic example of tautology.
Conclusion : Let's return to the children's playground and play with factorial mapping
To begin to truly understand the complexity of the living and thus develop useful and appropriate decision-aids for the clinician, I suggest that all those involved in this field reappraise the bases on which the knowledge they have acquired is founded and, at the same time, question their deepest convictions. True understanding was never achieved by neglect of part of the information but by dismantling the complex into its constituent units (multidimensionality reduction) and testing the operational worth of the reassembled structure. Any child playing with LEGO building blocks already knows this.
Factorial analysis may prove to be, along with other imaginative methods, an extremely useful mathematical tool in the children's playground. Not only is it a construction game but an ideal communication tool. Factor analysis does not translate complex situations into boring numbers but into inspiring pictures. These pictures describe the organisation of the system, highlight the links among individuals and properties, and often crystallise the intuitive information that the more rigid methods of our university education has a wanton tendency to ignore.
(a) A short explanatory note on factor analysis
Let's take a multidimensional structure or system composed of patient characteristics determined at different time-points (a table of n characteristics at t time-points). A factor analysis first performs an analytical step by calculating factorial axes of which the first is the principal component of the structure and the remainder are components of decreasing importance. All these components are independent and unrelated to each other. Each factorial axis, from the most to the least important, is composed of a special mix of characteristics and time-points. By combining the factorial axes, it is possible to create simple coherent substructures which can be partly reassembled into the initial structure.
If a single causal factor, such as time, governs the data, the outcome of the factor analysis will be simple (unidimensional). In other words, the information contained in the table - excluding background noise and the idiosyncracies associated with certain patients - can be represented by a single dimension (the time-factor). The multidimensional matrix will have, in effect, been reduced to a single projection (factorial) axis without any loss in information, regardless of the size of the table.
To understand the basis of the calculations requires some knowledge of vectorial and matricial algebra. The core of the procedure involves the diagonalisation of a symmetric square matrix and the calculation of the roots of the determinant (Eigenvalues and Eigenvectors). What this really means is that the double-entry table is converted into a symmetric square matrix, the rows and columns of which are cleverly permutated so that the most marked relationships between row and column lie along the diagonal. Bertin et al. illustrated the analysis by using a set cubes. The greater the relationship between row and column, the darker the cube. The result was an empirical scalogram.
However, the great majority of systems under study do not possess a unidimensional structure. It therefore becomes necessary to extract a series of successive factorial axes and Bertin's manual method has to be replaced by appropriate diagonalisation algorithms (e.g. those of Jacobi, Kaiser, Householder). How can one best describe these algorithms ? Everyone knows the least squares method for finding the regression line that best describes the relationship between the weight and height of a group of individuals. This line can be considered to be a factor resulting from the confrontation of individuals projected into a two-dimensional space (described by height and weight). An optimal virtual line can also be imagined in the context of an n-dimensional space. But, to avoid information loss, it is necessary to draw not one projection axis but a whole series. The individuals can then be projected onto each of these axes that shape the hyperspace.
This method of data manipulation results in the factorisation not of an algebraic polynomial (numerical values) but of individuals and their characteristics (descriptive values)..
1. Moles A. Les sciences de l'imprécis Le Seuil, Paris, 1995
2. Valéry P. (1871-1945) : French poet, essayist and philosopher, well-known for his personal published reflections 'Les cahiers'.
3. Descartes R. (1596-1650). French philosopher, physicist, and mathematician who created analytic geometry, particularly well known for his discourse on method ('Discours de la méthode')
4. Pascal (1623-1662) : French philosopher, writer, physicist and mathematician, whose principal work describes his personal thoughts ('Les pensées').