Systems Medicine is a promising new paradigm, but it poses some tough challenges for those concerned with causal inference. I’ll mention a few epistemological challenges here and then will give some thoughts about how they might be solved in a future post.
What is systems medicine?
Systems medicine (SM) applies systems biology to medicine. Systems biology is the holistic study of biological systems – typically, systems of molecules and their causal interactions within the cell. Systems biology uses data-intensive functional genomics techniques, such as transcriptomics, metabolomics, and proteomics. These techniques often appeal heavily to mechanistic knowledge, and are used largely for mechanism discovery.
While systems medicine retains a theoretical goal – discovering pathophysiological mechanisms -, it differs from systems biology in that it places far more importance on a practical goal, namely improving diagnosis, prognosis and treatment. It also tends to include higher-level clinical and environmental data, in addition to the sub-cellular level data commonly studied by systems biology.
The promise of systems medicine
By mining this rich seam of data, systems medicine promises to yield more robust conclusions, and to offer increased personalisation in medicine: with enough data pertaining to the subpopulation of patients similar to you, systems medicine promises to establish causal relationships that are relevant to you. This will hopefully lead to better targeted treatments. With more data and with increasing automation, systems medicine may also enable the discovery of more complex pathophysiological mechanisms. While there’s a lot of hype around systems medicine, as there is in any young field trying to gain a share of the funding pie, there is a sense that there are real opportunities too.
But these opportunities don’t come for free. By amalgamating a wider range of evidence, there is more observational evidence to process, leading to an increased danger of confounding. Search strategies in systems medicine are guided by mechanistic background knowledge, which can be rather limited and error prone in some areas. The ‘big data’ approach leads to substantial computational complexity, and a large number of simplifying assumptions needed to overcome this complexity. Finally, but perhaps most importantly, there is the challenge of how best to integrate a wide variety of sources of evidence.
There is a sense, then, in which SM is Sado-Masochistic medicine: sadistic in that it demands so much of someone who wants to fully understand a SM research project; masochistic in that there are so many obstinate challenges that face SM researchers.
Let’s dwell on the problem of how to integrate evidence. The present-day perspective is that there is a need to find a model that fits multiple datasets. Each kind of data yields a ‘fingerprint’, i.e., a model that gives a partial indication as to what is causing what. For instance, metabolomic data, proteomic data, transcriptomic data, clinical data and patient-reported outcomes all yield fingerprints, and these fingerprints will differ according to whether the data is obtained from human or animal studies. A model is needed that fits all these fingerprints: this is called the ‘handprint’. Unfortunately, there is no consensus as to how to obtain this handprint model.
Worse than that, this perspective is rather simplistic, in that it neglects the fact that the overall model also needs to fit available mechanistic evidence. There is a tendency in SM to think that if it’s not a dataset then it’s not evidence, but of course mechanistic evidence guides SM at all stages, and high quality mechanistic evidence can be obtained from a variety of means other than dataset-yielding statistical trials – e.g., literature searches, biomedical imaging, simulations. Presently, mechanistic evidence is used in a rather intuitive way, often without explicit consideration. But as SM progresses, eyeballing the mechanistic evidence won’t be a realistic prospect. This is because there is simply too much of it in the SM process. We need to make its contribution explicit.
First we need to be explicit about the role that mechanistic evidence plays. To mention but a few examples: it is used to devise and interpret experimental studies, it is used to rule out a hypothesis that a correlation is non-causal, and it is used to help determine the direction of causation. Second, by understanding these roles we need to be explicit about the constraints that mechanistic evidence of different kinds and quality impose on the handprint model. In SM we need to make quantitative prediction across levels of mechanisms, so the handprint models in question had better be quantitative models and had better be able to represent different hierarchical levels of a mechanism.
Quite a substantial challenge, in my view.