'Digital Twins and Virtual Patients: Ethics of Representation for Precision Medicine' Presentation by Mildred Cho

What if, in the near future, clinical trials were conducted not on living patients, but on digitally simulated ones?

Alexander Borsa
December 20, 2022

Dr. Mildred Cho explores this type of possibility, as well as present-day uses of simulated data in precision medicine. Dr. Cho is the Associate Director of the Center for Biomedical Ethics in the Pediatrics Department at the Stanford University School of Medicine, with an additional appointment in Primary Care and Population Health in the Department of Medicine. Last semester, Dr. Cho gave a presentation to Columbia’s Precision Medicine: Ethics, Politics, and Culture working group on the ethics and implications of virtual representations in biomedical research.

Unlikely Origins

To start tracing the origins of simulation, Dr. Cho began with a provocative, and perhaps unexpected, example. During the 1970 Apollo 13 mission that planned to land astronauts on the moon, an oxygen tank exploded, forcing the crew to cancel the remainder of their mission and fight for survival to land safely back on Earth. Down at Mission Control, an earthbound team of scientists and engineers began working furiously to try and assist their colleagues up in space. To better understand what would be necessary – or even possible – on the spacecraft, those on earth arranged materials in the command center to model characteristics of the broken equipment on the spacecraft. Communicating with the astronauts in real time, they continued to update their impromptu “simulation” to reflect physical changes made on the ship, and to advise the astronauts in turn on what to do next. Eventually, Mission Control was able to guide the crew home alive, as millions watched from below.

While the above example may seem an unlikely starting place from which to theorize simulated data in precision medicine, many of the same underlying information processes and interpretive choices are at play. Through a constant stream of input from the “real” situation up in space, a working model was produced and consistently modified in order to best serve the needs of those involved. As replicating every aspect of the emergency would be impossible – alas, we cannot easily turn off gravity – those involved had to decide what elements of the situation to replicate, and how to extrapolate from their representations to make strategic choices. Today, similar, more advanced techniques are still used, such as engineers routinely using digital simulations in conjunction with physical twins, as in the case with Mars rovers.

Now, in the context of precision medicine, the data that is being collected and the material situations that are being simulated are certainly different. Rather than oxygen tanks and spacecrafts, investigators are interested in simulating components of human bodies and their reactions to conditions such as having a certain disease or being treated with an experimental drug. Similar to the Apollo 13 scenario, this poses a number of central methodological questions: What data needs to be collected? How do we turn that data into a meaningful simulation? And how do we apply that simulation in a useful, appropriate, and ethical fashion?

Simulating Ourselves

Today, simulation is used in a variety of ways across the biomedical, health, and social sciences, from imputing missing values in datasets, to reproducing molecular dynamics, to making projections about the uptake of an emergent health intervention. In many instances, simulations such of these have proven incredibly useful, and have helped advance science and medicine or alleviate suffering. But the jury is still out on what the near-to-distant future of simulation holds in store. While some remain skeptical about the idea of being able to produce a full “digital twin” for a living person, others predict that a large portion of future FDA-approved clinical trials will rely exclusively on simulated data before a drug ever makes its way into a human body. The advantages of in silico clinical trials such of these could be multi-fold: they may reduce trial cost and time; they may increase the biological variability of a sample population; they may reduce risks or reliance on human and animal participants; and they may reduce bias introduced by prior knowledge or working hypotheses.  

But the end goal of a simulation impacts the data that needs to be collected in order to produce it. What types of data are necessary, and how much of each? For example, a recent perspective in Nature by Laubenbacher et al. outlines what would be needed to produce a digital twin of the human immune system, including molecular-, cellular-, tissue-, organ-, and body-level data. These data would then be integrated to produce a wide array of models and simulations that, ideally, could inform diagnosis, prognosis, and therapy optimization. But, as with any emerging scientific field, there is significant uncertainty and debate about what data is relevant to a given problem area. Just as monitoring and recording every exposure, behavior, and biomarker for someone over their life would raise ethical and epistemological questions, so does deciding what phenomena to exclude from incorporation into the production of a digital twin.

Homing in on these tensions, Dr. Cho provided a moral and bioethical roadmap to the current state of digital twin studies. In each context, investigators and impacted communities must decide how well a digital twin – or, as Dr. Cho calls it, a digital simulacrum – must represent an individual in order to be considered a viable representation. This requires deciding upon what characteristics should be incorporated, and what would merely be noise. Additionally, representations and inferences tend to “take on a life of their own.” We might envision scenarios where the conclusions drawn from simulations about an individual or population come to eclipse those people’s own lived experiences or understandings of their personal realities. If a healthcare provider or insurance company imposes decisions, treatment programs, or pricing on patients that privilege simulacra over their own testimonies, this might be a form of epistemic injustice, and one likely to impact already marginalized communities. This introduces additional questions about whether individuals should have the right to refuse the use of simulation-based predications in precision medicine contexts, or if such escape would even be possible if simulation becomes integrated into all parts of medical science.

The need for copious data also introduces a variety of moral issues pertaining to data collection, storage, and safety. As it stands, many scholars argue that current policies pertaining to data protection are not sufficient. Today, protected characteristics – such as one’s pregnancy status, ethnic background, or sexuality – are constantly inferred through consumer and social media data, and used to advertise products and services in a targeted fashion. (See: Target identifying shoppers as pregnant before family members, or perhaps even the shoppers themselves, know.) Thus, even domains of one’s life that are not specifically incorporated into simulacra could come to be impacted, or influence how one is treated by different systems. One of the legal responses to this sort of problem is the “right to reasonable inference,” or the idea that there should be limits on what types of conclusions can, or ought, to be drawn from digital simulacra.

Because of the stakes involved, its generally agreed upon that some sort of community involvement is ethnically necessary when compiling data about a certain population. But even within a community, certain people may feel very differently about how they want their data used, such as with recent sociogenomics research on the genetic associations of same-sex sexual behavior. This means that, while engaging community members and incorporating their feedback is of the utmost import, it does not necessarily delineate which uses are acceptable and which uses are not.

Dr. Cho advises that this domain of research requires a level of epistemic humility with regard to what we know, what we don’t know, and what we don’t know that we don’t know. For example, whereas much of biological and clinical research emphasizes knowing the causal mechanism(s) through which a certain intervention or disease operates, machine learning algorithms used by data scientists make predictions based on probabilistic associations. Thus, we may end up with clinical recommendations or forewarnings that are strongly supported by the data, without ultimately knowing why that is the case.

While some data engineers may not view ethical dimensions relevant until a product is in beta testing with real people, or may view them as outside their scope, ultimately, ethics needs to be incorporated into every step of digital simulacra research. With inherent limits to digital representation, our knowledge of complex biological systems, and on the translatability of lived experience, each step of the research and development process – deciding what to track, how to track it, how turn that data into a simulation, and how to extrapolate from it – all have ethical dimensions. And so, while digital simulacra hold immense promise in the field of precision medicine, we must also attend to the shortcomings, uncertainties, and tensions inherent to this field.