My colleagues and I have been trying to get a handle on how generative AI and large language models — the technology underlying systems like ChatGPT — will meaningfully change health care. How soon, if ever, will we see doctors feeding our symptoms into a chatbot that spits out a diagnosis? How will we know if the technology they’re using has been trained on biased data?
So we asked the experts: machine learning, ethics and health professionals who are closely observing the development of generative AI and attempts to incorporate it into health care.
Here are some of the key things they think you ought to know:
When they spit out an answer in response to a prompt, these models are essentially doing an advanced form of auto-complete by predicting the probability of certain words or phrases.
Generative AI models are getting better — and getting better fast — because they’re being trained on more and more data. These models might seem smart, but they’re less intelligent than they seem. In particular, they still can’t reason like a physician would.
Read more from those experts, including perspectives from University of Michigan computer scientists Jenna Wiens and Trenton Chang, athenahealth data science senior architect Heather Lane, and Carnegie Mellon University professor Zachary Lipton.
We also asked you what questions you had about generative AI in health care, and took your concerns to the experts. They offered some reassurance that health care organizations testing out generative AI may feel pressured to disclose that they’re doing so, but also advised patients and providers to be vigilant about errors and bias.
The bad news: The technology may be impossible to avoid entirely, experts agreed. “Without being too alarmist, the window where everyone has the ability to completely avoid this technology is likely closing,” John Kirchenbauer, a PhD student researching machine learning and natural language processing at the University of Maryland, told STAT. Also on your list of questions: whether medical records riddled with errors will cause problems for these AI tools trained on them, and whether they run the risk of bias
What’s generative AI’s role in diagnosis?
While health leaders agree generative AI could help understaffed health systems with burned out workforces handle simple communications with patients, they’re locked in a heated debate about whether the technology should creep into actual diagnoses, the founders of the General Catalyst and Andreessen Horowitz-backed startup Hippocratic AI (which I wrote about last week) told me.
Co-founders Munjal Shah, a repeat entrepreneur and computer scientist, and Meenesh Bhimani, a doctor and hospital executive, agree that it’s not currently safe to unleash the technology directly on health problems because of its tendency to hallucinate. Their venture aims to build a large language model specifically for use in medicine, pressure-tested by a team of health care professionals. And while it might one day tackle direct patient care, the team will start by focusing on applications that might ease documentation or communication burdens.
Shah said health systems focused purely on generative AI’s diagnostic potential are “struggling with imagination. They can’t think of all the other roles in health care,” he said. “I don’t think we’re really going to know for a while what all the capabilities are.”