Google DeepMind: The Podcast · 2025-05-08

AI and the Future of Health with Joelle Barral

Hosts: Hannah Fry

Guests: Joelle Barral

AI in healthcareMedical imagingDiabetic retinopathyTuberculosis detectionCancer researchDigital twinsClinical trialsMedical LLMsAMIEMed-GeminiMed-PaLMData privacyHuman-AI collaborationGlobal health access

Read summary Jump to transcript Go to episode

Podcast feed URL

Open feed

Why it matters

Diabetic retinopathy AI has been FDA-cleared and deployed in Thailand, screening 700,000 people, with plans to expand.

Key claims

Diabetic retinopathy AI has been FDA-cleared and deployed in Thailand, screening 700,000 people, with plans to expand.
DeepMind established ground truth for retinal imaging by aggregating diagnoses from 50 ophthalmologists, since individual experts often disagreed.
AI extends beyond imaging to sound-based diagnostics (e.g., tuberculosis detection from cough recordings) and virtual tissue staining that preserves samples.
A multi-modal cancer collaboration with Institut Curie combines genomics, transcriptomics, and imaging to tackle hard cases like triple-negative breast cancer.

Episode summary

Summary

In this episode of Google DeepMind: The Podcast, Professor Hannah Fry speaks with Joelle Barral, Senior Director of Research at Google DeepMind, about how AI is reshaping healthcare from diagnosis to treatment. Barral argues that while the patient-doctor relationship will persist, each touchpoint will be augmented by AI agents, and that healthcare is "paying the debt" left by decades of burdensome data entry by clinicians.

The conversation covers concrete deployments already in the clinic, including DeepMind's diabetic retinopathy screening tool (FDA-cleared and used to screen 700,000 people in Thailand), cough-based tuberculosis detection, virtual tissue staining, and multi-modal cancer research with Institut Curie targeting triple-negative breast and uterine cancers. Barral stresses the importance of rigorous ground-truth labeling—citing a case where 50 ophthalmologists were needed to reach consensus—and notes that AI can occasionally exceed individual expert performance while still leaving final decisions to physicians.

Looking forward, Barral discusses digital twins for clinical trials, privacy and data sovereignty concerns (particularly in Europe), and DeepMind's medical LLMs. She describes AMIE (Articulate Medical Intelligence Explorer) as a research effort to build a conversational diagnostic agent—now in a Harvard/Beth Israel clinical study under IRB approval—and highlights Med-PaLM and Med-Gemini, which inherit Gemini's long-context reasoning and multimodality while being fine-tuned on medical corpora. Barral emphasizes caution, defends a "family physician in your pocket" vision that requires long-term patient knowledge, and notes that today's LLMs already score well on empathy benchmarks.

Diabetic retinopathy AI has been FDA-cleared and deployed in Thailand, screening 700,000 people, with plans to expand.
DeepMind established ground truth for retinal imaging by aggregating diagnoses from 50 ophthalmologists, since individual experts often disagreed.
AI extends beyond imaging to sound-based diagnostics (e.g., tuberculosis detection from cough recordings) and virtual tissue staining that preserves samples.
A multi-modal cancer collaboration with Institut Curie combines genomics, transcriptomics, and imaging to tackle hard cases like triple-negative breast cancer.
Digital twins are being explored as virtual cohorts to reduce the size and cost of clinical trials.
AMIE (Articulate Medical Intelligence Explorer) is DeepMind's research LLM for diagnostic dialogue, now in an IRB-approved clinical study with Harvard/Beth Israel.
Med-PaLM and Med-Gemini are Gemini-family models fine-tuned on medical corpora, designed to inherit long-context reasoning and native multimodality.
Barral frames AI as augmenting physicians to address global clinician shortages and reclaim the 'joy of practicing medicine,' rather than replacing the patient-doctor relationship.

Source material

Transcript

Professor Hannah Fry I'm hoping AI contributes to bringing back the joy of practicing medicine.

Finally, with all of the data that physicians and all their healthcare professionals have taken so much time to enter, we are going to be able to derive insights and help them.

Welcome back to Google DeepMind The Podcast.

I'm Professor Hannah Fry.

In this episode, I'm talking to Joel Barral, Senior Director of Research at Google DeepMind, about AI for health.

For years, we have joked about how the internet is the hypochondriac's best friend, capable of turning a small headache into a terminal illness, and vice versa, with a quick search and a click of a button.

But quietly, behind the scenes, the role of algorithms in medical care has been shifting.

We've already talked a lot on this podcast about the impact of AI on drug discovery and research into proteins.

But now, AI promises to change diagnosis and treatment too.

Welcome to The Felt Castile.

I mean, there have been some pretty big changes already with AI, but should we expect that healthcare will look very different in 10 or 15 years' time to how it looks now?

Absolutely.

I believe so.

I think healthcare is really posed to be drastically changed with AI, but it may not look too different.

I picture healthcare as a fairly sticky ecosystem.

So 10, 15 years from now, we will likely still go to our primary care doctor and then follow through with specialist visits, et cetera.

But each of them is going to be augmented, if you wish, by an AI agent.

So underneath, the system will be very different, but it may not appear as different from what we know today.

How long have you been working in AI in healthcare?

Oh, wow.

My whole career, I would say, I was still a student when, I think it was CS229, right?

The class on machine learning was ticking off.

And I remember working with, I was further along in my PhD, but working with students on how to segment the larynx, right?

And AI was this new tool that could really help us do that better.

And I haven't stopped ever since.

But why this area for you?

What's the appeal of it for you?

In healthcare, you're always challenged, right?

There are lots of things that we don't do that well or you don't have infinite data.

And so you're never perfect.

So you're always trying to leverage the best tools to do what you're trying to do.

And then I joined Google a decade ago to work on surgical robotics.

And I remember vividly, it was the beginning of us really realizing that with images, anything that a human could do in terms of interpreting those images, machines were going to be able to do.

And so I sat down with my surgeon colleague and we took a hundred callus astectomies, which are, when you remove the gallbladder, we manually segmented livers and gallbladders.

And I fed a pretty simple neural net to see whether the algorithm would be able to decipher between those two things.

And I remembered being incredibly surprised actually when it did that perfectly, because most of the other tools we had at our disposal were never perfect, right?

Like it did the job, but not always.

Now livers and gallbladders really don't look the same.

It's a very easy task.

Any student will be able to tell you which one is which.

But still, that really for me meant, okay, now we have something that will be infinitely more powerful than any of the algorithms I had been working on previously.

And people forget, right?

But when I was still a student, it was hard with computer vision to decipher a QR code.

That was where we were back then.

And so again, AI has really opened up a world of capabilities far beyond anything we could envision before.

Okay.

Well, let's start off with some of the examples of places where AI has already made a lot of headway.

I'm thinking about diagnosis here and maybe medical imaging in particular.

Talk to me a bit about what's been going on.

Yeah, we've seen a lot of that over the last 10 years.

That's what we typically call narrow AI.

Narrow because it's solving one task, right?

You mentioned medical imaging.

So you can take a test x-ray or an MRI, or we've done also like diabetic retinopathy work.

Those are already tools that have been FDA cleared, for example, and deployed in the clinic.

And they are augmenting radiologists in the sense that for those specific tasks, the machine does a pretty damn good job at annotating the image for the physician.

Let me make sure I understand this then.

Okay.

So if we take diabetic retinopathy as an example, so this is people who have diabetes and it's a cause of blindness, is that right?

Absolutely.

It's the leading cause of preventable blindness worldwide.

And it's pretty simple.

So you take an image of the back of the eye with a fundus camera, and that gives you an image of the retina that then of the logists will grade on a scale from one to five to indicate whether patients have no disease, moderate disease, or severe disease.

And what was striking when we started working on that particular problem was the fact that it was actually really hard to get to what we call ground truth, meaning as we're training our machine learning models to learn a task, we first need to provide them with a set of ground truth images.

That means the pairs of the image and the label.

And for that particular task, we asked maybe one, two, three of the biologists, and often we got different answers.

And I think one thing we did particularly well was to be very rigorous in trying to establish ground truth.

And so we ended up working with 50, 50 of the biologists.

And we realized that actually on some images, they really didn't agree with each other.

Some would have given one, two, three, like each one of the scores could have potentially been the diagnosis a patient would have received because patients only see one, right?

They don't go and have their images graded by 50 people.

Yeah.

Boy, so some optimologists would say this is a mild or no disease, and others would say all the way up to severe on the same image.

Precisely, which sometimes is not too surprising, meaning that, you know, most of the biologists, and that's true for most physicians, see what's in their community.

And the rare cases, by definition, they don't see them very often.

So it can get very hard to properly annotate an image if you don't see that type very often.

Well, how do you do that then?

If even the experts don't agree, how do you decide what the correct answer is?

That's exactly why we had to go to 50, right?

Then you have this consensus among all of the biologists, and sometimes you go to even deeper experts, right, to really get to what is the correct label for that image, and that way you properly train your algorithms.

And then, you know, we also did a lot of rigorous testing to make sure that we had algorithms in which we could trust.

But then I suppose on the flip side of that, even though that makes the labeling quite difficult, if you know that experts are disagreeing, then actually you can create something that is a new gold standard, really, like better than experts.

Absolutely.

And I think that's where the AI has really its role to play, right?

It's where for tasks that it is really good at, that are really not very easy to be done by a single human.

I seem to remember reading this paper where some researchers had all of these images at the back of the eye, and were like, you know what, let's just see for fun if you can tell the sex of the patient based on the blood vessels at the back of the eye.

And no ophthalmologist in the world, I think, could do this reliably.

But the AI can.

Yeah, absolutely.

It was very interesting to then interrogate those images for other things than just diabetic retinopathy, and see that the model performed better than flipping a coin, right?

And for some things, I'm not sure you need a model to, you know, decipher the sex of your patient, but I do think for indications of potential cardiac disease, for example, that can be a useful tool, because if you're going to do that screening for diabetic retinopathy anyway, but now you have a test that can also provide you with additional information, you could potentially screen for more diseases.

So indeed, and it was, you know, kind of both interestingly, scientifically, to realize that those images contained a lot more information than we initially thought, and with potentially also applications that could derive from that.

What does it imply maybe that even though you've created an algorithm to do this narrow task, actually, there is potential for the AI to help improve the overall understanding of the human body?

I would answer maybe in two ways.

The first is, you know, in radiology, there's this thing called incidental findings, because it's very often that if you screen me head to toe with MRI right now, you'll find things, you know, you'll find nodules, things that are abnormal, they will get me worried.

But actually, we never did clinical trials to see whether having that thing somewhere is good or bad, right?

And many people leave with those things forever without any issues.

So you don't really want to do that type of screening that will, you know, get everyone worried.

The other thing I would say is when you do a blood test, you do a blood test for very particular thing.

The lab who did the blood test for finding a particular virus is not responsible for checking everything, right?

And if they haven't noticed that you had something else that wasn't what, you know, you got blood drawn, they're not responsible for that.

And so for imaging, it's always a little ambiguous what to do with those incidental findings, right?

And for those aspects of narrow AI, I think the jury is still out whether or not we want the AI to always look for everything else when it's looking at one particular image, even if it could provide pretty helpful alarms.

Those are images that we've spoken about.

But are there other inputs that you can use for this very narrow AI type of diagnosis?

Absolutely.

I would say all types of modalities, right?

We've done a lot of very interesting work with sound, for example, leveraging sound to be able to detect tuberculosis from cough.

And people have also explored that quite a bit during COVID, obviously.

And so we see good biomarkers actually in sound that can help also provide some relatively cheap ways to diagnose those types of diseases.

So wait, it actually works then.

Yes, it does.

You record a cough and it can tell you if you've got TB.

In shorts, yes, that's exactly what it is.

Okay.

But there are limitations to this stuff though, right?

I mean, are they getting it right every single time?

No, I think that's a very good point.

Like with any algorithms, right?

There is sensitivity and specificity, so it can have either false positives, false negatives.

So thinking that something exists when it doesn't or missing something that it should have picked.

In general though, what we do and what we did with the diabetic retinopathy paper is really checking how it does with respect to the best physicians or the best panels of physicians and then letting the human who is always working with the algorithm decide where on that specificity, sensitivity trade-off they want to operate.

And so they can decide that really they cannot afford false alarms or the other way around, that they cannot afford missing anything.

But in both cases, I think for an algorithm to be cleared by regulatory authorities, it needs to have shown very strong performance.

Okay.

Let me break that down then.

So it could be that an algorithm could say that it's TB when it isn't, but it also could be that an algorithm could miss a genuine case of TB and say that it was fine.

And obviously you don't want either of those things to happen.

But what percentage accuracy almost are you willing to accept before you say, okay, this model is adding value?

So it really depends, right?

If you're going to deploy your, let's say it's a screening test in places that had nothing before, it's a very different story than if you're replacing an existing alternate way of screening that same population.

If you're replacing something, you'd better be at least better.

If you're coming in where there was before a void, then public health authorities are going to decide what their team is acceptable.

It also depends what will happen to the patients that are provided with that diagnosis, right?

If they go home and you never see them again, it's a different story than if your tool is a pre-screening tool and then you're going to reroute your patients to an additional screening tool for confirmation.

When you launch these things, for example, the TB model or diabetic retinopathy, does that mean that you aim initially to go to places where they don't have these existing screening tests in place?

It really depends, right?

For diabetic retinopathy, for example, it is something that we've indeed deployed in Thailand and then we often work with partners to really bring it where it makes a difference.

We've already screened 700,000 people in Thailand and we're annexing that over the next few years.

Why Thailand?

Thailand is one of those countries in which the patient population that each ophthalmologist has to care for is fairly large, right?

There are many places in which there aren't really enough doctors and so it's a good example of a place in which AI screening can really help improve outcomes.

Let's say that you have a situation in which some sort of medical imaging is being used in conjunction with a doctor.

What happens if the two disagree?

I think at the end of the day, the doctor is really the one making the call, right?

So it's very important that they retain control and they are the one making the decision and they are the ones signing their name on the report.

And so, you know, they would have to explain, right?

If they're saying the machine is wrong, it could be for a variety of reasons.

In other cases, you know, you could think of the algorithm making the physician rethink, right?

So it's checking your work like a spell checker, if you wish, right?

Sometimes I disagree with my spell checker with, you know, accents on words in French or things like it's not perfect but I will double check.

Okay, well, let's talk about some of the more advanced stuff here because up until now, I mean, the examples that we've been using here is like an MRI or a photo of the back of an eye or, you know, the sound of a cough.

What about the more holistic view?

So, I mean, really good medicine doesn't see a human as a collection of interesting or otherwise medical problems, right?

Sort of sees a human as a human.

Can you use AI to think more holistically?

AI is indeed very good at looking together at things that up until now, we were mostly capable of looking at, you know, individually and maybe bring insights that up until now we didn't have.

Like what?

I mean, if you do manage to connect up everything we understand about the cell to everything we understand about an organ to eventually everything we understand about a human, what do you find in those connections?

Yeah, back in the days, we did quite a bit of work in virtual staining, right?

So, when you have an HNE slide, so that's what happens when any piece of tissue removed from your body in surgery typically will be sliced and looked at under the microscope.

And the way we do that is we make hypothesis as to, you know, what might be happening and we stain accordingly so that it makes those things visible.

And you can think of some of those techniques as making that visible, but without having to stain, right?

So, really revealing some of that information if it's already in the tissue, but without having to bring kind of additional things that are consuming that piece of tissue.

We'll destroy the tissue for future tests.

Exactly, exactly.

So, to me, you know, there's also something quite profound in the way we're now interrogating tissue with more and more instruments with things like transcriptomics, genomics, single cell, etc.

And so, that's really a wonderful application for AI because the AI can, you know, look at each modality and if you wish provide us with eyes for that particular scientific instrument.

And then it goes beyond because it's also capable of bridging between those different types of scientific instruments.

So then, you presumably can start to combine some of these quite complex AI tools, you know, on imaging, on genomics.

Does that actually allow you to advance in your understanding of diseases?

I mean, I'm thinking of cancer care, for instance.

Yeah, it's a great example and that's what we are actually precisely working on with Le Institute Curie, the Curie Institute in Paris, that is very advanced in exploiting all of those most recent modalities to better understand cancer.

In that case, women's cancer, so, uterine cancer or breast cancer, which, you know, despite our best efforts for many, many decades, we're still short of answers for many women.

And we're really hoping that by combining those modalities, by getting to a deeper understanding at the cellular level of what's going on, we'll finally be able to crack, you know, for example, triple negative breast cancers or some of the, again, more advanced cancers that today we don't have good solutions for.

As well as cells and genomics and imaging, there are other data sources that the AI can bring into the equation here as well, right?

Absolutely.

And I think that's also maybe one of the secrets AI will be able to decipher, which is health is in everything we do, right?

It's how many steps you walked today, what you ate for breakfast, lunch and dinner, whether you've interacted with friends or been lonely all day long.

And so it's typically really, really hard to translate that into either healthy habits or understand how much of that is factoring into something physical that is going on.

And we're doing a lot of work with sensors.

I'm wearing this Fitbit today.

You know, it's just one of the data sources that can really be leveraged in combination with everything else to try to both better understand health, sometimes just as an individual.

And also when we're trying to change behavior, right, accompany us on that journey.

I mean, you're talking about quite different quality of data here, right?

You've got sort of slides from biopsies and genomic data and then step count.

It sort of feels like a quite fuzzy thing around the edges.

Would it actually make a difference?

It does.

I mean, it's been shown, you know, over and over again, for example, how much sleep matters, right, for both cardiovascular health and in oncology, right, in terms of preventing cancer.

So it's one of my hopes, right, that AI will bring those two together in a way that is harder to do in our traditional healthcare systems and also very hard to do for us as individuals because you see that type of impact at a population level.

As an individual, it's really hard to convince yourself that you should go to bed one hour earlier because that's actually the best thing you can do for your health.

There's quite a lot of buzz at the moment about digital twins.

Does this also add to the whole holistic view of health?

Is there an idea of making a kind of digital twin of yourself for healthcare purposes?

Yeah, we see that a lot and it can mean different things for different people, right?

There is the idea that the digital twin is kind of a simulation of someone with a similar persona, for example, and then that can be a good proxy to interrogate the potential impact of different types of interventions.

And then digital twin can mean something quite different for the pharma industry, for example, where there's a lot of work trying to see how far we can go with clinical trials if we manage to assemble virtual cohorts that are allowing us to gain as much knowledge about the safety and efficacy of a drug, but without needing as many people for that particular clinical trial.

Oh, that's so interesting.

So is this about, I don't know, making a kind of cohort of like simulated humans or maybe not the whole thing, maybe just an organ or whatever it is you're particularly focusing on, but so that you don't necessarily have to use as many people in your clinical trial.

Exactly.

For clinical trial, that would be exactly that.

But then to be able to do that effectively, you have to really, really understand what real humans look like, which means having a wealth of data from individual patients somewhere along the way.

How do people feel about contributing their own data for this kind of end, for the research of medical purposes?

I think, you know, when we talk about privacy, we see different concerns in different parts of the world.

In Europe, there are a lot of sovereignty concerns.

People want to know that the data is, I mean, not in all countries, but is staying on their soil.

And we have solution for that, right?

All of the trusted public cloud solutions, for example, are complying with the highest levels of regulation on that topic.

In other parts of the world, we see a lot of appetite for people to not think that they are sick for nothing.

By that, I mean, it's actually quite compelling to know that if you're sick, but you're contributing data to research, you're helping making the less person that is going to have the same fate, a little less sick, if you wish.

So I think really enabling that virtuous cycle is really where we have to be.

Because it does feel like healthcare data is a particularly sensitive case, because on the one hand, exactly as you described, there is huge potential to advance our understanding and improve conditions for future generations.

But on the other hand, if healthcare data gets into the wrong hands, or is misused, or used irresponsibly, I think people have real concerns about the potential ramifications of that.

So I mean, how do you strike that balance?

Once you have the technology to actually protect the data, there's also accountability, right?

That meaning that if we're saying that this research will be helpful to the people, that indeed, at the end of the day, it is research that is helpful to the people where the data originated, right?

And I think that's a very important principle that in the same way that you shouldn't do clinical trials in places of the world, in some places of the world, and then the drugs benefit in other places of the world.

So you're not just extracting from one to give to another?

Precisely.

Okay, I also want to ask you a bit more about the patients' experience in all of this.

Are large language models changing the game here already?

I mean, are you concerned that people are trying to diagnose themselves with generative AI?

Yeah, I mean, I would say even before, you really shouldn't do that, right?

Like you shouldn't play your own doctor.

People do, they don't know.

I think people want to know, and especially in places where waiting time to see a physician is becoming a real issue, then of course, you know, patients are trying to learn as much as possible and to decipher what's happening.

And sometimes quite effectively, because they are their best advocate, right?

Our parent trying to figure something out for their child.

And sometimes, you know, we have a couple of examples where for rare diseases, actually, we've seen patients do a remarkable job.

Now, I would say, you know, at Google, we have always tried to provide the most helpful answers to our users.

And so with symptoms, for example, we have the knowledge cards that you've probably noticed that are telling you about a disease and with authoritative content, right?

Like trying to be clear in the explanation of common symptoms and potential options for treatment, but never crossing that line and telling you what your own diagnosis might be.

Now with large-on-gut models, our model Gemini will also not tell you what your diagnosis will be, right?

It will tell you, "Sorry, I'm not a medical doctor.

You should go and see a doctor."

But you can get more helpful answers if you're asking, you know, what potential conditions could explain this, right?

And it will give you more of the textbook answer.

I like to think of it as, you know, those family guide for healthcare, those, you know, I don't know, the big thick books that you might have where you have small children and you're trying to know what to do in those particular cases.

Well, it's that same level.

And if we believe that large-on-gut models are going to be able to provide diagnosis, we need to continue the research.

And that's what we're doing with our work with Amy, the Articulate Medical Intelligence Explorer.

It's a research project in which we are trying to see which conversational abilities a large-on-gut model can have to establish a diagnosis.

So can it ask the right question, like the way a physician would do, right?

Like dialoguing with a patient.

How close are we to seeing a system like Amy being used actually in medical settings?

So we've been working on this project for quite a while already.

And you know, our first research papers were all based on research done with patient actors and simulated scenarios.

We're now doing a clinical study under our view approval with Harvard at the Beth Israel Medical Center to see what happens, you know, again, in a very controlled environment with physician supervision with the system.

So that's our next step.

Hard for me to predict, you know, how long it is before such a system would really help physicians in the clinic.

But we are already seeing, you know, in parts of the world, similar systems that are typically being used to answer low-risk questions, if you wish, and that are always supervised.

So there's a physician maybe, you know, double-checking the answer within a period of time, a short amount of time, like say 10 minutes of a model, 15 minutes from a model answering a patient.

But you see how it's really a step-by-step approach.

And I think that's, you know, very important again, because if you have a model that says the right thing, nine out of 10 times, but the 10th, it's doing something really bad.

Well, maybe it wasn't so good at all for, you know, none of those cases.

The benefits don't outweigh the cost.

Precisely.

That is interesting, though, because you're right that when if you're talking to a real physician, I mean, they will tell you that patients don't come in with a sort of very clear, refined list of relevant information for you.

I mean, they'll come in and say, I sort of, "What, it feels so great."

And it's like, it's your job then to interrogate that and actually find out precisely what's at the heart of it, which bits of information are relevant, which bits aren't.

That must be a very difficult thing to mimic within an AI.

Indeed.

And also, you know, I like to often think of the AI of tomorrow as something that will be like your, you know, the physician in your family, you know, there's often this physician in your family that gets asked all questions about anything that has to do with healthcare of all family members, you know, all the cousins and everyone, even if they are dermatologists, they will be asked, you know, cardiology questions, then they will help people navigate the system, etc.

But not all families have physicians.

And so I like to think that, you know, the AI will build tomorrow will provide everyone with that like the equivalent of a physician in your family.

But the key element is that that member of your family actually knows you and they know you on the on the long term.

So they know if you're very anxious.

And you know, when you're saying that you have a better headache for the last three weeks, you've said that for the last 20 years and they can safely discard it.

Or if you know, if you're calling them, but you've actually never called them, they know that it's something they should pay attention to, they will recognize you the tone of your voice, etc.

Which I think makes this, you know, I give the example of the dermatologist that ends up, you know, having to say something about everything, but pretty safe, right?

In the end, it's actually not going to provide bad advice to someone that has a cardiac condition because they know the person and they will tell them to actually seek cardiologist advice at the right time.

So we would want our AI systems to do the same, right?

But for that, it means they need to know you enough.

And it's not just, you know, very quickly saying your latest symptoms and getting an answer.

It's it, you know, those systems have to be built.

I guess one of the differences of a physician in your family is that they're not really prone to hallucinations.

Some of them might, but you're right.

But how do you create something like that when we are still in a world where generative AI does have these hallucinations?

Always forgetful about timelines or, or, you know, mistakes, information, all of the common mistakes that you see in generative AI.

Yeah, that's why you cannot just take a large model off the shelf and apply it in healthcare and expect that to be acceptable.

Right.

And I think we've done a lot of work first with Matt Palm.

That was our first model fine tuned on a medical corpus.

We demonstrated that it was doing better at answering medical license exams than initially an average student and then a panel of expert.

And then we did med Gemini.

What's med Gemini?

Oh, med Gemini is our large longer model Gemini, which I know you've played with endlessly and we fine tuned it on a medical corpus.

And so it's really inheriting the long context of Gemini, its reasoning capabilities, its native multimodality.

But on top of that, it has seen medical data in a way that Gemini hadn't.

And it's also being evaluated for specific medical purposes.

There is this other aspect of the human AI collaboration, though, I mean, some of these, these tools are designed for doctors.

Is there a risk that doctors, I guess, lose some of their skills or start to become overly reliant on these kinds of models?

You know, there's always that risk, right?

In healthcare, I would say though, that for me, we don't really have a choice in the sense that we have a big shortage of healthcare professionals in many parts of the world.

So the question is, how do we address that?

Right?

How do we make sure that people globally can be as healthy as possible and don't suffer from things for which we have answers, right?

There are a lot of things for which we still don't have answers, but for a lot of others where medicine can actually provide an answer, I think we have a responsibility to leverage the technology we have to bring that to those people.

Rather than over-reliance, for me, the question is how are we going to train the next generation, right?

And what is actually of doctors?

It's true of doctors within our conversation.

It can be true of other professions within the context of the new era we're in with Gen AI.

But for doctors, there might be things where actually the AI is really doing a really good job.

And when it's not, it's also telling you that it doesn't know, etc.

So you can rely on the AI and it's okay to not be as good at those tasks that the AI is really good at because there are so many other things that you need to learn and you need to do well.

So there are some things then that you think the AI just isn't going to touch.

I mean, things like empathy, for instance.

Well, actually, that one is a good one because we always check that our models have good bedside manners.

And they actually do.

They're not bad at all.

They can adapt to their audience in a way that few human beings can.

They can leverage the right language for a five-year-old or a 70-year-old or a 40-year-old.

And if you're not a native speaker, they can speak in your own language.

So they're actually, when they're judged on empathy, they actually do pretty well.

How on earth do you train it to be more empathic?

Well, in the same way you train them for everything else, right?

Like they are based on a lot of conversations that they've seen in the training data.

And if those conversations are empathic, then they learn to also leverage the same words, the same tone, et cetera.

And you do need to fine tune them or provide them with more specific examples because not everything on the web is empathic.

Do you think, though, that there are some aspects of medical care that really AI can't touch?

I've built surgical robots.

So even the haptics and all of that, I tend to think that at some point we will have ways.

But today, whenever you need a physical exam, right?

Like if you're checking someone's stomach, right?

A lot of things you absolutely need physicians.

I paused because I think every single time we say the technology will never do this, we get wrong.

And at some point things change in a way that actually they can also be helpful in those circumstances.

It's coming as something that is augmenting physicians, right?

It's not replacing them in all of the tasks they do daily.

I see AI much more as the one paying the debt for what the digital industry has done to physicians, which is asking them to enter lots and lots of data in computers for years, changing their job fairly drastically, making them miserable in some cases, right?

I do think that we see a lot of burnout in the profession.

And I think I'm hoping AI contributes to bringing back the joy of practice in medicine.

But to me, it's paying that debt, right?

It's like finally with all of the data that physicians and all their healthcare professionals have taken so much time to enter, we are going to be able to derive insights or knowledge and help them.

Here's why it was all worth it, I guess.

Yes.

But that is interesting listening to you talk about it though, because I mean, so many of the conversations that I get to have on this podcast are about, I don't know, like AGI and sort of long term sort of these holistic systems and kind of advances towards completely doing things in a different way than we always have.

But the way that you're describing your vision of this, it's almost like the AI here is a tool that will change every aspect of medicine.

But fundamentally, the idea of a patient-doctor relationship is unchanged.

You're right.

I think I might be a little more cautious than, you know, what we're envisioning in other aspects of society.

And it also depends, again, where in the world we are envisioning that impact of technology, because I think that technology will bring healthcare to parts of the world that probably didn't have access to healthcare at all, or only in a very limited way, in ways that we're not yet envisioning.

And that might be, you know, beyond anything we've described today.

But I also, I'm always a bit skeptical of healthcare being entirely reverse-realized overnight.

And, you know, it will be completely different tomorrow from how it looked yesterday.

That being said, I'm also absolutely blown away with models like AMI, right?

Like if you think about it, you can now have a conversation with a large graduate model that is very similar to the conversation you would have with an incredibly well-educated specialist pooling knowledge from all around the world, meaning, you know, this very small clinical trial that happened in the '50s in some remote place, echoing some new things that have, you know, that are starting to be seen somewhere else.

Up until now, serendipity sometimes played a very big role if you had a rare condition as to whether, you know, you would be treated or not.

And now with our large language models, I think we're completely changing that serendipity game.

I mean, it's always nice to finish a podcast on a very positive and optimistic note, but I think there's a lot to look forward to.

Absolutely.

Thank you.

I think medicine is a special case because between the false positives and the false negatives and crucially the potential harms of either, this is something that has a very high bar to get right.

And I think there is something reassuring about Joelle's no-nonsense determination to take the scientific approach here, to carefully and cautiously evaluate the benefit of using AI, to make sure that we end up with the best healthcare outcomes for as many people as possible.

You've been listening to Google DeepMind the podcast with me, Professor Hannah Fry.

If you enjoyed that episode, then do subscribe to our YouTube channel.

You can also find us on your favorite podcast platform, and we have plenty more episodes on a whole range of topics to come.

So do check those out too.

See you next time.

[MUSIC PLAYING]