[News] Google’s medical AI was super accurate in a lab. Real life was a different story.
The Impact of AI on Medical Imaging: A Double-Edged Sword
This technology has only been successful if you don't make the AI system completely culpable for its mistakes. It can output its estimation and along with that, it can also output an estimation of its own uncertainty. You can like give you some confidence bounds here that these are not going to be statistical true confidence bounds because it's deep learning. But still, I would say please give all the available information that the system has and then let the humans work with the system rather than trying to fully replace the humans by simply saying yes/no or reject all.
For example, patients whose images were kicked out of the system were told they could have a visit they would have to visit a specialist at another clinic on another day if they found it hard to take time off work or did not have a car. This was obviously inconvenient for the nurses who felt frustrated especially when they believe the rejected scans showed no sign of disease and the follow-up appointments were unnecessary. This is exactly what I'm saying right, the nurses often also have very good experience and can combine could combine something like this with their own experience of when something is wrong and when something isn't wrong.
Maybe you even build some explainability to focus on part of the image and then alleviate a lot of these problems they sometimes waste time retaking or editing an image that the AI had rejected. This is just now you're just build AI working against humans rather than with humans. So, further it says because the system had to upload images to the cloud for processing poor internet connection in several clinics also caused delays so patients like instant results but the internet is slow and the patient's then complain they've been waiting here since 6:00 a.m.
For the first two hours could only screen 10 patients. Yes, this is the type of stuff you have to take into account so maybe actually put the GPU server into the clinic it's better anyway for for data privacy reasons but of course, the large companies they want to everything to be uploaded to their machines it's more convenient for them so they say there is now working with medical staff to design a new workflow.
I mean sometimes you do rely on an internet connection so I don't want to be too harsh here. The other there are some critics here. Michael Abramov, an eye doctor and computer scientist at the University of Iowa Hospitals and Clinics has been developing an AI for diagnosing retinal disease for several years and is a CEO of a spin-off here and he basically says that there is much more to healthcare than algorithms.
And I mean of course we can all we can all see that yeah it basically says that the questions the usefulness of comparing AI tools with human specialists when it comes to accuracy of course we don't want Nia to make a bad call but human doctors disagree all the time. He says that's fine, my assistant needs to fit into a process where sources of uncertainty are discussed rather than simply rejected and this exactly feeds into what I've been saying if the error just to output the source of uncertainty and all it thinks about a particular situation then the humans could discuss it right.
And then we could get to a better outcome but this only works if the legal framework is given. You want to assign kind of blame when something goes wrong but you just have to know that this is what keeps these systems back often finally they say the benefits could be huge there was one nurse that screened 1,000 patients on her own what time that is I guess that's over the course of the study or so and with this tool she's unstoppable. The patients didn't really care that it was an AI rather than a human reading their images they cared more about what their experience was going to be.
And that's a general extent general experience that I get from a lot of people working with human machine interactions is that the people don't they're not so super excited that it's a human if they if the machine appears competent. I think we've gotten used to AI being quite good at particular tasks and we're actually happy to outsource some of these to them but again if you build something for the real world you have to take into account the real-world conditions.
And this feeds into papers like image net v2 where you all of a sudden have a harder test set it feeds into topics like domain shift transfer learning domain adaptation and these are all research topics. So, problems like this can give rise to entirely new directions of research if you're looking for the PhD topic maybe this is something for you.
All right thanks for watching this, this was my blabs about the story I hope you enjoyed this in these kind of new sections it's a new thing I'm doing if you liked it subscribe if you didn't like it leave a comment and bye bye
"WEBVTTKind: captionsLanguage: enhi there today we're looking at this news story from MIT technology review Google's medical AI was super-accurate in a lab real life was a different story so the story here is that Google had this AI to detect diabetic retinopathy so if you're a diabetic and you your glucose isn't or your insulin isn't properly handled that means you get damage to your blood vessels and the small blood vessels like the ones in the eyes here they're the first ones to get damaged and that can lead you to get this disease called retinopathy which are is in the retina in the back of the eye and that can lead you to go blind if it's not discovered soon enough so a eye doctor can look at an photograph like this and can determine whether you have it or not I guess they would look at like a larger resolution of it but in any case they could determine from this so Google built an AI that could maybe spot things here that may be spot if you had this or not and they tried to deploy this and the story is about how this failed basically so they said they had the this in Thailand they had the opportunity to deploy this so Thailand's Ministry of Health had set an annual goal to screen 60% of the people with diabetes for this diabetic retinopathy it can cause blindness if not caught early so here is where I comes in because to four point five million patients that have diabetes there are only 200 experts that can determine from a photograph whether or not you do have that disease so they say clinics are struggling to meet the target and Google has built a AI says the am i developed by Google have can identify signs of diabetic retinopathy from an eye scan with more than 90% accuracy which the team calls human specialist level and gives results in less than 10 minutes all right so this is pretty cool right they've developed an AI you can send an eye scan and it'll say what what you whether or not you have this disease but then the problems mount they so they followed over several months they observe nurses conducting eye scans and interviewed them about their expertise using the new system so the nurses who conduct the eye scans they would try to use the AI and they they the nurses themselves aren't specialists they would otherwise send the scans to a specialist but now they are supposed to handle this up when it worked well the AI did speed things up so but sometimes failed to give a result at all so these AI had been trained on high quality scans right if of course if you want to train an AI system you want the highest quality data you can get but also in practice you're not gonna get high quality data it was designed to reject images that fell below a certain threshold of quality and they say often often taking the photos and poor lighting conditions in the real world more than a fifth of the images were rejected so this is my take on it if you build something for the real world you need to take into account what the real world holds in store for you which means that you probably are going to have poor lighting conditions if you build an image recognition system right now I'm not saying that like some people are saying whenever you work with AI you should consider how it impacts later on and so on no it's perfectly fine to work on a data set of high quality images if you do something like invent a new architecture or whatnot work on optimization algorithms well like not nothing of that but it is if you are thinking of deploying something in the real world you need to take this into account now I also think this was particularly poorly designed for the task and here is why Google here is mainly worried about legal culpability because the thing says it was designed to reject images that fell below a certain threshold of quality right the reason for this is that here you have a classifier right and either it says it says okay here is positive and negative class I am about this much sure of the positives clause in this much of the negative class and there's quite a big of a difference here right so I'm gonna go with the negative class but if those two things are somewhat closer together the Google doesn't trust its own AI it's like yeah and if it did some decision here if it's as well I still go go with the negative class right this goes back to the patient and they made a mistake then this thing here is automatically responsible for that mistake and since the AI is not a human these mistakes here could be rather trivial mistakes that a human would have spotted so basically since it's deep learning we don't really trust it and then because Google doesn't want the legal culpability of being responsible they simply reject these cases they just say we don't deal with it we just deal with things with a large discrepancy if you actually want to design something for the real world you need to take into account ok there's poor lighting conditions and I would think in if I were to build something like this optimally you would just output this thing you would output this distribution you would in this case you could say look I am 60 40 % I'm not sure I lean towards negative but I don't think so and then the nurse who ought maybe also has some expertise could be experienced in when the system fails or when it tends to be not sure and could kind of integrate that information but this only works so if you are a that's maybe a recommendations for lawgivers this only works if you don't make the AI system completely culpable for it's mistakes it can output its estimation and it can along of that it can actually also output an estimation of its own uncertainty you can like give you some confidence bounds here that these are not going to be statistical true confidence bounds because it's deep learning but still I would say please give all the available information that the system has and then let the humans work with the system rather than trying to fully replace the humans by simply saying yes/no or reject all right so they say patients whose images were kicked out of the system were told they could have a visit they would have to visit a specialist at another clinic on another day if they found it hard to take time off work or did not have a car this was obviously inconvenient which I can understand nurses felt frustrated especially when they believe the rejected scans showed no sign of disease and the follow-up appointments were unnecessary this is exactly what I'm saying right the nurses often also have very good experience and can combined could combine something like this with their own experience of when something is wrong and when something isn't wrong and maybe you even building some explain ability to focus on part of the image and then you could alleviate a lot of these problems they sometimes waste the time time to retake or edit an image that the AI had rejected right this this is just now you're just build AI working against humans rather than with humans so further it says because the system had to upload images to the cloud for processing poor internet connection in several clinics also caused delays so patients like the instant results but the internet is slow and the patient's then complain they've been waiting here since 6:00 a.m. and for the first two hours could only we could only screen 10 patients yes this is the type of stuff you have to take into account so maybe actually put the GPU server into the clinic it's better anyway for for data privacy reasons but of course the large companies they want to everything to be uploaded to their machines it's more convenient for them so they say there is now working with medical staff to design a new workflows I mean sometimes you do rely on an internet connection so I don't want to be too too harsh here so the the other there are some critics here so Michael Abramov an eye doctor and computer scientist at the University of Iowa Hospitals and Clinics has been developing an AI for diagnosing retinal disease for several years and is a CEO of a spin-off here and he basically says there is much more to health care than algorithms and I mean of course we can we can all we can all see that yeah it basically says that the questions the usefulness of comparing AI tools with human specialists when it comes to accuracy of course we don't want Nia to make a bad call but human doctors disagree all the time he says that's fine my assistant needs to fit into a process where sources of uncertainty are discussed rather than simply rejected and this exact this exactly feeds into what I've been saying if the error just to output the source of uncertainty and all it thinks about a particular situation then the humans could discuss it right and then we could get to a better outcome but this only works if the legal framework is given if you regulate and I get I get that point too you want to assign kind of blame when something goes wrong but you just have to know that this is what keeps these systems back often finally they say the benefits could be huge there was one nurse that screened 1,000 patients on her own what time that is I guess that's over the course of the study or so and with this tool she's unstoppable the patients didn't really care that it was an AI rather than a human reading their images they cared more about what their experience was going to be and that's a general extent general experience that I get from a lot of people working with human machine interactions is that the people don't they're not so super excited that it's a human if they if the machine appears competent I think we've gotten used to AI being quite good at particular tasks and we're actually happy to outsource some of these to them but again if you build something for the real world you have to take into account the real-world conditions and this feeds into papers like image net v2 where you all of a sudden have a harder test set it feeds into topics like domain shift transfer learning domain adaptation and these are all research topics so I think problems like this can give rise to entirely new directions of research if you're looking for the PhD topic maybe this is something for you all right thanks for watching this this was my blabs about the story I hope you enjoyed this in these kind of new sections it's a new thing I'm doing if you liked it subscribe if you didn't like it leave a comment and bye byehi there today we're looking at this news story from MIT technology review Google's medical AI was super-accurate in a lab real life was a different story so the story here is that Google had this AI to detect diabetic retinopathy so if you're a diabetic and you your glucose isn't or your insulin isn't properly handled that means you get damage to your blood vessels and the small blood vessels like the ones in the eyes here they're the first ones to get damaged and that can lead you to get this disease called retinopathy which are is in the retina in the back of the eye and that can lead you to go blind if it's not discovered soon enough so a eye doctor can look at an photograph like this and can determine whether you have it or not I guess they would look at like a larger resolution of it but in any case they could determine from this so Google built an AI that could maybe spot things here that may be spot if you had this or not and they tried to deploy this and the story is about how this failed basically so they said they had the this in Thailand they had the opportunity to deploy this so Thailand's Ministry of Health had set an annual goal to screen 60% of the people with diabetes for this diabetic retinopathy it can cause blindness if not caught early so here is where I comes in because to four point five million patients that have diabetes there are only 200 experts that can determine from a photograph whether or not you do have that disease so they say clinics are struggling to meet the target and Google has built a AI says the am i developed by Google have can identify signs of diabetic retinopathy from an eye scan with more than 90% accuracy which the team calls human specialist level and gives results in less than 10 minutes all right so this is pretty cool right they've developed an AI you can send an eye scan and it'll say what what you whether or not you have this disease but then the problems mount they so they followed over several months they observe nurses conducting eye scans and interviewed them about their expertise using the new system so the nurses who conduct the eye scans they would try to use the AI and they they the nurses themselves aren't specialists they would otherwise send the scans to a specialist but now they are supposed to handle this up when it worked well the AI did speed things up so but sometimes failed to give a result at all so these AI had been trained on high quality scans right if of course if you want to train an AI system you want the highest quality data you can get but also in practice you're not gonna get high quality data it was designed to reject images that fell below a certain threshold of quality and they say often often taking the photos and poor lighting conditions in the real world more than a fifth of the images were rejected so this is my take on it if you build something for the real world you need to take into account what the real world holds in store for you which means that you probably are going to have poor lighting conditions if you build an image recognition system right now I'm not saying that like some people are saying whenever you work with AI you should consider how it impacts later on and so on no it's perfectly fine to work on a data set of high quality images if you do something like invent a new architecture or whatnot work on optimization algorithms well like not nothing of that but it is if you are thinking of deploying something in the real world you need to take this into account now I also think this was particularly poorly designed for the task and here is why Google here is mainly worried about legal culpability because the thing says it was designed to reject images that fell below a certain threshold of quality right the reason for this is that here you have a classifier right and either it says it says okay here is positive and negative class I am about this much sure of the positives clause in this much of the negative class and there's quite a big of a difference here right so I'm gonna go with the negative class but if those two things are somewhat closer together the Google doesn't trust its own AI it's like yeah and if it did some decision here if it's as well I still go go with the negative class right this goes back to the patient and they made a mistake then this thing here is automatically responsible for that mistake and since the AI is not a human these mistakes here could be rather trivial mistakes that a human would have spotted so basically since it's deep learning we don't really trust it and then because Google doesn't want the legal culpability of being responsible they simply reject these cases they just say we don't deal with it we just deal with things with a large discrepancy if you actually want to design something for the real world you need to take into account ok there's poor lighting conditions and I would think in if I were to build something like this optimally you would just output this thing you would output this distribution you would in this case you could say look I am 60 40 % I'm not sure I lean towards negative but I don't think so and then the nurse who ought maybe also has some expertise could be experienced in when the system fails or when it tends to be not sure and could kind of integrate that information but this only works so if you are a that's maybe a recommendations for lawgivers this only works if you don't make the AI system completely culpable for it's mistakes it can output its estimation and it can along of that it can actually also output an estimation of its own uncertainty you can like give you some confidence bounds here that these are not going to be statistical true confidence bounds because it's deep learning but still I would say please give all the available information that the system has and then let the humans work with the system rather than trying to fully replace the humans by simply saying yes/no or reject all right so they say patients whose images were kicked out of the system were told they could have a visit they would have to visit a specialist at another clinic on another day if they found it hard to take time off work or did not have a car this was obviously inconvenient which I can understand nurses felt frustrated especially when they believe the rejected scans showed no sign of disease and the follow-up appointments were unnecessary this is exactly what I'm saying right the nurses often also have very good experience and can combined could combine something like this with their own experience of when something is wrong and when something isn't wrong and maybe you even building some explain ability to focus on part of the image and then you could alleviate a lot of these problems they sometimes waste the time time to retake or edit an image that the AI had rejected right this this is just now you're just build AI working against humans rather than with humans so further it says because the system had to upload images to the cloud for processing poor internet connection in several clinics also caused delays so patients like the instant results but the internet is slow and the patient's then complain they've been waiting here since 6:00 a.m. and for the first two hours could only we could only screen 10 patients yes this is the type of stuff you have to take into account so maybe actually put the GPU server into the clinic it's better anyway for for data privacy reasons but of course the large companies they want to everything to be uploaded to their machines it's more convenient for them so they say there is now working with medical staff to design a new workflows I mean sometimes you do rely on an internet connection so I don't want to be too too harsh here so the the other there are some critics here so Michael Abramov an eye doctor and computer scientist at the University of Iowa Hospitals and Clinics has been developing an AI for diagnosing retinal disease for several years and is a CEO of a spin-off here and he basically says there is much more to health care than algorithms and I mean of course we can we can all we can all see that yeah it basically says that the questions the usefulness of comparing AI tools with human specialists when it comes to accuracy of course we don't want Nia to make a bad call but human doctors disagree all the time he says that's fine my assistant needs to fit into a process where sources of uncertainty are discussed rather than simply rejected and this exact this exactly feeds into what I've been saying if the error just to output the source of uncertainty and all it thinks about a particular situation then the humans could discuss it right and then we could get to a better outcome but this only works if the legal framework is given if you regulate and I get I get that point too you want to assign kind of blame when something goes wrong but you just have to know that this is what keeps these systems back often finally they say the benefits could be huge there was one nurse that screened 1,000 patients on her own what time that is I guess that's over the course of the study or so and with this tool she's unstoppable the patients didn't really care that it was an AI rather than a human reading their images they cared more about what their experience was going to be and that's a general extent general experience that I get from a lot of people working with human machine interactions is that the people don't they're not so super excited that it's a human if they if the machine appears competent I think we've gotten used to AI being quite good at particular tasks and we're actually happy to outsource some of these to them but again if you build something for the real world you have to take into account the real-world conditions and this feeds into papers like image net v2 where you all of a sudden have a harder test set it feeds into topics like domain shift transfer learning domain adaptation and these are all research topics so I think problems like this can give rise to entirely new directions of research if you're looking for the PhD topic maybe this is something for you all right thanks for watching this this was my blabs about the story I hope you enjoyed this in these kind of new sections it's a new thing I'm doing if you liked it subscribe if you didn't like it leave a comment and bye bye\n"