The Impact of AI on Education: A Resilient Human Teacher?
As we sit down to discuss the impact of Artificial Intelligence (AI) on education, it's clear that this is a space where people are very heavily invested. The educational system has been around for thousands of years and is a fundamental structure in our society, making it a challenging space to navigate.
The recent rapid transition of AI into the education sector has sparked concerns around the potential for cheating and dependence on technology. However, despite these fears, there has been something remarkably resilient about the idea of a human teacher. At its heart, this concept has remained immune to ebbs and flows of technology. We've had classrooms for almost as long as we've had civilizations, and it's undeniable that there are opportunities here too.
Imagine a classroom where each lesson is tailored to your individual pace of learning, where an AI tutor is available around the clock, and technology can predict where you're likely to get stuck before you do. Researchers at Google DeepMind have been grappling with both the opportunities and challenges of AI in education. Recently, they published a major paper on developing AI responsibly in this area.
One of the lead authors of this paper is Dr. Irina Jurenka, a research lead at Google DeepMind. Her background spans experimental psychology and computational neuroscience, and she has spent a decade within these walls asking questions like how do humans learn? Welcome to the podcast, Irina. This is a space where people are very heavily invested, does that make it quite a difficult space to navigate?
It does because if you think about it, education has been around for thousands of years, and it is a fundamental structure in our society. Every child is supposed to get educated. The educational systems have been around for a while; they're quite rigid, and they're very established. To come in and say, look, we have this amazing technology, and we're going to revolutionize everything and change everything, I think it's not going to work so easily.
We've seen this happen with technologies of the past. Like intelligent tutoring systems have existed for 50 plus years. A lot of investment and research has gone into them, but you could argue that the promise of that technology hasn't fully materialized. Or more recently, we had MOOCs, these massive open courses, and again, there was so much excitement about how we won't need traditional education anymore. You can just go online and learn anything you would ever want to learn.
And once again, when you actually look at who uses these systems, it's people who have already gone through traditional education. And typically, it's people trying to get their second Master's. It's definitely not the thing that came and broke the system. I guess maybe we shouldn't be trying to break the system. There is a lot of amazing stuff happening in traditional education.
It's not just about taking the knowledge from the teacher and distilling or drip feeding that into the student. It's about the social aspects of talking to your peers and learning together. It's about the teachers giving skills like how to be a global citizen, how to navigate, how to critically think, how to evaluate information. So there is so much more to the educational systems than just the knowledge that they give.
In our team, we're thinking about the new technology in terms of how can it work within the current system, and how can it add to it? This isn't starting with a brand-new blank sheet of paper and saying design an education system from scratch. It's like augmenting the one that exists.
"WEBVTTKind: captionsLanguage: enHANNAH FRY: Welcome back toGoogle DeepMind, the Podcast.I'm your host,Professor Hannah Fry.Now, there are fewplaces in the past coupleof years that have felt thetransformative influence of AIas keenly as theeducation sector.Teachers around theworld are havingto abruptly rethink how theyengage with and assess students,and there are concerns aroundthe potential for cheatingand dependence on technology.Those fears are valid.But despite therapid transition,there has been somethingremarkably resilientabout the idea ofa human teacher,something that at its hearthas remained immune toebbs and flows of technology.After all, we've hadclassrooms for almost as longas we've had civilizations.And there are undoubtedlyopportunities here, too.Imagine a classroomwhere each lessonis tailored to your individualpace of learning, wherean AI tutor is availablearound the clock,and technology can predictwhere you're likely to get stuckbefore you do.Well, researchers hereat Google DeepMindhave been grappling with boththe opportunities and challengesof AI in education.They recentlypublished a major paperon developing AIresponsibly in this area,and one of its lead authorsis my guest on the podcast.Irina Jurenka is a researchlead at Google DeepMind.Her background spansexperimental psychologyand computational neuroscience,and she has spent a decadewithin these wallsasking questionslike how do humans learn?Welcome to the podcast, Irina.This is a space where peopleare very heavily invested.Does that make it quite adifficult space to navigate?IRINA JURENKA: It doesbecause if you think about it,education has been aroundfor thousands of years,and it is a fundamentalstructure in our society.Every child is supposedto get educated.So the educational systemshave been around for a while.They are quite rigid, andthey're very established.So to come in and say, look, wehave this amazing technology,and we're going to revolutionizeeverything and changeeverything, I think it'snot going to work so easily.And we've seen this happenwith technologies of the past.Like, intelligenttutoring systemshave existed for 50 plus years.A lot of investment andresearch has gone into them.But you could argue that thepromise of that technologyhasn't fully materialized.Or more recently, we had MOOCs,these massive open courses,and again, there wasso much excitementhow we won't need traditionaleducation anymore.You can just go onlineand learn anythingyou would ever want to learn.And once again,when you actuallylook at who usesthese systems, it'speople who have already gonethrough traditional education.And typically,it's people tryingto get their second Masters.So it's definitely not the thingthat came and broke the system.And I guess maybe we shouldn'tbe trying to break the system.There is a lot ofamazing stuff happeningin traditional education.It's not just about takingthe knowledge from the teacherand distilling or dripfeeding that into the student.It's about the social aspectsof talking to your peersand learning together.It's about theteachers, giving skillslike how to be a globalcitizen, how to navigate,how to critically think,how to evaluate information.So there is so much more to theeducational systems than justthe knowledge that they give.So in our team, we're thinkingabout the new technologyin terms of how can it workwithin the current system,and how can it add to it?HANNAH FRY: So this isn'tstarting with a brandnew blank sheet of paper andsaying design an educationsystem from scratch.It's like augmentingthe one that exists.IRINA JURENKA:Yeah, so actually,Justin Reicher, a researcher atMIT, has a really nice quote.So he says that, \"Newtechnology doesn'tbreak educational systems.Educational systems kindof tame new technology.\"HANNAH FRY: Which iswhat happened with MOOCs,as you said.IRINA JURENKA: Exactly, yes.And yeah, as I said,we're also seeingthat there arethese human aspectsof teacher-student interactionsthat we can't possiblyever change with technology.For example, if you thinkabout a student and a tutor,there are some social rules thatare in place where a student isvery unlikely to juststand up and walk awayfrom a human tutor.But if you are interactingwith an AI tutor,you can just close thewindow, and that's it.So there are certainchallenges thatcome with bringingtechnology in,and there are certain thingsthat human-to-human interactionshave that technologywill never replace.So this is why we're tryingto work within the systemto begin with.HANNAH FRY: Howdisruptive do you expectit will be to education, then?Because I mean, onthe one hand, therehas been quite a lotof disruption already,especially particularly recentlywith large language models.When I spoke to hewas talking about overestimatingthe impact of somethingin the short termand then underestimating how bigthe longer term impact will be.Where do you thinkeducation fits in with this?IRINA JURENKA: I feel like thereis so much buzz about GenAIright now in education.I feel like everyone actuallyexpects it to completely changeeverything immediately.And there have been somany different attacksthat sprung aroundtaking a language modeland turning that into atutor or a homework helperor anything elsethat helps students.And honestly, so far,nothing has reallymade the impact that I thinkeveryone was expecting.So that's why, fromour perspective,we are in the center of actuallyimproving this technology,and we have unprecedentedaccess to Gemini.We can influencehow things change.And in fact, one ofour goals is to makeGemini the best largelanguage model for education.HANNAH FRY: So whatis the ambition here?Is it to build auniversal AI tutor?IRINA JURENKA: It is,but we also wantedto power different experiences.So the very first place wherewe deployed our AI tutorwas YouTube.So on learningvideos, there is nowa new function, which is kindof like if you're watchinga learning video, and you don'tquite understand something,you can virtuallyraise your hand.And an AI tutor will pop up, andyou can ask all your learningquestions to the tutor.And then more recently, wealso launched Gemini Gem.So it's called thelearning coach,and it's basicallyoptimized to be your guideinto learning experiences.So here, you can comeup with any question.Let's say, I want to learnabout photosynthesis,or tell me aboutAmerican Civil War.And it will give you a plan.It will try to understand whatyou know and don't, and then itwill try to guide youthrough the materials.So what we're hoping to dois really push the researchto make these base models asgood as possible for educationand then figure outhow to actually makethe best use of them.And in fact, I think we hopethat the community can help uswith that so that it'snot just us dictating whatan AI tutor should be like.It's us listening to peoplewho have been in the spacefor much longerthan us and tryingto help them make themost of technologyand make the technology bethe best it can be for them.HANNAH FRY: How far do you thinkthe technology can go, though?Can you paint mean image of whatyou, in a veryoptimistic scenario,would like thefuture to look like?IRINA JURENKA: I think a lotof people talk about AI-firstschooling or these--I think there is evena school in the UKthat just switched to mostlyhaving AI-based education.And I think they only justhave a few teachers on handto help around.And I just don't think thatthat future is somethingwe should be striving for.We really don't want toreplace human teachers.We want to give atool that enhancesthis kind of in-person classroomexperience between teachersand students.I think it's a little bit sadif students come to schooland just sit aroundlooking at screens all day.So the way we arethinking about itis that there's still teachersas mentors, as role modelsto the students.And there is a lot of peerinteractions during learning.But there is this AIsystem that helps,that works withteachers and learnersand helps them make thebest of the situation.So maybe for eachlearner, the AI tutorcan help them moveat their own paceand really targettheir interests.And at the sametime, the teachergets a view of where everyoneis, and they can kind of stillsteer this tutor.So they still havecontrol, and they can stillbring their own personality andteaching style to the lessonsbecause I think this connectionbetween teachers and studentsis so important.And like looking back on my owneducation, what stands out tome is these amazingteachers who mademe excited abouta certain subject.So I think whattechnology shouldbe trying to do is make moreof interactions and memorieslike that and maybe remove theless ideal situations wheremaybe the teacher andthe student don't clickor the teacher is so overworkedthat they don't have timeto spend with a particularstudent who actually needs themthe most.HANNAH FRY: Iimagine that there'llbe some people watchingwho don't necessarilyknow about your background.So can you tell usa little bit, whatwas your path to get tothinking about AI and education?IRINA JURENKA: I mean, howfar away shall I start?HANNAH FRY: Day one.A brief history.IRINA JURENKA:Yeah, so I've alwaysbeen fascinated by intelligence,any kind of intelligence, humanor artificial.I started coding quiteearly on in life.It was just a lucky coincidencethat my brother and I gota comic book as children,and it was about, basically,introduction to programming.HANNAH FRY: Amazing.IRINA JURENKA: So westarted writing small gamesaround the age ofprobably 11 or 12.And I remember at some pointduring the summer, my brotherand I were bored, and wediscovered that you can actuallyget access to the source code ofone of those like shooter games.And you could actuallycode up your opponents.So like, wow, this is exciting.We can actually create AI.So I rememberputting a diary entrylike this summer we'regoing to solve AI.Of course, that didn't happen.HANNAH FRY: Oh, theambition of youth.IRINA JURENKA: Yeah.HANNAH FRY: AmazingIRINA JURENKA: Surprisingly,though, my brother went onto study computer science.But I was growing up ina traditional societywhere somehow it just didn'tclick to me that computerscience the degree and makinggames and playing aroundon computers with mybrother are the same thing.To me, computer sciencewas something kindof dry and moreabout the hardware,and I really did not enjoy that.So I ended up studyingpsychology as my degree.I was kind of wondering,how can I move towards AIand still study intelligence?Because I was fascinated,how does the brain do it?How does this incrediblebehavior and intelligenceand reasoning, howdoes it all arise?And then I was very lucky thatby the time I finished my PhD,and I heard about DeepMindand how you can actuallydo neuroscienceresearch and answerthese deep fundamentalquestions with deep learning,it was this perfect job for me.So I started off inthe neuroscience team.And as I mentioned, this ideaof intelligence and reasoninghas always been atthe back of my mindbecause reasoning is kind ofwhat makes us intelligent.So I started to workon improving reasoningin language models.And very early on,I kind of just,even before language modelsbecame this big thing,I realized that they werequite bad at reasoning.But also what I realizedis that humans don't reallyuse reasoning that much.If you think about it,in our daily lives,we don't actually thinkthrough a lot of our actions.We kind of just--we're almost actingon autopilot.So to really studyreasoning, we needed a domainwhere reasoning wasimportant, and that'swhere educationbecame a thing againbecause this is where humansdiscover how to reason well.HANNAH FRY: There's somethingso interesting in that,then, that the motivationis in some ways tryingto teach AI to bebetter at reasoningand in the processunderstand whatit means to teach reasoning.And that's kind of quite anice way around to look at it.IRINA JURENKA:Yeah, and also it'sinteresting how doing somethingand teaching somebody elsehow to do it are not the same.And this is basically thechallenge we are now solving.So the base Geminiis slowly improvingat reasoning and math and codingand all of those basic skills.But then our job is toactually stop the modelfrom using these skillsand giving away the answerand really just doingthe job for the studentand instead holding backand thinking about whatare the right questions I canask the student so that they canfigure it out by themselves?And that's very hard.Models are finetuned to be helpful.So the initial reaction isI'll just give you the answer.So we have to do a lot of workto stop them from doing that.HANNAH FRY: Butthen, actually, Ithink you've really hitthe nail on the headthere that beingable to do somethingis not the same asbeing able to teach it.And I'm really struck inmaths education, whichis the space that I know mostabout, about how there is thispush and pull fromdifferent sectorsabout what is requiredof students and the bestpossible way to instill thoseskills and that knowledge.If you're building asort of an AI, whichwill have thisuniversal appeal, howdo you find thatbalance of making surethat you're hitting all of thenotes that are required from allof the different areas?IRINA JURENKA: That--it is a good question.So when we first startedbuilding the tutor,we thought, we cantalk to teachersand other maybe academics inthe field as well as learnersand figure out, what isthe perfect way to teach?And then--HANNAH FRY: As though there is--IRINA JURENKA: Exactly.HANNAH FRY: A sort of a best.IRINA JURENKA: Yeah,but you kind of assumethat in everything thereis this optimal strategy.Maybe this is thescientist in us kind of--but-- and we did that.We went and intervieweda lot of stakeholders.And what we realized is thatthere is a lot of disagreement.And actually, once we startedeven deploying our early tutormodels on different Googleservices like YouTube or Geminiapp, we found thateven there, therewere different requirements.So let's say on YouTube, thevideo, the educational video,is the main act.So the tutor is reallythere to support that.And maybe the tutorshould be giving awayanswers much morebecause it's actuallyhelpful for the learneron that surface.At the same time, if youtalk to a teacher at school,they have verydifferent requirements.They really don't want the tutorto give away answers, definitelynot to the exam questions.And they will alsowant the tutorto follow some particularexam board requirementsor particular teaching styleof that particular teacher.So how do youactually incorporateall of those diversevoices into a single tutor?So what we'verealized is that weneed to build this basepedagogical model that youcan steer withdifferent instructions.So one teacher cancome and say, actually,I want my students tojust have fun todayand just answer anyquestion you haveand really push onsome fun experiences.And another tutor mightbe much more academicand say, no, today we'redoing exam practice problems.You just guide the studentthrough these topicsand make sure that theyunderstand everything.HANNAH FRY: I guessone of the big thingsabout education, Imean, as you said,there isn't this optimalapproach to teaching.But there are these kind ofimperfect measures, really.We sort of know goodteaching when we see it,but it feels quitedifficult to quantify.So how do you decide whatcounts as good pedagogywhen you're navigatingin this space?IRINA JURENKA: Sofirst, you might say,well, there's learning science.So why don't you justlook at the papers,and they'll give you the answer?And yes, there is alot of literature,but there is noconsensus as such.But another thing is pedagogyis very context dependent.So what works for maybe anovice learner might not workfor an expert learner, or whatworks for a subject that's moreprocedural-- let's say, math--you actually learn theskill of the procedureof how to solve a problem.It might not work for amore memory based subjectslike history.So when you startthinking about,there's these hundreds ofdifferent pedagogical strategiesthat have beenstudied, all of themwork slightly differentin different contexts.Suddenly you havethis massive spacewhere maybe there isn'tone single point that'sthe best pedagogy, but thereare many different regions thatare best pedagogiesin the given context.But the problem is how doyou even quantify this space,and then how do you searchit for this perfect pedagogystrategy?And it becomes kindof similar to the workthat DeepMind has done before,like playing the game of Go.The reason why it was sucha huge challenge for AIwas because the search spacewas huge, all the possible movesyou can do.It's so many, and thereisn't one known strategy.The AI has to search the spaceof possible moves and strategiesand discover what itthinks is the best one.And what we foundwith the AlphaGowork was that, first of all,AI was much better than humans.Basically, all of humanityfor thousands of yearsplaying the game of Go.AI was actually ableto search the spaceand discover better strategiesin the matter of days or monthscompared to whathumans could do.So our hope is that wecan do something similarwith education, but we're goingback to this kind of questionlike how do we actually knowwhat success looks like?In Go, you can stillmeasure who has won,and it's pretty unambiguous.Whereas, in education,the Holy Grailis whether the students'learning outcomeshave become better.But this is not somethingyou can measure quickly.You need months, if not years,to really track the learner,and that's not really feasible.So a lot of our workis actually done--OK, we know whatwe're aiming for,but how can we approximate it ina way that's easier to measureand faster to measure?So we publisheda report recentlywhere it's like 70 pages ofbasically our trial and errorand different attempts atmeasuring pedagogy, goingfrom working with real studentsat Arizona State Universityand maybe measuringat a longer timescales of a couple of months,to asking pedagogical ratersand teachers to look through afew examples of conversationsbetween studentsand our AI tutorand maybe give us quickerfeedback on the order of weeksor days, to automaticmeasures where we actuallyask AI to evaluate AI and giveus a much more targeted, muchmore limited, but still usefulfeedback in the matter of hours.HANNAH FRY: But Iguess, if you reallywant to evaluate whatgood teaching is,you want to do thatfull randomized controltrial where you're monitoringpeople over a period of time.How far away do you think we arefrom being able to run those?IRINA JURENKA: Well, theseare being run right now.So Arizona State Universityis one example wherewe're actually running these.I think the problemwith these iseven if you take yourstudents and you split theminto students who haveaccess to the AI tutorand students whodon't, what we findis that in the group wherethey theoretically have accessto the tutor, only asmall percentage actuallyengage with it.And that creates a problembecause why are some studentsengaging and others not?Is there something inherentlydifferent about these students?And then if we onlysee success in thosewho engage, is it becauseof the tutor, or is itbecause these learners wereinherently more motivated,and hence, they wouldhave done better anyway?And then the questionis, who are we helping,and what effect does ithave at a larger scale?So if you think aboutthe top studentsand the bottom students,and then you'rehelping the topstudents do better.But you're not actuallyhelping the bottom students.You're actuallyincreasing the gap.But I think everyone,when they go into Ed Tech,they actually wantto decrease the gap.So how do we do that?How do we make surethat everyone engages?That's another big questionthat we're working on.HANNAH FRY: I mean, there's justimperfect measures everywhereyou look, isn't there?It's very, very difficultto get a real groundtruth in any of this.But then I supposethere's also--I mean, there's furthercomplications in thisbecause, OK, so that'ssort of teaching style.But presumably, thereare some subjectswhere there's more of aground truth than others.I'm thinking, for example, ifyou created a tutor for history,I mean, it would changedepending on which country youwere in as to what might bethe most relevant answersto a particular question.IRINA JURENKA: Yes, thisis a big issue for us.We are-- yeah, we'vethought a lot about whatdo you do in this situation?Because you can't give thisone true answer to any historyquestion.This is again why we'rethinking about steerabilityso that teachers indifferent countriescan give the backgroundinformation to the tutor.So it kind of knowswhat is the expected wayof answering certain questions.But it also-- oftenhistorical topicsbring up questions that arereally important to discussbut are also hard todiscuss and very sensitive.I'm thinking thingslike the Holocaust.So again, how should the tutorbehave in these situations?I think the standardapproach to safetyoften is effectivelydeclining to engagein a difficult conversation.But that's not somethinga tutor can do.HANNAH FRY: No, I mean partof the point of educationis to think aboutdifficult things.IRINA JURENKA: Exactly.So I can't say that we'vesolved this problem.We are trying togive different viewsand trying to give the learnersa chance to critically evaluatedifferent ideas inthe space and alsoreally trying to bringmetacognition to this problem.So metacognition isan interesting one.I think it often get overlooked.But a lot of people don'tactually know how to learn.It's often not as muchfun as you would expect.It requires you to be--like to plan ahead, to reallyengage with the materials.And yeah, most people don'treally know how to do that.So what a tutor can do isactually teach the learner,if you're trying to answerthis difficult question,maybe what you shoulddo is go and look updifferent primarysources and thenthink about what arethey telling you,and what do you think about it?What do other expertsthink about this?And teaching thelearner how to goabout answering these questionsrather than necessarily givingthe answers directly.HANNAH FRY: There'slayers to it, then.I guess, on one layer you haveknowledge and facts, whichis, I guess, maths isquite full of them.And then above that,you've got the skillsof critically evaluating.And then abovethat metacognition,which is how to developthe skills to evaluatethe knowledge.IRINA JURENKA: Exactly.HANNAH FRY: So, you thinkthat's the answer to this safetyquestion of approachingdifficult problems.IRINA JURENKA: So notnecessarily the answerto safety.It's more of an answer how toengage with subjects where,as you said, there is nonecessarily like single groundtruth.In terms of safety, I think it'sa slightly different question.Sometimes people askus like, why are youworking on safety at all?Aren't you using base modelswhich already went through a lotof safety, finetuning safety work?And the answer to thatis, even though theyhave done all ofthis background work,when it comes to the educationaluse case specifically,you have to think about howthese systems will be used.So one thing we found was that--so our tutors are deployedto Arizona State Universitystudents and in particularthrough their study hallprogram, which is aimed atbringing more diverse learnersto higher education.So essentially, anyonewatching ASU videos on YouTubecan get invited to takepart in this coursewhere it's the same lecturesbut with more faculty supportand essentially an opportunityto earn credit and thentransfer to become an actualstudent at Arizona StateUniversity.But what it means is that theselearners are typically alreadylike full time work, or theyhave family commitments.They're quite shorton time and stressed.And so when they'relearning, naturally,sometimes they are just in abad state, and there's no one--maybe they're studying at 11:00PM, and they just need to vent.And the only thingthat they can vent tois this AI tutor that's sittingin front of them on the screen.So we find these kindof emotional outburstslike, I am so stressed.I'm really struggling here.Will I ever be ableto solve this problem?Maybe I should just quit.And the tutor can'tignore these messages.They can't just say, sorry.I can't answer this.It really needsto engage and saysomething that connectswith the user in this veryvulnerable state.So what our tutoris trained to do--and we see transcriptslike this coming in--is like it's fineto feel this way.Everyone feels this way.We can get throughthis together.There are resources that canhelp you and things like that.HANNAH FRY: I know that you'vewritten that an AI tutor shouldbe careful aboutsensitive self-disclosure,I guess particularly inthat sort of a setting.What did you mean by that?IRINA JURENKA: So whenpeople speak to each other,what often happens is maybeone of the conversationpartners will saysomething personaland maybe mentiona personal fact.And that encouragesthe other personto also open up and sharesomething about them.And through this,they build trustand a connection that helpsthe conversation move forward.And when a learner mentionssomething so personal about howstressed they are,it's almost naturalthat they would expectthe tutor to share back.But then, of course,the tutor doesn'thave a stressful situation fromtheir past that they can share.Anything they self-disclose likethat would be effectively a lie.So there's this verykind of thin linethat we have to walkwhere the tutor needsto maintain the connection andmake sure that they supportthe learner but at thesame time not mislead themand not create a connectionwhich shouldn't existbetween a human and an AI.HANNAH FRY: So at no point canit pretend to be another human,but it needs to understandhow to empathizewith a human student.IRINA JURENKA: Exactly.HANNAH FRY: But then.OK, I sort of wonder.There's somethingreally interesting thereabout the correct amountof anthropomorphization.Are there someadvantages to studentsknowing that it's anAI, knowing that thereisn't a human at the other end?Are students more comfortablemaking mistakes in frontof the AI, for instance?IRINA JURENKA: Yes, for sure.So something we'veheard from studentsis that they feel muchmore comfortable askingwhat they might perceiveas a silly question to AItutors just because they don'tfeel judged as you do when thereis a human on the other side.Also, when you're in a class,and you could ask a question,but then there'salso peer judgment.In this one-on-eonsetting with an AI tutor,you can basically say anything,and it's going to be fine.So we find that the learnersreally appreciate that.HANNAH FRY: But thenwhat about trust?Do you find that people end upbelieving the AI more than theywould a sort of human tutor?IRINA JURENKA: Sometimes we do.So we had this veryinteresting situationwhere, in the very first stagesof developing the AI tutor,we wanted to test it out howit compares to human teachers.So we connected pageraters who were told, look.You have this opportunityto learn different subjects.You will getconnected to a tutor.And we didn't tell them whetherit was an AI or a human.And just have fun, enjoythe learning experience.And after that, they weregiven a questionnaire.And in this questionnaire,we asked thingslike how much do youthink you've learned?How much did youenjoy the experience?And so this was thevery first versionof our tutor, whichwe knew was quite bad.And we found, surprisingly,that the learners reportedhaving learned more withan AI tutor than a human,so that seemed strange.So we decided to look throughthe transcript to understandwhat is going on there.And we found that theAI tutor hallucinatedall sorts of interesting,surprising factsthat, of course, as a learner,pretty much everythingthe tutor says sounds like,I did not expect that.This is a fun fact Ijust learned today.So of course, theywere very impressedand felt likethey've learned more.But in fact, thisis not somethingthat the tutor should bedoing and definitely somethingthat we worked on to addressin the future iterations.HANNAH FRY: Is that aconcern going forwards?I mean, the idea ofhallucinations and peoplemistaking thosefor real knowledge.IRINA JURENKA: It isdefinitely a concern.The base technology isgetting better at factuality.And also with education, becausewe're teaching some materialthat is known, so there'salways some sort of grounding.Our tutors avoid some ofthese issues of factualityby just being ableto say, I'm onlyteaching you aboutthis particular YouTubevideo or thisparticular piece of textthat your teacher hasprovided and referring factstowards that primary source.So it's kind of gives thetutor less opportunityto actually make things up.HANNAH FRY: I just wantedto think, actually,also about the effect thatlarge language models have hadon education more generally.So outside of aspecific AI tutor,because there is a big questionthat everyone has been asking,which is aboutputting in safeguardsto stop AI being used to cheator to do people's exams for themor do people's essays for them.So what kind ofsafeguards can you put in?IRINA JURENKA: Thistechnology is so pervasive.And we actually--we talk to studentsabout their use ofGenAI, and even wewere surprised by howmuch they used it.So literally, they weresaying that their screenis kind of a lecture,then notes, and thenGenAI at the bottom.So I think the technologyis here to stay,and it will be usedby the learners.What can be done is trying toencourage learners to criticallyevaluate their responses, tryingto maybe change how we evaluate.What are the assignmentsso that it actuallyworks with the technology.Because if you thinkabout it, educationis preparing usfor the real world.And in the real world,I think the expectationwill be more andmore to actually workwith this technology becauseit does help in many waysand does make usmore productive.So it doesn't make senseto ban it during educationand then expect learners toknow how to use it properlyin their work.So maybe one way tothink about it is--and I think that's whatwe've heard from teachers--is how to change assignmentsand the ways of teaching workingwhere GenAI isencouraged as a partner,but the evaluation isdone slightly differently.So it's kind of calculatorsin the past where you'reallowed to use calculatorsin certain math exams,but you're stillexpected to knowhow to do thesecalculations without help.HANNAH FRY: I do wonder,in the longer term,as we start to see, I don'tknow, like GenAI beinglike the assistant at all times,whether we can end up buildinga bit of a dependency on them.I mean, do students end up witha feeling that they have masterywhen they don't?Actually, it's the AIthat's doing the work.IRINA JURENKA: So I thinkthere are two potential issueshere that you've identified.One is this feeling ofmastery when there isn't one.And this is a very commonfactor in any kind of learning,even if you're talkingabout traditional education.For example, one of thethings that studentsdo a lot in preparation for examis just like reread their notesor reread the textbook.And that kind ofcreates a feelinglike they have mastered thematerial just because they'reso familiar with it.But when they go intoan exam, they actuallyfind that they can'tremember the factsand can't use theinformation well.So this is one ofthe things thatis very well known to be kindof a bad educational strategy,just rereading.And we find thesame with AI tutors,where if we ask alearner how they thoughtthe conversation went and howmuch they think they've learned,they can report reallygood satisfaction.Whereas, if we give the sameconversation to a teacherand then ask them thesame question, howpedagogical was the tutor?How well do you thinkthat session went?They could rate itvery, very differently.I guess the other factor isthis question of dependency.We definitely find that ifthe learners use GenAI a lotduring their studies,they feel like it's reallyhelping in the process.And actually,studies show that itdoes increase the successin exercises and marks.But when it comes toan exam, actually,the learner's performance drops.And that's becauseduring studies, theyget so dependent on the AIproviding them with the answersor even if it guides them, ifit doesn't actually teach themthe right things, it might belike they're just outsourcingtheir reasoning to the AI.That can be a problemduring exam conditionswhere you don't haveaccess to it anymore,and you don't actuallyremember or knowhow to reason throughthese problems on your own.HANNAH FRY: I guess becausethe best exams aren't justtesting knowledge.They're also testing skill.IRINA JURENKA: Exactly.HANNAH FRY: From allof your research, then,what are the bigconclusions that youdraw about how to createan effective AI tutor?Do you reckon you've solved it?IRINA JURENKA: No, nowhere near.And I will say we justmade the very first step,and that step is kind ofrealizing how hard of a problemthis is.I think when we firststarted doing this work,we were naive andwide eyed thinkingthat we will come in andsolve it within a year.But now, I thinkwe have a betteridea of the scope of the problemand what are the main thingsto address to start makingmeaningful progress.And these are things likehow do we know success?What-- how do wemeasure pedagogy?Why do we get the data?How do we actuallytrain these models?And also how to engagethe communities betterand who are we building for sothat we are not accidentallyincreasing the gap in educationbut are making meaningful stepstowards decreasing them.So I think there is a verylong road ahead of us.And actually, wethink that we reallyneed to bring all ofthe community to workon this problem together.So we are trying tocreate common benchmarksthat we can all climb together.HANNAH FRY: Thatwas really nice.It was really nice.Thank you for joining me, Irina.I was really struck in thatconversation with Irinaby the notable shift in thesorts of problems that are beingconsidered in this building.We've gone from dealing withdefinites like winning or losingat chess or Go orrecognizing cat or no catin images to education, aspace with no absolutes,only imperfect measures inevery direction-- in what countsas good teaching, in whatcounts as an effective learningexperience, in how to getthe balance between knowledgeand skills and learninghow to learn or to walkthe line between how mucha tutor should promptand how much itshould withhold, evenhow human a tutor should be.None of those questionshave ground truths,and that is what makes thischallenge so incrediblydifficult, but also one,as beautifully demonstratedby Irina there, whichrequires humilityand collaboration to solve.You've been listening toGoogle DeepMind the Podcastwith me, Professor Hannah Fry.If you enjoyed theepisode, do subscribeto our YouTube channel.You can also find us on yourfavorite podcast platform.And we have gotplenty more episodeson a whole rangeof topics to come,so do check those out, too.\n"