deeplearning.ai's Heroes of Deep Learning - Pieter Abbeel

The Role of Mentorship and Learning in Personal Growth

One of the most effective ways to shape individuals to become stronger at whatever they want to do is through mentorship. A dedicated person can help guide someone to achieve their goals, making it a valuable asset for personal growth. This type of mentorship is often found in academic settings, where professors play a crucial role in helping students learn and develop new skills. Many companies also offer mentorship programs, providing individuals with experienced professionals who can guide them through the learning process.

While formal education provides a structured environment for learning, it's not the only way to acquire knowledge and skills. Companies often have mentors who take on a similar role to professors, guiding employees and helping them develop new capabilities. The key difference between these settings is that mentorship in companies may not be as formalized or guaranteed as in academic institutions. Nevertheless, having a dedicated person to guide and support individuals can significantly accelerate their progress.

Learning through Self-Direction and Mentorship

There are two main approaches to learning: self-directed learning and mentorship-led learning. Self-directed learning involves taking charge of one's own education and development, relying on external resources such as books, online courses, or tutorials to acquire new skills. This approach allows individuals to learn at their own pace and explore topics that interest them.

On the other hand, mentorship-led learning relies on a dedicated person to guide and support an individual in their learning journey. This type of approach has proven effective in shaping students in academic programs, where professors take on a mentorship role to help students develop new skills and achieve their goals. In companies, similar mentorship programs exist, providing employees with experienced professionals who can guide them through the learning process.

Deep Reinforcement Learning: Applications and Challenges

Deep reinforcement learning is an exciting field that has already shown impressive results in various applications. One of the most notable examples is the development of AI systems that can play Atari games using only pixel processing as input. These systems learn to navigate the game environment by trial and error, eventually mastering complex tasks such as playing games with a joystick.

Another example of deep reinforcement learning in action is the development of simulated robots that can learn to walk, run, or interact with their environment. Researchers have created algorithms that enable these robots to learn from raw sensory inputs and adapt to new situations. For instance, one project involved teaching a robot to put blocks into matching openings, demonstrating the ability of deep reinforcement learning to tackle complex tasks.

The Generalizability of Deep Reinforcement Learning

One of the most impressive aspects of deep reinforcement learning is its generalizability across different domains and applications. Researchers have shown that a single algorithm can be applied to various tasks, such as teaching robots to walk or run on different terrain. This level of adaptability makes deep reinforcement learning a powerful tool for solving complex problems.

However, there are also challenges associated with applying deep reinforcement learning to real-world scenarios. One major limitation is the need for large amounts of data and computational resources. Currently, most applications of deep reinforcement learning rely on supervised learning techniques that involve humans creating labeled datasets. While this approach has proven effective in many cases, it may not be scalable or practical for all applications.

Real-World Deployment of Deep Reinforcement Learning

While deep reinforcement learning has shown impressive results in simulations and controlled environments, its deployment in real-world settings is still a work in progress. One potential obstacle is the need for large amounts of data and computational resources. Currently, many companies rely on human expertise to perform tasks such as behavioral cloning or supervised learning, which involves teaching machines to mimic human behavior.

However, researchers are exploring new approaches that could enable more efficient deployment of deep reinforcement learning in real-world settings. One potential solution involves infusing reinforcement learning into existing systems and combining it with supervised learning techniques to create a hybrid approach. This could enable companies to leverage the strengths of both machine learning and human expertise to achieve better outcomes.

The Role of Reinforcement Learning in Facebook Messenger

Facebook Messenger is an excellent example of how reinforcement learning can be applied in real-world settings. The company's AI-powered chatbots use reinforcement learning to learn from user interactions and improve their performance over time. By combining supervised learning with reinforcement learning, these chatbots can adapt to changing user behavior and provide more personalized support.

The key advantage of this approach is that it enables Facebook Messenger to learn from user feedback without requiring explicit human intervention. This allows the system to evolve quickly in response to changing user needs and preferences. As a result, Facebook Messenger becomes an even more effective tool for customer support and engagement.

In conclusion, learning and growth are essential components of personal development, and mentorship plays a crucial role in shaping individuals to achieve their goals. While formal education provides a structured environment for learning, companies also offer mentorship programs that provide employees with experienced professionals who can guide them through the learning process. Deep reinforcement learning has shown impressive results in various applications, from playing Atari games to developing simulated robots that can learn complex tasks. However, its deployment in real-world settings is still a work in progress, and researchers are exploring new approaches that could enable more efficient application of this technology.

"WEBVTTKind: captionsLanguage: enso thanks a lot Peter for joining me today um I think a lot of people know you as a well known machine learning and deep learning and robotics researcher like to have people here but about your story how did you end up doing the work that you do yeah it's a it's a good question and actually if you would have asked me as a as a 14 year old what I was aspiring to do it probably would not have been this in fact at the time I thought being a professional basketball player would be the right way to go I don't think I was able to achieve it I feel that machine let me left out that DeVos ball didn't work out yeah that didn't work out it was a lot of fun playing basketball but it didn't work out to try to make it into a career so what I really liked in school was physics and math and so from there it seemed pretty natural to study engineering which is applying physics and math in the real world and I should them after my undergrad in electrical engineering a she wasn't so sure what the Duke is which we anything engineering seemed interesting to me like understanding how anything works seems interesting trying to build anything is interesting and in some sense artificial intelligence went out because it seemed like it could somehow help all disciplines in some way and also it seems somehow a little more at the core of everything like you think about how a machine can think then maybe that's more the core of everything else than picking any specific discipline and saying you know a is the new electricity sounds like the fourteen-year-old here's know if you had an earlier version of that event you know in the past few years you've done a lot of work in deep reinforcement learning what's happening why is deeper enforcement learning suddenly taking off before I worked in deep reinforcement learning and I work a lot in reinforcement learning actually with you and ER at Stanford of course and so we worked on autonomous helicopter flight then later at Berkeley with some of my students who worked on getting a robot to learn to fall laundry and kind of what characterized the work was a combination of learning that enabled things that would not be possible without learning but also a lot of domain expertise in with the learning to get this to work and it it was very interesting because you needed the main expertise which is fun to acquire but at the same time was very time-consuming for every new application you wanted to succeed you need domain expertise plus machine learning expertise and for me was in 2012 with the imagenet breakthrough results from Jeff Hinton's group in Toronto Alex Ned showing that supervised learning all of a sudden could be done with far less engineering for the domain at hand there was very low engineering by a vision in Alex net made me think we really should revisit reinforcement learning under the same kind of viewpoint and see if we can get the deep version of reinforcement learning to work and do equally interesting things as had just happened in the supervised learning and so you know sounds like you saw earlier than most people the potential of deeper enforcement of learning so now looking into the future what do you see next what your prediction so that makes several ways to come in deeper also learning so I think what's interesting about deep reinforcement learning is down in some soon as there is many more questions none in supervised learning in supervised learning is about learning an input-output mapping with reinforcement learning there is the notion of where does the data even come from so that's the exploration problem when you have data how do you do credit assignment how do you understand what actions you took early on got you the reward later and then there's issues of safety when you have a system of times like collecting data essentially rather dangerous in most situations imagine the self-driving car company that says we're just gonna run deep reinforcement learning it's pretty likely that car will get into a lot of accidents before it does anything useful you need two negative examples of those oh right you do need some negative examples somehow yeah send positive ones hopefully so I think there's still a lot of challenges in deep reinforcement learning in terms of working out some of the specifics of how to get these things to work so the deep part is the representation within the reinforcement learning itself still has a lot of questions and what I feel is that with the advanced advances in deep learning somehow one part of the puzzle in reinforcement learning has been largely addressed which is the representation part so if if there is a pattern we can probably represent it with a deep network and capture that pattern and then how that tease apart the pattern is still a big challenge in reinforcement learning so I think big challenges are how to get systems to reason over long time horizons so right now a lot of the successes in deep reinforcement learning are very short horizon there are problems where if you act well over a five second horizon you act well over the entire problem and so a five second skill was something very different from a day-long skill or the ability to live a life as a robot or some software agent so I think there's a lot of challenges there I think safety has a lot of challenges in terms of how do you learn learn safely and also how do you keep learning once you're already pretty good so give an example again that a lot of people would be familiar with self-driving cars for a self-driving car to be better than a human driver I should that human drivers may begin to accidents bad accidents every three million miles or something and so that takes a long time to see the negative data once you're as good as a human driver but you want your cell down in car to be better than a human driver and so at that point the data collection becomes really really difficult to get that interesting data that makes your system improve there's a lot of challenges related to exploration that tie into that but one of the things I'm actually most excited about right now is seeing if we can actually take a step back and also learn the reinforcement learning algorithm so reinforcement is very complex credits and very complex explorations very complex and so maybe just like how deep learning for supervised learning was able to replace a lot of domain expertise maybe we can have programs that are learned that are reinforcement learning programs and the do all this instead of us designing the details the river-water functional during the whole program so this would be learning the entire reinforcement learning program so it would be imagine you have a reinforcement learning program whatever it is and you you throw it out some problem and then you see how long it takes to learn and then you say well that took a while now let another program modify this ring for learning program after the modification see how fast it learns if it learns more quickly that was a good modification and maybe you keep it and improve from there Wow I see right yeah this is direction yeah it's I think it's a lot to do with maybe the amount of compute that's becoming available so the more this would be running reinforcement learning in the inner loop whereas right now we were unreinforced millenials the final thing and so the more compute we get the more it becomes possible to maybe run something like reinforcement learning in the inner loop of a bigger algorithm so you know starting from the 14 euro you you've worked in the iPhone maybe what some 20-plus years now so so tell me a bit about how your understanding of AI has evolved over this at this time yeah so when I started looking at AI sorry interesting cuz it really coincided with coming to Stanford to do my master's degree there and there were some icons there like John McCarthy who I got to talk with but who had a very different approach to and in the year 2004 what most people are doing at the time but also talking with Daphne Koller and I think a lot of my initial thinking of AI was shaped by Daphne's thinking her AI class her brow was the graphical models class and kind of really being intrigued by how simply a distribution over many random variables and then being able to condition on some subsets of variables and throwing conclusions about others could actually give you so much if you can somehow make it computationally intractable which was definitely the challenge to make it computable and then from there when I start my PhD and her you you arrived at Stanford and I think you gave me a really good reality check that's that's not the right metric to evaluate your your work by and to really try to see the the connection from what you're working on to what impact it can can really have what change it can make rather than what's the math that happened to to be in your work right this doesn't mean I I that I did not realize that we got through that yeah it's actually one of the things that most often to people people asking them once if you're gonna cite only one thing that has stuck with you from Andrews advice it's it's making sure you can see the connection to where it's actually gonna do something um you know you've had and you're continuing to have an amazing career in AI so for some of the people you know listening to you on video now if they want to also enter or pursue a career in AI what what advice do you have for them I think it's a really good time to get into artificial intelligence it's if you look at the demand for for people it's so high there is so many so many job opportunities so many things you can do research wise build new companies and so forth so I would say yes it's definitely a smart decision in terms of actually getting going a lot of it you can self study whether you're in school or not there is a lot of online courses in your machine learning course there is also for example Andrea Kerr posses deep learning course which has videos online which is a great way to get started at Berklee do is a deep reinforced learning course which is all the lectures online so those are all good places to get started I think a big part of what what's important is to to make sure you try things yourself so not just read things or and watch videos but try things out with frameworks like tensorflow chainer Theano pi torch and so forth and then whatever is your favorite just it's very easy to get going and get something up and running very quickly to get the practice yourself very good implementing and see what works and see what doesn't work so this past week there was an article in Mashable about a 16 year old and United Kingdom who is one of the leaders on Carroll competitions and he just said he just went out and learn things found things online learned everything himself and never actually took any formal course per se and there is a 16 year old just being very competitive in Carroll competitions so it's definitely possible yeah we live in good times very people that one has learned absolutely one question I bet you get all sometimes is if someone wants to you know enter AI machine learning deep learning should they apply for a ph.d program or should they get the job as big company I think a lot of it has to do with maybe how much mentoring you can get so in a ph.d program you're essentially guaranteed the job of the professor is who is your advisor is to look out for you try to do everything they can to kind of shape you help you become stronger at whatever you want to do for example AI and so there's a very clear dedicated person sometimes you have to advise it and that's that's literally a job and that's why they are professors that's most of what they like about being professors often is helping shape students to become more capable at things now it doesn't mean it's not possible at companies and many companies have really good mentors and have people who love to help educate people who come in and strengthen them in so forth it's just it might not be as much of a guarantee and a given compared to actually enrolling in a ph.d program where that's the crux of the program is that you're gonna learn and somebody is there to help you learn yes so it really depends on the company and depends on the ph.d program absolutely yeah but I think it is key that then you can learn a lot on your own but I think you can learn a lot faster if you have somebody who's more experienced to is actually taking it up as their responsibility to spend time with you and help accelerate your progress so you know you've been one of the most visible leaders in deep reinforcement learning so what are the things that deep reinforcement learning is already working really well at I think if you look at some deep reinforce learning successes it's it's very very intriguing for example learning to play Atari games from pixels processing this pixels which is just numbers that are being processed somehow and turning to joystick actions then for example some of the work we did at Berkeley where we have a simulated robot inventing walking and the reward that it's given is as simple as the further you go north the better and the less hard you impact with the ground the better and somehow it decides that walking / running is the thing to invent whereas nobody showed it while walking is or running is or robot playing with children poison learned to kind of put them together put a block into a matching opening and so forth and so I think I think it's really just alert from raw sensory inputs all the way to raw controls for example torques at the motors but at the same time so it's very interesting that you can have a single algorithm for example no trustees and policy items is you can learn can have a robot learn to run can ever robot learn to stand up can have instead of a two legged robot now you're swapping a four-legged robot you run the same reinforce learning algorithm and it still learns to run and so there's no changing the reinforcement algorithm it's very very general same for the Atari games dqn was the same dqn for every one of the games but then when it actually starts hitting the frontiers of what's not yet possible as well it's it's it's nice it learns from scratch for each one of these tasks but it would be even nicer if it could reuse things that's learned in the past to learn even more quickly for the next task and that's something that that's still at the frontier and not yet possible it always starts from scratch essentially how quickly do you think you see deeper for learning get deployed in the robots around us or the robots in you know they're getting deployed in it well today I think in practice the realistic scenario is one where it's it starts with supervised learning behavioral cloning humans do them do the work and I think actually a lot of businesses will be built that way where it's a human behind the scenes doing a lot of the work imagine facebook messenger assistant a system like that could be built with a human behind the curtains doing a lot of the work machine learning matches up with what the human does and starts making suggestions to the humans sodium has a small number of options available you can just to click and select and then over time as it gets pretty good you start infusing some reinforcement learning where you give it actual objectives not just matching the human behind the curtains but give it objectives of achievement like maybe how fast were these two people able to plan their their meeting or how fast were they able to book their flight or things like that how long did it take how happy were they with it but it would probably be bootstrapped of a lot of behavioral cloning of humans showing how this could be done so saw the behavioral clothing just supervised learning to mimic whatever the person is doing and then gradually layer on the reinforcement learning to have it think about longer time horizons is that a fair summary I'd say so yeah just because straight up reinforcement learning from scratch is is really fun to watch it's it's super intriguing and very few things more fun to watch than a reinforced learning robot starting from nothing and inventing things but it's just time consuming and it's not always safe thank you very much that was fascinating but I'm really glad we had the chance to chat well and ER thank you for having me very much appreciate itso thanks a lot Peter for joining me today um I think a lot of people know you as a well known machine learning and deep learning and robotics researcher like to have people here but about your story how did you end up doing the work that you do yeah it's a it's a good question and actually if you would have asked me as a as a 14 year old what I was aspiring to do it probably would not have been this in fact at the time I thought being a professional basketball player would be the right way to go I don't think I was able to achieve it I feel that machine let me left out that DeVos ball didn't work out yeah that didn't work out it was a lot of fun playing basketball but it didn't work out to try to make it into a career so what I really liked in school was physics and math and so from there it seemed pretty natural to study engineering which is applying physics and math in the real world and I should them after my undergrad in electrical engineering a she wasn't so sure what the Duke is which we anything engineering seemed interesting to me like understanding how anything works seems interesting trying to build anything is interesting and in some sense artificial intelligence went out because it seemed like it could somehow help all disciplines in some way and also it seems somehow a little more at the core of everything like you think about how a machine can think then maybe that's more the core of everything else than picking any specific discipline and saying you know a is the new electricity sounds like the fourteen-year-old here's know if you had an earlier version of that event you know in the past few years you've done a lot of work in deep reinforcement learning what's happening why is deeper enforcement learning suddenly taking off before I worked in deep reinforcement learning and I work a lot in reinforcement learning actually with you and ER at Stanford of course and so we worked on autonomous helicopter flight then later at Berkeley with some of my students who worked on getting a robot to learn to fall laundry and kind of what characterized the work was a combination of learning that enabled things that would not be possible without learning but also a lot of domain expertise in with the learning to get this to work and it it was very interesting because you needed the main expertise which is fun to acquire but at the same time was very time-consuming for every new application you wanted to succeed you need domain expertise plus machine learning expertise and for me was in 2012 with the imagenet breakthrough results from Jeff Hinton's group in Toronto Alex Ned showing that supervised learning all of a sudden could be done with far less engineering for the domain at hand there was very low engineering by a vision in Alex net made me think we really should revisit reinforcement learning under the same kind of viewpoint and see if we can get the deep version of reinforcement learning to work and do equally interesting things as had just happened in the supervised learning and so you know sounds like you saw earlier than most people the potential of deeper enforcement of learning so now looking into the future what do you see next what your prediction so that makes several ways to come in deeper also learning so I think what's interesting about deep reinforcement learning is down in some soon as there is many more questions none in supervised learning in supervised learning is about learning an input-output mapping with reinforcement learning there is the notion of where does the data even come from so that's the exploration problem when you have data how do you do credit assignment how do you understand what actions you took early on got you the reward later and then there's issues of safety when you have a system of times like collecting data essentially rather dangerous in most situations imagine the self-driving car company that says we're just gonna run deep reinforcement learning it's pretty likely that car will get into a lot of accidents before it does anything useful you need two negative examples of those oh right you do need some negative examples somehow yeah send positive ones hopefully so I think there's still a lot of challenges in deep reinforcement learning in terms of working out some of the specifics of how to get these things to work so the deep part is the representation within the reinforcement learning itself still has a lot of questions and what I feel is that with the advanced advances in deep learning somehow one part of the puzzle in reinforcement learning has been largely addressed which is the representation part so if if there is a pattern we can probably represent it with a deep network and capture that pattern and then how that tease apart the pattern is still a big challenge in reinforcement learning so I think big challenges are how to get systems to reason over long time horizons so right now a lot of the successes in deep reinforcement learning are very short horizon there are problems where if you act well over a five second horizon you act well over the entire problem and so a five second skill was something very different from a day-long skill or the ability to live a life as a robot or some software agent so I think there's a lot of challenges there I think safety has a lot of challenges in terms of how do you learn learn safely and also how do you keep learning once you're already pretty good so give an example again that a lot of people would be familiar with self-driving cars for a self-driving car to be better than a human driver I should that human drivers may begin to accidents bad accidents every three million miles or something and so that takes a long time to see the negative data once you're as good as a human driver but you want your cell down in car to be better than a human driver and so at that point the data collection becomes really really difficult to get that interesting data that makes your system improve there's a lot of challenges related to exploration that tie into that but one of the things I'm actually most excited about right now is seeing if we can actually take a step back and also learn the reinforcement learning algorithm so reinforcement is very complex credits and very complex explorations very complex and so maybe just like how deep learning for supervised learning was able to replace a lot of domain expertise maybe we can have programs that are learned that are reinforcement learning programs and the do all this instead of us designing the details the river-water functional during the whole program so this would be learning the entire reinforcement learning program so it would be imagine you have a reinforcement learning program whatever it is and you you throw it out some problem and then you see how long it takes to learn and then you say well that took a while now let another program modify this ring for learning program after the modification see how fast it learns if it learns more quickly that was a good modification and maybe you keep it and improve from there Wow I see right yeah this is direction yeah it's I think it's a lot to do with maybe the amount of compute that's becoming available so the more this would be running reinforcement learning in the inner loop whereas right now we were unreinforced millenials the final thing and so the more compute we get the more it becomes possible to maybe run something like reinforcement learning in the inner loop of a bigger algorithm so you know starting from the 14 euro you you've worked in the iPhone maybe what some 20-plus years now so so tell me a bit about how your understanding of AI has evolved over this at this time yeah so when I started looking at AI sorry interesting cuz it really coincided with coming to Stanford to do my master's degree there and there were some icons there like John McCarthy who I got to talk with but who had a very different approach to and in the year 2004 what most people are doing at the time but also talking with Daphne Koller and I think a lot of my initial thinking of AI was shaped by Daphne's thinking her AI class her brow was the graphical models class and kind of really being intrigued by how simply a distribution over many random variables and then being able to condition on some subsets of variables and throwing conclusions about others could actually give you so much if you can somehow make it computationally intractable which was definitely the challenge to make it computable and then from there when I start my PhD and her you you arrived at Stanford and I think you gave me a really good reality check that's that's not the right metric to evaluate your your work by and to really try to see the the connection from what you're working on to what impact it can can really have what change it can make rather than what's the math that happened to to be in your work right this doesn't mean I I that I did not realize that we got through that yeah it's actually one of the things that most often to people people asking them once if you're gonna cite only one thing that has stuck with you from Andrews advice it's it's making sure you can see the connection to where it's actually gonna do something um you know you've had and you're continuing to have an amazing career in AI so for some of the people you know listening to you on video now if they want to also enter or pursue a career in AI what what advice do you have for them I think it's a really good time to get into artificial intelligence it's if you look at the demand for for people it's so high there is so many so many job opportunities so many things you can do research wise build new companies and so forth so I would say yes it's definitely a smart decision in terms of actually getting going a lot of it you can self study whether you're in school or not there is a lot of online courses in your machine learning course there is also for example Andrea Kerr posses deep learning course which has videos online which is a great way to get started at Berklee do is a deep reinforced learning course which is all the lectures online so those are all good places to get started I think a big part of what what's important is to to make sure you try things yourself so not just read things or and watch videos but try things out with frameworks like tensorflow chainer Theano pi torch and so forth and then whatever is your favorite just it's very easy to get going and get something up and running very quickly to get the practice yourself very good implementing and see what works and see what doesn't work so this past week there was an article in Mashable about a 16 year old and United Kingdom who is one of the leaders on Carroll competitions and he just said he just went out and learn things found things online learned everything himself and never actually took any formal course per se and there is a 16 year old just being very competitive in Carroll competitions so it's definitely possible yeah we live in good times very people that one has learned absolutely one question I bet you get all sometimes is if someone wants to you know enter AI machine learning deep learning should they apply for a ph.d program or should they get the job as big company I think a lot of it has to do with maybe how much mentoring you can get so in a ph.d program you're essentially guaranteed the job of the professor is who is your advisor is to look out for you try to do everything they can to kind of shape you help you become stronger at whatever you want to do for example AI and so there's a very clear dedicated person sometimes you have to advise it and that's that's literally a job and that's why they are professors that's most of what they like about being professors often is helping shape students to become more capable at things now it doesn't mean it's not possible at companies and many companies have really good mentors and have people who love to help educate people who come in and strengthen them in so forth it's just it might not be as much of a guarantee and a given compared to actually enrolling in a ph.d program where that's the crux of the program is that you're gonna learn and somebody is there to help you learn yes so it really depends on the company and depends on the ph.d program absolutely yeah but I think it is key that then you can learn a lot on your own but I think you can learn a lot faster if you have somebody who's more experienced to is actually taking it up as their responsibility to spend time with you and help accelerate your progress so you know you've been one of the most visible leaders in deep reinforcement learning so what are the things that deep reinforcement learning is already working really well at I think if you look at some deep reinforce learning successes it's it's very very intriguing for example learning to play Atari games from pixels processing this pixels which is just numbers that are being processed somehow and turning to joystick actions then for example some of the work we did at Berkeley where we have a simulated robot inventing walking and the reward that it's given is as simple as the further you go north the better and the less hard you impact with the ground the better and somehow it decides that walking / running is the thing to invent whereas nobody showed it while walking is or running is or robot playing with children poison learned to kind of put them together put a block into a matching opening and so forth and so I think I think it's really just alert from raw sensory inputs all the way to raw controls for example torques at the motors but at the same time so it's very interesting that you can have a single algorithm for example no trustees and policy items is you can learn can have a robot learn to run can ever robot learn to stand up can have instead of a two legged robot now you're swapping a four-legged robot you run the same reinforce learning algorithm and it still learns to run and so there's no changing the reinforcement algorithm it's very very general same for the Atari games dqn was the same dqn for every one of the games but then when it actually starts hitting the frontiers of what's not yet possible as well it's it's it's nice it learns from scratch for each one of these tasks but it would be even nicer if it could reuse things that's learned in the past to learn even more quickly for the next task and that's something that that's still at the frontier and not yet possible it always starts from scratch essentially how quickly do you think you see deeper for learning get deployed in the robots around us or the robots in you know they're getting deployed in it well today I think in practice the realistic scenario is one where it's it starts with supervised learning behavioral cloning humans do them do the work and I think actually a lot of businesses will be built that way where it's a human behind the scenes doing a lot of the work imagine facebook messenger assistant a system like that could be built with a human behind the curtains doing a lot of the work machine learning matches up with what the human does and starts making suggestions to the humans sodium has a small number of options available you can just to click and select and then over time as it gets pretty good you start infusing some reinforcement learning where you give it actual objectives not just matching the human behind the curtains but give it objectives of achievement like maybe how fast were these two people able to plan their their meeting or how fast were they able to book their flight or things like that how long did it take how happy were they with it but it would probably be bootstrapped of a lot of behavioral cloning of humans showing how this could be done so saw the behavioral clothing just supervised learning to mimic whatever the person is doing and then gradually layer on the reinforcement learning to have it think about longer time horizons is that a fair summary I'd say so yeah just because straight up reinforcement learning from scratch is is really fun to watch it's it's super intriguing and very few things more fun to watch than a reinforced learning robot starting from nothing and inventing things but it's just time consuming and it's not always safe thank you very much that was fascinating but I'm really glad we had the chance to chat well and ER thank you for having me very much appreciate it\n"

deeplearning.ai's Heroes of Deep Learning - Pieter Abbeel

Random Videos