Boris Sofman - Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics _ Lex Fridman Podcast #241

The Pursuit of Passion and Purpose: A Conversation with Boris Sofman

In a conversation that delved into the realms of passion, purpose, and the pursuit of happiness, Boris Sofman, a renowned roboticist, shared his insights on finding the perfect balance between personal fulfillment and professional growth. The discussion began with a question about how one can convert their passions into a generalizable potion, allowing individuals to maximize their potential in various domains.

Sofman emphasized that it's essential to find the overlap of where passion meets growing opportunity and need in the world. He likened this concept to the startup argument, where if you're passionate about something and it meets the market, you can do what you love while also opening up a vast array of opportunities. This idea is not limited to technology; Sofman suggested that the same thought process could be applied to various fields, including design, marketing, sales, and more.

The key to success lies in finding a space that is going to continue to grow, with a surface area that increases and problems that never get stale. When individuals join companies or industries that are poised for growth, they create opportunities that were previously unknown. This growth creates a snowball effect, where the possibilities become endless, and one can have many "shots on goal" to find the perfect overlap of timing, passion, and skill set.

Sofman also emphasized the importance of finding balance in life. He shared his personal experience of working obsessively for long periods, which led to an imbalance in friendships, family relationships, and other aspects of his life. To correct this, he prioritized nurturing friendships, building strong family ties, and making time for loved ones.

The conversation took a lighter turn as Sofman mentioned his love for robotics, particularly his work with robots like Cosmo and Anki. He expressed hope that these technologies would continue to evolve and improve, bringing about significant changes in various aspects of life. The discussion concluded with words from Isaac Asimov, highlighting the potential for humans to be seen as capable of love even if they are perceived as robots.

The conversation with Boris Sofman serves as a reminder that pursuing one's passion and purpose is crucial for happiness and fulfillment. By finding the right balance between personal growth and professional opportunities, individuals can create a life that is filled with meaning, joy, and connection. As Sofman so eloquently put it, "Love towards you said uh yeah love towards robots not the creepy kind of good guy that's a good kind just friendship and yeah and fun just yeah."

"WEBVTTKind: captionsLanguage: enthe following is a conversation with boris sofman who is the senior director of engineering and head of trucking at waymo the autonomous vehicle company formerly the google self-driving car project before that boris was the co-founder and ceo of anki a robotics company that created cosmo which in my opinion is one of the most incredible social robots ever built it's a toy robot but one with an emotional intelligence that creates a fun and engaging human robot interaction it was truly sad for me to see anki shut down when he did i had high hopes for those little robots we talk about this story and the future of autonomous trucks vehicles and robotics in general i spoke with steve vaseli recently on episode 237 about the human side of trucking this episode looks more at the robotic side this is the lex friedman podcast to support it please check out our sponsors in the description and now here's my conversation with boris sofman who is your favorite robot in science fiction books or movies wally and r2d2 where they were able to convey such an incredible degree of intent emotion and kind of character attachment without having any language whatsoever and just purely through the emotion richness of emotional interaction so those were fantastic and then uh the terminator series just like really really pretty wide wide range right uh but uh i kind of love this uh dynamic where you have this like incredible terminator itself that arnold played but uh and then he was kind of like the inferior like previous generation version that was like totally outmatched uh you know in terms of kind of specs by the new one but you know still kind of like held his own and so it was kind of interesting where you you realize how many how many levels there are on the spectrum from human to kind of potentials and ai and robotics to uh futures and so yeah that movie really uh as much as it was like kind of a dark world in a way was actually quite fascinating gets the imagination going well from an engineering perspective both the movies you mentioned wally and terminator the first one is probably achievable you know humanoid robot maybe not with like the realism in terms of skin and so on but that humanoid form we have the humanoid form it seems like a compelling form maybe the challenge is just super expensive to engine to build but you can imagine maybe not a machine of war but you could imagine terminator type robots walking around and then the same obviously with wall-e you've basically so for people who don't know you uh created the company anki that created a small robot with a big personality called coswell that just it does exactly what wally does which is somehow with very few basic visual tools is able to communicate a depth of emotion and that's fascinating but then again the humanoid form is uh super compelling so like uh cosmo is very distant from a humanoid form and then the terminator has a humanoid form you can imagine both of those actually being in our society it's true and it's interesting because um it was very intentional to go really far away from human form when you think about a character like cosmo or like wall-e where you can completely rethink uh the constraints you put on that character um what tools you leverage and then how you actually create a personality uh and a level of intelligence interactivity that actually matches the constraints that you're under whether it's mechanical or sensors or ai of the day this is why i almost was always really surprised by how much energy people put towards trying to replicate human form in a robot because you actually take on some pretty significant um kind of constraints and downsides when you do that um the first of which is obviously the cost where it's just the the articulation of a human body is just so like magical um in both the precision as well as the dimensionality that to replicate that even in this quote reasonably close form takes like a giant amount of joints and actuators and uh in motion and and you know sensors and encoders and so forth but then um you're almost like setting an expectation that the closer you try to get to human form the more you expect the strengths to match and that's not the way ai works is there's places where you're way stronger and there's places where you're weaker and by moving away from human form you can actually change the rules and embrace your strengths and bypass your weaknesses and at the same time the human form like has way too many degrees of freedom to play with it's it's kind of counterintuitive just as you're saying but when you have fewer constraints it's almost harder to master the the communication of emotion like you see this with cartoons like stick figures you can communicate quite a lot with just very minimal like two dots for eyes and a line for for a smile i think like you can almost communicate arbitrary levels of emotion with just two dots and a line yeah and like that's enough and if you focus on just that you can communicate the full range and then you like if you do that then you can focus on the actual magic of of uh human and dot line interaction versus all the engineering mess that's right like dimensionality voice all these sort of things actually become a crutch where you get lost in a search space almost um and so some of the best animators that we've worked with um they almost like study when they come up uh you know kind of in building their expertise by forcing these um projects where all you have is like a ball that can like kind of jump and manipulate itself or like really really like aggressive constraints for your force to kind of extract the deepest level of motion and so in a lot of ways um you know we thought when we thought about cosmos like you're right like our if we had to like describe it in like one small phrase it was bringing a pixar character to life in the real world it's uh it's what we were going for and um in a lot of ways what was interesting is that with like wall-e which we studied incredibly deeply and in fact some of our team were you know kind of had worked previously at um at pixar and on that project um they intentionally constrained wall-e as well even though in an animated film you could do whatever you wanted to because it forced you to like really saturate the smaller amount of dimensions but uh you sometimes end up getting a far more beautiful output um because you're pushing at the extremes of this emotional space in a way that you just wouldn't because you get lost in a surface area if you have like something that is just infinitely articulable so if we backtrack a little bit and uh you thought of cosmo in 2011 and 2013 actually uh designed and built it what is anki what is cosmo i guess who is cosmo and uh what was the vision behind this incredible little robot we started uh anki back in like while we were still in graduate school so myself and my two co-founders we were phd students uh in the robotics institute at carnegie mellon um and so we were uh studying robotics ai machine learning kind of different you know different uh uh areas one of my co-founders working on walking robots uh you know for a period of time and so we all had a um a bit of a really deep kind of a deeper passion for applications of robotics and ai where um there's like a spectrum where there's people that get like really fascinated by the theory of ai and machine learning robotics where um whether it gets applied in the near future or not is less of a kind of factor on them but they love the pursuit of like the challenge and that's necessary and there's a lot of incredible breakthroughs that happen there we're probably closer to the other end of the spectrum where we love the technology and the um and all the evolution of it but we were really driven by applications like how can you really reinvent experiences and functionality and build value that wouldn't have been possible without these approaches and and that's what drove us and we had a kind of some experiences through previous jobs and internships where we like got to see the applied side of robotics and at that time there was actually relatively few applications of robotics um that were outside of um you know peer research or industrial applications um military applications and so forth there were very few outside of it so maybe you know my robot was like one exception and maybe there were a few others but for the most part there weren't that many and so we got excited about consumer applications of robotics where you could leverage way higher levels of intelligence through software to create value and experiences that were just not possible in in those fields today and we saw kind of a pretty wide range of applications that varied in the complexity of what it would take to actually solve those and what we wanted to do was to commercialize this into a company but actually do a bottoms-up approach where we could have a huge impact in a space that was ripe to have an impact at that time and then build up off of that and move into other areas and entertainment became the place to start because um you had relatively little innovation in a toy space an entertainment space you had these really rich experiences in video games and uh and movies but there was like this chasm in between and so we thought that we could really reinvent that experience and there was a really fascinating transition technically that was happening at the time where the cost of components was plummeting because of the mobile phone industry and then the smartphone industry and so the cost of a microcontroller of a camera of a motor of memory of microphones cameras was dropping by orders of magnitude and then on top of that with the iphone coming out in 2000 uh i think it was 2007 i believe um it started to become apparent within a couple of years that this could become a really incredible interface device and the brain with much more computation behind a physical world experience that wouldn't have been possible previously and so um we really got excited about that and how we push all the complexity from the physical world into software by using really inexpensive components but putting huge amounts of complexity into the ai side and so cosmo became our second product and then the one that we're probably most proud of the idea there was to create a physical character that had enough understanding and awareness of the physical world around it in the context that mattered to feel like like he was alive um and to be able to have these like emotional kind connections and experiences with people that you would typically only find uh inside of a movie and the motivation very much was was pixar like we had an incredible uh respect and appreciation for what they were able to um build in this like really beautiful fashion and film um but it was always like a you know when it was virtual and two it was like a story on rails that had no interactivity to it it was very fixed and it obviously had a magic to it but where you really start to hit a different level of experiences when you're actually able to physically interact with that robot and then that was your idea with anki like the first product was the cars so basically you take you take a toy you add intelligence into it in the same way you would add intelligence into ai systems within a video game but you're not bringing into the physical space so the idea is is really brilliant which is you're basically bringing video games to life exactly that's exactly right we literally use that exact same phrase because in the case of drive this was a parallel of the racing genre and the goal was to effectively have a physical racing experience but have a virtual state at all times that matches what's happening in the physical world and then you can have a video game off of that and you can have uh different characters different traits for your the cars weapons and interactions and special abilities and all these sort of things that you think of virtually but then you can have it physically and um one of the things that we were like really surprised by that really stood out and immediately led us to really like kind of accelerate the path towards um cosmo is that things that feel like they're really constrained and simple in the physical world they have an amplified impact on people where the exact same experience virtually would not have anywhere near the impact but seeing it physically really stood out and so effectively we've with with drive we were creating a video game engine for the physical world um and then with cosmo we expanded that video game engine to create a character and and kind of an animation and interaction engine on top of it that allowed us to start to create these much more rich experiences and a lot of those elements were uh almost like a proving ground for what would human robot interaction feel like in a domain it's much more forgiving where you can make mistakes in a game it's okay if like uh if you know car goes off the track or if if cosmo makes a mistake um and what's funny is actually we're so worried about that in reality we realized very quickly that those mistakes can be endearing and if you make a mistake as long as you realize you make a mistake and have the right emotional reaction to it it builds even more empathy with the character that's brilliant exactly so when uh the the thing you're optimizing for is fun you have so much more freedom to fail to explore and and also in the toy space like all this is really brilliant like i got to ask you backtrack it seems for a roboticist to take us jump in into the direction of fun is a brilliant move because when you have the freedom to explore to design all those kinds of things and you can also build cheap robots like you don't have to like if you're not chasing perfection and like toys it's understood that you can go cheaper which means in robot it's still expensive but it's actually affordable by a large number of people so it's a really brilliant space to explore yeah that's right it's uh and in fact we realized pretty quickly that like perfection is actually not fun yeah because like in a traditional robotic roboticist sense the first kind of path planner and uh this is the you know the part that i worked worked on out of the gate was like a lot of the kind of ai systems where you have these you know vehicles and you know cars racing kind of making optimal maneuvers to try to kind of get ahead and you realize very quickly that like that's actually not fun because you want the like chaos from mistakes and the and so you start to kind of intentionally almost add noise to the system uh in order to kind of create more of a realism in the exact same way the human player might start really ineffective and inefficient and then start to kind of increase their quality bar as they as they progress and there is a really really aggressive constraint that's forced on you by being a consumer product where the price point matters a ton particularly in like kind of an entertainment where um you know you you can't make a thousand dollar product unless you're going to meet the qua like the expectations of a thousand dollar product and so um in order to make this work like your cost of goods had to be like like you know well under a hundred dollars uh uh in the case of cosmo we got it under fifty dollars end-to-end fully packaged and delivered and it was under two hundred dollars it cost the retail yeah so uh okay if we sit down like at this early stages if you go back to that and you're sitting down and thinking about what kosovo looks like from a design perspective and from a cost perspective i imagine that was part of the conversation first of all what came first did you have a cost in mind is there a target you're trying to chase did you have a vision in mind like size did you have because there's a lot of unique qualities to cosmos so for people who don't know they should definitely check it out there's a display there's eyes on the little display and those eyes can it's pretty uh low resolution eyes right but they they still able to convey a lot of emotion and there's this arm like that out lift sort of lifts stuff but there's something about arm movement that adds even more kind of depth it's like uh the face communicates emotion and sadness and disappointment and happiness and then the arms kind of communicates i'm trying here yeah i'm doing my best exactly so it's um uh it's interesting because like um all of cosmo's only four degrees of freedom and two of them are the two treads which is for basic movement and so you literally have only a head that goes up and down a lift that goes up and down and then your two wheels uh and you have sound uh and a screen yeah and a low resolution screen and with that it's actually pretty incredible what you can uh what you can come up with where like you said it's a uh it's a really interesting give and take because there's a lot of ideas far beyond that obviously as you can imagine where like you said how big is it how much degrees of freedom what does it look like um uh what does he sound like how does he communicate it's it's a formula that actually scales way beyond entertainment this is the formula for human kind of robot interface more generally is you almost have this triangle between um the physical aspects of it the mechanics the industrial design what's mass producible the cost constraints and so forth you have the ai side of how do you understand the world around you interact intelligently with it execute what you want to execute so perceive the environment make intelligent decisions and and move forward and then you have the character side of it um most uh companies have done anything in human robot interaction really uh missed the mark or under invest in the character side of it um they over invest in the mechanical side of it uh you know and then varied results on the ai side of it and so the thinking is that you put more mechanical flexibility into it you're gonna do better um you don't necessarily you actually create a much higher bar uh for a high roi because now your price point goes up your expectations go up and if the ai can't meet it or the overall experience isn't there you missed the mark um so who like how did you through those conversations get the cost down so much and make it made it so simple like that there's a big theme here because you come from the mecca of robotics which is carnegie mellon university robotics like for all the people i've interacted with that come from there or just from you know the world experts at robotics they don't they would never build something like cosmo yeah and so where did that come from so the simplicity it came from this combination of a team that we had it was it was quite cool because like we and by the way you ask anybody that's like experienced in the like kind of you know toy entertainment space you'll never sell a product over 99 um that was fundamentally false and we believed it to be false it was because experience had to kind of you know meet the mark and so we pushed past that amount but there was a pressure where the higher you go the more seasonal you become and the tougher it becomes and so on the cost side we very quickly partnered up with some previous contacts that we worked with where just as an example one our head of mechanical engineering um was one of the earliest heads of engineering at logitech and has a billion units of consumer products and circulation that he's worked on yeah so like crazy low cost high volume consumer product experience with a really great mechanical engineering team and just a very practical mindset where we were not going to compromise on feasibility in the market in order to chase something that would be enabler and we pushed a huge amount of expectations onto the software team where yes we're going to use cheap noisy motors and sensors but we're gonna fix it in the um on the software side then we found on the design and character side there was a faction that was more from like a game design background that thought that it should be very games driven cosmo where you create a whole bunch of games experiences and it's all about like game mechanics and then there was um a faction which my my co-founder and i the most involved in this like really believed in which was character driven and the argument is that you will never compete with what you can do virtually from a game standpoint but you actually on the character side put this into your wheelhouse and put it more towards your advantage because a physical character has a massively higher impact uh physically than virtually this is okay i can't just pause on that because this is so brilliant when i uh for people who don't know cosmo plays games with you but there's also a depth of character and i actually when i was you know playing with it i wondered exactly what is the compelling aspect of this because to me obviously i'm i'm biased but to me the character i get what i enjoyed most honestly or what got me to return to it is the character that's right but that's that's a fascinating discussion of uh you're right ultimately you cannot compete on the quality of the gaming experience too restrictive the physical world is just too restrictive and uh you don't have a graphics engine it's like all this but on the character side we uh and clearly we moved in that direction is like kind of the the the winning path and um we partnered up with this uh really we immediately like went towards pixar and carlos bana he was um one of like had been in pixar for nine years he'd worked on tons of the movies including wally and others and just immediately kind of spoke the language and just clicked on how you think about that like kind of magic and drive and then he we built out a team uh you know with him as like a really kind of prominent kind of driver of this with different types of backgrounds and animators and character developers where um we put these constraints on the team but then got them to really try to create magic despite that and we converged on this system that was at the overlap of character and the character ai that where if you imagine the dimensionality of emotions happy sad angry surprised confused uh um scared like you think of these extreme emotions we almost like kind of put this challenge to kind of populate this library of responses on how do you show the extreme response that like goes to the extreme spectrum on angry or frustrated or whatever and and so that gave us a lot of intuition and learnings and um and then we started parameterizing them where it wasn't just a fixed recording but they were parameterized and had randomness to them where you could have infinite permutations of happy and surprised and so forth and then we had a behavioral engine that took the context from the real world and would interpret it and then create kind of probability mappings on what sort of responses you would have that actually made sense and so if cosmo saw you for the first time in a day um he'd be really surprised and happy in the same way that the first time you walk in and like your toddler sees you they're so happy but they're not gonna be that happy for the entirety of your next two hours but like you have this like spike in response or if you leave him alone for too long he gets bored and starts causing trouble and like nudging things off the table um or if you beat him in a game um the most enjoyable emotions are him getting frustrated and grumpy to a point where our testers and our customers would be like i had to let him win because i don't want him to be upset and so you start to like create this feedback loop where you see how powerful those emotions are and just to give you an example something as simple as eye contact um you don't think about it in a movie just like it kind of happens like you know camera angles and so forth um but that's not really a prominent source of interaction what happens when a physical character like cosmo when he makes eye contact with you um it built universal kind of connection kids all the way through adults um and it was truly universal it was not like people stopped caring after 10 12 years old and so we started doing experiments and we found something as simple as increasing the amount of eye contact like the amount of times in a minute that he'll look over for your approval to like kind of make eye contact just by i think doubling it we increase the play time engagement by 40 like you see these sort of like kind of interactions where you build that empathy and and so we studied pets we studied um virtual characters there's like a lot of times actually dogs are one of the perfect most perfect uh um influencers behind these sort of interactions and what we realized is that the games were not there to entertain you the games were to create context to bring out the character and if you think about the types of games that you know that you played they're relatively simple but they were always once to create scenarios of either tension or winning or losing or surprise or whatever the case might be and they were purely there to just like create context to where an emotion could feel intelligent and not random and in the end it was all about the character so yeah there's so many elements to play with here so you said dogs what lessons do we draw from cats who don't seem to give a damn about you is that just another character is this another it's just another character and so you you could almost like in early aspirations we thought it would be really incredible if you had a diversity of characters where you almost help encourage which direction it goes just like in a role-playing game um and you had uh like think of like the you know seven dwarfs sort of and uh um and initially we even thought that it would be amazing if like the other like you know like their characters actually help them be have strengths and weaknesses and some you know like whatever they end up doing like some are scared some are you know arrogant some are uh you know super warm and like kind of friendly and in the end we focused on one because it made it very clear that hey we got to build out enough depth here because you're kind of trying to expand it's almost like how long can you maintain a fiction that this character is alive um to where the person's explorations don't hit a boundary um which happens almost immediately with with typical toys um and you know even with video games uh how long can we create that immersive experience to where you expand the boundary and one of the things we realized is that you're um just way more forgiving when something has a personality and it's physical that is the key that unlocks uh robotics interacting you know in the physical world more generally is that that uh the when you have a when you don't have a personality and you make a mistake as a robot the stupid robot made a mistake why is it not perfect when you have a character and you make a mistake you have empathy and it becomes endearing and you're way more forgiving and that was the key that was like i think goes far far beyond entertainment it actually builds the depth of the personality the mistakes so let me ask the the movie her question then how and so cosmos seem feels like the early days of something that will obviously be prevalent throughout society at a scale that we cannot even imagine my sense is it seems obvious that these kinds of characters will permeate society and they will be friends with them we'll be interacting with them in different ways the in the way we i mean you don't think of it this way but when you play video games they're kind they're often cold and impersonal but but even then uh you think about role-playing games you become friends with certain characters in that game they're they don't remember much about you they they're they're just telling a story it's exactly what you're saying they they exist in that virtual world but if they acknowledge that you exist in this physical world if the characters in the game remember that you exist that you like for me like lex they understand that i'm a human being who has like hopes and dreams and so on it seems like there's going to be a like billions if not trillions of cosmos in the world so if we look at that future there are several questions to ask how intelligent does that future cosmo need to be to create fulfilling relationships like friendships yeah it's a great question and and part of it was a recognition it's going to take time to get there because it has to be a lot more intelligent um because what's good enough to be a magical experience for uh you know an eight-year-old um it's a higher bar to do that be a complaint like a pet in the home or to help with functional interface in an office environment or in a home or uh and so forth and so and the idea was that you build on that and you kind of get there and as technology becomes more prevalent and less expensive and so forth you can start to kind of work up to it um but you know you're absolutely right at the end of the day um we almost equated it to how uh the touchscreen created like this really novel interface to you know physical kind of devices like this this is the extension of it where you have much richer physical interaction in the real world this is this is the enabler for it um and it shows itself in a few kind of really obvious places so just take something as simple as a voice assistant um you will never most people will never tolerate uh an alexa or a google home just starting a conversation um proactively uh when you weren't kind of expecting it because it it feels weird it's like you were listening and like and then now you're kind of it feels intrusive but if you had a character um like a cat that touches you and gets your attention or toddler like you never think twice about it what we found really kind of immediately is that um these types of characters like cosmo and they would like roam around and kind of get your attention and we had a future version it was always on kind of called vector people were way more forgiving and so you could initiate interaction in a way that is not acceptable for for machines and in general um you know there's a lot of ways to customize it but it makes people who are skeptical of technology much more comfortable with it there was like there were a couple of really really prominent examples of this so when we launched in europe and so we were in um uh i think like a dozen countries if i remember correctly but like we were we went pretty aggressively in launching in um germany and france and uh and uk and we were very worried in europe because there's obviously like a really a socially higher bar for privacy and you know security where you you've heard about how many companies have had troubles on uh uh that might things that might have been okay in the u.s but like are just not okay in germany and france in particular um and so we were worried about this because you have um you know cosmo who's um uh you know in our future product veteran like where you have cameras you have microphones it's kind of connected and like you're playing with kids and like in these experiences and you're like this is like ripe to be like a nightmare if you're not careful yes um and uh and the journalists are like notoriously like really really tough on on these sort of things um we were shocked and we prepared so much for what we would have to encounter we were shocked in that not once from any journalists or customer do we have any complaints beyond like a really casual kind of question and it was because of the character where um when it conversation came up it was almost like well of course he has to see in here how else is he going to be alive and interacting with you and it completely disarmed um this like fear of technology that enabled this interaction to be much more fluid and again like entertainment was a proving ground but that is like a you know there's like ingredients there that carry over to a lot of other uh elements down the road that's hilarious that we're a lot less concerned about privacy if the if the thing is value and charisma i mean that's true for all of women to human interaction too it's an understanding of intent where like well he's looking at me he can see me if he's not looking at me he can't see me right so it's almost like uh um you're communicating intent and with that intent people are like kind of kind of more understanding and calmer and it's a it's interesting we just it was just the earliest kind of version of starting an experiment with this but um it wasn't enabler and um and then and then you have like completely different dimensions where like you know kids with autism had like an incredible connection with cosmo that just went beyond anything we'd ever seen and we have like these just letters that we would receive from parents and we had some research projects kind of going on with some universities on studying this but um there are like there's an interesting dimension there that got unlocked that just hadn't existed before um that has these really interesting kind of links into society and and a potential building block of future experiences so if you look out into the future do you think we will have beyond a particular game you know a companion like uh like her like the movie her or like a cosmo that's kind of asks you how your day went too right you know like a friend how many years away from that do you think we are what's your intuition good question so i think the idea of a different type of character like more closer to like kind of a pet style companionship it will come way faster um and there's a few reasons one is like to to do something like in her that's like effectively almost general ai and the bar is so high that if you miss it by bit you hit the uncanny valley where it just becomes creepy and like and not um not appealing um because the closer you try to get to a human in form and interface and voice the harder it becomes whereas you have way more flexibility on still landing a really great experience if you embrace the idea of a character and that's why um one of the other reasons why we didn't have a voice uh and also why like a lot of video game characters uh like sims for example does not have a voice when you uh when you think about it it was it wasn't just a cost savings like for them it was actually for all of these purposes it was because when you have a voice you immediately narrow down the appeal to some particular demographic or age range or um kind of style or gender uh if you don't have a voice people interpret what they want to interpret and an eight-year-old might get a very different interpretation than a 40 year old but you create a dynamic range and so you just you can lean into these advantages much more um and something that doesn't resemble a human and so that'll come faster i don't know when a human like that's just uh still like ma just complete r d at this point the the chat interfaces are getting way more interesting and richer but it's still a long way to go to kind of pass the test of you know well let me like let's consider like let me play devil's advocate so google is a very large company that's servicing it's creating a very compelling product that wants to provide a service a lot of people but let's go outside of that you said characters yeah it feels like and you also said that it requires general intelligence to be a successful participant in a relationship which could explain why i'm single this is very but the i i honestly want to push back on that a little bit because i feel like is it possible that if you're just good at playing a character yeah you're in in a movie there's a bunch of characters if you just understand what creates compelling characters and then you you just are that character and you exist in the world and other people find you and they connect with you just like you do when you talk to somebody at a bar i like this character this character is kind of shady i don't like them you pick the ones that you like and you know maybe it's somebody that's uh reminds you of your father or mother i don't know what it is but the the freudian thing but there's some kind of connection that happens and that's that that's the cosmo you connect to that's the future cosmo you connect and that's so i guess the statement i'm trying to make is it possible to achieve a depth of friendship without solving general intelligence i think so it's about intelligent kind of constraints right and just uh you set expectations and constraints such that in the space that's left you can be successful and so you can do that by having a very focused domain that you can operate in for example you're a customer support agent for a particular product and you create intelligence and a good interface around that or uh you know kind of in the personal companionship side you can't be everything to across the board you you kind of solve those constraints and i think uh i think it's possible my my worry is like i right now i don't see anybody that has picked up on where kind of cosmo left off yes and is pushing on it in the same way and so i don't know if it's a sort of thing where similar to like how you know in dot com there were all these concepts that we considered like you know that didn't work out or like failed or like were too early or whatnot and then 20 years later you have these like incredible successes on almost the same concept like it might be that sort of thing where like there's another pass at it that happens in five years or in 10 years but um it does feel like that appreciation of that like that this the three like it's duel if you will between like you know the hardware the ai and the character um that balance it's hard to i'm not aware of of any pro anywhere right now where like that same kind of aggressive drive with the value on the character is uh is happening and so to me just a prediction exactly as you said something that looks awfully a lot like cosmo not in the actual physical form but in the three-legged stool something like that in some number of years would be a trillion dollar company i don't understand like it's obvious to me yeah that like character not just as robotic companions but in all our computers they'll be there it's like uh clippy was like two legs of that stool or something like that yeah i mean that those are all different attempts and what's really confusing to me is they they're born these attempts and they they everybody gets excited and for some reason they die and then nobody else tries to pick it up and then maybe a few years later a crazy guy like you comes comes around with just enough brilliance and vision to create this thing and it's born a lot of people love it a lot of people get excited but maybe the timing is not right yet and then and then when the timing is right it just blows up and it just keeps blowing up more and more until it just blows up and i guess everything in the full span of human civilization collapses eventually and that wouldn't surprise me at all and like what's gonna be different in another five years or ten years what not physical component costs will continue to come down uh in price and you know mobile devices and computations going to become more and more prevalent as well as cloud as a big tool uh to offload cost um ai is going to be a massive transformation compared to what we dealt with uh where um everything from voice understanding to um uh to just you know kind of a broader contextual uh understanding and mapping of of semantics and uh understanding scenes and so forth and then the character side will continue to kind of you know progress as well because that magic does exist it just exists in different forms and you have just the brilliance of uh that's happening in animation and you know these other areas where um that is that was a big unlock in um you know in film obviously uh and so i think yeah the pieces can reconnect and the building blocks are actually gonna be way more impressive than they were five years ago so so in 2019 uh anki the company that created cosmo the company that you started had to shut down how did you feel at that time yeah it was tough uh that was a really emotional stretch and it was really tough year like about a year ahead of that was actually a pretty brutal stretch because we were um kind of light life or death on many many moments um just navigating these insane kind of just ups and downs and um barriers and the thing that made it like um like just rewinding a tiny bit like what you know what ended up being really challenging about it as a business where is um from a commercial standpoint and customer reception standpoint there's a lot of things you could point to that were like you know pretty big successes sold millions of units uh like you got to like pretty serious revenue like kind of close to 100 million annual revenue um uh number one kind of product in kind of various categories but it was pretty expensive it ended up being very seasonal where something like 85 percent of our volume was in q4 because it was a you know a present and and it was expensive to market it and explain it and so forth um and even though though the volume was like really sizeable and like the reviews were really fantastic um forecasting and planning for it and managing the cash operations was just brutal like it was absolutely brutal you don't think about this when you're starting a company or when you have a few million in you know in revenue because it's just your biggest costs are kind of just your head count and operations and everything's ahead of you but we got to a point where um you know you if you look at the entire year you have to operate your company pay all you know the people and so forth you have to pay for the manufacturing the marketing and everything else to do your sales in mostly november december and then get paid in december january by retailers and those swings were pretty um were really rough um and just made it like so difficult because the more successfully became the more wild those swings became because you'd have to like spend you know tens of millions of dollars on inventory tens of millions of dollars on marketing and tens of millions of dollars on payroll and everything else and then there's the bigger dip and then you're waiting for the 204 yeah and it's not a business that like is recurring kind of month-to-month and predictable and it's just and then you're walking in your forecast in july um you know maybe august if you're lucky um and uh and it's also like very hit driven and seasonal where like you don't have the sort of continued uh kind of slow growth like you do in some other uh consumer electronics industries and so before then like hardware kind of like went out of favor too and so you had fitbit and gopro dropped from 10 billion revenue to 1 billion revenue and hardware companies are getting valued at like 1x revenue oftentimes um which is tough right and so we effectively kind of got caught in the middle where we were trying to quickly evolve out of entertainment and move into some other categories but you can't let go of that business because like that's what you're valued on that's what you're raising money on um but there's no path to prop kind of pure profitability just there because it was you know such you know uh specific type of price points and so forth and so um we tried really hard to make that transition and um yeah we had a financing round that fell apart at the last second and effectively there was just no path to kind of get through that and get to the next kind of like holiday season and so we ended up um uh selling some of the assets and kind of winding down the company it was uh it was brutal like we i was very transparent with the company like in the the team while we were going through it where actually despite how challenging that period was very few people left i mean like people loved the vision the team the culture of the like kind of chemistry and kind of what we were doing there was just a huge amount of pride there and we wanted to see it through and we felt like we had a shot to kind of get through these checkpoints um we ended up uh and i mean by brutal i mean like literally like days of cash like three four different times uh runway like in the year you know kind of before it um where you're like playing games of chicken on negotiating credit line timelines and like repayment terms and how to get like a bridge loan from an investor it's just like level of stress that like is as hard as things might be anywhere else like you'll never come you know come close to that where you feel that like responsibility for you know 200 plus people right um and so we were very transparent during our fundraise on who we're talking to the challenges um that we have how it's going and when things are going well when things were tough um and so it wasn't a complete shock when it happened but it was just very emotional where like i you know like you know when we announced it finally that like um you know we you know basically we're just like watching kind of like you know the runway and trying to kind of time it and when we realized that like we didn't have any more outs we wanted to like kind of wind it down make sure that it was like clean and you know we could like kind of take care of people the best we could but yeah like broke down crying at all you know hands and somebody else had to step in for a bit and like it was just very very emotional but the beautiful part is like afterwards like everybody stayed at the office to like two three in the morning just like drinking and hanging out and telling stories and celebrating and it was just like one of the best uh for many people was like the best kind of work experience that they had and there was a lot of pride in what we did and there wasn't anything obvious we could point to that like hey if only we had done that different things would have been completely different it was just like the physics didn't line up uh and uh um but the experience was pretty uh incredible but it was hard like it was uh it had this feeling that there was this like incredible beauty in both the technology and products and the team that um uh you know there's there's a lot there that like in the you know right context could have been uh pretty incredible but it was um emotional just yeah just thinking i mean just looking at this company like you said the product and technology but the vision the implementation you got the cost down very low yeah and the compelling the nature of the product was great so many robotics companies failed at this at they the robot was too expensive it didn't have the personality it didn't really provide any value like a sufficient value to justify the price so like you succeeded where basically every single other robotics company or most of them that are like going the category of social robotics have kind of failed and i mean it's uh it's quite tragic i remember uh reading that i'm not sure if i talked to you before that happened or not but i remember you know i'm distant from this i remember being heartbroken reading that because like if if cosmo's not going to succeed what is going to succeed because that to me was incredible like it was an incredible idea cost is down the minimum the the it's just like the most minimal design in physical form that you could do it's really compelling the balance of games so it's a it's a fun toy it's a great gift for all kinds of age groups right it's just it's compelling in every single way and it seemed like uh it was a huge success and it it failing was i don't know there was heartbreak on many levels for me just as an external observer is i was thinking how hard is it to run a business that's that's what i was thinking like if this failed this must have failed because uh it's obviously not like yeah it's b it's business yeah maybe it's some aspect of the manufacturing and so on but i'm now realizing it's also not just that it's yeah sales marketing also it's everything right like how do you explain something that's like a new category to people that like how all these previous positions and so like uh you know it it had some of the hardest elements of if you were to pick a business it had some of the hardest uh um customer dynamics because like to sell a 150 product you got to convince both the child to want it and the parents to agree that it's valuable so you're having like this dual prong marketing challenge you have manufacturing you have like really high precision on the components that you need you have the ai challenges so there were a lot of tough elements but is this feeling where like just really great alignment of unique strength across kind of like all these different areas just an incredible like you know kind of character and animation team between this like carlos and there's like a character director day that came on board and like you know really great people there the ai side the um uh the manufacturing the you know where um like never missing a launch right and actually you know he kind of hitting that quality was um yeah it was it was heartbreaking but uh here's one neat thing is like we we had so much like fan mail from kind of kids parents like i actually like there was a bunch they collected in the end yeah that um i actually saved and like i never it was too emotional to open it and i still haven't opened it um and so i actually have this giant envelope of like a stack this much of like letters from you know kids and families just like every you know perpetration permutation you can imagine and so planning to kind of i don't know maybe like a five year you know five year eight some year reunion just inviting everybody over and we'll just like kind of dig into it and um kind of bring back some memories but um you know good impact and uh um well i i think there will be companies uh maybe waymo and google will be somehow involved that will carry this flag forward and will uh will make you proud whether you're involved or not i think this is one of the greatest robotics companies in the history of robotics so you should be proud it's still tragic to know that you know because you read all the stories of apple and and let's see spacex and like companies that were just on the verge of failure several times through that story and they just it's almost like a roll of the diet they succeeded and here's the role of the dice that just happened to go and that's the appreciation that like when you really like talk to a lot of the founders like everybody goes through those moments and sometimes it really is a matter of like you know timing a little bit of luck like some things are just out of your control and um uh and you you get a much deeper appreciation for um just the dimensionality of of that challenge but um the great thing is that like a lot of the team actually like stayed together and so um they were actually a couple of companies that we we kind of kept big chunks of the team together and we actually kind of helped align this uh um you know to help people out as well um and one of them was waymo where uh a majority of the ai and robotics team actually had the exact background uh that you would look for in like kind of a b space it was a space that a lot of us like you know were you know worked on in grad school were always passionate about and ended up uh you know maybe the time you know serendipitous timings from another perspective where like uh um kind of landed in a really unique um circumstance it's actually been quite exciting too so it's interesting to ask you just your thoughts uh cosmo still lives on under dream labs i think is that are you tracking the progress there or is it too much pain is it are you is that something that you're excited to see where that goes so keeping an eye on it of course just out of your curiosity and obviously just kind of care for product line i think um it's deceptive how complex it is to manufacture and evolve that product line um and the amount of experiences that are required to complete the picture and be able to move that forward and i think that's going to make it pretty hard to do something really substantial with it it would be cool if like even the product in the way it was was able to be manufactured yes again that would be yeah which would be neat um but uh it's i think it was it's deceptive how tricky that is on like everything from the quality control the details and um and then like technology changes that forces you to rick reinvent and update certain things um so uh i haven't been super close to it but just kind of keeping an eye on it yeah it's really interesting how it's deceptively difficult just as you're saying for example those same folks uh and i've spoken with them they're they partnered up with rick and morty uh creators to uh to do the butter robot yes i love the idea i just recently i've kind of half-assed watch rick and morty previously but now i just watched like the first season it's such a brilliant show i i like i did not understand how brilliant that show is and obviously i think in season one is where the butter robot comes along for just a few minutes or whatever but i just fell in love with the butter robot the sort of the that particular character just like you said there's characters you can create personalities you can create and that particular a robot who's doing a particular task realizes you know this like realizes that's the existential question this the myth of sisyphus question that uh camus writes about it's like is this all there is because he moves butter but you know that realization that's a that's a beautiful little realization for a robot that my purpose is very limited with this particular task it's abuse it's humor of course it's darkness it's a beautiful mix but so they want to release that butter robot but something tells me that to do the same depth of personality as cosmo had the same richness it would be on the manufacturing on the ai on the storytelling on the design it's going to be very very difficult it could be a cool sort of uh toy for rick and morty fans but to create the same depth of existential angst yeah that the butter robot symbolizes is is really that's the brave effort you succeeded at with cosmo but it's not easy it's really studies and you can fail on almost any one of the kind of dimensions and like uh and yeah it takes you know yeah unique convergence of a lot of different skill sets to try to pull that off yeah on this topic let me ask you for some advice because uh as i've been watching rick and morty i i told myself i have to build the butter robot just as a hobby project and so uh i got a nice platform for it with treads and and there's a camera that moves up and down and so on um i'll probably paint it but the question i'd like to ask there's obvious technical questions i'm fine with communication the personality storytelling all those kinds of things i think i understand the process of that but how do you know when you got it right so with with cosmo how did you know this is great like or um something is off like yeah is this brainstorming with the team do you know it when you see it is it like love at first sight it's like this is right or like i guess if we think of it as an optimization space is there uncanny valley we're like that's not right or this is right or are a lot of characters right yeah we stayed away from uncanny valley just by having such a different what like mapping where it didn't try to look like a dog or a human or anything like that and so uh you avoided having like a weird pseudo similarity but not quite hitting the mark um but you could like just fall flat where just like a personality or a you know character emotion just didn't feel right and so it actually mirrored very closely to kind of the iterations that a character director of pixar would have where you're running through it and you can virtually kind of like see what it'll look like we we created a plug-in to where we actually used like like maya the sim you know the animation tools and then we created a plug-in that perfectly matched it uh to the physical one and so you could like test it out virtually and then push a button and see it physically play out and there's like subtle differences and so you want to like make sure that that feedback loop is super easy to be able to test it live um and then sometimes like you would just feel it that it's right and intuitively no and then you'd also do we did user testing but it was very very often that like the into like if we found it magical it would scale and be magical uh more broadly there were not too many cases where like like we were pretty decent about not like getting to it you know geeking out or getting too attached to something that was super unique to us um but trying to kind of like you know put a customer hat on and does it truly kind of feel magical and so in a lot of ways we just give a lot of um autonomy to the character team to really think about the you know character board and mood boards and storyboards and like what's the background of this character and how would they react um and they went through a process that's actually pretty familiar but now had to operate under these unique constraints um but the moment where it felt right um kind of took a fairly similar journey than like a as a character in an animated film actually it's quite cool well the the thing that's really important to me and i wonder if it's possible well i hope it's possible pretty sure it's possible is for me even though i know how it works to make sure there's sufficient randomness in the process yeah probably because it would be machine learning based that i'm surprised that i don't i'm surprised by certain reactions i'm surprised by certain communication maybe that's in a form of a question um were you surprised by certain things cosmo did like certain interactions yeah we made it intentionally like uh so that there would be some surprise then like a decent amount of variability in how he'd respond in certain circumstances and so in the end like it's um this is this isn't general ai this is a giant like spectrum and library of like parametrized kind of emotional responses and an emotional engine that would like kind of map your current state of the game your emotions the world the people are playing with you all so forth to what's happening um but we could make it feel spontaneous by creating enough diversity uh and randomness uh but still within the bounds of what felt felt like very realistic um to make that work and then what was really neat is that we could get statistics on how much of that space we were saturating um and then add more animations and more diversity in the places that would get hit more often so that you stay ahead of the um you know the curve and maximize the uh the chance that it it stays feeling alive um and so but then when you like combine it like the permutations and kind of like the combinations of emotions stitched together sometimes surprised us because you see them in isolation but when you actually see them and you see them live you know relative to some event that happened in the game or whatnot like it was kind of cool to see the combination of the two and um uh and not too different in other robotics applications where like you get you get so used to thinking about like the modules of a system and how things progress through a tech stack that the real magic is when all the pieces come together and you start getting the right emergent behavior um in a way that's easy to lose when you just kind of go too deep into any one piece of it yeah when the system is sufficiently complex there is something like emergent behavior and that's where the magic is you as a human being you can still appreciate the beauty of that magic of the fine at the system level first of all thank you for humoring me on this uh it's really really uh fascinating i think a lot of people would love this i i'd love to just one last thing on the butter robot i promise in terms of uh speech yeah cosmo is able to communicate so much with just movement and face do you think speech is too much of a degree of freedom like a speech a feature or a bug of uh deep uh interaction emotional interaction yeah for a product it's too deep right now it's just not real uh it would immediately break the fiction because the state of the art is just not good enough um and that's on top of just narrowing down the demographic where like the way you speak to an adult versus a way speak to a child is very different yet a dog is able to appeal to everybody and so right now there is no speech system that is like rich enough and and subtly realistic enough to feel appropriate um and so we very very quickly kind of like moved away from it now speech understanding is a different matter where understanding intent that's a really valuable input um but giving it back requires like a you know way way higher bar given kind of where today's world is and so that realization that you can do surprisingly much with uh either no speech or kind of tonal like the way you know wally r2d2 and kind of other characters are able to um it's quite powerful and it generalizes um across cultures and across ages really really well i think we're going to be in that world for a little while where it's still very much an unsolved problem on how to like make something it touches on kenny valley thing so if you have legs and you're a big humanoid looking thing you have very different expectations and a much narrower degree of what's going to be acceptable by society than if you're a you know robot like uh like cosmo or wall and you can or some other form where you can kind of like reinvent the character speech has that same property where speech is so well understood um in terms of expectations by humans that you have far less flexibility on how to deviate from that and lean into your strengths and avoid weaknesses but i wonder if there is obviously there's certain kinds of speech that activates the uncanny valley and breaks the illusion faster so i guess my intuition is we will solve certain we would be able to create some speech-based personalities sooner than others so for example i could i could think of a robot that doesn't know english and is learning english right yeah those kinds of personalities where you're like uh you're intentionally kind of like getting a toddler level of uh speech so that's exactly right so you can have like uh tie it into the experience where uh it is a more limited character or you embrace the lack of emotions as part or the lack of sorry dynamic range in the speech kind of capabilities emotions as like part of the character itself and you've seen that in like kind of fictional characters as well yeah um but that's why this podcast works and yeah like you kind of had that with like um i don't know i guess like you know data and some of the other yeah like um but yeah so you have to and that becomes a constraint that lets you meet the bar um see i i honestly think like also if you add uh drunk and angry that gives you more constraints that allow you to be dumber from an nlp perspective like there's certain aspects so if you modify human behavior like let's just so forget the sort of artificial thing where you don't know english toddler thing we if you just look at the full range of humans i think we there's certain situations where we put up with uh like lower level of intelligence in our communication like if somebody's drunk we understand this issue that they're probably under the influence like we understand that they're not going to be making any sense anger is another one like that i'm sure there's a lot of other kind of situation yeah maybe uh yeah again language loss in translation that kind of stuff that i think if you if you play with that uh what is it the ukrainian boy that passed the touring test you know play with those ideas i think that's really interesting and then you can create compelling characters but you're right that's a dangerous sort of road to walk because uh you're adding degrees of freedom that can get you in trouble yeah and that's why like you have these um big pushes that like for most of the last decade plus like where you'd have like full like human replicas of robots really being down to like skin and like kind of in some places um my personal feeling is like man like that's not the direction that's most fruitful right now um beautiful art yeah it's not in terms of a uh rich deep fulfilling experience yeah you're right yeah and the way creating a minefield of potential places to feel off uh and then and then you're sidestepping where like the biggest kind of functional ai challenges are to actually have you know kind of like really rich productivity that actually kind of justifies a you know kind of the higher price points and that's that's part of the challenges like yeah like robots are going to get to like thousands of dollars tens of thousands of dollars and so forth but you can imagine what sort of expectation of value that comes with it um and so that's where you want to be able to invest the the the time and uh and depth and so going down the full human replica route um creates a gigantic uh uh distraction and really really high bar that can end up sucking up so much of your resources so it's weird to say but you happen to be one of the greatest at this point roboticist ever because you created this little guy you were part obviously of a great team that created the the little guy with a deep personality and they're now switching to an entirely well maybe not entirely but a different fascinating impactful robotics problem which is autonomous driving and more specifically the biggest version of autonomous driving which is autonomous trucking so you are at waymo now can you give us a big picture overview what is waymo what is waymo driver what is waymo one what is waymo via can you give an overview of the company and the vision behind the company for sure waymo by the way it's just it's been eye-opening on just how incredible that that people and the talent is and how in one company you almost have to create i don't know 30 companies worth of like technology and capability to like kind of solve the full spectrum of it so um yeah so i've been at weymouth since um 2019 so about two and a half years so waymo is uh focused on building what we call a driver which is creating the ability to have autonomous driving across different environments vehicle platforms domains and use cases uh you know as you know got started in uh 2009 it was a lot almost like an immediate successor to the grand challenge and urban challenges that were like incredible uh kind of catalyst for this whole space um and so google started this project and then eventually waymo spun out and so what waymo is doing is creating uh the systems both you know hardware software infrastructure and everything that goes into it to enable and to commercialize autonomous driving this hits on consumer transportation and ride sharing and kind of vehicles and urban environments and as you mentioned it hits on autonomous trucking to to transport goods so in a lot of ways it's transporting people and transporting goods um but at the end of the day the underlying capabilities are required to do that are surprisingly better aligned than one might expect where it's the fundamentals of um of being able to understand the world around you process it make intelligent decisions and prove that we are at a level of safety that enables uh large-scale autonomy so from a branding perspective sort of uh waymo driver is the system that's irrespective of a particular uh vehicle it's operating in there you have a set of sensors that perceive the world can act in that world and move this whatever the vehicle is what's that legal platform that's right and so in the same way that you have a driver's license and like your ability to drive isn't tied to a particular make and model of a car and of course there's special licenses for other types of vehicles but the fundamentals of a human driver very very large you carry over and then there's uniquenesses related to a particular environment or domain or a particular um vehicle type that kind of add some extra additive challenges but that's exactly right it's the underlying systems that enable uh a physical vehicle without a human driver to uh very successfully accomplish the tasks that previously um what wasn't possible um without um you know 100 human driving and then there's way more one which is the transporting people that's right from a brand perspective and just in case we refer to it so people know and then there's waymo via which is the trucking component why via by the way what is that what is that what's is it just like a cool sounding name that just yeah uh like is there does there an interesting story there just it is a pretty cool sounding name it's a cool sounding name i mean when you think about it it's just like well we're gonna transport it via this and that like so it's just kind of like an allusion to um the mechanics of transporting something yes cool um and uh and it is a pretty good grouping and the interesting thing is that even the groupings kind of bore where waymo one is like human transportation and uh there's a fully autonomous service in the phoenix area that like every day is transporting people and it's pretty incredible to like just you know see that operate at reasonably large scale and just kind of happen and then on the via side it doesn't even have to be like long-haul trucking is a like a major focus of uh of ours but down the road you can stitch together the vehicle transportation as well for local delivery um also and a lot of this requirements for local delivery overlap very heavily with consumer transportation um obviously uh you know given that you're operating on a lot of the same roads um and uh and navigating the same safety challenges and so um yeah and wave mode very much is a multi-product company that has ambitions in both they have different challenges and both are tremendous opportunities but the cool thing is is that there's a huge amount of leverage and this kind of core technology stack now gets pushed on by both sides and that adds its own unique challenges but the success case is that um the challenges that you push on um they get leveraged across all platforms and also from an engineering perspective the teams are integrated it's a mix so there's a huge amount of centralized kind of core teams that support all applications and so you think of something like the hardware team that develops the lasers the compute integrates into vehicle platforms this is an experience that carries over across um you know any application that we'd have and they have been flow with both then there's like really unique um perception challenges planning challenges like other you know types of challenges where there's a huge amount of leverage on a cortex stack but then there's like dedicated teams that think of how do you deal with a unique challenge for example an articulated trailer with varying loads that completely changes the physical dynamics of a vehicle that doesn't exist on a car but becomes one of the most important kind of unique new challenges on a truck so what's the long-term dream of waymo via uh the autonomous trucking effort that waymo is doing yeah so we're starting with developing uh l4 autonomy for class 8 trucks these are 53-foot trailers that capture like a big perc a pretty sizable percentage of the good transportation in the country long term the opportunity is obviously to expand to much more diverse types of vehicles types of good transportation and start to really expand in both the volume and the route feasibility that's possible and so just like we did on the car side you start with a single route with a very specific operating kind of domain and constraints that allow you to solve the problem but then over time you start to really try to push against those boundaries and open up deeper feasibility across routes across surface streets across environmental conditions across the type of goods that you carry the versatility of those goods and how little supervision is necessary to just start to scale this network and long term there's actually it's a pretty incredible enabler where um you know today you have already a giant shortage of truck drivers it's uh over 80 000 truck driver shortage that's expected to grow to hundreds of thousands in the years ahead you have really really quickly increasing demand from e-commerce and just just distribution of uh where people are located um you have one of the deepest safety challenges of um of any profession in the u.s where um there's a huge huge kind of challenge around fatigue and around kind of the long routes that are driven and even beyond kind of the cost and necessity of it there are fundamental constraints built into our logistics network that are tied to the type of human constraints and regulatory constraints that are tied to trucking today for example our limits on how long a driver can be driving in a single day before they're they're not allowed to drive anymore which is a very important safety constraint what that does is it enforces limitations on how far jumps with a single driver could be and makes you very subject to availability of drivers which influences where warehouses are built which influences how goods are transported which influences costs and so um you start to have an opportunity on everything from plugging into existing fleets and brokerages and the existing logistics network and just immediately start to have a huge opportunity to add value from you know cost and driving fuel insurance and safety standpoint all the way to completely reinventing the logistics network um across the united states and enabling something completely different than what it looks like today yeah i had uh be published before this had a great conversation with steve vicelli who we talked about the manual driving and he echoed many of the same things that you were talking about but we talked about much of the the fascinating human stories of truck drivers he was also was a truck driver for for a bit as a grad student to try to understand the depth of the problem he's a fascinating wives we have some drivers that have 4 million miles of lifetime driving experience it's pretty incredible and um yeah it's uh yeah learning from them like some of them are on the road for 300 days a year it's a very unique type of lifestyle so there's fascinating stuff there just like you said there's a shortage of actually people uh truck drivers taking the job counter to what this i think is publicly believed so there's an excess of jobs and a shortage of people to take up those jobs and just like you said it's such a difficult problem and these are experts at driving it's solving this particular problem and it's fascinating to learn from them to understand you know how hard is this problem and that's the question i want to ask you from a perception from a robotics perspective what's your sense of how difficult is autonomous trucking maybe you can comment on which scenarios are super difficult which are more manageable is there is there a way to kind of convert into words how difficult the problem is yeah it's a good question so there's um and as you can expect it's a mix some things become a lot uh uh a lot easier or at least more flexible um some things are harder and so you know on the things that are like uh the tailwinds the benefits um a big focus of automating trucking especially initially is really focusing on the long-haul freeway stretch of it where that's where a majority of the value is captured on a freeway you have a lot more structure and a lot more consistency across freeways across the u.s compared to surface streets where you have a way higher dimensionality of what can happen lack of structural lack of consistency and variability across cities so you can leverage that consistency to tackle at least in that respect a more constrained ai problem which has some benefits to it um you can itemize much more of the sort of things you might encounter and so forth and so those are benefits is there a canonical freeway and city we should be thinking about like is there is there a standard thing that's brought up in conversation often like here's a stretch of road um what is it like when people talk about traveling across country they'll talk about new york this is san francisco is that the route like is there a stretch of road that's like nice and clean and then there's like cities with difficulties in them that you kind of think of as the canonical problem to solve here right uh so starting with the car side um well waymo very intentionally picked the phoenix area and the san francisco area as a follow once we hit driverless where when you think of consumer transportation and ride sharing you know kind of economy a big percentage of that market is captured in the densest cities in the united states and so really pushing out and solving san francisco becomes a really huge opportunity and uh importance and um and you know places one dot on kind of like the spectrum of like kind of complexity uh the phoenix area starting with chandler and then like kind of expanding more broadly in the phoenix uh metropolitan area it's i believe the fastest growing city in the us it's a uh kind of a higher medium-sized city but growing quickly and still captures a really wide range of kind of like complexities and so getting to driverless there actually exposes you a lot of the building blocks you need for the more complicated environments and so in a lot of ways there's a thesis that if you start to kind of place a few of these kind of dots where san francisco has these types of unique challenges dense pedestrians all this like complexity especially when you get into the downtown areas and so forth and phoenix has like a really interesting kind of spectrum of challenges maybe you know other ones like la kind of add freeway focus and so forth you start to kind of cover the full set of features that you might expect and it becomes faster and faster if you have the right systems in the right organization to then open up the fifth city and intensity in the 20th city on trucking there's uh similar properties where um obviously there's uniquenesses and freeways when you get into really dense environments and then the real opportunity uh to then you know get even more uh value is to think about how you expand with like some of the service street challenges but for example right now we're looking um we have a big facility that we're uh finishing building in q1 in uh dallas area um that'll allow us to do testing from the dallas area on routes like dallas to houston dallas to phoenix um going out east and dallas to austin austin so that triangle um waymo should come to austin well waymo the car side was in austin for a while yes i know yeah come back yeah but uh trucking is actually texas is one of the best places to start uh because of both volume regulatory weather there's a lot of benefits um on trucking a huge opportunity is port of la going east so in a lot of ways a lot of the work is to start to stitch together a network and converge to port of la where you have the biggest port in the united states um and the amount of goods going east from there is pretty tremendous and then obviously there's you know kind of channels everywhere and you have extra complexities as you get into like snow and inclement weather and so forth but um what's interesting about trucking is every single route segment that you add increases the value of the whole network and so it has this kind of network effect and cumulative effect that's very unique and so there's all these dimensions that we think about um and so in a lot of ways dallas has a really unique hub that opens up a lot of options has become a really valuable weber so the million questions i get asked first of all you mentioned level four for people who totally don't know there's these levels of automation that uh level four refers to uh kind of the first step that you could recognize is fully autonomous driving level five is really fully autonomous driving level four is kind of fully autonomous driving and then there are specific definitions depending on who you ask what that actually means but for you what does the level four mean and you mentioned freeway let's say like there's three parts of long-haul trucking maybe i'm wrong in this but there's freeway driving there's like truck stop and then there's more urban-y type of area so which of those do you want to tackle which of them do you include under level four like how do you think about this problem what do you focus on where's the biggest impact to be had in the short term so the goal is to we get we got to get to market as fast as we can because the moment you get the market you just learn so much and it influences everything that you do and it is um uh i mean one of the experiences that carried over from before is that you add constraints you figure out the right compromises you do whatever it takes because getting the market like is so critical right and here with autonomous driving you can get to market in so many different ways that's right and so one of the simplified simplifications that we intentionally have put on is using what we call transfer hubs where you can imagine depots uh that are uh at the entry points to metropolitan areas like let's say dallas like the hub that we're building which does a few things that are very valuable so from a first product standpoint you can automate transfer hub to transfer hub and that path from the transfer hub to the you know the full freeway route can be a very intentional single route that you can select for the features that you feel you want to handle at that point in time then you build the hub specifically designed for time tracking and that's what's going to happen actually like and you get you need to come out in january and check it out because it's going to be really cool it's the not only is it our main operating um headquarters for our fleet there but it will be the first uh fully ground-up design driverless hub for autonomous drivers autonomous trucks in terms of where do they enter where do they depart how do you think about the flow of people goods everything it's like it's quite cool and it's really beautiful on how it's thought through and so early on it is totally reasonable to do the last five miles manually to get to the final kind of depot to avoid having to solve the general surface street problem which is obviously very complex now when the time comes and we are increasingly we're already we're pushing on some of this but we will increasingly be pushing on surface street capabilities to build out the value chain to go all the way deeper to depot instead of transfer hub the transfer hub and we have probably the best advantages in the world because of all the waymo experience on surface streets but that's not the highest roi right now where the highest roi is hub the hub and get the routes going and so when you ask what's l4 l4 can be applied to any domain operating domain or scope but it's effectively for the places where we say we're ready for autonomous operation we are 100 operating uh with uh through the as a self-driving truck with no uh human behind the wheel that is l4 autonomy and it doesn't mean that you operate in every condition it doesn't mean you operate on every road but for a particularly well-defined area uh operating conditions routes kind of domain you are fully autonomous and that's the difference between l4 and l5 and most people would agree that at least any time in the foreseeable future l5 is just not even really worth thinking about because there's always going to be these extremes and so it's a race and a almost like a game where you think of what is the sequence of expanded capabilities that create the most value and teach us the most and create this feedback loop where we're building out and unlocking more and more capability over time i gotta ask you just curious so first of all i have to when i'm allowed to visit the dallas facility because it's super cool it's like robot on the giving and the receiving end it's the truck is a robot and the the hub is a robot yeah it's got to be very robot friendly so yeah that's great i will feel at home uh the what's the sensor suite like on the hub if you can just high level mention it is does the hub have like lidars and like is is it is the truck doing most of the intelligence or is the hub also intelligent yeah so most of it will be the truck and uh everything is like connected like so we uh we have our servers where we know exactly where every truck is we know exactly what's happening at a hub and so you can imagine like a large back-end system that over time starts to manage uh timings goods delivery windows all these sort of things and so you don't actually uh need to um there might be special cases where that is valuable to equip some sensors in the hub but a majority of the intelligence is going to be on the truck because um whatever is relevant to the truck relevance should be seen by the truck and can be relayed uh remotely for any sort of kind of cognizance or decision making but there's a distinct type of workflow where um where do you check trucks where do you want them to enter what if there's many operating at once where's the staging area to depart how do you set up the flow of humans and human cars and traffic so that you minimize the interaction between humans and kind of self-driving trucks uh and then how do you even intelligently select the locations of these transfer hubs that are both really great service locations for a metropolitan area and there could be over time many of them for a metropolitan area while at the same time leaning into the path of least resistance to lean into your current capabilities and strengths so that you minimize the amount of work that's necessary to unlock the next kind of big bar i have a million questions so first is the goal to have no human in the truck the goal is to have no human in the truck now of course right now we're testing with expert operators and so forth but um the goal is to um now there might be circumstances where it makes sense to have a human or uh and and obviously these trucks can also be manually driven so sometimes like our we talk with our fleet partners about how um you can buy a waymo equipped diamor truck down the road and on the routes that are autonomous it's autonomous on the routes that are not it's um human driven maybe there's l2 functionality that add safety systems and so forth but as soon as they become as soon as we expand in software the availability of driverless routes the hardware is forward compatible to just now start using them um in real time and so you can imagine uh this mixed use but at the end of the day the largest value proposition is where you're um able to have no constraints on how you can operate this truck um and it's 100 autonomous with nobody inside oh that's amazing so the let me ask on the logistics front because you mentioned that also opportunity to revamp or for builds from scratch some of the ideas around logistics i don't want to throw too much shade but from talking to steve my understanding is logistics is not perhaps as great as it could be in the current uh trucking uh environment i'm not maybe you can break down why but there's probably competing companies there's just a mess maybe some of it is literally just it's old school like they it's just like it's not computer it's not computerized like truckers are almost like contractors there there's an independence and there's not a nice interface where they can communicate where they're going where they're at you know all those kinds of things and so there it just feels like there's so much opportunity to digitize everything to where you could optimize the use of human time optimize the use of all kinds of resources how much you thinking about that problem how fascinating is that problem how difficult does it how much opportunity is there to revolutionize the space of logistics in autonomous trucking in trucking period it's pretty fascinating it's uh this is one of the most motivating aspects of all this where like yes there's like a mountain of problems that are like you wanna you have to solve to get to like the first checkpoints and first drive list and so forth and inevitably like in a space like this you plug in initially into the existing kind of system and start to kind of you know learn and iterate but um that opportunity is massive and so you know a couple of the factors that um play into it so first of all um there's obviously just the physical constraints of driving time driver availability some fleets have a 95 attrition rate you know right now because of just this demands and like you know kind of gaps in competition and so forth and then it's also incredibly fragmented where you would be shocked at like when you when you look at industries like when you think of the top 10 players like the biggest fleets like the walmarts and fedexes and so forth the percentage of the overall trucking market that's captured by the top 10 or 50 fleets is surprisingly small um the average kind of uh truck operation is like a one to five truck you know family business um and so and so there's just like a huge amount of like fragmentation which makes for um really interesting challenges in kind of stitching together through like bulletin boards and brokerages and some people run their own fleets and and this world's kind of like evolving um but it is one of the less digitized and optimized worlds that there is and the part that is optimized is optimized to the constraints of today and even within the constraints of today this is the 900 billion dollar industry in the u.s and it's continuing to grow it feels like from a business perspective if i were to predict that while trying to solve the autonomous trucking problem waymo might solve first the logistics problem like because that that would already be a huge impact yeah so on the way to solving autonomous trucking the human driven like there's so much opportunity to significantly improve the human driven trucking the timing the logistics so you use humans optimally the handoffs the like you know well even that you i mean you get really ambitious you start to expand this beyond like how does the uh fulfillment center work and like how does the transfer hub work how does a warehouse work to i mean there's a lot of opportunities to start to automate these chains and um a lot of the inefficiency today is because like you have a delay like port of la has a bunch of ships right now waiting outside of it because they can't dock because there's not enough labor inside of the port of la that means there's a big backlog of trucks which means there's a big backlog of deliveries which means the drivers aren't where they need to be and so you have this like huge chain reaction and your feasibility of readjusting in this network is low because everything's tied to humans and manual kind of processes uh or distributed processes across a whole bunch of players and so one of the biggest enablers is um yes we have to solve autonomous trucking first and that by the way that's not like an overnight thing that's decades of continued kind of expansion and work but um the first checkpoint in the first route is like is not that far off but once you start enabling and you start to learn about how the constraints of autonomous trucking which are very different in the constraints of human trucking and again strengths and weaknesses how do you then start to leverage that and rethink a flow of goods uh more broadly and this is where like the learnings of like really partnering with some of the largest fleets in the us and the sort of learnings that they have about the industry and the sort of needs that they have and what would change if you just like really broke this one constraint that like holds up the whole network or what if you enabled this other constraint that actually drives the roadmap in a lot of ways because um this is not like an all or nothing problem it's uh you know you start to kind of unlock more and more functionality over time which functionality most enables this optimization ends up being kind of part of the discussion but you're totally right like you fast forward to like you know five years ten years uh 15 years and you think about like very generalized capability of automation and logistics as well as the ability to like poke into how those handoffs work the efficiency goes far beyond just direct cost of today's like unit economics of a truck they go towards reinventing the entire system um in the same way that uh you know you see you know these other industries that uh like when you get to enough scale you can really rethink um how you build around your new set of capabilities not the old set of capabilities yeah use the analogy metaphor or whatever that autonomous trucking is like email versus mail and then with email you're still doing the communication but it opens up all kinds of comm varieties of communication that you didn't anticipate that's right constraints are just completely different um and yeah there's definitely a property of that here um and we're also still learning about it because there there is a lot of really um fascinating and sometimes really elegant things that the industry has done where there's companies whose entire existence is around despite the constraints optimizing as much as they can out of it and those lessons do carry over but it's an interesting kind of merger of worlds to think about like well what if this was completely different how would we approach it and the interesting thing is that for a really really really long time it's actually going to be the merger between how to use autonomy and how to use humans that leans into each each of their strengths yeah and then we're back to cosmo human robot interaction so and the interesting thing about waymo is because there's the passenger vehicle the the human the transportation of humans and transportation of goods you could see over time they might kind of meld together more because you you'll probably have like zero occupancy vehicles moving around so you have transportation goods for short distances and then for slightly longer distances and then slightly longer and then there'll be this then you just see the difference between a passenger vehicle and a truck is just size and you can have different sizes and all that kind of stuff and at the core you can have a way more driver that doesn't as long as you have the same that's sweet you can just think of it as one problem and that's why over time these do come kind of converge where in a lot of ways a lot of the challenges we're solving are freeway driving which are going to carry over very well to the vehicles to the car side um but there are like then unique challenges like uh you have a very different dynamics in your vehicle where you have to see much further out in order to have the proper like response time because you have an 80 000 pound fully loaded truck um that's a very very different type of braking profile than a than a car you have uh really interesting kind of dynamic limits because of the trailer where you actually it's very very hard to like physically like flip a car or do something like physically like most risk in a car is from just collisions um it's very hard to like in any normal operation to do something other than like you know unless you hit something it's actually kind of like roll over or something on a truck you actually have to drive much closer to the physical bounds of the safety limits um but you actually have like real constraints because you could uh you know you could have a really interesting interactions between the cabin and the trailer yeah there's something called jackknifing if you turn you know too quickly you have roll risks and so forth and so we spend a huge amount of time understanding those boundaries and those boundaries change based on the load that you have which is also an interesting difference you have to propagate through the out that through the algorithm so that you're leveraging your dynamic range but always staying within the safety balance but understanding what those safety bonds are and so we have this like really cool test facility where we like take it to the max and actually imagine a truck with these giant training wheels on the back of the trailer and you're pushing it past the safety limits uh in order to like try to actually see where it rolls and so you you you define this high dimensional boundary which then gets captured in software to stay safe and actually do the right thing but uh it's kind of fascinating the sort of uh you know kind of challenges you have there um but then all of these things drive really interesting challenges from perception to um unique behavior prediction challenges and obviously in planner where you have to think about merging and creating gaps with a 53 foot trailer and so forth and then obviously the platform itself is very different where you have different numbers of sensors sometimes types of sensors and you also have unique blind spots that you have because of the trailer which you have to think about and so it's a really interesting spectrum and in the end you try to capture these special cases in a way that is cleanly augmentations of the existing tech stack because a majority of what we're solving is actually generalizable to freeway driving um and different platforms and over time they all start to kind of merge ideally where the things that are unique are as as minimal as possible and that's where you get the most leverage and that's why waymo can do you know take on two trillion dollar opportunities um and have been nowhere near 2x the cost or investment or size in fact it's much much smaller than that because of the high degree of leverage so what kind of sensor suite they can speak to that uh that a long haul truck needs to have lidar vision how many what are we talking about here yeah so it's um more than the cars so very loosely you can think of as like 2x but it varies depending on the sensor and so we have like dozens of cameras radar and then multiple lidar as well you'll see one difference where the cars have a central main sensor pod on the roof in the middle and then a some kind of hood sensors for blind spots the truck moves to two main sensor pods on the outsides where you would typically have the mirrors next to the driver they effectively go as far out as possible um kind of up to the understanding of the front kind of on the cabin not all the way in the front but like kind of where the mirrors for the driver would be and so those are the main sensor pods and the reason they're there is because if you had one in the middle the trailer is higher than the cabin and you would be included with this like awkward wedge too much occlusion too much occlusion and so then you would add a lot of complexity to the software yeah to make up for that and and just unnecessary components so many probably fascinating design choices really cool because you can probably bring up light or higher and have it in the center or something you could have all kinds of choices you have to make the decisions here yeah that ultimately probably will define the industry right but by having two on the side there's actually multiple benefits so one is like um you're just beyond the trailer so you can see fully flush with the trailer and so you eliminate most of your blind spot except for right behind the trailer um which is which is great because now the software carries over really well and the same perception system you use on the car side largely that architecture can carry over and you can retrain some models and so forth but you leverage it a lot it also actually helps with redundancy where there's a really nice built-in redundancy for all the lidar cameras and radar where you can afford to have any one of them fail and you're still okay and at scale every one of them will fail um and you will be able to detect when one of them fails because they don't uh because the redundancy they're giving you the data that's inconsistent with the rest of that's right and it's not just like they no longer give data it could be like they're fouled or they stop giving data where the some electrical thing gets cut or you know part of your compute goes down so what's neat is that like you have way more sensors part of his field of view and occlusions part of its redundancy and part of it is new use cases so there's um uh new types of sensors uh to optimize for long range and uh kind of the the the sensing horizon that we look for on our vehicles um that is unique to trucks because it actually is like kind of much like further out than um than a car but a majority are actually used across both cars and trucks and so we use the same compute the same uh fundamental baseline sensors cameras uh radar um imus and so you get a great leverage from all of the infrastructure and the hardware development as a result so what about cameras what role does so lidar is this rich set of information has its strengths um has some weaknesses camera is this rich source of information that has some strengths has its weaknesses what role does lidar play what role does vision cameras play in this in this beautiful problem of autonomous trucking ah it is beautiful there's like so much that comes together and how much yeah at which point do they come together yeah so let's start with lidar so lidar has been like waymo's um uh one of waymo's big strengths and advantages where uh we developed our own lidar uh in-house where many generations in both in cost and functionality it is um uh the best and you know in this in the space which generation because i know there's this there's uh this cool i mean i love versions that are increasing uh which version of the hardware stack is at currently uh officially publicly uh so uh so some parts iterate more than others i'm trying to remember on the sensor side so this the entire self-driving system which includes sensors and compute is fifth generation yes um i can't wait until there's like iphone style like announcements yeah for like new versions of the weymouth hardware yeah well we try to be careful because man when you change the hardware it takes a lot to like retrain the models and uh and everything so we just went through that and going from the pacificas to the jaguars and so the jaguars and then the trucks are you know have the same generation now um but yeah the lidar is uh it's incredible and so waymo has um leaned into that as a strength and so a lot of the near-range perception system that obviously kind of carries over a lot from the car side uh uses lidar as a very prominent kind of like primary sensor but then obviously everything has its strengths and weaknesses and so in the near range lidar is a gigantic advantage um and it has its weaknesses on you know when it comes to occlusions in certain areas rain and weather like you know things like that but it's an incredible sensor and it gives you incredible density perfect location precision and consistency which is a very valuable property um to be able to uh to kind of apply a mel approach can you elaborate consistency yeah when you have a camera the position of the sun the time of the day uh um various of the properties can have a big impact uh whether there's glare the field of view things like that um so consistent the signal with uh in the face of a changing external environment the signal yeah daytime night time it's about 3d um physical existence in effect like you're you're seeing beams of light that bounce physically bounce off of something and come back and so whatever the conditional conditions are like the shape of a human sensor reading from a human or from a car or from an animal like you have um a reliability there which ends up being valuable for kind of like the long tail of challenges yeah now lidar is the first sensor to drop off in terms of range and ours has a really good range but at the end of the day um it drops off and so particularly for um for trucks on top of the general redundancy that you want for near range with and complements through cameras and radar for occlusions and for complementary information and so forth when you get to long range you have to be radar and camera primary because your lidar data will fundamentally drop off after a period of time and you have to be able to see um kind of objects further out now uh cameras have uh the the incredible range um where you get a high density high resolution camera you can get data you know well past a kilometer and it's like really um potentially a huge value now the signal drops off the noise is higher detecting is harder classifying is harder and one that you might think about localizing it's harder because you can be off by like two meters and where something's located a kilometer away and that's the difference between being on the shoulder and being in your lane and so you have like interesting challenges there that you have to solve which have a bunch of approaches that come into it um radar is interesting because um uh uh because it also has longer range than um than lidar uh and it gives you speed information so it becomes very very useful for dynamic information of traffic flow uh vehicle motions animals pedestrians like uh just things that might be um useful signals um and uh it helps with weather conditions where radar actually penetrates weather conditions in a better way than um other sensors and so it's just it's kind of interesting where we've kind of started to converge towards not thinking about a problem as a lidar problem or a camera problem or radar problem but it's a fusion problem where these are all like large scale ml problems where you put data into the system and in many cases you just look for the signals that might be present in the union of all of these and leave it to the system as much as possible to start to really identify how to um how to extract that and then there's places we have to intervene and actually include more but um no single sensor is in a great position to like really solve this problem and then without a huge extra challenge that's fascinating um there's a question that's probably still an open question is at which point do you fuse them do you do do you solve the perception problem for each sensor suite individually the lighter suite and the camera suite or do you do some kind of heterogeneous fusion or do you fuse at the very beginning is there a good answer or at least an inkling of intuitions you can accomplish yeah so people refer to this as like um early fusion or late fusion so late fusion might be that you have like the the camera pipeline the lidar pipeline and then you like fuse them and like when it gets to like final you know semantics and classification and tracking you like kind of fuse them together and and figure out which one's best um there's more and more evidence that um uh that early fusion is important um and that is because uh weight fusion does not allow you to pick up on the complementary strengths and weaknesses of the sensors um weather is a great example where um if you do early fusion you have an incredibly hard problem for any single sensor in rain to solve that problem um because you have reflections from the lidar um you have uh you know weird kind of noise from the camera blah blah blah right but the combination of all of them can help you filter and help you get to the real signal that then gets you as close as possible to the original stack and be much more fluid about the strengths and weaknesses where um you know your camera is much more susceptible to like kind of uh fouling on the on the actual lens from you know like rain or random stuff whereas like you might be a little bit more resilient than other sensors and so there's an element of logic that always happens late in the game but that fusion early on actually especially as you move towards ml and large-scale data-driven approaches just maximizes your ability to pull out the best signal you can out of each modality before you start making constraining decisions that end up being hard to unwind late in the stack so how much of this is a machine learning problem what role does ml machine learning playing this whole problem of autonomous driving autonomous trucking it's um massive and it's increasing over time you know if you go back to um you know the grand challenge days in the early days of kind of av development there was ml but it was not in like kind of the mass scale data style of ml it was like learning models but in a more structured kind of way and it was a lot of heuristic and search-based approaches and planning and so forth you can make a lot of progress with these types of approaches kind of across the board an almost deceptive amount of progress we can get pretty far but then you re you start to really grind the further you get in some parts of stack if you don't have an ability to absorb a massive amount of experience in a way that scales very sublinearly in terms of human labor and human attention and so when you look at the stack the perception side is probably the first to get really revolutionized by ml and it goes back many years because ml for like computer vision and these types of approaches has kind of took off um was a lot of the like early kind of push and um and deep learning and so there's always a debate on you know the spectrum between kind of like end to end ml which you know is a little bit kind of like too far to how you architect it to where you have modules but enough ability to think about long tail problems and so forth but at the end of the day um you have big parts of system that are very ml and data driven and we're increasingly moving that direction all the way across the board including behavior where even when it's not like a gigantic ml problem that covers like a giant swath end to end more and more parts of the system have this property where you want to be able to put more data into it and it gets better and that has been one of the realizations as you drive tens of millions of miles and try to like solve new expansions of domains without regressing in your old ones it becomes intractable for a human to approach that in the way that traditionally robotics has kind of approached some elements of the of the tech stack so are you trying to um create a data pipeline specifically for the trucking problem this is it like how much leveraging of the autonomous driving is there in terms of data collection yeah and how unique is the data required for the trucking problem so we uh we we use all the same infrastructure um so labeling workflows ml workflows everything so that actually carries over quite well um we heavily reuse the data even where almost every model that we have on a truck we started with the latest car model cool and um so it's almost like a good background model yeah it's like you can think of like you despite the different domain and different numbers of sensors and position of sensors there's a lot of signals that carry over across driving and so it's almost like pre-training and getting a big boost out of the gate where you can reduce the amount of data you need by a lot and it goes both ways actually and so we're increasingly thinking about our data strategy on how we leverage both of these so you think about um you know how other agents react to a truck yeah it's a little bit different but the fundamentals are actually like what will other vehicles in the road do there's a lot of carryover that's possible and in fact just to give you an example uh we're constantly kind of like adding more data from the trucking side but as of right now when we think of our like one of our models behavior prediction for other agents on the road like vehicles 85 percent of that data comes from cars and a lot of that 85 comes from surface streets because we just had so much of it and it was really valuable and so we're adding in more and more particularly in the areas where we need more data but you get a huge boost out of the gate just all different visual characteristics of roads lane markings pedestrians all that that's still relevant it's all still relevant and then just the fundamentals of how you know you detect the car does it really change that much whether you're detecting it from a car or a truck um the fundamentals of how a person will walk around your vehicle is it it'll change a little bit but the basics like there's a lot of signal in there that as a starting point to a network can actually be very valuable now we do have some very unique challenges where there's a sparsity of events on a freeway um the frequency of events happening on a freeway whether it's you know interesting you know objects in the road or incidents or or even like from a human benchmark like how often does a human have an accident on a freeway is far more sparse than on a surface street and so that leads to really interesting data problems where you can't just drive infinitely to encounter all the different permutations of things you might encounter and so there you get into interesting tools like structured testing and data collection data augmentation and so forth and so there's really interesting kind of technical challenges that push some of the research um that enables um these new suites of approaches what role does simulation play really good question so waymo simulates about a thousand miles for every mile it drives um so you think of in both so across the board across the board yeah uh so you think of for example well if we've driven you know over 20 million miles that's over 20 billion miles in simulation now how do you use simulation um it's a multi-purpose so uh you use it for basic development so you want to do make sure you have regression prevention and protection of everything you're doing right um that that's an easy one when you encounter something interesting in the world let's say there was an issue with how the vehicle behaved versus an ideal human um you can play that back in simulation and start augmenting your system and seeing how you would have reacted to that scenario with this improvement or this new area you can create scenarios that become part of your regression set after that point right um then you start getting into like really really high kind of hill climbing where um you say hey i need to improve this system i have these metrics that are really correlated with final performance how do i know how well i'm doing uh operation the actual physical driving is the least efficient form of testing and it's expensive it's time consuming so grabbing a large scale batch of historical data and simulating it to get a signal of over these last or just random sample of 100 000 miles how has this metric changed versus where we are today you can do that far more efficiently in simulation than just driving with that new system on board right and then you go all the way to the validation phase where to actually see your human relative safety of like how well you're performing on the car side or the trucking side relative to a human um a lot of that safety case is actually driven by uh taking all of the physical operational driving which probably includes a lot of interventions where like where the operate the driver took over just in case um and then you simulate those forward and see if would anything have happened and in most cases the answer is no but you you can simulate it forward and you can even start to do really interesting things where you add virtual agents to create harder environments you can fuzz the locations of physical agents you can muck with the scene and stress test the scenario from a whole bunch of different dimensions and effectively you're trying to like more efficiently sample this like infinite dimensional space but try to encounter the problems as fast as possible because what most people don't realize is the hardest problem in autonomous driving is actually the evaluation problem in many ways not the actual autonomy problem and so if you could in theory evaluate perfectly and instantaneously you can solve that problem in a really fast feedback loop quite well but the hardest part is being really smart about this suite of approaches on how can you get an accurate signal on how well you're doing as quickly as possible in a way that correlates to physical driving that's in the evaluation problem which metric are you evaluating towards we're talking about safety and some what are the performance metrics that we're talking about so in the end you care about and safety like that's in the end what keeps you like um that's what's deceptive where uh there's a lot of companies that have like a great demo the path from like a really great demo to being able to go driverless can be deceptively long even when that demo looks like it's driverless quality and the difference is is that the thing that keeps you from going driverless is not the stuff you encounter on a demo it's the stuff that you encounter once in a hundred thousand miles or 500 000 miles and so that is at the root of what it what is most challenging about going driverless because any issue you encounter you can go and fix it but how do you know you didn't create five other issues that you haven't that encountered yet so those learnings like those were painful earnings in waymo's history that waymo went through and led to us then finally being able to go driverless in phoenix and now are at the heart of how we develop evaluation is simultaneously evaluating final kind of end safety of how ready are you to go driverless which may be as you know direct as what is your collision human relative kind of collision rate uh for all these types of scenarios and and uh uh and severities to make sure that you're better than a human bar you know by by a good amount um but that's not actually the most useful for development for development it's much more kind of analog metrics that are part of the art of finding how what what are the properties of driving that give you a way quicker signal that's more sensitive than a collision that can correlate to qual the quality you care about and push the feedback loop to all of your development a lot of these are for example comparisons to human drivers like manual drivers how do you how do you do relative to human driver in various dimensions of various um circumstances can ask a tricky question so if i brought you a truck how would you test it okay alan turing came along and you said this one's can't tell if it's a human driver or yeah exactly yeah but not the human because because you know humans are flawed but yeah how do you actually know you're ready basically how do you know it's good enough um yeah and by the way this is the reason why like um weymouth released the safety framework for the car side because like one it sets the bar so nobody cuts below it um and does something bad for the field that and that causes an accident two it's to start the conversation on like framing what does this need to look like same thing we'll end up doing for the trucking side um there it ends up being um different demand different portfolio of approaches there's easy things like are you compliant with all these like fundamental rules of the road like you never drive above the speed limit that's actually pretty easy like you can fundamentally prove that it's either impossible to violate that rule or that in these like you can um itemize the scenarios where that comes up and you can do a test and show that you you know you pass that test and therefore you can handle that scenario and so those are like traditional structure testing kind of system engineering approaches where you can just quant like fault rates is another example where when something fails how do you deal with it you're not going to drive and randomly wait for it to fail you're going to force a failure and make sure that you can handle it and close courses and simulation or on the road and and run through all the permutations of failures which you can often times for some parts of system itemize like hardware the hardest part is behavioral where you have just infinite situations that could in theory happen and you want to figure out the the combinations of approaches that you know that can work there you can probably pass the turing test pretty quickly even if you're not like completely ready for driverless because the events that are really kind of like hard will not happen that often just to give you a perspective uh a human has a serious accident on a freeway uh like a truck driver on a freeway has uh there's a serious event happens once every 1.3 million miles and something that actually has like a really serious injury is 28 million miles and so those are really rare and so you could have a driver that looks like it's ready to go but you have no signal on on what happens there and so that's where you start to get creative on combinations of sampling and statistical arguments focused structured arguments where you can kind of simulate those scenarios and show that you can handle them and metrics that are correlated with what you care about but you can measure much more quickly and get to a right answer and that's what makes it pretty hard and in the end um you end up borrowing a lot of properties um from uh aerospace and like space shuttles and so forth where you don't get the chance to launch it a million times just to say you're ready because it's too expensive to fail um and so you go through a huge amount of kind of structured approaches in order to validate it and then by by thoroughness you can make a strong argument that you're ready to go this is actually a harder problem in a lot of ways though because you can think of a space shuttle as um getting to a fixed point and then you kind of like or an airplane and you like freeze the software and then you like prove it and you're good to go here you have to get to a driver's quality bar but then continue to aggressively change the software even while you're driverless and so and also the full range of environment that you there's there's an external environment where the shuttle is you're basically testing the like the systems the internal stuff yeah uh and you have a lot of control on the external stuff yeah and the hard part is how do you know you didn't get worse in something that you just changed yes and so uh so in a lot of ways like the turing test starts to fail pretty quickly because you start to feel driverless quality pretty early in that curve if you think about it right like in most um most uh kind of you know really good av demos maybe you'll sit there for 30 minutes right yeah so you've driven you know 15 miles or something like that um to go driverless uh like what's the sort of rate of issues that you need to have you won't even encounter so let's try something different then let's try a different version of the touring test which is like an iq test so there's these difficult questions of increasing difficulty they're very they're they're designed you don't know them ahead of time nobody knows the answer to them right and so is it possible to in the future orchestrate yeah basically really obstacle course almost of like yeah that maybe change every year and that represent if you can pass these it they don't necessarily represent the full spectrum that's it yeah they won't be conclusive but you can at least get a really quick read and filter yeah like you're able to yeah because you didn't know them ahead of time like i don't know probably like construction zones uh failures or driving anywhere in russia yeah like yeah weather um cut-ins uh dense traffic kind of merging lane closures uh animal foreign objects on a road that pop out on short notice mechanical failures sensor braking tire popped weird behaviors by other vehicles like a hard brake something reckless that they've done fouling of sensors like bugs or birds you know poop or something so but yeah like you have these like kind of like extreme uh conditions where like you have a nasty construction zone where everything shuts down and you have to like you know get pulled to the other side of the freeway with a temporary lane like that right those are sort of conditions where we do that to ourselves right we itemize everything that could possibly happen to give you a starting point to how to think about what you need to develop and at the end of the day there's no substitute for real miles like if you think of traditional ml like you know how there's like a validation set where you hold out some data and uh like real-world driving is the ultimate validation set that's the in the end like the cleanest signal but you can do a really good job on creating an obstacle course and you're absolutely right like at the end um if there was such a thing as automating uh and kind of a readiness um it would be these extreme conditions like a red light runner right a um really reckless pedestrian that's jaywalking a cyclist that you know makes like a really awkward maneuver that's actually what keeps you from going driverless like in the end that is the long tail yeah and it's interesting to think about the that to me is the touring test stereotest means a lot of things but to me in driving the touring test is exactly this validation set that is handcrafted there's a i don't know if you know him there's a guy named francoise he um he decides he thinks about like how designed to test for general intelligence he designs these iq tests for machines and the validation set for him is handcrafted yeah and that it requires like human genius or ingenuity to create a really good test yeah and you hold you truly hold it up it's an interesting perspective on the validation set which is like make that as hard as possible right not a generic representation of the data but this is the hardest the hardest stuff yeah you know it's like go like you'll never fully itemize like all the world states that you'll you'll expand and so you have to come up with different approaches and this is where you start hitting the struggles of ml where ml is fantastic at optimizing the average case it's a really unique craft to think about how you deal with the worst case which is what we care about in in av space um when using an ml system on something that that occurs like super infrequently so like you don't care about the worst case really on ads because if you miss a few it's not a big deal but you do care about it on the driving side and so um and so typically like you'll never fully enumerate the world and so you have to take a step back and abstract away what are the signals that you care about and the properties of a driver that correlate to defensive driving and avoiding nasty situations that um even though you'll always be surprised by things you'll encounter you feel good about your ability to generalize from what you've learned all right let me ask you a tricky question so to me the two companies that are building at scale some of the most incredible robots ever built is waymo and tesla so there's very distinct approaches technically philosophically in these two systems let me ask you to play sort of devil's advocate and then the devil's advocate to the devil's advocate it's it's a bit of a race of course everyone can win but if waymo wins this race to level four uh which why would they win what aspect of the approach do you think would be the winning aspect and if tesla wins why would they win and uh which aspect of their approach would be the reason just just building some intuition almost not from a business perspective from any of that just technically yeah yeah and we could summarize i think maybe you can correct me what one of the more distinct aspects is uh waymo has a richer suite of sensors as lidar and vision tesla now removed radar they do vision only tesla has a larger fleet of vehicles operated by humans so it's already deployed on the field in its uh larger what do you call it operational domain and then waymo is more focused on a specific domain and growing it with fewer vehicles so that's the both are fascinating approaches both are i think there's a lot of brilliant ideas nobody knows the answer so i'd love to get your comments on this lay of the land yeah for sure so maybe i'll um i'll start with waymo and you're right like both incredible companies and just a gigantic respect to like everything tesla's accomplished and uh how they push the field forward as well so on the weymouth side there is a fundamental advantage in the fact that it is focused and geared towards l4 from the very beginning we've customized the sensor suite for it the hardware the compute the infrastructure the tech stack and all of the investment inside the company um that's deceptively important because there's like a giant spectrum of problems you have to solve in order to like really do this from infrastructure to hardware to autonomy stack to the safety framework and that's an advantage because there's a reason why it's the fifth generation hardware and why all of those learnings went into the dymor program um it becomes such an advantage because you learn a lot as you drive and you optimize for the best information you have but fundamentally like there's a big big jump um uh like every order of magnitude that you drive uh in numbers of miles and what you earn and the gap from really kind of like decent progress or l2 and so forth to what it takes to actually go all for and at the end of the day um there's a feeling that waymo has uh there's a long way to go uh nobody's won um but there's a lot of advantages um in all of these buckets where it's the only company that has shipped a fully driverless service we can go and you can use it and it's at a decently like uh you know sizeable scale um and those learnings can feed forward and to solve how to solve the more general problems you see this process you've deployed in chandler you don't know the timeline exactly but you could see the steps they they seem almost incremental the steps it's become more engineering than totally bind r d because it works in one place and then you move yeah another place and you grow it this way and just to give you an example like we fundamentally changed our hardware and our software stack almost entirely from what when driverless in phoenix to what is the current generation of the system on both sides because the things that got us to driverless even though it got to driveway way like way beyond human relative safety um it is fundamentally not well set up to scale in an exponential fashion without like getting into like huge kind of scaling pains and so those learnings you just can't shortcut and so that's an advantage and so uh there's a lot of open challenges to kind of get through technical organizational like how do you solve problems that are increasingly broad and complex like this work on multiple products but there's the feeling that okay like balls in our court there's a head start there now we got to go and solve it and i think that focus on l4 it's a fundamentally different problem if you think about it like um let's say we were designing an l2 truck that was meant to be safer and help a human you could do that with far less sensors far less complexity and provide value very quickly arguably with what we already have today just packaged up in a good product but you would take a huge risk in having a gap from even like compute and sensors not not to mention the software to then jump from that system to an l4 system so it's a huge risk basically so i can let me allow me to be the person that plays the devil's advocate and let's argue for the tesla approach so that the what you just laid out makes perfect sense and is exactly right there are some open questions here which is it's possible that investing more in faster data collection which is essentially what tesla's doing will get us there faster if the sensor suite doesn't matter yeah as much and machine learning can do a lot of the work this is the open question is how much is is the thing you mentioned before how much of driving can be end to and learned that's the open question obviously the waymo and the vision only machine learning approach will solve driving eventually both yeah the question is of timeline what's faster that's right and what you mentioned like if i were to make the opposite argument like what what puts tesla uh in the strongest position it's data that is their like superpower where they have an access to real-world data effectively with like a safety driver uh and uh you know like they've they found a way to like um get paid by safety drivers versus paper safety drivers it's uh it's brilliant right yeah but you know all joking aside like um one it is incredible that they've built a business that's incredibly successful they can now be a foundation and bootstrap kind of like really aggressive investment in autonomy space uh if you can do it that's always like an incredible kind of advantage and then the data aspect of it um it is a giant amount of data if you can use it the right way to then solve the problem but the ability to collect um and filter through the things that to the things that matter at real-world scale like a large distribution that is a that is huge like it's a big advantage um and so then the question becomes can you use it in our right way and do you have the right software systems and hardware systems in order to solve the problem and you're right that in the long term there's no reason to believe that pure camera systems can't solve the problem that humans obviously are solving with you know with vision systems but the question is when it's a risk it's a big so there's no argument that it's not a risk right like and it's already such a hard problem and so much of that problem by the way is um uh you know even beyond the perception side some of the hardest elements of the problem on behavioral side and decision making and the long tail safety case if you are adding risk and complexity on the input side from perception you're now making a really really hard problem like which is on its own is still like almost insurmountably hard even harder and so the question is just how much and this is where like you can easily get into a little bit of a kind of a trap where similar to how you how do you evaluate how good an av company's product is like you go and you do a trial kind of a test run with them a demo run which they've kind of optimized like crazy and so forth and like and it feels good do you do you put any weight in that right you know that that gap is kind of like you know pretty large still um same thing on the like perception case like the long tail of computer vision is really really hard and there's a lot of ways that that can come up and even if it doesn't happen that often at all when you think about the safety bar and what it takes to actually go full driverless not like incredible assistance driverless but full driverless that bar gets crazy high and not only do you have to solve it on the behavioral side but now you have to push computer vision beyond arguably where it's ever been pushed and so you now on top of the broader av challenge you have a really hard perception challenge as well so there's perception there's planning there's human robot interaction to me what's fascinating about what tesla is doing is in this march towards level four because it's in the hands of so many humans you get to see video you get to see humans i mean forget forget companies forget businesses it's fascinating for humans to be interacting with robots that's incredible and they're actually helping kind of push it forward and yeah and that is valuable by the way where even for us a decent percentage of our data is human driving yes um we intentionally have humans drive higher percentage than you might expect because that creates some of the best signals to train the autonomy and so that is uh on its own value so together we're kind of learning about this problem in an applied sense just like you had with cosmo like once when when you're chasing an actual product that people are going to use robot based product that people are going to use you have to contend with the reality of what it takes to build a robot that successfully perceives the world and operates in the world and what it takes to have a robot that interacts with other humans in the world and that that's like to me one of the most interesting problems humans have ever undertaken because you're in trying to create an intelligent agent that operates in a human world you're also understanding the nature of intelligence itself like how hard is driving it's still not answered to me yeah i still don't understand like all the subtle cues like even little things like um your interaction with a pedestrian where you look at each other and just go okay go right like that's hard to do without a human driver right and you're missing that dimension how do you communicate that so there's like really really interesting kind of like elements here now here's what's beautiful can you imagine that like when autonomous driving is solved how much of the technology foundation of that like space can go and have like tremendous just transformative impacts on on other problem areas and other other spaces that have subsets of the these same problems like it's just incredible it's it's both a pro and a con is uh with autonomous driving is so um safety critical it's so so once you solve it it's beautiful because there's so many applications that are a lot less safety critical but it's also the the con of that is it's so safe it's so hard to solve and the same journalists that you mentioned and get excited for a demo are the ones who who write long articles about the failure of your company if there's one accident that's based on a robot and it's it's it's just society's so tense and waiting for failure robots you're in such a high stake environment failure has such a high cost and it slows down development it slows down development yeah like the team like definitely noticed that like once you go driverless like we're driving from phoenix and you continue to iterate your iteration pace slows down um because your fear of regression forces so much more rigor that you know obviously you know you have to find a compromise on like okay well how often do we release driverless builds because every time you release a driver's build you have to go through this like validation process which is very expensive so far so um it is interesting it's like it is just one of the hardest things there's no other industry where like uh you would not like you wouldn't release products way way quicker when you start to kind of provide even portions of the value that you provide healthcare maybe is the other one that's right but at the same time right like we've gotten there where you think of like surgery right like you have surgery there's always a risk but like it's really really bounded you know that there's an accident rate when you go out and drive your car today right like and you know what the fatality rate in the u.s is per year we're not banning driving because there was a car accident but the bar for us is way higher and we hold ourselves very serious to it where you have to not only be better than a human but you probably have to like at scale be far better than a human by a big margin and you have to be able to like really really thoughtfully explain um all of the ways that we validate that becomes very comfortable for humans to understand because a bunch of jargon that we use internally just doesn't compute at the end of the day we have to be able to explain to society how do we quantify the risk and acknowledge that there is some non-zero risk but it's far above a human you know relative safety here's the thing to push back a little bit uh and bring cosmo back in the conversation he said something quite brilliant at the beginning of this conversation that i think probably applies for autonomous driving which is you know there's this desire to make autonomous cars much safer than human driven cars but if you create a product that's really compelling and is able to explain both the leadership and the engineers and the product itself can communicate intent then i think people may be able to be willing to put up with the thing that might be even riskier than humans because they understand the value of taking risks you mentioned the speed limit humans understand the value of going over the speed limit yeah humans understand the value of like going fast through a for through a yellow light yeah to take in when you're in manhattan streets pushing through uh uh crossing pedestrians they understand that i mean this is a much more tense topic of discussion so this is just me talking so in with cosmo's case there was something about the way this particular robot communicated the energy it brought the intent it was able to communicate to the humans that you understood that of course he needs to have a camera yeah of course he needs to have this information and in that same way to me of course a car needs to take risks of course there's going to be accidents that's what like that's you know if you want a car that never has an accident have a car that just doesn't go anywhere yeah and so that but that's tricky because that's not a robotics problem like are not even under like due to you right obviously so there's a big difference though um yeah you are that's not a personal decision you're also impacting obviously kind of the rest of the road um and we're facilitating it right and so there's a higher kind of you know kind of ethical and moral bar which obviously then you know translates into as a society and from a regulatory standpoint kind of like what what comes out of it where it's hard for us to ever see this even being a debate in the sense that like you have to be beyond reproach from a safety standpoint because if you're wrong about this you could set the entire field back a decade right see i i this is me speaking i think if we look into the future there will be i personally believe this is me speaking yeah that there will be less and less focus on safety still very very high yeah meaning like after autonomy is very common and accepted it's not not not so common as everywhere but there has to be a transition because i think for innovation just like you were saying to explore ideas you have to take risks and i think if autonomy in the near term is to become prevalent in society i think people need to be more willing to understand the nature of risk the value of risk it's very difficult you're right of course with driving but that that's the fascinating nature of it this it's a it's a life-and-death situation that brings value to millions of people so you have to figure out what what do we value about this world how much do we value how deeply do we want to avoid hurting other humans that's right and there is a point where like you can imagine a scenario where waymo has a system that is uh even when it's like uh kind of beyond a you know human relative safety um and provably statistically will save lives there is a thoughtful navigation of you know the that fact versus just kind of society readiness and perception and education of um society and regulators and everything else where like it's it's multi-dimensional um and it's not a purely logical uh argument but um ironically the logic can actually help with the emotions and just like any technology there's early adopters and then there's kind of like a curve that um happens after it but eventually celebrities you get the rock in a way more vehicle and then everybody just comes and everybody calms down because the rock likes it yeah if you post uh yeah and it's like it's an open question on how this plays out i mean maybe we're presently surprised and it just like people just realize that this is such a enabler of life and like efficiency and cost and everything that um there's a pull like at some point i should fully believe that this will go from a thoughtful kind of you know you know movement and tiptoeing and like kind of like a push to society realizes how wonderful of an enabler this could become and it becomes more of a pull and um hard to know exactly how that will play out but at the end of the day like both the goods transportation and the people transportation side of it has that property where it's not easy there's a lot of open questions and challenges to navigate and there's obviously the technical problems to solve uh as a you know kind of prerequisite but um they they have such an opportunity that is um on a scale that very few industries in the last 20 30 years have even had a chance to tackle that i maybe were pleasantly surprised by how much how much that tipping point like in a really short amount of time actually turns into a societal pull to kind of embrace the benefits of this yeah i i hope so it seems like in the recent few decades there's been tipping points for technologies where like overnight things change it's uh like uh from taxis to ride sharing services all that that shift i mean there's just shift after shift after shift that requires digitization and technology i'm i hope we're pleasantly surprised in this so there's millions of long-haul trucks now in the united states do you see a future where there's millions of waymo trucks and maybe just broadly speaking way more vehicles just like like ants running around the united states uh freeways and local roads yeah in other countries too like uh you look back decades from now and it might be one of those things that just feels so natural and then it becomes almost like a kind of interesting kind of oddity that we had none of it like uh you know kind of decades earlier and it'll take a long time to grow and scale very different challenges appear at every stage but over time like this is one of the most enabling technologies that um that we have in the world uh today um it'll feel like you know how was the world before the internet how's the world before mobile phones like it's gonna have that sort of a feeling to it on both sides it's hard to predict the future but do you sometimes uh think about weird ways it might change the world like surprising ways so obviously there's more direct ways where like there's increases efficiency it'll enable a lot of kind of logistics optimizations kind of things it will change our uh probably our roadways and all that kind of stuff but it could also change society in some kind of interesting ways do you ever think about how might change cities how might change their lives all that kind of yeah you can imagine city uh where people live versus work becoming more distributed because the pain of commuting becomes different just easier uh and i don't know there's a lot of options that open up the way out of cities themselves and how you think about car storage and parking obviously uh just enables a completely different type of uh uh type of experience in urban environments i i think there was like a statistic that uh something like 30 of the traffic uh in cities during rush hour is caused by a pursuit of parking uh or some like some really high stats so those obviously kind of open up a lot of options um flexibility on goods will enable new industries and businesses that never existed before because now the efficiency becomes more palatable good delivery timing consistency and flexibility is going to change the way we distribute the logistics network will change the way we then can integrate with warehousing with shipping ports you can start to think about greater automation through the whole kind of stack and how that supply chain the ripples become much more uh agile versus like very grindy the way they are today where just the adaptation is like very tough and there's a lot of constraints that we have i think it'll be great for the environment it'll be great for safety where like probably about 95 of accidents today um statistically are due to just uh attention or things that are preventable with uh with the strengths of automation yeah and it'll be one of those things where like industries will shift but the net creation is going to be massively positive and then we just have to be thoughtful about the negative implications that will happen in local area places um and adjust for those but i'm an optimist in general for the technology where you could argue a negative on any new technology but you start to kind of see that if there is a big demand for something like this the in almost all cases the like it's an enabling factor that's gonna kind of propagate through the um you know through society and particularly as life expectancies get longer and you know and so forth like there's a just a lot more need for um a greater percentage of the population to kind of just be serviced with a high level of efficiency because otherwise we can have a really hard time kind of scaling to what's ahead in the next 50 years um yeah and you're absolutely right every technology has uh negative consequences of positive consequences and we tend to focus on the negative a little bit too much in fact autonomous trucks are often brought up as an example of uh artificial intelligence and robots in general taking our jobs and as we've talked about briefly here we talk a lot with steve you know that's it is a concern that automation will take away certain jobs it will create other jobs so there's temporary pain uh hopefully temporary but pain is pain and all people suffer and that human suffering is really important to think about how uh but trucking is ver i mean there's a lot written on this is i would say far from the the thing that that would cause the most pain yeah there's even more positive properties about trucking where not only is there just a you know huge shortage which is going to increase the average age of truck drivers is getting closer to 50 because the younger people aren't wanting to come into it they're trying to like incentivize lower the age limit like all these sort of things um and the demand is just going to increase and the least favorable like it depends on the person but in most cases the least favorable types of routes are the massive long-haul routes where you're on the road away from your family 300 plus stations steve talked about the pain of those kind of routes from a family perspective you're you're basically away from family it's not just hours you work insane hours but it's also just time away from family right and just obesity rate is through the roof because you're just sitting all day like it's really really tough and um and that's also where like the biggest kind of safety risk is because of fatigue and um and so when you think of the gradual evolution of how trucking comes in first of all it's not overnight it's gonna take decades to kind of phase in all the like there's just a long long long road ahead but the the routes and the portions of trucking that are going to require humans the longest and benefit the most from humans are the short-haul and most complicated kind of more urban routes which are also the more more pleasant ones which are um you know less continual driving time more um uh more flexibility on like you know geography and location and you get to kind of sleep with the at home with you at your own home and very importantly if you optimize the logistics you're going to use human you're going to use humans much better that's right and and thereby pay them much better because like one one of the biggest problems is truck drivers currently are paid by like how much they drive so you they really feel the pain of it inefficient logistics yeah because like if they're just sitting around for hours which they often do not driving waiting yeah they're not getting paid for that time that's right and that so like logistics has a significant impact on the quality of life of a truck driver and high percentage of trucks are like uh empty because of inefficiencies in the system um yeah it's one of those things where like um and the other thing is when you increase the efficiency of a system like this the overall net like volume of the system tends to increase right like the the entire market cap of trucking is going to go up um when the efficiency improves uh and facilitates both growth and industries and better utilization of trucking um and so that on its own just creates more and more demand which um uh of all the places where ai comes in and starts to really um uh kind of reshape an industry this is one of those where like there's just a lot of positives that for at least any time in the foreseeable future seem really lined up in a good way um to um kind of come in and help with the shortage and start to kind of optimize for the routes that are most dangerous and most painful yeah so this is true for trucking but if we zoom out broader you know automation and ai does technology broadly i would say but you know automation is a thing that has a potential in the next couple of decades to shift the kind of jobs available to humans yes and so that results in like i said human suffering because people lose their jobs there's economic pain there and there's also a pain of meaning so for a lot of people work is a source of uh meaning it's a source of identity of of pride of you know pride in getting good at the job pride in craftsmanship and excellence which is what truck drivers talk about yeah but but that this is true for a lot of jobs and is that something you think about as a sort of a roboticist zooming out from the trucking thing um like where do you think it would be harder to find activity and work that's a source of identity and source of meaning in the future like i do think about it because you want to make sure that you you worry about the entire system like not just like the party economy plays in it but what are the ripple effects of it down the road and um on enough of a time window there's a lot of opportunity to put in the right policies and the right opportunities to kind of reshape and retrain and find those openings and so just to give you a few examples both trucking and cars we have remote assistance facilities that are there to interface with customers and monitor vehicles and provide like very focused kind of assistance on uh kind of areas where the vehicle may want to request help uh in understanding an environment so those are jobs that kind of get created and supported i remember like taking a tour of one of the amazon facilities where you've probably seen the kiva systems robots uh where you have these orange robots that have automated um the warehouse like kind of picking and collecting of items in this like really elegant and beautiful way um it's actually one of my favorite applications of robotics of all time um uh you know like i think it kind of came across a company like 2006 was just amazing and what was the warehouse or was the transport little thing so basically instead of a person going and walking around and picking the seven items in your order um these robots go and pick up a shelf and move it over in a row where like the seven shelves that contain the seven items are lined up and a you know laser or whatever points to what you need to get and you go and pick it and you place it to fill the order and so the people were fulfilling the final orders what was interesting about that is that when i was asking them about like kind of the impact on labor when they transitioned that warehouse the throughput increased so much that the jobs shifted towards the final fulfillment even though the robots took over entirely the the search of the items themselves and the labor the job stayed like nobody like that was actually the same amount of jobs uh roughly they were necessary but the throughput increased by i think over 2x or some some amount right like so um you have these situations that are not zero-sum games in this really interesting way and the optimist to me thinks that there's these types of solutions in almost any industry where the growth that's enabled creates opportunities that you can then leverage but you got to be intentional about finding those and really helping make those links because any even if you make the argument that like there's a net positive locally there's always tough hits that you got to be very careful about that's right you have to have an understanding of that link because there's a short period of time whether training is required or just mental transition or physical or whatever is required that's still going to be short-term pain the uncertainty of it there's families involved you know it it's i mean it's exceptionally it's difficult on a human level and you have to really think about that even you can't just look at economic metrics always it's human beings that's right and you can't even just uh take it as like okay well we need to like subsidize or whatever because like there is an element of just personal pride where right majority of people like people don't want to just be okay but like they want to actually like have a craft like you said and have a mission and feel like they're having a really positive impact and so um my personal belief is that there's a lot of transferability and skill set um that is possible especially if you create a bridge and an investment um to enable it um and to some degree that's our responsibility as well this process uh you mentioned kiva robots amazon let me ask you about the astro robot which is i don't know if you've seen it it's amazon's announced that it's a home robot that they have a screen looks awfully a lot like cosmo has i think different vision probably what are your thoughts about like home robotics in this kind of space there's been a quite a bunch of home robots social robots that very unfortunately have closed their doors that um for various reasons perhaps they were too expensive there's manufacturing challenges all that kind of stuff what are your thoughts about amazon getting into this space yeah we had some signs that they were getting into like long long long long ago maybe they're a little too interested in cosmo and uh yeah during our conversations but they're also very good partners actually for us as we kind of disintegrated a lot of shared technology but if i could also get your thoughts on you know you could think of alexa as a robot as well yeah echo do you see those as fundamentally different just because you can move and look around is that fundamentally different than the thing that just sits in place uh it opens up options um but uh you know my first reaction is i think like i have my doubts that this one's going to hit the mark because i think for the price point that it's at and the like kind of functionality and value propositions that they're i'm trying to put out it's uh uh it's still searching for like the killer application that like justifies i think it was like a 1500 price point or kind of somewhere around there that's a really high bar so there's enthusiasts an early adopters will obviously kind of pursue it but you have to like really really hit a high mark at that price point which we always tried to we were always very cautious about jumping too quickly to the more advanced systems that we really wanted to make but would have raised the bar so much you have to be able to hit it in today's cost structures and technologies the mobility is an angle that hasn't been utilized but it has to be utilized in the right way um and so that's going to be the biggest challenge is like can you meet the bar of what a con what the mass market consumer like you know think like you know our uh our neighbors our friend parents like would they find a deep deep value like in you know fi in this at a mass scale that you know that justifies the price point i think that's in the end one of the biggest challenges for robotics especially consumer robotics where you have to kind of meet that bar uh it becomes very very hard um and there's also the higher bar just like you were saying with cosmo of you know a thing that can look one way and then turn around and look at you there's that's either a super desirable quality or super undesirable quality depending on how much you trust the thing that's right and so there's uh there's a problem of trust to solve there there's a problem of personalities the thing is the quote-unquote problem that cosmos solved so well yeah is that there you trust the thing yeah and that has to do with the company with the leadership with the intent that's communicated by the device and the company and everything together yeah exactly right uh and so um and i think they also have to retrace some of the like learnings on the character side where like as usual i think that's the place where it's uh a lot of companies are great at the hardware side of it and can you know think about those elements and then there's like you know the thinking about the ai challenges particularly the advantage of alexa is a pretty huge boost for them um the character side of it for technology companies is pretty new novel territory and so that will take some iterations but um yeah i mean i hope i hope there's continued progress in the space and that threat doesn't kind of go dormant for too long and it's not you know it's going to take a while to kind of evolve into like the ideal applications but you know this is one of um amazon's i guess like you could call it it's definitely like part of their dna but in many cases it's also strength where they're very willing to like iterate uh kind of aggressively and um and move quickly not take risks and take risks you have deep pockets so you can yeah and they'll maybe have more misfires than an apple would um but uh you know it's different styles and different approaches and um you know at the end of the day it's like there's a few familiar uh kind of elements there for sure which was uh you know kind of you know homage is one way to put it yeah uh so why is it so hard at a high level um to build a robotics company a robotics company that lives for a long time so if you look at so i thought cosmo for sure would live for a very long time that to me was exceptionally successful vision and idea and implementation irobot is an example of a company that has pivoted in all the right ways to survive and arguably thrive by focusing on the having like a have a driver that constantly provides profit which is the vacuum cleaner and of course there's like amazon what they're what they're doing is they're almost like taking risks so they can afford it because they have other sources of revenue right but outside of those examples most robotics companies fail yeah why why do they fail why is it so hard to run a robotics company our robot's impressive because they found a really really great fit of where the technology could satisfy a really clear used case in need and they did it well and they didn't try to overshoot from a cost-to-benefit standpoint robotics is hard because it like tends to be more expensive it combines way more technologies than a lot of other types of companies do if i were to like say one thing that is maybe the biggest risk and like a robotics company failing is that it can be either a technology in search of a application or they try to bite off a kind of an offering that has a mismatch and kind of price to function um and uh just the mass market appeal isn't there and um consumer products are just hard it's just i mean after all the years and it like definitely kind of feel a lot of the battle scars because you have um you know you not only do you have to like hit the function but you have to educate and explain get awareness up deal with different conductive consumers like uh you know there's um there's a reason why a lot of technology sometimes start in the enterprise space and then kind of continue forward in the consumer space even like you know you see ar like starting to kind of make that shift with hololens and so forth in some ways consumers and price points that they're willing to kind of uh be attracted in a mass market way and i don't mean like you know 10 000 enthusiasts bought it but i mean like you know 2 million 10 million 50 million like mass market kind of interest uh you know have bought it that bar is very very high and typically robotics is novel enough and non-standardized enough to where pushes on price points so much you can easily get out of range where the capabilities and today's technology or just a function that was picked just doesn't line up um and so that product market fit is very important so the space of killer apps or a rather super compelling apps is much smaller because it's easy to get outside the price range yeah and most consumers and it's not constant right like yeah that's why like we picked off entertainment because the quality was just so low in physical entertainment that we could we felt we could leapfrog that and still create a really compelling offering at a price point that was defensible and and we like that proved out to be true um and over time that same opportunity opens up in healthcare in home applications and you know commercial applications and kind of broader more generalized interface but there's missing pieces in order for that to happen and all of those have to be present um for it to line up and we see these sort of trends in technology where um you know kind of technologies that start in one place evolve and kind of grow to another something starting gaming some things start in uh in space uh or aerospace and then kind of move into the consumer market and sometimes it's just a timing thing right where how many stabs at what became the iphone were there over the 20 years before that just weren't quite ready in the function um relative to the kind of price point and complexity and sometimes it's a small detail of the implementation that makes all the difference which is uh design uh design is so important well something yeah like the the you the new generation ux right yeah it's um and uh and that's uh um it's tough and oftentimes all of them have to be there and it has to be like a perfect storm and um but yeah history repeats itself in a lot of ways uh in a lot of these trends which is pretty fascinating well let me ask you about the humanoid form what do you think about the tesla bot and humanoid robotics in general so obviously to me autonomous driving waymo and the other companies working in the space that seems to be a great place to invest in potential revolutionary application robotics application focused application what's the role of humanoid robotics do you think teslabot is ridiculous do you think it's super promising do you think it's interesting full of mystery nobody knows what do you think about this thing yeah i think today humanoid form robotics is research there's very few situations where you actually need a humanoid form to solve a problem uh if you think about it right like wheels are more efficient than legs there's joints and degrees of freedom beyond a certain point just add a lot of complexity and cost right so if you're doing a humanoid robot oftentimes it's in the pursuit of a humanoid robot not in a pursuit of an application for the time being um especially when you have like kind of the gaps and interface and you know kind of ai that we kind of talk about today so anything you want does i'm interested in following so there's there's an element of that world no matter how crazy how crazy it is i just like you know i'll pay attention i'm curious to see what comes out of it so it's like you can't you can't ever you know ignore it but you know it's uh definitely far afield from their kind of core business um uh obviously and um what was interesting to me is i've i've disagreed with you know elon a lot about this is to me the in the compelling aspect of the humanoid form and a lot of kind of robots cosmo for example is a human robot interaction part from elon musk's perspective the tesla bot has nothing to do with the human it's a form that's effective for the factory because the factory is designed for humans but to me the reason you might want to argue for the humanoid form is because you know at a party yeah it's a nice way to fit into the party the humanoid form has a compelling notion to it in the same way that cosmo is compelling i you i would argue if we were arguing about this that it's cheaper to build a cosmo like that form but if you wanted to make an argument which i have with jim keller about you know you could actually make a humanoid robot for pretty cheap it's possible and then the question is all right if if you're using an application where it can be flawed it could it can have a personality and be flawed in the same way that cosmo is that maybe it's interesting for integration to human society that's that's to me is an interesting application of a humanoid form because humans are drawn like i mentioned to you legged robots we're drawn to legs and limbs and body language and all that kind of stuff and even a face even if you don't have the facial features which you might not want to have for the uh to reduce the creepiness factor all that kind of stuff but yeah that to me the humanoid form is compelling but in terms of that being the right form for the factory environment i'm not so sure yeah for the factory environment like right off the bat um what are you optimizing for is it strength is it mobility is it versatility right like that changes completely the look and feel of the robot that you create you know and uh almost certainly the human form is over designed for some asp dimensions and constrained for some dimensions and so like like what are you grasping is it big is it little right so you would customize it and make it um customizable um for the different needs if that was the optimization right and then you know for the other one uh i could totally be wrong you know i still feel that the closer you try to get to a human the more you're subject to the um biases of what a human should be and you lose flexibility to shift away from your weaknesses uh and towards your strengths and that changes over time but there's ways to make really approachable and natural interfaces for robotic kind of characters and you know and uh you know and kind of deployments in these applications that do not at all look like a human directly but that actually creates way more flexibility and capability and role and forgiveness and interface and everything else yeah it's interesting but i'm still confused by the magic i see in legged robots yeah so there is a magic so i i'm uh absolutely amazed at it from a technical curiosity standpoint and like the the magic that like the boston dynamics team can do from uh you know like from walking and jumping and so forth now like there's been a long journey to try to find an application for that sort of um technology but wow that's incredible technology right yes so then you kind of go towards okay are you working back from a goal of what you're trying to solve or are you working forward from a technology and then looking for a solution and i think that's where um it's a kind of a bi-directional search oftentimes but you gotta you the two have to meet and that that's where humanoid robots is kind of close to that and that like it is a decision about a form factor and a technology that it forces um that doesn't have a clear justification on why that's the killer app or you know from the other end but i think the core fascinating idea with the tesla bot is the one that's carried by waymo as well is when you're solving the general robotics problem of perception control where this there's the very clear applications of driving it's as you get better and better at it when you have like way more driver yeah the whole world starts to kind of start to look like a robotics problem so it's very interesting for now detection classification segmentation tracking planning like it's carrie yeah so there's no reason i mean i'm not i'm not speaking for way more here but you know um moving goods there's no reason transformer like this thing couldn't you know uh take the goods up an elevator you know yeah like that like uh slowly expand yeah what it means to move goods and expand more and more of the world uh into a robotics problem well that's right and you start to like think of it as an end robotics problem from like loading from you know from everything yes and even like the truck itself um you know today's generation is integrating into today's understanding of what a vehicle is right the pacifica jaguar uh the freightliners from daimler there's nothing that stops these us from like down the road after like starting to get to scale to like expand these partnerships to really rethink what would the next generation of a truck look like um that is actually optimized for autonomy not for today's world um and maybe that means a very different type of trailer maybe that like there's a lot of things you could rethink on that front which is on its own very very exciting let me ask you like i said you went to the mecca of robotics which is cmu carnegie mellon university you got a phd there so maybe by way of advice and maybe by way of story and memories what does it take to get a phd in robotics at cmu and maybe you can throw in there some advice for people who are thinking about doing work in artificial intelligence and robotics and are thinking about whether to get a phd it's like i actually went i was a cmu for undergrad as well and didn't know anything about robotics coming in and was doing you know electrical computer engineering computer science and really got more and more into kind of ai and then fell in love with autonomous driving and at that point like that was just by a big margin like such a incredible like central spot of develop of investment in that area and so what i would say is that like robotics like for all the progress that's happened is still a really young field there's a huge amount of opportunity now that opportunity shifted where something like autonomous driving has moved from being very research and academics driven to being commercial driven where you see the investments happening in commercial now there's other areas that are much younger and you see like kind of grasping and manipulation making kind of the same sort of journey that like autonomy made and there's other areas as well what i would say is the space moves very quickly anything you do a phd in like it is in most areas will evolve and change as technology changes and constraints change and hardware changes and the world changes um and so the beautiful thing about robotics is it's super broad it's not a narrow space at all and it can be a million different things in a million different industries and so uh it's a great opportunity to come in and get a broad foundation on ai machine learning computer vision systems hardware sensors all these separate things you do need to like go deep and find something that you're like really really passionate about obviously like just like any phd this is like a five six year kind of uh endeavor and you have to love it enough to go super deep to learn all the things necessary to be super deeply functioning in that area and then contribute to it in a way that hasn't been done before and in robotics that probably means um more breadth because robotics is rarely kind of like one particular kind of narrow technology and it means being able to collaborate with teams where like one of the coolest aspects of like my the exp the experience that i kind of cherish in our phd is that we actually had a pretty large av project that for that time was like a pretty serious initiative where you got to like partner with a larger team and you had the experts in perception and the experts in planning and the staff and the mechanical challenge um so i was working on the a project called upi back then uh which was basically the off-road version of the darpa challenge it was a darpa funded project for basically like a large off-road vehicle that you would like drop and then give it a waypoint 10 kilometers away and it would have to navigate a complete structure in an office environment yeah so like forest ditches rocks vegetation and so it was like a really really interesting kind of a hard problem where like wheels would be up to my shoulders it's like gigantic right yeah by the way av for people stands for autonomous vehicles house vehicles yeah sorry um and so what i think is like the beauty of robotics but also kind of like the expectation is that um there's um spaces in computer science where you can be very very narrow and deep robotics one of the the necessity but also the beauty of it is that it forces you to be excited about that breadth and that partnership across different disciplines that enable it but that also opens up so many more doors where you can go and you can do robotics and almost any category where robotics isn't a in isn't really an industry it's like it's like ai right it's like the application of physical automation to uh you know to all these other worlds and so you can do robotic surgery you can do vehicles you can do factory automation you can do healthcare or you can do like uh leverage the ai around the sensing to think about static sensors and scene understanding so um so i think that's got to be the expectation and the excitement and it breeds people they're probably a little bit more collaborative and more excited about um working in teams uh if i could briefly comment on the fact that the robotics people i've met in my life from cmu and mit they're really happy people yeah because i think it's the collaborative thing yeah i think i think you don't you're not like a sitting in like the fourth basement uh exactly which when you're doing machine learning purely software it's very tempting to just disappear into your own hole yeah and never collaborate and and there that breeds a little bit more of the silo mentality of like i have a problem it's almost like negative to talk to somebody else or something like that but robotics folks are just very collaborative very friendly just and there's also an energy of like you get to confront the physics of reality often which is humbling and also exciting so it's humbling when it it fails and exciting when it finally it's like the purity of the passion you got to remember that like right now like robotics and ais like just all the rage and autonomous vehicles and all this like 15 years ago and 20 years ago like it wasn't that deeply lucrative people went into robotics they did it because they were like thought it was just the coolest thing in the world to like make physical things intelligent in the real world and so there's like a raw passion where they went into it for the right reasons and so forth and so it's really great space and that organizational challenge by the way like um when you think about the challenges in av we talk a lot about the technical challenges the organizational challenge is through the roof where um you think about the challenge the what it takes to build an av system and you have companies that are now thousands of people and um you know you look at other really hard technical problems like an operating system it's pretty well established like you kind of know that there's a file system there's virtual memory there's this there's that there's like caching and like and there's like a really reasonably well established modularity and apis and so forth and so you can kind of like scale it in an efficient fashion that doesn't exist anywhere near to that level of maturity in autonomous driving right now and tech stacks are being reinvented organizational structures are being reinvented you have problems like pedestrians that are not isolated problems they're part sensing part behavior prediction part planning part evaluation and like one of the biggest challenges is actually how do you solve these problems where the mental capacity of a human is starting to get strained on how do you organize it and think about it where you know you have this like multi-dimensional matrix that needs to all work together and so that makes it kind of cool as well because it's not like solved at all uh from you know like what what is what does it take to actually scale this right and then you look at like other gigantic challenges that have you know that have been success successful and are way more mature there's a stability to it and like maybe the autonomous vehicle space will get there but right now just as many uh technical challenges as they are they're like organizational challenges and how do you like solve these problems that touch on so many different areas and efficiently tackle them while like maintaining progress among all these constraints um while scaling by way of advice what advice would you give to uh somebody thinking about doing a robotics startup you mentioned cosmo somebody that wanted to carry the cosmo flag forward the anki flag forward looking back at your experience looking forward to the future that will obviously have such robots what advice would you give to that person yeah it was the greatest experience ever and it's like there's something you there's things you learn navigating a startup that you'll never like you you it was very hard to encounter that in like a typical kind of work environment and um and it's just it's wonderful you got to be ready for it it's not as good like you know the the glamour of a startup there's just like just brutal emotional swings up and down and so um having co-founders actually helps a ton like i would not cannot imagine doing it solo but having at least somebody where on your darkest days you can kind of like really openly just like have that conversation and you know lean on to somebody that's that's in the thick of it with you helps a lot what i would say what was the nature of darkest days and the emotional swings is it worried about the funding is it worried about whether any of your ideas are any good or ever were good is it like the self-doubt uh is it like facing new challenges that have nothing to do with the technology like organizational human resources that kind of stuff what yeah you come from a world in school where you feel that uh you put in a lot of effort and you'll get the right result and input translates proportional to output and you know you need to solve the set or do whatever and just kind of get it done now phd tests out a little bit but at the end of the day you put in the effort you tend to like kind of come out with your enough results to you kind of get a phd in the startup space like you know like you could talk to 50 investors and they just don't see your vision and it doesn't matter how hard you kind of tried and pitched you could uh work incredibly hard and you have a manufacturing defect and if you don't fix it you're gonna you're out of business um you need to raise money by a certain date and there's a you got to have this milestone in order to like have a good pitch and you do it you have to have this talent and you just don't have it inside the company or um you know you have to get 200 people or however many people kind of like along with you and kind of buy in the journey um you're like disagreeing with an investor and they're your investors so it's just like you know it's like you there's no walking away from it right so um and it tends to be like those things where you just kind of get clobbered in so many different ways that like things end up being harder than you expect and it's like such a gauntlet but you learn so much in the process and there's a lot of people that actually end up rooting for you and helping you like from the outside and you get good great mentors and you like get find fantastic people that step up in the company and you have this like magical period where everybody's like it's life or death for the company but like you're all fighting for the same thing and it's the most satisfying kind of journey ever um the things that make it easier and that i would recommend is like be really really thoughtful about the the application like there's a there's a saying of like kind of you know team and execution and market and like kind of how important are each of those um and oftentimes the market wins and you come at it thinking that if you're smart enough and you work hard enough and you're like have the right talented team and so forth like you'll always kind of find a way through and um it's surprising how much dynamics are driven by the industry you're in and the timing of you entering that industry um and so just uh waymo is a great example of it there is i don't know if there'll ever be another company or suite of companies that has raised and continues to spend so much money at such an early uh phase of revenue generation and product and productization um the you know from a p l standpoint uh like it's it's a anomaly like by any measure of any industry that's ever existed um except for maybe the u.s space program uh like right uh like but it's like uh multiple trillion dollar opportunities which is so unusual to find that size of a market that just the progress that shows the de-risking of it you could apply whatever discounts you want off of that trillion-dollar market and it still justifies the investment that is happening because like being successful in that space makes all the investments feel trivial now by the same consequence like the size of the market the size of the target audience the ability to capture that market share how hard that's going to be who the incumbent's like that's probably one of the lessons i appreciate like more than anything else where like those things really really do matter and um oftentimes can dominate the quality of the team or execution because if you miss the timing or you do it in the wrong space you run into like the institutional kind of headwinds of a particular environment like let's say you have the greatest idea in the world but you barrel into healthcare but it takes 10 years to innovate in healthcare because of a lot of challenges right like there's fundamental uh laws of physics that you have to think about and so um the combination of like anki waymo kind of drives that point home for me where you can do a ton if you have the right market the right opportunity the right way to explain it and you show the progress in the right sequence it actually can really significantly change the course of your journey and startup how much of it is understanding the market and how much of it's creating a new market so how do you think about like space robotics is really interesting you said exactly right the space of applications is small yeah you know relative to the cost involved so how much is like truly revolutionary thinking about like what is the application and then yeah but so creating something that didn't exist it didn't really exist like this is pretty obvious to me the whole space of home robotics just every everything that cosmo did i guess you could talk to it as a toy and people will understand it picazo is much more than a toy yeah and i don't think people fully understand the value of that you have to create it and the product will communicate it like just like the iphone nobody understood the value of of no keyboard and a thing that's that can do web browsing i don't think they understand the value of that until you create it yeah having a foot and a door in an entry point still helps because at the end of the day like an iphone replaced your phone and so it had a fundamental purpose and all these things that it did better right sure and so then you could do abc on top of it and uh and then like you even remember the early commercials where it's always like one application of what he could do and then you get a phone call right and so that was intentionally sending a message something familiar but then like yes you can send a text message you can listen to music you can surf the web right and so you know autonomous driving obviously anchors on that as well you don't have to explain to somebody the functionality of an autonomous truck right like there's nuances around it but the functionality makes sense um in the home you have a fundamental advantage like we always thought about this because it was so painful to explain to people what our products did and how like how to communicate that super cleanly especially when something was so experiential and so you compare like anki to nest nest um had some beautiful products where they started scaling and like actually find like really great success and they had like really clean and beautiful marketing messaging because they anchored on reinventing existing categories where it was a smart thermostat right and uh like and so you you kind of are able to um take what's familiar anchor that understanding and then explain what's what's better about it that's funny you're right cosmo is like totally new thing like what what is this thing because we struggle we spent like a lot of money on marketing we had a heart like we fought we actually had far greater efficiency on cosmo than um anything else because we found a way to capture the emotion in some little shorts to kind of lean into the personality in our marketing and it became viral where like we had these kind of videos that would like go and get like hundreds of thousands of views and like kind of like get spread and sometimes millions of views and so um but it was like really really hard um and so finding a way to kind of like anchor on something that's familiar but then grow into something that's not um is an advantage but then again like you don't have like there's successes otherwise like alexa never had a comp right uh you could argue that that's very novel and very new and um and there's a lot of other examples that kind of created a kind of a category out of like kiva systems i mean they like came in and they like uh enterprise is a little easier because if you can uh it's less susceptible to this because if you can argue a clear value proposition it's a more logical conversation that you can have um with customers it's not it's a little bit less emotional and um kind of subjective but yeah in the home you have to yeah so like a home robot it's like what does that mean yeah and so then you really have to be crisp about the value proposition and what like really makes it worth it like and and we by the way went to that same order we almost like we almost hit a wall coming out of 2013 where we were so big on explaining why our stuff was so high-tech and all the kind of like great technology in it and how cool it is and so forth um to having to make a super hard pivot on why is it fun and why did like does the random kind of family of four need this right like so it's learnings but that's that's the challenge and i think like robotics tends to sometimes fall into the new category problem but then you gotta be really crisp about why it needs to exist well i think some of robotics depending on the category depending on the application is a little bit of a marketing this uh challenge and i don't i don't mean i mean it's it's the kind of marketing that weimo is doing that tesla is doing is like showing off incredible engineering incredible technology but convincing like you said a family of four that this this will this is like this is transformative for your life this is this is this is fun this is you don't care about tech isn't your thing they don't they really don't like they need to know why they want it so some of that is just marketing yeah that's why like roomba like um yes they didn't you know like go and you know have this like you know huge huge con you know ramp into like the entirety of like kind of a robotics and so forth but like they built a really great business and um uh in a vacuum cleaner world and like everybody understands what a vacuum cleaner is um most people are annoyed by doing it um and now you have one that like kind of does it itself uh yeah various degrees of quality but that is so compelling that like it's easier to understand and like uh and they had a very kind of and i think they have like 15 of the vacuum cleaner market so it's like pretty successful right i think we need more of those um types of thoughtful stepping stones in robotics but the opportunities are becoming bigger because hardware's cheaper computes cheaper clouds cheaper and ai's better so there's a lot of opportunity if we zoom out from specifically startups and robotics what advice do you have to uh high school students college students about career and living a life that you can be proud of you lived one heck of a life you're very successful in several domains um if you can convert that into a generalizable potion what advice would you give yeah it's a very good question so it's very hard to go into a space that you're not passionate about and push like push hard enough to be you know to like maximize your potential uh in it and so there's a um there's always kind of like the saying of like okay follow your passion great try to find the overlap of where your passion overlaps with like a growing opportunity and need in the world where it's not too different than the startup kind of argument that we talked about where um if you are where your passion meets the market right you know i mean like because it's like uh um it's a you know that's a beautiful thing where like you can do what you love but it's also just opens up tons of opportunities because the world's ready for it right like and so um and so like if you're interested in technology um that might point to like go and study machine learning because you don't have to decide what career you're going to go into but it's going to be such a versatile space that's going to be at the root of like everything that's going to be in front of us that you can have eight different careers in different industries and be an absolute expert in this like kind of tool set that you wield that can go and be applied um and that by the way that doesn't apply to just technology right it's uh it could be the exact same thing if you want to um you know the same thought process apprised to design to marketing to um you know to sales to anything but um that versatility where you like um when you're in a space that's gonna continue to grow um it's just like what company do you join one that just is going to grow and the growth creates opportunities where the surface area is just going to increase and the problems will never get stale and you can have you know many like and so you go into a career where you have that sort of growth in the in the world that you're in you end up having so much more opportunity that organically just appears and you can then have more shots on goal to find like that killer overlap of timing and passion and skill set and point in life where you can like you know just really be motivated and fall in love with something um and then at the same time like uh find a balance like there's been times in my life where i worked like a little bit too obsessively and you know and crazy and uh and i you know think we kind of like tried to correct that you know kind of the right opportunities but you know i think i probably appreciate a lot more now friendships that go way back um you know family and things like that and um and i i'm kind of have the personality where i could use like i have like so much desire to really try to optimize like you know what i'm working on that i can easily go to kind of an extreme and now i'm trying to like kind of find that balance and make sure that i have the friendships the family like relationship with the kids everything that like i don't uh i push really really hard but it kind of find a balance and and i think people can be happy on actually many kind of extremes on that spectrum but it's easy to kind of inadvertently make a choice by how how you approach it that then becomes really hard to unwind um and so being very thoughtful about kind of all of those dimensions makes a lot of sense and so um to come those are all interrelated um but at the end of the day oh love passion and love yeah love towards you said uh yeah family friends family and hopefully one day if your work pans out boris is love towards robots not the creepy kind of good guy that's a good kind just just friendship and yeah and fun just yeah it's like another dimension to just how we interface with the world yeah of course you're one of my favorite human beings roboticists you've created some incredible robots and i think inspired countless people and like i said i hope cosmo i hope you work with anki lives on and um i can't wait to see what you do with waymo i mean that's if we're talking about artificial intelligence technology that has the potential to revolutionize so much of our world that's it right there so thank you so much for the work you've done and thank you for spending your valuable time talking with me thanks alex thanks for listening to this conversation with boris sofman to support this podcast please check out our sponsors in the description and now let me leave you some words from isaac asimov if you were to insist i was a robot you might not consider me capable of love in some mystic human sense thank you for listening and hope to see you next time youthe following is a conversation with boris sofman who is the senior director of engineering and head of trucking at waymo the autonomous vehicle company formerly the google self-driving car project before that boris was the co-founder and ceo of anki a robotics company that created cosmo which in my opinion is one of the most incredible social robots ever built it's a toy robot but one with an emotional intelligence that creates a fun and engaging human robot interaction it was truly sad for me to see anki shut down when he did i had high hopes for those little robots we talk about this story and the future of autonomous trucks vehicles and robotics in general i spoke with steve vaseli recently on episode 237 about the human side of trucking this episode looks more at the robotic side this is the lex friedman podcast to support it please check out our sponsors in the description and now here's my conversation with boris sofman who is your favorite robot in science fiction books or movies wally and r2d2 where they were able to convey such an incredible degree of intent emotion and kind of character attachment without having any language whatsoever and just purely through the emotion richness of emotional interaction so those were fantastic and then uh the terminator series just like really really pretty wide wide range right uh but uh i kind of love this uh dynamic where you have this like incredible terminator itself that arnold played but uh and then he was kind of like the inferior like previous generation version that was like totally outmatched uh you know in terms of kind of specs by the new one but you know still kind of like held his own and so it was kind of interesting where you you realize how many how many levels there are on the spectrum from human to kind of potentials and ai and robotics to uh futures and so yeah that movie really uh as much as it was like kind of a dark world in a way was actually quite fascinating gets the imagination going well from an engineering perspective both the movies you mentioned wally and terminator the first one is probably achievable you know humanoid robot maybe not with like the realism in terms of skin and so on but that humanoid form we have the humanoid form it seems like a compelling form maybe the challenge is just super expensive to engine to build but you can imagine maybe not a machine of war but you could imagine terminator type robots walking around and then the same obviously with wall-e you've basically so for people who don't know you uh created the company anki that created a small robot with a big personality called coswell that just it does exactly what wally does which is somehow with very few basic visual tools is able to communicate a depth of emotion and that's fascinating but then again the humanoid form is uh super compelling so like uh cosmo is very distant from a humanoid form and then the terminator has a humanoid form you can imagine both of those actually being in our society it's true and it's interesting because um it was very intentional to go really far away from human form when you think about a character like cosmo or like wall-e where you can completely rethink uh the constraints you put on that character um what tools you leverage and then how you actually create a personality uh and a level of intelligence interactivity that actually matches the constraints that you're under whether it's mechanical or sensors or ai of the day this is why i almost was always really surprised by how much energy people put towards trying to replicate human form in a robot because you actually take on some pretty significant um kind of constraints and downsides when you do that um the first of which is obviously the cost where it's just the the articulation of a human body is just so like magical um in both the precision as well as the dimensionality that to replicate that even in this quote reasonably close form takes like a giant amount of joints and actuators and uh in motion and and you know sensors and encoders and so forth but then um you're almost like setting an expectation that the closer you try to get to human form the more you expect the strengths to match and that's not the way ai works is there's places where you're way stronger and there's places where you're weaker and by moving away from human form you can actually change the rules and embrace your strengths and bypass your weaknesses and at the same time the human form like has way too many degrees of freedom to play with it's it's kind of counterintuitive just as you're saying but when you have fewer constraints it's almost harder to master the the communication of emotion like you see this with cartoons like stick figures you can communicate quite a lot with just very minimal like two dots for eyes and a line for for a smile i think like you can almost communicate arbitrary levels of emotion with just two dots and a line yeah and like that's enough and if you focus on just that you can communicate the full range and then you like if you do that then you can focus on the actual magic of of uh human and dot line interaction versus all the engineering mess that's right like dimensionality voice all these sort of things actually become a crutch where you get lost in a search space almost um and so some of the best animators that we've worked with um they almost like study when they come up uh you know kind of in building their expertise by forcing these um projects where all you have is like a ball that can like kind of jump and manipulate itself or like really really like aggressive constraints for your force to kind of extract the deepest level of motion and so in a lot of ways um you know we thought when we thought about cosmos like you're right like our if we had to like describe it in like one small phrase it was bringing a pixar character to life in the real world it's uh it's what we were going for and um in a lot of ways what was interesting is that with like wall-e which we studied incredibly deeply and in fact some of our team were you know kind of had worked previously at um at pixar and on that project um they intentionally constrained wall-e as well even though in an animated film you could do whatever you wanted to because it forced you to like really saturate the smaller amount of dimensions but uh you sometimes end up getting a far more beautiful output um because you're pushing at the extremes of this emotional space in a way that you just wouldn't because you get lost in a surface area if you have like something that is just infinitely articulable so if we backtrack a little bit and uh you thought of cosmo in 2011 and 2013 actually uh designed and built it what is anki what is cosmo i guess who is cosmo and uh what was the vision behind this incredible little robot we started uh anki back in like while we were still in graduate school so myself and my two co-founders we were phd students uh in the robotics institute at carnegie mellon um and so we were uh studying robotics ai machine learning kind of different you know different uh uh areas one of my co-founders working on walking robots uh you know for a period of time and so we all had a um a bit of a really deep kind of a deeper passion for applications of robotics and ai where um there's like a spectrum where there's people that get like really fascinated by the theory of ai and machine learning robotics where um whether it gets applied in the near future or not is less of a kind of factor on them but they love the pursuit of like the challenge and that's necessary and there's a lot of incredible breakthroughs that happen there we're probably closer to the other end of the spectrum where we love the technology and the um and all the evolution of it but we were really driven by applications like how can you really reinvent experiences and functionality and build value that wouldn't have been possible without these approaches and and that's what drove us and we had a kind of some experiences through previous jobs and internships where we like got to see the applied side of robotics and at that time there was actually relatively few applications of robotics um that were outside of um you know peer research or industrial applications um military applications and so forth there were very few outside of it so maybe you know my robot was like one exception and maybe there were a few others but for the most part there weren't that many and so we got excited about consumer applications of robotics where you could leverage way higher levels of intelligence through software to create value and experiences that were just not possible in in those fields today and we saw kind of a pretty wide range of applications that varied in the complexity of what it would take to actually solve those and what we wanted to do was to commercialize this into a company but actually do a bottoms-up approach where we could have a huge impact in a space that was ripe to have an impact at that time and then build up off of that and move into other areas and entertainment became the place to start because um you had relatively little innovation in a toy space an entertainment space you had these really rich experiences in video games and uh and movies but there was like this chasm in between and so we thought that we could really reinvent that experience and there was a really fascinating transition technically that was happening at the time where the cost of components was plummeting because of the mobile phone industry and then the smartphone industry and so the cost of a microcontroller of a camera of a motor of memory of microphones cameras was dropping by orders of magnitude and then on top of that with the iphone coming out in 2000 uh i think it was 2007 i believe um it started to become apparent within a couple of years that this could become a really incredible interface device and the brain with much more computation behind a physical world experience that wouldn't have been possible previously and so um we really got excited about that and how we push all the complexity from the physical world into software by using really inexpensive components but putting huge amounts of complexity into the ai side and so cosmo became our second product and then the one that we're probably most proud of the idea there was to create a physical character that had enough understanding and awareness of the physical world around it in the context that mattered to feel like like he was alive um and to be able to have these like emotional kind connections and experiences with people that you would typically only find uh inside of a movie and the motivation very much was was pixar like we had an incredible uh respect and appreciation for what they were able to um build in this like really beautiful fashion and film um but it was always like a you know when it was virtual and two it was like a story on rails that had no interactivity to it it was very fixed and it obviously had a magic to it but where you really start to hit a different level of experiences when you're actually able to physically interact with that robot and then that was your idea with anki like the first product was the cars so basically you take you take a toy you add intelligence into it in the same way you would add intelligence into ai systems within a video game but you're not bringing into the physical space so the idea is is really brilliant which is you're basically bringing video games to life exactly that's exactly right we literally use that exact same phrase because in the case of drive this was a parallel of the racing genre and the goal was to effectively have a physical racing experience but have a virtual state at all times that matches what's happening in the physical world and then you can have a video game off of that and you can have uh different characters different traits for your the cars weapons and interactions and special abilities and all these sort of things that you think of virtually but then you can have it physically and um one of the things that we were like really surprised by that really stood out and immediately led us to really like kind of accelerate the path towards um cosmo is that things that feel like they're really constrained and simple in the physical world they have an amplified impact on people where the exact same experience virtually would not have anywhere near the impact but seeing it physically really stood out and so effectively we've with with drive we were creating a video game engine for the physical world um and then with cosmo we expanded that video game engine to create a character and and kind of an animation and interaction engine on top of it that allowed us to start to create these much more rich experiences and a lot of those elements were uh almost like a proving ground for what would human robot interaction feel like in a domain it's much more forgiving where you can make mistakes in a game it's okay if like uh if you know car goes off the track or if if cosmo makes a mistake um and what's funny is actually we're so worried about that in reality we realized very quickly that those mistakes can be endearing and if you make a mistake as long as you realize you make a mistake and have the right emotional reaction to it it builds even more empathy with the character that's brilliant exactly so when uh the the thing you're optimizing for is fun you have so much more freedom to fail to explore and and also in the toy space like all this is really brilliant like i got to ask you backtrack it seems for a roboticist to take us jump in into the direction of fun is a brilliant move because when you have the freedom to explore to design all those kinds of things and you can also build cheap robots like you don't have to like if you're not chasing perfection and like toys it's understood that you can go cheaper which means in robot it's still expensive but it's actually affordable by a large number of people so it's a really brilliant space to explore yeah that's right it's uh and in fact we realized pretty quickly that like perfection is actually not fun yeah because like in a traditional robotic roboticist sense the first kind of path planner and uh this is the you know the part that i worked worked on out of the gate was like a lot of the kind of ai systems where you have these you know vehicles and you know cars racing kind of making optimal maneuvers to try to kind of get ahead and you realize very quickly that like that's actually not fun because you want the like chaos from mistakes and the and so you start to kind of intentionally almost add noise to the system uh in order to kind of create more of a realism in the exact same way the human player might start really ineffective and inefficient and then start to kind of increase their quality bar as they as they progress and there is a really really aggressive constraint that's forced on you by being a consumer product where the price point matters a ton particularly in like kind of an entertainment where um you know you you can't make a thousand dollar product unless you're going to meet the qua like the expectations of a thousand dollar product and so um in order to make this work like your cost of goods had to be like like you know well under a hundred dollars uh uh in the case of cosmo we got it under fifty dollars end-to-end fully packaged and delivered and it was under two hundred dollars it cost the retail yeah so uh okay if we sit down like at this early stages if you go back to that and you're sitting down and thinking about what kosovo looks like from a design perspective and from a cost perspective i imagine that was part of the conversation first of all what came first did you have a cost in mind is there a target you're trying to chase did you have a vision in mind like size did you have because there's a lot of unique qualities to cosmos so for people who don't know they should definitely check it out there's a display there's eyes on the little display and those eyes can it's pretty uh low resolution eyes right but they they still able to convey a lot of emotion and there's this arm like that out lift sort of lifts stuff but there's something about arm movement that adds even more kind of depth it's like uh the face communicates emotion and sadness and disappointment and happiness and then the arms kind of communicates i'm trying here yeah i'm doing my best exactly so it's um uh it's interesting because like um all of cosmo's only four degrees of freedom and two of them are the two treads which is for basic movement and so you literally have only a head that goes up and down a lift that goes up and down and then your two wheels uh and you have sound uh and a screen yeah and a low resolution screen and with that it's actually pretty incredible what you can uh what you can come up with where like you said it's a uh it's a really interesting give and take because there's a lot of ideas far beyond that obviously as you can imagine where like you said how big is it how much degrees of freedom what does it look like um uh what does he sound like how does he communicate it's it's a formula that actually scales way beyond entertainment this is the formula for human kind of robot interface more generally is you almost have this triangle between um the physical aspects of it the mechanics the industrial design what's mass producible the cost constraints and so forth you have the ai side of how do you understand the world around you interact intelligently with it execute what you want to execute so perceive the environment make intelligent decisions and and move forward and then you have the character side of it um most uh companies have done anything in human robot interaction really uh missed the mark or under invest in the character side of it um they over invest in the mechanical side of it uh you know and then varied results on the ai side of it and so the thinking is that you put more mechanical flexibility into it you're gonna do better um you don't necessarily you actually create a much higher bar uh for a high roi because now your price point goes up your expectations go up and if the ai can't meet it or the overall experience isn't there you missed the mark um so who like how did you through those conversations get the cost down so much and make it made it so simple like that there's a big theme here because you come from the mecca of robotics which is carnegie mellon university robotics like for all the people i've interacted with that come from there or just from you know the world experts at robotics they don't they would never build something like cosmo yeah and so where did that come from so the simplicity it came from this combination of a team that we had it was it was quite cool because like we and by the way you ask anybody that's like experienced in the like kind of you know toy entertainment space you'll never sell a product over 99 um that was fundamentally false and we believed it to be false it was because experience had to kind of you know meet the mark and so we pushed past that amount but there was a pressure where the higher you go the more seasonal you become and the tougher it becomes and so on the cost side we very quickly partnered up with some previous contacts that we worked with where just as an example one our head of mechanical engineering um was one of the earliest heads of engineering at logitech and has a billion units of consumer products and circulation that he's worked on yeah so like crazy low cost high volume consumer product experience with a really great mechanical engineering team and just a very practical mindset where we were not going to compromise on feasibility in the market in order to chase something that would be enabler and we pushed a huge amount of expectations onto the software team where yes we're going to use cheap noisy motors and sensors but we're gonna fix it in the um on the software side then we found on the design and character side there was a faction that was more from like a game design background that thought that it should be very games driven cosmo where you create a whole bunch of games experiences and it's all about like game mechanics and then there was um a faction which my my co-founder and i the most involved in this like really believed in which was character driven and the argument is that you will never compete with what you can do virtually from a game standpoint but you actually on the character side put this into your wheelhouse and put it more towards your advantage because a physical character has a massively higher impact uh physically than virtually this is okay i can't just pause on that because this is so brilliant when i uh for people who don't know cosmo plays games with you but there's also a depth of character and i actually when i was you know playing with it i wondered exactly what is the compelling aspect of this because to me obviously i'm i'm biased but to me the character i get what i enjoyed most honestly or what got me to return to it is the character that's right but that's that's a fascinating discussion of uh you're right ultimately you cannot compete on the quality of the gaming experience too restrictive the physical world is just too restrictive and uh you don't have a graphics engine it's like all this but on the character side we uh and clearly we moved in that direction is like kind of the the the winning path and um we partnered up with this uh really we immediately like went towards pixar and carlos bana he was um one of like had been in pixar for nine years he'd worked on tons of the movies including wally and others and just immediately kind of spoke the language and just clicked on how you think about that like kind of magic and drive and then he we built out a team uh you know with him as like a really kind of prominent kind of driver of this with different types of backgrounds and animators and character developers where um we put these constraints on the team but then got them to really try to create magic despite that and we converged on this system that was at the overlap of character and the character ai that where if you imagine the dimensionality of emotions happy sad angry surprised confused uh um scared like you think of these extreme emotions we almost like kind of put this challenge to kind of populate this library of responses on how do you show the extreme response that like goes to the extreme spectrum on angry or frustrated or whatever and and so that gave us a lot of intuition and learnings and um and then we started parameterizing them where it wasn't just a fixed recording but they were parameterized and had randomness to them where you could have infinite permutations of happy and surprised and so forth and then we had a behavioral engine that took the context from the real world and would interpret it and then create kind of probability mappings on what sort of responses you would have that actually made sense and so if cosmo saw you for the first time in a day um he'd be really surprised and happy in the same way that the first time you walk in and like your toddler sees you they're so happy but they're not gonna be that happy for the entirety of your next two hours but like you have this like spike in response or if you leave him alone for too long he gets bored and starts causing trouble and like nudging things off the table um or if you beat him in a game um the most enjoyable emotions are him getting frustrated and grumpy to a point where our testers and our customers would be like i had to let him win because i don't want him to be upset and so you start to like create this feedback loop where you see how powerful those emotions are and just to give you an example something as simple as eye contact um you don't think about it in a movie just like it kind of happens like you know camera angles and so forth um but that's not really a prominent source of interaction what happens when a physical character like cosmo when he makes eye contact with you um it built universal kind of connection kids all the way through adults um and it was truly universal it was not like people stopped caring after 10 12 years old and so we started doing experiments and we found something as simple as increasing the amount of eye contact like the amount of times in a minute that he'll look over for your approval to like kind of make eye contact just by i think doubling it we increase the play time engagement by 40 like you see these sort of like kind of interactions where you build that empathy and and so we studied pets we studied um virtual characters there's like a lot of times actually dogs are one of the perfect most perfect uh um influencers behind these sort of interactions and what we realized is that the games were not there to entertain you the games were to create context to bring out the character and if you think about the types of games that you know that you played they're relatively simple but they were always once to create scenarios of either tension or winning or losing or surprise or whatever the case might be and they were purely there to just like create context to where an emotion could feel intelligent and not random and in the end it was all about the character so yeah there's so many elements to play with here so you said dogs what lessons do we draw from cats who don't seem to give a damn about you is that just another character is this another it's just another character and so you you could almost like in early aspirations we thought it would be really incredible if you had a diversity of characters where you almost help encourage which direction it goes just like in a role-playing game um and you had uh like think of like the you know seven dwarfs sort of and uh um and initially we even thought that it would be amazing if like the other like you know like their characters actually help them be have strengths and weaknesses and some you know like whatever they end up doing like some are scared some are you know arrogant some are uh you know super warm and like kind of friendly and in the end we focused on one because it made it very clear that hey we got to build out enough depth here because you're kind of trying to expand it's almost like how long can you maintain a fiction that this character is alive um to where the person's explorations don't hit a boundary um which happens almost immediately with with typical toys um and you know even with video games uh how long can we create that immersive experience to where you expand the boundary and one of the things we realized is that you're um just way more forgiving when something has a personality and it's physical that is the key that unlocks uh robotics interacting you know in the physical world more generally is that that uh the when you have a when you don't have a personality and you make a mistake as a robot the stupid robot made a mistake why is it not perfect when you have a character and you make a mistake you have empathy and it becomes endearing and you're way more forgiving and that was the key that was like i think goes far far beyond entertainment it actually builds the depth of the personality the mistakes so let me ask the the movie her question then how and so cosmos seem feels like the early days of something that will obviously be prevalent throughout society at a scale that we cannot even imagine my sense is it seems obvious that these kinds of characters will permeate society and they will be friends with them we'll be interacting with them in different ways the in the way we i mean you don't think of it this way but when you play video games they're kind they're often cold and impersonal but but even then uh you think about role-playing games you become friends with certain characters in that game they're they don't remember much about you they they're they're just telling a story it's exactly what you're saying they they exist in that virtual world but if they acknowledge that you exist in this physical world if the characters in the game remember that you exist that you like for me like lex they understand that i'm a human being who has like hopes and dreams and so on it seems like there's going to be a like billions if not trillions of cosmos in the world so if we look at that future there are several questions to ask how intelligent does that future cosmo need to be to create fulfilling relationships like friendships yeah it's a great question and and part of it was a recognition it's going to take time to get there because it has to be a lot more intelligent um because what's good enough to be a magical experience for uh you know an eight-year-old um it's a higher bar to do that be a complaint like a pet in the home or to help with functional interface in an office environment or in a home or uh and so forth and so and the idea was that you build on that and you kind of get there and as technology becomes more prevalent and less expensive and so forth you can start to kind of work up to it um but you know you're absolutely right at the end of the day um we almost equated it to how uh the touchscreen created like this really novel interface to you know physical kind of devices like this this is the extension of it where you have much richer physical interaction in the real world this is this is the enabler for it um and it shows itself in a few kind of really obvious places so just take something as simple as a voice assistant um you will never most people will never tolerate uh an alexa or a google home just starting a conversation um proactively uh when you weren't kind of expecting it because it it feels weird it's like you were listening and like and then now you're kind of it feels intrusive but if you had a character um like a cat that touches you and gets your attention or toddler like you never think twice about it what we found really kind of immediately is that um these types of characters like cosmo and they would like roam around and kind of get your attention and we had a future version it was always on kind of called vector people were way more forgiving and so you could initiate interaction in a way that is not acceptable for for machines and in general um you know there's a lot of ways to customize it but it makes people who are skeptical of technology much more comfortable with it there was like there were a couple of really really prominent examples of this so when we launched in europe and so we were in um uh i think like a dozen countries if i remember correctly but like we were we went pretty aggressively in launching in um germany and france and uh and uk and we were very worried in europe because there's obviously like a really a socially higher bar for privacy and you know security where you you've heard about how many companies have had troubles on uh uh that might things that might have been okay in the u.s but like are just not okay in germany and france in particular um and so we were worried about this because you have um you know cosmo who's um uh you know in our future product veteran like where you have cameras you have microphones it's kind of connected and like you're playing with kids and like in these experiences and you're like this is like ripe to be like a nightmare if you're not careful yes um and uh and the journalists are like notoriously like really really tough on on these sort of things um we were shocked and we prepared so much for what we would have to encounter we were shocked in that not once from any journalists or customer do we have any complaints beyond like a really casual kind of question and it was because of the character where um when it conversation came up it was almost like well of course he has to see in here how else is he going to be alive and interacting with you and it completely disarmed um this like fear of technology that enabled this interaction to be much more fluid and again like entertainment was a proving ground but that is like a you know there's like ingredients there that carry over to a lot of other uh elements down the road that's hilarious that we're a lot less concerned about privacy if the if the thing is value and charisma i mean that's true for all of women to human interaction too it's an understanding of intent where like well he's looking at me he can see me if he's not looking at me he can't see me right so it's almost like uh um you're communicating intent and with that intent people are like kind of kind of more understanding and calmer and it's a it's interesting we just it was just the earliest kind of version of starting an experiment with this but um it wasn't enabler and um and then and then you have like completely different dimensions where like you know kids with autism had like an incredible connection with cosmo that just went beyond anything we'd ever seen and we have like these just letters that we would receive from parents and we had some research projects kind of going on with some universities on studying this but um there are like there's an interesting dimension there that got unlocked that just hadn't existed before um that has these really interesting kind of links into society and and a potential building block of future experiences so if you look out into the future do you think we will have beyond a particular game you know a companion like uh like her like the movie her or like a cosmo that's kind of asks you how your day went too right you know like a friend how many years away from that do you think we are what's your intuition good question so i think the idea of a different type of character like more closer to like kind of a pet style companionship it will come way faster um and there's a few reasons one is like to to do something like in her that's like effectively almost general ai and the bar is so high that if you miss it by bit you hit the uncanny valley where it just becomes creepy and like and not um not appealing um because the closer you try to get to a human in form and interface and voice the harder it becomes whereas you have way more flexibility on still landing a really great experience if you embrace the idea of a character and that's why um one of the other reasons why we didn't have a voice uh and also why like a lot of video game characters uh like sims for example does not have a voice when you uh when you think about it it was it wasn't just a cost savings like for them it was actually for all of these purposes it was because when you have a voice you immediately narrow down the appeal to some particular demographic or age range or um kind of style or gender uh if you don't have a voice people interpret what they want to interpret and an eight-year-old might get a very different interpretation than a 40 year old but you create a dynamic range and so you just you can lean into these advantages much more um and something that doesn't resemble a human and so that'll come faster i don't know when a human like that's just uh still like ma just complete r d at this point the the chat interfaces are getting way more interesting and richer but it's still a long way to go to kind of pass the test of you know well let me like let's consider like let me play devil's advocate so google is a very large company that's servicing it's creating a very compelling product that wants to provide a service a lot of people but let's go outside of that you said characters yeah it feels like and you also said that it requires general intelligence to be a successful participant in a relationship which could explain why i'm single this is very but the i i honestly want to push back on that a little bit because i feel like is it possible that if you're just good at playing a character yeah you're in in a movie there's a bunch of characters if you just understand what creates compelling characters and then you you just are that character and you exist in the world and other people find you and they connect with you just like you do when you talk to somebody at a bar i like this character this character is kind of shady i don't like them you pick the ones that you like and you know maybe it's somebody that's uh reminds you of your father or mother i don't know what it is but the the freudian thing but there's some kind of connection that happens and that's that that's the cosmo you connect to that's the future cosmo you connect and that's so i guess the statement i'm trying to make is it possible to achieve a depth of friendship without solving general intelligence i think so it's about intelligent kind of constraints right and just uh you set expectations and constraints such that in the space that's left you can be successful and so you can do that by having a very focused domain that you can operate in for example you're a customer support agent for a particular product and you create intelligence and a good interface around that or uh you know kind of in the personal companionship side you can't be everything to across the board you you kind of solve those constraints and i think uh i think it's possible my my worry is like i right now i don't see anybody that has picked up on where kind of cosmo left off yes and is pushing on it in the same way and so i don't know if it's a sort of thing where similar to like how you know in dot com there were all these concepts that we considered like you know that didn't work out or like failed or like were too early or whatnot and then 20 years later you have these like incredible successes on almost the same concept like it might be that sort of thing where like there's another pass at it that happens in five years or in 10 years but um it does feel like that appreciation of that like that this the three like it's duel if you will between like you know the hardware the ai and the character um that balance it's hard to i'm not aware of of any pro anywhere right now where like that same kind of aggressive drive with the value on the character is uh is happening and so to me just a prediction exactly as you said something that looks awfully a lot like cosmo not in the actual physical form but in the three-legged stool something like that in some number of years would be a trillion dollar company i don't understand like it's obvious to me yeah that like character not just as robotic companions but in all our computers they'll be there it's like uh clippy was like two legs of that stool or something like that yeah i mean that those are all different attempts and what's really confusing to me is they they're born these attempts and they they everybody gets excited and for some reason they die and then nobody else tries to pick it up and then maybe a few years later a crazy guy like you comes comes around with just enough brilliance and vision to create this thing and it's born a lot of people love it a lot of people get excited but maybe the timing is not right yet and then and then when the timing is right it just blows up and it just keeps blowing up more and more until it just blows up and i guess everything in the full span of human civilization collapses eventually and that wouldn't surprise me at all and like what's gonna be different in another five years or ten years what not physical component costs will continue to come down uh in price and you know mobile devices and computations going to become more and more prevalent as well as cloud as a big tool uh to offload cost um ai is going to be a massive transformation compared to what we dealt with uh where um everything from voice understanding to um uh to just you know kind of a broader contextual uh understanding and mapping of of semantics and uh understanding scenes and so forth and then the character side will continue to kind of you know progress as well because that magic does exist it just exists in different forms and you have just the brilliance of uh that's happening in animation and you know these other areas where um that is that was a big unlock in um you know in film obviously uh and so i think yeah the pieces can reconnect and the building blocks are actually gonna be way more impressive than they were five years ago so so in 2019 uh anki the company that created cosmo the company that you started had to shut down how did you feel at that time yeah it was tough uh that was a really emotional stretch and it was really tough year like about a year ahead of that was actually a pretty brutal stretch because we were um kind of light life or death on many many moments um just navigating these insane kind of just ups and downs and um barriers and the thing that made it like um like just rewinding a tiny bit like what you know what ended up being really challenging about it as a business where is um from a commercial standpoint and customer reception standpoint there's a lot of things you could point to that were like you know pretty big successes sold millions of units uh like you got to like pretty serious revenue like kind of close to 100 million annual revenue um uh number one kind of product in kind of various categories but it was pretty expensive it ended up being very seasonal where something like 85 percent of our volume was in q4 because it was a you know a present and and it was expensive to market it and explain it and so forth um and even though though the volume was like really sizeable and like the reviews were really fantastic um forecasting and planning for it and managing the cash operations was just brutal like it was absolutely brutal you don't think about this when you're starting a company or when you have a few million in you know in revenue because it's just your biggest costs are kind of just your head count and operations and everything's ahead of you but we got to a point where um you know you if you look at the entire year you have to operate your company pay all you know the people and so forth you have to pay for the manufacturing the marketing and everything else to do your sales in mostly november december and then get paid in december january by retailers and those swings were pretty um were really rough um and just made it like so difficult because the more successfully became the more wild those swings became because you'd have to like spend you know tens of millions of dollars on inventory tens of millions of dollars on marketing and tens of millions of dollars on payroll and everything else and then there's the bigger dip and then you're waiting for the 204 yeah and it's not a business that like is recurring kind of month-to-month and predictable and it's just and then you're walking in your forecast in july um you know maybe august if you're lucky um and uh and it's also like very hit driven and seasonal where like you don't have the sort of continued uh kind of slow growth like you do in some other uh consumer electronics industries and so before then like hardware kind of like went out of favor too and so you had fitbit and gopro dropped from 10 billion revenue to 1 billion revenue and hardware companies are getting valued at like 1x revenue oftentimes um which is tough right and so we effectively kind of got caught in the middle where we were trying to quickly evolve out of entertainment and move into some other categories but you can't let go of that business because like that's what you're valued on that's what you're raising money on um but there's no path to prop kind of pure profitability just there because it was you know such you know uh specific type of price points and so forth and so um we tried really hard to make that transition and um yeah we had a financing round that fell apart at the last second and effectively there was just no path to kind of get through that and get to the next kind of like holiday season and so we ended up um uh selling some of the assets and kind of winding down the company it was uh it was brutal like we i was very transparent with the company like in the the team while we were going through it where actually despite how challenging that period was very few people left i mean like people loved the vision the team the culture of the like kind of chemistry and kind of what we were doing there was just a huge amount of pride there and we wanted to see it through and we felt like we had a shot to kind of get through these checkpoints um we ended up uh and i mean by brutal i mean like literally like days of cash like three four different times uh runway like in the year you know kind of before it um where you're like playing games of chicken on negotiating credit line timelines and like repayment terms and how to get like a bridge loan from an investor it's just like level of stress that like is as hard as things might be anywhere else like you'll never come you know come close to that where you feel that like responsibility for you know 200 plus people right um and so we were very transparent during our fundraise on who we're talking to the challenges um that we have how it's going and when things are going well when things were tough um and so it wasn't a complete shock when it happened but it was just very emotional where like i you know like you know when we announced it finally that like um you know we you know basically we're just like watching kind of like you know the runway and trying to kind of time it and when we realized that like we didn't have any more outs we wanted to like kind of wind it down make sure that it was like clean and you know we could like kind of take care of people the best we could but yeah like broke down crying at all you know hands and somebody else had to step in for a bit and like it was just very very emotional but the beautiful part is like afterwards like everybody stayed at the office to like two three in the morning just like drinking and hanging out and telling stories and celebrating and it was just like one of the best uh for many people was like the best kind of work experience that they had and there was a lot of pride in what we did and there wasn't anything obvious we could point to that like hey if only we had done that different things would have been completely different it was just like the physics didn't line up uh and uh um but the experience was pretty uh incredible but it was hard like it was uh it had this feeling that there was this like incredible beauty in both the technology and products and the team that um uh you know there's there's a lot there that like in the you know right context could have been uh pretty incredible but it was um emotional just yeah just thinking i mean just looking at this company like you said the product and technology but the vision the implementation you got the cost down very low yeah and the compelling the nature of the product was great so many robotics companies failed at this at they the robot was too expensive it didn't have the personality it didn't really provide any value like a sufficient value to justify the price so like you succeeded where basically every single other robotics company or most of them that are like going the category of social robotics have kind of failed and i mean it's uh it's quite tragic i remember uh reading that i'm not sure if i talked to you before that happened or not but i remember you know i'm distant from this i remember being heartbroken reading that because like if if cosmo's not going to succeed what is going to succeed because that to me was incredible like it was an incredible idea cost is down the minimum the the it's just like the most minimal design in physical form that you could do it's really compelling the balance of games so it's a it's a fun toy it's a great gift for all kinds of age groups right it's just it's compelling in every single way and it seemed like uh it was a huge success and it it failing was i don't know there was heartbreak on many levels for me just as an external observer is i was thinking how hard is it to run a business that's that's what i was thinking like if this failed this must have failed because uh it's obviously not like yeah it's b it's business yeah maybe it's some aspect of the manufacturing and so on but i'm now realizing it's also not just that it's yeah sales marketing also it's everything right like how do you explain something that's like a new category to people that like how all these previous positions and so like uh you know it it had some of the hardest elements of if you were to pick a business it had some of the hardest uh um customer dynamics because like to sell a 150 product you got to convince both the child to want it and the parents to agree that it's valuable so you're having like this dual prong marketing challenge you have manufacturing you have like really high precision on the components that you need you have the ai challenges so there were a lot of tough elements but is this feeling where like just really great alignment of unique strength across kind of like all these different areas just an incredible like you know kind of character and animation team between this like carlos and there's like a character director day that came on board and like you know really great people there the ai side the um uh the manufacturing the you know where um like never missing a launch right and actually you know he kind of hitting that quality was um yeah it was it was heartbreaking but uh here's one neat thing is like we we had so much like fan mail from kind of kids parents like i actually like there was a bunch they collected in the end yeah that um i actually saved and like i never it was too emotional to open it and i still haven't opened it um and so i actually have this giant envelope of like a stack this much of like letters from you know kids and families just like every you know perpetration permutation you can imagine and so planning to kind of i don't know maybe like a five year you know five year eight some year reunion just inviting everybody over and we'll just like kind of dig into it and um kind of bring back some memories but um you know good impact and uh um well i i think there will be companies uh maybe waymo and google will be somehow involved that will carry this flag forward and will uh will make you proud whether you're involved or not i think this is one of the greatest robotics companies in the history of robotics so you should be proud it's still tragic to know that you know because you read all the stories of apple and and let's see spacex and like companies that were just on the verge of failure several times through that story and they just it's almost like a roll of the diet they succeeded and here's the role of the dice that just happened to go and that's the appreciation that like when you really like talk to a lot of the founders like everybody goes through those moments and sometimes it really is a matter of like you know timing a little bit of luck like some things are just out of your control and um uh and you you get a much deeper appreciation for um just the dimensionality of of that challenge but um the great thing is that like a lot of the team actually like stayed together and so um they were actually a couple of companies that we we kind of kept big chunks of the team together and we actually kind of helped align this uh um you know to help people out as well um and one of them was waymo where uh a majority of the ai and robotics team actually had the exact background uh that you would look for in like kind of a b space it was a space that a lot of us like you know were you know worked on in grad school were always passionate about and ended up uh you know maybe the time you know serendipitous timings from another perspective where like uh um kind of landed in a really unique um circumstance it's actually been quite exciting too so it's interesting to ask you just your thoughts uh cosmo still lives on under dream labs i think is that are you tracking the progress there or is it too much pain is it are you is that something that you're excited to see where that goes so keeping an eye on it of course just out of your curiosity and obviously just kind of care for product line i think um it's deceptive how complex it is to manufacture and evolve that product line um and the amount of experiences that are required to complete the picture and be able to move that forward and i think that's going to make it pretty hard to do something really substantial with it it would be cool if like even the product in the way it was was able to be manufactured yes again that would be yeah which would be neat um but uh it's i think it was it's deceptive how tricky that is on like everything from the quality control the details and um and then like technology changes that forces you to rick reinvent and update certain things um so uh i haven't been super close to it but just kind of keeping an eye on it yeah it's really interesting how it's deceptively difficult just as you're saying for example those same folks uh and i've spoken with them they're they partnered up with rick and morty uh creators to uh to do the butter robot yes i love the idea i just recently i've kind of half-assed watch rick and morty previously but now i just watched like the first season it's such a brilliant show i i like i did not understand how brilliant that show is and obviously i think in season one is where the butter robot comes along for just a few minutes or whatever but i just fell in love with the butter robot the sort of the that particular character just like you said there's characters you can create personalities you can create and that particular a robot who's doing a particular task realizes you know this like realizes that's the existential question this the myth of sisyphus question that uh camus writes about it's like is this all there is because he moves butter but you know that realization that's a that's a beautiful little realization for a robot that my purpose is very limited with this particular task it's abuse it's humor of course it's darkness it's a beautiful mix but so they want to release that butter robot but something tells me that to do the same depth of personality as cosmo had the same richness it would be on the manufacturing on the ai on the storytelling on the design it's going to be very very difficult it could be a cool sort of uh toy for rick and morty fans but to create the same depth of existential angst yeah that the butter robot symbolizes is is really that's the brave effort you succeeded at with cosmo but it's not easy it's really studies and you can fail on almost any one of the kind of dimensions and like uh and yeah it takes you know yeah unique convergence of a lot of different skill sets to try to pull that off yeah on this topic let me ask you for some advice because uh as i've been watching rick and morty i i told myself i have to build the butter robot just as a hobby project and so uh i got a nice platform for it with treads and and there's a camera that moves up and down and so on um i'll probably paint it but the question i'd like to ask there's obvious technical questions i'm fine with communication the personality storytelling all those kinds of things i think i understand the process of that but how do you know when you got it right so with with cosmo how did you know this is great like or um something is off like yeah is this brainstorming with the team do you know it when you see it is it like love at first sight it's like this is right or like i guess if we think of it as an optimization space is there uncanny valley we're like that's not right or this is right or are a lot of characters right yeah we stayed away from uncanny valley just by having such a different what like mapping where it didn't try to look like a dog or a human or anything like that and so uh you avoided having like a weird pseudo similarity but not quite hitting the mark um but you could like just fall flat where just like a personality or a you know character emotion just didn't feel right and so it actually mirrored very closely to kind of the iterations that a character director of pixar would have where you're running through it and you can virtually kind of like see what it'll look like we we created a plug-in to where we actually used like like maya the sim you know the animation tools and then we created a plug-in that perfectly matched it uh to the physical one and so you could like test it out virtually and then push a button and see it physically play out and there's like subtle differences and so you want to like make sure that that feedback loop is super easy to be able to test it live um and then sometimes like you would just feel it that it's right and intuitively no and then you'd also do we did user testing but it was very very often that like the into like if we found it magical it would scale and be magical uh more broadly there were not too many cases where like like we were pretty decent about not like getting to it you know geeking out or getting too attached to something that was super unique to us um but trying to kind of like you know put a customer hat on and does it truly kind of feel magical and so in a lot of ways we just give a lot of um autonomy to the character team to really think about the you know character board and mood boards and storyboards and like what's the background of this character and how would they react um and they went through a process that's actually pretty familiar but now had to operate under these unique constraints um but the moment where it felt right um kind of took a fairly similar journey than like a as a character in an animated film actually it's quite cool well the the thing that's really important to me and i wonder if it's possible well i hope it's possible pretty sure it's possible is for me even though i know how it works to make sure there's sufficient randomness in the process yeah probably because it would be machine learning based that i'm surprised that i don't i'm surprised by certain reactions i'm surprised by certain communication maybe that's in a form of a question um were you surprised by certain things cosmo did like certain interactions yeah we made it intentionally like uh so that there would be some surprise then like a decent amount of variability in how he'd respond in certain circumstances and so in the end like it's um this is this isn't general ai this is a giant like spectrum and library of like parametrized kind of emotional responses and an emotional engine that would like kind of map your current state of the game your emotions the world the people are playing with you all so forth to what's happening um but we could make it feel spontaneous by creating enough diversity uh and randomness uh but still within the bounds of what felt felt like very realistic um to make that work and then what was really neat is that we could get statistics on how much of that space we were saturating um and then add more animations and more diversity in the places that would get hit more often so that you stay ahead of the um you know the curve and maximize the uh the chance that it it stays feeling alive um and so but then when you like combine it like the permutations and kind of like the combinations of emotions stitched together sometimes surprised us because you see them in isolation but when you actually see them and you see them live you know relative to some event that happened in the game or whatnot like it was kind of cool to see the combination of the two and um uh and not too different in other robotics applications where like you get you get so used to thinking about like the modules of a system and how things progress through a tech stack that the real magic is when all the pieces come together and you start getting the right emergent behavior um in a way that's easy to lose when you just kind of go too deep into any one piece of it yeah when the system is sufficiently complex there is something like emergent behavior and that's where the magic is you as a human being you can still appreciate the beauty of that magic of the fine at the system level first of all thank you for humoring me on this uh it's really really uh fascinating i think a lot of people would love this i i'd love to just one last thing on the butter robot i promise in terms of uh speech yeah cosmo is able to communicate so much with just movement and face do you think speech is too much of a degree of freedom like a speech a feature or a bug of uh deep uh interaction emotional interaction yeah for a product it's too deep right now it's just not real uh it would immediately break the fiction because the state of the art is just not good enough um and that's on top of just narrowing down the demographic where like the way you speak to an adult versus a way speak to a child is very different yet a dog is able to appeal to everybody and so right now there is no speech system that is like rich enough and and subtly realistic enough to feel appropriate um and so we very very quickly kind of like moved away from it now speech understanding is a different matter where understanding intent that's a really valuable input um but giving it back requires like a you know way way higher bar given kind of where today's world is and so that realization that you can do surprisingly much with uh either no speech or kind of tonal like the way you know wally r2d2 and kind of other characters are able to um it's quite powerful and it generalizes um across cultures and across ages really really well i think we're going to be in that world for a little while where it's still very much an unsolved problem on how to like make something it touches on kenny valley thing so if you have legs and you're a big humanoid looking thing you have very different expectations and a much narrower degree of what's going to be acceptable by society than if you're a you know robot like uh like cosmo or wall and you can or some other form where you can kind of like reinvent the character speech has that same property where speech is so well understood um in terms of expectations by humans that you have far less flexibility on how to deviate from that and lean into your strengths and avoid weaknesses but i wonder if there is obviously there's certain kinds of speech that activates the uncanny valley and breaks the illusion faster so i guess my intuition is we will solve certain we would be able to create some speech-based personalities sooner than others so for example i could i could think of a robot that doesn't know english and is learning english right yeah those kinds of personalities where you're like uh you're intentionally kind of like getting a toddler level of uh speech so that's exactly right so you can have like uh tie it into the experience where uh it is a more limited character or you embrace the lack of emotions as part or the lack of sorry dynamic range in the speech kind of capabilities emotions as like part of the character itself and you've seen that in like kind of fictional characters as well yeah um but that's why this podcast works and yeah like you kind of had that with like um i don't know i guess like you know data and some of the other yeah like um but yeah so you have to and that becomes a constraint that lets you meet the bar um see i i honestly think like also if you add uh drunk and angry that gives you more constraints that allow you to be dumber from an nlp perspective like there's certain aspects so if you modify human behavior like let's just so forget the sort of artificial thing where you don't know english toddler thing we if you just look at the full range of humans i think we there's certain situations where we put up with uh like lower level of intelligence in our communication like if somebody's drunk we understand this issue that they're probably under the influence like we understand that they're not going to be making any sense anger is another one like that i'm sure there's a lot of other kind of situation yeah maybe uh yeah again language loss in translation that kind of stuff that i think if you if you play with that uh what is it the ukrainian boy that passed the touring test you know play with those ideas i think that's really interesting and then you can create compelling characters but you're right that's a dangerous sort of road to walk because uh you're adding degrees of freedom that can get you in trouble yeah and that's why like you have these um big pushes that like for most of the last decade plus like where you'd have like full like human replicas of robots really being down to like skin and like kind of in some places um my personal feeling is like man like that's not the direction that's most fruitful right now um beautiful art yeah it's not in terms of a uh rich deep fulfilling experience yeah you're right yeah and the way creating a minefield of potential places to feel off uh and then and then you're sidestepping where like the biggest kind of functional ai challenges are to actually have you know kind of like really rich productivity that actually kind of justifies a you know kind of the higher price points and that's that's part of the challenges like yeah like robots are going to get to like thousands of dollars tens of thousands of dollars and so forth but you can imagine what sort of expectation of value that comes with it um and so that's where you want to be able to invest the the the time and uh and depth and so going down the full human replica route um creates a gigantic uh uh distraction and really really high bar that can end up sucking up so much of your resources so it's weird to say but you happen to be one of the greatest at this point roboticist ever because you created this little guy you were part obviously of a great team that created the the little guy with a deep personality and they're now switching to an entirely well maybe not entirely but a different fascinating impactful robotics problem which is autonomous driving and more specifically the biggest version of autonomous driving which is autonomous trucking so you are at waymo now can you give us a big picture overview what is waymo what is waymo driver what is waymo one what is waymo via can you give an overview of the company and the vision behind the company for sure waymo by the way it's just it's been eye-opening on just how incredible that that people and the talent is and how in one company you almost have to create i don't know 30 companies worth of like technology and capability to like kind of solve the full spectrum of it so um yeah so i've been at weymouth since um 2019 so about two and a half years so waymo is uh focused on building what we call a driver which is creating the ability to have autonomous driving across different environments vehicle platforms domains and use cases uh you know as you know got started in uh 2009 it was a lot almost like an immediate successor to the grand challenge and urban challenges that were like incredible uh kind of catalyst for this whole space um and so google started this project and then eventually waymo spun out and so what waymo is doing is creating uh the systems both you know hardware software infrastructure and everything that goes into it to enable and to commercialize autonomous driving this hits on consumer transportation and ride sharing and kind of vehicles and urban environments and as you mentioned it hits on autonomous trucking to to transport goods so in a lot of ways it's transporting people and transporting goods um but at the end of the day the underlying capabilities are required to do that are surprisingly better aligned than one might expect where it's the fundamentals of um of being able to understand the world around you process it make intelligent decisions and prove that we are at a level of safety that enables uh large-scale autonomy so from a branding perspective sort of uh waymo driver is the system that's irrespective of a particular uh vehicle it's operating in there you have a set of sensors that perceive the world can act in that world and move this whatever the vehicle is what's that legal platform that's right and so in the same way that you have a driver's license and like your ability to drive isn't tied to a particular make and model of a car and of course there's special licenses for other types of vehicles but the fundamentals of a human driver very very large you carry over and then there's uniquenesses related to a particular environment or domain or a particular um vehicle type that kind of add some extra additive challenges but that's exactly right it's the underlying systems that enable uh a physical vehicle without a human driver to uh very successfully accomplish the tasks that previously um what wasn't possible um without um you know 100 human driving and then there's way more one which is the transporting people that's right from a brand perspective and just in case we refer to it so people know and then there's waymo via which is the trucking component why via by the way what is that what is that what's is it just like a cool sounding name that just yeah uh like is there does there an interesting story there just it is a pretty cool sounding name it's a cool sounding name i mean when you think about it it's just like well we're gonna transport it via this and that like so it's just kind of like an allusion to um the mechanics of transporting something yes cool um and uh and it is a pretty good grouping and the interesting thing is that even the groupings kind of bore where waymo one is like human transportation and uh there's a fully autonomous service in the phoenix area that like every day is transporting people and it's pretty incredible to like just you know see that operate at reasonably large scale and just kind of happen and then on the via side it doesn't even have to be like long-haul trucking is a like a major focus of uh of ours but down the road you can stitch together the vehicle transportation as well for local delivery um also and a lot of this requirements for local delivery overlap very heavily with consumer transportation um obviously uh you know given that you're operating on a lot of the same roads um and uh and navigating the same safety challenges and so um yeah and wave mode very much is a multi-product company that has ambitions in both they have different challenges and both are tremendous opportunities but the cool thing is is that there's a huge amount of leverage and this kind of core technology stack now gets pushed on by both sides and that adds its own unique challenges but the success case is that um the challenges that you push on um they get leveraged across all platforms and also from an engineering perspective the teams are integrated it's a mix so there's a huge amount of centralized kind of core teams that support all applications and so you think of something like the hardware team that develops the lasers the compute integrates into vehicle platforms this is an experience that carries over across um you know any application that we'd have and they have been flow with both then there's like really unique um perception challenges planning challenges like other you know types of challenges where there's a huge amount of leverage on a cortex stack but then there's like dedicated teams that think of how do you deal with a unique challenge for example an articulated trailer with varying loads that completely changes the physical dynamics of a vehicle that doesn't exist on a car but becomes one of the most important kind of unique new challenges on a truck so what's the long-term dream of waymo via uh the autonomous trucking effort that waymo is doing yeah so we're starting with developing uh l4 autonomy for class 8 trucks these are 53-foot trailers that capture like a big perc a pretty sizable percentage of the good transportation in the country long term the opportunity is obviously to expand to much more diverse types of vehicles types of good transportation and start to really expand in both the volume and the route feasibility that's possible and so just like we did on the car side you start with a single route with a very specific operating kind of domain and constraints that allow you to solve the problem but then over time you start to really try to push against those boundaries and open up deeper feasibility across routes across surface streets across environmental conditions across the type of goods that you carry the versatility of those goods and how little supervision is necessary to just start to scale this network and long term there's actually it's a pretty incredible enabler where um you know today you have already a giant shortage of truck drivers it's uh over 80 000 truck driver shortage that's expected to grow to hundreds of thousands in the years ahead you have really really quickly increasing demand from e-commerce and just just distribution of uh where people are located um you have one of the deepest safety challenges of um of any profession in the u.s where um there's a huge huge kind of challenge around fatigue and around kind of the long routes that are driven and even beyond kind of the cost and necessity of it there are fundamental constraints built into our logistics network that are tied to the type of human constraints and regulatory constraints that are tied to trucking today for example our limits on how long a driver can be driving in a single day before they're they're not allowed to drive anymore which is a very important safety constraint what that does is it enforces limitations on how far jumps with a single driver could be and makes you very subject to availability of drivers which influences where warehouses are built which influences how goods are transported which influences costs and so um you start to have an opportunity on everything from plugging into existing fleets and brokerages and the existing logistics network and just immediately start to have a huge opportunity to add value from you know cost and driving fuel insurance and safety standpoint all the way to completely reinventing the logistics network um across the united states and enabling something completely different than what it looks like today yeah i had uh be published before this had a great conversation with steve vicelli who we talked about the manual driving and he echoed many of the same things that you were talking about but we talked about much of the the fascinating human stories of truck drivers he was also was a truck driver for for a bit as a grad student to try to understand the depth of the problem he's a fascinating wives we have some drivers that have 4 million miles of lifetime driving experience it's pretty incredible and um yeah it's uh yeah learning from them like some of them are on the road for 300 days a year it's a very unique type of lifestyle so there's fascinating stuff there just like you said there's a shortage of actually people uh truck drivers taking the job counter to what this i think is publicly believed so there's an excess of jobs and a shortage of people to take up those jobs and just like you said it's such a difficult problem and these are experts at driving it's solving this particular problem and it's fascinating to learn from them to understand you know how hard is this problem and that's the question i want to ask you from a perception from a robotics perspective what's your sense of how difficult is autonomous trucking maybe you can comment on which scenarios are super difficult which are more manageable is there is there a way to kind of convert into words how difficult the problem is yeah it's a good question so there's um and as you can expect it's a mix some things become a lot uh uh a lot easier or at least more flexible um some things are harder and so you know on the things that are like uh the tailwinds the benefits um a big focus of automating trucking especially initially is really focusing on the long-haul freeway stretch of it where that's where a majority of the value is captured on a freeway you have a lot more structure and a lot more consistency across freeways across the u.s compared to surface streets where you have a way higher dimensionality of what can happen lack of structural lack of consistency and variability across cities so you can leverage that consistency to tackle at least in that respect a more constrained ai problem which has some benefits to it um you can itemize much more of the sort of things you might encounter and so forth and so those are benefits is there a canonical freeway and city we should be thinking about like is there is there a standard thing that's brought up in conversation often like here's a stretch of road um what is it like when people talk about traveling across country they'll talk about new york this is san francisco is that the route like is there a stretch of road that's like nice and clean and then there's like cities with difficulties in them that you kind of think of as the canonical problem to solve here right uh so starting with the car side um well waymo very intentionally picked the phoenix area and the san francisco area as a follow once we hit driverless where when you think of consumer transportation and ride sharing you know kind of economy a big percentage of that market is captured in the densest cities in the united states and so really pushing out and solving san francisco becomes a really huge opportunity and uh importance and um and you know places one dot on kind of like the spectrum of like kind of complexity uh the phoenix area starting with chandler and then like kind of expanding more broadly in the phoenix uh metropolitan area it's i believe the fastest growing city in the us it's a uh kind of a higher medium-sized city but growing quickly and still captures a really wide range of kind of like complexities and so getting to driverless there actually exposes you a lot of the building blocks you need for the more complicated environments and so in a lot of ways there's a thesis that if you start to kind of place a few of these kind of dots where san francisco has these types of unique challenges dense pedestrians all this like complexity especially when you get into the downtown areas and so forth and phoenix has like a really interesting kind of spectrum of challenges maybe you know other ones like la kind of add freeway focus and so forth you start to kind of cover the full set of features that you might expect and it becomes faster and faster if you have the right systems in the right organization to then open up the fifth city and intensity in the 20th city on trucking there's uh similar properties where um obviously there's uniquenesses and freeways when you get into really dense environments and then the real opportunity uh to then you know get even more uh value is to think about how you expand with like some of the service street challenges but for example right now we're looking um we have a big facility that we're uh finishing building in q1 in uh dallas area um that'll allow us to do testing from the dallas area on routes like dallas to houston dallas to phoenix um going out east and dallas to austin austin so that triangle um waymo should come to austin well waymo the car side was in austin for a while yes i know yeah come back yeah but uh trucking is actually texas is one of the best places to start uh because of both volume regulatory weather there's a lot of benefits um on trucking a huge opportunity is port of la going east so in a lot of ways a lot of the work is to start to stitch together a network and converge to port of la where you have the biggest port in the united states um and the amount of goods going east from there is pretty tremendous and then obviously there's you know kind of channels everywhere and you have extra complexities as you get into like snow and inclement weather and so forth but um what's interesting about trucking is every single route segment that you add increases the value of the whole network and so it has this kind of network effect and cumulative effect that's very unique and so there's all these dimensions that we think about um and so in a lot of ways dallas has a really unique hub that opens up a lot of options has become a really valuable weber so the million questions i get asked first of all you mentioned level four for people who totally don't know there's these levels of automation that uh level four refers to uh kind of the first step that you could recognize is fully autonomous driving level five is really fully autonomous driving level four is kind of fully autonomous driving and then there are specific definitions depending on who you ask what that actually means but for you what does the level four mean and you mentioned freeway let's say like there's three parts of long-haul trucking maybe i'm wrong in this but there's freeway driving there's like truck stop and then there's more urban-y type of area so which of those do you want to tackle which of them do you include under level four like how do you think about this problem what do you focus on where's the biggest impact to be had in the short term so the goal is to we get we got to get to market as fast as we can because the moment you get the market you just learn so much and it influences everything that you do and it is um uh i mean one of the experiences that carried over from before is that you add constraints you figure out the right compromises you do whatever it takes because getting the market like is so critical right and here with autonomous driving you can get to market in so many different ways that's right and so one of the simplified simplifications that we intentionally have put on is using what we call transfer hubs where you can imagine depots uh that are uh at the entry points to metropolitan areas like let's say dallas like the hub that we're building which does a few things that are very valuable so from a first product standpoint you can automate transfer hub to transfer hub and that path from the transfer hub to the you know the full freeway route can be a very intentional single route that you can select for the features that you feel you want to handle at that point in time then you build the hub specifically designed for time tracking and that's what's going to happen actually like and you get you need to come out in january and check it out because it's going to be really cool it's the not only is it our main operating um headquarters for our fleet there but it will be the first uh fully ground-up design driverless hub for autonomous drivers autonomous trucks in terms of where do they enter where do they depart how do you think about the flow of people goods everything it's like it's quite cool and it's really beautiful on how it's thought through and so early on it is totally reasonable to do the last five miles manually to get to the final kind of depot to avoid having to solve the general surface street problem which is obviously very complex now when the time comes and we are increasingly we're already we're pushing on some of this but we will increasingly be pushing on surface street capabilities to build out the value chain to go all the way deeper to depot instead of transfer hub the transfer hub and we have probably the best advantages in the world because of all the waymo experience on surface streets but that's not the highest roi right now where the highest roi is hub the hub and get the routes going and so when you ask what's l4 l4 can be applied to any domain operating domain or scope but it's effectively for the places where we say we're ready for autonomous operation we are 100 operating uh with uh through the as a self-driving truck with no uh human behind the wheel that is l4 autonomy and it doesn't mean that you operate in every condition it doesn't mean you operate on every road but for a particularly well-defined area uh operating conditions routes kind of domain you are fully autonomous and that's the difference between l4 and l5 and most people would agree that at least any time in the foreseeable future l5 is just not even really worth thinking about because there's always going to be these extremes and so it's a race and a almost like a game where you think of what is the sequence of expanded capabilities that create the most value and teach us the most and create this feedback loop where we're building out and unlocking more and more capability over time i gotta ask you just curious so first of all i have to when i'm allowed to visit the dallas facility because it's super cool it's like robot on the giving and the receiving end it's the truck is a robot and the the hub is a robot yeah it's got to be very robot friendly so yeah that's great i will feel at home uh the what's the sensor suite like on the hub if you can just high level mention it is does the hub have like lidars and like is is it is the truck doing most of the intelligence or is the hub also intelligent yeah so most of it will be the truck and uh everything is like connected like so we uh we have our servers where we know exactly where every truck is we know exactly what's happening at a hub and so you can imagine like a large back-end system that over time starts to manage uh timings goods delivery windows all these sort of things and so you don't actually uh need to um there might be special cases where that is valuable to equip some sensors in the hub but a majority of the intelligence is going to be on the truck because um whatever is relevant to the truck relevance should be seen by the truck and can be relayed uh remotely for any sort of kind of cognizance or decision making but there's a distinct type of workflow where um where do you check trucks where do you want them to enter what if there's many operating at once where's the staging area to depart how do you set up the flow of humans and human cars and traffic so that you minimize the interaction between humans and kind of self-driving trucks uh and then how do you even intelligently select the locations of these transfer hubs that are both really great service locations for a metropolitan area and there could be over time many of them for a metropolitan area while at the same time leaning into the path of least resistance to lean into your current capabilities and strengths so that you minimize the amount of work that's necessary to unlock the next kind of big bar i have a million questions so first is the goal to have no human in the truck the goal is to have no human in the truck now of course right now we're testing with expert operators and so forth but um the goal is to um now there might be circumstances where it makes sense to have a human or uh and and obviously these trucks can also be manually driven so sometimes like our we talk with our fleet partners about how um you can buy a waymo equipped diamor truck down the road and on the routes that are autonomous it's autonomous on the routes that are not it's um human driven maybe there's l2 functionality that add safety systems and so forth but as soon as they become as soon as we expand in software the availability of driverless routes the hardware is forward compatible to just now start using them um in real time and so you can imagine uh this mixed use but at the end of the day the largest value proposition is where you're um able to have no constraints on how you can operate this truck um and it's 100 autonomous with nobody inside oh that's amazing so the let me ask on the logistics front because you mentioned that also opportunity to revamp or for builds from scratch some of the ideas around logistics i don't want to throw too much shade but from talking to steve my understanding is logistics is not perhaps as great as it could be in the current uh trucking uh environment i'm not maybe you can break down why but there's probably competing companies there's just a mess maybe some of it is literally just it's old school like they it's just like it's not computer it's not computerized like truckers are almost like contractors there there's an independence and there's not a nice interface where they can communicate where they're going where they're at you know all those kinds of things and so there it just feels like there's so much opportunity to digitize everything to where you could optimize the use of human time optimize the use of all kinds of resources how much you thinking about that problem how fascinating is that problem how difficult does it how much opportunity is there to revolutionize the space of logistics in autonomous trucking in trucking period it's pretty fascinating it's uh this is one of the most motivating aspects of all this where like yes there's like a mountain of problems that are like you wanna you have to solve to get to like the first checkpoints and first drive list and so forth and inevitably like in a space like this you plug in initially into the existing kind of system and start to kind of you know learn and iterate but um that opportunity is massive and so you know a couple of the factors that um play into it so first of all um there's obviously just the physical constraints of driving time driver availability some fleets have a 95 attrition rate you know right now because of just this demands and like you know kind of gaps in competition and so forth and then it's also incredibly fragmented where you would be shocked at like when you when you look at industries like when you think of the top 10 players like the biggest fleets like the walmarts and fedexes and so forth the percentage of the overall trucking market that's captured by the top 10 or 50 fleets is surprisingly small um the average kind of uh truck operation is like a one to five truck you know family business um and so and so there's just like a huge amount of like fragmentation which makes for um really interesting challenges in kind of stitching together through like bulletin boards and brokerages and some people run their own fleets and and this world's kind of like evolving um but it is one of the less digitized and optimized worlds that there is and the part that is optimized is optimized to the constraints of today and even within the constraints of today this is the 900 billion dollar industry in the u.s and it's continuing to grow it feels like from a business perspective if i were to predict that while trying to solve the autonomous trucking problem waymo might solve first the logistics problem like because that that would already be a huge impact yeah so on the way to solving autonomous trucking the human driven like there's so much opportunity to significantly improve the human driven trucking the timing the logistics so you use humans optimally the handoffs the like you know well even that you i mean you get really ambitious you start to expand this beyond like how does the uh fulfillment center work and like how does the transfer hub work how does a warehouse work to i mean there's a lot of opportunities to start to automate these chains and um a lot of the inefficiency today is because like you have a delay like port of la has a bunch of ships right now waiting outside of it because they can't dock because there's not enough labor inside of the port of la that means there's a big backlog of trucks which means there's a big backlog of deliveries which means the drivers aren't where they need to be and so you have this like huge chain reaction and your feasibility of readjusting in this network is low because everything's tied to humans and manual kind of processes uh or distributed processes across a whole bunch of players and so one of the biggest enablers is um yes we have to solve autonomous trucking first and that by the way that's not like an overnight thing that's decades of continued kind of expansion and work but um the first checkpoint in the first route is like is not that far off but once you start enabling and you start to learn about how the constraints of autonomous trucking which are very different in the constraints of human trucking and again strengths and weaknesses how do you then start to leverage that and rethink a flow of goods uh more broadly and this is where like the learnings of like really partnering with some of the largest fleets in the us and the sort of learnings that they have about the industry and the sort of needs that they have and what would change if you just like really broke this one constraint that like holds up the whole network or what if you enabled this other constraint that actually drives the roadmap in a lot of ways because um this is not like an all or nothing problem it's uh you know you start to kind of unlock more and more functionality over time which functionality most enables this optimization ends up being kind of part of the discussion but you're totally right like you fast forward to like you know five years ten years uh 15 years and you think about like very generalized capability of automation and logistics as well as the ability to like poke into how those handoffs work the efficiency goes far beyond just direct cost of today's like unit economics of a truck they go towards reinventing the entire system um in the same way that uh you know you see you know these other industries that uh like when you get to enough scale you can really rethink um how you build around your new set of capabilities not the old set of capabilities yeah use the analogy metaphor or whatever that autonomous trucking is like email versus mail and then with email you're still doing the communication but it opens up all kinds of comm varieties of communication that you didn't anticipate that's right constraints are just completely different um and yeah there's definitely a property of that here um and we're also still learning about it because there there is a lot of really um fascinating and sometimes really elegant things that the industry has done where there's companies whose entire existence is around despite the constraints optimizing as much as they can out of it and those lessons do carry over but it's an interesting kind of merger of worlds to think about like well what if this was completely different how would we approach it and the interesting thing is that for a really really really long time it's actually going to be the merger between how to use autonomy and how to use humans that leans into each each of their strengths yeah and then we're back to cosmo human robot interaction so and the interesting thing about waymo is because there's the passenger vehicle the the human the transportation of humans and transportation of goods you could see over time they might kind of meld together more because you you'll probably have like zero occupancy vehicles moving around so you have transportation goods for short distances and then for slightly longer distances and then slightly longer and then there'll be this then you just see the difference between a passenger vehicle and a truck is just size and you can have different sizes and all that kind of stuff and at the core you can have a way more driver that doesn't as long as you have the same that's sweet you can just think of it as one problem and that's why over time these do come kind of converge where in a lot of ways a lot of the challenges we're solving are freeway driving which are going to carry over very well to the vehicles to the car side um but there are like then unique challenges like uh you have a very different dynamics in your vehicle where you have to see much further out in order to have the proper like response time because you have an 80 000 pound fully loaded truck um that's a very very different type of braking profile than a than a car you have uh really interesting kind of dynamic limits because of the trailer where you actually it's very very hard to like physically like flip a car or do something like physically like most risk in a car is from just collisions um it's very hard to like in any normal operation to do something other than like you know unless you hit something it's actually kind of like roll over or something on a truck you actually have to drive much closer to the physical bounds of the safety limits um but you actually have like real constraints because you could uh you know you could have a really interesting interactions between the cabin and the trailer yeah there's something called jackknifing if you turn you know too quickly you have roll risks and so forth and so we spend a huge amount of time understanding those boundaries and those boundaries change based on the load that you have which is also an interesting difference you have to propagate through the out that through the algorithm so that you're leveraging your dynamic range but always staying within the safety balance but understanding what those safety bonds are and so we have this like really cool test facility where we like take it to the max and actually imagine a truck with these giant training wheels on the back of the trailer and you're pushing it past the safety limits uh in order to like try to actually see where it rolls and so you you you define this high dimensional boundary which then gets captured in software to stay safe and actually do the right thing but uh it's kind of fascinating the sort of uh you know kind of challenges you have there um but then all of these things drive really interesting challenges from perception to um unique behavior prediction challenges and obviously in planner where you have to think about merging and creating gaps with a 53 foot trailer and so forth and then obviously the platform itself is very different where you have different numbers of sensors sometimes types of sensors and you also have unique blind spots that you have because of the trailer which you have to think about and so it's a really interesting spectrum and in the end you try to capture these special cases in a way that is cleanly augmentations of the existing tech stack because a majority of what we're solving is actually generalizable to freeway driving um and different platforms and over time they all start to kind of merge ideally where the things that are unique are as as minimal as possible and that's where you get the most leverage and that's why waymo can do you know take on two trillion dollar opportunities um and have been nowhere near 2x the cost or investment or size in fact it's much much smaller than that because of the high degree of leverage so what kind of sensor suite they can speak to that uh that a long haul truck needs to have lidar vision how many what are we talking about here yeah so it's um more than the cars so very loosely you can think of as like 2x but it varies depending on the sensor and so we have like dozens of cameras radar and then multiple lidar as well you'll see one difference where the cars have a central main sensor pod on the roof in the middle and then a some kind of hood sensors for blind spots the truck moves to two main sensor pods on the outsides where you would typically have the mirrors next to the driver they effectively go as far out as possible um kind of up to the understanding of the front kind of on the cabin not all the way in the front but like kind of where the mirrors for the driver would be and so those are the main sensor pods and the reason they're there is because if you had one in the middle the trailer is higher than the cabin and you would be included with this like awkward wedge too much occlusion too much occlusion and so then you would add a lot of complexity to the software yeah to make up for that and and just unnecessary components so many probably fascinating design choices really cool because you can probably bring up light or higher and have it in the center or something you could have all kinds of choices you have to make the decisions here yeah that ultimately probably will define the industry right but by having two on the side there's actually multiple benefits so one is like um you're just beyond the trailer so you can see fully flush with the trailer and so you eliminate most of your blind spot except for right behind the trailer um which is which is great because now the software carries over really well and the same perception system you use on the car side largely that architecture can carry over and you can retrain some models and so forth but you leverage it a lot it also actually helps with redundancy where there's a really nice built-in redundancy for all the lidar cameras and radar where you can afford to have any one of them fail and you're still okay and at scale every one of them will fail um and you will be able to detect when one of them fails because they don't uh because the redundancy they're giving you the data that's inconsistent with the rest of that's right and it's not just like they no longer give data it could be like they're fouled or they stop giving data where the some electrical thing gets cut or you know part of your compute goes down so what's neat is that like you have way more sensors part of his field of view and occlusions part of its redundancy and part of it is new use cases so there's um uh new types of sensors uh to optimize for long range and uh kind of the the the sensing horizon that we look for on our vehicles um that is unique to trucks because it actually is like kind of much like further out than um than a car but a majority are actually used across both cars and trucks and so we use the same compute the same uh fundamental baseline sensors cameras uh radar um imus and so you get a great leverage from all of the infrastructure and the hardware development as a result so what about cameras what role does so lidar is this rich set of information has its strengths um has some weaknesses camera is this rich source of information that has some strengths has its weaknesses what role does lidar play what role does vision cameras play in this in this beautiful problem of autonomous trucking ah it is beautiful there's like so much that comes together and how much yeah at which point do they come together yeah so let's start with lidar so lidar has been like waymo's um uh one of waymo's big strengths and advantages where uh we developed our own lidar uh in-house where many generations in both in cost and functionality it is um uh the best and you know in this in the space which generation because i know there's this there's uh this cool i mean i love versions that are increasing uh which version of the hardware stack is at currently uh officially publicly uh so uh so some parts iterate more than others i'm trying to remember on the sensor side so this the entire self-driving system which includes sensors and compute is fifth generation yes um i can't wait until there's like iphone style like announcements yeah for like new versions of the weymouth hardware yeah well we try to be careful because man when you change the hardware it takes a lot to like retrain the models and uh and everything so we just went through that and going from the pacificas to the jaguars and so the jaguars and then the trucks are you know have the same generation now um but yeah the lidar is uh it's incredible and so waymo has um leaned into that as a strength and so a lot of the near-range perception system that obviously kind of carries over a lot from the car side uh uses lidar as a very prominent kind of like primary sensor but then obviously everything has its strengths and weaknesses and so in the near range lidar is a gigantic advantage um and it has its weaknesses on you know when it comes to occlusions in certain areas rain and weather like you know things like that but it's an incredible sensor and it gives you incredible density perfect location precision and consistency which is a very valuable property um to be able to uh to kind of apply a mel approach can you elaborate consistency yeah when you have a camera the position of the sun the time of the day uh um various of the properties can have a big impact uh whether there's glare the field of view things like that um so consistent the signal with uh in the face of a changing external environment the signal yeah daytime night time it's about 3d um physical existence in effect like you're you're seeing beams of light that bounce physically bounce off of something and come back and so whatever the conditional conditions are like the shape of a human sensor reading from a human or from a car or from an animal like you have um a reliability there which ends up being valuable for kind of like the long tail of challenges yeah now lidar is the first sensor to drop off in terms of range and ours has a really good range but at the end of the day um it drops off and so particularly for um for trucks on top of the general redundancy that you want for near range with and complements through cameras and radar for occlusions and for complementary information and so forth when you get to long range you have to be radar and camera primary because your lidar data will fundamentally drop off after a period of time and you have to be able to see um kind of objects further out now uh cameras have uh the the incredible range um where you get a high density high resolution camera you can get data you know well past a kilometer and it's like really um potentially a huge value now the signal drops off the noise is higher detecting is harder classifying is harder and one that you might think about localizing it's harder because you can be off by like two meters and where something's located a kilometer away and that's the difference between being on the shoulder and being in your lane and so you have like interesting challenges there that you have to solve which have a bunch of approaches that come into it um radar is interesting because um uh uh because it also has longer range than um than lidar uh and it gives you speed information so it becomes very very useful for dynamic information of traffic flow uh vehicle motions animals pedestrians like uh just things that might be um useful signals um and uh it helps with weather conditions where radar actually penetrates weather conditions in a better way than um other sensors and so it's just it's kind of interesting where we've kind of started to converge towards not thinking about a problem as a lidar problem or a camera problem or radar problem but it's a fusion problem where these are all like large scale ml problems where you put data into the system and in many cases you just look for the signals that might be present in the union of all of these and leave it to the system as much as possible to start to really identify how to um how to extract that and then there's places we have to intervene and actually include more but um no single sensor is in a great position to like really solve this problem and then without a huge extra challenge that's fascinating um there's a question that's probably still an open question is at which point do you fuse them do you do do you solve the perception problem for each sensor suite individually the lighter suite and the camera suite or do you do some kind of heterogeneous fusion or do you fuse at the very beginning is there a good answer or at least an inkling of intuitions you can accomplish yeah so people refer to this as like um early fusion or late fusion so late fusion might be that you have like the the camera pipeline the lidar pipeline and then you like fuse them and like when it gets to like final you know semantics and classification and tracking you like kind of fuse them together and and figure out which one's best um there's more and more evidence that um uh that early fusion is important um and that is because uh weight fusion does not allow you to pick up on the complementary strengths and weaknesses of the sensors um weather is a great example where um if you do early fusion you have an incredibly hard problem for any single sensor in rain to solve that problem um because you have reflections from the lidar um you have uh you know weird kind of noise from the camera blah blah blah right but the combination of all of them can help you filter and help you get to the real signal that then gets you as close as possible to the original stack and be much more fluid about the strengths and weaknesses where um you know your camera is much more susceptible to like kind of uh fouling on the on the actual lens from you know like rain or random stuff whereas like you might be a little bit more resilient than other sensors and so there's an element of logic that always happens late in the game but that fusion early on actually especially as you move towards ml and large-scale data-driven approaches just maximizes your ability to pull out the best signal you can out of each modality before you start making constraining decisions that end up being hard to unwind late in the stack so how much of this is a machine learning problem what role does ml machine learning playing this whole problem of autonomous driving autonomous trucking it's um massive and it's increasing over time you know if you go back to um you know the grand challenge days in the early days of kind of av development there was ml but it was not in like kind of the mass scale data style of ml it was like learning models but in a more structured kind of way and it was a lot of heuristic and search-based approaches and planning and so forth you can make a lot of progress with these types of approaches kind of across the board an almost deceptive amount of progress we can get pretty far but then you re you start to really grind the further you get in some parts of stack if you don't have an ability to absorb a massive amount of experience in a way that scales very sublinearly in terms of human labor and human attention and so when you look at the stack the perception side is probably the first to get really revolutionized by ml and it goes back many years because ml for like computer vision and these types of approaches has kind of took off um was a lot of the like early kind of push and um and deep learning and so there's always a debate on you know the spectrum between kind of like end to end ml which you know is a little bit kind of like too far to how you architect it to where you have modules but enough ability to think about long tail problems and so forth but at the end of the day um you have big parts of system that are very ml and data driven and we're increasingly moving that direction all the way across the board including behavior where even when it's not like a gigantic ml problem that covers like a giant swath end to end more and more parts of the system have this property where you want to be able to put more data into it and it gets better and that has been one of the realizations as you drive tens of millions of miles and try to like solve new expansions of domains without regressing in your old ones it becomes intractable for a human to approach that in the way that traditionally robotics has kind of approached some elements of the of the tech stack so are you trying to um create a data pipeline specifically for the trucking problem this is it like how much leveraging of the autonomous driving is there in terms of data collection yeah and how unique is the data required for the trucking problem so we uh we we use all the same infrastructure um so labeling workflows ml workflows everything so that actually carries over quite well um we heavily reuse the data even where almost every model that we have on a truck we started with the latest car model cool and um so it's almost like a good background model yeah it's like you can think of like you despite the different domain and different numbers of sensors and position of sensors there's a lot of signals that carry over across driving and so it's almost like pre-training and getting a big boost out of the gate where you can reduce the amount of data you need by a lot and it goes both ways actually and so we're increasingly thinking about our data strategy on how we leverage both of these so you think about um you know how other agents react to a truck yeah it's a little bit different but the fundamentals are actually like what will other vehicles in the road do there's a lot of carryover that's possible and in fact just to give you an example uh we're constantly kind of like adding more data from the trucking side but as of right now when we think of our like one of our models behavior prediction for other agents on the road like vehicles 85 percent of that data comes from cars and a lot of that 85 comes from surface streets because we just had so much of it and it was really valuable and so we're adding in more and more particularly in the areas where we need more data but you get a huge boost out of the gate just all different visual characteristics of roads lane markings pedestrians all that that's still relevant it's all still relevant and then just the fundamentals of how you know you detect the car does it really change that much whether you're detecting it from a car or a truck um the fundamentals of how a person will walk around your vehicle is it it'll change a little bit but the basics like there's a lot of signal in there that as a starting point to a network can actually be very valuable now we do have some very unique challenges where there's a sparsity of events on a freeway um the frequency of events happening on a freeway whether it's you know interesting you know objects in the road or incidents or or even like from a human benchmark like how often does a human have an accident on a freeway is far more sparse than on a surface street and so that leads to really interesting data problems where you can't just drive infinitely to encounter all the different permutations of things you might encounter and so there you get into interesting tools like structured testing and data collection data augmentation and so forth and so there's really interesting kind of technical challenges that push some of the research um that enables um these new suites of approaches what role does simulation play really good question so waymo simulates about a thousand miles for every mile it drives um so you think of in both so across the board across the board yeah uh so you think of for example well if we've driven you know over 20 million miles that's over 20 billion miles in simulation now how do you use simulation um it's a multi-purpose so uh you use it for basic development so you want to do make sure you have regression prevention and protection of everything you're doing right um that that's an easy one when you encounter something interesting in the world let's say there was an issue with how the vehicle behaved versus an ideal human um you can play that back in simulation and start augmenting your system and seeing how you would have reacted to that scenario with this improvement or this new area you can create scenarios that become part of your regression set after that point right um then you start getting into like really really high kind of hill climbing where um you say hey i need to improve this system i have these metrics that are really correlated with final performance how do i know how well i'm doing uh operation the actual physical driving is the least efficient form of testing and it's expensive it's time consuming so grabbing a large scale batch of historical data and simulating it to get a signal of over these last or just random sample of 100 000 miles how has this metric changed versus where we are today you can do that far more efficiently in simulation than just driving with that new system on board right and then you go all the way to the validation phase where to actually see your human relative safety of like how well you're performing on the car side or the trucking side relative to a human um a lot of that safety case is actually driven by uh taking all of the physical operational driving which probably includes a lot of interventions where like where the operate the driver took over just in case um and then you simulate those forward and see if would anything have happened and in most cases the answer is no but you you can simulate it forward and you can even start to do really interesting things where you add virtual agents to create harder environments you can fuzz the locations of physical agents you can muck with the scene and stress test the scenario from a whole bunch of different dimensions and effectively you're trying to like more efficiently sample this like infinite dimensional space but try to encounter the problems as fast as possible because what most people don't realize is the hardest problem in autonomous driving is actually the evaluation problem in many ways not the actual autonomy problem and so if you could in theory evaluate perfectly and instantaneously you can solve that problem in a really fast feedback loop quite well but the hardest part is being really smart about this suite of approaches on how can you get an accurate signal on how well you're doing as quickly as possible in a way that correlates to physical driving that's in the evaluation problem which metric are you evaluating towards we're talking about safety and some what are the performance metrics that we're talking about so in the end you care about and safety like that's in the end what keeps you like um that's what's deceptive where uh there's a lot of companies that have like a great demo the path from like a really great demo to being able to go driverless can be deceptively long even when that demo looks like it's driverless quality and the difference is is that the thing that keeps you from going driverless is not the stuff you encounter on a demo it's the stuff that you encounter once in a hundred thousand miles or 500 000 miles and so that is at the root of what it what is most challenging about going driverless because any issue you encounter you can go and fix it but how do you know you didn't create five other issues that you haven't that encountered yet so those learnings like those were painful earnings in waymo's history that waymo went through and led to us then finally being able to go driverless in phoenix and now are at the heart of how we develop evaluation is simultaneously evaluating final kind of end safety of how ready are you to go driverless which may be as you know direct as what is your collision human relative kind of collision rate uh for all these types of scenarios and and uh uh and severities to make sure that you're better than a human bar you know by by a good amount um but that's not actually the most useful for development for development it's much more kind of analog metrics that are part of the art of finding how what what are the properties of driving that give you a way quicker signal that's more sensitive than a collision that can correlate to qual the quality you care about and push the feedback loop to all of your development a lot of these are for example comparisons to human drivers like manual drivers how do you how do you do relative to human driver in various dimensions of various um circumstances can ask a tricky question so if i brought you a truck how would you test it okay alan turing came along and you said this one's can't tell if it's a human driver or yeah exactly yeah but not the human because because you know humans are flawed but yeah how do you actually know you're ready basically how do you know it's good enough um yeah and by the way this is the reason why like um weymouth released the safety framework for the car side because like one it sets the bar so nobody cuts below it um and does something bad for the field that and that causes an accident two it's to start the conversation on like framing what does this need to look like same thing we'll end up doing for the trucking side um there it ends up being um different demand different portfolio of approaches there's easy things like are you compliant with all these like fundamental rules of the road like you never drive above the speed limit that's actually pretty easy like you can fundamentally prove that it's either impossible to violate that rule or that in these like you can um itemize the scenarios where that comes up and you can do a test and show that you you know you pass that test and therefore you can handle that scenario and so those are like traditional structure testing kind of system engineering approaches where you can just quant like fault rates is another example where when something fails how do you deal with it you're not going to drive and randomly wait for it to fail you're going to force a failure and make sure that you can handle it and close courses and simulation or on the road and and run through all the permutations of failures which you can often times for some parts of system itemize like hardware the hardest part is behavioral where you have just infinite situations that could in theory happen and you want to figure out the the combinations of approaches that you know that can work there you can probably pass the turing test pretty quickly even if you're not like completely ready for driverless because the events that are really kind of like hard will not happen that often just to give you a perspective uh a human has a serious accident on a freeway uh like a truck driver on a freeway has uh there's a serious event happens once every 1.3 million miles and something that actually has like a really serious injury is 28 million miles and so those are really rare and so you could have a driver that looks like it's ready to go but you have no signal on on what happens there and so that's where you start to get creative on combinations of sampling and statistical arguments focused structured arguments where you can kind of simulate those scenarios and show that you can handle them and metrics that are correlated with what you care about but you can measure much more quickly and get to a right answer and that's what makes it pretty hard and in the end um you end up borrowing a lot of properties um from uh aerospace and like space shuttles and so forth where you don't get the chance to launch it a million times just to say you're ready because it's too expensive to fail um and so you go through a huge amount of kind of structured approaches in order to validate it and then by by thoroughness you can make a strong argument that you're ready to go this is actually a harder problem in a lot of ways though because you can think of a space shuttle as um getting to a fixed point and then you kind of like or an airplane and you like freeze the software and then you like prove it and you're good to go here you have to get to a driver's quality bar but then continue to aggressively change the software even while you're driverless and so and also the full range of environment that you there's there's an external environment where the shuttle is you're basically testing the like the systems the internal stuff yeah uh and you have a lot of control on the external stuff yeah and the hard part is how do you know you didn't get worse in something that you just changed yes and so uh so in a lot of ways like the turing test starts to fail pretty quickly because you start to feel driverless quality pretty early in that curve if you think about it right like in most um most uh kind of you know really good av demos maybe you'll sit there for 30 minutes right yeah so you've driven you know 15 miles or something like that um to go driverless uh like what's the sort of rate of issues that you need to have you won't even encounter so let's try something different then let's try a different version of the touring test which is like an iq test so there's these difficult questions of increasing difficulty they're very they're they're designed you don't know them ahead of time nobody knows the answer to them right and so is it possible to in the future orchestrate yeah basically really obstacle course almost of like yeah that maybe change every year and that represent if you can pass these it they don't necessarily represent the full spectrum that's it yeah they won't be conclusive but you can at least get a really quick read and filter yeah like you're able to yeah because you didn't know them ahead of time like i don't know probably like construction zones uh failures or driving anywhere in russia yeah like yeah weather um cut-ins uh dense traffic kind of merging lane closures uh animal foreign objects on a road that pop out on short notice mechanical failures sensor braking tire popped weird behaviors by other vehicles like a hard brake something reckless that they've done fouling of sensors like bugs or birds you know poop or something so but yeah like you have these like kind of like extreme uh conditions where like you have a nasty construction zone where everything shuts down and you have to like you know get pulled to the other side of the freeway with a temporary lane like that right those are sort of conditions where we do that to ourselves right we itemize everything that could possibly happen to give you a starting point to how to think about what you need to develop and at the end of the day there's no substitute for real miles like if you think of traditional ml like you know how there's like a validation set where you hold out some data and uh like real-world driving is the ultimate validation set that's the in the end like the cleanest signal but you can do a really good job on creating an obstacle course and you're absolutely right like at the end um if there was such a thing as automating uh and kind of a readiness um it would be these extreme conditions like a red light runner right a um really reckless pedestrian that's jaywalking a cyclist that you know makes like a really awkward maneuver that's actually what keeps you from going driverless like in the end that is the long tail yeah and it's interesting to think about the that to me is the touring test stereotest means a lot of things but to me in driving the touring test is exactly this validation set that is handcrafted there's a i don't know if you know him there's a guy named francoise he um he decides he thinks about like how designed to test for general intelligence he designs these iq tests for machines and the validation set for him is handcrafted yeah and that it requires like human genius or ingenuity to create a really good test yeah and you hold you truly hold it up it's an interesting perspective on the validation set which is like make that as hard as possible right not a generic representation of the data but this is the hardest the hardest stuff yeah you know it's like go like you'll never fully itemize like all the world states that you'll you'll expand and so you have to come up with different approaches and this is where you start hitting the struggles of ml where ml is fantastic at optimizing the average case it's a really unique craft to think about how you deal with the worst case which is what we care about in in av space um when using an ml system on something that that occurs like super infrequently so like you don't care about the worst case really on ads because if you miss a few it's not a big deal but you do care about it on the driving side and so um and so typically like you'll never fully enumerate the world and so you have to take a step back and abstract away what are the signals that you care about and the properties of a driver that correlate to defensive driving and avoiding nasty situations that um even though you'll always be surprised by things you'll encounter you feel good about your ability to generalize from what you've learned all right let me ask you a tricky question so to me the two companies that are building at scale some of the most incredible robots ever built is waymo and tesla so there's very distinct approaches technically philosophically in these two systems let me ask you to play sort of devil's advocate and then the devil's advocate to the devil's advocate it's it's a bit of a race of course everyone can win but if waymo wins this race to level four uh which why would they win what aspect of the approach do you think would be the winning aspect and if tesla wins why would they win and uh which aspect of their approach would be the reason just just building some intuition almost not from a business perspective from any of that just technically yeah yeah and we could summarize i think maybe you can correct me what one of the more distinct aspects is uh waymo has a richer suite of sensors as lidar and vision tesla now removed radar they do vision only tesla has a larger fleet of vehicles operated by humans so it's already deployed on the field in its uh larger what do you call it operational domain and then waymo is more focused on a specific domain and growing it with fewer vehicles so that's the both are fascinating approaches both are i think there's a lot of brilliant ideas nobody knows the answer so i'd love to get your comments on this lay of the land yeah for sure so maybe i'll um i'll start with waymo and you're right like both incredible companies and just a gigantic respect to like everything tesla's accomplished and uh how they push the field forward as well so on the weymouth side there is a fundamental advantage in the fact that it is focused and geared towards l4 from the very beginning we've customized the sensor suite for it the hardware the compute the infrastructure the tech stack and all of the investment inside the company um that's deceptively important because there's like a giant spectrum of problems you have to solve in order to like really do this from infrastructure to hardware to autonomy stack to the safety framework and that's an advantage because there's a reason why it's the fifth generation hardware and why all of those learnings went into the dymor program um it becomes such an advantage because you learn a lot as you drive and you optimize for the best information you have but fundamentally like there's a big big jump um uh like every order of magnitude that you drive uh in numbers of miles and what you earn and the gap from really kind of like decent progress or l2 and so forth to what it takes to actually go all for and at the end of the day um there's a feeling that waymo has uh there's a long way to go uh nobody's won um but there's a lot of advantages um in all of these buckets where it's the only company that has shipped a fully driverless service we can go and you can use it and it's at a decently like uh you know sizeable scale um and those learnings can feed forward and to solve how to solve the more general problems you see this process you've deployed in chandler you don't know the timeline exactly but you could see the steps they they seem almost incremental the steps it's become more engineering than totally bind r d because it works in one place and then you move yeah another place and you grow it this way and just to give you an example like we fundamentally changed our hardware and our software stack almost entirely from what when driverless in phoenix to what is the current generation of the system on both sides because the things that got us to driverless even though it got to driveway way like way beyond human relative safety um it is fundamentally not well set up to scale in an exponential fashion without like getting into like huge kind of scaling pains and so those learnings you just can't shortcut and so that's an advantage and so uh there's a lot of open challenges to kind of get through technical organizational like how do you solve problems that are increasingly broad and complex like this work on multiple products but there's the feeling that okay like balls in our court there's a head start there now we got to go and solve it and i think that focus on l4 it's a fundamentally different problem if you think about it like um let's say we were designing an l2 truck that was meant to be safer and help a human you could do that with far less sensors far less complexity and provide value very quickly arguably with what we already have today just packaged up in a good product but you would take a huge risk in having a gap from even like compute and sensors not not to mention the software to then jump from that system to an l4 system so it's a huge risk basically so i can let me allow me to be the person that plays the devil's advocate and let's argue for the tesla approach so that the what you just laid out makes perfect sense and is exactly right there are some open questions here which is it's possible that investing more in faster data collection which is essentially what tesla's doing will get us there faster if the sensor suite doesn't matter yeah as much and machine learning can do a lot of the work this is the open question is how much is is the thing you mentioned before how much of driving can be end to and learned that's the open question obviously the waymo and the vision only machine learning approach will solve driving eventually both yeah the question is of timeline what's faster that's right and what you mentioned like if i were to make the opposite argument like what what puts tesla uh in the strongest position it's data that is their like superpower where they have an access to real-world data effectively with like a safety driver uh and uh you know like they've they found a way to like um get paid by safety drivers versus paper safety drivers it's uh it's brilliant right yeah but you know all joking aside like um one it is incredible that they've built a business that's incredibly successful they can now be a foundation and bootstrap kind of like really aggressive investment in autonomy space uh if you can do it that's always like an incredible kind of advantage and then the data aspect of it um it is a giant amount of data if you can use it the right way to then solve the problem but the ability to collect um and filter through the things that to the things that matter at real-world scale like a large distribution that is a that is huge like it's a big advantage um and so then the question becomes can you use it in our right way and do you have the right software systems and hardware systems in order to solve the problem and you're right that in the long term there's no reason to believe that pure camera systems can't solve the problem that humans obviously are solving with you know with vision systems but the question is when it's a risk it's a big so there's no argument that it's not a risk right like and it's already such a hard problem and so much of that problem by the way is um uh you know even beyond the perception side some of the hardest elements of the problem on behavioral side and decision making and the long tail safety case if you are adding risk and complexity on the input side from perception you're now making a really really hard problem like which is on its own is still like almost insurmountably hard even harder and so the question is just how much and this is where like you can easily get into a little bit of a kind of a trap where similar to how you how do you evaluate how good an av company's product is like you go and you do a trial kind of a test run with them a demo run which they've kind of optimized like crazy and so forth and like and it feels good do you do you put any weight in that right you know that that gap is kind of like you know pretty large still um same thing on the like perception case like the long tail of computer vision is really really hard and there's a lot of ways that that can come up and even if it doesn't happen that often at all when you think about the safety bar and what it takes to actually go full driverless not like incredible assistance driverless but full driverless that bar gets crazy high and not only do you have to solve it on the behavioral side but now you have to push computer vision beyond arguably where it's ever been pushed and so you now on top of the broader av challenge you have a really hard perception challenge as well so there's perception there's planning there's human robot interaction to me what's fascinating about what tesla is doing is in this march towards level four because it's in the hands of so many humans you get to see video you get to see humans i mean forget forget companies forget businesses it's fascinating for humans to be interacting with robots that's incredible and they're actually helping kind of push it forward and yeah and that is valuable by the way where even for us a decent percentage of our data is human driving yes um we intentionally have humans drive higher percentage than you might expect because that creates some of the best signals to train the autonomy and so that is uh on its own value so together we're kind of learning about this problem in an applied sense just like you had with cosmo like once when when you're chasing an actual product that people are going to use robot based product that people are going to use you have to contend with the reality of what it takes to build a robot that successfully perceives the world and operates in the world and what it takes to have a robot that interacts with other humans in the world and that that's like to me one of the most interesting problems humans have ever undertaken because you're in trying to create an intelligent agent that operates in a human world you're also understanding the nature of intelligence itself like how hard is driving it's still not answered to me yeah i still don't understand like all the subtle cues like even little things like um your interaction with a pedestrian where you look at each other and just go okay go right like that's hard to do without a human driver right and you're missing that dimension how do you communicate that so there's like really really interesting kind of like elements here now here's what's beautiful can you imagine that like when autonomous driving is solved how much of the technology foundation of that like space can go and have like tremendous just transformative impacts on on other problem areas and other other spaces that have subsets of the these same problems like it's just incredible it's it's both a pro and a con is uh with autonomous driving is so um safety critical it's so so once you solve it it's beautiful because there's so many applications that are a lot less safety critical but it's also the the con of that is it's so safe it's so hard to solve and the same journalists that you mentioned and get excited for a demo are the ones who who write long articles about the failure of your company if there's one accident that's based on a robot and it's it's it's just society's so tense and waiting for failure robots you're in such a high stake environment failure has such a high cost and it slows down development it slows down development yeah like the team like definitely noticed that like once you go driverless like we're driving from phoenix and you continue to iterate your iteration pace slows down um because your fear of regression forces so much more rigor that you know obviously you know you have to find a compromise on like okay well how often do we release driverless builds because every time you release a driver's build you have to go through this like validation process which is very expensive so far so um it is interesting it's like it is just one of the hardest things there's no other industry where like uh you would not like you wouldn't release products way way quicker when you start to kind of provide even portions of the value that you provide healthcare maybe is the other one that's right but at the same time right like we've gotten there where you think of like surgery right like you have surgery there's always a risk but like it's really really bounded you know that there's an accident rate when you go out and drive your car today right like and you know what the fatality rate in the u.s is per year we're not banning driving because there was a car accident but the bar for us is way higher and we hold ourselves very serious to it where you have to not only be better than a human but you probably have to like at scale be far better than a human by a big margin and you have to be able to like really really thoughtfully explain um all of the ways that we validate that becomes very comfortable for humans to understand because a bunch of jargon that we use internally just doesn't compute at the end of the day we have to be able to explain to society how do we quantify the risk and acknowledge that there is some non-zero risk but it's far above a human you know relative safety here's the thing to push back a little bit uh and bring cosmo back in the conversation he said something quite brilliant at the beginning of this conversation that i think probably applies for autonomous driving which is you know there's this desire to make autonomous cars much safer than human driven cars but if you create a product that's really compelling and is able to explain both the leadership and the engineers and the product itself can communicate intent then i think people may be able to be willing to put up with the thing that might be even riskier than humans because they understand the value of taking risks you mentioned the speed limit humans understand the value of going over the speed limit yeah humans understand the value of like going fast through a for through a yellow light yeah to take in when you're in manhattan streets pushing through uh uh crossing pedestrians they understand that i mean this is a much more tense topic of discussion so this is just me talking so in with cosmo's case there was something about the way this particular robot communicated the energy it brought the intent it was able to communicate to the humans that you understood that of course he needs to have a camera yeah of course he needs to have this information and in that same way to me of course a car needs to take risks of course there's going to be accidents that's what like that's you know if you want a car that never has an accident have a car that just doesn't go anywhere yeah and so that but that's tricky because that's not a robotics problem like are not even under like due to you right obviously so there's a big difference though um yeah you are that's not a personal decision you're also impacting obviously kind of the rest of the road um and we're facilitating it right and so there's a higher kind of you know kind of ethical and moral bar which obviously then you know translates into as a society and from a regulatory standpoint kind of like what what comes out of it where it's hard for us to ever see this even being a debate in the sense that like you have to be beyond reproach from a safety standpoint because if you're wrong about this you could set the entire field back a decade right see i i this is me speaking i think if we look into the future there will be i personally believe this is me speaking yeah that there will be less and less focus on safety still very very high yeah meaning like after autonomy is very common and accepted it's not not not so common as everywhere but there has to be a transition because i think for innovation just like you were saying to explore ideas you have to take risks and i think if autonomy in the near term is to become prevalent in society i think people need to be more willing to understand the nature of risk the value of risk it's very difficult you're right of course with driving but that that's the fascinating nature of it this it's a it's a life-and-death situation that brings value to millions of people so you have to figure out what what do we value about this world how much do we value how deeply do we want to avoid hurting other humans that's right and there is a point where like you can imagine a scenario where waymo has a system that is uh even when it's like uh kind of beyond a you know human relative safety um and provably statistically will save lives there is a thoughtful navigation of you know the that fact versus just kind of society readiness and perception and education of um society and regulators and everything else where like it's it's multi-dimensional um and it's not a purely logical uh argument but um ironically the logic can actually help with the emotions and just like any technology there's early adopters and then there's kind of like a curve that um happens after it but eventually celebrities you get the rock in a way more vehicle and then everybody just comes and everybody calms down because the rock likes it yeah if you post uh yeah and it's like it's an open question on how this plays out i mean maybe we're presently surprised and it just like people just realize that this is such a enabler of life and like efficiency and cost and everything that um there's a pull like at some point i should fully believe that this will go from a thoughtful kind of you know you know movement and tiptoeing and like kind of like a push to society realizes how wonderful of an enabler this could become and it becomes more of a pull and um hard to know exactly how that will play out but at the end of the day like both the goods transportation and the people transportation side of it has that property where it's not easy there's a lot of open questions and challenges to navigate and there's obviously the technical problems to solve uh as a you know kind of prerequisite but um they they have such an opportunity that is um on a scale that very few industries in the last 20 30 years have even had a chance to tackle that i maybe were pleasantly surprised by how much how much that tipping point like in a really short amount of time actually turns into a societal pull to kind of embrace the benefits of this yeah i i hope so it seems like in the recent few decades there's been tipping points for technologies where like overnight things change it's uh like uh from taxis to ride sharing services all that that shift i mean there's just shift after shift after shift that requires digitization and technology i'm i hope we're pleasantly surprised in this so there's millions of long-haul trucks now in the united states do you see a future where there's millions of waymo trucks and maybe just broadly speaking way more vehicles just like like ants running around the united states uh freeways and local roads yeah in other countries too like uh you look back decades from now and it might be one of those things that just feels so natural and then it becomes almost like a kind of interesting kind of oddity that we had none of it like uh you know kind of decades earlier and it'll take a long time to grow and scale very different challenges appear at every stage but over time like this is one of the most enabling technologies that um that we have in the world uh today um it'll feel like you know how was the world before the internet how's the world before mobile phones like it's gonna have that sort of a feeling to it on both sides it's hard to predict the future but do you sometimes uh think about weird ways it might change the world like surprising ways so obviously there's more direct ways where like there's increases efficiency it'll enable a lot of kind of logistics optimizations kind of things it will change our uh probably our roadways and all that kind of stuff but it could also change society in some kind of interesting ways do you ever think about how might change cities how might change their lives all that kind of yeah you can imagine city uh where people live versus work becoming more distributed because the pain of commuting becomes different just easier uh and i don't know there's a lot of options that open up the way out of cities themselves and how you think about car storage and parking obviously uh just enables a completely different type of uh uh type of experience in urban environments i i think there was like a statistic that uh something like 30 of the traffic uh in cities during rush hour is caused by a pursuit of parking uh or some like some really high stats so those obviously kind of open up a lot of options um flexibility on goods will enable new industries and businesses that never existed before because now the efficiency becomes more palatable good delivery timing consistency and flexibility is going to change the way we distribute the logistics network will change the way we then can integrate with warehousing with shipping ports you can start to think about greater automation through the whole kind of stack and how that supply chain the ripples become much more uh agile versus like very grindy the way they are today where just the adaptation is like very tough and there's a lot of constraints that we have i think it'll be great for the environment it'll be great for safety where like probably about 95 of accidents today um statistically are due to just uh attention or things that are preventable with uh with the strengths of automation yeah and it'll be one of those things where like industries will shift but the net creation is going to be massively positive and then we just have to be thoughtful about the negative implications that will happen in local area places um and adjust for those but i'm an optimist in general for the technology where you could argue a negative on any new technology but you start to kind of see that if there is a big demand for something like this the in almost all cases the like it's an enabling factor that's gonna kind of propagate through the um you know through society and particularly as life expectancies get longer and you know and so forth like there's a just a lot more need for um a greater percentage of the population to kind of just be serviced with a high level of efficiency because otherwise we can have a really hard time kind of scaling to what's ahead in the next 50 years um yeah and you're absolutely right every technology has uh negative consequences of positive consequences and we tend to focus on the negative a little bit too much in fact autonomous trucks are often brought up as an example of uh artificial intelligence and robots in general taking our jobs and as we've talked about briefly here we talk a lot with steve you know that's it is a concern that automation will take away certain jobs it will create other jobs so there's temporary pain uh hopefully temporary but pain is pain and all people suffer and that human suffering is really important to think about how uh but trucking is ver i mean there's a lot written on this is i would say far from the the thing that that would cause the most pain yeah there's even more positive properties about trucking where not only is there just a you know huge shortage which is going to increase the average age of truck drivers is getting closer to 50 because the younger people aren't wanting to come into it they're trying to like incentivize lower the age limit like all these sort of things um and the demand is just going to increase and the least favorable like it depends on the person but in most cases the least favorable types of routes are the massive long-haul routes where you're on the road away from your family 300 plus stations steve talked about the pain of those kind of routes from a family perspective you're you're basically away from family it's not just hours you work insane hours but it's also just time away from family right and just obesity rate is through the roof because you're just sitting all day like it's really really tough and um and that's also where like the biggest kind of safety risk is because of fatigue and um and so when you think of the gradual evolution of how trucking comes in first of all it's not overnight it's gonna take decades to kind of phase in all the like there's just a long long long road ahead but the the routes and the portions of trucking that are going to require humans the longest and benefit the most from humans are the short-haul and most complicated kind of more urban routes which are also the more more pleasant ones which are um you know less continual driving time more um uh more flexibility on like you know geography and location and you get to kind of sleep with the at home with you at your own home and very importantly if you optimize the logistics you're going to use human you're going to use humans much better that's right and and thereby pay them much better because like one one of the biggest problems is truck drivers currently are paid by like how much they drive so you they really feel the pain of it inefficient logistics yeah because like if they're just sitting around for hours which they often do not driving waiting yeah they're not getting paid for that time that's right and that so like logistics has a significant impact on the quality of life of a truck driver and high percentage of trucks are like uh empty because of inefficiencies in the system um yeah it's one of those things where like um and the other thing is when you increase the efficiency of a system like this the overall net like volume of the system tends to increase right like the the entire market cap of trucking is going to go up um when the efficiency improves uh and facilitates both growth and industries and better utilization of trucking um and so that on its own just creates more and more demand which um uh of all the places where ai comes in and starts to really um uh kind of reshape an industry this is one of those where like there's just a lot of positives that for at least any time in the foreseeable future seem really lined up in a good way um to um kind of come in and help with the shortage and start to kind of optimize for the routes that are most dangerous and most painful yeah so this is true for trucking but if we zoom out broader you know automation and ai does technology broadly i would say but you know automation is a thing that has a potential in the next couple of decades to shift the kind of jobs available to humans yes and so that results in like i said human suffering because people lose their jobs there's economic pain there and there's also a pain of meaning so for a lot of people work is a source of uh meaning it's a source of identity of of pride of you know pride in getting good at the job pride in craftsmanship and excellence which is what truck drivers talk about yeah but but that this is true for a lot of jobs and is that something you think about as a sort of a roboticist zooming out from the trucking thing um like where do you think it would be harder to find activity and work that's a source of identity and source of meaning in the future like i do think about it because you want to make sure that you you worry about the entire system like not just like the party economy plays in it but what are the ripple effects of it down the road and um on enough of a time window there's a lot of opportunity to put in the right policies and the right opportunities to kind of reshape and retrain and find those openings and so just to give you a few examples both trucking and cars we have remote assistance facilities that are there to interface with customers and monitor vehicles and provide like very focused kind of assistance on uh kind of areas where the vehicle may want to request help uh in understanding an environment so those are jobs that kind of get created and supported i remember like taking a tour of one of the amazon facilities where you've probably seen the kiva systems robots uh where you have these orange robots that have automated um the warehouse like kind of picking and collecting of items in this like really elegant and beautiful way um it's actually one of my favorite applications of robotics of all time um uh you know like i think it kind of came across a company like 2006 was just amazing and what was the warehouse or was the transport little thing so basically instead of a person going and walking around and picking the seven items in your order um these robots go and pick up a shelf and move it over in a row where like the seven shelves that contain the seven items are lined up and a you know laser or whatever points to what you need to get and you go and pick it and you place it to fill the order and so the people were fulfilling the final orders what was interesting about that is that when i was asking them about like kind of the impact on labor when they transitioned that warehouse the throughput increased so much that the jobs shifted towards the final fulfillment even though the robots took over entirely the the search of the items themselves and the labor the job stayed like nobody like that was actually the same amount of jobs uh roughly they were necessary but the throughput increased by i think over 2x or some some amount right like so um you have these situations that are not zero-sum games in this really interesting way and the optimist to me thinks that there's these types of solutions in almost any industry where the growth that's enabled creates opportunities that you can then leverage but you got to be intentional about finding those and really helping make those links because any even if you make the argument that like there's a net positive locally there's always tough hits that you got to be very careful about that's right you have to have an understanding of that link because there's a short period of time whether training is required or just mental transition or physical or whatever is required that's still going to be short-term pain the uncertainty of it there's families involved you know it it's i mean it's exceptionally it's difficult on a human level and you have to really think about that even you can't just look at economic metrics always it's human beings that's right and you can't even just uh take it as like okay well we need to like subsidize or whatever because like there is an element of just personal pride where right majority of people like people don't want to just be okay but like they want to actually like have a craft like you said and have a mission and feel like they're having a really positive impact and so um my personal belief is that there's a lot of transferability and skill set um that is possible especially if you create a bridge and an investment um to enable it um and to some degree that's our responsibility as well this process uh you mentioned kiva robots amazon let me ask you about the astro robot which is i don't know if you've seen it it's amazon's announced that it's a home robot that they have a screen looks awfully a lot like cosmo has i think different vision probably what are your thoughts about like home robotics in this kind of space there's been a quite a bunch of home robots social robots that very unfortunately have closed their doors that um for various reasons perhaps they were too expensive there's manufacturing challenges all that kind of stuff what are your thoughts about amazon getting into this space yeah we had some signs that they were getting into like long long long long ago maybe they're a little too interested in cosmo and uh yeah during our conversations but they're also very good partners actually for us as we kind of disintegrated a lot of shared technology but if i could also get your thoughts on you know you could think of alexa as a robot as well yeah echo do you see those as fundamentally different just because you can move and look around is that fundamentally different than the thing that just sits in place uh it opens up options um but uh you know my first reaction is i think like i have my doubts that this one's going to hit the mark because i think for the price point that it's at and the like kind of functionality and value propositions that they're i'm trying to put out it's uh uh it's still searching for like the killer application that like justifies i think it was like a 1500 price point or kind of somewhere around there that's a really high bar so there's enthusiasts an early adopters will obviously kind of pursue it but you have to like really really hit a high mark at that price point which we always tried to we were always very cautious about jumping too quickly to the more advanced systems that we really wanted to make but would have raised the bar so much you have to be able to hit it in today's cost structures and technologies the mobility is an angle that hasn't been utilized but it has to be utilized in the right way um and so that's going to be the biggest challenge is like can you meet the bar of what a con what the mass market consumer like you know think like you know our uh our neighbors our friend parents like would they find a deep deep value like in you know fi in this at a mass scale that you know that justifies the price point i think that's in the end one of the biggest challenges for robotics especially consumer robotics where you have to kind of meet that bar uh it becomes very very hard um and there's also the higher bar just like you were saying with cosmo of you know a thing that can look one way and then turn around and look at you there's that's either a super desirable quality or super undesirable quality depending on how much you trust the thing that's right and so there's uh there's a problem of trust to solve there there's a problem of personalities the thing is the quote-unquote problem that cosmos solved so well yeah is that there you trust the thing yeah and that has to do with the company with the leadership with the intent that's communicated by the device and the company and everything together yeah exactly right uh and so um and i think they also have to retrace some of the like learnings on the character side where like as usual i think that's the place where it's uh a lot of companies are great at the hardware side of it and can you know think about those elements and then there's like you know the thinking about the ai challenges particularly the advantage of alexa is a pretty huge boost for them um the character side of it for technology companies is pretty new novel territory and so that will take some iterations but um yeah i mean i hope i hope there's continued progress in the space and that threat doesn't kind of go dormant for too long and it's not you know it's going to take a while to kind of evolve into like the ideal applications but you know this is one of um amazon's i guess like you could call it it's definitely like part of their dna but in many cases it's also strength where they're very willing to like iterate uh kind of aggressively and um and move quickly not take risks and take risks you have deep pockets so you can yeah and they'll maybe have more misfires than an apple would um but uh you know it's different styles and different approaches and um you know at the end of the day it's like there's a few familiar uh kind of elements there for sure which was uh you know kind of you know homage is one way to put it yeah uh so why is it so hard at a high level um to build a robotics company a robotics company that lives for a long time so if you look at so i thought cosmo for sure would live for a very long time that to me was exceptionally successful vision and idea and implementation irobot is an example of a company that has pivoted in all the right ways to survive and arguably thrive by focusing on the having like a have a driver that constantly provides profit which is the vacuum cleaner and of course there's like amazon what they're what they're doing is they're almost like taking risks so they can afford it because they have other sources of revenue right but outside of those examples most robotics companies fail yeah why why do they fail why is it so hard to run a robotics company our robot's impressive because they found a really really great fit of where the technology could satisfy a really clear used case in need and they did it well and they didn't try to overshoot from a cost-to-benefit standpoint robotics is hard because it like tends to be more expensive it combines way more technologies than a lot of other types of companies do if i were to like say one thing that is maybe the biggest risk and like a robotics company failing is that it can be either a technology in search of a application or they try to bite off a kind of an offering that has a mismatch and kind of price to function um and uh just the mass market appeal isn't there and um consumer products are just hard it's just i mean after all the years and it like definitely kind of feel a lot of the battle scars because you have um you know you not only do you have to like hit the function but you have to educate and explain get awareness up deal with different conductive consumers like uh you know there's um there's a reason why a lot of technology sometimes start in the enterprise space and then kind of continue forward in the consumer space even like you know you see ar like starting to kind of make that shift with hololens and so forth in some ways consumers and price points that they're willing to kind of uh be attracted in a mass market way and i don't mean like you know 10 000 enthusiasts bought it but i mean like you know 2 million 10 million 50 million like mass market kind of interest uh you know have bought it that bar is very very high and typically robotics is novel enough and non-standardized enough to where pushes on price points so much you can easily get out of range where the capabilities and today's technology or just a function that was picked just doesn't line up um and so that product market fit is very important so the space of killer apps or a rather super compelling apps is much smaller because it's easy to get outside the price range yeah and most consumers and it's not constant right like yeah that's why like we picked off entertainment because the quality was just so low in physical entertainment that we could we felt we could leapfrog that and still create a really compelling offering at a price point that was defensible and and we like that proved out to be true um and over time that same opportunity opens up in healthcare in home applications and you know commercial applications and kind of broader more generalized interface but there's missing pieces in order for that to happen and all of those have to be present um for it to line up and we see these sort of trends in technology where um you know kind of technologies that start in one place evolve and kind of grow to another something starting gaming some things start in uh in space uh or aerospace and then kind of move into the consumer market and sometimes it's just a timing thing right where how many stabs at what became the iphone were there over the 20 years before that just weren't quite ready in the function um relative to the kind of price point and complexity and sometimes it's a small detail of the implementation that makes all the difference which is uh design uh design is so important well something yeah like the the you the new generation ux right yeah it's um and uh and that's uh um it's tough and oftentimes all of them have to be there and it has to be like a perfect storm and um but yeah history repeats itself in a lot of ways uh in a lot of these trends which is pretty fascinating well let me ask you about the humanoid form what do you think about the tesla bot and humanoid robotics in general so obviously to me autonomous driving waymo and the other companies working in the space that seems to be a great place to invest in potential revolutionary application robotics application focused application what's the role of humanoid robotics do you think teslabot is ridiculous do you think it's super promising do you think it's interesting full of mystery nobody knows what do you think about this thing yeah i think today humanoid form robotics is research there's very few situations where you actually need a humanoid form to solve a problem uh if you think about it right like wheels are more efficient than legs there's joints and degrees of freedom beyond a certain point just add a lot of complexity and cost right so if you're doing a humanoid robot oftentimes it's in the pursuit of a humanoid robot not in a pursuit of an application for the time being um especially when you have like kind of the gaps and interface and you know kind of ai that we kind of talk about today so anything you want does i'm interested in following so there's there's an element of that world no matter how crazy how crazy it is i just like you know i'll pay attention i'm curious to see what comes out of it so it's like you can't you can't ever you know ignore it but you know it's uh definitely far afield from their kind of core business um uh obviously and um what was interesting to me is i've i've disagreed with you know elon a lot about this is to me the in the compelling aspect of the humanoid form and a lot of kind of robots cosmo for example is a human robot interaction part from elon musk's perspective the tesla bot has nothing to do with the human it's a form that's effective for the factory because the factory is designed for humans but to me the reason you might want to argue for the humanoid form is because you know at a party yeah it's a nice way to fit into the party the humanoid form has a compelling notion to it in the same way that cosmo is compelling i you i would argue if we were arguing about this that it's cheaper to build a cosmo like that form but if you wanted to make an argument which i have with jim keller about you know you could actually make a humanoid robot for pretty cheap it's possible and then the question is all right if if you're using an application where it can be flawed it could it can have a personality and be flawed in the same way that cosmo is that maybe it's interesting for integration to human society that's that's to me is an interesting application of a humanoid form because humans are drawn like i mentioned to you legged robots we're drawn to legs and limbs and body language and all that kind of stuff and even a face even if you don't have the facial features which you might not want to have for the uh to reduce the creepiness factor all that kind of stuff but yeah that to me the humanoid form is compelling but in terms of that being the right form for the factory environment i'm not so sure yeah for the factory environment like right off the bat um what are you optimizing for is it strength is it mobility is it versatility right like that changes completely the look and feel of the robot that you create you know and uh almost certainly the human form is over designed for some asp dimensions and constrained for some dimensions and so like like what are you grasping is it big is it little right so you would customize it and make it um customizable um for the different needs if that was the optimization right and then you know for the other one uh i could totally be wrong you know i still feel that the closer you try to get to a human the more you're subject to the um biases of what a human should be and you lose flexibility to shift away from your weaknesses uh and towards your strengths and that changes over time but there's ways to make really approachable and natural interfaces for robotic kind of characters and you know and uh you know and kind of deployments in these applications that do not at all look like a human directly but that actually creates way more flexibility and capability and role and forgiveness and interface and everything else yeah it's interesting but i'm still confused by the magic i see in legged robots yeah so there is a magic so i i'm uh absolutely amazed at it from a technical curiosity standpoint and like the the magic that like the boston dynamics team can do from uh you know like from walking and jumping and so forth now like there's been a long journey to try to find an application for that sort of um technology but wow that's incredible technology right yes so then you kind of go towards okay are you working back from a goal of what you're trying to solve or are you working forward from a technology and then looking for a solution and i think that's where um it's a kind of a bi-directional search oftentimes but you gotta you the two have to meet and that that's where humanoid robots is kind of close to that and that like it is a decision about a form factor and a technology that it forces um that doesn't have a clear justification on why that's the killer app or you know from the other end but i think the core fascinating idea with the tesla bot is the one that's carried by waymo as well is when you're solving the general robotics problem of perception control where this there's the very clear applications of driving it's as you get better and better at it when you have like way more driver yeah the whole world starts to kind of start to look like a robotics problem so it's very interesting for now detection classification segmentation tracking planning like it's carrie yeah so there's no reason i mean i'm not i'm not speaking for way more here but you know um moving goods there's no reason transformer like this thing couldn't you know uh take the goods up an elevator you know yeah like that like uh slowly expand yeah what it means to move goods and expand more and more of the world uh into a robotics problem well that's right and you start to like think of it as an end robotics problem from like loading from you know from everything yes and even like the truck itself um you know today's generation is integrating into today's understanding of what a vehicle is right the pacifica jaguar uh the freightliners from daimler there's nothing that stops these us from like down the road after like starting to get to scale to like expand these partnerships to really rethink what would the next generation of a truck look like um that is actually optimized for autonomy not for today's world um and maybe that means a very different type of trailer maybe that like there's a lot of things you could rethink on that front which is on its own very very exciting let me ask you like i said you went to the mecca of robotics which is cmu carnegie mellon university you got a phd there so maybe by way of advice and maybe by way of story and memories what does it take to get a phd in robotics at cmu and maybe you can throw in there some advice for people who are thinking about doing work in artificial intelligence and robotics and are thinking about whether to get a phd it's like i actually went i was a cmu for undergrad as well and didn't know anything about robotics coming in and was doing you know electrical computer engineering computer science and really got more and more into kind of ai and then fell in love with autonomous driving and at that point like that was just by a big margin like such a incredible like central spot of develop of investment in that area and so what i would say is that like robotics like for all the progress that's happened is still a really young field there's a huge amount of opportunity now that opportunity shifted where something like autonomous driving has moved from being very research and academics driven to being commercial driven where you see the investments happening in commercial now there's other areas that are much younger and you see like kind of grasping and manipulation making kind of the same sort of journey that like autonomy made and there's other areas as well what i would say is the space moves very quickly anything you do a phd in like it is in most areas will evolve and change as technology changes and constraints change and hardware changes and the world changes um and so the beautiful thing about robotics is it's super broad it's not a narrow space at all and it can be a million different things in a million different industries and so uh it's a great opportunity to come in and get a broad foundation on ai machine learning computer vision systems hardware sensors all these separate things you do need to like go deep and find something that you're like really really passionate about obviously like just like any phd this is like a five six year kind of uh endeavor and you have to love it enough to go super deep to learn all the things necessary to be super deeply functioning in that area and then contribute to it in a way that hasn't been done before and in robotics that probably means um more breadth because robotics is rarely kind of like one particular kind of narrow technology and it means being able to collaborate with teams where like one of the coolest aspects of like my the exp the experience that i kind of cherish in our phd is that we actually had a pretty large av project that for that time was like a pretty serious initiative where you got to like partner with a larger team and you had the experts in perception and the experts in planning and the staff and the mechanical challenge um so i was working on the a project called upi back then uh which was basically the off-road version of the darpa challenge it was a darpa funded project for basically like a large off-road vehicle that you would like drop and then give it a waypoint 10 kilometers away and it would have to navigate a complete structure in an office environment yeah so like forest ditches rocks vegetation and so it was like a really really interesting kind of a hard problem where like wheels would be up to my shoulders it's like gigantic right yeah by the way av for people stands for autonomous vehicles house vehicles yeah sorry um and so what i think is like the beauty of robotics but also kind of like the expectation is that um there's um spaces in computer science where you can be very very narrow and deep robotics one of the the necessity but also the beauty of it is that it forces you to be excited about that breadth and that partnership across different disciplines that enable it but that also opens up so many more doors where you can go and you can do robotics and almost any category where robotics isn't a in isn't really an industry it's like it's like ai right it's like the application of physical automation to uh you know to all these other worlds and so you can do robotic surgery you can do vehicles you can do factory automation you can do healthcare or you can do like uh leverage the ai around the sensing to think about static sensors and scene understanding so um so i think that's got to be the expectation and the excitement and it breeds people they're probably a little bit more collaborative and more excited about um working in teams uh if i could briefly comment on the fact that the robotics people i've met in my life from cmu and mit they're really happy people yeah because i think it's the collaborative thing yeah i think i think you don't you're not like a sitting in like the fourth basement uh exactly which when you're doing machine learning purely software it's very tempting to just disappear into your own hole yeah and never collaborate and and there that breeds a little bit more of the silo mentality of like i have a problem it's almost like negative to talk to somebody else or something like that but robotics folks are just very collaborative very friendly just and there's also an energy of like you get to confront the physics of reality often which is humbling and also exciting so it's humbling when it it fails and exciting when it finally it's like the purity of the passion you got to remember that like right now like robotics and ais like just all the rage and autonomous vehicles and all this like 15 years ago and 20 years ago like it wasn't that deeply lucrative people went into robotics they did it because they were like thought it was just the coolest thing in the world to like make physical things intelligent in the real world and so there's like a raw passion where they went into it for the right reasons and so forth and so it's really great space and that organizational challenge by the way like um when you think about the challenges in av we talk a lot about the technical challenges the organizational challenge is through the roof where um you think about the challenge the what it takes to build an av system and you have companies that are now thousands of people and um you know you look at other really hard technical problems like an operating system it's pretty well established like you kind of know that there's a file system there's virtual memory there's this there's that there's like caching and like and there's like a really reasonably well established modularity and apis and so forth and so you can kind of like scale it in an efficient fashion that doesn't exist anywhere near to that level of maturity in autonomous driving right now and tech stacks are being reinvented organizational structures are being reinvented you have problems like pedestrians that are not isolated problems they're part sensing part behavior prediction part planning part evaluation and like one of the biggest challenges is actually how do you solve these problems where the mental capacity of a human is starting to get strained on how do you organize it and think about it where you know you have this like multi-dimensional matrix that needs to all work together and so that makes it kind of cool as well because it's not like solved at all uh from you know like what what is what does it take to actually scale this right and then you look at like other gigantic challenges that have you know that have been success successful and are way more mature there's a stability to it and like maybe the autonomous vehicle space will get there but right now just as many uh technical challenges as they are they're like organizational challenges and how do you like solve these problems that touch on so many different areas and efficiently tackle them while like maintaining progress among all these constraints um while scaling by way of advice what advice would you give to uh somebody thinking about doing a robotics startup you mentioned cosmo somebody that wanted to carry the cosmo flag forward the anki flag forward looking back at your experience looking forward to the future that will obviously have such robots what advice would you give to that person yeah it was the greatest experience ever and it's like there's something you there's things you learn navigating a startup that you'll never like you you it was very hard to encounter that in like a typical kind of work environment and um and it's just it's wonderful you got to be ready for it it's not as good like you know the the glamour of a startup there's just like just brutal emotional swings up and down and so um having co-founders actually helps a ton like i would not cannot imagine doing it solo but having at least somebody where on your darkest days you can kind of like really openly just like have that conversation and you know lean on to somebody that's that's in the thick of it with you helps a lot what i would say what was the nature of darkest days and the emotional swings is it worried about the funding is it worried about whether any of your ideas are any good or ever were good is it like the self-doubt uh is it like facing new challenges that have nothing to do with the technology like organizational human resources that kind of stuff what yeah you come from a world in school where you feel that uh you put in a lot of effort and you'll get the right result and input translates proportional to output and you know you need to solve the set or do whatever and just kind of get it done now phd tests out a little bit but at the end of the day you put in the effort you tend to like kind of come out with your enough results to you kind of get a phd in the startup space like you know like you could talk to 50 investors and they just don't see your vision and it doesn't matter how hard you kind of tried and pitched you could uh work incredibly hard and you have a manufacturing defect and if you don't fix it you're gonna you're out of business um you need to raise money by a certain date and there's a you got to have this milestone in order to like have a good pitch and you do it you have to have this talent and you just don't have it inside the company or um you know you have to get 200 people or however many people kind of like along with you and kind of buy in the journey um you're like disagreeing with an investor and they're your investors so it's just like you know it's like you there's no walking away from it right so um and it tends to be like those things where you just kind of get clobbered in so many different ways that like things end up being harder than you expect and it's like such a gauntlet but you learn so much in the process and there's a lot of people that actually end up rooting for you and helping you like from the outside and you get good great mentors and you like get find fantastic people that step up in the company and you have this like magical period where everybody's like it's life or death for the company but like you're all fighting for the same thing and it's the most satisfying kind of journey ever um the things that make it easier and that i would recommend is like be really really thoughtful about the the application like there's a there's a saying of like kind of you know team and execution and market and like kind of how important are each of those um and oftentimes the market wins and you come at it thinking that if you're smart enough and you work hard enough and you're like have the right talented team and so forth like you'll always kind of find a way through and um it's surprising how much dynamics are driven by the industry you're in and the timing of you entering that industry um and so just uh waymo is a great example of it there is i don't know if there'll ever be another company or suite of companies that has raised and continues to spend so much money at such an early uh phase of revenue generation and product and productization um the you know from a p l standpoint uh like it's it's a anomaly like by any measure of any industry that's ever existed um except for maybe the u.s space program uh like right uh like but it's like uh multiple trillion dollar opportunities which is so unusual to find that size of a market that just the progress that shows the de-risking of it you could apply whatever discounts you want off of that trillion-dollar market and it still justifies the investment that is happening because like being successful in that space makes all the investments feel trivial now by the same consequence like the size of the market the size of the target audience the ability to capture that market share how hard that's going to be who the incumbent's like that's probably one of the lessons i appreciate like more than anything else where like those things really really do matter and um oftentimes can dominate the quality of the team or execution because if you miss the timing or you do it in the wrong space you run into like the institutional kind of headwinds of a particular environment like let's say you have the greatest idea in the world but you barrel into healthcare but it takes 10 years to innovate in healthcare because of a lot of challenges right like there's fundamental uh laws of physics that you have to think about and so um the combination of like anki waymo kind of drives that point home for me where you can do a ton if you have the right market the right opportunity the right way to explain it and you show the progress in the right sequence it actually can really significantly change the course of your journey and startup how much of it is understanding the market and how much of it's creating a new market so how do you think about like space robotics is really interesting you said exactly right the space of applications is small yeah you know relative to the cost involved so how much is like truly revolutionary thinking about like what is the application and then yeah but so creating something that didn't exist it didn't really exist like this is pretty obvious to me the whole space of home robotics just every everything that cosmo did i guess you could talk to it as a toy and people will understand it picazo is much more than a toy yeah and i don't think people fully understand the value of that you have to create it and the product will communicate it like just like the iphone nobody understood the value of of no keyboard and a thing that's that can do web browsing i don't think they understand the value of that until you create it yeah having a foot and a door in an entry point still helps because at the end of the day like an iphone replaced your phone and so it had a fundamental purpose and all these things that it did better right sure and so then you could do abc on top of it and uh and then like you even remember the early commercials where it's always like one application of what he could do and then you get a phone call right and so that was intentionally sending a message something familiar but then like yes you can send a text message you can listen to music you can surf the web right and so you know autonomous driving obviously anchors on that as well you don't have to explain to somebody the functionality of an autonomous truck right like there's nuances around it but the functionality makes sense um in the home you have a fundamental advantage like we always thought about this because it was so painful to explain to people what our products did and how like how to communicate that super cleanly especially when something was so experiential and so you compare like anki to nest nest um had some beautiful products where they started scaling and like actually find like really great success and they had like really clean and beautiful marketing messaging because they anchored on reinventing existing categories where it was a smart thermostat right and uh like and so you you kind of are able to um take what's familiar anchor that understanding and then explain what's what's better about it that's funny you're right cosmo is like totally new thing like what what is this thing because we struggle we spent like a lot of money on marketing we had a heart like we fought we actually had far greater efficiency on cosmo than um anything else because we found a way to capture the emotion in some little shorts to kind of lean into the personality in our marketing and it became viral where like we had these kind of videos that would like go and get like hundreds of thousands of views and like kind of like get spread and sometimes millions of views and so um but it was like really really hard um and so finding a way to kind of like anchor on something that's familiar but then grow into something that's not um is an advantage but then again like you don't have like there's successes otherwise like alexa never had a comp right uh you could argue that that's very novel and very new and um and there's a lot of other examples that kind of created a kind of a category out of like kiva systems i mean they like came in and they like uh enterprise is a little easier because if you can uh it's less susceptible to this because if you can argue a clear value proposition it's a more logical conversation that you can have um with customers it's not it's a little bit less emotional and um kind of subjective but yeah in the home you have to yeah so like a home robot it's like what does that mean yeah and so then you really have to be crisp about the value proposition and what like really makes it worth it like and and we by the way went to that same order we almost like we almost hit a wall coming out of 2013 where we were so big on explaining why our stuff was so high-tech and all the kind of like great technology in it and how cool it is and so forth um to having to make a super hard pivot on why is it fun and why did like does the random kind of family of four need this right like so it's learnings but that's that's the challenge and i think like robotics tends to sometimes fall into the new category problem but then you gotta be really crisp about why it needs to exist well i think some of robotics depending on the category depending on the application is a little bit of a marketing this uh challenge and i don't i don't mean i mean it's it's the kind of marketing that weimo is doing that tesla is doing is like showing off incredible engineering incredible technology but convincing like you said a family of four that this this will this is like this is transformative for your life this is this is this is fun this is you don't care about tech isn't your thing they don't they really don't like they need to know why they want it so some of that is just marketing yeah that's why like roomba like um yes they didn't you know like go and you know have this like you know huge huge con you know ramp into like the entirety of like kind of a robotics and so forth but like they built a really great business and um uh in a vacuum cleaner world and like everybody understands what a vacuum cleaner is um most people are annoyed by doing it um and now you have one that like kind of does it itself uh yeah various degrees of quality but that is so compelling that like it's easier to understand and like uh and they had a very kind of and i think they have like 15 of the vacuum cleaner market so it's like pretty successful right i think we need more of those um types of thoughtful stepping stones in robotics but the opportunities are becoming bigger because hardware's cheaper computes cheaper clouds cheaper and ai's better so there's a lot of opportunity if we zoom out from specifically startups and robotics what advice do you have to uh high school students college students about career and living a life that you can be proud of you lived one heck of a life you're very successful in several domains um if you can convert that into a generalizable potion what advice would you give yeah it's a very good question so it's very hard to go into a space that you're not passionate about and push like push hard enough to be you know to like maximize your potential uh in it and so there's a um there's always kind of like the saying of like okay follow your passion great try to find the overlap of where your passion overlaps with like a growing opportunity and need in the world where it's not too different than the startup kind of argument that we talked about where um if you are where your passion meets the market right you know i mean like because it's like uh um it's a you know that's a beautiful thing where like you can do what you love but it's also just opens up tons of opportunities because the world's ready for it right like and so um and so like if you're interested in technology um that might point to like go and study machine learning because you don't have to decide what career you're going to go into but it's going to be such a versatile space that's going to be at the root of like everything that's going to be in front of us that you can have eight different careers in different industries and be an absolute expert in this like kind of tool set that you wield that can go and be applied um and that by the way that doesn't apply to just technology right it's uh it could be the exact same thing if you want to um you know the same thought process apprised to design to marketing to um you know to sales to anything but um that versatility where you like um when you're in a space that's gonna continue to grow um it's just like what company do you join one that just is going to grow and the growth creates opportunities where the surface area is just going to increase and the problems will never get stale and you can have you know many like and so you go into a career where you have that sort of growth in the in the world that you're in you end up having so much more opportunity that organically just appears and you can then have more shots on goal to find like that killer overlap of timing and passion and skill set and point in life where you can like you know just really be motivated and fall in love with something um and then at the same time like uh find a balance like there's been times in my life where i worked like a little bit too obsessively and you know and crazy and uh and i you know think we kind of like tried to correct that you know kind of the right opportunities but you know i think i probably appreciate a lot more now friendships that go way back um you know family and things like that and um and i i'm kind of have the personality where i could use like i have like so much desire to really try to optimize like you know what i'm working on that i can easily go to kind of an extreme and now i'm trying to like kind of find that balance and make sure that i have the friendships the family like relationship with the kids everything that like i don't uh i push really really hard but it kind of find a balance and and i think people can be happy on actually many kind of extremes on that spectrum but it's easy to kind of inadvertently make a choice by how how you approach it that then becomes really hard to unwind um and so being very thoughtful about kind of all of those dimensions makes a lot of sense and so um to come those are all interrelated um but at the end of the day oh love passion and love yeah love towards you said uh yeah family friends family and hopefully one day if your work pans out boris is love towards robots not the creepy kind of good guy that's a good kind just just friendship and yeah and fun just yeah it's like another dimension to just how we interface with the world yeah of course you're one of my favorite human beings roboticists you've created some incredible robots and i think inspired countless people and like i said i hope cosmo i hope you work with anki lives on and um i can't wait to see what you do with waymo i mean that's if we're talking about artificial intelligence technology that has the potential to revolutionize so much of our world that's it right there so thank you so much for the work you've done and thank you for spending your valuable time talking with me thanks alex thanks for listening to this conversation with boris sofman to support this podcast please check out our sponsors in the description and now let me leave you some words from isaac asimov if you were to insist i was a robot you might not consider me capable of love in some mystic human sense thank you for listening and hope to see you next time you\n"

Boris Sofman - Waymo, Cozmo, Self-Driving Cars, and the Future of Robotics _ Lex Fridman Podcast #241

Random Videos