All About LLM Agents with the CEO of a Generative AI Startup (MindverseAI) - What's AI Episode 14

The Future of AI: Large Language Models and Agent-Based Software

They can be totally above average human uh in terms of way to analyze and the reason to solve complicated problems and the second thing is kind of happening is gonna happen is they are going to be able to handle more signals nowadays mostly handle the text data and respond by text Data as well but in the future they would be able to absorb to get like visual data a voice and all this kind of different signals different sensories that humans have and then they can respond not only by text output but also by some actions by by different type of types of ways to to deliver information to to the end user that's what's going to happen I think in five years but if we think it's for a more long-term perspective um I believe like all the digital service is going to be changed by AI so uh I think AI copilot or AI agent is the new form of software in the future so every software would be in the form of agents and everyone would be able to have an army of agents that they can leverage to help them finish a lot of things that's probably happening not very far five eight years yeah that would be really cool.

And about just the language model part we've seen the big jump from gpt3 to chat GPT and just gpt2 to gpd3 as well but would you would you agree that we may have done for example the deparator rule with whatever percentage but like the gpt3 to chat GPT was 20 of the work to get 80 percent of their results but then the final 20 will require 80 of the work so like do you think their progress will slow down compared to gpt3 to GPT to chat GPT for the next big step or do you believe we still will improve quite a lot in the following years I think it's a very interesting question if you truly uh look into what Obama has done over the past few years right I think they are still trying to scale it up and they still believe that there's a lot more can be mined from scaling the model to the next level so my my opinion on this is in terms of large language model itself in the next five to three to five years we can still make huge progress to make it make them more intelligent to make them to handle like longer context and just in general better and more powerful I don't think I don't see it slowing down but I do believe that there is definitely some limitations um large land model itself so we we need to build something build a framework around it to unleash the power more so from that front uh we wouldn't see many countries many companies many researchers to make it more autonomous to make it more like like you said self-teaching and to make it more powerful to to connect to the external world and to be able to use external tools and to make it more like adapt adaptive as you use it so all these kind of things are like a different layer of Innovations on top of large language models and combining these two together uh because it's you just multiplies by these different factors of innovation right.

uh I can see in five to ten years the whole AI landscape is still growing like very very fast it's gonna be as fast as the past five years yeah even more yeah that's super exciting. And for those who want to learn more about my work, check out my inverse and minus, I think it's a really good product and it's super promising really cool and I'm excited about anything everything related to agent and a bit scared but I hope I hope it will go well.

Also, I can share a little bit on mine OS so minus is currently still uh close the beta product um so we are currently um experimenting this product with like around 500 to 1000 and Pilots users um uh it's it's not uh 100 Reddit yet but we are iterating it very fast so it's probably going to be ready within two months so it can be used by anyone in the world so hopefully at that time um it can help you a lot and uh and if you are very interested in using the closed beta version of mine OS please go to my universe dot Ai and apply for trial use and the you can you can test this immature version and give us your valuable feedback that will be very much appreciated.

Awesome thank you very much for your time it was super valuable and insightful I I really enjoyed discussing large English model and just mindverse in general I think it's as I said it's a very interesting topic to challenge it's basically research but like applied research so that's amazing and just like in research it's super iterative and you will just keep learning and improving so it's really cool.

"WEBVTTKind: captionsLanguage: enforeign CEO of Mind versus AI Felix has done a PhD in computer science before working as a research scientist at Facebook and then switching to Alibaba to be the director of the neuro symbolic lab now he is creating his own application mindverse AI in the interview we discuss large language models Chad GPT mindverse and a lot of other topics that are really trending right now I hope you enjoy it awesome so I will start with my first usual question which is who are you and more specifically I will since it's a very broad question I I would like to start with your academic background so what's your academic background yeah my name is Felix Tao so just like you I was a PhD student in computer science domain so I got my bachelor degree in qinwa University and then I came to the states to get my PhD from UIUC University of Illinois especially working on Data Mining and machine learning stuff that's the era where before the large language models happened uh so so everything is about training models that can be specialized for solving like particular um industrial quest industrial problems uh after I graduate I went to Facebook in Alibaba as a research scientist so that's mostly my research background academic background and now I'm running a company called my numbers but my University is also uh like uh AI related company so I do a lot of research in minors as well that's super interesting I'd love to first talk about your PhD just because as you mentioned I'm also doing one and retrospectively would you say that your PhD was worth it uh that's actually that's a question I asked myself a lot during my PhD times uh honestly in retrospect I think uh I think it's worth it for me but not necessarily where's it for all PhD students because for me it is a very good way very good Training for me to get deep into the foundation of machine learning and Ai and truly understand the evolution of how the technology has gone have gone through so it prepared me better for for the future like a research scientists work in Facebook and Alibaba and also in My Universe um but I definitely need to say one thing is the AI industry and Air Research Academia is changing like so fast so all the research papers I did during my PhD PhD days are not uh I think they are not useful anymore so in terms of impact in terms of uh how lasting can your can the value of your research Works be um it is pretty challenging um to think and for students so I would say if it's it's it's worth it as a training process but not worth it as a way to um truly make like a huge impact on the research area because it's changing a lot each year yeah that definitely makes sense and I've I've recently talked with another startup CEO on on this exact question of research and the PHD but mostly research and I don't want to put words in his mouth but basically the the gist was that research may not be worth it right now since we've done so much so much progress recently in research and now is the time to apply and commercialize and worked and and work on productization plus they are so much opportunities with open Ai and just n Minos and everything to help you commercialize and use the models so would you think investing in research is still relevant or like trying to pursue a research career path relevant or should one go into more like Hands-On developer engineering role yeah I think it's a it's a good question uh I feel like um the parenting shift of large language models makes a huge impact on the research Community one way is the way how we do research is changed which lowers about the four lowest bar for people to to be able to do research so previously we need to learn a lot of math and learn a lot a lot of like tuning on neural networks and Bayesian networks to make something work which requires a lot of like study and learning in the related domain but nowadays with the help of large language model we basically trying to do a lot of high level uh structure on top of that model so if I would say if you are not researching on the foundation models themselves it's probably better to skip PhD and get to the industry to work on things on top of larger language models and these are the place you can provide as a maximum value but if you are researching on the fundamental uh math and the mechanism of large language models always say Foundation models then yes research is still very valuable but that also means like previously we have like CV area NLP area and in LP we have so many different tasks right all these tasks many of them are gone it's not that valid research problem anymore because they can all be done by a single Foundation model so to study these models to study these questions is probably not good not valid anymore yeah yeah it must be difficult compared to back in the days to find a research topic especially in the PHD or for students like relatively beginners it must be difficult to find a research topic that where you can be impactful versus open AI deepmind meta and every large company in the field it yeah it's definitely not only that the PHD itself takes a lot of time and may not be as valuable to yourself in terms of like income compared to having a job and even to the learning process like working in a startup or having your own startup is like you learn a lot you are super motivated and you work way more just because you wanted to succeed and you you learn a lot just like in the PHD but I don't know what will happen with the PHD like I don't know if it will stay the way it is I see more and more work being done affiliated with companies so I assume this is where it will go like most phds will be people doing a PhD for meta for example and things like that yeah yeah sadly sadly I I agree with most of of your points um because the resource requirements to to get something down on AI space uh the the research Labs from universities are not gonna be as suitable as industrial Labs right for example the the air labs in Google or in Fair or in in open AI so yeah yeah I think I think this this kind of shift happens to many many Industries for example uh like database research mostly in the early days all these Innovations are done by research Labs but as after the database systems becomes more commercialized and the commercial companies have have more resource to to pull in to the research area and the building their own like product there are many Innovations are done by the industrial Labs or but done by the commercial companies I think AI is having this kind of moments so after that uh so the field is still going to be very Innovative Innovation driven but uh I think the majority of innovation would come from there is from the industrial Labs or uh uh startups like okay yeah yeah it definitely think so as well and before diving into the topic you mentioned that you worked at Facebook and Alibaba so I would love to hear a bit more about what were you doing first at Facebook my work in Facebook is very interesting because Facebook mostly is is one of the biggest platform in the world to have like so many data like so much data including like documents news articles videos pictures so when I join Facebook is having a New Direction they previously you can only see like a friend's Post in your news feed right but they think it's people are getting uh having a lot of Demand on truly using Facebook as an information source to get like news updates and to to get interesting videos not just from Friends status so at that time we are thinking about how can we bring Bridge the users billions of users on Facebook to like the billions of pieces of informations on the web so trying to bring information to people and then we needed to develop a tool we call it content understanding tool basically it is an AI platform that can you can put all the information all the news articles images videos into this platform and it will be able to understand the topic understand the key Concepts embedded in those contents and Facebook can use this kind of signals extracted by AI from this like Text data or video data to help users to find those relevant informations based on their interests and I was a person who started this project in news feed they're trying to understand text Data first and then go from Text data to image and video data and trying to build a good foundation for AI foundation for Content understanding and that was actually my first job and it's really exciting to to to be able to work on a project like this of this size because you are basically handling like the world's biggest database and trying to use AI to to help uh understand it so that's very cool and just to put that into perspective when was this exactly like what year uh I started working at Facebook from um 2017. well for AI That's like a very long time ago compared to where we are at now yeah I think at that time we we don't have the concept of large language models so uh the way how we understand those tax data or video data is still quite traditional in AI sense we we develop like specialized models and to extract topics or find like people in the videos all these kind of things they are done by a set of different models but nowadays you after the help of these Foundation models you probably can do them with a more elegant Way by using one single model to handle the whole content understanding job yeah it must have been a very good role to to learn a lot about a natural language and all the well natural language processing but like all the the intricacies with understanding text like since you didn't have all the tools that we now have which makes it much simpler even though it's extremely complex but it's still like that well the challenges are very different I assume as well yeah at that time the NLP tours we are using it requires a lot of like you know data labeling kind of stuff and for which particular task for example like topic extraction from news articles we need to label a lot of data so that's why in Facebook or in any tech company invest heavy on AI they usually partner with all this like data labeling generation companies where they hire people to label data for their tasks but nowadays as you said so all these tasks can just be a prompt to to to be a you need to design very smart prompts for the large language models and all the tasks they do is to how can we train a large language model that can uh observe all the data on the web and to to let the model learn how to understand the whole world and it's it's a totally different topic yes really cool and speaking of different topic it seems I I've read that you've been the founding director of the neurosymbolic lab in Alibaba and so um I I've also read that the goal was to waking up of Consciousness and so this seems super different from what you are doing at Facebook prior to that and I'd love to hear more about it uh yeah I was going through a very interesting time in AI history when I started I when I started to quit my Facebook job to John Alibaba that was around the 2000 19. in the in 2020. so around that time there's a very big thing happening in AI industry uh we all know it's gp3 right so open air launched the gp3 around the middle mid of 2020 and around that time I feel like how we approach AI is done wrong because uh so the mindset all these AI researchers have before is always how can we Define the problem to a particular model and how can we Define the input and outputs of this model and how can we get data and no matter it's by labeling or by harnessing from the the web how can get data enough data for this particular task but after gp3 I realized that the the goal of AGI has a very good foundation very good starting point scientists in in the early last century when they defined the term AI they mean AGI that the AI can learn things and can do all different kinds of jobs but uh because it's too hard at that time that's why people start to uh have different directions like some people research on the vision problem how can machines see some people research on the NLP problems how can machines read and understand the text and then you go deeper and deeper uh you you you have a set of problems in NLP and for each problem you split into smaller problems uh that's the traditional mindset for AI researchers but with a after gpd3 comes out Everything's changed so that that's the time I realized that maybe we can do something different maybe we don't need to follow the traditional uh way of doing AI research so I think that's why I studied this research lab called neurosymbolic lab so I try to combine Foundation models large Lounge models together with a framework which is more like human brain for example the framework is able to have like truly long-term memories and so framework is truly able to have like perceptions getting all these different kind of information sources and put it make them useful and you just send these signals to the foundation models and let the foundation model to process all these signals for you so if we combine Foundation models from Combine large language models with all the memories perceptions and even like abilities to motor control to be able to control the the arms of Robotics those kind of things uh you're actually making something similar to a human right so you're making some some you are truly pushing AI to be like how a human process information how a human uh think about a problem how a human deliver its actions so I feel like it's time to do that so that's why we set up a very ambitious goal saying like waking up of the Consciousness in machine but uh it's probably not done yet even from today's perspective it's but I think it's much much more closer than five years ago to to truly achieve this like ambitious goal definitely just like with well I don't know if it's recent in terms of AI history but with clip we've seen that we could understand a concept from either text or images which is also something that we can do quite easily if we see the word cat or a cat or if we hear cat we can all link it to the same concept in our brain so that's pretty cool that right now we can do that with AI or with one model well one larger model so that's yeah it's really cool that we seem to be transitioning into Foundation model basically but yeah into larger and more general intelligence compared to being very good at a specific application where we were aiming at where we we definitely were aiming at that just a few years ago even in my masters was to do some few shots classification like on on be better on very specific tasks and that's already much different now so it's yeah pretty cool yeah and one thing I I'd like to add is I I also observe the evolution of AI and I think one padding is very interesting is we start this AI concept AI research uh field by having a bunch of like super intelligence researchers trying to tackle the general AI problem and then failed first and then we start to take this fragmented approach where we split the tasks into smaller ones and by having like hundreds of different AI tasks and having separate people working on each of them uh then it goes back to a more like unified approach more like a general approach and I don't know I I feel like then after this having having these large language models then we probably will enter into a new era where we find one large language model is not going to be sufficient enough we want to diverge from it again to have different types of large language models for different for example personalities for different like jobs so it's always like fragmented and then unified and then fragmented so so this is a very interesting pattern happening in AI That's why I think as you said even though nowadays we see only a few companies who are able to deliver the the top of large Lounge models but uh I still think in the Future No Matter it's startups or maybe research students they would be able to find the niche issue Niche problems to solve but let's see yeah and regarding splitting into smaller applications would you think wouldn't you think that that's already what we are doing with for example lots of people take large English models and fine-tune them to their own task or rather sometimes build a memory of some kind of data set that they have for example a medical book and then they just ask the large English model to to cite this medical book and so we already are are doing that like going from the large model and and adapting into our adapting it to be better on on our specific task yeah so do you think this is promising Avenue or or will we have a large language models soon enough that is better at each of these very specific tasks versus splitting it yeah I think that's a very interesting question uh so to be honest I I don't have a certain answer yet but my my belief is uh I feel like we need a different AIS for different tasks and just to make it more specialized in making more uh more of high quality in terms of solving that particular domain's problems uh for example my company is called a mind versus right so we have this core product called The Mind OS it's basically trying to solve this for different domains experts uh I I can't I would say that large language models has two things embedded in the model one is their reasoning ability so they are able to reason uh by complicated Logic No matter it's its domain agnostic so it's not related to some particular domain it's just how they like human beings they can just use logic to reason um you use logic to solve problems and the another layer of large language model is their common sense so this common sense of the understanding of the whole world is obtained one day in the pre-training stage when they go through all the web data but this Common Sense usually are not good enough for a particular domain so that's why people can particularly domain they probably need the reasoning part but not necessaries and not on the knowledge Parts embedded in large language model because they want to have their own specialized knowledge they want to have their own vertical data so we call that uh the grounding layer right so we use large language models for reasoning but we add a grounding layer on top of it to make the model uh tuned to a particular domain so they will be much much better solving at that domain's problem I think that's that's one of the major goals for mine OS so people can do it in a very easy manner they can simply updating documents and they can simply connecting domain specific abilities apis onto the model and the model itself would be able to plan and retrieve related information from this particular domain particular workflow particular use case and use the reasoning power of larger lineup models to to solve it better it's basically like replicating the process of a perfect human where we we are all able to eat world can do pretty much everything but yeah you have to go to university and do a PhD and everything if you want to be the best at something and so it's funny how we are basically trying to do the same thing with AI where we try to teach it the world in general just like we do to our children and then we we give it like lots of very specific documents just like we do with University students to become experts in one field so that's it completely makes sense yeah yeah totally I think the way how you put it is about the the perfect way to describe how we build these AI systems so so you need like all these pre-changing steps like training our kids from just born born to having like a college degree then you need this professional training to make it like expert in a particular domain so what we do in my universe what open ad has done over many other large language model has done is to finish the pre-training to make sure the AI is like a college grad graduate and but for for what we do is to have this like professional training agenda designer for each AI so they can be a super professional in their domain and they can have long-term memories so they can not only be perfect they are not only professionals they can also grow as you use it so they can grow more and more grounded into your particular domain and your particular uh use case so that's very exciting because for me in the future I think AI is not going to be a general model solving everything for everyone each person each business they they they need the AI to be more related to them to their life to be more grounded in their daily work so adding this grounding layer on top of the large language model is definitely something that can have a lot of innovation happening so I assume that is the main goal of mine OS is to build these additional layers above the the first initial pre-training and are you aiming to do that on most fields or are you specializing to to some some Fields what exactly are you building and aiming for with Minos yeah yeah so uh minor verse it's like our vision right so it's a term minor verse basically is our vision where in the future AI beings are gonna uh coexist with human beings in a forum a new Society where a lot of things humans do humans do are going to be delegated to Ai and AI is just going to be an integral part of our society and that's the vision and I think we can try to make it uh like a benefit a good Vision a good future for for Humanity and mind OS is our product so basically it's like operating system that can generate AI mind and we call it AI Geniuses in in our system so anyone can get on the system and trying to create a AI genius for a particular domain for example you can create a genius for for like Hotel Butler you can create a genius for your like assistant for Air Research right and genius for HR Support all these kind of things so to do that we we need to like you said we need to tackle a couple of technical so challenges one is to make them like easy to use and add this grounding layer on top of each AI genius so uh we are making it as general as possible to just to answer your question because the the foundation model itself is General right and the training process the professional training process is mostly alike in the real world so what do we do the grounding layer is basically adding the training procedure for different domains the the way how you train it is similar but the the materials the data you train it is different for different domains so minus is mostly trying to provide a tool set a engine a platform that different domains can use so we don't try to focus only one or two domains and we want to make it more like a create creativity platform where people can have their own creative way of generating AI Geniuses fourth or AI Minds for their on domain so that's a goal that's one of the biggest features in mine OS but we also do other things um to name one which I think is very interesting as well is like when you use chargept you know the reasoning power is there but it can only do one round of reasoning right after you ask something it gives you some results after reasoning but we can already see a trend in industry for example Auto GPT so the AI would be able to autonomously use its reasoning power and by different by many iterations so it's like you are multiplying the reasoning power of large language models by a big number for a particular task so they would be able to uh simplify a complex task to different subtasks and do it gradually iteratively so I think that's very also very important piece of work we do in mind OS to let AI to have the ability to deep think to do like a slow think so they can Leverage The Power of reasoning more and make the AI more powerful yeah I have lots of questions following that but with the last thing you said I personally using chatgpt and other models my main issue is hallucination and so I'm a bit afraid and skeptical when talking about chaining those prompts and requests just because I feel like it can just grow the risk of hallucination and just working upon these hallucinations and just like growing more and more the what like the wrong it can do and so is there anything you are doing to mitigate the the risk of of hallucination or is that a thing that the brands using mine OS need to tackle is that like is there something you guys do to to help with that I think Hallucination is one of the major reasons why people especially businesses don't use chat gpt2 actually and uh to me I think the solution to this can be twofolds right one is how what Obama is doing for example they are training gpd4 and it may be even like higher level of large Lounge models in the future I think one major goal of these new new models are to solve the hallucination issue so I think in one of the interviews by Down by Sam Altman or Elia I can remember whom they say that the hallucination issue in gbd4 is reduced by at least 60 percent so so that's one one area where people where we can make it better another area is what we are doing is a surrounding layer so by doing grounding layer we use like tactics for example and generating like a very special prompt to enforce the AI model to uh speak on things by the reference text not by things trained from the pre-training stage and uh we enforce that we also added this like we say citation system so everyone say he said we would like to ask him to add citations from the original source and uh everything marked with the citation would be more trustworthy than things that's not marked by citations so it can solve some issues when people perceive the result of the AI generated right um we we can have more Trust on things they they think have a good citation on and the things they don't have a good citation on but uh I would say the the hallucination issue is a fundamental flaw for large language models and I actually think it's a fundamental flaw in human mind as well yeah yeah so so humans sometimes do a lot of like bullshiting as well so uh yeah but I think it's it's getting solved gradually um in in areas like marketing and for example uh entertainment people are more like uh okay to have these hallucinations sometimes as long as the amount of Hallucination is controlled but in here is like medical as you said medical law and all this kind of very serious domains probably this is going to be a bigger issue so I see even different Industries when they adopt this large language model larger models they they have like different pacing for adoption yeah and that's actually how I used to describe it as well large English models I just I I used to compare it with Wikipedia where like back in the day you you can trust Wikipedia but you cannot really cite it or you cannot write something for school or whatever based on Wikipedia you need to check the sources and yeah confirm so it's pretty much the same thing with large language models right now you you always need to confirm what it says if it cannot cite where it took its information which is why I think like linking it to a memory based on on lots of documentation is super powerful and maybe the easiest solution to to tackle that but I agree that it may be just like a human bias that is just generalized to language models just because lots of people share fake news or or lie to their friends and stuff like that and the data is is from us so yeah it makes sense that it's doing the same thing yeah yeah totally totally uh that's a very good analogy um by the way I think uh just like you say the one thing we can always do to reduce the impact of Hallucination is trying to make the thinking process of AI as transparent as possible yeah so uh for example uh in the stage of retrieving information as external reference we make it transparent in the process of calling apis to finish a task we make it transparent so the user once they talk to AI chatbot with a powerful ah about all the actions all the informations used in the whole thinking process would be should be transparent so people can have more trust this cannot be done by a real person right we cannot when we talk to a professional we cannot ask them to to list all the thinking process for us but we can do it with AI so I think we have different a lot of a lot of ways to to reduce the impact of hallucination yeah like that I just mentioned that it's important to double check what the language model says just to be sure it's a truthful and so when it's chaining prompts and you are not controlling everything is there a way to ensure that what is happening in between the the input and the final results is is truthful and there were no hallucinations in between uh I I don't think we can can't nowadays we we don't have the proper tools to get into the like very detailed uh of the Computing process um down by large language models right because it's more like a black box um but uh I think to train a powerful AI you probably don't simply just to use large language models you actually build a framework on top of it and having this like thinking thought flows between different parts of the mind uh you have thought flaws from the memory to the perception area from perception area to the um you know Moto action area and this high level flow of information can totally be transparent to the users so um I'm not sure if Albany Heights it is developing something trying to visualize and make the the hidden process transparent I doubt it's very difficult to do but we make the high level process transparent and we make sure everything the the AI generate generated to user have some good reference and good citations I think that's that's one way to go but it's not going to solve the problem um by 100 and could you identify what is right now the biggest challenge when build when trying to build these well this tool that produces specialized agent like is there one specific challenge that is something you are currently working on is harder to solve than than the others one thing is just as you said hallucination but other than hallucination one big challenge that we try to tackle is how we actually make the AI as grounded as possible usually people's knowledge on how to deliver a job is very complicated it's not only about providing the AI a few documents it's not only about providing AI with a couple of tools you you always need to teach the AI how to behave in very different scenarios so we can definitely do it by uh give instructions to AI in different scenarios but still they sometimes don't understand like the best best practice for a particular domain they have these different tools right for example if we are building a marketing agent they have tools to connect to Facebook they have tools to to finding the product details of a company but how to combine them together for a better marketing plan campaign is quite challenging because you can use some piece of information to give like give us some good ideas but for them to autonomously finish it for you is very hard so Auto GPT and the similar like Technologies uh like what we are doing can be a good direction to go but this autonomy has issues for example they are not controllable they usually can be very open-ended and not very focused on the particular task and which turns out to be like wasting a lot of money but not solving your issue so how can we make AI that knows how to deliver a complicated domain specific task and without random thinking about like random ideas that's very hard to do so I would say uh no matter it's a hallucination issue all the like autonomy issue it's all about how we can control the ai's behavior to what we really want it to be by natural language that's a very key part that's why we need to build a really really good framework on top of large language models to deliver the know-how to deliver some feedback to the AI so it can smartly incorporate users feedback into the way how they think and the way how they perform tasks that's very hard and that's a major technical challenge we are facing and we try to solve it by by framework we developed and is the the only way to tackle this is through is through prompting and trying to like bus training make it understand how to do the job or are you also referring to fine-tuning and basically changing also changing the brain of the AI just to to tweak its answers like is it fully after training with with prompting resources are also related to retraining or fine-tuning the models yeah it's also a very good point uh I think currently um most people are doing using this prompting approach the way how we do it is not ask a human to write a better prompt it's like how can the AI get some feedback and updates their own prompt updates their own behavior by updating their own instructions so that's one way automatically and if it can be done automatically it would be a very efficient way to tune and the control as Behavior but uh I think this approach has some limitation and it cannot achieve like total control over ai's Behavior so I I believe in the future we probably want to have some training built in the process so we don't train the the foundation model we don't train the big one but we can definitely train a behavior module on top of that which is which can be a very small model like adapter yeah and this model would would be in charge of getting users feedback and updates the parameters within it automatically and gradually adapting to users preference and uh make it more grounded more suitable to whether one user really need so but the the prompting approach is gonna last for a while before the real fine tuning stage comes because we haven't seen the we haven't reached the limitation of the company approach yet so it's always more convenient to work on the prompting part yeah yeah that definitely makes sense and I I had a question for this is mainly for a friend of mine that is um it's about mineverse and mine OS but it's also to give context and and help the listeners to to better understand the tool so I will just uh yeah give my example so my friend is a recruiter at a company so and he tries to find people to fill in a broad range of roles and right now she tries to find ways to use AI to improve his work so okay he doesn't have any background like he's not a programmer he's not he doesn't have any background in AI other than playing with jbt and so how could someone like him which I assume a lot of listeners are in the same profile could could use mine Os or minevirus to improve anything in their work and like what do they need to learn or to do what would be the the steps to get into it and have an agent help them at in the end yeah so so uh if I understand correctly your friend is in the HR space right yeah okay so uh definitely I think uh I I believe in the future any professional like HRS like lawyers and researchers they can they can have their own way of using mine Os or any other agent platform to build their own agents or just use other people's agents um for for their work for example if we look at this HR job you can find many different steps right some steps are trying to find the candidates from the web or from some like linking profiles and then the next step is trying to assess the degree of feeding of each candidate to the job description and then maybe we can have this like interview process and then communicate with the candidates or this kind of compensation or these kind of things for many different things you can definitely use a myos to build one agent for your job so for example uh we in Minos for example you you when you try to um find a good candidate for LinkedIn we can add one endpoint for the genius you create to have the ability to to browse the web especially on the LinkedIn websites and then you can create a very good workflow that's uh by dragging a few modules together say any candidates you find you first assess them based on their uh for example uh past experience and give them a score from one to five and this second thing is mostly by issuing a natural language command to the AI and ask the AI to do it as you order it to do so they will start to automatically browsing the web and getting all these LinkedIn profiles and trying to uh like greater than by their experience and then you can build another workflow on top of that say after you grade them please send all the like five grade as five candidates to my email and rank them by uh the closeness of then their current job to our city something like that so so it's very very easy to set up those kind of workflows in mindos and we have all this we actually have a very interesting feature in mind OS we are currently building is we are creating this collaboration Network between AI agents or we call AI Geniuses uh for example you you create one AI agent for your like uh Talent acquisition process and you have another AI agent working on this like uh resume grading and I have a third agent working on this like uh automatic interview with uh with the candidates per se so you can ask you can connect them together and to say uh I needed this AI like Talent acquisition agent to First find the talents and then pass this to the second agents for grading and uh findings of Fitness findings are feeding positions in within our company and then the third one to start an initial interview with with the with the candidates so uh this this kind of way to work can be applied to all different types of jobs right not only for HR so we can always find this like each job they have different components and each one requires them to teach the AI how to do it properly and if we can have this mechanism to combine these different AIS together we can largely reduce how people work on this like uh repetitive and tedious work and that's mostly the goal of mine OS and that's I think answers your question how can they use monos and to improve their productivity yeah yeah so just to summarize it would almost only be through lateral language so English or even I assume it could work in French or other languages yeah and do they need any other skills for example when you see that the the agent could go on LinkedIn and and scrape all the different profiles do you need to to do anything related to connecting it to LinkedIn yeah we already provide a set of like public skills for example ability to browse the web or ability to run some code ability to to find a flight for your travel uh things like that but uh the beauty of it the beauty of the grounding layer is for your particular work you can always have your own vertical abilities right for example you can you you need to connect to a vertical database to look for some information or finding out the the clients from your CRM whatever you can always connect these vertical skills vertical tools into the AI so the AI itself would be a be able to autonomously use this set of skills and trying to for each of your inquiry uh each of your tasks they would be able to automatically come up with a plan using these different tools and solve your particular issue if you think it's not good enough and more tools to it and give more feedback to the AI and he will become more powerful and trying to adapt to your true need and I think that's the power of mine OS so or like like I said it's a it's a tool it's a very good tool and you can build uh use this tool and for your need and we provide a lot of flexibility to users so they can basically build anything they want really cool yes it's really cool and I'm excited to see what people will do not requiring coding skills or learning other things to use those models it's it will just I think it's like the quantity of user will will speed up the development progress as well just based on all the Investments and what it will stimulate with that and so I have a few final questions just related to the mainly to the future but yeah it will be difficult questions but short ones the first one is is there a thing that AI cannot do yet but you would love it to be able to do I think the AI country still don't have like some self-consciousness but but I I'm not saying self-consciousness as a very risky way it's more like AI is self-aware of what they can do and they cannot do that's very important that's I think that's a major source of hallucination if they know that they don't have sufficient knowledge or sufficient capability to finish something they should be able to realize they are lacking of this kind of capability and then they will stop being hallucin generating hallucination uh generating like false informations so I think it's very important to build some sort of like self-awareness module in the AI agents or in the AI mind framework and for them to not only to understand what they cannot do but also understand they need to learn new stuff to grow themselves to be self-teaching and that would some be something super helpful and super cool and I don't think I don't see any AI tools or air Frameworks based on larger large learning models can can have that down I completely agree and self-teaching will be super important but I I I'm afraid it's also something that humans are good to just the typical fake it till you make it it's basically exactly that some humans assume they can do something or just yeah fake that they can do something and they they cannot and so AI is just doing the same thing yeah it's true it's true and so okay I ask you about the biggest challenge facing your your company right now but would you have a different answer for the biggest challenge facing the whole AI industry in general or is that the the main priorities for for most people you would assume I think about that a lot recently especially after seeing the power of gpd4 so I think the biggest challenge is how can we control the impact on the society as a whole so we we all know that this AI is very powerful everyone's way of working and and living their life is going to be fundamentally Changed by this AI technology but uh it can be good or can be very risky right and it can if it happens too fast then I feel like a lot of people would be will lose their job so so I totally agree that AI in the long run can be a very beneficial thing for the society but in the short run in the short term we probably want to be more uh conservative on pushing it Forward I think the whole Human Society needs to be more prepared and the impact it's gonna given to us and we probably need more regulations and we we need more like better technology tools to control the like negative impact of AI so I think that's that's a major thing that's stopping AI from being more powerful which I think is good so I think it takes Collective effort for everyone who is involved in this this AI wave uh no matter if for for people like us we are all like AI professionals or for just random person on the street they have very little knowledge of AI I think we should afford we should be more like pay more attention to this particular issue I completely agree and speaking of the long run how do you see AI evolving over the next five years or in your mind where will we be at in five years uh five year five years is probably not going to be a huge difference I think of in five years probably two things is gonna happen one is uh the ai's reasoning ability is going to be much better so they can be totally above average average human uh in terms of way to analyze and the reason to solve complicated problems and the second thing is kind of happening is gonna happen is they are going to be able to handle more signals nowadays mostly handle the text data and respond by text Data as well but in the future they would be able to absorb to get like visual data a voice and all this kind of different signals different sensories that humans have and then they can respond not only by text output but also by some actions by by different type of types of ways to to deliver information to to the end user that's what's going to happen I think in five years but if we think it's for a more long-term perspective um I believe like all the digital service is going to be changed by AI so uh I think AI copilot or AI agent is the new form of software in the future so every software would be in the form of agents and everyone would be able to have an army of agents that they can leverage to help them finish a lot of things that's probably happening not very far five eight years yeah that would be really cool and about just the language model part we've seen the big jump from gpt3 to chat GPT and just gpt2 to gpd3 as well but would you would you agree that we may have done for example the deparator rule with whatever percentage but like the gpt3 to chat GPT was 20 of the work to get 80 percent of their results but then the final 20 will require 80 of the work so like do you think their progress will slow down compared to gpt3 to GPT to chat GPT for the next big step or do you believe we still will improve quite a lot in the following years I think it's a very interesting question if you truly uh look into what Obama has done over the past few years right I think they are still trying to scale it up and they still believe that there's a lot more can be mined from scaling the model to the next level so my my opinion on this is in terms of large language model itself in the next five to three to five years we can still make huge progress to make it make them more intelligent to make them to handle like longer context and just in general better and more powerful I don't think I don't see it slowing down but I do believe that there is definitely some limitations um large land model itself so we we need to build something build a framework around it to unleash the power more so from that front uh we wouldn't see many countries many companies many researchers to make it more autonomous to make it more like like you said self-teaching and to make it more powerful to to connect to the external world and to be able to use external tools and to make it more like adapt adaptive as you use it so all these kind of things are like a different layer of Innovations on top of large language models and combining these two together uh because it's you just multiplies by these different factors of innovation right uh I can see in five to ten years the whole AI landscape is still growing like very very fast it's gonna be as fast as the past five years yeah even more yeah that's super exciting yeah well first I will recommend everyone listening to check out my inverse and minus I think it's a really good product and it's super promising really cool and I'm excited about anything everything related to agent and a bit scared but I hope I hope it will go well do you have anything you'd like to share to the audience in terms either of mine Os or your personal projects uh yeah it's uh I can share a little bit on mine OS so minus is currently still uh close the beta product um so we are currently um experimenting this product with like around 500 to 1000 and Pilots users um uh it's it's not uh 100 Reddit yet but we are iterating it very fast so it's probably going to be ready within two months so it can be used by anyone in the world so hopefully at that time um it can help you a lot and uh and if you are very interested in using the closed beta version of mine OS please go to my universe dot Ai and apply for trial use and the you can you can test this immature version and give us your valuable feedback that will be very much appreciated awesome thank you very much for your time it was super valuable and insightful I I really enjoyed discussing large English model and just mindverse in general I think it's as I said it's a very interesting topic to challenge it's basically research but like applied research so that's amazing and just like in research it's super iterative and you will just keep learning and improving so it's really cool and yeah so thank you very much for your time I really appreciate itforeign CEO of Mind versus AI Felix has done a PhD in computer science before working as a research scientist at Facebook and then switching to Alibaba to be the director of the neuro symbolic lab now he is creating his own application mindverse AI in the interview we discuss large language models Chad GPT mindverse and a lot of other topics that are really trending right now I hope you enjoy it awesome so I will start with my first usual question which is who are you and more specifically I will since it's a very broad question I I would like to start with your academic background so what's your academic background yeah my name is Felix Tao so just like you I was a PhD student in computer science domain so I got my bachelor degree in qinwa University and then I came to the states to get my PhD from UIUC University of Illinois especially working on Data Mining and machine learning stuff that's the era where before the large language models happened uh so so everything is about training models that can be specialized for solving like particular um industrial quest industrial problems uh after I graduate I went to Facebook in Alibaba as a research scientist so that's mostly my research background academic background and now I'm running a company called my numbers but my University is also uh like uh AI related company so I do a lot of research in minors as well that's super interesting I'd love to first talk about your PhD just because as you mentioned I'm also doing one and retrospectively would you say that your PhD was worth it uh that's actually that's a question I asked myself a lot during my PhD times uh honestly in retrospect I think uh I think it's worth it for me but not necessarily where's it for all PhD students because for me it is a very good way very good Training for me to get deep into the foundation of machine learning and Ai and truly understand the evolution of how the technology has gone have gone through so it prepared me better for for the future like a research scientists work in Facebook and Alibaba and also in My Universe um but I definitely need to say one thing is the AI industry and Air Research Academia is changing like so fast so all the research papers I did during my PhD PhD days are not uh I think they are not useful anymore so in terms of impact in terms of uh how lasting can your can the value of your research Works be um it is pretty challenging um to think and for students so I would say if it's it's it's worth it as a training process but not worth it as a way to um truly make like a huge impact on the research area because it's changing a lot each year yeah that definitely makes sense and I've I've recently talked with another startup CEO on on this exact question of research and the PHD but mostly research and I don't want to put words in his mouth but basically the the gist was that research may not be worth it right now since we've done so much so much progress recently in research and now is the time to apply and commercialize and worked and and work on productization plus they are so much opportunities with open Ai and just n Minos and everything to help you commercialize and use the models so would you think investing in research is still relevant or like trying to pursue a research career path relevant or should one go into more like Hands-On developer engineering role yeah I think it's a it's a good question uh I feel like um the parenting shift of large language models makes a huge impact on the research Community one way is the way how we do research is changed which lowers about the four lowest bar for people to to be able to do research so previously we need to learn a lot of math and learn a lot a lot of like tuning on neural networks and Bayesian networks to make something work which requires a lot of like study and learning in the related domain but nowadays with the help of large language model we basically trying to do a lot of high level uh structure on top of that model so if I would say if you are not researching on the foundation models themselves it's probably better to skip PhD and get to the industry to work on things on top of larger language models and these are the place you can provide as a maximum value but if you are researching on the fundamental uh math and the mechanism of large language models always say Foundation models then yes research is still very valuable but that also means like previously we have like CV area NLP area and in LP we have so many different tasks right all these tasks many of them are gone it's not that valid research problem anymore because they can all be done by a single Foundation model so to study these models to study these questions is probably not good not valid anymore yeah yeah it must be difficult compared to back in the days to find a research topic especially in the PHD or for students like relatively beginners it must be difficult to find a research topic that where you can be impactful versus open AI deepmind meta and every large company in the field it yeah it's definitely not only that the PHD itself takes a lot of time and may not be as valuable to yourself in terms of like income compared to having a job and even to the learning process like working in a startup or having your own startup is like you learn a lot you are super motivated and you work way more just because you wanted to succeed and you you learn a lot just like in the PHD but I don't know what will happen with the PHD like I don't know if it will stay the way it is I see more and more work being done affiliated with companies so I assume this is where it will go like most phds will be people doing a PhD for meta for example and things like that yeah yeah sadly sadly I I agree with most of of your points um because the resource requirements to to get something down on AI space uh the the research Labs from universities are not gonna be as suitable as industrial Labs right for example the the air labs in Google or in Fair or in in open AI so yeah yeah I think I think this this kind of shift happens to many many Industries for example uh like database research mostly in the early days all these Innovations are done by research Labs but as after the database systems becomes more commercialized and the commercial companies have have more resource to to pull in to the research area and the building their own like product there are many Innovations are done by the industrial Labs or but done by the commercial companies I think AI is having this kind of moments so after that uh so the field is still going to be very Innovative Innovation driven but uh I think the majority of innovation would come from there is from the industrial Labs or uh uh startups like okay yeah yeah it definitely think so as well and before diving into the topic you mentioned that you worked at Facebook and Alibaba so I would love to hear a bit more about what were you doing first at Facebook my work in Facebook is very interesting because Facebook mostly is is one of the biggest platform in the world to have like so many data like so much data including like documents news articles videos pictures so when I join Facebook is having a New Direction they previously you can only see like a friend's Post in your news feed right but they think it's people are getting uh having a lot of Demand on truly using Facebook as an information source to get like news updates and to to get interesting videos not just from Friends status so at that time we are thinking about how can we bring Bridge the users billions of users on Facebook to like the billions of pieces of informations on the web so trying to bring information to people and then we needed to develop a tool we call it content understanding tool basically it is an AI platform that can you can put all the information all the news articles images videos into this platform and it will be able to understand the topic understand the key Concepts embedded in those contents and Facebook can use this kind of signals extracted by AI from this like Text data or video data to help users to find those relevant informations based on their interests and I was a person who started this project in news feed they're trying to understand text Data first and then go from Text data to image and video data and trying to build a good foundation for AI foundation for Content understanding and that was actually my first job and it's really exciting to to to be able to work on a project like this of this size because you are basically handling like the world's biggest database and trying to use AI to to help uh understand it so that's very cool and just to put that into perspective when was this exactly like what year uh I started working at Facebook from um 2017. well for AI That's like a very long time ago compared to where we are at now yeah I think at that time we we don't have the concept of large language models so uh the way how we understand those tax data or video data is still quite traditional in AI sense we we develop like specialized models and to extract topics or find like people in the videos all these kind of things they are done by a set of different models but nowadays you after the help of these Foundation models you probably can do them with a more elegant Way by using one single model to handle the whole content understanding job yeah it must have been a very good role to to learn a lot about a natural language and all the well natural language processing but like all the the intricacies with understanding text like since you didn't have all the tools that we now have which makes it much simpler even though it's extremely complex but it's still like that well the challenges are very different I assume as well yeah at that time the NLP tours we are using it requires a lot of like you know data labeling kind of stuff and for which particular task for example like topic extraction from news articles we need to label a lot of data so that's why in Facebook or in any tech company invest heavy on AI they usually partner with all this like data labeling generation companies where they hire people to label data for their tasks but nowadays as you said so all these tasks can just be a prompt to to to be a you need to design very smart prompts for the large language models and all the tasks they do is to how can we train a large language model that can uh observe all the data on the web and to to let the model learn how to understand the whole world and it's it's a totally different topic yes really cool and speaking of different topic it seems I I've read that you've been the founding director of the neurosymbolic lab in Alibaba and so um I I've also read that the goal was to waking up of Consciousness and so this seems super different from what you are doing at Facebook prior to that and I'd love to hear more about it uh yeah I was going through a very interesting time in AI history when I started I when I started to quit my Facebook job to John Alibaba that was around the 2000 19. in the in 2020. so around that time there's a very big thing happening in AI industry uh we all know it's gp3 right so open air launched the gp3 around the middle mid of 2020 and around that time I feel like how we approach AI is done wrong because uh so the mindset all these AI researchers have before is always how can we Define the problem to a particular model and how can we Define the input and outputs of this model and how can we get data and no matter it's by labeling or by harnessing from the the web how can get data enough data for this particular task but after gp3 I realized that the the goal of AGI has a very good foundation very good starting point scientists in in the early last century when they defined the term AI they mean AGI that the AI can learn things and can do all different kinds of jobs but uh because it's too hard at that time that's why people start to uh have different directions like some people research on the vision problem how can machines see some people research on the NLP problems how can machines read and understand the text and then you go deeper and deeper uh you you you have a set of problems in NLP and for each problem you split into smaller problems uh that's the traditional mindset for AI researchers but with a after gpd3 comes out Everything's changed so that that's the time I realized that maybe we can do something different maybe we don't need to follow the traditional uh way of doing AI research so I think that's why I studied this research lab called neurosymbolic lab so I try to combine Foundation models large Lounge models together with a framework which is more like human brain for example the framework is able to have like truly long-term memories and so framework is truly able to have like perceptions getting all these different kind of information sources and put it make them useful and you just send these signals to the foundation models and let the foundation model to process all these signals for you so if we combine Foundation models from Combine large language models with all the memories perceptions and even like abilities to motor control to be able to control the the arms of Robotics those kind of things uh you're actually making something similar to a human right so you're making some some you are truly pushing AI to be like how a human process information how a human uh think about a problem how a human deliver its actions so I feel like it's time to do that so that's why we set up a very ambitious goal saying like waking up of the Consciousness in machine but uh it's probably not done yet even from today's perspective it's but I think it's much much more closer than five years ago to to truly achieve this like ambitious goal definitely just like with well I don't know if it's recent in terms of AI history but with clip we've seen that we could understand a concept from either text or images which is also something that we can do quite easily if we see the word cat or a cat or if we hear cat we can all link it to the same concept in our brain so that's pretty cool that right now we can do that with AI or with one model well one larger model so that's yeah it's really cool that we seem to be transitioning into Foundation model basically but yeah into larger and more general intelligence compared to being very good at a specific application where we were aiming at where we we definitely were aiming at that just a few years ago even in my masters was to do some few shots classification like on on be better on very specific tasks and that's already much different now so it's yeah pretty cool yeah and one thing I I'd like to add is I I also observe the evolution of AI and I think one padding is very interesting is we start this AI concept AI research uh field by having a bunch of like super intelligence researchers trying to tackle the general AI problem and then failed first and then we start to take this fragmented approach where we split the tasks into smaller ones and by having like hundreds of different AI tasks and having separate people working on each of them uh then it goes back to a more like unified approach more like a general approach and I don't know I I feel like then after this having having these large language models then we probably will enter into a new era where we find one large language model is not going to be sufficient enough we want to diverge from it again to have different types of large language models for different for example personalities for different like jobs so it's always like fragmented and then unified and then fragmented so so this is a very interesting pattern happening in AI That's why I think as you said even though nowadays we see only a few companies who are able to deliver the the top of large Lounge models but uh I still think in the Future No Matter it's startups or maybe research students they would be able to find the niche issue Niche problems to solve but let's see yeah and regarding splitting into smaller applications would you think wouldn't you think that that's already what we are doing with for example lots of people take large English models and fine-tune them to their own task or rather sometimes build a memory of some kind of data set that they have for example a medical book and then they just ask the large English model to to cite this medical book and so we already are are doing that like going from the large model and and adapting into our adapting it to be better on on our specific task yeah so do you think this is promising Avenue or or will we have a large language models soon enough that is better at each of these very specific tasks versus splitting it yeah I think that's a very interesting question uh so to be honest I I don't have a certain answer yet but my my belief is uh I feel like we need a different AIS for different tasks and just to make it more specialized in making more uh more of high quality in terms of solving that particular domain's problems uh for example my company is called a mind versus right so we have this core product called The Mind OS it's basically trying to solve this for different domains experts uh I I can't I would say that large language models has two things embedded in the model one is their reasoning ability so they are able to reason uh by complicated Logic No matter it's its domain agnostic so it's not related to some particular domain it's just how they like human beings they can just use logic to reason um you use logic to solve problems and the another layer of large language model is their common sense so this common sense of the understanding of the whole world is obtained one day in the pre-training stage when they go through all the web data but this Common Sense usually are not good enough for a particular domain so that's why people can particularly domain they probably need the reasoning part but not necessaries and not on the knowledge Parts embedded in large language model because they want to have their own specialized knowledge they want to have their own vertical data so we call that uh the grounding layer right so we use large language models for reasoning but we add a grounding layer on top of it to make the model uh tuned to a particular domain so they will be much much better solving at that domain's problem I think that's that's one of the major goals for mine OS so people can do it in a very easy manner they can simply updating documents and they can simply connecting domain specific abilities apis onto the model and the model itself would be able to plan and retrieve related information from this particular domain particular workflow particular use case and use the reasoning power of larger lineup models to to solve it better it's basically like replicating the process of a perfect human where we we are all able to eat world can do pretty much everything but yeah you have to go to university and do a PhD and everything if you want to be the best at something and so it's funny how we are basically trying to do the same thing with AI where we try to teach it the world in general just like we do to our children and then we we give it like lots of very specific documents just like we do with University students to become experts in one field so that's it completely makes sense yeah yeah totally I think the way how you put it is about the the perfect way to describe how we build these AI systems so so you need like all these pre-changing steps like training our kids from just born born to having like a college degree then you need this professional training to make it like expert in a particular domain so what we do in my universe what open ad has done over many other large language model has done is to finish the pre-training to make sure the AI is like a college grad graduate and but for for what we do is to have this like professional training agenda designer for each AI so they can be a super professional in their domain and they can have long-term memories so they can not only be perfect they are not only professionals they can also grow as you use it so they can grow more and more grounded into your particular domain and your particular uh use case so that's very exciting because for me in the future I think AI is not going to be a general model solving everything for everyone each person each business they they they need the AI to be more related to them to their life to be more grounded in their daily work so adding this grounding layer on top of the large language model is definitely something that can have a lot of innovation happening so I assume that is the main goal of mine OS is to build these additional layers above the the first initial pre-training and are you aiming to do that on most fields or are you specializing to to some some Fields what exactly are you building and aiming for with Minos yeah yeah so uh minor verse it's like our vision right so it's a term minor verse basically is our vision where in the future AI beings are gonna uh coexist with human beings in a forum a new Society where a lot of things humans do humans do are going to be delegated to Ai and AI is just going to be an integral part of our society and that's the vision and I think we can try to make it uh like a benefit a good Vision a good future for for Humanity and mind OS is our product so basically it's like operating system that can generate AI mind and we call it AI Geniuses in in our system so anyone can get on the system and trying to create a AI genius for a particular domain for example you can create a genius for for like Hotel Butler you can create a genius for your like assistant for Air Research right and genius for HR Support all these kind of things so to do that we we need to like you said we need to tackle a couple of technical so challenges one is to make them like easy to use and add this grounding layer on top of each AI genius so uh we are making it as general as possible to just to answer your question because the the foundation model itself is General right and the training process the professional training process is mostly alike in the real world so what do we do the grounding layer is basically adding the training procedure for different domains the the way how you train it is similar but the the materials the data you train it is different for different domains so minus is mostly trying to provide a tool set a engine a platform that different domains can use so we don't try to focus only one or two domains and we want to make it more like a create creativity platform where people can have their own creative way of generating AI Geniuses fourth or AI Minds for their on domain so that's a goal that's one of the biggest features in mine OS but we also do other things um to name one which I think is very interesting as well is like when you use chargept you know the reasoning power is there but it can only do one round of reasoning right after you ask something it gives you some results after reasoning but we can already see a trend in industry for example Auto GPT so the AI would be able to autonomously use its reasoning power and by different by many iterations so it's like you are multiplying the reasoning power of large language models by a big number for a particular task so they would be able to uh simplify a complex task to different subtasks and do it gradually iteratively so I think that's very also very important piece of work we do in mind OS to let AI to have the ability to deep think to do like a slow think so they can Leverage The Power of reasoning more and make the AI more powerful yeah I have lots of questions following that but with the last thing you said I personally using chatgpt and other models my main issue is hallucination and so I'm a bit afraid and skeptical when talking about chaining those prompts and requests just because I feel like it can just grow the risk of hallucination and just working upon these hallucinations and just like growing more and more the what like the wrong it can do and so is there anything you are doing to mitigate the the risk of of hallucination or is that a thing that the brands using mine OS need to tackle is that like is there something you guys do to to help with that I think Hallucination is one of the major reasons why people especially businesses don't use chat gpt2 actually and uh to me I think the solution to this can be twofolds right one is how what Obama is doing for example they are training gpd4 and it may be even like higher level of large Lounge models in the future I think one major goal of these new new models are to solve the hallucination issue so I think in one of the interviews by Down by Sam Altman or Elia I can remember whom they say that the hallucination issue in gbd4 is reduced by at least 60 percent so so that's one one area where people where we can make it better another area is what we are doing is a surrounding layer so by doing grounding layer we use like tactics for example and generating like a very special prompt to enforce the AI model to uh speak on things by the reference text not by things trained from the pre-training stage and uh we enforce that we also added this like we say citation system so everyone say he said we would like to ask him to add citations from the original source and uh everything marked with the citation would be more trustworthy than things that's not marked by citations so it can solve some issues when people perceive the result of the AI generated right um we we can have more Trust on things they they think have a good citation on and the things they don't have a good citation on but uh I would say the the hallucination issue is a fundamental flaw for large language models and I actually think it's a fundamental flaw in human mind as well yeah yeah so so humans sometimes do a lot of like bullshiting as well so uh yeah but I think it's it's getting solved gradually um in in areas like marketing and for example uh entertainment people are more like uh okay to have these hallucinations sometimes as long as the amount of Hallucination is controlled but in here is like medical as you said medical law and all this kind of very serious domains probably this is going to be a bigger issue so I see even different Industries when they adopt this large language model larger models they they have like different pacing for adoption yeah and that's actually how I used to describe it as well large English models I just I I used to compare it with Wikipedia where like back in the day you you can trust Wikipedia but you cannot really cite it or you cannot write something for school or whatever based on Wikipedia you need to check the sources and yeah confirm so it's pretty much the same thing with large language models right now you you always need to confirm what it says if it cannot cite where it took its information which is why I think like linking it to a memory based on on lots of documentation is super powerful and maybe the easiest solution to to tackle that but I agree that it may be just like a human bias that is just generalized to language models just because lots of people share fake news or or lie to their friends and stuff like that and the data is is from us so yeah it makes sense that it's doing the same thing yeah yeah totally totally uh that's a very good analogy um by the way I think uh just like you say the one thing we can always do to reduce the impact of Hallucination is trying to make the thinking process of AI as transparent as possible yeah so uh for example uh in the stage of retrieving information as external reference we make it transparent in the process of calling apis to finish a task we make it transparent so the user once they talk to AI chatbot with a powerful ah about all the actions all the informations used in the whole thinking process would be should be transparent so people can have more trust this cannot be done by a real person right we cannot when we talk to a professional we cannot ask them to to list all the thinking process for us but we can do it with AI so I think we have different a lot of a lot of ways to to reduce the impact of hallucination yeah like that I just mentioned that it's important to double check what the language model says just to be sure it's a truthful and so when it's chaining prompts and you are not controlling everything is there a way to ensure that what is happening in between the the input and the final results is is truthful and there were no hallucinations in between uh I I don't think we can can't nowadays we we don't have the proper tools to get into the like very detailed uh of the Computing process um down by large language models right because it's more like a black box um but uh I think to train a powerful AI you probably don't simply just to use large language models you actually build a framework on top of it and having this like thinking thought flows between different parts of the mind uh you have thought flaws from the memory to the perception area from perception area to the um you know Moto action area and this high level flow of information can totally be transparent to the users so um I'm not sure if Albany Heights it is developing something trying to visualize and make the the hidden process transparent I doubt it's very difficult to do but we make the high level process transparent and we make sure everything the the AI generate generated to user have some good reference and good citations I think that's that's one way to go but it's not going to solve the problem um by 100 and could you identify what is right now the biggest challenge when build when trying to build these well this tool that produces specialized agent like is there one specific challenge that is something you are currently working on is harder to solve than than the others one thing is just as you said hallucination but other than hallucination one big challenge that we try to tackle is how we actually make the AI as grounded as possible usually people's knowledge on how to deliver a job is very complicated it's not only about providing the AI a few documents it's not only about providing AI with a couple of tools you you always need to teach the AI how to behave in very different scenarios so we can definitely do it by uh give instructions to AI in different scenarios but still they sometimes don't understand like the best best practice for a particular domain they have these different tools right for example if we are building a marketing agent they have tools to connect to Facebook they have tools to to finding the product details of a company but how to combine them together for a better marketing plan campaign is quite challenging because you can use some piece of information to give like give us some good ideas but for them to autonomously finish it for you is very hard so Auto GPT and the similar like Technologies uh like what we are doing can be a good direction to go but this autonomy has issues for example they are not controllable they usually can be very open-ended and not very focused on the particular task and which turns out to be like wasting a lot of money but not solving your issue so how can we make AI that knows how to deliver a complicated domain specific task and without random thinking about like random ideas that's very hard to do so I would say uh no matter it's a hallucination issue all the like autonomy issue it's all about how we can control the ai's behavior to what we really want it to be by natural language that's a very key part that's why we need to build a really really good framework on top of large language models to deliver the know-how to deliver some feedback to the AI so it can smartly incorporate users feedback into the way how they think and the way how they perform tasks that's very hard and that's a major technical challenge we are facing and we try to solve it by by framework we developed and is the the only way to tackle this is through is through prompting and trying to like bus training make it understand how to do the job or are you also referring to fine-tuning and basically changing also changing the brain of the AI just to to tweak its answers like is it fully after training with with prompting resources are also related to retraining or fine-tuning the models yeah it's also a very good point uh I think currently um most people are doing using this prompting approach the way how we do it is not ask a human to write a better prompt it's like how can the AI get some feedback and updates their own prompt updates their own behavior by updating their own instructions so that's one way automatically and if it can be done automatically it would be a very efficient way to tune and the control as Behavior but uh I think this approach has some limitation and it cannot achieve like total control over ai's Behavior so I I believe in the future we probably want to have some training built in the process so we don't train the the foundation model we don't train the big one but we can definitely train a behavior module on top of that which is which can be a very small model like adapter yeah and this model would would be in charge of getting users feedback and updates the parameters within it automatically and gradually adapting to users preference and uh make it more grounded more suitable to whether one user really need so but the the prompting approach is gonna last for a while before the real fine tuning stage comes because we haven't seen the we haven't reached the limitation of the company approach yet so it's always more convenient to work on the prompting part yeah yeah that definitely makes sense and I I had a question for this is mainly for a friend of mine that is um it's about mineverse and mine OS but it's also to give context and and help the listeners to to better understand the tool so I will just uh yeah give my example so my friend is a recruiter at a company so and he tries to find people to fill in a broad range of roles and right now she tries to find ways to use AI to improve his work so okay he doesn't have any background like he's not a programmer he's not he doesn't have any background in AI other than playing with jbt and so how could someone like him which I assume a lot of listeners are in the same profile could could use mine Os or minevirus to improve anything in their work and like what do they need to learn or to do what would be the the steps to get into it and have an agent help them at in the end yeah so so uh if I understand correctly your friend is in the HR space right yeah okay so uh definitely I think uh I I believe in the future any professional like HRS like lawyers and researchers they can they can have their own way of using mine Os or any other agent platform to build their own agents or just use other people's agents um for for their work for example if we look at this HR job you can find many different steps right some steps are trying to find the candidates from the web or from some like linking profiles and then the next step is trying to assess the degree of feeding of each candidate to the job description and then maybe we can have this like interview process and then communicate with the candidates or this kind of compensation or these kind of things for many different things you can definitely use a myos to build one agent for your job so for example uh we in Minos for example you you when you try to um find a good candidate for LinkedIn we can add one endpoint for the genius you create to have the ability to to browse the web especially on the LinkedIn websites and then you can create a very good workflow that's uh by dragging a few modules together say any candidates you find you first assess them based on their uh for example uh past experience and give them a score from one to five and this second thing is mostly by issuing a natural language command to the AI and ask the AI to do it as you order it to do so they will start to automatically browsing the web and getting all these LinkedIn profiles and trying to uh like greater than by their experience and then you can build another workflow on top of that say after you grade them please send all the like five grade as five candidates to my email and rank them by uh the closeness of then their current job to our city something like that so so it's very very easy to set up those kind of workflows in mindos and we have all this we actually have a very interesting feature in mind OS we are currently building is we are creating this collaboration Network between AI agents or we call AI Geniuses uh for example you you create one AI agent for your like uh Talent acquisition process and you have another AI agent working on this like uh resume grading and I have a third agent working on this like uh automatic interview with uh with the candidates per se so you can ask you can connect them together and to say uh I needed this AI like Talent acquisition agent to First find the talents and then pass this to the second agents for grading and uh findings of Fitness findings are feeding positions in within our company and then the third one to start an initial interview with with the with the candidates so uh this this kind of way to work can be applied to all different types of jobs right not only for HR so we can always find this like each job they have different components and each one requires them to teach the AI how to do it properly and if we can have this mechanism to combine these different AIS together we can largely reduce how people work on this like uh repetitive and tedious work and that's mostly the goal of mine OS and that's I think answers your question how can they use monos and to improve their productivity yeah yeah so just to summarize it would almost only be through lateral language so English or even I assume it could work in French or other languages yeah and do they need any other skills for example when you see that the the agent could go on LinkedIn and and scrape all the different profiles do you need to to do anything related to connecting it to LinkedIn yeah we already provide a set of like public skills for example ability to browse the web or ability to run some code ability to to find a flight for your travel uh things like that but uh the beauty of it the beauty of the grounding layer is for your particular work you can always have your own vertical abilities right for example you can you you need to connect to a vertical database to look for some information or finding out the the clients from your CRM whatever you can always connect these vertical skills vertical tools into the AI so the AI itself would be a be able to autonomously use this set of skills and trying to for each of your inquiry uh each of your tasks they would be able to automatically come up with a plan using these different tools and solve your particular issue if you think it's not good enough and more tools to it and give more feedback to the AI and he will become more powerful and trying to adapt to your true need and I think that's the power of mine OS so or like like I said it's a it's a tool it's a very good tool and you can build uh use this tool and for your need and we provide a lot of flexibility to users so they can basically build anything they want really cool yes it's really cool and I'm excited to see what people will do not requiring coding skills or learning other things to use those models it's it will just I think it's like the quantity of user will will speed up the development progress as well just based on all the Investments and what it will stimulate with that and so I have a few final questions just related to the mainly to the future but yeah it will be difficult questions but short ones the first one is is there a thing that AI cannot do yet but you would love it to be able to do I think the AI country still don't have like some self-consciousness but but I I'm not saying self-consciousness as a very risky way it's more like AI is self-aware of what they can do and they cannot do that's very important that's I think that's a major source of hallucination if they know that they don't have sufficient knowledge or sufficient capability to finish something they should be able to realize they are lacking of this kind of capability and then they will stop being hallucin generating hallucination uh generating like false informations so I think it's very important to build some sort of like self-awareness module in the AI agents or in the AI mind framework and for them to not only to understand what they cannot do but also understand they need to learn new stuff to grow themselves to be self-teaching and that would some be something super helpful and super cool and I don't think I don't see any AI tools or air Frameworks based on larger large learning models can can have that down I completely agree and self-teaching will be super important but I I I'm afraid it's also something that humans are good to just the typical fake it till you make it it's basically exactly that some humans assume they can do something or just yeah fake that they can do something and they they cannot and so AI is just doing the same thing yeah it's true it's true and so okay I ask you about the biggest challenge facing your your company right now but would you have a different answer for the biggest challenge facing the whole AI industry in general or is that the the main priorities for for most people you would assume I think about that a lot recently especially after seeing the power of gpd4 so I think the biggest challenge is how can we control the impact on the society as a whole so we we all know that this AI is very powerful everyone's way of working and and living their life is going to be fundamentally Changed by this AI technology but uh it can be good or can be very risky right and it can if it happens too fast then I feel like a lot of people would be will lose their job so so I totally agree that AI in the long run can be a very beneficial thing for the society but in the short run in the short term we probably want to be more uh conservative on pushing it Forward I think the whole Human Society needs to be more prepared and the impact it's gonna given to us and we probably need more regulations and we we need more like better technology tools to control the like negative impact of AI so I think that's that's a major thing that's stopping AI from being more powerful which I think is good so I think it takes Collective effort for everyone who is involved in this this AI wave uh no matter if for for people like us we are all like AI professionals or for just random person on the street they have very little knowledge of AI I think we should afford we should be more like pay more attention to this particular issue I completely agree and speaking of the long run how do you see AI evolving over the next five years or in your mind where will we be at in five years uh five year five years is probably not going to be a huge difference I think of in five years probably two things is gonna happen one is uh the ai's reasoning ability is going to be much better so they can be totally above average average human uh in terms of way to analyze and the reason to solve complicated problems and the second thing is kind of happening is gonna happen is they are going to be able to handle more signals nowadays mostly handle the text data and respond by text Data as well but in the future they would be able to absorb to get like visual data a voice and all this kind of different signals different sensories that humans have and then they can respond not only by text output but also by some actions by by different type of types of ways to to deliver information to to the end user that's what's going to happen I think in five years but if we think it's for a more long-term perspective um I believe like all the digital service is going to be changed by AI so uh I think AI copilot or AI agent is the new form of software in the future so every software would be in the form of agents and everyone would be able to have an army of agents that they can leverage to help them finish a lot of things that's probably happening not very far five eight years yeah that would be really cool and about just the language model part we've seen the big jump from gpt3 to chat GPT and just gpt2 to gpd3 as well but would you would you agree that we may have done for example the deparator rule with whatever percentage but like the gpt3 to chat GPT was 20 of the work to get 80 percent of their results but then the final 20 will require 80 of the work so like do you think their progress will slow down compared to gpt3 to GPT to chat GPT for the next big step or do you believe we still will improve quite a lot in the following years I think it's a very interesting question if you truly uh look into what Obama has done over the past few years right I think they are still trying to scale it up and they still believe that there's a lot more can be mined from scaling the model to the next level so my my opinion on this is in terms of large language model itself in the next five to three to five years we can still make huge progress to make it make them more intelligent to make them to handle like longer context and just in general better and more powerful I don't think I don't see it slowing down but I do believe that there is definitely some limitations um large land model itself so we we need to build something build a framework around it to unleash the power more so from that front uh we wouldn't see many countries many companies many researchers to make it more autonomous to make it more like like you said self-teaching and to make it more powerful to to connect to the external world and to be able to use external tools and to make it more like adapt adaptive as you use it so all these kind of things are like a different layer of Innovations on top of large language models and combining these two together uh because it's you just multiplies by these different factors of innovation right uh I can see in five to ten years the whole AI landscape is still growing like very very fast it's gonna be as fast as the past five years yeah even more yeah that's super exciting yeah well first I will recommend everyone listening to check out my inverse and minus I think it's a really good product and it's super promising really cool and I'm excited about anything everything related to agent and a bit scared but I hope I hope it will go well do you have anything you'd like to share to the audience in terms either of mine Os or your personal projects uh yeah it's uh I can share a little bit on mine OS so minus is currently still uh close the beta product um so we are currently um experimenting this product with like around 500 to 1000 and Pilots users um uh it's it's not uh 100 Reddit yet but we are iterating it very fast so it's probably going to be ready within two months so it can be used by anyone in the world so hopefully at that time um it can help you a lot and uh and if you are very interested in using the closed beta version of mine OS please go to my universe dot Ai and apply for trial use and the you can you can test this immature version and give us your valuable feedback that will be very much appreciated awesome thank you very much for your time it was super valuable and insightful I I really enjoyed discussing large English model and just mindverse in general I think it's as I said it's a very interesting topic to challenge it's basically research but like applied research so that's amazing and just like in research it's super iterative and you will just keep learning and improving so it's really cool and yeah so thank you very much for your time I really appreciate itforeign CEO of Mind versus AI Felix has done a PhD in computer science before working as a research scientist at Facebook and then switching to Alibaba to be the director of the neuro symbolic lab now he is creating his own application mindverse AI in the interview we discuss large language models Chad GPT mindverse and a lot of other topics that are really trending right now I hope you enjoy it awesome so I will start with my first usual question which is who are you and more specifically I will since it's a very broad question I I would like to start with your academic background so what's your academic background yeah my name is Felix Tao so just like you I was a PhD student in computer science domain so I got my bachelor degree in qinwa University and then I came to the states to get my PhD from UIUC University of Illinois especially working on Data Mining and machine learning stuff that's the era where before the large language models happened uh so so everything is about training models that can be specialized for solving like particular um industrial quest industrial problems uh after I graduate I went to Facebook in Alibaba as a research scientist so that's mostly my research background academic background and now I'm running a company called my numbers but my University is also uh like uh AI related company so I do a lot of research in minors as well that's super interesting I'd love to first talk about your PhD just because as you mentioned I'm also doing one and retrospectively would you say that your PhD was worth it uh that's actually that's a question I asked myself a lot during my PhD times uh honestly in retrospect I think uh I think it's worth it for me but not necessarily where's it for all PhD students because for me it is a very good way very good Training for me to get deep into the foundation of machine learning and Ai and truly understand the evolution of how the technology has gone have gone through so it prepared me better for for the future like a research scientists work in Facebook and Alibaba and also in My Universe um but I definitely need to say one thing is the AI industry and Air Research Academia is changing like so fast so all the research papers I did during my PhD PhD days are not uh I think they are not useful anymore so in terms of impact in terms of uh how lasting can your can the value of your research Works be um it is pretty challenging um to think and for students so I would say if it's it's it's worth it as a training process but not worth it as a way to um truly make like a huge impact on the research area because it's changing a lot each year yeah that definitely makes sense and I've I've recently talked with another startup CEO on on this exact question of research and the PHD but mostly research and I don't want to put words in his mouth but basically the the gist was that research may not be worth it right now since we've done so much so much progress recently in research and now is the time to apply and commercialize and worked and and work on productization plus they are so much opportunities with open Ai and just n Minos and everything to help you commercialize and use the models so would you think investing in research is still relevant or like trying to pursue a research career path relevant or should one go into more like Hands-On developer engineering role yeah I think it's a it's a good question uh I feel like um the parenting shift of large language models makes a huge impact on the research Community one way is the way how we do research is changed which lowers about the four lowest bar for people to to be able to do research so previously we need to learn a lot of math and learn a lot a lot of like tuning on neural networks and Bayesian networks to make something work which requires a lot of like study and learning in the related domain but nowadays with the help of large language model we basically trying to do a lot of high level uh structure on top of that model so if I would say if you are not researching on the foundation models themselves it's probably better to skip PhD and get to the industry to work on things on top of larger language models and these are the place you can provide as a maximum value but if you are researching on the fundamental uh math and the mechanism of large language models always say Foundation models then yes research is still very valuable but that also means like previously we have like CV area NLP area and in LP we have so many different tasks right all these tasks many of them are gone it's not that valid research problem anymore because they can all be done by a single Foundation model so to study these models to study these questions is probably not good not valid anymore yeah yeah it must be difficult compared to back in the days to find a research topic especially in the PHD or for students like relatively beginners it must be difficult to find a research topic that where you can be impactful versus open AI deepmind meta and every large company in the field it yeah it's definitely not only that the PHD itself takes a lot of time and may not be as valuable to yourself in terms of like income compared to having a job and even to the learning process like working in a startup or having your own startup is like you learn a lot you are super motivated and you work way more just because you wanted to succeed and you you learn a lot just like in the PHD but I don't know what will happen with the PHD like I don't know if it will stay the way it is I see more and more work being done affiliated with companies so I assume this is where it will go like most phds will be people doing a PhD for meta for example and things like that yeah yeah sadly sadly I I agree with most of of your points um because the resource requirements to to get something down on AI space uh the the research Labs from universities are not gonna be as suitable as industrial Labs right for example the the air labs in Google or in Fair or in in open AI so yeah yeah I think I think this this kind of shift happens to many many Industries for example uh like database research mostly in the early days all these Innovations are done by research Labs but as after the database systems becomes more commercialized and the commercial companies have have more resource to to pull in to the research area and the building their own like product there are many Innovations are done by the industrial Labs or but done by the commercial companies I think AI is having this kind of moments so after that uh so the field is still going to be very Innovative Innovation driven but uh I think the majority of innovation would come from there is from the industrial Labs or uh uh startups like okay yeah yeah it definitely think so as well and before diving into the topic you mentioned that you worked at Facebook and Alibaba so I would love to hear a bit more about what were you doing first at Facebook my work in Facebook is very interesting because Facebook mostly is is one of the biggest platform in the world to have like so many data like so much data including like documents news articles videos pictures so when I join Facebook is having a New Direction they previously you can only see like a friend's Post in your news feed right but they think it's people are getting uh having a lot of Demand on truly using Facebook as an information source to get like news updates and to to get interesting videos not just from Friends status so at that time we are thinking about how can we bring Bridge the users billions of users on Facebook to like the billions of pieces of informations on the web so trying to bring information to people and then we needed to develop a tool we call it content understanding tool basically it is an AI platform that can you can put all the information all the news articles images videos into this platform and it will be able to understand the topic understand the key Concepts embedded in those contents and Facebook can use this kind of signals extracted by AI from this like Text data or video data to help users to find those relevant informations based on their interests and I was a person who started this project in news feed they're trying to understand text Data first and then go from Text data to image and video data and trying to build a good foundation for AI foundation for Content understanding and that was actually my first job and it's really exciting to to to be able to work on a project like this of this size because you are basically handling like the world's biggest database and trying to use AI to to help uh understand it so that's very cool and just to put that into perspective when was this exactly like what year uh I started working at Facebook from um 2017. well for AI That's like a very long time ago compared to where we are at now yeah I think at that time we we don't have the concept of large language models so uh the way how we understand those tax data or video data is still quite traditional in AI sense we we develop like specialized models and to extract topics or find like people in the videos all these kind of things they are done by a set of different models but nowadays you after the help of these Foundation models you probably can do them with a more elegant Way by using one single model to handle the whole content understanding job yeah it must have been a very good role to to learn a lot about a natural language and all the well natural language processing but like all the the intricacies with understanding text like since you didn't have all the tools that we now have which makes it much simpler even though it's extremely complex but it's still like that well the challenges are very different I assume as well yeah at that time the NLP tours we are using it requires a lot of like you know data labeling kind of stuff and for which particular task for example like topic extraction from news articles we need to label a lot of data so that's why in Facebook or in any tech company invest heavy on AI they usually partner with all this like data labeling generation companies where they hire people to label data for their tasks but nowadays as you said so all these tasks can just be a prompt to to to be a you need to design very smart prompts for the large language models and all the tasks they do is to how can we train a large language model that can uh observe all the data on the web and to to let the model learn how to understand the whole world and it's it's a totally different topic yes really cool and speaking of different topic it seems I I've read that you've been the founding director of the neurosymbolic lab in Alibaba and so um I I've also read that the goal was to waking up of Consciousness and so this seems super different from what you are doing at Facebook prior to that and I'd love to hear more about it uh yeah I was going through a very interesting time in AI history when I started I when I started to quit my Facebook job to John Alibaba that was around the 2000 19. in the in 2020. so around that time there's a very big thing happening in AI industry uh we all know it's gp3 right so open air launched the gp3 around the middle mid of 2020 and around that time I feel like how we approach AI is done wrong because uh so the mindset all these AI researchers have before is always how can we Define the problem to a particular model and how can we Define the input and outputs of this model and how can we get data and no matter it's by labeling or by harnessing from the the web how can get data enough data for this particular task but after gp3 I realized that the the goal of AGI has a very good foundation very good starting point scientists in in the early last century when they defined the term AI they mean AGI that the AI can learn things and can do all different kinds of jobs but uh because it's too hard at that time that's why people start to uh have different directions like some people research on the vision problem how can machines see some people research on the NLP problems how can machines read and understand the text and then you go deeper and deeper uh you you you have a set of problems in NLP and for each problem you split into smaller problems uh that's the traditional mindset for AI researchers but with a after gpd3 comes out Everything's changed so that that's the time I realized that maybe we can do something different maybe we don't need to follow the traditional uh way of doing AI research so I think that's why I studied this research lab called neurosymbolic lab so I try to combine Foundation models large Lounge models together with a framework which is more like human brain for example the framework is able to have like truly long-term memories and so framework is truly able to have like perceptions getting all these different kind of information sources and put it make them useful and you just send these signals to the foundation models and let the foundation model to process all these signals for you so if we combine Foundation models from Combine large language models with all the memories perceptions and even like abilities to motor control to be able to control the the arms of Robotics those kind of things uh you're actually making something similar to a human right so you're making some some you are truly pushing AI to be like how a human process information how a human uh think about a problem how a human deliver its actions so I feel like it's time to do that so that's why we set up a very ambitious goal saying like waking up of the Consciousness in machine but uh it's probably not done yet even from today's perspective it's but I think it's much much more closer than five years ago to to truly achieve this like ambitious goal definitely just like with well I don't know if it's recent in terms of AI history but with clip we've seen that we could understand a concept from either text or images which is also something that we can do quite easily if we see the word cat or a cat or if we hear cat we can all link it to the same concept in our brain so that's pretty cool that right now we can do that with AI or with one model well one larger model so that's yeah it's really cool that we seem to be transitioning into Foundation model basically but yeah into larger and more general intelligence compared to being very good at a specific application where we were aiming at where we we definitely were aiming at that just a few years ago even in my masters was to do some few shots classification like on on be better on very specific tasks and that's already much different now so it's yeah pretty cool yeah and one thing I I'd like to add is I I also observe the evolution of AI and I think one padding is very interesting is we start this AI concept AI research uh field by having a bunch of like super intelligence researchers trying to tackle the general AI problem and then failed first and then we start to take this fragmented approach where we split the tasks into smaller ones and by having like hundreds of different AI tasks and having separate people working on each of them uh then it goes back to a more like unified approach more like a general approach and I don't know I I feel like then after this having having these large language models then we probably will enter into a new era where we find one large language model is not going to be sufficient enough we want to diverge from it again to have different types of large language models for different for example personalities for different like jobs so it's always like fragmented and then unified and then fragmented so so this is a very interesting pattern happening in AI That's why I think as you said even though nowadays we see only a few companies who are able to deliver the the top of large Lounge models but uh I still think in the Future No Matter it's startups or maybe research students they would be able to find the niche issue Niche problems to solve but let's see yeah and regarding splitting into smaller applications would you think wouldn't you think that that's already what we are doing with for example lots of people take large English models and fine-tune them to their own task or rather sometimes build a memory of some kind of data set that they have for example a medical book and then they just ask the large English model to to cite this medical book and so we already are are doing that like going from the large model and and adapting into our adapting it to be better on on our specific task yeah so do you think this is promising Avenue or or will we have a large language models soon enough that is better at each of these very specific tasks versus splitting it yeah I think that's a very interesting question uh so to be honest I I don't have a certain answer yet but my my belief is uh I feel like we need a different AIS for different tasks and just to make it more specialized in making more uh more of high quality in terms of solving that particular domain's problems uh for example my company is called a mind versus right so we have this core product called The Mind OS it's basically trying to solve this for different domains experts uh I I can't I would say that large language models has two things embedded in the model one is their reasoning ability so they are able to reason uh by complicated Logic No matter it's its domain agnostic so it's not related to some particular domain it's just how they like human beings they can just use logic to reason um you use logic to solve problems and the another layer of large language model is their common sense so this common sense of the understanding of the whole world is obtained one day in the pre-training stage when they go through all the web data but this Common Sense usually are not good enough for a particular domain so that's why people can particularly domain they probably need the reasoning part but not necessaries and not on the knowledge Parts embedded in large language model because they want to have their own specialized knowledge they want to have their own vertical data so we call that uh the grounding layer right so we use large language models for reasoning but we add a grounding layer on top of it to make the model uh tuned to a particular domain so they will be much much better solving at that domain's problem I think that's that's one of the major goals for mine OS so people can do it in a very easy manner they can simply updating documents and they can simply connecting domain specific abilities apis onto the model and the model itself would be able to plan and retrieve related information from this particular domain particular workflow particular use case and use the reasoning power of larger lineup models to to solve it better it's basically like replicating the process of a perfect human where we we are all able to eat world can do pretty much everything but yeah you have to go to university and do a PhD and everything if you want to be the best at something and so it's funny how we are basically trying to do the same thing with AI where we try to teach it the world in general just like we do to our children and then we we give it like lots of very specific documents just like we do with University students to become experts in one field so that's it completely makes sense yeah yeah totally I think the way how you put it is about the the perfect way to describe how we build these AI systems so so you need like all these pre-changing steps like training our kids from just born born to having like a college degree then you need this professional training to make it like expert in a particular domain so what we do in my universe what open ad has done over many other large language model has done is to finish the pre-training to make sure the AI is like a college grad graduate and but for for what we do is to have this like professional training agenda designer for each AI so they can be a super professional in their domain and they can have long-term memories so they can not only be perfect they are not only professionals they can also grow as you use it so they can grow more and more grounded into your particular domain and your particular uh use case so that's very exciting because for me in the future I think AI is not going to be a general model solving everything for everyone each person each business they they they need the AI to be more related to them to their life to be more grounded in their daily work so adding this grounding layer on top of the large language model is definitely something that can have a lot of innovation happening so I assume that is the main goal of mine OS is to build these additional layers above the the first initial pre-training and are you aiming to do that on most fields or are you specializing to to some some Fields what exactly are you building and aiming for with Minos yeah yeah so uh minor verse it's like our vision right so it's a term minor verse basically is our vision where in the future AI beings are gonna uh coexist with human beings in a forum a new Society where a lot of things humans do humans do are going to be delegated to Ai and AI is just going to be an integral part of our society and that's the vision and I think we can try to make it uh like a benefit a good Vision a good future for for Humanity and mind OS is our product so basically it's like operating system that can generate AI mind and we call it AI Geniuses in in our system so anyone can get on the system and trying to create a AI genius for a particular domain for example you can create a genius for for like Hotel Butler you can create a genius for your like assistant for Air Research right and genius for HR Support all these kind of things so to do that we we need to like you said we need to tackle a couple of technical so challenges one is to make them like easy to use and add this grounding layer on top of each AI genius so uh we are making it as general as possible to just to answer your question because the the foundation model itself is General right and the training process the professional training process is mostly alike in the real world so what do we do the grounding layer is basically adding the training procedure for different domains the the way how you train it is similar but the the materials the data you train it is different for different domains so minus is mostly trying to provide a tool set a engine a platform that different domains can use so we don't try to focus only one or two domains and we want to make it more like a create creativity platform where people can have their own creative way of generating AI Geniuses fourth or AI Minds for their on domain so that's a goal that's one of the biggest features in mine OS but we also do other things um to name one which I think is very interesting as well is like when you use chargept you know the reasoning power is there but it can only do one round of reasoning right after you ask something it gives you some results after reasoning but we can already see a trend in industry for example Auto GPT so the AI would be able to autonomously use its reasoning power and by different by many iterations so it's like you are multiplying the reasoning power of large language models by a big number for a particular task so they would be able to uh simplify a complex task to different subtasks and do it gradually iteratively so I think that's very also very important piece of work we do in mind OS to let AI to have the ability to deep think to do like a slow think so they can Leverage The Power of reasoning more and make the AI more powerful yeah I have lots of questions following that but with the last thing you said I personally using chatgpt and other models my main issue is hallucination and so I'm a bit afraid and skeptical when talking about chaining those prompts and requests just because I feel like it can just grow the risk of hallucination and just working upon these hallucinations and just like growing more and more the what like the wrong it can do and so is there anything you are doing to mitigate the the risk of of hallucination or is that a thing that the brands using mine OS need to tackle is that like is there something you guys do to to help with that I think Hallucination is one of the major reasons why people especially businesses don't use chat gpt2 actually and uh to me I think the solution to this can be twofolds right one is how what Obama is doing for example they are training gpd4 and it may be even like higher level of large Lounge models in the future I think one major goal of these new new models are to solve the hallucination issue so I think in one of the interviews by Down by Sam Altman or Elia I can remember whom they say that the hallucination issue in gbd4 is reduced by at least 60 percent so so that's one one area where people where we can make it better another area is what we are doing is a surrounding layer so by doing grounding layer we use like tactics for example and generating like a very special prompt to enforce the AI model to uh speak on things by the reference text not by things trained from the pre-training stage and uh we enforce that we also added this like we say citation system so everyone say he said we would like to ask him to add citations from the original source and uh everything marked with the citation would be more trustworthy than things that's not marked by citations so it can solve some issues when people perceive the result of the AI generated right um we we can have more Trust on things they they think have a good citation on and the things they don't have a good citation on but uh I would say the the hallucination issue is a fundamental flaw for large language models and I actually think it's a fundamental flaw in human mind as well yeah yeah so so humans sometimes do a lot of like bullshiting as well so uh yeah but I think it's it's getting solved gradually um in in areas like marketing and for example uh entertainment people are more like uh okay to have these hallucinations sometimes as long as the amount of Hallucination is controlled but in here is like medical as you said medical law and all this kind of very serious domains probably this is going to be a bigger issue so I see even different Industries when they adopt this large language model larger models they they have like different pacing for adoption yeah and that's actually how I used to describe it as well large English models I just I I used to compare it with Wikipedia where like back in the day you you can trust Wikipedia but you cannot really cite it or you cannot write something for school or whatever based on Wikipedia you need to check the sources and yeah confirm so it's pretty much the same thing with large language models right now you you always need to confirm what it says if it cannot cite where it took its information which is why I think like linking it to a memory based on on lots of documentation is super powerful and maybe the easiest solution to to tackle that but I agree that it may be just like a human bias that is just generalized to language models just because lots of people share fake news or or lie to their friends and stuff like that and the data is is from us so yeah it makes sense that it's doing the same thing yeah yeah totally totally uh that's a very good analogy um by the way I think uh just like you say the one thing we can always do to reduce the impact of Hallucination is trying to make the thinking process of AI as transparent as possible yeah so uh for example uh in the stage of retrieving information as external reference we make it transparent in the process of calling apis to finish a task we make it transparent so the user once they talk to AI chatbot with a powerful ah about all the actions all the informations used in the whole thinking process would be should be transparent so people can have more trust this cannot be done by a real person right we cannot when we talk to a professional we cannot ask them to to list all the thinking process for us but we can do it with AI so I think we have different a lot of a lot of ways to to reduce the impact of hallucination yeah like that I just mentioned that it's important to double check what the language model says just to be sure it's a truthful and so when it's chaining prompts and you are not controlling everything is there a way to ensure that what is happening in between the the input and the final results is is truthful and there were no hallucinations in between uh I I don't think we can can't nowadays we we don't have the proper tools to get into the like very detailed uh of the Computing process um down by large language models right because it's more like a black box um but uh I think to train a powerful AI you probably don't simply just to use large language models you actually build a framework on top of it and having this like thinking thought flows between different parts of the mind uh you have thought flaws from the memory to the perception area from perception area to the um you know Moto action area and this high level flow of information can totally be transparent to the users so um I'm not sure if Albany Heights it is developing something trying to visualize and make the the hidden process transparent I doubt it's very difficult to do but we make the high level process transparent and we make sure everything the the AI generate generated to user have some good reference and good citations I think that's that's one way to go but it's not going to solve the problem um by 100 and could you identify what is right now the biggest challenge when build when trying to build these well this tool that produces specialized agent like is there one specific challenge that is something you are currently working on is harder to solve than than the others one thing is just as you said hallucination but other than hallucination one big challenge that we try to tackle is how we actually make the AI as grounded as possible usually people's knowledge on how to deliver a job is very complicated it's not only about providing the AI a few documents it's not only about providing AI with a couple of tools you you always need to teach the AI how to behave in very different scenarios so we can definitely do it by uh give instructions to AI in different scenarios but still they sometimes don't understand like the best best practice for a particular domain they have these different tools right for example if we are building a marketing agent they have tools to connect to Facebook they have tools to to finding the product details of a company but how to combine them together for a better marketing plan campaign is quite challenging because you can use some piece of information to give like give us some good ideas but for them to autonomously finish it for you is very hard so Auto GPT and the similar like Technologies uh like what we are doing can be a good direction to go but this autonomy has issues for example they are not controllable they usually can be very open-ended and not very focused on the particular task and which turns out to be like wasting a lot of money but not solving your issue so how can we make AI that knows how to deliver a complicated domain specific task and without random thinking about like random ideas that's very hard to do so I would say uh no matter it's a hallucination issue all the like autonomy issue it's all about how we can control the ai's behavior to what we really want it to be by natural language that's a very key part that's why we need to build a really really good framework on top of large language models to deliver the know-how to deliver some feedback to the AI so it can smartly incorporate users feedback into the way how they think and the way how they perform tasks that's very hard and that's a major technical challenge we are facing and we try to solve it by by framework we developed and is the the only way to tackle this is through is through prompting and trying to like bus training make it understand how to do the job or are you also referring to fine-tuning and basically changing also changing the brain of the AI just to to tweak its answers like is it fully after training with with prompting resources are also related to retraining or fine-tuning the models yeah it's also a very good point uh I think currently um most people are doing using this prompting approach the way how we do it is not ask a human to write a better prompt it's like how can the AI get some feedback and updates their own prompt updates their own behavior by updating their own instructions so that's one way automatically and if it can be done automatically it would be a very efficient way to tune and the control as Behavior but uh I think this approach has some limitation and it cannot achieve like total control over ai's Behavior so I I believe in the future we probably want to have some training built in the process so we don't train the the foundation model we don't train the big one but we can definitely train a behavior module on top of that which is which can be a very small model like adapter yeah and this model would would be in charge of getting users feedback and updates the parameters within it automatically and gradually adapting to users preference and uh make it more grounded more suitable to whether one user really need so but the the prompting approach is gonna last for a while before the real fine tuning stage comes because we haven't seen the we haven't reached the limitation of the company approach yet so it's always more convenient to work on the prompting part yeah yeah that definitely makes sense and I I had a question for this is mainly for a friend of mine that is um it's about mineverse and mine OS but it's also to give context and and help the listeners to to better understand the tool so I will just uh yeah give my example so my friend is a recruiter at a company so and he tries to find people to fill in a broad range of roles and right now she tries to find ways to use AI to improve his work so okay he doesn't have any background like he's not a programmer he's not he doesn't have any background in AI other than playing with jbt and so how could someone like him which I assume a lot of listeners are in the same profile could could use mine Os or minevirus to improve anything in their work and like what do they need to learn or to do what would be the the steps to get into it and have an agent help them at in the end yeah so so uh if I understand correctly your friend is in the HR space right yeah okay so uh definitely I think uh I I believe in the future any professional like HRS like lawyers and researchers they can they can have their own way of using mine Os or any other agent platform to build their own agents or just use other people's agents um for for their work for example if we look at this HR job you can find many different steps right some steps are trying to find the candidates from the web or from some like linking profiles and then the next step is trying to assess the degree of feeding of each candidate to the job description and then maybe we can have this like interview process and then communicate with the candidates or this kind of compensation or these kind of things for many different things you can definitely use a myos to build one agent for your job so for example uh we in Minos for example you you when you try to um find a good candidate for LinkedIn we can add one endpoint for the genius you create to have the ability to to browse the web especially on the LinkedIn websites and then you can create a very good workflow that's uh by dragging a few modules together say any candidates you find you first assess them based on their uh for example uh past experience and give them a score from one to five and this second thing is mostly by issuing a natural language command to the AI and ask the AI to do it as you order it to do so they will start to automatically browsing the web and getting all these LinkedIn profiles and trying to uh like greater than by their experience and then you can build another workflow on top of that say after you grade them please send all the like five grade as five candidates to my email and rank them by uh the closeness of then their current job to our city something like that so so it's very very easy to set up those kind of workflows in mindos and we have all this we actually have a very interesting feature in mind OS we are currently building is we are creating this collaboration Network between AI agents or we call AI Geniuses uh for example you you create one AI agent for your like uh Talent acquisition process and you have another AI agent working on this like uh resume grading and I have a third agent working on this like uh automatic interview with uh with the candidates per se so you can ask you can connect them together and to say uh I needed this AI like Talent acquisition agent to First find the talents and then pass this to the second agents for grading and uh findings of Fitness findings are feeding positions in within our company and then the third one to start an initial interview with with the with the candidates so uh this this kind of way to work can be applied to all different types of jobs right not only for HR so we can always find this like each job they have different components and each one requires them to teach the AI how to do it properly and if we can have this mechanism to combine these different AIS together we can largely reduce how people work on this like uh repetitive and tedious work and that's mostly the goal of mine OS and that's I think answers your question how can they use monos and to improve their productivity yeah yeah so just to summarize it would almost only be through lateral language so English or even I assume it could work in French or other languages yeah and do they need any other skills for example when you see that the the agent could go on LinkedIn and and scrape all the different profiles do you need to to do anything related to connecting it to LinkedIn yeah we already provide a set of like public skills for example ability to browse the web or ability to run some code ability to to find a flight for your travel uh things like that but uh the beauty of it the beauty of the grounding layer is for your particular work you can always have your own vertical abilities right for example you can you you need to connect to a vertical database to look for some information or finding out the the clients from your CRM whatever you can always connect these vertical skills vertical tools into the AI so the AI itself would be a be able to autonomously use this set of skills and trying to for each of your inquiry uh each of your tasks they would be able to automatically come up with a plan using these different tools and solve your particular issue if you think it's not good enough and more tools to it and give more feedback to the AI and he will become more powerful and trying to adapt to your true need and I think that's the power of mine OS so or like like I said it's a it's a tool it's a very good tool and you can build uh use this tool and for your need and we provide a lot of flexibility to users so they can basically build anything they want really cool yes it's really cool and I'm excited to see what people will do not requiring coding skills or learning other things to use those models it's it will just I think it's like the quantity of user will will speed up the development progress as well just based on all the Investments and what it will stimulate with that and so I have a few final questions just related to the mainly to the future but yeah it will be difficult questions but short ones the first one is is there a thing that AI cannot do yet but you would love it to be able to do I think the AI country still don't have like some self-consciousness but but I I'm not saying self-consciousness as a very risky way it's more like AI is self-aware of what they can do and they cannot do that's very important that's I think that's a major source of hallucination if they know that they don't have sufficient knowledge or sufficient capability to finish something they should be able to realize they are lacking of this kind of capability and then they will stop being hallucin generating hallucination uh generating like false informations so I think it's very important to build some sort of like self-awareness module in the AI agents or in the AI mind framework and for them to not only to understand what they cannot do but also understand they need to learn new stuff to grow themselves to be self-teaching and that would some be something super helpful and super cool and I don't think I don't see any AI tools or air Frameworks based on larger large learning models can can have that down I completely agree and self-teaching will be super important but I I I'm afraid it's also something that humans are good to just the typical fake it till you make it it's basically exactly that some humans assume they can do something or just yeah fake that they can do something and they they cannot and so AI is just doing the same thing yeah it's true it's true and so okay I ask you about the biggest challenge facing your your company right now but would you have a different answer for the biggest challenge facing the whole AI industry in general or is that the the main priorities for for most people you would assume I think about that a lot recently especially after seeing the power of gpd4 so I think the biggest challenge is how can we control the impact on the society as a whole so we we all know that this AI is very powerful everyone's way of working and and living their life is going to be fundamentally Changed by this AI technology but uh it can be good or can be very risky right and it can if it happens too fast then I feel like a lot of people would be will lose their job so so I totally agree that AI in the long run can be a very beneficial thing for the society but in the short run in the short term we probably want to be more uh conservative on pushing it Forward I think the whole Human Society needs to be more prepared and the impact it's gonna given to us and we probably need more regulations and we we need more like better technology tools to control the like negative impact of AI so I think that's that's a major thing that's stopping AI from being more powerful which I think is good so I think it takes Collective effort for everyone who is involved in this this AI wave uh no matter if for for people like us we are all like AI professionals or for just random person on the street they have very little knowledge of AI I think we should afford we should be more like pay more attention to this particular issue I completely agree and speaking of the long run how do you see AI evolving over the next five years or in your mind where will we be at in five years uh five year five years is probably not going to be a huge difference I think of in five years probably two things is gonna happen one is uh the ai's reasoning ability is going to be much better so they can be totally above average average human uh in terms of way to analyze and the reason to solve complicated problems and the second thing is kind of happening is gonna happen is they are going to be able to handle more signals nowadays mostly handle the text data and respond by text Data as well but in the future they would be able to absorb to get like visual data a voice and all this kind of different signals different sensories that humans have and then they can respond not only by text output but also by some actions by by different type of types of ways to to deliver information to to the end user that's what's going to happen I think in five years but if we think it's for a more long-term perspective um I believe like all the digital service is going to be changed by AI so uh I think AI copilot or AI agent is the new form of software in the future so every software would be in the form of agents and everyone would be able to have an army of agents that they can leverage to help them finish a lot of things that's probably happening not very far five eight years yeah that would be really cool and about just the language model part we've seen the big jump from gpt3 to chat GPT and just gpt2 to gpd3 as well but would you would you agree that we may have done for example the deparator rule with whatever percentage but like the gpt3 to chat GPT was 20 of the work to get 80 percent of their results but then the final 20 will require 80 of the work so like do you think their progress will slow down compared to gpt3 to GPT to chat GPT for the next big step or do you believe we still will improve quite a lot in the following years I think it's a very interesting question if you truly uh look into what Obama has done over the past few years right I think they are still trying to scale it up and they still believe that there's a lot more can be mined from scaling the model to the next level so my my opinion on this is in terms of large language model itself in the next five to three to five years we can still make huge progress to make it make them more intelligent to make them to handle like longer context and just in general better and more powerful I don't think I don't see it slowing down but I do believe that there is definitely some limitations um large land model itself so we we need to build something build a framework around it to unleash the power more so from that front uh we wouldn't see many countries many companies many researchers to make it more autonomous to make it more like like you said self-teaching and to make it more powerful to to connect to the external world and to be able to use external tools and to make it more like adapt adaptive as you use it so all these kind of things are like a different layer of Innovations on top of large language models and combining these two together uh because it's you just multiplies by these different factors of innovation right uh I can see in five to ten years the whole AI landscape is still growing like very very fast it's gonna be as fast as the past five years yeah even more yeah that's super exciting yeah well first I will recommend everyone listening to check out my inverse and minus I think it's a really good product and it's super promising really cool and I'm excited about anything everything related to agent and a bit scared but I hope I hope it will go well do you have anything you'd like to share to the audience in terms either of mine Os or your personal projects uh yeah it's uh I can share a little bit on mine OS so minus is currently still uh close the beta product um so we are currently um experimenting this product with like around 500 to 1000 and Pilots users um uh it's it's not uh 100 Reddit yet but we are iterating it very fast so it's probably going to be ready within two months so it can be used by anyone in the world so hopefully at that time um it can help you a lot and uh and if you are very interested in using the closed beta version of mine OS please go to my universe dot Ai and apply for trial use and the you can you can test this immature version and give us your valuable feedback that will be very much appreciated awesome thank you very much for your time it was super valuable and insightful I I really enjoyed discussing large English model and just mindverse in general I think it's as I said it's a very interesting topic to challenge it's basically research but like applied research so that's amazing and just like in research it's super iterative and you will just keep learning and improving so it's really cool and yeah so thank you very much for your time I really appreciate it\n"

All About LLM Agents with the CEO of a Generative AI Startup (MindverseAI) - What's AI Episode 14

Random Videos