Learn Mistral AI – JavaScript Tutorial

**The AMA Scrimba Course: Exploring the World of Large Language Models**

As we begin this journey through the AMA Scrimba course, we find ourselves standing at the threshold of a vast and exciting world. The cost of hardware and electricity aside, these tokens are completely free, and there's 100% privacy as the data stays on the device. This is a remarkable aspect of the technology, one that sets it apart from other AI projects.

**The Power of Local Inference**

One of the most fascinating aspects of this technology is its ability to be used as a model for any AI project you build locally. To explore this further, we'll take a closer look at the process of creating an AI agent with function calling and local inference through the help of AMA. By clicking on the Cog wheel in the bottom right corner and then clicking "Download as Zip", we can unzip the folder and rename it to create a new project. We've named this project "AMA hello world" and navigate into it.

To get started, we'll run `mpm install` and then `npm start`. This will spin up the project on Port 3000, allowing us to access the browser interface and ask questions via the `question` parameter. The code uses the AMA SDK and patterns familiar to you from the Mistal SDK. We call the chat method and pass in an object with the model as Mistal and a messages array containing role and content.

**The Magic of Chat Completion**

When we type a question in the browser, such as "Why do stars shine", and hit enter, the browser works for a little while before providing a reply. The response is a remarkable demonstration of the technology's capabilities, explaining how stars shine due to nuclear fusion, where hydrogen atoms are turned into helium and releasing immense amounts of energy.

**A Sense of Accomplishment**

As we reach the end of this course, we should take a moment to acknowledge our achievement. Most people who start online courses give up before they reach the end, but not you – you're not a quitter. You've completed this course, and that's something to be proud of.

**Recap of Course Materials**

To help solidify our understanding, let's quickly recap the key takeaways from this course. We began by learning about the basics of Mistl and their platform, including the chat completion API that we interacted with through their JavaScript SDK. We also explored various Mistal models, embeddings, and Vector databases using Superbase. Additionally, we dipped our toes into LangChain for chunking text due to the requirement for Rag or Retrieval Augmented Generation.

Finally, we created an AI agent with function calling and local inference through the help of AMA. This process allowed us to build a project from scratch, navigating through the various components and integrating them seamlessly. By completing this course, you've gained a unique set of skills that will serve you well in your journey as a developer.

**The Future of AI**

As we conclude this course, it's essential to remember that the world of AI is exploding, offering developers like yourself the possibility to create entirely new experiences and apps. The possibilities are endless, and the world is your oyster. We encourage you to keep building, to experiment, and to push the boundaries of what's possible with large language models.

**Join the Community**

To celebrate our completion of this course, we invite you to share your achievement on social media or join the Scribus Discord Community. Our Today I Did channel is a great place to connect with others who have also completed courses, and we love seeing people achieve their goals. We wish you all the best in your future endeavors and look forward to seeing what exciting projects you'll be working on next.

"WEBVTTKind: captionsLanguage: enlearn how to use mistal AI to build intelligent apps all the way from simple chat completions to Advanced use cases like Rag and function calling per borgan from scrimba created this course in collaboration with mistal AI you'll get hands-on experience with mistral's open- Source models including mistl 7B and mistl 8ex 7B and their commercial models by the end of this course you'll Master essential AI engineering paradigms enabling you to create sophisticated conversational user experiences and run AI models locally on your own computer hi there and welcome to this introduction to mistol AI my goal with this course is to teach you how to build magical stuff and more specifically how to do that using JavaScript and mistal AI if you don't know what mistal is it is a company that builds so-called foundational models that in 2023 twice managed to stun the AI Community by launching small open-source foundational models that were on par with the best close Source models out there so as an AI engineer mistel is definitely something that deserves your attention in this course we are going to start off by looking a little bit closer at mistel in general and their platform before we dive into the API Basics and how to use their JavaScript SDK as this course is based around JavaScript though their python SDK is similar so even if you prefer python over JavaScript you'll still get a ton of value value from this course we are also going to go through all of the models that mistl offers at the time of recording this course including their embedding model which lets you work with Vector databases which you'll also get an introduction to in order to give your AI apps domain knowledge which for example could be proprietary company data real-time information that the model hasn't been trained on or for example extra in-depth knowledge about a specific subject that is too narrow for the AI to have been trained on and we'll do this through a technique called retrieve augmented generation AKA rag you'll also learn how to build AI agents with function calling enabling your apps to take action based upon the user prompt a truly revolutionary Paradigm and finally you'll learn how to run your models locally on your computer and interact with them both via the terminal and a web page now who am I I've been a developer instructor and startup founder for almost 10 years now and I'm also the CEO of the learning platform you're on now which is scrimba I use create tutorials on JavaScript react and AI engineering and in total they have been watched by literally millions of people through the scrimba platform corsera and YouTube I love to connect with my students so please click on either of these links if you're interested in connecting on either X or LinkedIn now you'll also see lessons from two other teachers as well throughout this course namely from Gil Hernandez one of our brilliant instructors here at scrimba and we're also proud to have Sophia Yang the head of developer relations at mistol contributing to this course so as you probably understand now this course is a collaboration between mistol and scrimba so we're not pulling this curriculum out of thin air it has been created in partnership with the company itself if you ever find yourself lacking some JavaScript skills or AI engineering Concepts please check out our frontend developer career path or this AI engineering course as those will help you get up to speed so with that let's get started hello it's this is Sophia yam from Mr AI I like to welcome you to the course and give you a brief introduction of mrol Mr AI was founded last year by our three co-founders Arthur Tim and gam we first released our open W model Mr 7B in September last year we released a x7b mixture of experts model and that platform in December we currently have offices in Paris London and San Francisco Bay Area we offer six models for all use cases and business needs including two open source models mro 7B and mixol 8 x7b they're under open source AP par 2.0 license they great to started experimenting with we also offer four optimized Enterprise grate models Mr small for low latency use cases Mr medium or language based tasks and Mr Large for your most sophisticated needs we also offer an embedding model which offers the State ofth art embeddings for text to get started you can use our chat assistant L to interact with our model right away just go to chat. m.ai and you can play with Lua there are several ways to use our models we offer API end points for all of our models through the platform you can subscribe and get an API key on the platform this is the easiest to use and deploy you can also use our model on cloud services which provide fastest deployment for Enterprise especially for those who already use cloud services you can also self- deploy our models on your own on Prem infrastructure this will give you more control and flexibility but it's the most complex among the three so it's a tradeoff between ease of deployment and level control so you can choose whichever you want for your own use cases and your business needs this course will focus on the platform and how to use Mr API for various tasks hope you enjoy the course okay in order to interact with the mistal API you need an API key which will'll get through their platform or La platform as they call it so click on this image right here and you'll be taken to the mistal homepage and there you can click on the build now option that'll take you to the authentication screen so choose however authentication method you want and then in the next step you're asked to create a worksspace name and check off whether you're a solo Creator or doing this as a team member in a company whatever you choose click create workspace and there we go this is the platform and in order to get access to the API you have to provide a card or subscribe as they say here however you only pay for what you use so this is not an ongoing fixed subscription so just add your card and once you done that this box will go away and you can click on API keys to create Keys you can authenticate with click on the create new key and give it a name and an expiration date and then create key now you'll only see this key once so be sure to save it as a scrimba environment variable you learn how to do that by clicking on this link right here and please don't take the time to try and copy this API key right here by the time you watch this scrim this key is no longer active as I've deleted it so go ahead and follow these steps and set the N variables in scrimba and then in the next scrim my colleague Gil will teach you the basics of how to interact with the mistal API through JavaScript hey in this tutorial we'll go over using the chat completion API which allows you to chat with a model that's fine-tuned to follow instructions so let's Dive Right In we're going to use mistral's JavaScript client which I've installed and set up in this interactive scrim I'm importing mistal AI at the top of the Javascript file and I've instantiated a mistal client using my API key which I've stored as an environment variable on scrimba so we're ready to go the chat completion endpoint is designed to handle back and forth conversations you feed it a prompt and a series of messages and it generates a completion or an appropriate continuation of that conversation so now let's make our first chat request using ml's chat method I'll declare a constant named chat response to store the response returned from the chat request which will await with await client. chat and pass the method an object containing the request body the chat completion API accepts various parameters the two required parameters are model and messages mistol has various pre-trained models you can use with the API for our purposes we'll use a model called mistal tiny then I'll set the messages parameter to an array and this is a key part of the chat request as it holds the prompts to generate completion for this should be an array of message objects each with role and content properties role defines the role of the message I'll set it to user indicating that the message is from the user's perspective then set content to the actual content of the user message this is usually a question like what is the best French cheese all right and this is all we need to generate a chat completion so let's log the response to the console and the way to access the message content directly is like this I'll run this code by clicking the Run button and good the API returns a humanlike response about the different types of French cheese all right so what I want you to do now is personalize the AI response by updating the content property to something that interests you you might not have realized this yet but this isn't your typical video player you are experiencing a fully interactive scrim that you can pause at any moment and jump right into the code and make changes to it so go ahead and ask the AI a question then click run okay hopefully that was fun and you got some great responses now let's experiment with other parameters to make our response more interesting we'll use the temperature parameter to set the creativity or randomness of the generated text and this should be a value between 0 and 1 Now the default temperature is 0.7 but as you get closer to one the output will be more random and creative while lower values make the response more focused and deterministic I'll set it right down the middle at 0.5 to strike a balance between creative and predictable responses and now I'll feed it a different question like I want a puppy what is the most kid-friendly dog I'll run this code and I get back a detailed conversational response about various dog breeds good all right I want you to go ahead and pause me now and try experimenting with different temperature values you can also provide custom system prompts to guide the behavior of the model this time I'll set roll to system then set content to the instructions or prompt for the model this is your chance to influence how the AI response so I'm instructing it that it's a friendly cheese kind of sore and that when asked about cheese to reply concisely and humorously now running this won't work because now we need to follow the system role with a user role in content I'll set the role property in this second message object to user then set this content property to ask what is the best French cheese I'll run this code and I get back a fun and witty response about French cheese fortunately it's always cheese season right all right so that's it for the basics of working with the chat completion API now that you've gotten to know the basics of how to set up a request to mistol let's have a look at some of the options and configurations you as a developer can adjust so that you tweak the response you get from mol to your needs and perhaps the most apparent one is adding support for streaming because that is often a key feature of AI apps for example here on hugging face the platform for open- Source AI models and data sets on the mistal organization there's a hosted version of one of their models along with a chat interface so that you can talk with it so here I'll ask it the question what's your favorite Taco ingredient and when I send that I immediately see the response getting built up token by token until it's done and this is a really Pleasant user experience so let's see how we can tweak this from just giving us the entire response to giving us one token at a time so the first thing we need to do is change this from chat to chat stream like that what then happens is that this chat response changes from being a regular object to being a so-called async iterable meaning that we have to await as every item in this iterable becomes available to us so chat response will kind of gradually be built out as we get token by token from the mystal API and the way to deal with this is to create an asynchronous for of loop so we'll do for A8 and then const chunk of chat response and every time the body of this for Loop is executed we get access to a new chunk and as for the chat response this is an object with many properties so we'll have to navigate all almost in the same way as we navigated into the chat response do choices though instead of message it's called Delta so if we now try to console log out this and comment this one out let's see what happens and yes we are getting a ton of stuff logged to the console super fast so this kind of buildup of the response would happen almost instantly and probably a lot faster than we could read it though it's a lot better user experience than having to wait until the entire thing is generated and and then get the response in one go okay let's have a look at another cool configuration you can make to the request and that is to tell mistl that you want the reply in the format of Json that is Javascript object notation here is an example of a Json string and if you don't know what is it is essentially a very common schema that developers use when sending and processing information so being able to get this kind of format from the AI is super helpful as you integrate it with your app and doing this only requires two small settings the first one being that you need to set the response format as an object of type Json object like that and then you also need to specify it in the prompt so here I'll write reply with Json like that here the data will be processed by code and not by a human first and foremost so let's skip this streaming here because it is mostly for the ux directed at humans and then go back to chat here and finally uncomment this one and then like that so let's run the code and yes there we get a Json object I'll copy it from the console paste it in here and there we can see it is an object with a key answer that talks a little bit about good cheese and then it also has a cheese key with a subsequent name key cheese key which is an object that has three keys name country and type so you can imagine it being a lot easier to extract the metadata from this reply as opposed to Simply getting a couple of sentences so I would recommend you to play around with this check out the documentation and see what other configurations and modifications you can make to this response and then once you're ready I'll see you in the next RM where we'll dive more into what we've configured on this specific line which is the models themselves that mistl provides as it's important to have a good overview in order to choose the right ones for the job so I'll see you there hey in this Grim we're going to look at the various models mistal offers now be aware though that these are the models it offers at the time of recording this scrim you should definitely click on this image right here so that you're taken to the landing page for their models as there you can click around and check out their latest optimized commercial models as well as their open models now speaking of open models mistol Rose to prominence in the AI community in 2023 when they launched their first model mistol 7B that is a model that has so-called open weights meaning that you can download it to your computer or upload it to a server and use it as a part of your application without paying mistel a dime one of the things that stunned the AI Community was how powerful it was despite only having 7 billion parameters as the leading open models back then had many more parameters than this even an order of magnitude more now a little later mistol launched the so-called mixol adex 7B which also is an open model and has a unique architecture that allows it to be much more powerful though only slightly more expensive to run inference on the core idea behind this one is that it uses a mix of eight different so-called experts so the total number of parameters here is actually 8 * 7 which is 46 though when you run inference it only Taps into one of these experts and it actually uses around 13 billion parameters when being run now at this point you might be a little bit confused and want to know more about this I don't want to go more into the technical details here because I don't think it's that important in order to use these Technologies though if you are interested feel free to click on this image right here and you'll be taken to a article which talks more in depth about the Mixel model moving on to the next models those are the mistal small mistal medium and mistal large and these are not so-called open weights meaning that you can simply download them from their website and get started locally you either have to use this VI cloud provider that supports these models or you can do self hosting as well though to to do that you have to talk with the mistal team now if we compare these models side by side with their performance on the MML U test as the height of each bar here you can see that the commercial models are more powerful than the open models though the small commercial model and the mix dra are quite within the same range now if you don't know what MML U is it is a common way to test llms it's short for massive multitask language understanding and it puts llms to the test through a range of different tasks giving them a score from 0 to 100% based upon how well they perform now looking at this image it seems that we always should go for the mistal large model but that's actually not the case because the flip side of using a better model is very often that it is more expensive so if we plot this models out on a two-dimensional space with the cost per million tokens on the x-axis and the ml U score on the Y AIS you can see that the picture is definitely different because mistal is by far the most expensive model over twice as expensive as the mistal medium so here if you are able to get the job done with medium you should definitely choose that one analogy you can think of here is when hiring people at a company in many cases you probably don't want to hire a person that is overeducated or over qualified for the job because most likely their hourly rate will be higher so how do you then decide which model to use if you want to dive more into this subject just click on this image here and you'll be taken to the guide in the docs which specifically talks about model selection there you can see some use case examples on what kinds of typical tasks a model is suitable for so for example the mistal small works well for things like classification and customer support whereas the mystal medium is the ideal model for intermediate tasks that require moderate reasoning that could be things like data extraction summarizing a document writing a job description and so forth and finally if you want to do more complex tasks Mr Large is your go-to model so later in this course we are going to create a little agent that can call functions on behalf of users in addition to doing so-called retrieval augmented generation AKA Rag and in those cases we are going to use the large model as those require significant reasoning capabilities and on that note what is exactly rag well you'll figure out in the next scrim here at scrim but we use an app called notion for notes taking and with a team of several teachers developers people in operations and so forth we have a lot of Internal Documentation and it quickly becomes chaotic so here we have a courses and teaching page which again contains a bunch of sub pages and they themselves also have sub Pages as well so it is actually quite hard at times to get to the answer you want to get to which is why I was really glad when lotion launched their ask AI feature which is essentially means that you can ask questions to notion so one day when I was working on our corsera exports I seemed to remember that we needed a widget for doing these exports and I asked it about exactly that it thought a little bit and then came with an answer yes you are correct for corera courses a type of item called plug-in is used to embed scrims and this is quite interesting because I asked for a widget but the AI understood that well actually I meant the plugins so it's shared with me through this footnote here the link to the document that talked about these corsera plugins and this kind of user experience is a GameChanger for web apps suddenly it is much easier to find the information you need and also you give the llm access to proprietary data as obviously the underlying model here does not have any knowledge about how we at scrimba internally embed our scrims in corsera courses now this whole experience was only possible through something called retrieval augmented generation which Probably sounds very complex but don't worry we'll go through it step by step and we won't refer to it through this long complex name here we'll use the popularized expression rag okay so rag contains of mainly two steps there's the retrieval step fetching the data you need to reply to the user's question and there's the generation taking whatever information you found and using that as context when generating the conversational reply back to the user so if if we zoom in on the retrieval first this is very often done in collaboration with a so-called Vector database that is a specific type of database that is optimized for storing information in a specific format that makes it easy for AI to reason about it so it stores so-called embeddings now at this point you're probably a little bit confused what's this thing about vectors and embeddings and all of that don't worry about it we'll get back to that later for now I just want to explain rag on a very high level so what you do is you take all of your data and shove it into a vector database in this specific embedded format and then you take the search query or the input from the user and turn that into an embedding as well as that gives you the opportunity to do a so-called semantic search and get these search results which intelligently for example understand that no pair wasn't looking for a widget he was actually looking for this and thus fetch the relevant data for the app that is the retrieval part once you've done that you take the user input that is the question I asked which was a very humanly written sentence about I seem to remember something about a corsera wouldit blah blah blah and then you combine that with the search results we got in the retrieval step and turn it into a singular prompt that the llm can use as input so mistal AI takes that prompt and the relevant context we retrieved and turns that into a very humanly readable response with in many cases a footnote or link to the underlying data as well thus providing the user a way of factchecking the claim that the AI comes with now there's one thing that all of this relies on which is our ability to turn data for example a sentence into numbers that the AI can understand now all of this relies in our ability to create something called embeddings and what is an embedding well it is what you get when you take a piece of data for example the string hello world and run it through an AI model that turns it into a long array of numbers also known as a vector and as we build out a rag solution in this course it is really important that you have an intuitive understanding of what this embedding concept is so before we continue on with our rag project I'll leave the mic to my colleague Gil Hernandez who will give you a primer on embeddings in the next scrim whether you realize it or not AI powered search shapes many parts of your daily lives every day you interact with platforms sifting through massive amounts of data from text and images to audio and video think about Amazon recommending products or search engines refining your queries social media platforms curate tailored content while services like YouTube Netflix and Spotify offer suggestions based on your preferences now Advanced AIS despite their capabilities don't truly understand the real world as we do they can't grasp the actual meaning or Nuance of a video title song or news article so how exactly do AIS and platforms like Spotify Netflix and YouTube truly get us how is it that they appear to understand predict and respond to us as effectively as if not better than people well the magic behind this capability involves a blend of algorithms AI models and huge amounts of data but a larger part of the answer involves embeddings you see when you present a question to an AI it first needs to translate it into a format it can understand so you can think of embeddings as the language that AI understands the term embedding is a mathematical concept that refers to placing one object into a different space think of it like taking a word or sentence which is in a Content space and transforming it into a different representation like a set of numbers in a vector space all while preserving its original meaning and the relationships between other words and phrases AI systems process lots of data from user inputs to information and databases at the heart of this processing are embeddings which are vectors representing that data transforming content like search queries photos songs or videos into vectors gives machines the power to effectively compare categorize and understand the content in a way that's almost human so how is all of this possible well it isn't exactly as easy as just turning data into vectors so before we go any deeper let's take a closer look at what vectors are think of a vector as a coordinate or point in space and to keep things simple we'll have a look at this 2D graph with an X and Y AIS let's say that a word like cat is translated into a vector like 4.5 12.2 which is this point this Vector encapsulates the meaning and nuances of the word cat in a way an AI model can understand and then we have the word feline represented by a nearby Vector of 4.7 12.6 so we'll place that point on the graph now words that have similar meanings are numerically similar and tend to be be closely positioned in the vector space so this closeness implies that cat and Feline have similar meanings now let's say we have the word or vectors for kitten which might also be close to cat and Feline but maybe slightly further apart due to its age related Nuance now a dog is different but still in the same general domain of domesticated animals so the word dog might be represented by a vector that's not too distant but clearly in a different region let's say 7.5 10.5 and even a phrase like Man's Best Friend which is a colloquial term for a dog could be represented by a vector that's close to the vector for dog on the other hand a word like building is not related in meaning to any of these so its Vector would be much further apart let's say 15.3 3.9 here's another example that demonstrates how embeddings might capture semantic meaning and relationships between words let's say we have the word King represented by the vector 25 then man man is the vector 13 and woman is represented by the vector 14 now let's do some quick Vector arithmetic we'll start with the vector for King then subtract the vector for man to remove the male context and add the vector for woman to introduce new context after performing this Vector math our resulting Vector is 26 so we'll plot that point on the graph and let's say there's another word in our space queen represented by the vector 2 6.2 right here well this Vector is extremely close to the resulting Vector so we might identify queen as the most similar word based on that Vector just as a trained AI model would now a two-dimensional graph is a massive simplification as real world embeddings often exist in much higher dimensional spaces sometimes spanning hundreds or even thousands of dimensions for example the actual Vector embedding for the word Queen might have values across multiple Dimensions each Dimension or number in this Vector might capture a different semantic or contextual aspect of the word Queen for instance royalty Cleopatra or even chess this is what allows the AIS to recognize and differentiate between these contexts when the word is used in different scenarios now imagine embedding hundreds of thousands of words and phrases into this high-dimensional space some words will naturally gravitate closer to one another due to their similarities forming clusters While others are further apart or sparsely distributed in the space these relationships between vectors are extremely useful think back to spotify's method of embedding tracks in a vector space tracks that are positioned closely together are likely to be played one after the other all right so what else can we do with embeddings and how are they used in the real world well you can imagine how embeddings have revolutionized our daily experiences for example search engines have evolved to understand the essence of your queries and content moving beyond mere keyword matching and recommendation systems with the aid of embedding suggest products movies or songs that truly resonate with our preferences and purchase history for example Netflix uses them to create a tailored and personalized platform to maximize engagement and retention also in the healthcare industry embeddings are used to analyze medical images and extract information doctors can use to diagnose diseases and in the finance World embeddings help with analyzing financial data and making predictions about stock prices or currency exchange rates so every time you interact with an AI chatbot every time an app recommends something behind the scenes embeddings are at work translating data into meaning all right so how are these embeddings actually created well let's dive into that next before we create our embeddings there's one important thing you need to learn and that is how to split text because as an AI engineer you'll find yourself having to split text again and again because let's say that you are working on an internal employee handbook app which lets employees ask questions about the compan policies well in which casee you probably have a large data source like the one you can see here in handbook. text which contains all of the data that you need to embed however creating one embed of this entire thing would just be meaningless there's far too many subjects and themes talked about in this handbook so it wouldn't really have any specific semantic meaning of value it would be far too broad so what we're going to do is take this document and split it into chunks and then we'll create an embedding of of every single chunk now creating such chunks is actually a little bit complex though luckily we have a tool to help us with that and that is Lang chain one of the leading libraries for AI Engineers so what we'll do is enhance this function so that it uses the Lang chain text splitter because as you can see this doesn't do much at the moment it's simply an async function that fetches the handbook and calls do text on the response thus giving us all of the text in this handbook let's run the code and just see that it works yes there we have it so now we can use Lang chain to split this into smaller chunks I'll import the Lang chain Library here as a dependency and then let's figure out which specific tool we need to import from Lang chain the simplest one is the character text splitter though the recommended one to use is the recursive character text splitter so that's the one we're going to use so here we'll do import recursive character text Splitter from Lang chain SL text splitter like that now we can create a new recursive character text splitter this is a Constructor function that takes an object as the argument and here you define two things the size of the chunk and how much overlap you want between the chunks we'll try for example 250 characters for the size of the chunk that feels like a sentence or two and will allow for some overlap for example 40 characters we'll call our splitter simply splitter like that and then we can do splitter. create document and pass in the text this is an async function so we have to await it and store the result in a variable called for example output like that now if we log out the output let's run the code and there I got an error and that is because I have a typo I called the text splitter which is wrong it should be text splitter like that let's run the code again yes there we go as you can see in the console there are a bunch of data there and if we open the dev tools we'll be able to inspect it a little bit more in detail so let's do that here as you can see it is an array which contains 2 180 objects let's open up one of these objects and there we can see that we have the text itself under the page content property and also under the lines property we get the exact lines this content comes from in the handbook. text file that is very handy in case you want to create footnotes or reference to the original Source in your app now what you want to make sure of when you respect your data like this is that each of these trunks ideally only deal with one subject or theme that is how you create good embeddings if a given trunk is quote unquote polluted by different themes it'll be harder for the embedding model to create a meaningful Vector of it so here you can see that this trunk deals with delegation of authority and responsibility and the administration and the executive director so definitely a coherent subject though it's actually been split in the middle of two sentences so it could probably be better as well we have probably not struck the perfect balance here you could argue that it would have been better to split this into two and then use the entire sentences or maybe expand it in both ends and include both of the complete sentences in general the shorter your chunks are the more precise meaning the embedding will get though you might also miss some wider context and the longer the trunks are the more context they contain but it can also produce too broad of a scope of information and this would reduce the quality of the similarity search that the vector database does as the underlying meaning of the embedding could be ambiguous it could point in two different directions so to speak in general you want your chunks to be as small as possible but you don't want to lose context so it's definitely a balance to strike and something you'll probably only find through experimentation creating smaller and bigger chunks and actually seeing how it plays out in action in your app for now we'll stick with this and see how it works as we continue on building this rag feature let's carry on in the previous scrim I wrote all of the code for you but as you know this is a scrimo course meaning that your job is to get your hands on the keyboard and write the code out yourself so I left out a couple of pieces for you which you now are to implement through this challenge I want you to refactor this function so that it first of all takes the path to the data or document as an argument so that is to the handbook. text here that'll make it a little bit more generalized as it's really not a good practice to have the path for the fetal request hardcoded in here on line seven and then secondly I want you to return the splitted data as an array of strings and just that because that's how we want our data in the next step of building out this feature so go ahead and solve this challenge right now okay hopefully that went well first we'll specify that it takes a path here as the argument which we'll use in the fetch request and then of course we'll need to specify in the function invocation that we indeed want to get the data from the handbook. text that was part one part two returning the data as an array of strings if you remember from the previous Grim when we inspected this data it is actually an array of objects right now but this time around we only want the data that is within the page content property because we do not care about the location metadata at this point so here we'll take the output and we'll map through it and for each of these trunk objects we'll return trunk. page content like that and here we can store that in a variable called text R for text array and then simply return it now you can of course condense these into fewer lines of code but I like to be explicit and only do one thing at a time on each line so with that we are done and ready to carry on now it is finally time to use the myal API to create our very first embedding as you can see I have imported the mystal client and added my API key so we are ready to get going the first thing I need is an example text trunk to create an embedding of I happen to have copied one of them into my clipboard so I'll paste it in here and call it example trunk as you can see it says professional ethics and behavior are expected of all ambri employees further ambri expects each employee to display good judgment so this is a quite good text for embedding because it deals with one subject which is the expectation of characters for ambri employees now I'll comment this one out as we won't call this function right now instead we'll down here at line 22 call the client do embeddings function that is an async function so we have to await it and inside of the parameter the object we'll specify first what kind of model we want to use and here mistol provides an embedding model called mistol embed and then the second key in the sub is the input now we can't just paste the example trunk like this as this input isn't expecting a string it's actually expecting an array of strings so we have to do like this we'll store the response we get back from this in a const called for example embeddings response like that and then let's finally log it out I'll run the code and yet again I had a typo Mistral with r is what we want to write not Mall we'll try again and there we go we got something very interesting back let's paste it into the editor to inspect it a bit more like that here we can see it has an ID and under the data property we have an array that holds an embedding and that embedding is a long array of floating Point numbers all of which seemingly are very close to zero though slightly more or slightly less so this Vector right here is an embedding of this specific text as transformed by this model and as we use this model to transform other pieces of text the mathematical distance from the various vectors will be a reflect of how similar or how different the semantical meaning of the sentences in the various trunks are so pretty cool and with that I think you are ready to take the next step in building this rag feature so in the next scrim I'll give you a challenge let's move on okay now it's your turn to create your very first embeddings and as you might have noticed already I have removed the code I wrote in the previous Grim because yeah this is scrimba you are going to write the code on your own that's how you really learn so the only thing I've done is called this split documents function and stored the results in a variable I'm calling handbook chunks because you're going to use that when you create and invoke this create embeddings function it takes the chunks that is these as a parameter and turns them all into embeddings using the mistal API once you've done that you are to return the data in a specific format so what we're doing here is prepping it before we'll upload it to the vector database and the service we are using for our Vector database is called superbase which you'll learn more about very soon now the structure superbase wants us to create is the following it should be an array of objects and each of the objects should contain two properties one called content that is just the raw text string that you find in each of the Trunks and secondly the embedding property should simply be the embedded version of that string so aka the vector once you have the data in this format just return it and then later on we'll take care of uploading it to superbase so go ahead and give this one Your Best Shot good luck okay hopefully this went well I'll start by defining the function like that this will be one with asynchronous operations so we need to define it as yes an async function and inside of it we'll start with the mistal client and the embeddings method it takes two arguments the model which should be mistal embed and the input which should be the chunks that we have passed into the function now previously I added a string here so I had to wrap it in square brackets like this because the input is expecting an array of strings here though the trunks is already in the shape of an array as it is this handbook trunks array right here so we don't need to do that but we do need to await this one and store the result in a variable like that let's now console log out embeddings and see if we get anything when we run the code let's call the function pass in the handbook Trunks and see what we get out here on line 24 all right so in our console you can see we have an object which contains a data array which again contains their own objects with a property called embedding so the data we want exists in ins side of embeddings do data then we can navigate into a random item in this array for example the 12th one and then fetch out it's embedding like that if we run the code again we should see yes one vector being logged out to the console really good now we need to combine all of these vectors with all of our text chunks in this structure we've defined down here so to do that I'll map through each of the chunks and then return a new object which contains the chunk as the cont and the vector should be under the embedding key and we'll find it by navigating into embeddings Data into one of the items and then dot embedding so there's a lot of embedding words here right now just bear with me and we'll try to make this work so actually I'll I'm a little bit lazy I'll just copy this one right in here and then we need to replace this with whatever index we are at at every step in the iteration luckily map gives us the index as the second parameter of the Callback function so we can simply replace this with I like this let's store this in a variable called data and then finally return data like that I'll remove this one now we can call create embeddings and expect to get back the data and then log it out but if we want to do that we also have to await it because here we have a synchronous code so console log like that let's run this and see what we get yes there we have a beautiful array with objects that contain two keys content that contains the raw text string and embedding that contains the vector itself so we have the data just how we want it now if you solve this in a slightly different way that's totally okay there are certainly ways to condense this code and make it quote unquote drier I'm not going to worry about that right now but feel free to write this however you want the important thing is that you got the intended result not exactly that my code and your code are mirror images of each other so with that we are ready to take the next very exciting step in our rag Journey and that is to start learning about Vector databases for that I'll hand the bow over to my colleague Gil who will teach you about Vector databases over the next couple of scrims in this course we're going to use super base to manage our Vector database superbase is a full-fledged open source backend platform that offers a postgressql or postgress database which is a free and open- Source database system recognized for its stability and advanced capability while postgress is not a dedicated Vector database superbase does support a powerful postest extension called PG Vector for storing embeddings and Performing Vector similarity searches if you've worked with subase or postgress this should be pretty straightforward if not don't worry you don't have to be a database expert to start using superbase it's quick and easy to set up and the platform has a simple to use dashboard that makes postest as easy to use as a spreadsheet so the first thing you want to do is head over to superb.com once there click to sign in which you can do using your GitHub credentials next on your dashboard's project page click new project you'll first create a new organization with superbase you can use a company name or your own name choose the type of organization in my case personal set the pricing plan to free then click create organization after that superbase will ask you to create a new project which comes with its own dedicated instance and full postgress database it will also set up an API so you can easily interact with your new database so give your new project a name like vector embeddings create a password for your postgress database then choose a region that's geographically closer to you or your user base for best performance then click create new project and after a short moment your new project should be all set up from here you'll need to enable the PG Vector extension in your new project click the the database icon in the sidebar to go to the database page then on the pages sidebar click extensions in the search field search for vector and enable the extension and that should set you up to use superbase to store index and query Vector embeddings all right next you'll need to integrate superbase with your application or in our case the scrims for this course to do that click on the project setting icon and navigate to the API section in the sidebar here you'll find your project URL and API Keys these are essential for integrating superbase with your app so first copy your project URL then save it as an environment variable on scrimba remember you can access your environment variables with the keyword shortcut command or control shift e and be sure to name this variable super basore URL exactly as shown here finally copy your project API key then save it as a scrimba environment variable named superbase API key just like this Vector databases or vector stores possess unique superpowers for managing Vector embeddings with the capacity to store and retrieve embeddings quickly and at scale all right so how do Vector databases actually work well embeddings essentially allow us to match content to a question unlike traditional databases that search for exact value matches in rows Vector databases are powered by complex algorithm Ms that store search and quickly identify vectors so instead of looking for exact matches they use a similarity metric that uses all the information vectors provide about the meaning of the words and phrases to find the vectors most similar to a given query so storing custom information as edings in a vector database gives you the benefit of enabling users to interact with and receive responses exclusively from your own content you have complete control over your data ensuring it remains relevant and up toate this can also help reduce the number of calls and token usage and even allow the summarization and storage of chat histories which helps AIS maintain a type of long-term memory an important tool against the problem of hallucinations with AI models so with all that said Vector databases are becoming a central part of how we build AI powered software and play a massive role in the advancements of large language models these days you have various Vector database options from tools like chroma to Pine Cone superbase and several others all right so next up I'll guide you through setting up your own Vector database see you soon now we need to configure superbase in our project so that we can start interacting with the database as you can see I've installed superbase as a dependency and imported the create client from the superbase JavaScript SDK on line six we invoke this function passing in the superbase URL as the first parameter and the API key as the second and then we have our superbase client however now we have two clients here the mistal one and the superbase one so I want to make it a bit more apparent that this one here is dealing with mistol so I'll rename it like that and then change the name here as well now I want you to head over to your dashboard in superbase and click into the vector embeddings project from there choose the SQL editor from the menu on the left hand side as this allows you to create tables in the database using a SQL query and and having tables is absolutely necessary in a SQL database as that is how you store the data I happen to have the query right here for you as you can see it's pretty straightforward create table we're calling it handbook docs and then we Define the three columns we want our table to have an ID which has the data type big serial that is the primary key so the identification field in this table we'll have the content which will specify as plain text and finally there's the embedding which is a vector of 1,24 Dimensions if you think this resembles our data structure down here you are completely right that is exactly why we formatted our data this way so go ahead and take the SQL and paste it into the editor hit run and then you should see under the results here success no rows returned that means that your table has been created to view it simply click on the table editor in the menu on the left hand side there you can see this is the very beginnings of a table that has an ID column a Content column and an embedding column now to get our data all the way from the handbook via the embed end point and finally into the structure we want and then upload it to super base we only have one line of code to write and that is simply super base do from here we'll specify our table handbook docs dot insert cuz we want to insert something and what do we want to insert well that is the data this is also an async operation so we got to wait it and when this line has executed and JavaScript moves on we'll log out upload complete let's now run this and there we go the upload should be complete let's head over to super base and boom there we go we have our content and their corresponding embeddings in the vector database meaning that we are ready to take the final step in this rag feature which is to perform the retrieval so that we can generate replies to the users for any question they might have about our employee handbook so great job reaching this far let's carry on with all of our text Trunks and embeddings safely stored at superbase we are finally ready to write the code for our rag feature so as you can see here I've changed around on the index JS a little bit as I moved the old uploading code over to data.js as we won't be using that now since we're now actually going to do the retrieval and generation steps so let's start by going through this code so that we're both on the same page the flow of this app contains four steps the first one is getting the user input here I've just hardcoded it as a variable where the user is asking for whether or not they get an extra day off since December 25th falls on a Sunday now of course in real app the user would probably ask this in some kind of form and you do some Dom manipulation to fetch this though that's outside of the scope for this course so we'll just keep it simple and use this input variable next we need to take this input and turn it into an embedding as we need to see if the embedding of this string matches some of the embeddings we've created of the various chunks in our handbook now creating this embedding should be piece of cake for you for now so I didn't bother going through that code with you as you've done that before so once we have this embedding stored in this variable we'll pass it into another function that we've called retrieve matches and this is where we are going to do the similarity search now I've not written the body of this function yet let's just continue on with the flow and then get back to that because once we've gotten the matches or aka the context we'll pass both the context and the input into a function called generate chat response where we'll use these two in combination to get mistol to formulate a reply to the user so that is essentially the four steps of our rag feature now let's look at this retrieve matches function here we need to tell superbase to do a similarity search and if you read some of these descriptions you might be a little bit scared because they're called things like ukian distance negative inner product or coign distance that certainly sounds complicated though luckily we don't have to to worry about any of that as superbase provides us with a SQL function that we simply can copy paste so that we don't have to dive into the underlying complexity I've pasted this function into the function. SQL file right here changing around a little bit on a few things like the name of the function which I want to be match handbook docs as this is the name of our table and also I've changed the vector to account for the number of Dimensions mistal gives our embeddings plus updated this query down here to account for our handbook docs name so what I want you to do now is copy this entire function head over to superbase and click into the SQL editor there click on the new query button and then paste in the function click on run and if you see success no rows returned it means that this function is now available in your database but now the question is how do we access this function in our JavaScript and that is where superbase is really user friendly CU they have an RPC method a so-called remote procedure call which you can invoke anywhere in your code just like they do on this snippet right here so what we'll do is simply copy this and paste it into our code now our function was not called match documents it was called match handbook docks and to begin with I don't want 10 matches which is what you define here I want just one now the match threshold sets a threshold for how similar embedding should be in order to be included as a match the higher you put this the more picky you are so the less mattress you'll see but also the more similar they will actually be and here the aquarium embedding is the embedding that we passed into this function in other words the embedding of this string right here finally we can return data and I happen to know that inside of the first item in the data array there is a property called content and that is what we want so now let's comment out this line and console log out the context and try to run this code and there we go we get back a very relevant piece of context which says Christmas Day full-time employees parenthesis employees work at least 35 hours per week receive one paid day off for each full day of holiday time holiday benefits for part-time employees and then it stops so we were able to retrieve very relevant information though it's not formulated as a good reply to the user and also there's some lacking information here as well ideally we would have seen the sentence that was cut off Midway as that would have given us information about the part-time employee vacation policy for these kinds of situations so that leaves us with a couple of tasks to be done down here in these retrieve matches and the generate chat response functions and who do you think is going to fix up that yes you guessed it that's yourself so in the next Grim you are going to complete the retrieval and the generation process of this feature let's move on I'll see you there welcome to this two-part challenge where you are going to complete the retrieve matches function and write the entire body of the generate chat chat response function in the first one we are to fix the fact that we didn't get enough data back by simply getting one match so instead we are going to return five matches and that involves updating this object and changing how you return the data for the Second Challenge you are to take whatever context you get back from the retrieval step and combine it with the user's query or input and turn that into a prompt this prompt should be sent to mistral's API and decide for yourself what models and what settings you'd like to use and here you're going to do a little bit of prompt engineering as you'll need to combine the context and the query into a single prompt and I don't want to dictate this for you instead just think of how you would take two pieces of data a context and a question and turn it into one prompt that instructs the AI to answer the question based upon the provided context now I can disclose that it doesn't have to be complex the AI is pretty capable of figuring out what you're trying to do so just make it your best chart and see how it works finally once you've done this you probably want to log out the response here to inspect what kind of reply the AI generated for you okay with that best of luck you got this okay hopefully this went well let's do it so I'll start with this one and the first thing we need to do is of course update this number to be five instead of one and then I'll check out the data here by logging it out and then I'll actually run the code as we're logging out the context here and then I'll actually remove this and just return the data so that we get to inspect the underlying structure here as we are logging out the context here on line 15 let's run the code opening up the console and there we go so this is an array of objects where we are looking for the string inside of the content keys so if we want to combine all of the content Keys into a single string I'll do data. map and for each chunk I'll return the chunk. content if we return this let's see what we get running the code and there we get an array of strings to combine that to a single string we'll just do dot join and then specify that we want a space in between each of the strings logging this out yes that looks good moving on to this one here we'll start by using the mystal client and call the chat method passing in the model and let's try with the most capable one first mistal large latest and the messages only needs one object or I'll at least try with that and if it doesn't work I'll perhaps try to add a system message but let's go straight to the user as a first solution and the content here is where we'll need to do a little bit of prompt engineering I'll try the easy way first and simply do handbook context like that and then passing in the context like that and then we'll do question colon and pass in the query now we could of course have written this is an extract from the the handbook that contains relevant background info for the user question though as you've probably understood I like to start off simple and then only make it more complex if needed so this is actually all the prompt engineering I was looking for so as the next step we want to return whatever result we get from this and to do that we of course have to await this function and store the response in a variable and then I happen to know that the real generated reply to the user lives inside response do choices and the first item in that array do message. content as I happen to know that this is the location of the generated response from the AI so with that it is the moment we've all been waiting for let's log out the response like that and run the code and see how this works and yes based on the handbook context provided if Christmas Day falls on a Sunday you as a full-time employee would still receive one pay day off for the holiday brilliant that is exactly what we were looking for it was able to both figure out that December 25th was semantically related to Christmas day so that it was able to retrieve the relevant information and use that to generate a nice and humanly readable reply so phenomenally well done you now have rag as a tool in your tool belt as an AI engineer and this will definitely come in handy throughout your career if you continue down the path of AI so give yourself a pad on the back perhaps take a break at this point as you've learned a lot and perhaps has a need to digest it if not in the next next part of this course you are going to learn about an insanely exciting concept which is function calling which enables you to create AI agents that interact with the world on the user's behalf so truly something that opens the door to a whole world of revolutionary user experiences hey and welcome to the section about function calling this is a very exciting field that opens the door for you to AI agents and by that I mean smart assistants that can interact act with the world on behalf of your users just by interpreting what they say so this is a new paradigm in terms of the user experience we developers can provide now let's start off at a very high level looking at how the architecture of such a agent typically is so let's say that you are running an e-commerce website where you sell products to people and you have a chat where users come to ask questions for example things like is my package on its way what we'll do then is send this to our llm along with some Specific Instructions as to what kind of tools it has available to figure out the answer for the user and then if it is a good model it'll look at this query and realize for itself that hm I actually need to call the fetch order function to give the user a good reply and then it'll instruct our software to actually perform this function call this is often done via regular code like if and else conditionals so through the code you have written you'll ensure that when the AI wants to call this function you will actually perform this function call and get the order data she then again will return to the llm it will then read that data which probably comes in the form of an object and then turn that into a human readable response like yes your order is expected to arrive and blah blah blah which again then results in a happy user all without you having to use manpower to do this so you can imagine the power of this technology is it can drastically improve the customer service users can get when talking with these chat Bots we've all come across over the last few years and of course that is just the start imagine how powerful it is when this is rolled out to all Industries okay let's now have a look at the code this here is by and large a very standard function call to the chat endpoint at mistl so all of this should be familiar to you except for this line 13 here where we've added an array of tools and that comes from the import statement here on line two and if we head over to tools.jar and searches through that data and as you can see down in the tools array we have described this function through a specific schema this entire object's sole purpose is to describe for the AI what this function does so it says that yes first of all it is a function and its name is this and here is a description of it as well plus it takes one parameter of type object this one right here and this object has a property called transaction ID which is of type string and also it is the required property so I think I think you can guess what I'm getting to here this is all our aices it never sees the content of this function it just looks at the description and tries to decide whether or not it should be invoked and with what kind of arguments based upon the input from the user so if we yet again head back to index.js and try to run this code with the prompt is the transaction t01 paid and run this then you can see in the console we get back an object and I happen to have copy pasted one of these objects into the output JS just to make it a little bit easier to read and here as you probably remember from earlier the choices message content property is actually empty now this is where mistol always adds the reply from the llm though now the llm has nothing to say instead it has given us some instructions about what we should do here under tool calls there's a function key which has the name of get payment function and the argument of transaction ID t01 so this is how it tells us that hey developer now you got to invoke a function and the llm isn't done reasoning about this issue if it was it would have said stop here but instead it wants us to call a tool and then send back the result so as you can understand we developers have some work to do here and we're also going to make this agent even more capable by adding another function so I'll leave it at this so that the two of us can get back to work and start coding and we'll do that in the next scrim so we've been able to get the mistal model to tell us to call function when someone asks about the payment status of an order however if we take a look at the data you can see that there's also other things going on here there's both an amount and a date for each of the orders so there's definitely potential to make our agent more powerful I'm now going to paste a function in here called get payment date it also takes in the transaction ID and does more or less the same thing as get payment status though it instead Returns the date for when it was paid now you could have of course also have done this by simply making this one a bit more robust to fetch various pieces from the data based upon the argument passed in but the point here is not to build a production ready system but rather to give you some practice in building agents so what I want you to do now is expand upon this tools array by adding a new object that describes the second function because as you might remember the AI doesn't read this code it only knows about the function through how you describe it in the tools schema so that is your first Challenge and once you've done that I want you to verify that your solution works by changing the prompt that we give the agent here in a way that would get the llm to instruct us to call the newly added function so go ahead and solve these two challenges and I'll show you the solution when you return back to me okay let's do this I'll head back to the tools file again and then I will simply copy this object as I'm a little bit lazy and I've understood that the second object here will look very similar similar as the first one so I'll do like that and simply change from get payment status to get payment date and the description to get the payment date of a transaction and then the parameter can stay just the same as it is identical to the previous function and actually that was about it heading back to index JS I'll change this to when was the transaction t01 paid let's run this open up the console and yes indeed you can see that it is now instructing us to call get payment date and not get payment status so mission accomplished hopefully that forced you to get to know the schema a little bit more as we're going to move to the next step now where we'll start acting upon the instructions we get from the mistal model right now we are in this step of our flow the llm has just told us that we need to call a function and now we're about to write the code we need in order to do that so the first thing I want to do is that we keep our messages array up to date as it should include every single piece of dialogue going back and forth between the user the app and mistol so what I'll do is messages. push and I'll push some part of this response though not the entire thing let's have a look at how it was structured if we head into the choices and the first object in this array we can see that there is a message object and that is what we want so we'll do response on. choices the first one and then message like that as you can see it has a role just like our user message has a role though this is from the assistant the content is empty since it didn't have anything to say to us instead it had some instructions about what tools we should call so the next thing we want to do is write the N statement that checks if we indeed are about to call a tool and then write the code for the specific tool call now I don't want to use the fact that there's a tool calls here in our conditional instead the right way to do this is to look at the Finish reason and the fact that this has the string value of tool calls so we'll do if response choices the first item and then then finish reason if this is equal to Tool calls well then our next step is to fetch out the name of the function we are to call and its argument and at this point I think I've written more than enough code for you it is your turn to take over so here is a challenge for you I want you to get a hold of the name of the function that we should call and its arguments and we want the function name as a string but the arguments as an object so I've set up the two variables for you the function name and the function arcs both are just initialized as empty strings but whatever expression you replace this with should be of the data type object so now you'll have to dig through this one yourself and fetch out the relevant information and once you're done just return back to me and then I will show you the solution the final thing I'm going to do here is make sure that I close this if statement properly like that and indent this and have the correct indentation for this one as well and with that you are good to go best of luck okay hopefully this went well let's do this together so first I'll start with the function name and here I'll need to navigate all the way down to Tool calls function and then name and I happen to see that both the name and the argument is in the same object here this function object So to avoid too much repetition I'm going to do const function object like that and then I'll paste this in adding dot tool calls which is an array and we want the first item and do function like that now I can do function object. name and function object do r ents so if you got to this point good job though we're not quite done yet as this function object. argument is a string and we want it as an object and the way to do that is to do json.parse and that should turn whatever string we have into an object let's consol out the function name and the arguments and then run the code to verify that it works and I'll comment out the response down here running the code and yes the first first one is get payment date as a string and the second one is indeed an object so very well done solving this challenge let's carry on it is time for us to do this step which is to call the function so that we get the data we eventually can send back to mistal and as you know we have the function name and the function arcs but this file doesn't yet have access to the functions as they live in the tools file so I will import these functions get payment date and get payment status like that though now the question is how do we go from just having the data as a string value for example get payment date into actually calling the function well to help you with that I'm going to wrap these functions in an object called available functions I'll add get payment date and get payment status what this gives us the opportunity to do is to use the bracket notation to get a hold of a reference to any of these functions because if we passed in a string called get payment date this would be a reference to this function and if we throw in parenthesis we'll invoke the function so that is pretty cool and it is exactly what I want you to do right now in your challenge your job is simply to perform the function call so go ahead and write the code to do that and then I'll see you when you return back to me okay hopefully this went well the way to do it is to grab a hold of the available functions object and then use bracket notation to pass in our function name call it with parenthesis and finally add function arcs like that we'll store this in a variable called function response and then finally log it out let's run the code and there we go we have turned a prompt from the user into a real function call that returns data that our assistant asked for really good job reaching this far we are making a ton of progress so let's just keep up the pace and carry on so we've called the function obtained the data we need and now we need to send it back to our assistant so how do we do that well as I've said earlier it's important that we keep track of all the dialogue in this app whether it's from the user from the assistant or in this case from the tool itself and where do we keep track of this dialogue take a guess yes it is in this messages array which we are passing along to mistal every time we interact with our API so what we're going to do here first is messages. push pass in an object and as you've seen before we always have a role though this time around it's not the role of user and also not the role of assistant which is what we had when we got the instructions to call the function this time around the role belongs to the tool and the next piece of information we need to pass along is the name of the tool which we have here in function name and finally the content after we've told mol that we worked with the tool and gave them the name of the tool what do you think the content here should be take a guess yes hopefully you understood that it is the response we got back from the function because when mistol gets all of this data it should be able to decide what the next step should be and speaking of which how do we then send this off to mistl well we could start a new client. chat down here and add all of the metadata again though that's a very hard-coded and hacky solution instead we want to rerun this piece of code and then yet again check if we're instructed to call yet another function and then keep on going until the assistant tells us that yes we are now done with the back and forth and I have a good response for the user and hearing that what kind of programming Paradigm does that sound like a job for and yes you guessed it the loop so we are going to wrap this entire thing in a loop and keep it running until we have a satisfying result we'll do that in the next Grim so I'll see you there okay we are ready to perform the final steps of our flow we'll take the result from the function and send it to our assistant who will then construct a reply and send it back so that our user is happy again and as I talked about in the previous Grim we'll do this through the help of a loop so this is a challenge where you are to start by creating a for Loop that runs a maximum of five times and inside of the for Loop if the Finish reason is ever set to the string stop then I want you to simp simply return the response from the assistant so then you are to return the entire function and that'll also then break out of the loop as you can see the Finish reason lives down here in the object you get from the assistant now you might ask at this point well why are we simply hardcoding in a for Loop that runs five times wouldn't it be better with a while loop that could run as many times as you need until the task is complete or for example a recursive solution that would do the same thing and yes those could be better Solutions but they also open up for the possibility of infinite Loops so it would require some guard rails in my opinion to implement such a solution which is why we're simply going for a naive for Loop that runs five times and that should be more than enough for our use case though of course if you want to build on this after you've solved the challenge you are more than free to do that and actually I would encourage you to do that anyway give this challenge your best shot and then I'll see you when you're done okay let's do this we'll do four let I equals z and I should be less than five and it should increment moving this all the way down here and indenting everything inside of the loop like that and checking here if response do choices the first item and yes it lives within that object if the Finish reason is stop then we'll simply return the content within the message and finally let's bring this up here and do else if like that okay the moment of truth let's see if we've been able to successfully implement this entire flow I'll comment out this console log we're asking when this transaction was paid let's run the code and yes the transaction 2001 was paid on October 5th 2021 wow congrats you've just built your very first AI agent and while this of course is a dummy example You Now understand the basic building blocks which gives you a foundation for building Real World products so give yourself a pat on the back and then I'll see you in the next scrim hi there now you are going to learn how to run mistal models locally on your computer and the tool we are going to use for that is called olama it is an app that wraps large language models on your computer and lets you interact with them in various ways so click on this image right here and you'll get to the AMA page there you can search for models for example mistl click into it and see the size of it this one is 4.1 GB and also read about how it performs compared to other open- source models for now let's head back to the homepage and click on the download button then choose whatever platform you use and click yet again on the download button so that you can complete the download and install AMA on your computer once you've done that open up the terminal on your computer and type AMA run mistal that'll start the download process and as it'll take some time I'll fast forward and once it's done we get this little UI where we can type a message so let's ask the model something for example trying to use it as a motivational coach which I often use large language models for so I'll type feeling a bit demotivated today can you help me get started with my day and when I hit enter mistl starts typing out tip after tip not through being run on a thirdparty server but being run by my computer and let's just take a minute and acknowledge how cool this is because aside for the cost of the hardware and the electricity these tokens are completely free and there's also 100% privacy as the data stays on the device now what's perhaps even cooler is that you can use this as a model for any AI projects you build locally as well so let's try to do that in the current scrim click on the Cog wheel in the bottom right corner and then click download as zip then in your downloads folder you'll see this ZIP file so just go ahead and double click on it to unzip it and that'll give you a folder with a weird looking name that is the underlying ID for the scrim so take this folder rename it and place it wherever you keep your Dev project for me that is in the dev directory and I've named this project AMA hello world so I'll navigate into it and there you should do mpm install and then do npm start that'll spin up the project on Port 3000 meaning that you can head over to Local Host 3000 and there you will see the browser telling you to ask a question via the question parameter and what's going on here is that this little Express router checks if there is a question in the URL parameter called question and if there isn't a question there for example if you just visited the root page without any URL params it'll just render out this string though if it is a question there it'll execute the following lines of code and here we're using theama SDK and the patterns that are being used here is probably quite familiar to you right now because yes this resembles the mistal SDK quite a lot we call the chat method and pass in an object where we specify the model in this case it's mistal and a messages array that has a role and some content and the content here is whatever Express finds in the URL parameter called question so if we now type in a question in the browser for example why do stars shine and hit enter then we'll see that the browser will work for a little while and boom eventually it gives you reply which says that star shine due to nuclear fusion where hydrogen atoms are turned into helium and thus releasing immense amounts of energy and actually this continues until all of these atoms have turned into iron which by the way means that the iron you used to fry your eggs with was created billions of years ago in the center of a star wow that is mindblowing blowing to think about almost as mind-blowing as the fact that you have reached the end of this mystal course most people who start a course here on scrimba give up before they reach the end but you my friend do not you are not a quitter so give yourself a pat on the back and in the next scrim we'll do a quick recap of everything you've learned wow you really did something special today you completed this course please remember that most people who start online courses give up you are not like them so let's have a quick recap of what you've learned starting out we looked at the basics of mistl and their platform along with the chat completion API that we interacted with through their JavaScript SDK you have a solid grasp of the various mistal models right now and also know how to work with embeddings and Vector databases with superbase you've also dipped your toes into Lang chain and using it for chunking text because you had to do that when you were learning about rag or retrieval augmented generation finally you know how to create AI agents with function calling and how to do local inference through the help of AMA now it's time for you to celebrate your win do share that you've completed this course on social media or if you want a less public way of doing that you can check out scribus Discord Community as there we have a today I did Channel where we love seeing people completing courses and whatever you do please keep on building you have a unique set of skills here and this is just a start the world of AI is exploding giving Developers like yourself the possibility to create entirely new experiences and apps the world is your oyster so happy building and best of lucklearn how to use mistal AI to build intelligent apps all the way from simple chat completions to Advanced use cases like Rag and function calling per borgan from scrimba created this course in collaboration with mistal AI you'll get hands-on experience with mistral's open- Source models including mistl 7B and mistl 8ex 7B and their commercial models by the end of this course you'll Master essential AI engineering paradigms enabling you to create sophisticated conversational user experiences and run AI models locally on your own computer hi there and welcome to this introduction to mistol AI my goal with this course is to teach you how to build magical stuff and more specifically how to do that using JavaScript and mistal AI if you don't know what mistal is it is a company that builds so-called foundational models that in 2023 twice managed to stun the AI Community by launching small open-source foundational models that were on par with the best close Source models out there so as an AI engineer mistel is definitely something that deserves your attention in this course we are going to start off by looking a little bit closer at mistel in general and their platform before we dive into the API Basics and how to use their JavaScript SDK as this course is based around JavaScript though their python SDK is similar so even if you prefer python over JavaScript you'll still get a ton of value value from this course we are also going to go through all of the models that mistl offers at the time of recording this course including their embedding model which lets you work with Vector databases which you'll also get an introduction to in order to give your AI apps domain knowledge which for example could be proprietary company data real-time information that the model hasn't been trained on or for example extra in-depth knowledge about a specific subject that is too narrow for the AI to have been trained on and we'll do this through a technique called retrieve augmented generation AKA rag you'll also learn how to build AI agents with function calling enabling your apps to take action based upon the user prompt a truly revolutionary Paradigm and finally you'll learn how to run your models locally on your computer and interact with them both via the terminal and a web page now who am I I've been a developer instructor and startup founder for almost 10 years now and I'm also the CEO of the learning platform you're on now which is scrimba I use create tutorials on JavaScript react and AI engineering and in total they have been watched by literally millions of people through the scrimba platform corsera and YouTube I love to connect with my students so please click on either of these links if you're interested in connecting on either X or LinkedIn now you'll also see lessons from two other teachers as well throughout this course namely from Gil Hernandez one of our brilliant instructors here at scrimba and we're also proud to have Sophia Yang the head of developer relations at mistol contributing to this course so as you probably understand now this course is a collaboration between mistol and scrimba so we're not pulling this curriculum out of thin air it has been created in partnership with the company itself if you ever find yourself lacking some JavaScript skills or AI engineering Concepts please check out our frontend developer career path or this AI engineering course as those will help you get up to speed so with that let's get started hello it's this is Sophia yam from Mr AI I like to welcome you to the course and give you a brief introduction of mrol Mr AI was founded last year by our three co-founders Arthur Tim and gam we first released our open W model Mr 7B in September last year we released a x7b mixture of experts model and that platform in December we currently have offices in Paris London and San Francisco Bay Area we offer six models for all use cases and business needs including two open source models mro 7B and mixol 8 x7b they're under open source AP par 2.0 license they great to started experimenting with we also offer four optimized Enterprise grate models Mr small for low latency use cases Mr medium or language based tasks and Mr Large for your most sophisticated needs we also offer an embedding model which offers the State ofth art embeddings for text to get started you can use our chat assistant L to interact with our model right away just go to chat. m.ai and you can play with Lua there are several ways to use our models we offer API end points for all of our models through the platform you can subscribe and get an API key on the platform this is the easiest to use and deploy you can also use our model on cloud services which provide fastest deployment for Enterprise especially for those who already use cloud services you can also self- deploy our models on your own on Prem infrastructure this will give you more control and flexibility but it's the most complex among the three so it's a tradeoff between ease of deployment and level control so you can choose whichever you want for your own use cases and your business needs this course will focus on the platform and how to use Mr API for various tasks hope you enjoy the course okay in order to interact with the mistal API you need an API key which will'll get through their platform or La platform as they call it so click on this image right here and you'll be taken to the mistal homepage and there you can click on the build now option that'll take you to the authentication screen so choose however authentication method you want and then in the next step you're asked to create a worksspace name and check off whether you're a solo Creator or doing this as a team member in a company whatever you choose click create workspace and there we go this is the platform and in order to get access to the API you have to provide a card or subscribe as they say here however you only pay for what you use so this is not an ongoing fixed subscription so just add your card and once you done that this box will go away and you can click on API keys to create Keys you can authenticate with click on the create new key and give it a name and an expiration date and then create key now you'll only see this key once so be sure to save it as a scrimba environment variable you learn how to do that by clicking on this link right here and please don't take the time to try and copy this API key right here by the time you watch this scrim this key is no longer active as I've deleted it so go ahead and follow these steps and set the N variables in scrimba and then in the next scrim my colleague Gil will teach you the basics of how to interact with the mistal API through JavaScript hey in this tutorial we'll go over using the chat completion API which allows you to chat with a model that's fine-tuned to follow instructions so let's Dive Right In we're going to use mistral's JavaScript client which I've installed and set up in this interactive scrim I'm importing mistal AI at the top of the Javascript file and I've instantiated a mistal client using my API key which I've stored as an environment variable on scrimba so we're ready to go the chat completion endpoint is designed to handle back and forth conversations you feed it a prompt and a series of messages and it generates a completion or an appropriate continuation of that conversation so now let's make our first chat request using ml's chat method I'll declare a constant named chat response to store the response returned from the chat request which will await with await client. chat and pass the method an object containing the request body the chat completion API accepts various parameters the two required parameters are model and messages mistol has various pre-trained models you can use with the API for our purposes we'll use a model called mistal tiny then I'll set the messages parameter to an array and this is a key part of the chat request as it holds the prompts to generate completion for this should be an array of message objects each with role and content properties role defines the role of the message I'll set it to user indicating that the message is from the user's perspective then set content to the actual content of the user message this is usually a question like what is the best French cheese all right and this is all we need to generate a chat completion so let's log the response to the console and the way to access the message content directly is like this I'll run this code by clicking the Run button and good the API returns a humanlike response about the different types of French cheese all right so what I want you to do now is personalize the AI response by updating the content property to something that interests you you might not have realized this yet but this isn't your typical video player you are experiencing a fully interactive scrim that you can pause at any moment and jump right into the code and make changes to it so go ahead and ask the AI a question then click run okay hopefully that was fun and you got some great responses now let's experiment with other parameters to make our response more interesting we'll use the temperature parameter to set the creativity or randomness of the generated text and this should be a value between 0 and 1 Now the default temperature is 0.7 but as you get closer to one the output will be more random and creative while lower values make the response more focused and deterministic I'll set it right down the middle at 0.5 to strike a balance between creative and predictable responses and now I'll feed it a different question like I want a puppy what is the most kid-friendly dog I'll run this code and I get back a detailed conversational response about various dog breeds good all right I want you to go ahead and pause me now and try experimenting with different temperature values you can also provide custom system prompts to guide the behavior of the model this time I'll set roll to system then set content to the instructions or prompt for the model this is your chance to influence how the AI response so I'm instructing it that it's a friendly cheese kind of sore and that when asked about cheese to reply concisely and humorously now running this won't work because now we need to follow the system role with a user role in content I'll set the role property in this second message object to user then set this content property to ask what is the best French cheese I'll run this code and I get back a fun and witty response about French cheese fortunately it's always cheese season right all right so that's it for the basics of working with the chat completion API now that you've gotten to know the basics of how to set up a request to mistol let's have a look at some of the options and configurations you as a developer can adjust so that you tweak the response you get from mol to your needs and perhaps the most apparent one is adding support for streaming because that is often a key feature of AI apps for example here on hugging face the platform for open- Source AI models and data sets on the mistal organization there's a hosted version of one of their models along with a chat interface so that you can talk with it so here I'll ask it the question what's your favorite Taco ingredient and when I send that I immediately see the response getting built up token by token until it's done and this is a really Pleasant user experience so let's see how we can tweak this from just giving us the entire response to giving us one token at a time so the first thing we need to do is change this from chat to chat stream like that what then happens is that this chat response changes from being a regular object to being a so-called async iterable meaning that we have to await as every item in this iterable becomes available to us so chat response will kind of gradually be built out as we get token by token from the mystal API and the way to deal with this is to create an asynchronous for of loop so we'll do for A8 and then const chunk of chat response and every time the body of this for Loop is executed we get access to a new chunk and as for the chat response this is an object with many properties so we'll have to navigate all almost in the same way as we navigated into the chat response do choices though instead of message it's called Delta so if we now try to console log out this and comment this one out let's see what happens and yes we are getting a ton of stuff logged to the console super fast so this kind of buildup of the response would happen almost instantly and probably a lot faster than we could read it though it's a lot better user experience than having to wait until the entire thing is generated and and then get the response in one go okay let's have a look at another cool configuration you can make to the request and that is to tell mistl that you want the reply in the format of Json that is Javascript object notation here is an example of a Json string and if you don't know what is it is essentially a very common schema that developers use when sending and processing information so being able to get this kind of format from the AI is super helpful as you integrate it with your app and doing this only requires two small settings the first one being that you need to set the response format as an object of type Json object like that and then you also need to specify it in the prompt so here I'll write reply with Json like that here the data will be processed by code and not by a human first and foremost so let's skip this streaming here because it is mostly for the ux directed at humans and then go back to chat here and finally uncomment this one and then like that so let's run the code and yes there we get a Json object I'll copy it from the console paste it in here and there we can see it is an object with a key answer that talks a little bit about good cheese and then it also has a cheese key with a subsequent name key cheese key which is an object that has three keys name country and type so you can imagine it being a lot easier to extract the metadata from this reply as opposed to Simply getting a couple of sentences so I would recommend you to play around with this check out the documentation and see what other configurations and modifications you can make to this response and then once you're ready I'll see you in the next RM where we'll dive more into what we've configured on this specific line which is the models themselves that mistl provides as it's important to have a good overview in order to choose the right ones for the job so I'll see you there hey in this Grim we're going to look at the various models mistal offers now be aware though that these are the models it offers at the time of recording this scrim you should definitely click on this image right here so that you're taken to the landing page for their models as there you can click around and check out their latest optimized commercial models as well as their open models now speaking of open models mistol Rose to prominence in the AI community in 2023 when they launched their first model mistol 7B that is a model that has so-called open weights meaning that you can download it to your computer or upload it to a server and use it as a part of your application without paying mistel a dime one of the things that stunned the AI Community was how powerful it was despite only having 7 billion parameters as the leading open models back then had many more parameters than this even an order of magnitude more now a little later mistol launched the so-called mixol adex 7B which also is an open model and has a unique architecture that allows it to be much more powerful though only slightly more expensive to run inference on the core idea behind this one is that it uses a mix of eight different so-called experts so the total number of parameters here is actually 8 * 7 which is 46 though when you run inference it only Taps into one of these experts and it actually uses around 13 billion parameters when being run now at this point you might be a little bit confused and want to know more about this I don't want to go more into the technical details here because I don't think it's that important in order to use these Technologies though if you are interested feel free to click on this image right here and you'll be taken to a article which talks more in depth about the Mixel model moving on to the next models those are the mistal small mistal medium and mistal large and these are not so-called open weights meaning that you can simply download them from their website and get started locally you either have to use this VI cloud provider that supports these models or you can do self hosting as well though to to do that you have to talk with the mistal team now if we compare these models side by side with their performance on the MML U test as the height of each bar here you can see that the commercial models are more powerful than the open models though the small commercial model and the mix dra are quite within the same range now if you don't know what MML U is it is a common way to test llms it's short for massive multitask language understanding and it puts llms to the test through a range of different tasks giving them a score from 0 to 100% based upon how well they perform now looking at this image it seems that we always should go for the mistal large model but that's actually not the case because the flip side of using a better model is very often that it is more expensive so if we plot this models out on a two-dimensional space with the cost per million tokens on the x-axis and the ml U score on the Y AIS you can see that the picture is definitely different because mistal is by far the most expensive model over twice as expensive as the mistal medium so here if you are able to get the job done with medium you should definitely choose that one analogy you can think of here is when hiring people at a company in many cases you probably don't want to hire a person that is overeducated or over qualified for the job because most likely their hourly rate will be higher so how do you then decide which model to use if you want to dive more into this subject just click on this image here and you'll be taken to the guide in the docs which specifically talks about model selection there you can see some use case examples on what kinds of typical tasks a model is suitable for so for example the mistal small works well for things like classification and customer support whereas the mystal medium is the ideal model for intermediate tasks that require moderate reasoning that could be things like data extraction summarizing a document writing a job description and so forth and finally if you want to do more complex tasks Mr Large is your go-to model so later in this course we are going to create a little agent that can call functions on behalf of users in addition to doing so-called retrieval augmented generation AKA Rag and in those cases we are going to use the large model as those require significant reasoning capabilities and on that note what is exactly rag well you'll figure out in the next scrim here at scrim but we use an app called notion for notes taking and with a team of several teachers developers people in operations and so forth we have a lot of Internal Documentation and it quickly becomes chaotic so here we have a courses and teaching page which again contains a bunch of sub pages and they themselves also have sub Pages as well so it is actually quite hard at times to get to the answer you want to get to which is why I was really glad when lotion launched their ask AI feature which is essentially means that you can ask questions to notion so one day when I was working on our corsera exports I seemed to remember that we needed a widget for doing these exports and I asked it about exactly that it thought a little bit and then came with an answer yes you are correct for corera courses a type of item called plug-in is used to embed scrims and this is quite interesting because I asked for a widget but the AI understood that well actually I meant the plugins so it's shared with me through this footnote here the link to the document that talked about these corsera plugins and this kind of user experience is a GameChanger for web apps suddenly it is much easier to find the information you need and also you give the llm access to proprietary data as obviously the underlying model here does not have any knowledge about how we at scrimba internally embed our scrims in corsera courses now this whole experience was only possible through something called retrieval augmented generation which Probably sounds very complex but don't worry we'll go through it step by step and we won't refer to it through this long complex name here we'll use the popularized expression rag okay so rag contains of mainly two steps there's the retrieval step fetching the data you need to reply to the user's question and there's the generation taking whatever information you found and using that as context when generating the conversational reply back to the user so if if we zoom in on the retrieval first this is very often done in collaboration with a so-called Vector database that is a specific type of database that is optimized for storing information in a specific format that makes it easy for AI to reason about it so it stores so-called embeddings now at this point you're probably a little bit confused what's this thing about vectors and embeddings and all of that don't worry about it we'll get back to that later for now I just want to explain rag on a very high level so what you do is you take all of your data and shove it into a vector database in this specific embedded format and then you take the search query or the input from the user and turn that into an embedding as well as that gives you the opportunity to do a so-called semantic search and get these search results which intelligently for example understand that no pair wasn't looking for a widget he was actually looking for this and thus fetch the relevant data for the app that is the retrieval part once you've done that you take the user input that is the question I asked which was a very humanly written sentence about I seem to remember something about a corsera wouldit blah blah blah and then you combine that with the search results we got in the retrieval step and turn it into a singular prompt that the llm can use as input so mistal AI takes that prompt and the relevant context we retrieved and turns that into a very humanly readable response with in many cases a footnote or link to the underlying data as well thus providing the user a way of factchecking the claim that the AI comes with now there's one thing that all of this relies on which is our ability to turn data for example a sentence into numbers that the AI can understand now all of this relies in our ability to create something called embeddings and what is an embedding well it is what you get when you take a piece of data for example the string hello world and run it through an AI model that turns it into a long array of numbers also known as a vector and as we build out a rag solution in this course it is really important that you have an intuitive understanding of what this embedding concept is so before we continue on with our rag project I'll leave the mic to my colleague Gil Hernandez who will give you a primer on embeddings in the next scrim whether you realize it or not AI powered search shapes many parts of your daily lives every day you interact with platforms sifting through massive amounts of data from text and images to audio and video think about Amazon recommending products or search engines refining your queries social media platforms curate tailored content while services like YouTube Netflix and Spotify offer suggestions based on your preferences now Advanced AIS despite their capabilities don't truly understand the real world as we do they can't grasp the actual meaning or Nuance of a video title song or news article so how exactly do AIS and platforms like Spotify Netflix and YouTube truly get us how is it that they appear to understand predict and respond to us as effectively as if not better than people well the magic behind this capability involves a blend of algorithms AI models and huge amounts of data but a larger part of the answer involves embeddings you see when you present a question to an AI it first needs to translate it into a format it can understand so you can think of embeddings as the language that AI understands the term embedding is a mathematical concept that refers to placing one object into a different space think of it like taking a word or sentence which is in a Content space and transforming it into a different representation like a set of numbers in a vector space all while preserving its original meaning and the relationships between other words and phrases AI systems process lots of data from user inputs to information and databases at the heart of this processing are embeddings which are vectors representing that data transforming content like search queries photos songs or videos into vectors gives machines the power to effectively compare categorize and understand the content in a way that's almost human so how is all of this possible well it isn't exactly as easy as just turning data into vectors so before we go any deeper let's take a closer look at what vectors are think of a vector as a coordinate or point in space and to keep things simple we'll have a look at this 2D graph with an X and Y AIS let's say that a word like cat is translated into a vector like 4.5 12.2 which is this point this Vector encapsulates the meaning and nuances of the word cat in a way an AI model can understand and then we have the word feline represented by a nearby Vector of 4.7 12.6 so we'll place that point on the graph now words that have similar meanings are numerically similar and tend to be be closely positioned in the vector space so this closeness implies that cat and Feline have similar meanings now let's say we have the word or vectors for kitten which might also be close to cat and Feline but maybe slightly further apart due to its age related Nuance now a dog is different but still in the same general domain of domesticated animals so the word dog might be represented by a vector that's not too distant but clearly in a different region let's say 7.5 10.5 and even a phrase like Man's Best Friend which is a colloquial term for a dog could be represented by a vector that's close to the vector for dog on the other hand a word like building is not related in meaning to any of these so its Vector would be much further apart let's say 15.3 3.9 here's another example that demonstrates how embeddings might capture semantic meaning and relationships between words let's say we have the word King represented by the vector 25 then man man is the vector 13 and woman is represented by the vector 14 now let's do some quick Vector arithmetic we'll start with the vector for King then subtract the vector for man to remove the male context and add the vector for woman to introduce new context after performing this Vector math our resulting Vector is 26 so we'll plot that point on the graph and let's say there's another word in our space queen represented by the vector 2 6.2 right here well this Vector is extremely close to the resulting Vector so we might identify queen as the most similar word based on that Vector just as a trained AI model would now a two-dimensional graph is a massive simplification as real world embeddings often exist in much higher dimensional spaces sometimes spanning hundreds or even thousands of dimensions for example the actual Vector embedding for the word Queen might have values across multiple Dimensions each Dimension or number in this Vector might capture a different semantic or contextual aspect of the word Queen for instance royalty Cleopatra or even chess this is what allows the AIS to recognize and differentiate between these contexts when the word is used in different scenarios now imagine embedding hundreds of thousands of words and phrases into this high-dimensional space some words will naturally gravitate closer to one another due to their similarities forming clusters While others are further apart or sparsely distributed in the space these relationships between vectors are extremely useful think back to spotify's method of embedding tracks in a vector space tracks that are positioned closely together are likely to be played one after the other all right so what else can we do with embeddings and how are they used in the real world well you can imagine how embeddings have revolutionized our daily experiences for example search engines have evolved to understand the essence of your queries and content moving beyond mere keyword matching and recommendation systems with the aid of embedding suggest products movies or songs that truly resonate with our preferences and purchase history for example Netflix uses them to create a tailored and personalized platform to maximize engagement and retention also in the healthcare industry embeddings are used to analyze medical images and extract information doctors can use to diagnose diseases and in the finance World embeddings help with analyzing financial data and making predictions about stock prices or currency exchange rates so every time you interact with an AI chatbot every time an app recommends something behind the scenes embeddings are at work translating data into meaning all right so how are these embeddings actually created well let's dive into that next before we create our embeddings there's one important thing you need to learn and that is how to split text because as an AI engineer you'll find yourself having to split text again and again because let's say that you are working on an internal employee handbook app which lets employees ask questions about the compan policies well in which casee you probably have a large data source like the one you can see here in handbook. text which contains all of the data that you need to embed however creating one embed of this entire thing would just be meaningless there's far too many subjects and themes talked about in this handbook so it wouldn't really have any specific semantic meaning of value it would be far too broad so what we're going to do is take this document and split it into chunks and then we'll create an embedding of of every single chunk now creating such chunks is actually a little bit complex though luckily we have a tool to help us with that and that is Lang chain one of the leading libraries for AI Engineers so what we'll do is enhance this function so that it uses the Lang chain text splitter because as you can see this doesn't do much at the moment it's simply an async function that fetches the handbook and calls do text on the response thus giving us all of the text in this handbook let's run the code and just see that it works yes there we have it so now we can use Lang chain to split this into smaller chunks I'll import the Lang chain Library here as a dependency and then let's figure out which specific tool we need to import from Lang chain the simplest one is the character text splitter though the recommended one to use is the recursive character text splitter so that's the one we're going to use so here we'll do import recursive character text Splitter from Lang chain SL text splitter like that now we can create a new recursive character text splitter this is a Constructor function that takes an object as the argument and here you define two things the size of the chunk and how much overlap you want between the chunks we'll try for example 250 characters for the size of the chunk that feels like a sentence or two and will allow for some overlap for example 40 characters we'll call our splitter simply splitter like that and then we can do splitter. create document and pass in the text this is an async function so we have to await it and store the result in a variable called for example output like that now if we log out the output let's run the code and there I got an error and that is because I have a typo I called the text splitter which is wrong it should be text splitter like that let's run the code again yes there we go as you can see in the console there are a bunch of data there and if we open the dev tools we'll be able to inspect it a little bit more in detail so let's do that here as you can see it is an array which contains 2 180 objects let's open up one of these objects and there we can see that we have the text itself under the page content property and also under the lines property we get the exact lines this content comes from in the handbook. text file that is very handy in case you want to create footnotes or reference to the original Source in your app now what you want to make sure of when you respect your data like this is that each of these trunks ideally only deal with one subject or theme that is how you create good embeddings if a given trunk is quote unquote polluted by different themes it'll be harder for the embedding model to create a meaningful Vector of it so here you can see that this trunk deals with delegation of authority and responsibility and the administration and the executive director so definitely a coherent subject though it's actually been split in the middle of two sentences so it could probably be better as well we have probably not struck the perfect balance here you could argue that it would have been better to split this into two and then use the entire sentences or maybe expand it in both ends and include both of the complete sentences in general the shorter your chunks are the more precise meaning the embedding will get though you might also miss some wider context and the longer the trunks are the more context they contain but it can also produce too broad of a scope of information and this would reduce the quality of the similarity search that the vector database does as the underlying meaning of the embedding could be ambiguous it could point in two different directions so to speak in general you want your chunks to be as small as possible but you don't want to lose context so it's definitely a balance to strike and something you'll probably only find through experimentation creating smaller and bigger chunks and actually seeing how it plays out in action in your app for now we'll stick with this and see how it works as we continue on building this rag feature let's carry on in the previous scrim I wrote all of the code for you but as you know this is a scrimo course meaning that your job is to get your hands on the keyboard and write the code out yourself so I left out a couple of pieces for you which you now are to implement through this challenge I want you to refactor this function so that it first of all takes the path to the data or document as an argument so that is to the handbook. text here that'll make it a little bit more generalized as it's really not a good practice to have the path for the fetal request hardcoded in here on line seven and then secondly I want you to return the splitted data as an array of strings and just that because that's how we want our data in the next step of building out this feature so go ahead and solve this challenge right now okay hopefully that went well first we'll specify that it takes a path here as the argument which we'll use in the fetch request and then of course we'll need to specify in the function invocation that we indeed want to get the data from the handbook. text that was part one part two returning the data as an array of strings if you remember from the previous Grim when we inspected this data it is actually an array of objects right now but this time around we only want the data that is within the page content property because we do not care about the location metadata at this point so here we'll take the output and we'll map through it and for each of these trunk objects we'll return trunk. page content like that and here we can store that in a variable called text R for text array and then simply return it now you can of course condense these into fewer lines of code but I like to be explicit and only do one thing at a time on each line so with that we are done and ready to carry on now it is finally time to use the myal API to create our very first embedding as you can see I have imported the mystal client and added my API key so we are ready to get going the first thing I need is an example text trunk to create an embedding of I happen to have copied one of them into my clipboard so I'll paste it in here and call it example trunk as you can see it says professional ethics and behavior are expected of all ambri employees further ambri expects each employee to display good judgment so this is a quite good text for embedding because it deals with one subject which is the expectation of characters for ambri employees now I'll comment this one out as we won't call this function right now instead we'll down here at line 22 call the client do embeddings function that is an async function so we have to await it and inside of the parameter the object we'll specify first what kind of model we want to use and here mistol provides an embedding model called mistol embed and then the second key in the sub is the input now we can't just paste the example trunk like this as this input isn't expecting a string it's actually expecting an array of strings so we have to do like this we'll store the response we get back from this in a const called for example embeddings response like that and then let's finally log it out I'll run the code and yet again I had a typo Mistral with r is what we want to write not Mall we'll try again and there we go we got something very interesting back let's paste it into the editor to inspect it a bit more like that here we can see it has an ID and under the data property we have an array that holds an embedding and that embedding is a long array of floating Point numbers all of which seemingly are very close to zero though slightly more or slightly less so this Vector right here is an embedding of this specific text as transformed by this model and as we use this model to transform other pieces of text the mathematical distance from the various vectors will be a reflect of how similar or how different the semantical meaning of the sentences in the various trunks are so pretty cool and with that I think you are ready to take the next step in building this rag feature so in the next scrim I'll give you a challenge let's move on okay now it's your turn to create your very first embeddings and as you might have noticed already I have removed the code I wrote in the previous Grim because yeah this is scrimba you are going to write the code on your own that's how you really learn so the only thing I've done is called this split documents function and stored the results in a variable I'm calling handbook chunks because you're going to use that when you create and invoke this create embeddings function it takes the chunks that is these as a parameter and turns them all into embeddings using the mistal API once you've done that you are to return the data in a specific format so what we're doing here is prepping it before we'll upload it to the vector database and the service we are using for our Vector database is called superbase which you'll learn more about very soon now the structure superbase wants us to create is the following it should be an array of objects and each of the objects should contain two properties one called content that is just the raw text string that you find in each of the Trunks and secondly the embedding property should simply be the embedded version of that string so aka the vector once you have the data in this format just return it and then later on we'll take care of uploading it to superbase so go ahead and give this one Your Best Shot good luck okay hopefully this went well I'll start by defining the function like that this will be one with asynchronous operations so we need to define it as yes an async function and inside of it we'll start with the mistal client and the embeddings method it takes two arguments the model which should be mistal embed and the input which should be the chunks that we have passed into the function now previously I added a string here so I had to wrap it in square brackets like this because the input is expecting an array of strings here though the trunks is already in the shape of an array as it is this handbook trunks array right here so we don't need to do that but we do need to await this one and store the result in a variable like that let's now console log out embeddings and see if we get anything when we run the code let's call the function pass in the handbook Trunks and see what we get out here on line 24 all right so in our console you can see we have an object which contains a data array which again contains their own objects with a property called embedding so the data we want exists in ins side of embeddings do data then we can navigate into a random item in this array for example the 12th one and then fetch out it's embedding like that if we run the code again we should see yes one vector being logged out to the console really good now we need to combine all of these vectors with all of our text chunks in this structure we've defined down here so to do that I'll map through each of the chunks and then return a new object which contains the chunk as the cont and the vector should be under the embedding key and we'll find it by navigating into embeddings Data into one of the items and then dot embedding so there's a lot of embedding words here right now just bear with me and we'll try to make this work so actually I'll I'm a little bit lazy I'll just copy this one right in here and then we need to replace this with whatever index we are at at every step in the iteration luckily map gives us the index as the second parameter of the Callback function so we can simply replace this with I like this let's store this in a variable called data and then finally return data like that I'll remove this one now we can call create embeddings and expect to get back the data and then log it out but if we want to do that we also have to await it because here we have a synchronous code so console log like that let's run this and see what we get yes there we have a beautiful array with objects that contain two keys content that contains the raw text string and embedding that contains the vector itself so we have the data just how we want it now if you solve this in a slightly different way that's totally okay there are certainly ways to condense this code and make it quote unquote drier I'm not going to worry about that right now but feel free to write this however you want the important thing is that you got the intended result not exactly that my code and your code are mirror images of each other so with that we are ready to take the next very exciting step in our rag Journey and that is to start learning about Vector databases for that I'll hand the bow over to my colleague Gil who will teach you about Vector databases over the next couple of scrims in this course we're going to use super base to manage our Vector database superbase is a full-fledged open source backend platform that offers a postgressql or postgress database which is a free and open- Source database system recognized for its stability and advanced capability while postgress is not a dedicated Vector database superbase does support a powerful postest extension called PG Vector for storing embeddings and Performing Vector similarity searches if you've worked with subase or postgress this should be pretty straightforward if not don't worry you don't have to be a database expert to start using superbase it's quick and easy to set up and the platform has a simple to use dashboard that makes postest as easy to use as a spreadsheet so the first thing you want to do is head over to superb.com once there click to sign in which you can do using your GitHub credentials next on your dashboard's project page click new project you'll first create a new organization with superbase you can use a company name or your own name choose the type of organization in my case personal set the pricing plan to free then click create organization after that superbase will ask you to create a new project which comes with its own dedicated instance and full postgress database it will also set up an API so you can easily interact with your new database so give your new project a name like vector embeddings create a password for your postgress database then choose a region that's geographically closer to you or your user base for best performance then click create new project and after a short moment your new project should be all set up from here you'll need to enable the PG Vector extension in your new project click the the database icon in the sidebar to go to the database page then on the pages sidebar click extensions in the search field search for vector and enable the extension and that should set you up to use superbase to store index and query Vector embeddings all right next you'll need to integrate superbase with your application or in our case the scrims for this course to do that click on the project setting icon and navigate to the API section in the sidebar here you'll find your project URL and API Keys these are essential for integrating superbase with your app so first copy your project URL then save it as an environment variable on scrimba remember you can access your environment variables with the keyword shortcut command or control shift e and be sure to name this variable super basore URL exactly as shown here finally copy your project API key then save it as a scrimba environment variable named superbase API key just like this Vector databases or vector stores possess unique superpowers for managing Vector embeddings with the capacity to store and retrieve embeddings quickly and at scale all right so how do Vector databases actually work well embeddings essentially allow us to match content to a question unlike traditional databases that search for exact value matches in rows Vector databases are powered by complex algorithm Ms that store search and quickly identify vectors so instead of looking for exact matches they use a similarity metric that uses all the information vectors provide about the meaning of the words and phrases to find the vectors most similar to a given query so storing custom information as edings in a vector database gives you the benefit of enabling users to interact with and receive responses exclusively from your own content you have complete control over your data ensuring it remains relevant and up toate this can also help reduce the number of calls and token usage and even allow the summarization and storage of chat histories which helps AIS maintain a type of long-term memory an important tool against the problem of hallucinations with AI models so with all that said Vector databases are becoming a central part of how we build AI powered software and play a massive role in the advancements of large language models these days you have various Vector database options from tools like chroma to Pine Cone superbase and several others all right so next up I'll guide you through setting up your own Vector database see you soon now we need to configure superbase in our project so that we can start interacting with the database as you can see I've installed superbase as a dependency and imported the create client from the superbase JavaScript SDK on line six we invoke this function passing in the superbase URL as the first parameter and the API key as the second and then we have our superbase client however now we have two clients here the mistal one and the superbase one so I want to make it a bit more apparent that this one here is dealing with mistol so I'll rename it like that and then change the name here as well now I want you to head over to your dashboard in superbase and click into the vector embeddings project from there choose the SQL editor from the menu on the left hand side as this allows you to create tables in the database using a SQL query and and having tables is absolutely necessary in a SQL database as that is how you store the data I happen to have the query right here for you as you can see it's pretty straightforward create table we're calling it handbook docs and then we Define the three columns we want our table to have an ID which has the data type big serial that is the primary key so the identification field in this table we'll have the content which will specify as plain text and finally there's the embedding which is a vector of 1,24 Dimensions if you think this resembles our data structure down here you are completely right that is exactly why we formatted our data this way so go ahead and take the SQL and paste it into the editor hit run and then you should see under the results here success no rows returned that means that your table has been created to view it simply click on the table editor in the menu on the left hand side there you can see this is the very beginnings of a table that has an ID column a Content column and an embedding column now to get our data all the way from the handbook via the embed end point and finally into the structure we want and then upload it to super base we only have one line of code to write and that is simply super base do from here we'll specify our table handbook docs dot insert cuz we want to insert something and what do we want to insert well that is the data this is also an async operation so we got to wait it and when this line has executed and JavaScript moves on we'll log out upload complete let's now run this and there we go the upload should be complete let's head over to super base and boom there we go we have our content and their corresponding embeddings in the vector database meaning that we are ready to take the final step in this rag feature which is to perform the retrieval so that we can generate replies to the users for any question they might have about our employee handbook so great job reaching this far let's carry on with all of our text Trunks and embeddings safely stored at superbase we are finally ready to write the code for our rag feature so as you can see here I've changed around on the index JS a little bit as I moved the old uploading code over to data.js as we won't be using that now since we're now actually going to do the retrieval and generation steps so let's start by going through this code so that we're both on the same page the flow of this app contains four steps the first one is getting the user input here I've just hardcoded it as a variable where the user is asking for whether or not they get an extra day off since December 25th falls on a Sunday now of course in real app the user would probably ask this in some kind of form and you do some Dom manipulation to fetch this though that's outside of the scope for this course so we'll just keep it simple and use this input variable next we need to take this input and turn it into an embedding as we need to see if the embedding of this string matches some of the embeddings we've created of the various chunks in our handbook now creating this embedding should be piece of cake for you for now so I didn't bother going through that code with you as you've done that before so once we have this embedding stored in this variable we'll pass it into another function that we've called retrieve matches and this is where we are going to do the similarity search now I've not written the body of this function yet let's just continue on with the flow and then get back to that because once we've gotten the matches or aka the context we'll pass both the context and the input into a function called generate chat response where we'll use these two in combination to get mistol to formulate a reply to the user so that is essentially the four steps of our rag feature now let's look at this retrieve matches function here we need to tell superbase to do a similarity search and if you read some of these descriptions you might be a little bit scared because they're called things like ukian distance negative inner product or coign distance that certainly sounds complicated though luckily we don't have to to worry about any of that as superbase provides us with a SQL function that we simply can copy paste so that we don't have to dive into the underlying complexity I've pasted this function into the function. SQL file right here changing around a little bit on a few things like the name of the function which I want to be match handbook docs as this is the name of our table and also I've changed the vector to account for the number of Dimensions mistal gives our embeddings plus updated this query down here to account for our handbook docs name so what I want you to do now is copy this entire function head over to superbase and click into the SQL editor there click on the new query button and then paste in the function click on run and if you see success no rows returned it means that this function is now available in your database but now the question is how do we access this function in our JavaScript and that is where superbase is really user friendly CU they have an RPC method a so-called remote procedure call which you can invoke anywhere in your code just like they do on this snippet right here so what we'll do is simply copy this and paste it into our code now our function was not called match documents it was called match handbook docks and to begin with I don't want 10 matches which is what you define here I want just one now the match threshold sets a threshold for how similar embedding should be in order to be included as a match the higher you put this the more picky you are so the less mattress you'll see but also the more similar they will actually be and here the aquarium embedding is the embedding that we passed into this function in other words the embedding of this string right here finally we can return data and I happen to know that inside of the first item in the data array there is a property called content and that is what we want so now let's comment out this line and console log out the context and try to run this code and there we go we get back a very relevant piece of context which says Christmas Day full-time employees parenthesis employees work at least 35 hours per week receive one paid day off for each full day of holiday time holiday benefits for part-time employees and then it stops so we were able to retrieve very relevant information though it's not formulated as a good reply to the user and also there's some lacking information here as well ideally we would have seen the sentence that was cut off Midway as that would have given us information about the part-time employee vacation policy for these kinds of situations so that leaves us with a couple of tasks to be done down here in these retrieve matches and the generate chat response functions and who do you think is going to fix up that yes you guessed it that's yourself so in the next Grim you are going to complete the retrieval and the generation process of this feature let's move on I'll see you there welcome to this two-part challenge where you are going to complete the retrieve matches function and write the entire body of the generate chat chat response function in the first one we are to fix the fact that we didn't get enough data back by simply getting one match so instead we are going to return five matches and that involves updating this object and changing how you return the data for the Second Challenge you are to take whatever context you get back from the retrieval step and combine it with the user's query or input and turn that into a prompt this prompt should be sent to mistral's API and decide for yourself what models and what settings you'd like to use and here you're going to do a little bit of prompt engineering as you'll need to combine the context and the query into a single prompt and I don't want to dictate this for you instead just think of how you would take two pieces of data a context and a question and turn it into one prompt that instructs the AI to answer the question based upon the provided context now I can disclose that it doesn't have to be complex the AI is pretty capable of figuring out what you're trying to do so just make it your best chart and see how it works finally once you've done this you probably want to log out the response here to inspect what kind of reply the AI generated for you okay with that best of luck you got this okay hopefully this went well let's do it so I'll start with this one and the first thing we need to do is of course update this number to be five instead of one and then I'll check out the data here by logging it out and then I'll actually run the code as we're logging out the context here and then I'll actually remove this and just return the data so that we get to inspect the underlying structure here as we are logging out the context here on line 15 let's run the code opening up the console and there we go so this is an array of objects where we are looking for the string inside of the content keys so if we want to combine all of the content Keys into a single string I'll do data. map and for each chunk I'll return the chunk. content if we return this let's see what we get running the code and there we get an array of strings to combine that to a single string we'll just do dot join and then specify that we want a space in between each of the strings logging this out yes that looks good moving on to this one here we'll start by using the mystal client and call the chat method passing in the model and let's try with the most capable one first mistal large latest and the messages only needs one object or I'll at least try with that and if it doesn't work I'll perhaps try to add a system message but let's go straight to the user as a first solution and the content here is where we'll need to do a little bit of prompt engineering I'll try the easy way first and simply do handbook context like that and then passing in the context like that and then we'll do question colon and pass in the query now we could of course have written this is an extract from the the handbook that contains relevant background info for the user question though as you've probably understood I like to start off simple and then only make it more complex if needed so this is actually all the prompt engineering I was looking for so as the next step we want to return whatever result we get from this and to do that we of course have to await this function and store the response in a variable and then I happen to know that the real generated reply to the user lives inside response do choices and the first item in that array do message. content as I happen to know that this is the location of the generated response from the AI so with that it is the moment we've all been waiting for let's log out the response like that and run the code and see how this works and yes based on the handbook context provided if Christmas Day falls on a Sunday you as a full-time employee would still receive one pay day off for the holiday brilliant that is exactly what we were looking for it was able to both figure out that December 25th was semantically related to Christmas day so that it was able to retrieve the relevant information and use that to generate a nice and humanly readable reply so phenomenally well done you now have rag as a tool in your tool belt as an AI engineer and this will definitely come in handy throughout your career if you continue down the path of AI so give yourself a pad on the back perhaps take a break at this point as you've learned a lot and perhaps has a need to digest it if not in the next next part of this course you are going to learn about an insanely exciting concept which is function calling which enables you to create AI agents that interact with the world on the user's behalf so truly something that opens the door to a whole world of revolutionary user experiences hey and welcome to the section about function calling this is a very exciting field that opens the door for you to AI agents and by that I mean smart assistants that can interact act with the world on behalf of your users just by interpreting what they say so this is a new paradigm in terms of the user experience we developers can provide now let's start off at a very high level looking at how the architecture of such a agent typically is so let's say that you are running an e-commerce website where you sell products to people and you have a chat where users come to ask questions for example things like is my package on its way what we'll do then is send this to our llm along with some Specific Instructions as to what kind of tools it has available to figure out the answer for the user and then if it is a good model it'll look at this query and realize for itself that hm I actually need to call the fetch order function to give the user a good reply and then it'll instruct our software to actually perform this function call this is often done via regular code like if and else conditionals so through the code you have written you'll ensure that when the AI wants to call this function you will actually perform this function call and get the order data she then again will return to the llm it will then read that data which probably comes in the form of an object and then turn that into a human readable response like yes your order is expected to arrive and blah blah blah which again then results in a happy user all without you having to use manpower to do this so you can imagine the power of this technology is it can drastically improve the customer service users can get when talking with these chat Bots we've all come across over the last few years and of course that is just the start imagine how powerful it is when this is rolled out to all Industries okay let's now have a look at the code this here is by and large a very standard function call to the chat endpoint at mistl so all of this should be familiar to you except for this line 13 here where we've added an array of tools and that comes from the import statement here on line two and if we head over to tools.jar and searches through that data and as you can see down in the tools array we have described this function through a specific schema this entire object's sole purpose is to describe for the AI what this function does so it says that yes first of all it is a function and its name is this and here is a description of it as well plus it takes one parameter of type object this one right here and this object has a property called transaction ID which is of type string and also it is the required property so I think I think you can guess what I'm getting to here this is all our aices it never sees the content of this function it just looks at the description and tries to decide whether or not it should be invoked and with what kind of arguments based upon the input from the user so if we yet again head back to index.js and try to run this code with the prompt is the transaction t01 paid and run this then you can see in the console we get back an object and I happen to have copy pasted one of these objects into the output JS just to make it a little bit easier to read and here as you probably remember from earlier the choices message content property is actually empty now this is where mistol always adds the reply from the llm though now the llm has nothing to say instead it has given us some instructions about what we should do here under tool calls there's a function key which has the name of get payment function and the argument of transaction ID t01 so this is how it tells us that hey developer now you got to invoke a function and the llm isn't done reasoning about this issue if it was it would have said stop here but instead it wants us to call a tool and then send back the result so as you can understand we developers have some work to do here and we're also going to make this agent even more capable by adding another function so I'll leave it at this so that the two of us can get back to work and start coding and we'll do that in the next scrim so we've been able to get the mistal model to tell us to call function when someone asks about the payment status of an order however if we take a look at the data you can see that there's also other things going on here there's both an amount and a date for each of the orders so there's definitely potential to make our agent more powerful I'm now going to paste a function in here called get payment date it also takes in the transaction ID and does more or less the same thing as get payment status though it instead Returns the date for when it was paid now you could have of course also have done this by simply making this one a bit more robust to fetch various pieces from the data based upon the argument passed in but the point here is not to build a production ready system but rather to give you some practice in building agents so what I want you to do now is expand upon this tools array by adding a new object that describes the second function because as you might remember the AI doesn't read this code it only knows about the function through how you describe it in the tools schema so that is your first Challenge and once you've done that I want you to verify that your solution works by changing the prompt that we give the agent here in a way that would get the llm to instruct us to call the newly added function so go ahead and solve these two challenges and I'll show you the solution when you return back to me okay let's do this I'll head back to the tools file again and then I will simply copy this object as I'm a little bit lazy and I've understood that the second object here will look very similar similar as the first one so I'll do like that and simply change from get payment status to get payment date and the description to get the payment date of a transaction and then the parameter can stay just the same as it is identical to the previous function and actually that was about it heading back to index JS I'll change this to when was the transaction t01 paid let's run this open up the console and yes indeed you can see that it is now instructing us to call get payment date and not get payment status so mission accomplished hopefully that forced you to get to know the schema a little bit more as we're going to move to the next step now where we'll start acting upon the instructions we get from the mistal model right now we are in this step of our flow the llm has just told us that we need to call a function and now we're about to write the code we need in order to do that so the first thing I want to do is that we keep our messages array up to date as it should include every single piece of dialogue going back and forth between the user the app and mistol so what I'll do is messages. push and I'll push some part of this response though not the entire thing let's have a look at how it was structured if we head into the choices and the first object in this array we can see that there is a message object and that is what we want so we'll do response on. choices the first one and then message like that as you can see it has a role just like our user message has a role though this is from the assistant the content is empty since it didn't have anything to say to us instead it had some instructions about what tools we should call so the next thing we want to do is write the N statement that checks if we indeed are about to call a tool and then write the code for the specific tool call now I don't want to use the fact that there's a tool calls here in our conditional instead the right way to do this is to look at the Finish reason and the fact that this has the string value of tool calls so we'll do if response choices the first item and then then finish reason if this is equal to Tool calls well then our next step is to fetch out the name of the function we are to call and its argument and at this point I think I've written more than enough code for you it is your turn to take over so here is a challenge for you I want you to get a hold of the name of the function that we should call and its arguments and we want the function name as a string but the arguments as an object so I've set up the two variables for you the function name and the function arcs both are just initialized as empty strings but whatever expression you replace this with should be of the data type object so now you'll have to dig through this one yourself and fetch out the relevant information and once you're done just return back to me and then I will show you the solution the final thing I'm going to do here is make sure that I close this if statement properly like that and indent this and have the correct indentation for this one as well and with that you are good to go best of luck okay hopefully this went well let's do this together so first I'll start with the function name and here I'll need to navigate all the way down to Tool calls function and then name and I happen to see that both the name and the argument is in the same object here this function object So to avoid too much repetition I'm going to do const function object like that and then I'll paste this in adding dot tool calls which is an array and we want the first item and do function like that now I can do function object. name and function object do r ents so if you got to this point good job though we're not quite done yet as this function object. argument is a string and we want it as an object and the way to do that is to do json.parse and that should turn whatever string we have into an object let's consol out the function name and the arguments and then run the code to verify that it works and I'll comment out the response down here running the code and yes the first first one is get payment date as a string and the second one is indeed an object so very well done solving this challenge let's carry on it is time for us to do this step which is to call the function so that we get the data we eventually can send back to mistal and as you know we have the function name and the function arcs but this file doesn't yet have access to the functions as they live in the tools file so I will import these functions get payment date and get payment status like that though now the question is how do we go from just having the data as a string value for example get payment date into actually calling the function well to help you with that I'm going to wrap these functions in an object called available functions I'll add get payment date and get payment status what this gives us the opportunity to do is to use the bracket notation to get a hold of a reference to any of these functions because if we passed in a string called get payment date this would be a reference to this function and if we throw in parenthesis we'll invoke the function so that is pretty cool and it is exactly what I want you to do right now in your challenge your job is simply to perform the function call so go ahead and write the code to do that and then I'll see you when you return back to me okay hopefully this went well the way to do it is to grab a hold of the available functions object and then use bracket notation to pass in our function name call it with parenthesis and finally add function arcs like that we'll store this in a variable called function response and then finally log it out let's run the code and there we go we have turned a prompt from the user into a real function call that returns data that our assistant asked for really good job reaching this far we are making a ton of progress so let's just keep up the pace and carry on so we've called the function obtained the data we need and now we need to send it back to our assistant so how do we do that well as I've said earlier it's important that we keep track of all the dialogue in this app whether it's from the user from the assistant or in this case from the tool itself and where do we keep track of this dialogue take a guess yes it is in this messages array which we are passing along to mistal every time we interact with our API so what we're going to do here first is messages. push pass in an object and as you've seen before we always have a role though this time around it's not the role of user and also not the role of assistant which is what we had when we got the instructions to call the function this time around the role belongs to the tool and the next piece of information we need to pass along is the name of the tool which we have here in function name and finally the content after we've told mol that we worked with the tool and gave them the name of the tool what do you think the content here should be take a guess yes hopefully you understood that it is the response we got back from the function because when mistol gets all of this data it should be able to decide what the next step should be and speaking of which how do we then send this off to mistl well we could start a new client. chat down here and add all of the metadata again though that's a very hard-coded and hacky solution instead we want to rerun this piece of code and then yet again check if we're instructed to call yet another function and then keep on going until the assistant tells us that yes we are now done with the back and forth and I have a good response for the user and hearing that what kind of programming Paradigm does that sound like a job for and yes you guessed it the loop so we are going to wrap this entire thing in a loop and keep it running until we have a satisfying result we'll do that in the next Grim so I'll see you there okay we are ready to perform the final steps of our flow we'll take the result from the function and send it to our assistant who will then construct a reply and send it back so that our user is happy again and as I talked about in the previous Grim we'll do this through the help of a loop so this is a challenge where you are to start by creating a for Loop that runs a maximum of five times and inside of the for Loop if the Finish reason is ever set to the string stop then I want you to simp simply return the response from the assistant so then you are to return the entire function and that'll also then break out of the loop as you can see the Finish reason lives down here in the object you get from the assistant now you might ask at this point well why are we simply hardcoding in a for Loop that runs five times wouldn't it be better with a while loop that could run as many times as you need until the task is complete or for example a recursive solution that would do the same thing and yes those could be better Solutions but they also open up for the possibility of infinite Loops so it would require some guard rails in my opinion to implement such a solution which is why we're simply going for a naive for Loop that runs five times and that should be more than enough for our use case though of course if you want to build on this after you've solved the challenge you are more than free to do that and actually I would encourage you to do that anyway give this challenge your best shot and then I'll see you when you're done okay let's do this we'll do four let I equals z and I should be less than five and it should increment moving this all the way down here and indenting everything inside of the loop like that and checking here if response do choices the first item and yes it lives within that object if the Finish reason is stop then we'll simply return the content within the message and finally let's bring this up here and do else if like that okay the moment of truth let's see if we've been able to successfully implement this entire flow I'll comment out this console log we're asking when this transaction was paid let's run the code and yes the transaction 2001 was paid on October 5th 2021 wow congrats you've just built your very first AI agent and while this of course is a dummy example You Now understand the basic building blocks which gives you a foundation for building Real World products so give yourself a pat on the back and then I'll see you in the next scrim hi there now you are going to learn how to run mistal models locally on your computer and the tool we are going to use for that is called olama it is an app that wraps large language models on your computer and lets you interact with them in various ways so click on this image right here and you'll get to the AMA page there you can search for models for example mistl click into it and see the size of it this one is 4.1 GB and also read about how it performs compared to other open- source models for now let's head back to the homepage and click on the download button then choose whatever platform you use and click yet again on the download button so that you can complete the download and install AMA on your computer once you've done that open up the terminal on your computer and type AMA run mistal that'll start the download process and as it'll take some time I'll fast forward and once it's done we get this little UI where we can type a message so let's ask the model something for example trying to use it as a motivational coach which I often use large language models for so I'll type feeling a bit demotivated today can you help me get started with my day and when I hit enter mistl starts typing out tip after tip not through being run on a thirdparty server but being run by my computer and let's just take a minute and acknowledge how cool this is because aside for the cost of the hardware and the electricity these tokens are completely free and there's also 100% privacy as the data stays on the device now what's perhaps even cooler is that you can use this as a model for any AI projects you build locally as well so let's try to do that in the current scrim click on the Cog wheel in the bottom right corner and then click download as zip then in your downloads folder you'll see this ZIP file so just go ahead and double click on it to unzip it and that'll give you a folder with a weird looking name that is the underlying ID for the scrim so take this folder rename it and place it wherever you keep your Dev project for me that is in the dev directory and I've named this project AMA hello world so I'll navigate into it and there you should do mpm install and then do npm start that'll spin up the project on Port 3000 meaning that you can head over to Local Host 3000 and there you will see the browser telling you to ask a question via the question parameter and what's going on here is that this little Express router checks if there is a question in the URL parameter called question and if there isn't a question there for example if you just visited the root page without any URL params it'll just render out this string though if it is a question there it'll execute the following lines of code and here we're using theama SDK and the patterns that are being used here is probably quite familiar to you right now because yes this resembles the mistal SDK quite a lot we call the chat method and pass in an object where we specify the model in this case it's mistal and a messages array that has a role and some content and the content here is whatever Express finds in the URL parameter called question so if we now type in a question in the browser for example why do stars shine and hit enter then we'll see that the browser will work for a little while and boom eventually it gives you reply which says that star shine due to nuclear fusion where hydrogen atoms are turned into helium and thus releasing immense amounts of energy and actually this continues until all of these atoms have turned into iron which by the way means that the iron you used to fry your eggs with was created billions of years ago in the center of a star wow that is mindblowing blowing to think about almost as mind-blowing as the fact that you have reached the end of this mystal course most people who start a course here on scrimba give up before they reach the end but you my friend do not you are not a quitter so give yourself a pat on the back and in the next scrim we'll do a quick recap of everything you've learned wow you really did something special today you completed this course please remember that most people who start online courses give up you are not like them so let's have a quick recap of what you've learned starting out we looked at the basics of mistl and their platform along with the chat completion API that we interacted with through their JavaScript SDK you have a solid grasp of the various mistal models right now and also know how to work with embeddings and Vector databases with superbase you've also dipped your toes into Lang chain and using it for chunking text because you had to do that when you were learning about rag or retrieval augmented generation finally you know how to create AI agents with function calling and how to do local inference through the help of AMA now it's time for you to celebrate your win do share that you've completed this course on social media or if you want a less public way of doing that you can check out scribus Discord Community as there we have a today I did Channel where we love seeing people completing courses and whatever you do please keep on building you have a unique set of skills here and this is just a start the world of AI is exploding giving Developers like yourself the possibility to create entirely new experiences and apps the world is your oyster so happy building and best of luck\n"