The ULTIMATE In-Game AI Assistant _ Speech to Speech - Groq API - GPT4V ++

**Creating a Deepfakes System using Mixol and Gro**

We've been working on creating a deepfakes system that uses a mix of machine learning models to generate realistic audio and video. The system is based on a combination of pre-trained language models, speech recognition technology, and computer vision techniques. In this article, we'll explore the details of how our system works and provide a step-by-step guide on how to set it up.

**Fetching Upto-Date Information**

One of the challenges in creating a deepfakes system is ensuring that the information used is up-to-date and accurate. To address this issue, we've been working on developing a script that can fetch more recent information from various sources. The script uses the Mixol model to retrieve relevant data and then uses the Gro API to process the information.

We've also been experimenting with different types of models, including language models like BERT and RoBERTa. These models have proven to be effective in generating realistic text and speech. However, they require a significant amount of training data to learn from, which can be time-consuming and resource-intensive.

**Transcribe Function**

When we speak into the microphone, our system uses the Faster Whisper model to transcribe our words. This allows us to generate text that is almost indistinguishable from what was spoken. The transcription process is an important step in creating a deepfakes system, as it enables us to create realistic audio and video.

The transcription process involves using the Mixol model to generate text based on the audio input. The model is trained on a large dataset of text and speech, which allows it to learn patterns and relationships between words and sounds. Once the transcription is complete, we use the Gro API to process the text and generate realistic audio and video.

**Gro API Function**

The Gro API is an important component of our deepfakes system. It allows us to process user input and generate realistic audio and video. The API uses a combination of machine learning models and natural language processing techniques to analyze user input and generate output.

We've been experimenting with different types of prompts, including those that require the model to describe images or explain lore from Elden Ring. These prompts are designed to test the limits of our system and ensure that it can generate realistic output in a variety of contexts.

**Text-to-Speech Function**

One of the key features of our deepfakes system is its text-to-speech functionality. This allows us to create realistic audio by synthesizing speech from text. The system uses a combination of machine learning models and natural language processing techniques to generate realistic speech.

We've been using the 11Labs model for this purpose, which has proven to be effective in generating realistic speech. However, we're always looking for ways to improve the quality of our output and are experimenting with different types of models and prompts.

**Main Function**

The main function of our deepfakes system is a loop that runs indefinitely. It uses a combination of machine learning models and natural language processing techniques to analyze user input and generate realistic audio and video.

When we start the system, it begins by recording user input using the microphone. The audio is then transcribed using the Faster Whisper model, which generates text based on the audio input. The text is then processed using the Gro API, which analyzes the text and generates output.

The output can be in the form of a description of an image or an explanation of lore from Elden Ring. We've been experimenting with different types of prompts, including those that require the model to describe images or explain lore from the game.

**About Prompts**

We've developed two main prompts for our deepfakes system: "Explain related lore" and "Describe item location in the image". The first prompt requires the model to provide a short description of an image that is related to Elden Ring. The second prompt requires the model to describe the location of an item within an image.

These prompts are designed to test the limits of our system and ensure that it can generate realistic output in a variety of contexts. We're always looking for ways to improve the quality of our output and are experimenting with different types of models and prompts.

**System Messages**

Our deepfakes system uses a combination of system messages to communicate with the user. These messages provide feedback on the status of the system, including whether the input has been accepted or rejected.

We've developed two main system messages: one that responds to "describe image" prompts and another that responds to "Describe item location in the image" prompts. These messages are designed to provide helpful feedback to the user and ensure that they understand what the system is doing.

**Voice Recording Function**

One of the key features of our deepfakes system is its voice recording functionality. This allows us to record audio from the microphone and use it as input for the model.

When we press the "record" button, the system records audio from the microphone using the Faster Whisper model. The recorded audio is then transcribed into text, which is used as input for the Gro API. The output of the Gro API is then processed by the system to generate realistic audio and video.

**Conclusion**

In conclusion, our deepfakes system uses a combination of machine learning models, speech recognition technology, and computer vision techniques to generate realistic audio and video. We've developed a script that can fetch more recent information from various sources and process it using the Gro API.

The system is designed to be flexible and adaptable, allowing us to test different types of prompts and analyze user input in a variety of contexts. With its text-to-speech functionality, our deepfakes system has the potential to revolutionize the field of AI and create new possibilities for entertainment and education.

"WEBVTTKind: captionsLanguage: enwhat weapon does the tree Sentinel in Lim grave drop the tree Sentinel in Lim grave drops the golden hbert weapon it's a great early Faith weapon scaling primarily with strength faith and dexterity so I'm just going to click the side button off my mouse here to get some extra information about this item The decus Medallion is a key item you find in Elden ring divided into two halves the left half which you see in the image is found early in the game in limb grave and shows the her tree so what you just saw here was the AI assistant system I have created for any video game I have been using it for Elden ring but you can't adapt it to any kind of video game you love to play so let me show you how this works so here you can kind of see an overview of how I set this up so I am controlling this by using my PC Mouse since I play with this controller here I can kind of have my mouse free so I can press like the scroll button just to ask a question and I can press the side button on my mouse to take a screenshot that will be processed by gp4 sent over to this python Hub and it will be using the information from the screenshot via a prompt and I'm using the grock API you can insert whatever llm you want here and we get the response back uh from like an 11 Labs voice I just set up uh the other way is to yeah we can press scroll to speech I can ask a question and it kind of goes the same way uh it's get transcribed into the python Hub and we get the response back in every lab so it's a pretty easy setup but I works it works pretty good and I'm very happy with how it's been performing so far before we do many more examples and take a look at the code let me introduce you to today's sponsor are you curious to learn more about the technology that enables AI then you're in for a tweet with brilliant.org the sponsor of today's video brilliant simplifies learning by offering engaging courses in computer science math and much more it's all about an interactive hands-on experience dive into the vectors course to explore the core of physics graphics and machine learning you get to apply vectors in exciting ways from programming a video game to using vectors in machine learning then Unleash Your creativity with the course creative coding where you learn to bring ideas to life using variables and Loops brilliant makes these complex topics approachable and enjoyable with interactive exercises and guided problem solving you gain skills that stick so if you're curious about turning abstract Concepts into real world applications brilliant is the place to start head over to brilliant.org allabout AI to start your Learning Journey or just click the link in the description for a free 30-day trial the first 200 of you that signs up will receive 20% off an annual premium subscription a big thanks to brilliant for sponsoring this video now let's go back to the project before we take a look at how I kind of set this up in Python let's just fire this up and do a few more examples okay so the first thing I want to do is travel to this church here take a screenshot of the statue to get some more information right okay so I'm just going to go here turn around and do like a screenshot and let's see if we get some information about I think this is Queen America so I wanted to see if we can get some information about this lore here sure thing so the image shows the Third Church of maica a place dedicated to maica a big-time goddess in Elden ring this church like others honoring ma can have cool items like sacred tears those are key for upgrading your sacred flask which is super helpful for healing or EMP powering your character's abilities what can you tell me about Queen Marica in Elden ring queen MAA also known as MAA the Eternal is a crucial character in Elden Ring's lore she was once the queen of the lands between and the vessel of the Elden ring a powerful artifact that binds the world together maica had a significant impact on the world and the course of events in the game ma was the mother of the demigods powerful beings who were her Offspring with God Godfrey the first Elden Lord yeah that's pretty cool I just think it's very cool like if you discover some lore in the game you can just ask more about it so I really like that because the Elden ring story is very complicated so sometimes I don't even know what's happening here okay so one thing that can be a bit confusing is kind of the stats right so if we go to status here and we go to base stat we can see we have something called Poise so let's ask about that so in my base stats I have this thing called Poise it's 34 what does even Poise do in Elden ring Poise in Elden ring determines How likely you are to be staggered or interrupted during attacks if you have a high Poise you can resist interruptions and keep attacking making it useful for heavy slow weapons however Poise doesn't have a significant impact on faster weapons at 34 Poise you'll have some resistance to interruptions but won't be as resistant as with higher Poise values okay yeah that was pretty good right if you didn't know what Poise was I think that was a perfect answer of course you can follow up with more questions if you had that too so yeah pretty good so let's ask a bit about kind of attribute points here so I'm playing a samurai so what kind of stats should I focus on should I do like Arcane or should I do dexterity wigor I'm kind of unsure here it's my first time playing as a Samurai and Elden ring you'll want to focus on dexterity decks and vigor decks will increase your damage and let you use better weapons while Vigor enhances your HP allowing you to take more damage for a samurai I'd recommend investing in deck first then Vigor strength Str Str can be useful too but it's not as important for a DEX focused build Arcane is more suited for magic oriented builds so you can ignore it for your Samurai yeah I think that was pretty good right again I kind of get exactly what I asked for and like decks and vigor is pretty good things to do when you are playing like a samurai right okay so I just found this fire grease here so let me just screenshot that and let's see what kind of information we will get from this sure thing so in the image you see the item fire grease this item lets you add fire damage to your weapon which is super useful against enemies weak to fire yeah again I think that was quite okay okay so I wanted to try something I'm pretty sure we can't get an answer to so I want to try to screenshot the Rya lucaria here and see if we can get yeah a good answer back in the image you see a character gazing at storm V Castle the first major dungeon in Elden ring this ominous Fortress is ruled by godric the grafted a demigod and shardbearer defeating him advances the main story in simpler terms It's a Spooky complicated Place ruled by a powerful creepy dude you need to defeat okay that wasn't too bad it guessed stormw castle that was the previous Castle right but we are in L of the Lakes now so I'm just going to try to ask about this instead I'm aneria of lakes and I'm standing in front of this big castle where can this be it sounds like you're in the realm of Leia of the lakes and standing in front of car and study hall this Grand Castle is located in the northeastern part of Leia and it's hard to miss with its imposing architecture keep an eye out for the magical barriers surrounding the castle and be prepared for a challenging encounter okay great so is there a boss inside there absolutely the car and studyed Hall is home to renala Queen of the full moon she's the demigod boss you'll encounter here prepare for a unique and challenging battle as rala has a variety of powerful spells and abilities at her disposal good luck okay I got to say that was pretty good right okay so I was quite impressed by that I didn't think we will get it at least not with the screenshot uh we didn't but like when I kind of precised more where I am and what I was looking at yeah again very impressive yeah I don't think we have to do any more examples I think we kind of prove that it works of course it's not perfect that needs a lot more work so I have been thinking about doing some kind of rag that we can fetch more upto-date information because this is just using like the mixol model and the information it has I don't didn't add add any later information I just wanted to take a quick look at the python code here so I know some of you are interested in that for the transcribe thing when I speech to the microphone we use the faster whisper and we have this onclick here so this is kind of how I control the actions so we have the button middle to kind of control the the recording part and we have X2 that is the button the side of my mouse to kind of trigger this screenshot uh function we have here and we have record audio Yeah of course and yeah we convert those to base 64 we have something to actually play the audio we get back from 11 Labs here you can see we have the gro uh function and this can be replaced by open AI you can do your local models you have a lot of options here it doesn't really matter what kind of llm you use but of course you want to use like an llm that has up to-date information about the game you are playing right here is the text to speech function we are just using 11 labs in this case also here you have options whatever you want you can just put in whatever you want and here is kind of our main function that kind of runs this whole Loop uh it looks a bit confusing uh I'm sure the code can be much more optimized but I just wanted to get this demo up and running so if you are interested in trying this out I will be uploading this script with some I I might actually do like a a video tutorial on how to set this up yourself uh on my YouTube members page so if you want to join just click the link in the description below become a member and just follow the instructions in the intro video there and I can invite you to the GitHub if you become a member so you can just uh try this yourself also let me just show you a quick about the prompts we are using because we have kind of uh set this to be kind of this ELD ring experts right so here is where we take the screenshot so the image description we get back from gp4 is going to be fed into this and we have this prompt as an Elden ring expert explain related lore from the image description very short uh and a conversational way I guess in a very short and conversational way uh the other prompt we have is you are an elering expert in elering only describe lore from eling related to items location in the image keep it short and precise that was the system message for describe image and we also have the system message for our Gro API so basically you're an eldering expert keep the responses short and conversational since we are using text to speech I also I always want to try to keep the responses like uh conversational because it just feels more natural right okay so here we can kind of see how the back work so when I press my side button I kind of save the screenshot to my path here and here we get the description back from GPT 4 and this gets fed into the prompt and this is what Gro returns so in the image you're seeing a character gasing at stor castle that was wrong of course that is kind of how the screen recording part works so let me show you about the voice recording so here you kind of see I started recording and this is what I said and here we kind of got the response back we started recording again and I followed up with so is there a boss inside there the carry on study is the home to rala Queen of the full moon okay great so yeah you can see this works so when we run this now this is just going to run in like a true Loop right so when I press the scroll button now you can see we can record and I can just stop it again with the scroll button so basically we just start with the scroll button and we end with the scroll button and when it's finished it kind of goes back into this state where we are just waiting to press one of the buttons again so it's in a true Loop so it always run in the background we don't after to we start it or anything so yeah that is basically what I wanted to share with you today hope you enjoyed it if you want to become a member of the channel and try this out for yourself just follow the link in the description also don't forget to check out brilliant.org have a great day and I'll see you probably on Wednesdaywhat weapon does the tree Sentinel in Lim grave drop the tree Sentinel in Lim grave drops the golden hbert weapon it's a great early Faith weapon scaling primarily with strength faith and dexterity so I'm just going to click the side button off my mouse here to get some extra information about this item The decus Medallion is a key item you find in Elden ring divided into two halves the left half which you see in the image is found early in the game in limb grave and shows the her tree so what you just saw here was the AI assistant system I have created for any video game I have been using it for Elden ring but you can't adapt it to any kind of video game you love to play so let me show you how this works so here you can kind of see an overview of how I set this up so I am controlling this by using my PC Mouse since I play with this controller here I can kind of have my mouse free so I can press like the scroll button just to ask a question and I can press the side button on my mouse to take a screenshot that will be processed by gp4 sent over to this python Hub and it will be using the information from the screenshot via a prompt and I'm using the grock API you can insert whatever llm you want here and we get the response back uh from like an 11 Labs voice I just set up uh the other way is to yeah we can press scroll to speech I can ask a question and it kind of goes the same way uh it's get transcribed into the python Hub and we get the response back in every lab so it's a pretty easy setup but I works it works pretty good and I'm very happy with how it's been performing so far before we do many more examples and take a look at the code let me introduce you to today's sponsor are you curious to learn more about the technology that enables AI then you're in for a tweet with brilliant.org the sponsor of today's video brilliant simplifies learning by offering engaging courses in computer science math and much more it's all about an interactive hands-on experience dive into the vectors course to explore the core of physics graphics and machine learning you get to apply vectors in exciting ways from programming a video game to using vectors in machine learning then Unleash Your creativity with the course creative coding where you learn to bring ideas to life using variables and Loops brilliant makes these complex topics approachable and enjoyable with interactive exercises and guided problem solving you gain skills that stick so if you're curious about turning abstract Concepts into real world applications brilliant is the place to start head over to brilliant.org allabout AI to start your Learning Journey or just click the link in the description for a free 30-day trial the first 200 of you that signs up will receive 20% off an annual premium subscription a big thanks to brilliant for sponsoring this video now let's go back to the project before we take a look at how I kind of set this up in Python let's just fire this up and do a few more examples okay so the first thing I want to do is travel to this church here take a screenshot of the statue to get some more information right okay so I'm just going to go here turn around and do like a screenshot and let's see if we get some information about I think this is Queen America so I wanted to see if we can get some information about this lore here sure thing so the image shows the Third Church of maica a place dedicated to maica a big-time goddess in Elden ring this church like others honoring ma can have cool items like sacred tears those are key for upgrading your sacred flask which is super helpful for healing or EMP powering your character's abilities what can you tell me about Queen Marica in Elden ring queen MAA also known as MAA the Eternal is a crucial character in Elden Ring's lore she was once the queen of the lands between and the vessel of the Elden ring a powerful artifact that binds the world together maica had a significant impact on the world and the course of events in the game ma was the mother of the demigods powerful beings who were her Offspring with God Godfrey the first Elden Lord yeah that's pretty cool I just think it's very cool like if you discover some lore in the game you can just ask more about it so I really like that because the Elden ring story is very complicated so sometimes I don't even know what's happening here okay so one thing that can be a bit confusing is kind of the stats right so if we go to status here and we go to base stat we can see we have something called Poise so let's ask about that so in my base stats I have this thing called Poise it's 34 what does even Poise do in Elden ring Poise in Elden ring determines How likely you are to be staggered or interrupted during attacks if you have a high Poise you can resist interruptions and keep attacking making it useful for heavy slow weapons however Poise doesn't have a significant impact on faster weapons at 34 Poise you'll have some resistance to interruptions but won't be as resistant as with higher Poise values okay yeah that was pretty good right if you didn't know what Poise was I think that was a perfect answer of course you can follow up with more questions if you had that too so yeah pretty good so let's ask a bit about kind of attribute points here so I'm playing a samurai so what kind of stats should I focus on should I do like Arcane or should I do dexterity wigor I'm kind of unsure here it's my first time playing as a Samurai and Elden ring you'll want to focus on dexterity decks and vigor decks will increase your damage and let you use better weapons while Vigor enhances your HP allowing you to take more damage for a samurai I'd recommend investing in deck first then Vigor strength Str Str can be useful too but it's not as important for a DEX focused build Arcane is more suited for magic oriented builds so you can ignore it for your Samurai yeah I think that was pretty good right again I kind of get exactly what I asked for and like decks and vigor is pretty good things to do when you are playing like a samurai right okay so I just found this fire grease here so let me just screenshot that and let's see what kind of information we will get from this sure thing so in the image you see the item fire grease this item lets you add fire damage to your weapon which is super useful against enemies weak to fire yeah again I think that was quite okay okay so I wanted to try something I'm pretty sure we can't get an answer to so I want to try to screenshot the Rya lucaria here and see if we can get yeah a good answer back in the image you see a character gazing at storm V Castle the first major dungeon in Elden ring this ominous Fortress is ruled by godric the grafted a demigod and shardbearer defeating him advances the main story in simpler terms It's a Spooky complicated Place ruled by a powerful creepy dude you need to defeat okay that wasn't too bad it guessed stormw castle that was the previous Castle right but we are in L of the Lakes now so I'm just going to try to ask about this instead I'm aneria of lakes and I'm standing in front of this big castle where can this be it sounds like you're in the realm of Leia of the lakes and standing in front of car and study hall this Grand Castle is located in the northeastern part of Leia and it's hard to miss with its imposing architecture keep an eye out for the magical barriers surrounding the castle and be prepared for a challenging encounter okay great so is there a boss inside there absolutely the car and studyed Hall is home to renala Queen of the full moon she's the demigod boss you'll encounter here prepare for a unique and challenging battle as rala has a variety of powerful spells and abilities at her disposal good luck okay I got to say that was pretty good right okay so I was quite impressed by that I didn't think we will get it at least not with the screenshot uh we didn't but like when I kind of precised more where I am and what I was looking at yeah again very impressive yeah I don't think we have to do any more examples I think we kind of prove that it works of course it's not perfect that needs a lot more work so I have been thinking about doing some kind of rag that we can fetch more upto-date information because this is just using like the mixol model and the information it has I don't didn't add add any later information I just wanted to take a quick look at the python code here so I know some of you are interested in that for the transcribe thing when I speech to the microphone we use the faster whisper and we have this onclick here so this is kind of how I control the actions so we have the button middle to kind of control the the recording part and we have X2 that is the button the side of my mouse to kind of trigger this screenshot uh function we have here and we have record audio Yeah of course and yeah we convert those to base 64 we have something to actually play the audio we get back from 11 Labs here you can see we have the gro uh function and this can be replaced by open AI you can do your local models you have a lot of options here it doesn't really matter what kind of llm you use but of course you want to use like an llm that has up to-date information about the game you are playing right here is the text to speech function we are just using 11 labs in this case also here you have options whatever you want you can just put in whatever you want and here is kind of our main function that kind of runs this whole Loop uh it looks a bit confusing uh I'm sure the code can be much more optimized but I just wanted to get this demo up and running so if you are interested in trying this out I will be uploading this script with some I I might actually do like a a video tutorial on how to set this up yourself uh on my YouTube members page so if you want to join just click the link in the description below become a member and just follow the instructions in the intro video there and I can invite you to the GitHub if you become a member so you can just uh try this yourself also let me just show you a quick about the prompts we are using because we have kind of uh set this to be kind of this ELD ring experts right so here is where we take the screenshot so the image description we get back from gp4 is going to be fed into this and we have this prompt as an Elden ring expert explain related lore from the image description very short uh and a conversational way I guess in a very short and conversational way uh the other prompt we have is you are an elering expert in elering only describe lore from eling related to items location in the image keep it short and precise that was the system message for describe image and we also have the system message for our Gro API so basically you're an eldering expert keep the responses short and conversational since we are using text to speech I also I always want to try to keep the responses like uh conversational because it just feels more natural right okay so here we can kind of see how the back work so when I press my side button I kind of save the screenshot to my path here and here we get the description back from GPT 4 and this gets fed into the prompt and this is what Gro returns so in the image you're seeing a character gasing at stor castle that was wrong of course that is kind of how the screen recording part works so let me show you about the voice recording so here you kind of see I started recording and this is what I said and here we kind of got the response back we started recording again and I followed up with so is there a boss inside there the carry on study is the home to rala Queen of the full moon okay great so yeah you can see this works so when we run this now this is just going to run in like a true Loop right so when I press the scroll button now you can see we can record and I can just stop it again with the scroll button so basically we just start with the scroll button and we end with the scroll button and when it's finished it kind of goes back into this state where we are just waiting to press one of the buttons again so it's in a true Loop so it always run in the background we don't after to we start it or anything so yeah that is basically what I wanted to share with you today hope you enjoyed it if you want to become a member of the channel and try this out for yourself just follow the link in the description also don't forget to check out brilliant.org have a great day and I'll see you probably on Wednesday\n"