Fine-Tuning a Language Model: A Step-by-Step Guide
Creating a Fine-Tuning Job
In this article, we will explore the process of fine-tuning a language model using the ID right okay and create the fine tuning job. To start, we need to run the script and upload our data set. Once the data is uploaded, we can begin the fine-tuning process.
Running the Fine-Tuning Job
To initiate the fine-tuning job, we simply need to click on the "Run" button. The model will then start processing the data and generating a new model. This process may take some time, but it's worth the wait as the resulting model will be more accurate and better suited for our specific task.
Analyzing the Results
Once the fine-tuning job is complete, we can analyze the results to see how well the model performed. In this case, we have a metric called "Trading Loss" which represents how well the model is predicting the next token in a sequence. The lower number the better, indicating that the model has achieved a good level of accuracy.
The graph shows the loss over each step of the trading process and tends to stabilize as the model learns more from the data. This suggests that the model has learned as much as it can from this data set and is ready for testing.
Testing the Model
With the fine-tuning job complete, we can now test the model using the playground interface. We select our latest fine tune model and adjust the parameters to suit our needs. In this case, we increase the length of the model and change the temperature setting to 0.5.
We then copy a snippet of text from Wikipedia and add a response message at the end. This is an important step as it allows the model to generate a coherent and relevant response.
Testing the Model with Response
To test the model, we paste in our input data and see how it performs. In this case, the model correctly identifies the title, author, and year of the text. However, if we remove the response message at the end, the model's performance suffers. This highlights the importance of including a response message when fine-tuning a language model.
Limitations and Future Directions
While fine-tuning has been successful in this example, there are limitations to consider. For complex tasks or large data sets, more advanced techniques may be necessary. Additionally, the quality of the input data can greatly impact the performance of the model.
Future research directions include exploring other open-source models and experimenting with different parameters and techniques. As language models continue to evolve, it's essential to stay up-to-date on the latest developments and best practices.
Conclusion
Fine-tuning a language model is a powerful technique for improving its accuracy and performance. By following these steps and adjusting the parameters to suit our needs, we can unlock the full potential of this technology. Whether you're a researcher or simply looking to improve your language skills, fine-tuning is an exciting area to explore.
"WEBVTTKind: captionsLanguage: enI put out this poll early this morning to figure out what video I should make today 40% of you wanted a video on chbt 3.5 turbo fine tuning for like a special specific task so we are going to do that and don't worry I will cover the other topics too in a future video but yeah let's do some fine tuning so when should you actually use fine tuning so let's say you have a very specific task you want to complete then fine tuning might be something for you you have a very specific requested format output you want to do every time uh that is what we are going to look at today uh you also you have to have the data sets required to find tuna model so you kind of need to go and look at that uh data sets are of course very important when it comes to fine tuning uh so everything relies on how the quality of your data sets are right uh maybe you tried prompt engineering to get theard outputs uh without getting like consistently getting the results you want or like the prompt is huge let's say it's like many examples The Prompt is like maybe 2 3,000 tokens then you might consider fine-tuning it instead of using that so you can save some tokens uh so what we want to do is like in this case train chat GPT 3.5 turbo to try to outperform or be equal to at least GPT 4 on a specific task and by doing that we can get like more value out of running a fine tune model uh by cheaper and saving tokens and stuff instead of just using like a foundation model like gbd4 so today's example is going to be we want a CSV format only response that's an easy example to understand and why fine tune jgpt turbo unlike a specific task so this could be price a find chat GPT 3.5 would still be cheaper than using like the new GPT 4 Turbo if you are on the API right speed uh a find model is much faster than a foundation model and it's a better user experience if you put this into production right every user wants quicker response uh save tokens like I said uh you can shrink down the input uh you put in like if you have a lot of examples and stuff a long prompt you can fine tune on that to kind of shrink it so we kind of only need to put in the essentials I'm going to show you that and of course output tokens we can kind of fine-tune it to just get what we want and yeah you get more control and there's a whole a lot of other stuff too I might cover in a future video uh so I think that's it I think we just going to head over to chat interface I'm just going to show you what I meant uh by yeah picking a task here you can see we are on chat TBT 4 uh I put up just a simple text here we have a year we have an author we have like a name of a book and we have some information about it and uh task I gave it here is think to the text carefully and list systematically in a CSV format the title author year of release and shra right and we get the response based on the provided text the CSV format would be as follows so we have this right and we have kind of what output we want here The Alchemist po Coello 1988 allegorical fiction okay I don't know what that is uh okay so we follow up with a new text we give a second example of a text and you can see we get kind of the same response here okay that's good so you can see deep D4 clearly understands the task here so let's go over to 3.5 here okay so you can see we kind of do the same thing give it a text we ask for it the first time yeah perfect that is this response the kind of response we want right uh but then we feed like a second text here into chat GPT 3.5 and yeah it starts off well but then no it kind of Misses here you can see it divides it into too many uh so we end up with like too many rows here or columns so yeah that didn't work so you can see clearly now that t chbt 4 is outperforming 3.5 as expected right so this is kind of the starting point we have now when we are going in and trying to fine tune 3.5 on some data from gbt 4 and try to equal out like the differences here by using fine tuning so the next step then is to going to start creating our data sets we kind of need for fine tuning so for this I like to use just GPT 4 to create our synthetic data sets in this case if you already had a data sets then you can just use that right uh but this saves us a lot of time to Preparing the data set so Ive created this script you can see here to make it easy to create these synthetic data sets so this is going to be put on in like Json put out in like Json L and we can just slot it straight in like the fine-tuning steps so like if you're interested in the scripts everything I'm going to use today you can find a link in the description below and you can uh support me by becoming a member and you will get access to this GitHub where I'm going to put up all the codes I'm a bit behind on that now but I'm going to do it tomorrow I think so yeah let's dive into this synthetic data Creator I kind of uh made here so we're going to start off by creating the text we need to solve this problem for our data set right so you can see your task is to create a text that can be used to complete the assignment here so examples so we give a few few shot examples here so basically it's a text the same as we use in chat gbt forite here's the same assignment and here we kind of force the response we want so we have the title the author the year of release and the shra right same in example two exactly the same but a different example and we have the same here in example three so we give three examples of what kind of response we want right and we finish off with create a similar text to the example above just a text so this is yeah just to get the text right and that produces a random text each time we run this in a loop and the next part is to create the answers so you can see we have the exact same uh yeah same kind of examples here but at the end here we have we going to feed in like it's a placeholder for a new text to be created your task is to complete the assignments in order and think two is step by step CSV format so when we uh replace this placeholder here with a new text that we created in the first example here so like you can think of it like the response here the new text is going to be fed into this placeholder here and we going to get a new response that hopefully is this right just this so that is how we create our data sets so pretty simple setup it's just like you got to get your head around how we think about examples and stuff but I think this is pretty straightforward and when we have that complete you can see here we kind of set this up in uh again you can see we use GPT 4 to run this and it's pretty much set up here you can see kind of our placeholder so we're going to feed prob one into that and prob one that is of course the first we look at so this is going to feed the text and we have a system prompt that is kind of you're an expert Problem Solver thinking a step by step way use reasoning Chain of Thought common sense I I just left this as is so I don't know how should I explain this yes everything gets like U appended into this schema. Json L file and here you can can see we can set how many examples we want to create I thought for this video uh I'm only going to create 30 examples 30 data sets or examples of data sets in the Json L output and we're going to do some handpicking so we're going to go through each example and kind of pick out the best examples so we're going to remove the ones I don't like so hopefully we can like clean up our data set and improve it that way too so we get an even better result so for now I think we just going to run this script here and create our 30 data sets and then we're going to take a look at them before we do the fine-tuning part okay so I'm just going to set this off now and we just going to wait for our data sets to be complete so this is going to create 30 examples then we're going to take a look at each and every one of them and remove the ones we don't like so yeah uh see you soon okay so that was it uh that didn't take too long maybe like 3 4 minutes uh so we can see we saved 30 examples uh so let's open that jonl file and yeah start picking out uh examples okay so here you can kind of see the structure now uh of our data set so I just want to take a closer look here uh you can see here is kind of our input this is our system prompt right and here you can see the tech we created so publish in 1961 Catch 22 so here is just an information about the book so this goes all the way down here right and here you can see kind of the response so we have Catch 22 that's the title author 61 and satirical fiction perfect so this is the outputs we are looking for so you can see example two also has this uh murder on the Orient Express AG got Christi 34 mystery so I'm happy with that so I'm just going to go through this now and see if I found any examples that look bad and I'm going to remove them okay so I got all the way down to example 14 before I found this errors so you can see kind of the output here is okay we have the title we have the author we have the year of release and the shre but in the shre part here it kind of comes up with two things so black com and antiwar so this is wrong right so I'm just going to remove example 14 right like this and yeah now we have cleaned up the data set a bit and I'm just going to continue going through this and see if we find any more errors okay so I went over the data set uh I found four errors I removed so of the 30 we created we ended up with you can see here 26 data sets and that means we are kind of ready now to to fine-tune our model so let's go over to that script and take a look at it okay so this is the first part of the two scripts we need for this the first part is to kind of upload the file we have created the Json L here so we just put our file path here and we use this client create files and the purpose is fine tune and when we run this now we're going to get this file ID and on the next script we're going to use that file ID and we're going to paste it in here training file idea right and here is an important part this is kind of what model you want to fine tune so I'm picking like the newest 3.5 turbo version here uh if you have other things you maybe want to put pick another model to find tune you just adjust that here in the model name right and yeah I think we just ready to upload this and get our ID and then we can move on to the second script and after that we can kind of move on to this uh open AI finetuning interface they created so we can kind of follow along how our fine tuning is going so yeah I think we just going to run these two scripts now and yeah uh complete our fine tuning okay so let's just run it okay file uploaded successful good now let's copy this uh file ID here and go back to our script here let's fill in the file ID right okay and now we can actually create the fine tuning job okay so let's run it and now let's go over to the interface here you can see we have started the fine tuning job here that's good uh let me turn off this dark mode so we can see better uh okay so I think we're just going to let this run you can kind of see we have three EPO here is the date and stuff it's validating the files and yeah you can kind of see the metrics here so I'm just going to let this run and we can kind of watch how this develops over time now okay and that was the fine-tuning job complete you can see we completed seven 78 steps we train 19,800 tokens uh we got to remember 26 examples that's not that's not much right uh you kind of need a lot more if you have some very specific thing you need to do but this is just for this video so what I did is I took a screenshot of this I went over to gbd4 here I uploaded this and I asked can you explain these results and yeah you can see trading loss this is a crucial usual metric the trading loss is a numeric value that represents how well the model is performing the lower number the better model is predicting the next token in a sequence here the loss was uh 1 0.12 suggests that the model has achieved a good level of accuracy and you can kind of see the graph shows loss over each step of the trading starts higher Trends downwards which is ideal that means that the model is improving making fewer mistakes as it learns uh the law seems to St stabilize which generally means that the model has learned as much as it can from this data so we don't have a big data set here so yeah it kind of learn everything it could it even went to zero so there was nothing more to extract from this fine tuning job so what is left now is just to test it and see if yeah it worked to test this now we just going to go to the playground you can go to models here right and we're going to scoll scroll down you can see fine tunes I have a lot of bad fine tune models here but all the way down here you can find our latest fine tune right okay so let us select that let's set the temperature to zero yeah let's set the length too it doesn't matter I went to Wikipedia I just found a a Sci-Fi Noel here snow crash Neil Stevenson so let's copy some text from Wikipedia and add a message mess okay so remember we got to end it with uh response right response and let's test it yeah good good good so we got the title the author the year and um science fiction that's the shre uh let's grab I found also something about Dune here that's another novel uh okay so let's reload this uh yeah that's a good model let's try it at 0.5 increase the length of it paste in the Vicky response right yeah June Frank Harbert 1965 science fiction so you can copy this we can go to like a spreadsheet here can paste it in data split and yeah here's our input so perfect I think it's working again this was a very simple example right but imagine like you need a very structured big schema that has a lot of variables uh I think you can get a lot of this out uh by fine-tuning this model because we only get this out right but you always got to remember to put in like the response here part here that's important let's see what happens if we don't put in response here okay doing Frank Herbert Science Fiction it's trying okay Dune Frank it got it at the end but it would be much more stable if we always add response here at the end right because that it was trained from from the from the if we go to our data sets here you can kind of see it always had response here right so that's important to remember but yeah I think this is just going to conclude this and the conclusion is this works great if you have a specific task as we talk about in the intro or we have some other variables that kind of fits fine tuning and yeah I'm pretty excited about this also trying to find you in other models open source models going forward small large Lang small large language small language models that's also going to be interesting like 5 2 and stuff but I think we just going to wrap it up here and of course I'm going to cover other topics in other video hope you learned something and again if you want to support me and get the codes and stuff just go to the link in the description below and become a member it doesn't matter what tier you pick that's up to you but yeah thank you for tuning in have a great day and I'll see you on WednesdayI put out this poll early this morning to figure out what video I should make today 40% of you wanted a video on chbt 3.5 turbo fine tuning for like a special specific task so we are going to do that and don't worry I will cover the other topics too in a future video but yeah let's do some fine tuning so when should you actually use fine tuning so let's say you have a very specific task you want to complete then fine tuning might be something for you you have a very specific requested format output you want to do every time uh that is what we are going to look at today uh you also you have to have the data sets required to find tuna model so you kind of need to go and look at that uh data sets are of course very important when it comes to fine tuning uh so everything relies on how the quality of your data sets are right uh maybe you tried prompt engineering to get theard outputs uh without getting like consistently getting the results you want or like the prompt is huge let's say it's like many examples The Prompt is like maybe 2 3,000 tokens then you might consider fine-tuning it instead of using that so you can save some tokens uh so what we want to do is like in this case train chat GPT 3.5 turbo to try to outperform or be equal to at least GPT 4 on a specific task and by doing that we can get like more value out of running a fine tune model uh by cheaper and saving tokens and stuff instead of just using like a foundation model like gbd4 so today's example is going to be we want a CSV format only response that's an easy example to understand and why fine tune jgpt turbo unlike a specific task so this could be price a find chat GPT 3.5 would still be cheaper than using like the new GPT 4 Turbo if you are on the API right speed uh a find model is much faster than a foundation model and it's a better user experience if you put this into production right every user wants quicker response uh save tokens like I said uh you can shrink down the input uh you put in like if you have a lot of examples and stuff a long prompt you can fine tune on that to kind of shrink it so we kind of only need to put in the essentials I'm going to show you that and of course output tokens we can kind of fine-tune it to just get what we want and yeah you get more control and there's a whole a lot of other stuff too I might cover in a future video uh so I think that's it I think we just going to head over to chat interface I'm just going to show you what I meant uh by yeah picking a task here you can see we are on chat TBT 4 uh I put up just a simple text here we have a year we have an author we have like a name of a book and we have some information about it and uh task I gave it here is think to the text carefully and list systematically in a CSV format the title author year of release and shra right and we get the response based on the provided text the CSV format would be as follows so we have this right and we have kind of what output we want here The Alchemist po Coello 1988 allegorical fiction okay I don't know what that is uh okay so we follow up with a new text we give a second example of a text and you can see we get kind of the same response here okay that's good so you can see deep D4 clearly understands the task here so let's go over to 3.5 here okay so you can see we kind of do the same thing give it a text we ask for it the first time yeah perfect that is this response the kind of response we want right uh but then we feed like a second text here into chat GPT 3.5 and yeah it starts off well but then no it kind of Misses here you can see it divides it into too many uh so we end up with like too many rows here or columns so yeah that didn't work so you can see clearly now that t chbt 4 is outperforming 3.5 as expected right so this is kind of the starting point we have now when we are going in and trying to fine tune 3.5 on some data from gbt 4 and try to equal out like the differences here by using fine tuning so the next step then is to going to start creating our data sets we kind of need for fine tuning so for this I like to use just GPT 4 to create our synthetic data sets in this case if you already had a data sets then you can just use that right uh but this saves us a lot of time to Preparing the data set so Ive created this script you can see here to make it easy to create these synthetic data sets so this is going to be put on in like Json put out in like Json L and we can just slot it straight in like the fine-tuning steps so like if you're interested in the scripts everything I'm going to use today you can find a link in the description below and you can uh support me by becoming a member and you will get access to this GitHub where I'm going to put up all the codes I'm a bit behind on that now but I'm going to do it tomorrow I think so yeah let's dive into this synthetic data Creator I kind of uh made here so we're going to start off by creating the text we need to solve this problem for our data set right so you can see your task is to create a text that can be used to complete the assignment here so examples so we give a few few shot examples here so basically it's a text the same as we use in chat gbt forite here's the same assignment and here we kind of force the response we want so we have the title the author the year of release and the shra right same in example two exactly the same but a different example and we have the same here in example three so we give three examples of what kind of response we want right and we finish off with create a similar text to the example above just a text so this is yeah just to get the text right and that produces a random text each time we run this in a loop and the next part is to create the answers so you can see we have the exact same uh yeah same kind of examples here but at the end here we have we going to feed in like it's a placeholder for a new text to be created your task is to complete the assignments in order and think two is step by step CSV format so when we uh replace this placeholder here with a new text that we created in the first example here so like you can think of it like the response here the new text is going to be fed into this placeholder here and we going to get a new response that hopefully is this right just this so that is how we create our data sets so pretty simple setup it's just like you got to get your head around how we think about examples and stuff but I think this is pretty straightforward and when we have that complete you can see here we kind of set this up in uh again you can see we use GPT 4 to run this and it's pretty much set up here you can see kind of our placeholder so we're going to feed prob one into that and prob one that is of course the first we look at so this is going to feed the text and we have a system prompt that is kind of you're an expert Problem Solver thinking a step by step way use reasoning Chain of Thought common sense I I just left this as is so I don't know how should I explain this yes everything gets like U appended into this schema. Json L file and here you can can see we can set how many examples we want to create I thought for this video uh I'm only going to create 30 examples 30 data sets or examples of data sets in the Json L output and we're going to do some handpicking so we're going to go through each example and kind of pick out the best examples so we're going to remove the ones I don't like so hopefully we can like clean up our data set and improve it that way too so we get an even better result so for now I think we just going to run this script here and create our 30 data sets and then we're going to take a look at them before we do the fine-tuning part okay so I'm just going to set this off now and we just going to wait for our data sets to be complete so this is going to create 30 examples then we're going to take a look at each and every one of them and remove the ones we don't like so yeah uh see you soon okay so that was it uh that didn't take too long maybe like 3 4 minutes uh so we can see we saved 30 examples uh so let's open that jonl file and yeah start picking out uh examples okay so here you can kind of see the structure now uh of our data set so I just want to take a closer look here uh you can see here is kind of our input this is our system prompt right and here you can see the tech we created so publish in 1961 Catch 22 so here is just an information about the book so this goes all the way down here right and here you can see kind of the response so we have Catch 22 that's the title author 61 and satirical fiction perfect so this is the outputs we are looking for so you can see example two also has this uh murder on the Orient Express AG got Christi 34 mystery so I'm happy with that so I'm just going to go through this now and see if I found any examples that look bad and I'm going to remove them okay so I got all the way down to example 14 before I found this errors so you can see kind of the output here is okay we have the title we have the author we have the year of release and the shre but in the shre part here it kind of comes up with two things so black com and antiwar so this is wrong right so I'm just going to remove example 14 right like this and yeah now we have cleaned up the data set a bit and I'm just going to continue going through this and see if we find any more errors okay so I went over the data set uh I found four errors I removed so of the 30 we created we ended up with you can see here 26 data sets and that means we are kind of ready now to to fine-tune our model so let's go over to that script and take a look at it okay so this is the first part of the two scripts we need for this the first part is to kind of upload the file we have created the Json L here so we just put our file path here and we use this client create files and the purpose is fine tune and when we run this now we're going to get this file ID and on the next script we're going to use that file ID and we're going to paste it in here training file idea right and here is an important part this is kind of what model you want to fine tune so I'm picking like the newest 3.5 turbo version here uh if you have other things you maybe want to put pick another model to find tune you just adjust that here in the model name right and yeah I think we just ready to upload this and get our ID and then we can move on to the second script and after that we can kind of move on to this uh open AI finetuning interface they created so we can kind of follow along how our fine tuning is going so yeah I think we just going to run these two scripts now and yeah uh complete our fine tuning okay so let's just run it okay file uploaded successful good now let's copy this uh file ID here and go back to our script here let's fill in the file ID right okay and now we can actually create the fine tuning job okay so let's run it and now let's go over to the interface here you can see we have started the fine tuning job here that's good uh let me turn off this dark mode so we can see better uh okay so I think we're just going to let this run you can kind of see we have three EPO here is the date and stuff it's validating the files and yeah you can kind of see the metrics here so I'm just going to let this run and we can kind of watch how this develops over time now okay and that was the fine-tuning job complete you can see we completed seven 78 steps we train 19,800 tokens uh we got to remember 26 examples that's not that's not much right uh you kind of need a lot more if you have some very specific thing you need to do but this is just for this video so what I did is I took a screenshot of this I went over to gbd4 here I uploaded this and I asked can you explain these results and yeah you can see trading loss this is a crucial usual metric the trading loss is a numeric value that represents how well the model is performing the lower number the better model is predicting the next token in a sequence here the loss was uh 1 0.12 suggests that the model has achieved a good level of accuracy and you can kind of see the graph shows loss over each step of the trading starts higher Trends downwards which is ideal that means that the model is improving making fewer mistakes as it learns uh the law seems to St stabilize which generally means that the model has learned as much as it can from this data so we don't have a big data set here so yeah it kind of learn everything it could it even went to zero so there was nothing more to extract from this fine tuning job so what is left now is just to test it and see if yeah it worked to test this now we just going to go to the playground you can go to models here right and we're going to scoll scroll down you can see fine tunes I have a lot of bad fine tune models here but all the way down here you can find our latest fine tune right okay so let us select that let's set the temperature to zero yeah let's set the length too it doesn't matter I went to Wikipedia I just found a a Sci-Fi Noel here snow crash Neil Stevenson so let's copy some text from Wikipedia and add a message mess okay so remember we got to end it with uh response right response and let's test it yeah good good good so we got the title the author the year and um science fiction that's the shre uh let's grab I found also something about Dune here that's another novel uh okay so let's reload this uh yeah that's a good model let's try it at 0.5 increase the length of it paste in the Vicky response right yeah June Frank Harbert 1965 science fiction so you can copy this we can go to like a spreadsheet here can paste it in data split and yeah here's our input so perfect I think it's working again this was a very simple example right but imagine like you need a very structured big schema that has a lot of variables uh I think you can get a lot of this out uh by fine-tuning this model because we only get this out right but you always got to remember to put in like the response here part here that's important let's see what happens if we don't put in response here okay doing Frank Herbert Science Fiction it's trying okay Dune Frank it got it at the end but it would be much more stable if we always add response here at the end right because that it was trained from from the from the if we go to our data sets here you can kind of see it always had response here right so that's important to remember but yeah I think this is just going to conclude this and the conclusion is this works great if you have a specific task as we talk about in the intro or we have some other variables that kind of fits fine tuning and yeah I'm pretty excited about this also trying to find you in other models open source models going forward small large Lang small large language small language models that's also going to be interesting like 5 2 and stuff but I think we just going to wrap it up here and of course I'm going to cover other topics in other video hope you learned something and again if you want to support me and get the codes and stuff just go to the link in the description below and become a member it doesn't matter what tier you pick that's up to you but yeah thank you for tuning in have a great day and I'll see you on Wednesday\n"