Uploading Training Data Files to OpenAI for Fine Tuning AI Models

Colby Fayock
InstructorColby Fayock
Share this video with your friends

Social Share Links

Send Tweet
Published a year ago
Updated a year ago

Once we have our training data, we need to a way to deliver it to OpenAI.

We can do that by uploading our training data file to OpenAI directly where we can then reference that with our fine tuning.

Instructor: [0:00] Once you have all of your training data ready to use for fine-tuning, your next step is to upload that file and make it available to OpenAI to use and associate with your fine-tune. We can see in the example request here, that's going to be pretty similar to a lot of the other APIs we used, except this time, we're going to use the createFile method.

[0:16] To start off, inside of my project, I'm going to create a new script and I'm going to call it upload-training-data. Inside of that file, I added some boilerplate, where I'm requiring my environment variables, I'm requiring the OpenAI library, I'm configuring it, and I'm even creating a new instance of OpenAI.

[0:31] To make this work, we're pretty much going to do exactly what OpenAI is doing in the example request, where we can copy this createFile method.

[0:38] Inside of my run function, which is going to be immediately invoked, I'm going to go ahead and paste that in, where the only thing I'm going to do is make sure I update my source, where it's inside of /source/data/prompts.jsonl or wherever your local file's located.

[0:52] Because this is using fs, we need to make sure that we import that into our project. At the top of my file, I'm going to go ahead and require the fs library. Now, once we get that response, I want to actually see it. I'm going to go ahead and console.log out the response, and I'm going to specifically respond with the data.

[1:07] Now when I try to run that script, we can see that we got a successful response, where the status is uploaded. We can see we even have our file name, our purpose, and we have a file ID, which is what we're going to need to keep and store to reference later.

[1:20] What we'll do in the next lesson is take that file ID and pass it to the fine-tune creation method, which is going to allow us to start that process of creating our own custom model. Be sure to save that file ID somewhere. What I like to do, if I'm just trying to work through this myself, is just add it as a comment so I know that I can use that for my next file.

[1:38] In review, in order to use our prompt data in order to fine-tune a model, we need to first upload it to OpenAI. To do this, we can use the OpenAI Node SDK, where we can use the createFile method, pass in a ReadStream of our local file, where with a successful response, we can see that we get a file ID which we'll be able to use when trying to fine-tune our model.

~ 42 minutes ago

Member comments are a way for members to communicate, interact, and ask questions about a lesson.

The instructor or someone from the community might respond to your question Here are a few basic guidelines to commenting on egghead.io

Be on-Topic

Comments are for discussing a lesson. If you're having a general issue with the website functionality, please contact us at support@egghead.io.

Avoid meta-discussion

  • This was great!
  • This was horrible!
  • I didn't like this because it didn't match my skill level.
  • +1 It will likely be deleted as spam.

Code Problems?

Should be accompanied by code! Codesandbox or Stackblitz provide a way to share code and discuss it in context

Details and Context

Vague question? Vague answer. Any details and context you can provide will lure more interesting answers!

Markdown supported.
Become a member to join the discussionEnroll Today