1. 15
    Tokens

Tokens

Tom Chant
InstructorTom Chant
Share this video with your friends

Social Share Links

Send Tweet

Explore the concept of tokens in OpenAI API requests and how they affect the amount of text returned. Tokens are chunks of text, typically about 75% of a word, which OpenAI processes.

By setting the maxTokens property in our API request, we can control the length of the text we receive. Insufficient maxTokens limits the response, resulting in incomplete text. It's crucial to set an adequate maxTokens value to receive complete responses.

Monitoring the finish reason in the API response helps ensure the text is not cut short. It's important to understand that maxTokens does not control the conciseness of the text but rather the completeness.

Prompt design plays a significant role in controlling text length, ultimately determining the desired response. Achieving optimal prompt design is crucial for obtaining the desired length of text from the OpenAI API.

[00:00] Okay, let's talk about tokens and the maxTokens property. So we know that when we add this maxTokens of 60 to our API request, we get plenty of text back. We also know that if we don't set maxTokens, we get much less text back and it's actually

[00:19] not complete and therefore doesn't really make sense. And that's what we can actually see right here. So what exactly is going on with tokens and what even are tokens? OpenAI breaks down chunks of text for processing.

[00:36] Now I say text, it depends on which model you're using and what you're doing. It could also be code that gets broken down into chunks, but we're working with text, so we need to think about tokens in the context of text. Now you might think that each word or each syllable would be a token, but it's actually not as simple as that.

[00:55] Roughly speaking, a token is about 75% of a word. So 100 tokens is about 75 words. So the 60 token limit that we put on this fetch request would bring us back a maximum of about 40 words.

[01:12] When we didn't use maxTokens at all, this one here actually defaulted to 16, which as we can see here is way too short for our needs. So that's an important lesson. If you don't allow enough tokens, your completion will be cut short. So you'll actually get less text back from OpenAI.

[01:32] So now you might be thinking, well, okay, a token is a chunk of text, but so what? Why do I need to think about limiting it? And the best way to make sure that your maxToken setting isn't causing you problems is to check the object you get with your response. If you see the finish reason of length, that is a bad sign.

[01:51] That means the text that you're getting back has been cut short. And if you see a finish reason of stop, that is a good sign. That means that OpenAI actually got to the end of the process and it's given you all of the text that it wanted to give you. So you might be thinking, okay, so a token is a chunk of text, but why does that matter?

[02:09] Well, there are some good reasons for knowing about tokens and being able to limit them. Each token incurs a charge and it takes time to process. So that gives you an incentive. If you limit the number of tokens, you can keep costs down and keep performance up.

[02:27] And that's really important, of course, for when you run out of free credit and if you're creating a production app and that app scales to millions and millions of users and why shouldn't it? Now, there's something really important about maxTokens that we need to understand.

[02:42] maxTokens does not help us control how concise a text is. As we saw in our app, we get an incomplete response when the token count is low, not a more concise one. So as a tool to control how verbose or how expressive OpenAI is, maxTokens is useless.

[03:01] And that begs the question, how should I use it? And the answer is we should set it high enough to allow a full response from OpenAI. So you might just have to do a little bit of experimentation with that each time and just making sure the text that you get back from the API is not cut short.

[03:18] So how can we control the length of text we get from OpenAI? Well, we do that with prompt design. Good prompt design is everything. And good prompt design is the best way to ensure that the text we get from OpenAI is the length we want. Now I actually think that the text that we got back when we had maxToken set to 60 was

[03:38] just a little bit too long. So as we go through this project and we learn more about prompt design, we will come back and just do a little bit of refactoring here. But for now, I want to keep up the momentum, keep moving forwards. So let's start tackling our next API call, which is to generate a full synopsis from

[03:56] our one-sentence movie idea. When you're ready for that, let's move on.