Learn OpenAI API

A free visual AI and machine learning lesson with an interactive 3D visualization, plain-English theory, and quiz.

The OpenAI chat completions endpoint is the simplest LLM API: send a list of messages, get back the next assistant message. That's it. Everything else — streaming, function calling, JSON mode, vision, voice — is built on top of this. If you understand one HTTPS POST request, you understand 80% of what shipping an LLM app means.

Anatomy of a chat completion

A request is a JSON body with: `model` (e.g. 'gpt-4o-mini'), `messages` (list of `{role, content}` pairs with roles 'system', 'user', 'assistant'), and optional knobs like `temperature` (randomness 0-2), `max_tokens` (cap), `stream` (true to receive tokens as they generate).

POST https://api.openai.com/v1/chat/completions
Authorization: Bearer YOUR_API_KEY

{
  'model': 'gpt-4o-mini',
  'messages': [
    {'role': 'system', 'content': 'You are concise.'},
    {'role': 'user',   'content': 'What is 2+2?'}
  ],
  'temperature': 0.7
}

→ {'choices': [{'message': {'role': 'assistant', 'content': '4'}}], 'usage': {...}}

One HTTPS request, one JSON response

Token-based pricing

OpenAI bills per million tokens, with separate rates for input and output (output is usually 2-4× more expensive). 1000 tokens ≈ 750 English words. A typical chat turn is 50-200 input tokens + 50-500 output tokens. Watch the cost like you'd watch infrastructure spend — long contexts (RAG with many docs) and chatty users can rack up bills fast.

gpt-4o-mini — fast and cheap, the default for most apps ($0.15/$0.60 per 1M tokens)
gpt-4o — flagship, best quality, ~30× more expensive
o1-preview — reasoning model, slow but solves harder problems
Add `stream=True` to receive tokens incrementally — feels like typing
Add `response_format={'type': 'json_object'}` for guaranteed JSON
Use `tools` parameter to enable function calling

Anatomy of a chat completion

Token-based pricing

Practice questions

Related AI learning resources