OpenAI - Platforma AI dla deweloperów

OpenAI - Kompletny przewodnik po platformie AI, która zmieniła świat technologii

W listopadzie 2022 roku OpenAI wypuściło ChatGPT i w ciągu 5 dni zarejestrowało się ponad milion użytkowników. Dzisiaj OpenAI to firma wyceniana na 500 miliardów dolarów, a jej produkty - od GPT-5 przez DALL-E po Sora - definiują kierunek rozwoju sztucznej inteligencji. Ale OpenAI to nie tylko ChatGPT. Dla deweloperów to przede wszystkim platforma API z modelami, narzędziami i SDK, które pozwalają budować aplikacje AI nowej generacji.

Czym jest OpenAI?

OpenAI to amerykańska firma badawczo-technologiczna zajmująca się sztuczną inteligencją, założona w grudniu 2015 roku przez Sama Altmana, Elona Muska, Ilyę Sutskevera, Grega Brockmana i innych. Początkowo działała jako organizacja non-profit z misją rozwijania "bezpiecznej i korzystnej" sztucznej inteligencji ogólnej (AGI).

Od 2019 roku OpenAI przeszło transformację strukturalną - z czystego non-profit na model hybrydowy z "ograniczonym zyskiem" (capped profit), a w 2025 roku przekształciło się w publiczną spółkę korzyści społecznych (PBC) pod nadzorem fundacji OpenAI Foundation. Ta ewolucja była napędzana prostym faktem: rozwój AI wymaga ogromnych zasobów obliczeniowych i finansowych.

Siedziba firmy mieści się w San Francisco. Jej CEO to Sam Altman, a firma zatrudnia tysiące pracowników i badaczy.

Historia i kluczowe momenty

2015-2018: Początki

OpenAI zostało założone z pledgem 1 miliarda dolarów od Altmana, Muska, Reida Hoffmana, Petera Thiela, AWS i innych. Celem było prowadzenie otwartych badań nad AI, stąd "Open" w nazwie. W 2018 roku Elon Musk opuścił zarząd po nieudanej próbie przejęcia kontroli nad organizacją.

2019-2022: GPT i przełom

W 2019 roku stworzono GPT-2, który zaskoczył jakością generowanego tekstu. GPT-3 (2020) z 175 miliardami parametrów zapoczątkował erę dużych modeli językowych. W 2021 pojawiły się DALL-E (generowanie obrazów) i Codex (generowanie kodu). GPT-3.5 i ChatGPT (listopad 2022) wywołały globalny boom AI.

2023: Turbulencje i GPT-4

W marcu 2023 OpenAI wypuściło GPT-4 - model multimodalny rozumiejący tekst i obrazy. W listopadzie zarząd usunął Sama Altmana ze stanowiska CEO. Pięć dni później wrócił, po tym jak praktycznie wszyscy pracownicy zagrozili odejściem. Kryzys doprowadził do rekonstrukcji zarządu.

2024-2025: Modele reasoning i GPT-5

OpenAI wprowadził serię modeli o1/o3/o4-mini z "myśleniem" - modele mogą rozumować krok po kroku przed udzieleniem odpowiedzi. W 2025 roku pojawił się GPT-5, unifikujący inteligencję ogólną, reasoning, kodowanie i multimodalność w jednej rodzinie modeli. Pojawiły się też Sora 2 (video), Operator (agent webowy), ChatGPT Atlas (przeglądarka) i GPT-OSS (modele open-weight).

W maju 2025 OpenAI przejęło za 6.5 miliarda dolarów firmę IO założoną przez Jony'ego Ive'a (byłego szefa designu Apple), planując stworzyć nową kategorię urządzeń osobistych.

Produkty OpenAI

ChatGPT

Flagowy produkt konsumencki OpenAI. Interfejs czatowy do interakcji z modelami GPT:

ChatGPT Free - darmowy dostęp do GPT-4o mini
ChatGPT Plus ($20/mies.) - GPT-4o, GPT-5, DALL-E, analiza danych
ChatGPT Pro ($200/mies.) - nielimitowany dostęp do wszystkich modeli, o1-pro, deep research
ChatGPT Team ($25/os./mies.) - wersja dla zespołów
ChatGPT Enterprise - wersja korporacyjna z compliance i SSO

DALL-E / gpt-image

Modele generowania obrazów z opisów tekstowych. DALL-E 3 zintegrowane z ChatGPT. W 2025 roku OpenAI przeszło na modele gpt-image bazujące na multimodalnych możliwościach GPT, zastępując wcześniejsze modele dyfuzyjne.

Sora

Model text-to-video generujący filmy do rozdzielczości 1920x1080. Sora 2 (2025) oferuje ulepszoną symulację fizyki i zsynchronizowany dźwięk.

Whisper

Model rozpoznawania mowy - transkrypcja audio na tekst z obsługą wielu języków (w tym polskiego). Open-source, dostępny też przez API.

Operator

Agent AI do automatyzacji zadań w przeglądarce - logowanie, wypełnianie formularzy, nawigowanie po stronach.

Modele OpenAI dla deweloperów

Rodzina GPT-5

Model	Input/1M tokenów	Output/1M tokenów	Kontekst	Przeznaczenie
GPT-5.2	$1.25	$10.00	128K	Najnowszy flagowy model
GPT-5	$1.25	$10.00	128K	Ogólny + reasoning
GPT-5 Mini	-	-	128K	Balans koszt/wydajność
GPT-5 Nano	-	-	128K	Minimalne koszty

Rodzina GPT-4

Model	Input/1M tokenów	Output/1M tokenów	Kontekst	Przeznaczenie
GPT-4o	$2.50	$10.00	128K	Wszechstronny flagowiec
GPT-4o Mini	$0.15	$0.60	128K	Szybki i tani
GPT-4.1	$2.00	$8.00	1M	Coding, instrukcje
GPT-4.1 Mini	$0.40	$1.60	1M	Tani z 1M kontekstem
GPT-4.1 Nano	$0.10	$0.40	1M	Najszybszy

Modele reasoning (seria O)

Model	Input/1M tokenów	Output/1M tokenów	Kontekst	Przeznaczenie
o1	$15.00	$60.00	200K	Głęboki reasoning
o3	$2.00	$8.00	200K	Reasoning następnej gen.
o4-mini	$1.10	$4.40	200K	Szybki reasoning

Modele O zużywają "reasoning tokens" na wewnętrzne myślenie - te tokeny są rozliczane jako output, ale nie widoczne w odpowiedzi API. Odpowiedź z 500 widocznymi tokenami może zużyć 2000+ tokenów.

Modele specjalistyczne

gpt-image-1.5 - generowanie obrazów
gpt-4o-transcribe / whisper-1 - transkrypcja audio
gpt-4o-mini-tts - text-to-speech
text-embedding-3-small/large - embeddingi tekstowe

Jak wybrać model?

Prototypowanie: GPT-5 Nano lub GPT-4o Mini (najtańsze)
Aplikacje ogólne: GPT-4o lub GPT-5
Złożone rozumowanie: o3 lub GPT-5.2
Kodowanie: GPT-4.1 (zoptymalizowany pod instrukcje)
Masowe przetwarzanie: GPT-4o Mini + Batch API (50% taniej)

Responses API

Responses API to najnowsze API OpenAI, łączące prostotę Chat Completions z wbudowanymi narzędziami Assistants API. Zastępuje wcześniejsze Assistants API (deprecated, usunięcie w sierpniu 2026).

Podstawowe użycie

Code

Python

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-4o",
    input="Wyjaśnij czym jest React w 3 zdaniach"
)
print(response.output_text)

Z wbudowanymi narzędziami

Code

Python

response = client.responses.create(
    model="gpt-4o",
    tools=[{"type": "web_search_preview"}],
    input="Jakie są najnowsze zmiany w Next.js 15?"
)
print(response.output_text)

Streaming

Code

Python

stream = client.responses.create(
    model="gpt-4o",
    input="Napisz tutorial o TypeScript generics",
    stream=True
)
for event in stream:
    if hasattr(event, "delta"):
        print(event.delta, end="", flush=True)

Chat Completions API

Klasyczne API do generowania odpowiedzi w formacie konwersacji:

Code

Python

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Jesteś pomocnym asystentem programisty."},
        {"role": "user", "content": "Jak stworzyć custom hook w React?"}
    ]
)
print(response.choices[0].message.content)

Structured Outputs

Wymuszenie odpowiedzi w określonym schemacie JSON:

Code

Python

from pydantic import BaseModel

class CodeReview(BaseModel):
    issues: list[str]
    suggestions: list[str]
    score: int

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Przeanalizuj kod i zwróć review."},
        {"role": "user", "content": "function add(a,b){return a+b}"}
    ],
    response_format=CodeReview
)
review = response.choices[0].message.parsed
print(f"Score: {review.score}")
print(f"Issues: {review.issues}")

Function calling

Pozwala modelowi wywoływać zewnętrzne funkcje:

Code

Python

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Warsaw?"}],
    tools=tools,
    tool_choice="auto"
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")

Vision (analiza obrazów)

Code

Python

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Co widzisz na tym zrzucie ekranu?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}}
            ]
        }
    ]
)
print(response.choices[0].message.content)

Embeddingi

Embeddingi zamieniają tekst na wektory numeryczne, które odzwierciedlają znaczenie semantyczne. Podobne teksty generują podobne wektory.

Code

Python

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="TypeScript to nadzbiór JavaScriptu z typami statycznymi"
)
embedding = response.data[0].embedding
print(f"Wymiary: {len(embedding)}")

Zastosowania embeddingów

Wyszukiwanie semantyczne - znajdowanie dokumentów po znaczeniu, nie słowach kluczowych
RAG (Retrieval-Augmented Generation) - dostarczanie kontekstu do modeli
Klasyfikacja - kategoryzowanie tekstów
Rekomendacje - sugerowanie podobnych treści
Wykrywanie anomalii - identyfikacja nietypowych wzorców

Porównanie modeli embeddingowych

Model	Wymiary	Cena/1M tokenów	Zastosowanie
text-embedding-3-small	1536	$0.02	Większość zastosowań
text-embedding-3-large	3072	$0.13	Maksymalna jakość

Dla większości aplikacji text-embedding-3-small oferuje wystarczającą jakość przy 1/6 kosztu wersji large.

Agents SDK

Agents SDK to framework OpenAI do budowania agentów AI - programów, które mogą autonomicznie wykonywać wielokrokowe zadania, korzystając z narzędzi i delegując pracę innym agentom.

Podstawowy agent (Python)

Code

Python

from agents import Agent, Runner

agent = Agent(
    name="code-reviewer",
    instructions="You are an expert code reviewer. Analyze code for bugs, security issues, and performance problems.",
    model="gpt-4o"
)

result = Runner.run_sync(agent, "Review this function: def add(a,b): return a+b")
print(result.final_output)

Agent z narzędziami

Code

Python

from agents import Agent, Runner, function_tool

@function_tool
def search_docs(query: str) -> str:
    """Search project documentation"""
    return f"Found documentation about: {query}"

@function_tool
def run_tests(file_path: str) -> str:
    """Run tests for a specific file"""
    return f"All tests passed for {file_path}"

agent = Agent(
    name="dev-assistant",
    instructions="Help developers by searching docs and running tests.",
    tools=[search_docs, run_tests],
    model="gpt-4o"
)

result = Runner.run_sync(agent, "Check if the auth module has documentation and run its tests")
print(result.final_output)

Agents SDK (TypeScript)

Code

TypeScript

import { Agent, run } from "@openai/agents"

const agent = new Agent({
  name: "assistant",
  instructions: "You are a helpful coding assistant.",
  model: "gpt-4o",
})

const result = await run(agent, "Explain React hooks")
console.log(result.finalOutput)

Handoffs (delegowanie)

Agenci mogą delegować zadania innym agentom:

Code

Python

from agents import Agent, Runner

researcher = Agent(
    name="researcher",
    instructions="Research topics thoroughly and provide detailed findings."
)

writer = Agent(
    name="writer",
    instructions="Write clear, engaging content based on research.",
    handoffs=[researcher]
)

result = Runner.run_sync(writer, "Write an article about WebAssembly")
print(result.final_output)

Wbudowane narzędzia

web_search - wyszukiwanie w internecie
file_search - przeszukiwanie dokumentów
code_interpreter - wykonywanie kodu Python
image_generation - generowanie obrazów
computer_use - automatyzacja interfejsu

Codex CLI

Open-source narzędzie do agent-style kodowania w terminalu:

Code

Bash

npx @openai/codex "Add error handling to the auth module"

Codex pracuje bezpośrednio z twoim repozytorium - czyta pliki, proponuje zmiany, pozwala je przejrzeć i zaakceptować. Integruje się z Agents SDK przez MCP (Model Context Protocol).

Optymalizacja kosztów

Prompt caching

OpenAI automatycznie cachuje powtarzające się prefiksy promptów:

Rodzina modeli	Oszczędność
GPT-5	90% na cached tokens
GPT-4.1	75%
GPT-4o / O-series	50%

Batch API

Dla zadań, które nie wymagają natychmiastowej odpowiedzi (analiza danych, generowanie treści):

Code

Python

batch = client.batches.create(
    input_file_id="file-abc123",
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

Batch API przetwarza żądania w ciągu 24 godzin z 50% rabatem.

Architektura kaskadowa

Efektywna strategia to routing żądań przez modele o rosnącej mocy:

GPT-4o Mini obsługuje 80% prostych zapytań ($0.15/1M input)
GPT-4o obsługuje 15% średnich zapytań ($2.50/1M input)
o3 obsługuje 5% złożonych zapytań ($2.00/1M input + reasoning)

Fine-tuning

Trenowanie modelu na własnych danych:

Code

Python

file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)

job = client.fine_tuning.jobs.create(
    training_file=file.id,
    model="gpt-4.1-2025-04-14"
)

Fine-tuning jest dostępny dla GPT-4.1, GPT-4o i o4-mini. Koszt treningu: $1.50-$100/h w zależności od modelu.

SDK i biblioteki

Python

Code

Bash

pip install openai

Code

Python

from openai import OpenAI

client = OpenAI()  # automatycznie czyta OPENAI_API_KEY z env

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Node.js / TypeScript

Code

Bash

npm install openai

Code

TypeScript

import OpenAI from "openai"

const client = new OpenAI()

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
})
console.log(response.choices[0].message.content)

.NET

Code

Bash

dotnet add package OpenAI

REST API (curl)

Code

Bash

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Audio API

Transkrypcja (Speech-to-Text)

Code

Python

audio_file = open("meeting.mp3", "rb")
transcript = client.audio.transcriptions.create(
    model="gpt-4o-transcribe",
    file=audio_file
)
print(transcript.text)

Text-to-Speech

Code

Python

response = client.audio.speech.create(
    model="gpt-4o-mini-tts",
    voice="alloy",
    input="TypeScript jest nadzbiorem JavaScriptu."
)
response.stream_to_file("output.mp3")

Generowanie obrazów

Code

Python

response = client.images.generate(
    model="gpt-image-1",
    prompt="A futuristic city with flying cars, cyberpunk style",
    size="1024x1024",
    n=1
)
print(response.data[0].url)

Bezpieczeństwo i prywatność

Polityka danych API

Dane przesyłane przez API nie są używane do trenowania modeli (domyślnie)
OpenAI przechowuje dane API przez 30 dni do monitoringu nadużyć
Zero Data Retention (ZDR) dostępne dla kwalifikujących się klientów
SOC 2 Type II compliance

Rate limits

Limity zależą od Tier użytkownika (Tier 1-5), który rośnie z wydatkami:

Tier	Wymagania	RPM (GPT-4o)	TPM (GPT-4o)
Tier 1	$5 wpłaty	500	30,000
Tier 2	$50 wydane	5,000	450,000
Tier 3	$100 wydane	5,000	800,000
Tier 5	$1,000 wydane	10,000	30,000,000

Klucze API

Code

Bash

export OPENAI_API_KEY="sk-..."

Nigdy nie commituj kluczy API do repozytorium. Używaj zmiennych środowiskowych lub secret managers.

Porównanie z konkurencją

Cecha	OpenAI	Anthropic (Claude)	Google (Gemini)	Meta (Llama)
Najlepszy model	GPT-5.2	Claude Opus 4	Gemini 2.5 Pro	Llama 3.3 70B
Cena (input/1M)	od $0.10	od $0.25	od $0.075	Darmowe (local)
Kontekst	do 1M tokenów	200K	2M	128K
Vision	Tak	Tak	Tak	Tak (z Vision)
Agents	Agents SDK	Agent SDK	Vertex AI Agents	Brak oficjalnego
Open-weight	GPT-OSS (Apache 2)	Nie	Gemma (Apache 2)	Tak (Llama License)
Fine-tuning	Tak	Nie	Tak	Tak (local)

Praktyczne zastosowania

Chatbot z kontekstem

Code

Python

messages = [
    {"role": "system", "content": "Jesteś asystentem sklepu internetowego. Odpowiadaj po polsku."},
    {"role": "user", "content": "Szukam butów do biegania w rozmiarze 43"}
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

messages.append(response.choices[0].message)
messages.append({"role": "user", "content": "A co z modelami wodoodpornymi?"})

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

Analiza dokumentów z RAG

Code

Python

from openai import OpenAI
import numpy as np

client = OpenAI()

def get_embedding(text):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

docs = ["React to biblioteka do budowania UI", "Next.js to framework React z SSR"]
doc_embeddings = [get_embedding(d) for d in docs]

query_embedding = get_embedding("Jak renderować po stronie serwera?")
similarities = [cosine_similarity(query_embedding, de) for de in doc_embeddings]
best_doc = docs[np.argmax(similarities)]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": f"Odpowiedz na podstawie: {best_doc}"},
        {"role": "user", "content": "Jak renderować po stronie serwera?"}
    ]
)

Moderacja treści

Code

Python

response = client.moderations.create(
    model="omni-moderation-latest",
    input="Check this content for violations"
)
print(response.results[0].flagged)

FAQ

Ile kosztuje korzystanie z API OpenAI?

Opłaty są naliczane za tokeny (fragmenty tekstu). Najtańszy model (GPT-4.1 Nano) kosztuje $0.10/1M tokenów input. Nowe konta dostają darmowe kredyty na start.

Czym różnią się modele GPT od modeli O?

Modele GPT (GPT-4o, GPT-5) to modele ogólnego przeznaczenia - szybkie i wszechstronne. Modele O (o1, o3) to modele reasoning - "myślą" przed odpowiedzią, lepsze w złożonych problemach logicznych i matematycznych, ale wolniejsze i droższe.

Czy mogę fine-tunować modele OpenAI?

Tak, fine-tuning jest dostępny dla GPT-4.1, GPT-4o i o4-mini. Potrzebujesz danych treningowych w formacie JSONL.

Czy dane wysyłane do API są bezpieczne?

Dane API domyślnie nie są używane do trenowania modeli. OpenAI przechowuje je 30 dni do monitoringu. Dla wrażliwych danych dostępne jest Zero Data Retention.

Jak wybrać między OpenAI a alternatywami?

OpenAI ma najszerszy ekosystem narzędzi deweloperskich i najlepsze modele reasoning. Anthropic (Claude) wyróżnia się bezpieczeństwem i długim kontekstem. Google (Gemini) oferuje najdłuższy kontekst (2M) i integrację z GCP. Meta (Llama) to najlepsze modele open-source do samodzielnego hostingu.

Czy OpenAI ma modele open-source?

Tak, GPT-OSS (gpt-oss-120b i gpt-oss-20b) zostały wypuszczone w sierpniu 2025 pod licencją Apache 2.0.

Co to jest Responses API vs Chat Completions?

Responses API to nowsze API z wbudowanymi narzędziami (web search, file search, code interpreter). Chat Completions to starsze, prostsze API. OpenAI rekomenduje Responses API dla nowych projektów.

Czy ChatGPT i API to to samo?

Nie. ChatGPT to produkt konsumencki (aplikacja czatowa). API to usługa dla deweloperów do integracji modeli w własnych aplikacjach. Mają osobne plany cenowe.

Podsumowanie

OpenAI to lider w dziedzinie sztucznej inteligencji, oferujący najszerszy ekosystem narzędzi dla deweloperów - od modeli językowych i wizyjnych, przez SDK do budowania agentów, po narzędzia do fine-tuningu i embeddingów.

Dla deweloperów kluczowe są: Responses API jako główny punkt wejścia, Agents SDK do budowania wielokrokowych workflow'ów, Structured Outputs do typowanego wyjścia, function calling do integracji z zewnętrznymi serwisami, oraz bogata biblioteka modeli od najtańszych (GPT-4.1 Nano) po najpotężniejsze (o3, GPT-5.2).

Niezależnie od tego, czy budujesz chatbota, system RAG, agenta AI czy pipeline przetwarzania danych - platforma OpenAI dostarcza narzędzia do realizacji praktycznie każdego zadania AI.

OpenAI - A complete guide to the AI platform that changed the tech world

In November 2022, OpenAI released ChatGPT and within 5 days, over a million users signed up. Today, OpenAI is a company valued at $500 billion, and its products - from GPT-5 through DALL-E to Sora - define the direction of artificial intelligence development. But OpenAI is more than just ChatGPT. For developers, it's primarily an API platform with models, tools, and SDKs that enable building next-generation AI applications.

What is OpenAI?

OpenAI is an American AI research and technology company founded in December 2015 by Sam Altman, Elon Musk, Ilya Sutskever, Greg Brockman, and others. It initially operated as a non-profit organization with the mission of developing "safe and beneficial" artificial general intelligence (AGI).

Since 2019, OpenAI has undergone a structural transformation - from a pure non-profit to a hybrid "capped profit" model, and in 2025, it transformed into a public benefit corporation (PBC) under the oversight of the OpenAI Foundation. This evolution was driven by a simple fact: AI development requires enormous computational and financial resources.

The company is headquartered in San Francisco. Its CEO is Sam Altman, and the company employs thousands of workers and researchers.

History and key moments

2015-2018: The beginning

OpenAI was founded with a $1 billion pledge from Altman, Musk, Reid Hoffman, Peter Thiel, AWS, and others. The goal was to conduct open AI research, hence the "Open" in the name. In 2018, Elon Musk left the board after an unsuccessful attempt to take control of the organization.

2019-2022: GPT and the breakthrough

In 2019, GPT-2 was created, surprising everyone with the quality of generated text. GPT-3 (2020) with 175 billion parameters launched the era of large language models. In 2021, DALL-E (image generation) and Codex (code generation) appeared. GPT-3.5 and ChatGPT (November 2022) triggered the global AI boom.

2023: Turbulence and GPT-4

In March 2023, OpenAI released GPT-4 - a multimodal model understanding text and images. In November, the board removed Sam Altman as CEO. Five days later, he returned after practically all employees threatened to leave. The crisis led to a board restructuring.

2024-2025: Reasoning models and GPT-5

OpenAI introduced the o1/o3/o4-mini model series with "thinking" - models can reason step by step before providing an answer. In 2025, GPT-5 appeared, unifying general intelligence, reasoning, coding, and multimodality in one model family. Also released were Sora 2 (video), Operator (web agent), ChatGPT Atlas (browser), and GPT-OSS (open-weight models).

In May 2025, OpenAI acquired for $6.5 billion the company IO founded by Jony Ive (former Apple design chief), planning to create a new category of personal devices.

OpenAI products

ChatGPT

OpenAI's flagship consumer product. A chat interface for interacting with GPT models:

ChatGPT Free - free access to GPT-4o mini
ChatGPT Plus ($20/mo) - GPT-4o, GPT-5, DALL-E, data analysis
ChatGPT Pro ($200/mo) - unlimited access to all models, o1-pro, deep research
ChatGPT Team ($25/person/mo) - team version
ChatGPT Enterprise - corporate version with compliance and SSO

DALL-E / gpt-image

Image generation models from text descriptions. DALL-E 3 integrated with ChatGPT. In 2025, OpenAI transitioned to gpt-image models based on GPT's multimodal capabilities, replacing earlier diffusion models.

Sora

A text-to-video model generating videos up to 1920x1080 resolution. Sora 2 (2025) offers improved physics simulation and synchronized audio.

Whisper

A speech recognition model - audio-to-text transcription with multi-language support (including Polish). Open-source, also available via API.

Operator

An AI agent for automating browser tasks - logging in, filling forms, navigating websites.

OpenAI models for developers

GPT-5 family

Model	Input/1M tokens	Output/1M tokens	Context	Purpose
GPT-5.2	$1.25	$10.00	128K	Latest flagship model
GPT-5	$1.25	$10.00	128K	General + reasoning
GPT-5 Mini	-	-	128K	Cost/performance balance
GPT-5 Nano	-	-	128K	Minimum costs

GPT-4 family

Model	Input/1M tokens	Output/1M tokens	Context	Purpose
GPT-4o	$2.50	$10.00	128K	Versatile flagship
GPT-4o Mini	$0.15	$0.60	128K	Fast and cheap
GPT-4.1	$2.00	$8.00	1M	Coding, instructions
GPT-4.1 Mini	$0.40	$1.60	1M	Cheap with 1M context
GPT-4.1 Nano	$0.10	$0.40	1M	Fastest

Reasoning models (O-series)

Model	Input/1M tokens	Output/1M tokens	Context	Purpose
o1	$15.00	$60.00	200K	Deep reasoning
o3	$2.00	$8.00	200K	Next-gen reasoning
o4-mini	$1.10	$4.40	200K	Fast reasoning

O-series models consume "reasoning tokens" for internal thinking - these tokens are billed as output but not visible in the API response. A response with 500 visible tokens might consume 2000+ tokens.

Specialist models

gpt-image-1.5 - image generation
gpt-4o-transcribe / whisper-1 - audio transcription
gpt-4o-mini-tts - text-to-speech
text-embedding-3-small/large - text embeddings

How to choose a model?

Prototyping: GPT-5 Nano or GPT-4o Mini (cheapest)
General applications: GPT-4o or GPT-5
Complex reasoning: o3 or GPT-5.2
Coding: GPT-4.1 (optimized for instructions)
Batch processing: GPT-4o Mini + Batch API (50% cheaper)

Responses API

The Responses API is OpenAI's newest API, combining the simplicity of Chat Completions with the built-in tools of the Assistants API. It replaces the earlier Assistants API (deprecated, removal in August 2026).

Basic usage

Code

Python

from openai import OpenAI

client = OpenAI()

response = client.responses.create(
    model="gpt-4o",
    input="Explain what React is in 3 sentences"
)
print(response.output_text)

With built-in tools

Code

Python

response = client.responses.create(
    model="gpt-4o",
    tools=[{"type": "web_search_preview"}],
    input="What are the latest changes in Next.js 15?"
)
print(response.output_text)

Streaming

Code

Python

stream = client.responses.create(
    model="gpt-4o",
    input="Write a tutorial about TypeScript generics",
    stream=True
)
for event in stream:
    if hasattr(event, "delta"):
        print(event.delta, end="", flush=True)

Chat Completions API

The classic API for generating responses in conversation format:

Code

Python

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful programming assistant."},
        {"role": "user", "content": "How do I create a custom hook in React?"}
    ]
)
print(response.choices[0].message.content)

Structured Outputs

Forcing responses in a defined JSON schema:

Code

Python

from pydantic import BaseModel

class CodeReview(BaseModel):
    issues: list[str]
    suggestions: list[str]
    score: int

response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "Analyze the code and return a review."},
        {"role": "user", "content": "function add(a,b){return a+b}"}
    ],
    response_format=CodeReview
)
review = response.choices[0].message.parsed
print(f"Score: {review.score}")
print(f"Issues: {review.issues}")

Function calling

Allows the model to invoke external functions:

Code

Python

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string", "description": "City name"},
                    "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "What's the weather in Warsaw?"}],
    tools=tools,
    tool_choice="auto"
)

tool_call = response.choices[0].message.tool_calls[0]
print(f"Function: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")

Vision (image analysis)

Code

Python

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What do you see in this screenshot?"},
                {"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}}
            ]
        }
    ]
)
print(response.choices[0].message.content)

Embeddings

Embeddings convert text into numerical vectors that reflect semantic meaning. Similar texts generate similar vectors.

Code

Python

response = client.embeddings.create(
    model="text-embedding-3-small",
    input="TypeScript is a superset of JavaScript with static types"
)
embedding = response.data[0].embedding
print(f"Dimensions: {len(embedding)}")

Embedding use cases

Semantic search - finding documents by meaning, not keywords
RAG (Retrieval-Augmented Generation) - providing context to models
Classification - categorizing texts
Recommendations - suggesting similar content
Anomaly detection - identifying unusual patterns

Embedding model comparison

Model	Dimensions	Price/1M tokens	Use case
text-embedding-3-small	1536	$0.02	Most applications
text-embedding-3-large	3072	$0.13	Maximum quality

For most applications, text-embedding-3-small offers sufficient quality at 1/6th the cost of the large version.

Agents SDK

The Agents SDK is OpenAI's framework for building AI agents - programs that can autonomously execute multi-step tasks, using tools and delegating work to other agents.

Basic agent (Python)

Code

Python

from agents import Agent, Runner

agent = Agent(
    name="code-reviewer",
    instructions="You are an expert code reviewer. Analyze code for bugs, security issues, and performance problems.",
    model="gpt-4o"
)

result = Runner.run_sync(agent, "Review this function: def add(a,b): return a+b")
print(result.final_output)

Agent with tools

Code

Python

from agents import Agent, Runner, function_tool

@function_tool
def search_docs(query: str) -> str:
    """Search project documentation"""
    return f"Found documentation about: {query}"

@function_tool
def run_tests(file_path: str) -> str:
    """Run tests for a specific file"""
    return f"All tests passed for {file_path}"

agent = Agent(
    name="dev-assistant",
    instructions="Help developers by searching docs and running tests.",
    tools=[search_docs, run_tests],
    model="gpt-4o"
)

result = Runner.run_sync(agent, "Check if the auth module has documentation and run its tests")
print(result.final_output)

Agents SDK (TypeScript)

Code

TypeScript

import { Agent, run } from "@openai/agents"

const agent = new Agent({
  name: "assistant",
  instructions: "You are a helpful coding assistant.",
  model: "gpt-4o",
})

const result = await run(agent, "Explain React hooks")
console.log(result.finalOutput)

Handoffs (delegation)

Agents can delegate tasks to other agents:

Code

Python

from agents import Agent, Runner

researcher = Agent(
    name="researcher",
    instructions="Research topics thoroughly and provide detailed findings."
)

writer = Agent(
    name="writer",
    instructions="Write clear, engaging content based on research.",
    handoffs=[researcher]
)

result = Runner.run_sync(writer, "Write an article about WebAssembly")
print(result.final_output)

Built-in tools

web_search - internet search
file_search - document search
code_interpreter - Python code execution
image_generation - image creation
computer_use - interface automation

Codex CLI

An open-source tool for agent-style coding in the terminal:

Code

Bash

npx @openai/codex "Add error handling to the auth module"

Codex works directly with your repository - reads files, proposes changes, lets you review and accept them. It integrates with the Agents SDK via MCP (Model Context Protocol).

Cost optimization

Prompt caching

OpenAI automatically caches repeated prompt prefixes:

Model family	Savings
GPT-5	90% on cached tokens
GPT-4.1	75%
GPT-4o / O-series	50%

Batch API

For tasks that don't require immediate responses (data analysis, content generation):

Code

Python

batch = client.batches.create(
    input_file_id="file-abc123",
    endpoint="/v1/chat/completions",
    completion_window="24h"
)

The Batch API processes requests within 24 hours at a 50% discount.

Cascade architecture

An efficient strategy is routing requests through models of increasing power:

GPT-4o Mini handles 80% of simple queries ($0.15/1M input)
GPT-4o handles 15% of medium queries ($2.50/1M input)
o3 handles 5% of complex queries ($2.00/1M input + reasoning)

Fine-tuning

Training a model on your own data:

Code

Python

file = client.files.create(
    file=open("training_data.jsonl", "rb"),
    purpose="fine-tune"
)

job = client.fine_tuning.jobs.create(
    training_file=file.id,
    model="gpt-4.1-2025-04-14"
)

Fine-tuning is available for GPT-4.1, GPT-4o, and o4-mini. Training cost: $1.50-$100/h depending on the model.

SDKs and libraries

Python

Code

Bash

pip install openai

Code

Python

from openai import OpenAI

client = OpenAI()  # automatically reads OPENAI_API_KEY from env

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}]
)

Node.js / TypeScript

Code

Bash

npm install openai

Code

TypeScript

import OpenAI from "openai"

const client = new OpenAI()

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
})
console.log(response.choices[0].message.content)

.NET

Code

Bash

dotnet add package OpenAI

REST API (curl)

Code

Bash

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Audio API

Transcription (Speech-to-Text)

Code

Python

audio_file = open("meeting.mp3", "rb")
transcript = client.audio.transcriptions.create(
    model="gpt-4o-transcribe",
    file=audio_file
)
print(transcript.text)

Text-to-Speech

Code

Python

response = client.audio.speech.create(
    model="gpt-4o-mini-tts",
    voice="alloy",
    input="TypeScript is a superset of JavaScript."
)
response.stream_to_file("output.mp3")

Image generation

Code

Python

response = client.images.generate(
    model="gpt-image-1",
    prompt="A futuristic city with flying cars, cyberpunk style",
    size="1024x1024",
    n=1
)
print(response.data[0].url)

Security and privacy

API data policy

Data sent via API is not used for model training (by default)
OpenAI retains API data for 30 days for abuse monitoring
Zero Data Retention (ZDR) available for qualifying customers
SOC 2 Type II compliance

Rate limits

Limits depend on user Tier (Tier 1-5), which increases with spending:

Tier	Requirements	RPM (GPT-4o)	TPM (GPT-4o)
Tier 1	$5 deposit	500	30,000
Tier 2	$50 spent	5,000	450,000
Tier 3	$100 spent	5,000	800,000
Tier 5	$1,000 spent	10,000	30,000,000

API keys

Code

Bash

export OPENAI_API_KEY="sk-..."

Never commit API keys to your repository. Use environment variables or secret managers.

Comparison with competitors

Feature	OpenAI	Anthropic (Claude)	Google (Gemini)	Meta (Llama)
Best model	GPT-5.2	Claude Opus 4	Gemini 2.5 Pro	Llama 3.3 70B
Price (input/1M)	from $0.10	from $0.25	from $0.075	Free (local)
Context	up to 1M tokens	200K	2M	128K
Vision	Yes	Yes	Yes	Yes (with Vision)
Agents	Agents SDK	Agent SDK	Vertex AI Agents	No official
Open-weight	GPT-OSS (Apache 2)	No	Gemma (Apache 2)	Yes (Llama License)
Fine-tuning	Yes	No	Yes	Yes (local)

Practical use cases

Chatbot with context

Code

Python

messages = [
    {"role": "system", "content": "You are an online store assistant. Respond helpfully."},
    {"role": "user", "content": "I'm looking for running shoes in size 10"}
]

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

messages.append(response.choices[0].message)
messages.append({"role": "user", "content": "What about waterproof models?"})

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages
)

Document analysis with RAG

Code

Python

from openai import OpenAI
import numpy as np

client = OpenAI()

def get_embedding(text):
    response = client.embeddings.create(
        model="text-embedding-3-small",
        input=text
    )
    return response.data[0].embedding

def cosine_similarity(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

docs = ["React is a library for building UI", "Next.js is a React framework with SSR"]
doc_embeddings = [get_embedding(d) for d in docs]

query_embedding = get_embedding("How to render on the server side?")
similarities = [cosine_similarity(query_embedding, de) for de in doc_embeddings]
best_doc = docs[np.argmax(similarities)]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": f"Answer based on: {best_doc}"},
        {"role": "user", "content": "How to render on the server side?"}
    ]
)

Content moderation

Code

Python

response = client.moderations.create(
    model="omni-moderation-latest",
    input="Check this content for violations"
)
print(response.results[0].flagged)

FAQ

How much does using the OpenAI API cost?

Charges are based on tokens (text fragments). The cheapest model (GPT-4.1 Nano) costs $0.10/1M input tokens. New accounts receive free credits to get started.

How do GPT models differ from O models?

GPT models (GPT-4o, GPT-5) are general-purpose models - fast and versatile. O models (o1, o3) are reasoning models - they "think" before answering, better at complex logical and mathematical problems, but slower and more expensive.

Can I fine-tune OpenAI models?

Yes, fine-tuning is available for GPT-4.1, GPT-4o, and o4-mini. You need training data in JSONL format.

Is data sent to the API secure?

API data is not used for model training by default. OpenAI retains it for 30 days for monitoring. For sensitive data, Zero Data Retention is available.

How to choose between OpenAI and alternatives?

OpenAI has the broadest developer tool ecosystem and the best reasoning models. Anthropic (Claude) excels in safety and long context. Google (Gemini) offers the longest context (2M) and GCP integration. Meta (Llama) has the best open-source models for self-hosting.

Does OpenAI have open-source models?

Yes, GPT-OSS (gpt-oss-120b and gpt-oss-20b) were released in August 2025 under the Apache 2.0 license.

What is Responses API vs Chat Completions?

The Responses API is the newer API with built-in tools (web search, file search, code interpreter). Chat Completions is the older, simpler API. OpenAI recommends the Responses API for new projects.

Are ChatGPT and the API the same thing?

No. ChatGPT is a consumer product (chat application). The API is a service for developers to integrate models into their own applications. They have separate pricing plans.

Summary

OpenAI is the leader in artificial intelligence, offering the broadest developer tool ecosystem - from language and vision models, through SDKs for building agents, to fine-tuning and embedding tools.

For developers, the key offerings are: the Responses API as the main entry point, the Agents SDK for building multi-step workflows, Structured Outputs for typed output, function calling for external service integration, and a rich model library from the cheapest (GPT-4.1 Nano) to the most powerful (o3, GPT-5.2).

Whether you're building a chatbot, RAG system, AI agent, or data processing pipeline - the OpenAI platform provides tools for virtually any AI task.