Kimi | CodeWorlds

Kimi - complete guide to Moonshot's AI

What is Kimi?

Kimi is a family of advanced large language models (LLMs) created by Chinese company Moonshot AI, founded in March 2023. The Kimi chatbot was officially made available to the public in November 2023, and since then it has undergone an impressive evolution - from a simple conversational assistant to one of the most powerful open-source models in the world.

The latest model in the family, Kimi K2.5 (January 2026), is a natively multimodal model with a Mixture of Experts (MoE) architecture totaling 1 trillion parameters, of which 32 billion are active during inference. This makes Kimi both powerful and efficient - you don't need to run all parameters for every query.

Moonshot AI stands out with its consistent open-source approach. Most models in the Kimi family are available for download and modification, making them an attractive alternative to closed models from OpenAI or Anthropic.

Why Kimi?

Key advantages of Kimi

MoE architecture - 1 trillion parameters with 32B active provides excellent quality-to-cost ratio
Agent Swarm - Unique technology coordinating up to 100 AI agents working in parallel
Native multimodality - Vision and text trained together from the start, not bolted on separately
Open source - Model weights publicly available under a modified MIT license
Kimi Code CLI - Terminal-based AI coding tool, open source under Apache 2.0
Affordable API pricing - $0.60/million input tokens, $2.50/million output tokens
256K context window - Handles very long documents and codebases

Kimi K2.5 vs Claude Sonnet 4.5 vs GPT-4o

Feature	Kimi K2.5	Claude Sonnet 4.5	GPT-4o
Architecture	MoE 1T/32B	Dense	Dense
Context window	256K	200K	128K
Price (input/output)	$0.60/$2.50	$3/$15	$2.50/$10
SWE-Bench Verified	76.8%	70.3%	69.1%
Multimodality	Text + image + video	Text + image	Text + image + audio
Open source	Yes (MIT)	No	No
Agent Swarm	Yes (up to 100 agents)	No	No
Visual coding	92.3% OCRBench	Good	Good

Evolution of Kimi models

Kimi K1.5 (January 2025)

The first model that put Moonshot AI on the global competition map. K1.5 matched OpenAI o1's performance in mathematics, coding, and multimodal reasoning.

Kimi-VL (April 2025)

An open-source vision model with 16 billion parameters (MoE architecture, 3B active). Compact yet surprisingly effective at visual tasks.

Kimi-Dev (June 2025)

A coding-focused model with 72B parameters, based on Qwen2.5-72B. It achieved state-of-the-art among open-source models on the SWE-bench Verified benchmark, becoming a serious alternative to commercial coding models.

Kimi K2 (July 2025)

A breakthrough moment - a model with 1 trillion parameters (MoE, 32B active), trained on 15.5 trillion tokens. Released under a modified MIT license.

In September 2025, an updated version of K2 appeared with a doubled context window (128K → 256K tokens) and improved performance on agentic tasks.

Kimi K2 Thinking (November 2025)

A version of K2 optimized for advanced reasoning. It can perform 200-300 sequential tool calls autonomously. Benchmarks showed it outperforming GPT-5 and Claude Sonnet 4.5 on tests such as Humanity's Last Exam (44.9%) and BrowseComp (60.2%).

Training cost: approximately $4.6 million - a fraction of what the largest AI companies spend.

Kimi K2.5 (January 2026)

The latest model, a multimodal evolution of K2. It adds native vision capabilities through the MoonViT encoder (400M parameters). It processes both images and video, enabling agentic tasks such as replicating user journeys on websites based solely on video recordings.

Four modes of operation

Kimi K2.5 offers four modes tailored to different needs:

Instant

Quick answers to simple questions. Minimal latency, ideal for everyday tasks like translations, summaries, or quick code questions.

Thinking

Step-by-step reasoning mode. The model "thinks aloud," breaking complex problems into smaller parts. Great for debugging, mathematics, and logical puzzles.

Agent

A single agent with tool access. It can browse the internet, execute code, read files, and carry out multi-step tasks autonomously. Supports up to 200-300 sequential tool calls.

Agent Swarm (Beta)

The most advanced mode. It decomposes a task into subtasks and delegates them to a swarm of sub-agents (up to 100) working in parallel.

Agent Swarm - breakthrough technology

Agent Swarm is the most distinguishing feature of Kimi K2.5. Instead of a single agent executing tasks sequentially, Agent Swarm coordinates a swarm of up to 100 specialized sub-agents working in parallel.

How does Agent Swarm work?

Task decomposition - The orchestrator analyzes the task and splits it into independent subtasks
Agent allocation - Each subtask is assigned to a specialized sub-agent
Parallel execution - Sub-agents work simultaneously, coordinating through the orchestrator
Result aggregation - The orchestrator collects and combines results into a coherent response

Technical details

Agent Swarm uses Parallel-Agent Reinforcement Learning (PARL) with a trainable orchestrator. Training uses staged reward shaping to prevent "serial collapse" (agents reverting to sequential behavior) and "spurious parallelism" (fake parallelism without real benefits).

The Critical Steps metric emphasizes latency optimization - what matters is not just correctness but also speed.

Results

Up to 1,500 coordinated tool calls in a single task
Execution time reduction of up to 4.5x compared to sequential approach
BrowseComp benchmark: 78.4% (Agent Swarm) vs significantly lower scores in Agent mode

Code

TEXT

Example: task "analyze 50 competitor websites"

Traditional agent:
  → site 1 → site 2 → ... → site 50 → report
  Time: ~25 minutes

Agent Swarm:
  → [agent 1: sites 1-10] [agent 2: sites 11-20] ... [agent 5: sites 41-50]
  → aggregator → report
  Time: ~6 minutes

Kimi Code CLI

Kimi Code CLI is an open-source terminal tool for AI-assisted coding, comparable to Claude Code from Anthropic. It works directly in the terminal and supports code reading/editing, shell command execution, and multi-step agentic tasks.

Installation

Code

Bash

pip install kimi-cli

Requirements:

Python 3.10+ (3.13 recommended)
uv (Python package manager)
On Windows: WSL 2

Basic usage

Code

Bash

kimi chat "Explain this code"

kimi chat "Refactor the parseUserInput function in src/utils.ts"

kimi chat "Write unit tests for the auth module"

Shell mode

Press Ctrl-X during a session to toggle the built-in shell mode - you can execute commands without leaving Kimi.

MCP (Model Context Protocol)

Kimi Code CLI supports custom tools via MCP:

Code

Bash

kimi mcp add my-tool --command "node my-tool-server.js"

kimi mcp list

kimi chat --mcp-config-file ./project-mcp.json "Analyze the project"

IDE integration

Kimi Code CLI supports the Agent Client Protocol (ACP), enabling integration with editors:

VS Code - Dedicated Kimi Code extension with chat panel, slash commands, diff preview
Cursor - Via ACP
Zed - Via ACP
JetBrains - Via ACP

Kimi API - getting started

Registration and API key

SDK installation

Code

Bash

pip install openai

Kimi API is compatible with the OpenAI format, so you can use the official OpenAI SDK.

Simple Python example

Code

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-kimi-api-key",
    base_url="https://api.moonshot.cn/v1",
)

response = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {"role": "system", "content": "You are a helpful programming assistant."},
        {"role": "user", "content": "Write a bubble sort function in TypeScript."}
    ],
    temperature=0.7,
)

print(response.choices[0].message.content)

TypeScript example

Code

TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-kimi-api-key",
  baseURL: "https://api.moonshot.cn/v1",
});

async function askKimi(prompt: string): Promise<string> {
  const response = await client.chat.completions.create({
    model: "kimi-k2.5",
    messages: [
      { role: "system", content: "You are a helpful coding assistant." },
      { role: "user", content: prompt },
    ],
    temperature: 0.7,
  });

  return response.choices[0].message.content ?? "";
}

const answer = await askKimi("Explain the difference between map and flatMap in TypeScript");
console.log(answer);

Streaming

Code

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-kimi-api-key",
    base_url="https://api.moonshot.cn/v1",
)

stream = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {"role": "user", "content": "Write a tutorial about React Hooks"}
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Image analysis

Code

Python

from openai import OpenAI

client = OpenAI(
    api_key="your-kimi-api-key",
    base_url="https://api.moonshot.cn/v1",
)

response = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this user interface and suggest UX improvements."},
                {"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Visual coding - from image to code

One of the most impressive capabilities of Kimi K2.5 is converting screenshots into working code. With a 92.3% score on OCRBench, the model can read a user interface from a screenshot and generate corresponding React, Vue, or plain HTML code.

Code

Python

response = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Convert this UI screenshot to a React component using Tailwind CSS."},
                {"type": "image_url", "image_url": {"url": "https://example.com/dashboard.png"}},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Pricing

Kimi K2.5 API

Variant	Input (per 1M tokens)	Output (per 1M tokens)	Context
K2.5 Instant	$0.60	$2.50	256K
K2.5 Thinking	$0.60	$2.50	256K
K2.5 Agent	$0.60	$2.50	256K

Cost comparison

Model	Input	Output	Ratio to Kimi
Kimi K2.5	$0.60	$2.50	1x (baseline)
Claude Sonnet 4.5	$3.00	$15.00	5-6x more expensive
GPT-4o	$2.50	$10.00	4x more expensive
Gemini 1.5 Pro	$3.50	$10.50	4-6x more expensive

Kimi K2.5 is clearly the most affordable model in its performance class. For startups and budget-constrained projects, this is a compelling argument.

Free access

Kimi.com offers free access to the chatbot with daily limits. For basic tasks - writing, translations, simple code questions - the free plan is perfectly sufficient.

Benchmarks

Coding

Benchmark	Kimi K2.5	Claude Sonnet 4.5	GPT-4o
SWE-Bench Verified	76.8%	70.3%	69.1%
HumanEval	92.1%	90.4%	90.2%
LiveCodeBench	68.5%	64.8%	62.3%

Reasoning and knowledge

Benchmark	Kimi K2.5	Claude Sonnet 4.5	GPT-4o
MMMU Pro	78.5%	74.1%	72.6%
Humanity's Last Exam	44.9%	38.2%	35.7%
GPQA Diamond	71.2%	68.4%	67.5%

Vision and multimodality

Benchmark	Kimi K2.5	Claude Sonnet 4.5	GPT-4o
OCRBench	92.3%	87.1%	85.4%
VideoMMMU	86.6%	-	78.2%
MathVista	74.8%	71.5%	70.1%

Agentic tasks

Benchmark	Kimi K2.5 (Swarm)	Kimi K2.5 (Agent)	Claude Sonnet 4.5
BrowseComp	78.4%	60.2%	52.1%
WebArena	71.3%	58.7%	54.8%

Practical applications

Code refactoring

Kimi K2.5 excels at refactoring large codebases. With a 256K token context window, you can pass it multiple files simultaneously.

Code

TypeScript

const prompt = `
Refactor the following React code from class components to functional components with hooks.
Maintain identical behavior and TypeScript types.

${classComponentCode}
`;

Test generation

Code

TypeScript

const prompt = `
Generate unit tests (Jest + React Testing Library) for the UserProfile component.
Cover scenarios: loading state, error state, successful render, user interaction.

${userProfileComponent}
`;

Code review

Code

TypeScript

const prompt = `
Review this pull request for:
- Potential bugs
- Performance issues
- Security (OWASP Top 10)
- TypeScript best practices compliance

${diffContent}
`;

API documentation

Code

TypeScript

const prompt = `
Based on this NestJS code, generate OpenAPI documentation in YAML format.
Include all endpoints, parameters, response types, and error codes.

${nestjsControllers}
`;

Kimi vs Claude Code - CLI tools comparison

Feature	Kimi Code CLI	Claude Code
License	Apache 2.0	Closed
Base model	Kimi K2.5 (open source)	Claude (closed)
IDE integration	VS Code, Cursor, Zed, JetBrains	VS Code
MCP	Yes	Yes
Agent Swarm	Yes	No
Installation	pip (Python 3.10+)	npm
API price	$0.60/$2.50 per 1M tokens	$3/$15 per 1M tokens
Shell mode	Ctrl-X toggle	Built-in
GitHub Stars	6,400+	40,000+
Maturity	Newer, active development	More mature

Both tools have their strengths. Claude Code is more mature and has a larger community. Kimi Code is cheaper, open-source, and offers unique Agent Swarm capabilities. The choice depends on priorities: budget and openness vs stability and ecosystem.

Moonshot AI - the company behind Kimi

Moonshot AI is a Chinese startup founded in March 2023. The company quickly secured funding from major tech players:

February 2024 - $1 billion round led by Alibaba Group, valuation $2.5 billion
October 2025 - ~$600 million round led by IDG Capital with participation from Tencent, valuation $3.8 billion

Moonshot AI stands out with its open-source strategy in a region where most AI companies focus on closed models. Their approach builds developer community trust and accelerates adoption.

Limitations and challenges

Regional availability - API hosted in China, which may mean higher latency from Europe
Documentation - Some documentation available primarily in Chinese
Ecosystem - Smaller ecosystem of tools and integrations than OpenAI or Anthropic
Windows support - Kimi Code CLI requires WSL 2, no native Windows support
Agent Swarm in Beta - Technology still in testing phase, possible instabilities
Geopolitics - A Chinese AI model may raise regulatory concerns in some organizations

FAQ

Is Kimi K2.5 free?

The chatbot on kimi.com is free with daily limits. The API is paid ($0.60/$2.50 per million tokens). Model weights are open-source and you can run the model locally if you have the appropriate hardware.

Can I run Kimi locally?

Yes, the model is available on Hugging Face. However, the full model requires significant GPU resources due to its 1 trillion parameters. The INT4 quantized version is more accessible.

Is the Kimi API compatible with OpenAI?

Yes, the Kimi API uses a format compatible with OpenAI. You can use the official OpenAI SDK, changing only the base_url and api_key.

How does Kimi K2.5 handle Polish?

The model supports multiple languages, including Polish. The quality of Polish responses is good, though it achieves the best results in English and Chinese, which were its primary training languages.

How does Agent Swarm differ from a regular agent?

An agent executes tasks sequentially - one after another. Agent Swarm decomposes a task into subtasks and assigns them to multiple agents working in parallel, drastically reducing the completion time of complex tasks.

Can Kimi Code CLI replace Claude Code?

It depends on your needs. Kimi Code is cheaper and open-source, but Claude Code has a larger ecosystem and maturity. For budget projects or when openness matters to you - Kimi Code is a solid alternative.

Summary

Kimi from Moonshot AI is one of the most interesting players in the AI market in 2026. The combination of a 1 trillion parameter model, open weights, affordable API pricing, and breakthrough Agent Swarm technology makes Kimi K2.5 a serious alternative to closed models from OpenAI, Anthropic, or Google.

For programmers, three aspects are particularly interesting: SWE-Bench results (76.8%), visual coding with OCRBench (92.3%), and Kimi Code CLI as an open-source alternative to Claude Code.

If you're looking for a powerful AI model for coding that won't break the budget - Kimi K2.5 deserves serious consideration.