We use cookies to enhance your experience on the site
CodeWorlds
Back to collections
Guide14 min read

Kimi

Kimi is an advanced AI model by Moonshot AI with a 1 trillion parameter MoE architecture. Guide to Kimi K2.5, Agent Swarm, Kimi Code CLI, API, and practical programming applications.

Kimi - complete guide to Moonshot's AI

What is Kimi?

Kimi is a family of advanced large language models (LLMs) created by Chinese company Moonshot AI, founded in March 2023. The Kimi chatbot was officially made available to the public in November 2023, and since then it has undergone an impressive evolution - from a simple conversational assistant to one of the most powerful open-source models in the world.

The latest model in the family, Kimi K2.5 (January 2026), is a natively multimodal model with a Mixture of Experts (MoE) architecture totaling 1 trillion parameters, of which 32 billion are active during inference. This makes Kimi both powerful and efficient - you don't need to run all parameters for every query.

Moonshot AI stands out with its consistent open-source approach. Most models in the Kimi family are available for download and modification, making them an attractive alternative to closed models from OpenAI or Anthropic.

Why Kimi?

Key advantages of Kimi

  1. MoE architecture - 1 trillion parameters with 32B active provides excellent quality-to-cost ratio
  2. Agent Swarm - Unique technology coordinating up to 100 AI agents working in parallel
  3. Native multimodality - Vision and text trained together from the start, not bolted on separately
  4. Open source - Model weights publicly available under a modified MIT license
  5. Kimi Code CLI - Terminal-based AI coding tool, open source under Apache 2.0
  6. Affordable API pricing - $0.60/million input tokens, $2.50/million output tokens
  7. 256K context window - Handles very long documents and codebases

Kimi K2.5 vs Claude Sonnet 4.5 vs GPT-4o

FeatureKimi K2.5Claude Sonnet 4.5GPT-4o
ArchitectureMoE 1T/32BDenseDense
Context window256K200K128K
Price (input/output)$0.60/$2.50$3/$15$2.50/$10
SWE-Bench Verified76.8%70.3%69.1%
MultimodalityText + image + videoText + imageText + image + audio
Open sourceYes (MIT)NoNo
Agent SwarmYes (up to 100 agents)NoNo
Visual coding92.3% OCRBenchGoodGood

Evolution of Kimi models

Kimi K1.5 (January 2025)

The first model that put Moonshot AI on the global competition map. K1.5 matched OpenAI o1's performance in mathematics, coding, and multimodal reasoning.

Kimi-VL (April 2025)

An open-source vision model with 16 billion parameters (MoE architecture, 3B active). Compact yet surprisingly effective at visual tasks.

Kimi-Dev (June 2025)

A coding-focused model with 72B parameters, based on Qwen2.5-72B. It achieved state-of-the-art among open-source models on the SWE-bench Verified benchmark, becoming a serious alternative to commercial coding models.

Kimi K2 (July 2025)

A breakthrough moment - a model with 1 trillion parameters (MoE, 32B active), trained on 15.5 trillion tokens. Released under a modified MIT license.

In September 2025, an updated version of K2 appeared with a doubled context window (128K → 256K tokens) and improved performance on agentic tasks.

Kimi K2 Thinking (November 2025)

A version of K2 optimized for advanced reasoning. It can perform 200-300 sequential tool calls autonomously. Benchmarks showed it outperforming GPT-5 and Claude Sonnet 4.5 on tests such as Humanity's Last Exam (44.9%) and BrowseComp (60.2%).

Training cost: approximately $4.6 million - a fraction of what the largest AI companies spend.

Kimi K2.5 (January 2026)

The latest model, a multimodal evolution of K2. It adds native vision capabilities through the MoonViT encoder (400M parameters). It processes both images and video, enabling agentic tasks such as replicating user journeys on websites based solely on video recordings.

Four modes of operation

Kimi K2.5 offers four modes tailored to different needs:

Instant

Quick answers to simple questions. Minimal latency, ideal for everyday tasks like translations, summaries, or quick code questions.

Thinking

Step-by-step reasoning mode. The model "thinks aloud," breaking complex problems into smaller parts. Great for debugging, mathematics, and logical puzzles.

Agent

A single agent with tool access. It can browse the internet, execute code, read files, and carry out multi-step tasks autonomously. Supports up to 200-300 sequential tool calls.

Agent Swarm (Beta)

The most advanced mode. It decomposes a task into subtasks and delegates them to a swarm of sub-agents (up to 100) working in parallel.

Agent Swarm - breakthrough technology

Agent Swarm is the most distinguishing feature of Kimi K2.5. Instead of a single agent executing tasks sequentially, Agent Swarm coordinates a swarm of up to 100 specialized sub-agents working in parallel.

How does Agent Swarm work?

  1. Task decomposition - The orchestrator analyzes the task and splits it into independent subtasks
  2. Agent allocation - Each subtask is assigned to a specialized sub-agent
  3. Parallel execution - Sub-agents work simultaneously, coordinating through the orchestrator
  4. Result aggregation - The orchestrator collects and combines results into a coherent response

Technical details

Agent Swarm uses Parallel-Agent Reinforcement Learning (PARL) with a trainable orchestrator. Training uses staged reward shaping to prevent "serial collapse" (agents reverting to sequential behavior) and "spurious parallelism" (fake parallelism without real benefits).

The Critical Steps metric emphasizes latency optimization - what matters is not just correctness but also speed.

Results

  • Up to 1,500 coordinated tool calls in a single task
  • Execution time reduction of up to 4.5x compared to sequential approach
  • BrowseComp benchmark: 78.4% (Agent Swarm) vs significantly lower scores in Agent mode
Code
TEXT
Example: task "analyze 50 competitor websites"

Traditional agent:
  → site 1 → site 2 → ... → site 50 → report
  Time: ~25 minutes

Agent Swarm:
  → [agent 1: sites 1-10] [agent 2: sites 11-20] ... [agent 5: sites 41-50]
  → aggregator → report
  Time: ~6 minutes

Kimi Code CLI

Kimi Code CLI is an open-source terminal tool for AI-assisted coding, comparable to Claude Code from Anthropic. It works directly in the terminal and supports code reading/editing, shell command execution, and multi-step agentic tasks.

Installation

Code
Bash
pip install kimi-cli

Requirements:

  • Python 3.10+ (3.13 recommended)
  • uv (Python package manager)
  • On Windows: WSL 2

Basic usage

Code
Bash
kimi chat "Explain this code"

kimi chat "Refactor the parseUserInput function in src/utils.ts"

kimi chat "Write unit tests for the auth module"

Shell mode

Press Ctrl-X during a session to toggle the built-in shell mode - you can execute commands without leaving Kimi.

MCP (Model Context Protocol)

Kimi Code CLI supports custom tools via MCP:

Code
Bash
kimi mcp add my-tool --command "node my-tool-server.js"

kimi mcp list

kimi chat --mcp-config-file ./project-mcp.json "Analyze the project"

IDE integration

Kimi Code CLI supports the Agent Client Protocol (ACP), enabling integration with editors:

  • VS Code - Dedicated Kimi Code extension with chat panel, slash commands, diff preview
  • Cursor - Via ACP
  • Zed - Via ACP
  • JetBrains - Via ACP

Kimi API - getting started

Registration and API key

Register at platform.moonshot.ai and generate an API key in the dashboard.

SDK installation

Code
Bash
pip install openai

Kimi API is compatible with the OpenAI format, so you can use the official OpenAI SDK.

Simple Python example

Code
Python
from openai import OpenAI

client = OpenAI(
    api_key="your-kimi-api-key",
    base_url="https://api.moonshot.cn/v1",
)

response = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {"role": "system", "content": "You are a helpful programming assistant."},
        {"role": "user", "content": "Write a bubble sort function in TypeScript."}
    ],
    temperature=0.7,
)

print(response.choices[0].message.content)

TypeScript example

Code
TypeScript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "your-kimi-api-key",
  baseURL: "https://api.moonshot.cn/v1",
});

async function askKimi(prompt: string): Promise<string> {
  const response = await client.chat.completions.create({
    model: "kimi-k2.5",
    messages: [
      { role: "system", content: "You are a helpful coding assistant." },
      { role: "user", content: prompt },
    ],
    temperature: 0.7,
  });

  return response.choices[0].message.content ?? "";
}

const answer = await askKimi("Explain the difference between map and flatMap in TypeScript");
console.log(answer);

Streaming

Code
Python
from openai import OpenAI

client = OpenAI(
    api_key="your-kimi-api-key",
    base_url="https://api.moonshot.cn/v1",
)

stream = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {"role": "user", "content": "Write a tutorial about React Hooks"}
    ],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Image analysis

Code
Python
from openai import OpenAI

client = OpenAI(
    api_key="your-kimi-api-key",
    base_url="https://api.moonshot.cn/v1",
)

response = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this user interface and suggest UX improvements."},
                {"type": "image_url", "image_url": {"url": "https://example.com/screenshot.png"}},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Visual coding - from image to code

One of the most impressive capabilities of Kimi K2.5 is converting screenshots into working code. With a 92.3% score on OCRBench, the model can read a user interface from a screenshot and generate corresponding React, Vue, or plain HTML code.

Code
Python
response = client.chat.completions.create(
    model="kimi-k2.5",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Convert this UI screenshot to a React component using Tailwind CSS."},
                {"type": "image_url", "image_url": {"url": "https://example.com/dashboard.png"}},
            ],
        }
    ],
)

print(response.choices[0].message.content)

Pricing

Kimi K2.5 API

VariantInput (per 1M tokens)Output (per 1M tokens)Context
K2.5 Instant$0.60$2.50256K
K2.5 Thinking$0.60$2.50256K
K2.5 Agent$0.60$2.50256K

Cost comparison

ModelInputOutputRatio to Kimi
Kimi K2.5$0.60$2.501x (baseline)
Claude Sonnet 4.5$3.00$15.005-6x more expensive
GPT-4o$2.50$10.004x more expensive
Gemini 1.5 Pro$3.50$10.504-6x more expensive

Kimi K2.5 is clearly the most affordable model in its performance class. For startups and budget-constrained projects, this is a compelling argument.

Free access

Kimi.com offers free access to the chatbot with daily limits. For basic tasks - writing, translations, simple code questions - the free plan is perfectly sufficient.

Benchmarks

Coding

BenchmarkKimi K2.5Claude Sonnet 4.5GPT-4o
SWE-Bench Verified76.8%70.3%69.1%
HumanEval92.1%90.4%90.2%
LiveCodeBench68.5%64.8%62.3%

Reasoning and knowledge

BenchmarkKimi K2.5Claude Sonnet 4.5GPT-4o
MMMU Pro78.5%74.1%72.6%
Humanity's Last Exam44.9%38.2%35.7%
GPQA Diamond71.2%68.4%67.5%

Vision and multimodality

BenchmarkKimi K2.5Claude Sonnet 4.5GPT-4o
OCRBench92.3%87.1%85.4%
VideoMMMU86.6%-78.2%
MathVista74.8%71.5%70.1%

Agentic tasks

BenchmarkKimi K2.5 (Swarm)Kimi K2.5 (Agent)Claude Sonnet 4.5
BrowseComp78.4%60.2%52.1%
WebArena71.3%58.7%54.8%

Practical applications

Code refactoring

Kimi K2.5 excels at refactoring large codebases. With a 256K token context window, you can pass it multiple files simultaneously.

Code
TypeScript
const prompt = `
Refactor the following React code from class components to functional components with hooks.
Maintain identical behavior and TypeScript types.

${classComponentCode}
`;

Test generation

Code
TypeScript
const prompt = `
Generate unit tests (Jest + React Testing Library) for the UserProfile component.
Cover scenarios: loading state, error state, successful render, user interaction.

${userProfileComponent}
`;

Code review

Code
TypeScript
const prompt = `
Review this pull request for:
- Potential bugs
- Performance issues
- Security (OWASP Top 10)
- TypeScript best practices compliance

${diffContent}
`;

API documentation

Code
TypeScript
const prompt = `
Based on this NestJS code, generate OpenAPI documentation in YAML format.
Include all endpoints, parameters, response types, and error codes.

${nestjsControllers}
`;

Kimi vs Claude Code - CLI tools comparison

FeatureKimi Code CLIClaude Code
LicenseApache 2.0Closed
Base modelKimi K2.5 (open source)Claude (closed)
IDE integrationVS Code, Cursor, Zed, JetBrainsVS Code
MCPYesYes
Agent SwarmYesNo
Installationpip (Python 3.10+)npm
API price$0.60/$2.50 per 1M tokens$3/$15 per 1M tokens
Shell modeCtrl-X toggleBuilt-in
GitHub Stars6,400+40,000+
MaturityNewer, active developmentMore mature

Both tools have their strengths. Claude Code is more mature and has a larger community. Kimi Code is cheaper, open-source, and offers unique Agent Swarm capabilities. The choice depends on priorities: budget and openness vs stability and ecosystem.

Moonshot AI - the company behind Kimi

Moonshot AI is a Chinese startup founded in March 2023. The company quickly secured funding from major tech players:

  • February 2024 - $1 billion round led by Alibaba Group, valuation $2.5 billion
  • October 2025 - ~$600 million round led by IDG Capital with participation from Tencent, valuation $3.8 billion

Moonshot AI stands out with its open-source strategy in a region where most AI companies focus on closed models. Their approach builds developer community trust and accelerates adoption.

Limitations and challenges

  1. Regional availability - API hosted in China, which may mean higher latency from Europe
  2. Documentation - Some documentation available primarily in Chinese
  3. Ecosystem - Smaller ecosystem of tools and integrations than OpenAI or Anthropic
  4. Windows support - Kimi Code CLI requires WSL 2, no native Windows support
  5. Agent Swarm in Beta - Technology still in testing phase, possible instabilities
  6. Geopolitics - A Chinese AI model may raise regulatory concerns in some organizations

FAQ

Is Kimi K2.5 free?

The chatbot on kimi.com is free with daily limits. The API is paid ($0.60/$2.50 per million tokens). Model weights are open-source and you can run the model locally if you have the appropriate hardware.

Can I run Kimi locally?

Yes, the model is available on Hugging Face. However, the full model requires significant GPU resources due to its 1 trillion parameters. The INT4 quantized version is more accessible.

Is the Kimi API compatible with OpenAI?

Yes, the Kimi API uses a format compatible with OpenAI. You can use the official OpenAI SDK, changing only the base_url and api_key.

How does Kimi K2.5 handle Polish?

The model supports multiple languages, including Polish. The quality of Polish responses is good, though it achieves the best results in English and Chinese, which were its primary training languages.

How does Agent Swarm differ from a regular agent?

An agent executes tasks sequentially - one after another. Agent Swarm decomposes a task into subtasks and assigns them to multiple agents working in parallel, drastically reducing the completion time of complex tasks.

Can Kimi Code CLI replace Claude Code?

It depends on your needs. Kimi Code is cheaper and open-source, but Claude Code has a larger ecosystem and maturity. For budget projects or when openness matters to you - Kimi Code is a solid alternative.

Summary

Kimi from Moonshot AI is one of the most interesting players in the AI market in 2026. The combination of a 1 trillion parameter model, open weights, affordable API pricing, and breakthrough Agent Swarm technology makes Kimi K2.5 a serious alternative to closed models from OpenAI, Anthropic, or Google.

For programmers, three aspects are particularly interesting: SWE-Bench results (76.8%), visual coding with OCRBench (92.3%), and Kimi Code CLI as an open-source alternative to Claude Code.

If you're looking for a powerful AI model for coding that won't break the budget - Kimi K2.5 deserves serious consideration.