Google DeepMind • Multimodal

Gemini 2.5 Pro

Google's most capable multimodal model with a groundbreaking 1M token context window. Excels at vision understanding, document analysis, and complex reasoning across text, image, and code.

Overview

Gemini 2.5 Pro represents a major leap in multimodal AI from Google DeepMind. Built on a natively multimodal architecture, it processes text, images, code, and documents with state-of-the-art accuracy. Its 1 million token context window enables analysis of entire codebases, lengthy documents, and extended conversations without information loss.

Context Window (tokens)

Multimodal

Text + Image + Code

2026

Training Data Cutoff

Top-Tier

Benchmark Performance

Performance Benchmarks

MMLU-Pro

91.8%

Vision Understanding

94.2%

DocVQA

93.1%

HumanEval

89.7%

MATH

87.5%

Key Capabilities

🧠

Multimodal Understanding

Natively processes text, images, and code in a unified architecture. Seamlessly combines inputs for complex reasoning across modalities.

👁️

Vision Analysis

Industry-leading image understanding with 94.2% accuracy. Analyzes charts, diagrams, screenshots, photos, and handwritten content.

📄

Document Parsing

Extracts structured data from PDFs, invoices, receipts, and complex documents with exceptional accuracy and formatting preservation.

📐

Long Context Processing

1M token context window enables analysis of entire codebases, books, and lengthy document collections in a single prompt.

Code Example

TypeScript — Using Gemini 2.5 Pro via RusorAgent API for image analysis

import { RusorAgent } from "rusoragent-ai";

// Initialize the client with your API key
const client = new RusorAgent({
  apiKey: process.env.RUSORAGENT_API_KEY,
});

// Multimodal: Analyze an image with Gemini 2.5 Pro
async function analyzeImage() {
  const response = await client.chat.completions.create({
    model: "gemini-2.5-pro",
    messages: [
      {
        role: "user",
        content: [
          {
            type: "text",
            text: "Describe this image in detail. Extract any text or data visible."
          },
          {
            type: "image_url",
            image_url: {
              url: "https://example.com/chart.png"
            }
          }
        ]
      }
    ],
    max_tokens: 4096,
  });

  console.log(response.choices[0].message.content);
}

analyzeImage();
            

Start Using Gemini 2.5 Pro Today

Access Google DeepMind's most powerful multimodal model through our unified API. Pay only for what you use.

View Pricing & Get Started