AGI Index

Categories

Agents: tool use, planning, autonomy, and task completion.
Coding: software engineering and repository-level development.
Reasoning: math, logic, long-context analysis, and planning.
Science: research assistance, discovery, simulation, and lab workflows.
Multimodal: image, audio, video, document, and mixed-input understanding.
Robotics: embodied control and real-world physical task performance.
Reliability: robustness, reproducibility, safety, and consistency.

Monthly score

Each category receives a score from recent public evidence. The overall index is a weighted blend that emphasizes agents, coding, reasoning, and science while still tracking multimodal, robotics, and reliability progress.

Confidence

Confidence is higher when signals are independently verifiable, reproducible, and tied to concrete evaluations. It is lower when evidence depends on claims without public methods.

How the score works

Categories

Monthly score

Confidence