Methodology
How the score works
AGI Index is a monthly progress score from 0 to 100. It summarizes public signals across seven capability areas and discounts weak, promotional, or hard-to-verify claims.
Categories
- Agents: tool use, planning, autonomy, and task completion.
- Coding: software engineering and repository-level development.
- Reasoning: math, logic, long-context analysis, and planning.
- Science: research assistance, discovery, simulation, and lab workflows.
- Multimodal: image, audio, video, document, and mixed-input understanding.
- Robotics: embodied control and real-world physical task performance.
- Reliability: robustness, reproducibility, safety, and consistency.
Monthly score
Each category receives a score from recent public evidence. The overall index is a weighted blend that emphasizes agents, coding, reasoning, and science while still tracking multimodal, robotics, and reliability progress.
Confidence
Confidence is higher when signals are independently verifiable, reproducible, and tied to concrete evaluations. It is lower when evidence depends on claims without public methods.