Methodology

How the score works

AGI Index is a monthly progress score from 0 to 100. It summarizes public signals across seven capability areas and discounts weak, promotional, or hard-to-verify claims.

Categories

  • Agents: tool use, planning, autonomy, and task completion.
  • Coding: software engineering and repository-level development.
  • Reasoning: math, logic, long-context analysis, and planning.
  • Science: research assistance, discovery, simulation, and lab workflows.
  • Multimodal: image, audio, video, document, and mixed-input understanding.
  • Robotics: embodied control and real-world physical task performance.
  • Reliability: robustness, reproducibility, safety, and consistency.

Monthly score

Each category receives a score from recent public evidence. The overall index is a weighted blend that emphasizes agents, coding, reasoning, and science while still tracking multimodal, robotics, and reliability progress.

Confidence

Confidence is higher when signals are independently verifiable, reproducible, and tied to concrete evaluations. It is lower when evidence depends on claims without public methods.