◆ Methodology
Your overall score is a weighted average of dimensions. Each dimension below
explains what it measures, why it matters, how it's computed, and what to do to grow it.
Cross-tool dimensions
These dimensions score across Claude Code, Codex, and Cowork where the underlying signal exists. Source coverage is noted per dimension.
Customization
What it measures.
Whether you configure your AI environment — CLAUDE.md, AGENTS.md, and MCP configs.
Why it matters.
The biggest leap from "user" to "operator" is when you start shaping the tool to your work, not the other way around.
How it's measured.
Sum of custom_mcp_config_writes
and claude_md_writes
(or agents_md_writes
for Codex)
over the rolling 30 days, log-normalized. Skill file authorship is tracked separately under Custom Skills.
How to grow it.
Add a CLAUDE.md to your project that captures your conventions. Add one MCP server you'd use weekly.
Sources: Claude · Codex · Cowork
Parallel Agents
What it measures.
Whether you're running multiple agents in parallel within a session.
Why it matters.
Parallel orchestration is how senior operators compress hours into minutes — branching work across subagents and converging the results.
How it's measured.
Weighted blend of breadth (sessions with orchestration), depth (parallel agent turns), and peak (max parallel agents in any session).
How to grow it.
Use the Agent tool to fan out independent subtasks. When two pieces of work don't depend on each other, launch them at the same time.
Sources: Claude · Codex · Cowork
Background Work
(was Context Leverage)
What it measures.
Whether you delegate work to run unattended — Cron, Task, or Monitor primitives.
Why it matters.
Senior operators don't sit and wait. They queue work, walk away, and come back to results.
How it's measured.
Sessions in which you used context-leverage tools (TaskCreate, ScheduleWakeup, CronCreate, Monitor, etc.), log-normalized to a reference of 80.
How to grow it.
The next time a subtask will take a while, schedule it via Task and switch contexts. Use Cron for nightly jobs you'd otherwise babysit.
Sources: Claude
Tool Breadth
What it measures. How broadly you engage with skills and MCP servers.
Why it matters.
Skills and MCPs are how Claude Code reaches outside the conversation — into your tools, services, and data.
How it's measured.
Distinct count of skill_counts
+ mcp_server_counts
in the 30-day window, log-normalized to a reference of 100.
How to grow it.
Install one MCP server you'd actually use weekly. Try a built-in skill you've been ignoring.
Sources: Claude · Codex · Cowork
Planning
What it measures. Whether you use plan mode before non-trivial work.
Why it matters. Plans cut rework. Power users front-load thinking.
How it's measured.
Sessions with plan mode invocations (or update_plan
calls in Codex), log-normalized to a reference of 10.
How to grow it. When a task touches more than two files, plan first.
Sources: Claude · Codex
Repetition NEW
What it measures.
When you use a skill, do you go deep — or just try once?
Why it matters.
Repeat usage signals that a skill has become real practice, not a one-off experiment.
How it's measured.
Total invocations across skill_counts
in the 30-day window, log-normalized to a reference of 20.
How to grow it.
Pick one or two skills and use them repeatedly across a week, instead of trying twenty different things once each.
Sources: Claude · Codex · Cowork
Custom Skills NEW
What it measures.
How many of your own custom skills you've authored and actively practice.
Why it matters.
Authoring a skill is one thing; making it part of your daily flow is the real signal. Writing your own automation and then using it repeatedly shows both investment and payoff.
How it's measured.
Log-normalized blend of two signals: the set of skill names you authored (detected via Write/Edit/apply_patch calls on
~/.claude/skills/<name>/...
or ~/.codex/skills/<name>/...
paths in your transcripts) intersected with your skill_counts
keys, plus the summed practice counts for those authored skills.
Namespace suffix-match applies: a plugin-installed skill named
<plugin>:<name>
still credits you if <name>
matches a skill you authored locally and shipped back.
How to grow it.
Author a skill that automates a recurring task. Use it in real workflows. Share it back to your company plugin — the namespace suffix-match means you still get credit. Author another.
Sources: Claude · Codex · Cowork
Multi-Tasking NEW
What it measures.
How many concurrent sessions you run at peak — multi-tasking across distinct workstreams.
Why it matters.
Operators who run several workstreams at once are squeezing more value from the tool than serial users.
How it's measured.
Peak max_concurrent_sessions
over the 30-day window, log-normalized to a reference of 15.
How to grow it.
Try Cowork or run two terminals. The next time you're blocked on one task, start the next.
Sources: Claude · Codex · Cowork
Tracked but no longer scored
We removed these from the score because they were too easily saturated, didn't separate users in practice, or measured the wrong thing. We still capture the raw data so we can revisit if the user base diversifies.
-
Prompt Efficiency — was constant 100 across users.
-
Tool Diversity — saturated at the top of the cohort.
-
Conversation Depth — saturated similarly.
-
Basic Chat — measured the floor, not the ceiling.