Methodology

How AILOS measures AI leadership capability.

The 7-minute diagnostic is a structured instrument. This page documents what it measures, how it scores, the research it draws from, and the current state of validation.

For the short version, download the one-page methodology PDF.

Download one-page PDF →

01 — The instrument

A 15-item, 5-vector self-report instrument.

Items15 Likert items (3 per vector), plus 2 free-text reflection prompts.
Scale5-point agreement scale (1 = strongly disagree, 5 = strongly agree), with anchored verbal labels.
Length~7 minutes median completion time, including reflections.
OutputPer-vector score (0–100, normalized), aggregate maturity score, and a maturity stage (Aware → Exploring → Operating → Scaling → Compounding).
ScoringVector score = mean of items, rescaled to 0–100. Aggregate = weighted mean across the 5 vectors. Stage cut-points are fixed thresholds (not relative to a cohort) so individual scores are stable as the sample grows.

02 — Construct definitions

The five vectors.

Strategic Framing

Reading the system

What it is. The leader's ability to locate AI inside the business model — where it creates leverage, where it adds risk, and what it changes about the value chain.

What it isn't. Not tool knowledge or prompt fluency.

Example item: “I can articulate where AI changes the unit economics of my team's work.”

Decision Architecture

Deciding with AI in the loop

What it is. How well the leader structures decisions when AI is a participant — what AI gets to recommend, what stays human, and how the trade-offs are made explicit.

What it isn't. Not delegating decisions to a model.

Example item: “For each recurring decision, I know which parts AI assists and which parts a human owns.”

Team Adoption

Moving the bench, not just yourself

What it is. The leader's effectiveness at building AI capability across their team — psychological safety to experiment, shared norms, and visible behavior change.

What it isn't. Not headcount of tool licenses.

Example item: “My team openly shares what worked and what failed with AI in the last week.”

Operational Integration

Workflows that hold up

What it is. How AI is embedded into actual workflows, review gates, and quality checks — versus sitting on the side as a novelty.

What it isn't. Not pilots or sandboxes.

Example item: “Our standard operating workflows have explicit AI insertion and review points.”

Ethical & Risk Stance

Responsible by default

What it is. The leader's clarity on data, IP, bias, and disclosure trade-offs — and the ability to make those calls in front of stakeholders.

What it isn't. Not policy memorization.

Example item: “I can explain, to a non-technical exec, why we did or did not use AI for a given task.”

03 — Research basis

Where the vectors come from.

The vectors are derived from published research on AI in management, knowledge work, and organizational adoption. The instrument synthesizes these frames; it is not an endorsement by, or a publication of, the institutions named.

Oxford — Future of Work program — Frames AI as a complement-vs-substitute decision at task level, not job level.
MIT Sloan / MIT CSAIL — AI & Business strategy — Distinction between AI-augmented decision making and AI-automated execution.
Stanford HAI — Human-Centered AI — Responsible deployment patterns and human-in-the-loop review.
Harvard Business School — Generative AI & knowledge work — Effect of AI on individual vs team performance and the “jagged frontier.”
World Economic Forum — Future of Jobs — Skill-shift forecasts that shape the team-adoption vector.

04 — Validation status

Honest current state.

"Grounded in research" is a basis claim. Validation is a separate, ongoing process. We publish status, not marketing.

Content validity

Items reviewed against the published research base above and pre-tested with practitioner readers.

Status: complete

Internal consistency (Cronbach's α)

Per-vector reliability measured on the design-partner sample. Target α ≥ 0.70 per vector.

Status: pilot in progress — figures published when N is sufficient

Test–retest stability

Re-take cadence built into the product (quarterly). Stability target: r ≥ 0.75 over 30 days for leaders without intervention.

Status: pilot in progress

Construct validity

Confirmatory factor analysis on the 5-vector structure once enrolled-cohort N is sufficient.

Status: scheduled

Updated values are published on this page as design-partner data crosses each threshold. We do not publish reliability figures derived from an insufficient sample.

05 — Limitations & responsible use

What this instrument is not for.

• It is a self-report instrument. It measures the leader's perception of their own capability, not observed performance.
• It is a leading indicator of AI leadership capability — useful for development planning, cohort design, and 12-week roadmaps. It is not appropriate as a performance-review or hiring instrument.
• Scores are stable across cohorts (fixed cut-points), so an individual's score is not affected by who else takes the diagnostic.
• Re-take cadence is quarterly. Re-taking more often will not meaningfully reflect capability change.
• Data handling, retention, and team-rollout privacy are documented on the Trust page.

Take the diagnostic →See pricing →