Methodology

How the research gets done.

Every WireSift claim ties back to a primary document. This page documents the methodology behind the AI Adoption Tracker — the source tiering, extraction approach, models used, and validation layers — so any reader can audit how a finding was produced.

Last updated: May 2026

The source hierarchy

WireSift uses a tiered source framework, where every claim in our research is anchored to the strongest available evidence. Tiers run from 1 (strongest) to 7 (weakest), and the tier of each source is disclosed in our claim ledger.

Tier 1 — primary documents under regulatory or legal obligation (10-K filings, earnings call transcripts, court filings, peer-reviewed research). Earnings calls are Tier 1 because management speaks under securities-disclosure obligations.
Tier 2 — primary documents not under formal obligation (investor day presentations, board letters, official communications).
Tier 3 — direct journalism by reputable outlets quoting Tier 1 sources verbatim.
Tiers 4–7 — reduced rigor, used as supporting context only and labeled accordingly.

The AI Adoption Tracker pipeline

The AI Adoption Tracker reads every S&P 500 Q1 2026 earnings call transcript through a structured extraction pipeline, producing a comparable, auditable dataset. The pipeline runs in three stages.

1. Source acquisition

Earnings call transcripts are pulled from Financial Modeling Prep (FMP), which sources directly from the call audio. Each transcript is cached locally with a SHA-256 hash so we can verify integrity and detect any upstream revisions.

2. Structured extraction

Each transcript runs through a single-pass extraction using Anthropic’s Claude Sonnet 4.6 with a versioned schema. The schema captures:

Mentions:every distinct AI claim made by management, with the verbatim quote, speaker, role, section (prepared remarks vs. Q&A), specificity score (1–5), ai_scope, time horizon, and named entities (products, partners, models, customers).
AI revenue disclosure: whether management disclosed AI revenue, the disclosure method (GAAP segment, ARR, run-rate, bookings, qualitative), and the verbatim quote.
Disclosure gaps: analyst questions where management declined to quantify a number — often as informative as the disclosures themselves.
AI activities:investments (capex, opex, R&D, acquisitions), realized outcomes (productivity gains, cost savings), partnerships (model providers, hyperscalers, chip vendors).
Specificity score on a 1–5 scale: 1 = aspirational, 2 = directional, 3 = operational, 4 = quantified, 5 = financialized (specific dollar / margin / revenue figure tied to AI).

Every extracted claim must be backed by a verbatim quote present in the source transcript. The pipeline fails closed when a quote can’t be located in the source — extraction quality is non-negotiable.

3. Quality gates

Three layers of validation run on every extraction before it enters the public dataset:

Layer 1 (per-extraction): automated checks for quote integrity, scope coherence, schema compliance, and internal consistency. Flags surface for manual review before publishing.
Layer 2 (cross-model): a stratified random sample (~10%) is independently re-extracted by Claude Opus 4.6 and compared field-by-field. Aggregate agreement on substantive judgments runs at 80%+; disagreements are reviewed manually.
Layer 3 (string-match cross-check):for every partner-leaderboard claim, we run a naive case-insensitive string match for the partner’s brand and product names across the full transcript corpus. Any company that mentions the partner in raw text but isn’t in the count gets its surrounding context pulled and classified — was the LLM extraction correctly conservative (analyst question, passing reference, non-AI context, idiomatic use of an ambiguous term like “bedrock”) or did it miss a real partnership that should be added? The May 2026 run found one categorization bug (a parenthetically-named partner that wasn’t rolling up to its canonical bucket); five empty-quote rows were re-routed to the snapshot-level quote-integrity filter described below. Results logged in the project’s public repo.

4. Snapshot-level filters

Two rules apply at the moment data is published to this site, on top of whatever the LLM extracted:

Quote-integrity filter. Any partnership row without a non-empty quote_verbatimfield is excluded from public counts and per-company drill-downs. This enforces the “no quote, no claim” rule consistently — even when the LLM extracts a partner correctly but couldn’t attribute a per-partner quote (e.g. management lists multiple partners in one sentence). The row stays in the underlying database for audit; the public dataset shows only quote-substantiated claims.
Partner-name canonicalization.The LLM records the exact wording management used (“AWS” vs. “Amazon Web Services” vs. “Bedrock”; “GCP” vs. “Google Cloud” vs. “Vertex”). Without normalization, single ecosystems would fragment across multiple leaderboard rows. We apply a published alias map plus a parenthetical-prefix rule: Google (Google Gemini / Google Health)rolls up to Google because the leading text matches a known canonical, but Hyperscalers (unnamed)stays as-is because its prefix doesn’t. The full alias list lives in the snapshot script in our public repo.

Editorial choices we disclose

A few editorial calls are applied at render time on the public tracker. Each is disclosed in the chart’s source line:

Big Tech treated as Information Technology. Four companies are reclassified into the IT sector regardless of their GICS label, because their AI commentary is dominated by tech-AI use cases and treating them as their nominal sector obscures the real picture:
- Alphabet (GOOG) — GICS Communication Services. AI commentary is Cloud, Search, and Workspace.
- Amazon (AMZN) — GICS Consumer Discretionary. AI commentary is dominated by AWS.
- Meta (META) — GICS Communication Services. AI commentary is AI infrastructure, Llama, and AR/VR.
- Tesla (TSLA) — GICS Consumer Discretionary. AI commentary is FSD, Optimus, and robotaxi.
Reclassification is applied consistently in the maturity chart, the company drilldown, and any callout that uses a non-tech framing. The principle: where a GICS classification would force a misleading read, we relabel and disclose. The full list of reclassifications and their public footnotes is published in the source.
Alphabet share classes consolidated. Alphabet’s Class A (GOOGL) and Class C (GOOG) shares represent the same legal entity and the same earnings call. GOOGL is excluded from all aggregates so Alphabet appears once, under GOOG. Disclosed in the chart source line.

Versioning and change tracking

The extraction schema is semver-versioned. Old extractions are never deleted — when the schema changes (a new field, a refined controlled vocabulary, a renamed enum), prior data stays in its original schema version and the change is logged in our public changelog. This means a finding shipped under schema v2.0 can always be reproduced from the v2.0 record.

What we won’t do

A few discipline points worth naming explicitly:

No LLM-side normalization of quantifications. Numbers are extracted as raw strings; any normalization (units, currency, period) happens downstream in deterministic Python code. This keeps the audit trail clean — what the model extracted is what management said.
No company-specific prompt engineering.The same prompt runs against every transcript. We don’t tune extraction for individual companies, which would bias comparability.
No editorial paraphrase as a substitute for the quote.Every claim shows the verbatim quote, even when it’s long or awkward. The quote is the evidence; our framing around it is commentary.

Open methodology

The full pipeline source code, schema, prompts, and changelog are public on GitHub. Anyone can audit how a claim was produced — or fork the pipeline against a different universe of companies.

Questions

Methodology questions, data licensing inquiries, or audit requests: info@wiresift.com.

← Back to the AI Adoption Tracker