The pipeline

How It Works

From raw news coverage to a published Bunting Rating is a five-step process. Each step is designed to preserve the differences between outlets rather than smooth them out. Here is what happens at each stage.

Pipeline steps

  1. Outlet selection

    We track news outlets across the left, centre, and right of US politics. Each outlet is classified by political lean and by type: corporate (large commercial or institutional) or independent (podcast, newsletter, or standalone digital publisher). Independents are deliberately oversampled relative to their audience share, because a list weighted only toward large corporate outlets would miss a significant portion of where politically-engaged readers actually go for news. You can see the full list on the Outlets page.

  2. Story discovery

    We ingest content from each outlet daily via RSS feeds and, for audio-visual outlets, YouTube transcripts. Articles are fetched in full where available; video content is processed from its transcript. The goal at this stage is breadth: pulling in everything an outlet publishes, not filtering by topic in advance.

  3. Topic clustering

    An AI model scans the incoming content and surfaces stories that are being covered across the spectrum, grouping articles and transcripts that address the same underlying event or policy. This step is driven by what the outlets are actually covering, not by a hand-curated topic list. Stories that are only covered by one lean do not make it to a rating: the rating requires presence across the spectrum to be meaningful.

  4. Per-lean analysis

    Each lean is analysed in isolation before any synthesis happens. The left-lean sources are summarised as a group; the centre sources as a group; the right sources as a group. The framing summaries for each lean are produced independently, so there is no single voice that could unconsciously average out the differences. Verbatim headline quotes are extracted alongside the summaries so readers can always check the AI's characterisation against what outlets actually wrote.

  5. Bunting Rating and publishing

    The three per-lean framing summaries, the framing dimension vectors, the corporate-versus-independent contrast, and the headline vocabulary data are combined into the Bunting Rating score. The result is then published as a Substack post with the full framing summaries, verbatim headline pull-quotes, and the rating in context. Social cards for Instagram, TikTok, X, and Bluesky are generated automatically from the same data.


On AI

The pipeline uses Claude (Anthropic's AI model) for three steps: topic clustering, the per-lean framing summaries, and the synthesis step that produces the social-card copy. These are the steps where human-speed reading of hundreds of articles per day is not practical. Without AI at those three points, the project would not be viable as a daily analysis tool.

Everything else is deterministic or human-authored. The outlet classifications were set by hand and are not changed by the model. The verbatim headline quotes are lifted directly from the source content, not written by AI. The Bunting Rating score is calculated by formula from the component measurements. The limitations that this introduces, including the risk that Claude's own training data carries a systematic lean, are documented honestly on the Methodology page.