Narrative Framing for Air Pollution, Energy Transition, Animal Welfare
Why narrative framing?
Narrative framing analyses could serve multiple purposes:
-
Understand how a topic is being discussed: Identify which narratives dominate coverage versus those that remain marginal or absent. This helps advocates understand the current discourse landscape—what frames are being amplified, which perspectives are under-represented, and where there might be opportunities to shift narratives. For example, an analysis might reveal that air pollution discussions focus heavily on individual vehicle emissions while neglecting industrial sources, pointing to potential advocacy gaps.
-
Prioritise regions, messaging and outlets: Use data to identify gaps and opportunities—which regions or outlets are influential but missing key narratives, or which audiences lack exposure to critical frames. For instance, an analysis might reveal that certain media outlets heavily cover industrial pollution but rarely connect it to policy solutions, indicating a clear messaging gap. This enables more targeted grantmaking and campaign design, ensuring resources go to outlets and regions where narrative shifts are most needed and feasible.
-
Measure change over time: Track how narratives evolve before, during, and after advocacy campaigns or major events. Detect whether specific frames are gaining or losing traction, measure campaign impact by comparing pre- and post-intervention coverage, and identify emerging trends early. This provides evidence-based feedback loops for grantees and helps demonstrate the effectiveness of narrative change initiatives.
Examples
Jakarta — Air pollution causes
Context: Analysis of Jan 2020– Oct 2025 Indonesian media coverage on air pollution in Jakarta, focusing on how different causes are discussed. The corpus spans 14,469 articles from major Indonesian outlets in Bahasa Indonesia, capturing how journalists frame pollution sources—from vehicle emissions to seasonal weather patterns.
Results summary: Transport emissions dominate coverage (41% of articles), reflecting Jakarta’s heavy traffic and vehicle-related pollution discourse. Natural and meteorological factors come next with a score of 8.5% articles, with notable seasonal spikes during dry periods when weather conditions exacerbate pollution.

Frames identified:
| Frame | Description | Key Keywords | Share |
|---|---|---|---|
| Transport Emissions | Vehicle emissions from cars, motorcycles, buses, trucks, and road traffic | kendaraan bermotor, lalu lintas, emisi kendaraan, uji emisi | 41.1% |
| Natural Factors | Meteorological and seasonal factors affecting air quality (weather patterns, El Niño, rainfall) | cuaca, angin, musim kemarau, El Nino, curah hujan rendah | 8.5% |
| Industrial Emissions | Factory and manufacturing emissions, including smelters, steel, and cement production | pabrik, industri, smelter, industri baja, industri semen | 6.3% |
| Power Plant Emissions | Coal-fired and fossil-fuel power plant emissions | PLTU, pembangkit listrik, coal-fired power plant | 3.3% |
| Biomass Burning | Agricultural fires, forest fires, and land clearing through burning | pembakaran lahan, kebakaran hutan, pembakaran biomassa | 2.1% |
| Waste Burning | Open burning of municipal waste and landfill fires | pembakaran sampah, open burning, landfill fire | 1.9% |
| Household Emissions | Household cooking and heating using fossil fuels or biomass | pembakaran rumah tangga, kompor kayu, bahan bakar padat | 0.5% |
| Construction Dust | Construction activities, roadworks, and resuspended dust | debu konstruksi, pembangunan, road dust, pekerjaan jalan | 0.4% |
Note: Percentages represent the share of articles that discuss each frame (occurrence-based, threshold ≥0.2). Articles can discuss multiple frames.
Philippines — Renewable energy
Brazil — Animal welfare
Method overview
The pipeline follows a hybrid LLM-to-classifier approach: we start with flexible LLM exploration to discover and annotate narrative frames, then scale up with a fine-tuned transformer classifier. This balances domain adaptability (frames tailored to each question and context) with computational efficiency (fast inference over large corpora).
flowchart LR
subgraph Collection["1. Data Collection & Preparation"]
direction TB
subgraph CollectionSub[ ]
direction TB
A["Content discovery<br/>(media, TV, radio, forums, etc.)"]
A2["Scrape & extract text<br/>(using Scrapy)"]
B["Chunk text<br/>(using SpaCy language model)"]
A --> A2 --> B
end
end
subgraph Discovery["2. Frame Induction & Annotation"]
direction TB
subgraph DiscoverySub[ ]
direction TB
C["LLM: Induce frames<br/>(with or without user guidance)"]
D["LLM: Label samples<br/>(multi-label distributions)"]
C --> D
end
end
subgraph Classification["3. Scalable Classification"]
direction TB
subgraph ClassificationSub[ ]
direction TB
E["Train transformer classifier<br/>(fine-tune on LLM labels)"]
F["Classify all chunks<br/>(fast inference)"]
E --> F
end
end
subgraph Analysis["4. Aggregation & Reporting"]
direction TB
subgraph AnalysisSub[ ]
direction TB
G["Aggregate to document level<br/>(length-weighted attention)"]
H["Results analysis<br/>(e.g. time series & outlets breakdowns)"]
I["Generate reports<br/>(interactive HTML + static plots)"]
G --> H --> I
end
end
Collection --> Discovery
Discovery --> Classification
Classification --> Analysis
classDef nodeBox fill:#ffffff33,stroke:#333,stroke-width:1px
classDef somePaddingClass padding-bottom:5em
classDef transparent fill:#ffffff00,stroke-width:0
Collection:::somePaddingClass
CollectionSub:::transparent
Discovery:::discoveryStyle
Discovery:::somePaddingClass
DiscoverySub:::transparent
Classification:::somePaddingClass
ClassificationSub:::transparent
Analysis:::analysisStyle
Analysis:::somePaddingClass
AnalysisSub:::transparent
class A,A2,B,C,D,E,F,G,H,I nodeBox
style Collection fill:#e1f5ff,stroke:#0277bd,stroke-width:2px
style Discovery fill:#fff3e0,stroke:#ef6c00,stroke-width:2px
style Classification fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
style Analysis fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
Content discovery (search/filters): We start by defining the slice of content we care about—whether from media articles, TV news transcripts, radio programs, forums, Reddit, or other sources—in a way that is both broad enough to catch variation and precise enough to be actionable. For media analysis, using Media Cloud collections lets us anchor each run in a country and time window, and then layer topical filters (for instance, city names or issue cues) to focus coverage. Similar approaches work for other platforms: TV news and radio transcripts, forum posts, Reddit threads, or other text corpora can be collected through their respective APIs or scraping tools. The intent is to bias toward recall at this stage: we would rather include a few borderline documents and filter them downstream than miss legitimate phrasing that differs from our initial keywords. Every run is captured in a small YAML file so the choices are explicit and replicable.
Scrape and extract: To reason about narratives we need full passages, not just headlines or snippets. We fetch pages and extract the main text, then remove boilerplate and navigation tails that otherwise drown the signal (things like widgets, “follow us” blocks, or stock tickers). The trimming rules live in config so we can adapt them by outlet or country. This step trades a little engineering effort for cleaner inputs and more stable downstream classification.
Frame induction (LLM): We ask an LLM to propose a compact set of categories tailored to the question and context (e.g., causes of air pollution in Jakarta) by feeding it a random sample of passages (200 passages in the examples above) in several consecutive batches, followed by a consolidation call. User can inject guidance to guide the LLM e.g. to include or exclude certain frames. After a manual and shallow comparison of various models performances through visual inspection of framing results, I selected OpenAI GPT‑4.1 for this step. The resulting schema (names, short definitions, examples, keywords) is passed along to the annotation step.
Frame application to samples (LLM):
We then use another LLM as a probabilistic annotator on a sample of passages (typically 2,000 passages in the examples above). Each passage gets a distribution over frames (not just a single label) plus a brief rationale. We typically use a smaller GPT‑4 variant (e.g., gpt-4.1-mini) for this step to balance cost and quality, since we need to label thousands of examples. This does two things: it reveals ambiguous cases that keyword-based approaches would mis-label, and it gives us enough labeled data to train a supervised model.
Supervised classifier (transformers):
We then fine‑tune a multi‑label transformer classifier on those LLM‑labeled passages using Hugging Face transformers. We start with a pre-trained language model (e.g., indobenchmark/indobert-base-p1 for Bahasa Indonesia, distilbert-base-uncased for English) and adapt it to our frame classification task: the encoder layers learn to recognize frame-relevant patterns, while a new classification head outputs probability scores for each frame using sigmoid activation. This gives us cheap, fast inference over tens of thousands of chunks while freezing the labeling policy defined by the schema.
Classify the corpus: We classify content at the chunk level (typically sentences or short spans) to avoid burying weaker frames in long documents. Light keyword gating and regex excludes from earlier steps help keep us on topic without reintroducing brittle rules. Results are cached per document to support iterative runs and easy re‑aggregation.
Aggregate and report: Finally, we aggregate chunk‑level predictions to document‑level profiles and summaries over time. A length‑weighted aggregator estimates how much attention each frame receives within a document (article, post, thread, etc.); an occurrence view answers a different question—what share of documents mention a frame at all.
Keyword-based approaches have significant limitations for narrative analysis:
- Paraphrases and semantic variations: Journalists describe the same concept in many ways. An article discussing "road dust resuspension from heavy traffic" contains the same frame as one mentioning "vehicle emissions," but keyword matching would miss this connection.
- Language evolution: Terms change over time and across regions. What counts as "construction pollution" in Jakarta might be discussed as "infrastructure development impacts" elsewhere, requiring constant keyword list updates.
- Implied meaning: Media often conveys frames through context rather than explicit terms. A passage describing "stagnant air during the dry season" implies natural factors affecting pollution, even without using keywords like "El Niño" or "low rainfall."
- Cross-language nuance: In multilingual contexts, keywords must be translated carefully, but semantic understanding captures equivalent concepts across languages automatically.
Our approach uses LLMs to capture semantic meaning, then scales it with a classifier—combining the flexibility of language understanding with the efficiency needed for large-scale analysis.
Get in touch
I am interested in hearing from others working on similar problems or exploring how these tools could be applied in new contexts or further developed to be more useful. Whether you have ideas for improvements, questions about the approach, or want to collaborate on applications, I’d love to hear from you - reach out to me.