v17: Conflict-Realism Calibration — War Zones Now Score Accurately

2026-06-11

The geo scoring engine was built for breaking news but miscalibrated for ongoing conflicts. A missile strike from 18 hours ago was nearly dead in the recency window. Countries at war were averaged down by their own routine diplomatic and economic coverage. v17 fixes all of that: longer half-life, tighter scoring thresholds, a smaller article pool, and a heatmap that colors by peak signal rather than diluted average.

The original geo scoring system was designed around a simple assumption: news is breaking news. Something happens, it gets reported, the signal spikes, and then it decays as the story is replaced by the next one. That assumption works for isolated incidents — a surprise election result, a natural disaster, a single cyber attack. It fails badly for sustained conflicts, where the "news" is not a single event but an ongoing situation generating dozens of articles per day, none of which individually looks alarming compared to an actual breaking story.

Russia and Ukraine are at war. Iran, Israel, Gaza, and Lebanon are in active conflict. Yet the geo scores for those countries were sitting in the 4–6 range — "elevated" at best. The problem was not bad data. The problem was that the scoring system was systematically discounting exactly the kind of sustained, high-intensity coverage that active war zones produce.

Four compounding problems

Recency half-life of 6 hours: An article about a missile strike published 18 hours ago scored 10 × e^(−18/6) = 0.25 out of 10 on recency. Nearly dead. In an active conflict, yesterday's strike is still relevant — the war did not end overnight.
Urgency threshold too conservative: The lexicon scores urgency as (matched keywords / 8) × 10. A headline like "Russia fires missiles at Kyiv, 12 killed" might match 4–5 keywords — scoring 5–6.25 out of 10. The threshold of 8 keywords required to max out urgency was calibrated for very dense, multi-incident breaking bulletins.
Sentiment threshold too conservative: Negative sentiment required matching 6 words to max out. A typical war-reporting article matching 4 negative words scored 6.7/10. The threshold was designed for catastrophic events, not sustained conflict.
Geo pool averaged 100 articles: Russia generates 30–50 geo articles per day — war reports, diplomatic statements, economic sanctions, sports, cultural coverage. All 100 were averaged together. The 8/10 war articles pulled the average toward 5.0 even when the conflict was at peak intensity.

What changed

Parameter	Before (v16)	After (v17)	Rationale
Geo recency half-life	6 hours	24 hours	Wars last months. A strike from yesterday is still a live signal.
Urgency threshold	keywords / 8	keywords / 6	4-keyword war headlines now score 6.7+ instead of 5.0
Sentiment threshold	words / 6	words / 4	4 negative words now maxes out sentiment instead of requiring 6
Geo article pool	Top 100	Top 30	Concentrates score on highest-urgency articles; prevents routine coverage from diluting conflict signal
Geo blend weight	27%	40%	Geopolitical news is the primary driver of world-threat perception
Heatmap color basis	All-article average per country	Top-10 scoring articles per country	Colors reflect peak signal intensity, not diluted average across all mentions
Heatmap critical threshold	≥ 8.0	≥ 7.0	Recalibrated for the new scoring range

Geo half-life: 6h → 24h

The exponential recency decay function is the same: recency = 10 × e^(−ageHours / halfLife). Only the halfLife parameter changes for geo articles. At 6 hours, an article 24 hours old scored 0.25/10 on recency — essentially zero. At 24 hours, that same article scores 3.7/10, still decaying but still contributing meaningfully to the signal. By 48 hours it scores 1.4/10; by 72 hours (the cutoff) it scores 0.5/10. This means sustained daily coverage of an active war zone accumulates properly rather than resetting every morning.

recency(t) = 10 × e^(−t / 24)   where t = article age in hours

Tighter thresholds

The urgency and sentiment thresholds were originally set for maximum caution — only truly overwhelming article content would score near 10. The problem is that war reporting is not "overwhelming" in keyword density; it is consistently severe. A headline matching 5 urgency keywords (war, attack, missile, killed, crisis) is an extremely serious article. With the old threshold it scored 6.25/10 on urgency. With the new threshold it scores 8.3/10 — which is accurate. Matching 4 negative sentiment words (dead, attack, critical, terror) now maxes out sentiment instead of requiring 6.

Geo pool: 100 → 30 articles

The geo score is the mean of the top-N scored geo articles. Averaging 100 articles means that 60–70 routine articles about the same country — economic policy, sports, cultural events — pull the mean down even when 30 of those articles are about active armed conflict. Reducing the pool to 30 means the score reflects the 30 most alarming geo stories in the last 72 hours. Routine coverage does not disappear from the system — it is still in Firestore, still displayed in the feed — it just no longer dilutes the scoring signal.

Heatmap: coloring by peak signal

The geo heatmap previously colored each country by the average score across all articles mentioning that country in the last 24 hours. Russia might appear in 40 articles — 5 about missile strikes scoring 8–9, and 35 about sanctions, energy prices, diplomatic statements, and sports scoring 2–4. The average was around 4.5, producing an "elevated" yellow color. v17 changes the basis to the mean of the top-10 highest-scoring articles per country. The same Russia example now produces a mean of 8.2 from its five highest-scoring conflict articles — correctly rendering as critical red. Countries not in active conflict still render at their true level, because their top-10 articles are not war reports.

Rollout note

The new urgency and sentiment thresholds only apply to articles scored after the v17 deployment. Existing articles in Firestore have their scores cached and will not be automatically re-scored. As the 72-hour article window turns over — roughly by Saturday — all articles in the scoring pool will carry v17-calibrated scores. The recency half-life change, pool-size change, and blend weight change take effect on every poll cycle immediately, regardless of when individual articles were scored.