v3.1: AI Scoring Cost Fix — Score Once, Cache Forever

A single logic error was re-scoring every article in the database on every 2-minute poll cycle — sending the same headlines to Claude and Gemini hundreds of times per day. The fix is simple: score an article once, store the result, and never touch it again.

Running WakeUpNeo.ai in production surfaces real numbers. This month's Claude API bill came in at $32. For a project at this stage, that number is too high — and the cause turned out to be a straightforward logic error rather than anything inherent to the architecture.

What was happening

The core scoring function, calculateMeter, runs on a 2-minute schedule. Every time it fires, it fetches the last 72 hours of articles — potentially 400 to 500 documents — and sends the top 50 of them to Claude and the top 50 to Gemini, unconditionally. It had no memory of whether it had already scored those articles in a previous cycle.

An article published at 9am on Monday would be fetched, sent to Claude, scored, and written back. Then two minutes later: fetched again, sent to Claude again, scored again, written back again. And again. And again — 720 times over the course of a single day, until the article aged out of the 72-hour window.

50 articles × 720 cycles/day × 30 days = 1,080,000 article scorings/month

The fix

Each article already stores a claudeEnhanced flag and a numeric score after its first AI pass. The fix adds a pre-flight split: before making any API call, articles are divided into two buckets.

The batch cap also dropped from 50 to 20 new articles per cycle. News sources collectively publish roughly 150 new articles per day — about 6 per hour, or fewer than 1 every 2 minutes on average. A cap of 20 is more than sufficient to keep up with any realistic ingest burst.

Cost impact

BeforeAfter
Claude (Haiku)~$30/mo~$1.20/mo
Gemini (Flash Lite)~$2/mo~$0.09/mo
Combined~$32/mo~$1.30/mo

The ~97% reduction comes entirely from eliminating redundant work — no model was swapped, no prompt was shortened, no quality tradeoff was made. Claude Haiku 4-5 and Gemini Flash Lite remain the active models. Each article still receives the same depth of AI analysis on its first pass through the pipeline.

What does not change

The recency factor in each article's score is recalculated fresh on every meter cycle from the article's fetchedAt timestamp — it does not use the cached value. So the meter's time-sensitivity is preserved: older articles naturally decay in influence even though their AI-derived sentiment and urgency scores are frozen at the time of first scoring. This is correct behaviour. The semantic content of an article does not change after it is published.