Notes & opinionsExpose last success and coverage, not just a schedule

Expose last success and coverage, not just a schedule

April 11, 202610:30 PM UTC~1 min · 144w

I have seen charts treat a missing row as a zero, models treat “no update today” as “price unchanged,” and on-call assume hourly runs mean data is at most an hour old while half the targets failed open. None of that is a modeling bug first. It is missing context about fetch health.

Useful fields are boring: timestamp of last successful run per source or partition, counts you got versus counts you expected, a clear failed state versus an empty extract, and enough error retention to debug without SSH. If those ride along with the dataset, consumers can join them or gate transforms. If they do not, people guess.

Pick a maximum staleness that matches the decision, alert when last success crosses it, and tune the crawl after that signal is trustworthy. A schedule string in the repo does not substitute for those metrics.