There is a special kind of dashboard bug that does not look dramatic at first.

No page crash. No missing deploy. No giant red stack trace. Just two numbers that should agree, quietly disagreeing.

That happened in my financial markets report. The Risk Overview header showed an uncapped risk score of 778. The SVG trend chart showed “Today” at 858.

One page. One concept. Two answers.

For a decorative dashboard, that might be annoying. For a report I actually use to understand market risk, it matters. A decision-support tool does not get to be mostly right in the places where the reader is trying to build confidence.

The bug was not in the math

My first instinct was to suspect the scoring engine. That would have been the exciting explanation: a model issue, a threshold mistake, a weighting problem, something worth a whole architecture rewrite.

The actual cause was less glamorous and more useful.

The HTML build was reading the daily risk snapshot before the current run wrote today’s snapshot to disk. The header used the live health.score_uncapped. The chart used risk_score_daily.json, which still held the prior trading day’s saved point.

So both numbers were “real.” They just came from two different moments in the pipeline.

The fix was small: write the JSONL history and upsert the daily snapshot before building the HTML. Then the banner and the chart both read from the same current run.

That is the kind of bug that makes me appreciate boring sequencing. In a pipeline, “what happens first” is not housekeeping. It is product behavior.

Freshness is a user experience feature

The same session started with a freshness check. The morning report was not showing yet, so I checked GitHub Actions. There was no failure. The weekday cron had simply drifted, which GitHub schedules can do. The run was late, not broken.

I manually triggered the workflow, watched the report build, and confirmed Pages deployed.

That sounds operational, but it is also UX. If someone expects a daily report at 7am Eastern and it appears later, the page can feel stale even when the system is healthy. If the chart says “today” but is reading yesterday’s saved point, the page can feel untrustworthy even when the model is healthy.

The lesson I keep learning: data products need to explain their timing as clearly as their numbers.

The reader should not need to know how GitHub Actions scheduling works. But the system should be built with the humility that schedules are best-effort, local clones can get stale, and persisted files have to be updated in the right order.

The small checklist that saved the day

This was not a heroic debugging session. It was a good one because the loop stayed tight:

  1. Notice the mismatch.
  2. Identify which number came from live state and which came from persisted history.
  3. Move snapshot persistence earlier in the report generation sequence.
  4. Sync the public deploy clone before patching it.
  5. Push both repos and verify the workflow.

The “sync the public clone” step deserves its own little spotlight. My local financial-reports clone was 56 commits behind origin/main. If I had patched first and pulled later, I could have created a needless merge mess or overwritten context I needed.

So the durable habit is simple: pull before porting.

Not fancy. Very effective.

What I like about this kind of work

The positive part of this session is that the report keeps becoming more real.

Early on, the financial agent was an impressive build: APIs, scoring, static HTML, AI-generated narrative, deployment. Now it is becoming a tool with product expectations. I notice when the morning run is late. I notice when two risk numbers disagree. I care whether the chart and header speak from the same state.

That is a good sign.

It means the artifact has crossed from “look what I built” into “I rely on this enough to be annoyed when it is wrong.”

That is where product sense sharpens. Real use creates better bugs. Better bugs create better systems.

Trust is built in the unglamorous layer

People talk a lot about AI features: analysis, summaries, forecasts, recommendations. I like those too. But the parts that make me trust the system are often quieter:

  • The chart reads the same current snapshot as the banner.
  • The workflow run is visible and verifiable.
  • The public site deploys from source changes.
  • The daily history has one canonical row per date.
  • The report can be rebuilt without hand-editing HTML.

None of that is flashy. All of it matters.

A dashboard does not earn trust by looking confident. It earns trust by being internally consistent, fresh enough for its purpose, and honest about how it was built.

This is one of my favorite parts of building with AI agents in Cursor: the agent can help move fast, but the judgment still has to come from use. I had to look at the page, feel the mismatch, ask why, and insist that the answer be boringly correct.

That is the work I want more of: cheerful, practical, evidence-seeking iteration. A live report gets a little more trustworthy. A workflow gets a little cleaner. A future reader has one less reason to wonder which number to believe.