A data story
Cloudy answers a practical question for Swedish locations: what is normal here, what has happened recently, and how much should that change the next few weeks? This deck follows the product as it is: the data we ingest, the cleanup rules, the read path, the normals view, the weekly outlook, the spatial estimate, and the deploy path.
Scroll to begin ↓
PART 1
What the data gives us, and the two gaps we close: space between stations, and recency.
Roadmap
SMHI gives us cloud cover at 109 weather stations across Sweden. That tells us how cloudy it is at each station, but nothing about the air in between. A seasonal normal tells us how cloudy a place usually is in a given week of the year, but it is flat year over year and has no sense of what the last few weeks actually did. We close those two gaps with two kinds of model: spatial models fill in space, and a damped persistence model adds recency in time.
Roadmap
This grid is the map for the deck. The horizontal axis is time: on the left, the seasonal normal; on the right, recent conditions. The vertical axis is space: at the top, values at stations; at the bottom, estimates at any point. The shipped product fills the grid with small, explicit pieces: normals, damped persistence, kNN spatial estimates, and their composition. Lightning is area-based, so it uses its own simpler shape: count strike-days in a circle and compare them to the seasonal normal.
PART 2
Two live SMHI sources: what each contains, what it does not, and how it is served.
Data
The system serves two SMHI observation feeds. Each has its own ingest module and carries a (source, source_version) tag inside its natural key, so feeds can never collide. Cloud is a station time series. Lightning is a country-wide event stream. Everything else in the app is derived from those two sources.
| Source | What it is | Provider | source tag |
|---|---|---|---|
| SMHI cloud cover (param 16) | hourly cloud % at stations | SMHI metobs | smhi-metobs |
| SMHI lightning | per-discharge strike events | SMHI | smhi-lightning |
ingest/cloud.py and ingest/lightning.py. The read path and model views are built from these tables and their rollups; no proxy weather source is served to users.Data
This is the only source actually served to users. It gives hourly total cloud cover at each station, but only at the station: there is nothing in between, nothing sub-hourly, and no wind, temperature, or humidity to explain the value.
Data
Unlike cloud, lightning covers the whole country, not just stations: one row per discharge. It is the storm-activity signal. It carries no cloud cover, no stored density raster, and no altitude.
Data
The app has two exploration pages beside Normals and Predictions. Exploration shows cloud and lightning as synchronized time-series charts for the selected location. Map shows lightning events in the current viewport and time window. These are the pages that make the raw feeds inspectable before they become seasonal normals or weekly outlooks.
Data
The Exploration chart asks for cloud at a chosen resolution — anywhere from hourly to yearly. Aggregating cloud_hourly on every request would be slow, so on each ingest we precompute per-station buckets at six fixed resolutions and store them in cloud_rollups. The read path then serves those rows directly: a chart is a lookup, not an aggregation. Each bucket also keeps its counts — observed, expected, missing — so a gappy week is reported as gappy instead of silently averaging over holes. Because it materializes six resolutions across every station and every bucket, this derived table is the largest on disk, which the next slide shows.
# ingest/cloud.py — rebuilt from cloud_hourly on every ingest
ROLLUP_RESOLUTIONS = ("hour", "6h", "day", "week", "month", "year")
class CloudRollup(SQLModel, table=True):
station_id: int
resolution: str # one of ROLLUP_RESOLUTIONS
bucket_start: datetime
observed_count: int
expected_count: int
missing_count: int
mean_cloud_pct: float | None
p05_cloud_pct: float | None
p50_cloud_pct: float | None
p95_cloud_pct: float | NoneData
The live database holds two raw sources plus the precomputed serving rollups. The chart below is a snapshot of table size on disk, in megabytes; longer bars are larger tables. The top bar, cloud_rollups, is derived data, not raw input, and is the single largest table at 5.7 GB because it stores every station at six serving resolutions.
PART 3
Where raw values are normalized, where they are preserved, and how writes stay repeatable.
Cleanup
Every raw cloud reading passes through a single function before it becomes a percentage. Sentinels (113, 9999, -9999), negatives, anything over 100, and octa readings above 8 all return None. The 113 code means the sky was obscured or not observable, so we treat it as unknown rather than as overcast. Octas convert with value / 8 * 100. Because the fix lives in one place, there is no second code path to keep in sync.
MISSING_SENTINELS = frozenset({113, 9999, -9999})
def normalize_cloud_pct(raw, *, octas=False):
if raw is None or raw == "":
return None
try:
value = float(raw)
except (TypeError, ValueError):
return None
if value in MISSING_SENTINELS or value < 0:
return None
if octas:
if value > 8:
return None
return value / 8.0 * 100.0
if value > 100:
return None
return valueCleanup
Strike events get no value sanitizer. We do not filter peak current or any other field against a physical range, because the raw measurements are the signal we want to keep. The only thing we drop is structurally malformed CSV lines: a row that fails to parse is caught, counted, and skipped. Everything that parses is stored verbatim, including the raw quality geometry, so it stays available for later use.
for line in csv.DictReader(f, delimiter=";"):
try:
ts = datetime(int(line["year"]), ...)
row = {"peak_current_ka": float(line["peakCurrent"]), ...}
rows.append(row)
except (KeyError, ValueError, TypeError):
skipped += 1Cleanup
Each ingest unit is replaced or upserted inside one transaction, so re-running a day or a station does not duplicate rows or leave a half-written state. The (source, source_version) pair is part of every natural key. Corrected cloud archives use delete-then-insert; rolling cloud updates use upsert; lightning replaces one day and refreshes its rollup. New data lands without rewriting the full archive.
with engine.begin() as conn: # one transaction: replace the whole day
conn.execute(
delete(LightningEvent).where(
LightningEvent.day == day,
LightningEvent.source_version == SOURCE_VERSION,
)
)
if rows:
conn.execute(insert(LightningEvent), rows)
refresh_sweden_daily_rollups(conn, day, day)PART 4
The stack and the decisions that keep the read path fast and the Python/TypeScript contract mechanical.
Architecture
Pydantic validates every request against bounds and enums before any query runs. A latitude outside the Sweden envelope, a radius that is not 50 or 100, or a longitude without its latitude all return a 422 instead of a bad result. Responses are plain TypedDicts, so they describe shape at type-check time and cost nothing at runtime.
class PredictionsCloudQuery(BaseModel):
model_config = {"populate_by_name": True}
lat: Annotated[float | None, Field(ge=54.0, le=70.0)] = None
lon: Annotated[float | None, Field(ge=9.0, le=26.0)] = None
radius_km: Literal[50, 100] = 50
@model_validator(mode="after")
def _location_is_a_pair(self) -> Self:
if (self.lat is None) ^ (self.lon is None):
raise ValueError("lat and lon must be provided together")
return selfArchitecture
The frontend does not hand-write API types. FastAPI emits an OpenAPI schema from the same response models, and openapi-typescript turns that schema into schema.gen.ts, which the frontend imports. The generation runs offline — no server, no database — so it works the same in CI and on a laptop. Rename a field in Python and the dependent TypeScript stops compiling.
# frontend/scripts/gen-api.sh
( cd "${BACKEND_DIR}" && uv run python -c \
"import json, cloudy.api as a; print(json.dumps(a.create_app().openapi(), indent=2))" \
) > "${SCHEMA_JSON}"
node -e "JSON.parse(require('fs').readFileSync(process.argv[1],'utf8'))" "${SCHEMA_JSON}"
pnpm exec openapi-typescript "${SCHEMA_JSON}" -o "${SCHEMA_TS}"Architecture
The Sweden-wide normal is a percentile scan over about 10 million rows, which takes roughly 10 seconds when run live — too slow for a request. So we run that scan once per ingest, off the request path, and write the result to a table. The read path then serves the materialized rows. Located queries touch only a few stations, so those stay live and sub-second.
def refresh_sweden_normals(
engine, source="smhi-metobs", source_version="1.0"
) -> int:
written = 0
with engine.begin() as conn:
for period in ("day", "month", "year"):
bucket = f"EXTRACT({_PERIOD_FIELD[period]} FROM ts_utc)::int"
rows = conn.execute(
text(_NORMAL_SQL.format(bucket=bucket, station_filter=_SWEDEN_FILTER))
).all()
conn.execute(delete(CloudNormal).where(...))
if rows:
conn.execute(insert(CloudNormal), [... for row in rows])
written += len(rows)
return writtenArchitecture
A real deployment would use a shared cache like Redis. Here it is an in-memory LRU. The route code never names either one: it talks to a Cache Protocol whose values are JSON strings only. Swapping the backend is a new implementation plus a config value, not a change to any route.
class Cache(Protocol):
def get(self, key: str) -> str | None: ...
def set(self, key: str, value: str, ttl_s: int) -> None: ...
@lru_cache # one cache instance per process
def get_cache() -> Cache:
backend = get_settings().cache_backend
if backend == "memory":
return MemoryCache()
raise ValueError(f"unknown cache backend: {backend!r} (supported: memory)")Goals
Cloud and lightning need different shapes of answer. Cloud is a value at a point, and the nearest stations may be far away, so the work is estimating across distance. Lightning is regional — a strike lands somewhere in an area, and how far that is from a station does not matter — so the work is counting in a circle. Recency (the damped model) is explored for both, but cloud is the clean case.
PART 5
A 1-2 week outlook that nudges the seasonal normal toward what the recent weeks actually did.
Prediction
A seasonal normal says how cloudy a week usually is, but it has no idea what just happened. The fix is small: take the normal for the week and add a fraction of how far the recent weeks have run above or below it. That fraction is alpha, the lag-k autocorrelation of weekly anomalies. Weekly anomalies persist (lag-1 is about 0.3 across Sweden, higher in some places); monthly ones barely do. We clamp alpha to [0, 1]: floored at 0 so a noisy negative value cannot flip the signal, capped at 1 so we never amplify an anomaly. At alpha = 0 the forecast collapses back to the normal.
| normal (this week) | recent week ran | gap a | α | forecast = normal + α·a |
|---|---|---|---|---|
| 65% | 80% — cloudier | +15 | 0.30 (lead 1) | 65 + 0.30×15 = 69.5% |
| 65% | 50% — clearer | −15 | 0.30 | 65 − 0.30×15 = 60.5% |
| 65% | 80% — cloudier | +15 | 0.10 (lead 2) | 65 + 0.10×15 = 66.5% |
| 65% | 80% — cloudier | +15 | 0.00 (no persistence) | 65 + 0 = 65% — the normal |
def fit_alpha(anomalies, lead):
present = [a for a in anomalies if a is not None]
if len(present) <= lead:
return 0.0
mean = fmean(present)
var = sum((a - mean) ** 2 for a in present) / len(present)
if var == 0:
return 0.0
pairs = [
(a, b)
for t in range(len(anomalies) - lead)
if (a := anomalies[t]) is not None and (b := anomalies[t + lead]) is not None
]
cov = sum((a - mean) * (b - mean) for a, b in pairs) / len(pairs)
return max(0.0, min(1.0, cov / var)) # floor 0, cap 1Prediction
To trust the outlook we score it the way it would actually run. At each weekly origin we rebuild the normal from only the weeks up to that origin, so a past forecast is never measured against data from its own future. We let about two years of weeks accumulate first as warm-up, then start scoring. The baseline is the normal itself, which always predicts an anomaly of zero. Skill is the fraction by which the model cuts the baseline's error: 1 minus model MAE over baseline MAE.
climatology = {woy: total[woy] / count[woy] for woy in total}
causal = [
None if v is None else v - climatology[woys[i]]
for i, v in enumerate(values[: origin + 1])
]
prediction = predict(causal, woys, origin, lead)
target = actual - climatology[target_woy]
model_err.append(abs(target - prediction))
base_err.append(abs(target)) # the normal predicts zero anomaly
skill = 1.0 - fmean(model_err) / base_maePrediction
Averaged over Sweden the gain is small but consistent: lead-1 median skill is +1.9%, and the model beats the normal at 98.2% of stations. Stockholm is a clearer example. The chart plots rolling 52-week mean absolute error, not raw cloud: gray is the seasonal normal, blue is the damped outlook, and lower is better. Across the backtest the normal is off by 23.0 points on average; the damped outlook is off by 16.6.
PART 6
Estimate the cloud normal anywhere in Sweden from the nearest stations, climbing three rungs of precision.
Spatial
For a location without a station, the useful signal is nearby stations. The product estimates the local cloud normal from the 5 nearest stations: nearest-station normal as the simple floor, kNN average as the shipped estimate, and a learned model as a benchmark check. The first two are direct statistics on real station observations. The benchmark exists to answer one question: does learning beat the average?
Spatial
The evaluation uses the same neighbour rule as serving: when the origin is a station, that station is excluded from its own neighbour list. That gives leave-station-out scoring directly from the data shape. Whole stations go to disjoint folds, and serving reuses the same feature writers, so the benchmark is measured on the same inputs the shipped estimate uses.
def nearest_neighbours(points, k=DEFAULT_NEIGHBOURS):
neighbours = {}
for origin in points:
ranked = sorted(
(
(other.id, haversine_km(origin.lat, origin.lon, other.lat, other.lon))
for other in points
if other.id != origin.id
),
key=lambda pair: pair[1],
)
neighbours[origin.id] = ranked[:k]
return neighboursThe benchmark model was LightGBM with 400 trees, learning rate 0.05, 31 leaves, fit on MAE (regression_l1). Features are the 5 nearest stations' cloud values plus distance, bearing, lat/lon, and seasonal sin/cos. It was a path we explored and then discarded: the result on the next slide is kept as evidence, but the learned model itself is not in the codebase — only the kNN estimate and the shared feature writers it was measured against ship.
Spatial
The deciding score is against held-out station observations, because that is the data the product serves. On that score the learned benchmark and kNN are a near-tie, and kNN is slightly better: 6.20 pp median weekly MAE versus 6.36. Each bar is weekly median MAE in cloud percentage points; lower is better.
PART 7
Strike chance is regional, so we count strikes in a circle; only normals exist so far.
Lightning
Lightning is regional, not tied to a station: a discharge lands somewhere in an area, and how far it is from a weather station does not matter. So we count strikes inside a circle (default radius 10 km, secondary 25 km) and work in lightning-days — calendar days with at least one strike in the circle. The probability for a month is lightning-days divided by the days actually observed, not by the days on the calendar, so missing coverage cannot inflate the number.
The current-month figure is a linear extrapolation expressed in expected lightning-days: the days observed so far plus a climatology tail, where the tail is the monthly lightning-day rate times the days remaining. It is an expected count, not a compounded probability. The damped-persistence machinery does run on weekly lightning-days, but lightning is bursty and seasonal, so that output is shown only as indicative.
PART 8
The deployment is deliberately plain: static frontend, container API, serverless Postgres, and a raw archive cache for ingest.
Deploy
The deployed system has four moving parts. Cloudflare Pages serves the deck and React app. Fly.io runs the FastAPI container. Neon stores Postgres. Cloudflare R2 stores the gitignored SMHI raw archive so scheduled ingest jobs can replay files before downloading anything missing. Terraform creates and wires the infrastructure; GitHub Actions deploys app code only after tests pass.
# infra/terraform/main.tf
module "neon" {
source = "./modules/neon"
}
module "backend_fly" {
source = "./modules/backend_fly"
database_url = module.neon.database_url
}
module "frontend_pages" {
source = "./modules/frontend_pages"
api_url = module.backend_fly.backend_url
}PART 9
Where the work goes next, now that space and time are both covered.
Next
Space and time are already combined: the kNN average gives the normal at a point, and the damped step adds recency. The bottom-right corner is filled by composition, so the next work is operational freshness, lightning, and denser cloud inputs.
Three directions are open. Keep it current automatically. Every model here is a cheap recomputation, not a trained artifact — the normals are averages, α is one autocorrelation, the lightning rate is a count. A scheduled job can rebuild affected rollups, refit α, and regenerate the backtest as new SMHI data lands. Push lightning past climatology. It is the thinnest corner, still normals plus an indicative damped nudge. Sharpen the cloud estimate with denser inputs. kNN×damped is a good fit for sparse station data; materially sharper cloud needs satellite cloud or a high-resolution analysis served directly, not more complexity on the same station set.