The Data

Thunder Bay Transit publishes GTFS schedule and realtime feeds — the same standard behind Google Maps and NextLift. Those apps show where buses are now and discard the data. We store every position, delay, and cancellation as a raw event.

A background job periodically rolls the raw events into the aggregates shown on the Metrics tab. All metrics are measured at timepoint stops (the ones marked timepoint=true in the GTFS feed).

Why Metrics

“Without an explicit SLO, users often develop their own beliefs about desired performance, which may be unrelated to the beliefs held by the people designing and operating the service.”

Google is the company who defined the GTFS transit standard and are known for running complex systems with legendary reliability. The SRE Handbook is very influential to how I think about building and operating software systems.

The handbook calls its metrics service level indicators (SLIs) — the same idea as a Key Performance Indicator (KPI), but focused on what the user experiences rather than what the operator reports. An indicator becomes a service level objective (SLO) when stakeholders commit to a target: not just “we track on-time performance” but “we agree 75% is the floor.”

Baseball stats can’t tell you who wins tonight. Transit metrics are the same — they can’t say whether every rider got where they needed to be. What they can do is show whether the system is trending in the right direction over time.

Six-Hour Chunks

Architecture

Each service day is sliced into three six-hour windows. Every metric is computed per window, per route. Chunks store raw counts and sums — not percentages. A weekly or system-wide number sums the counts across chunks and divides once at the end, so a busy route with 200 trips outweighs a quiet one with 10 instead of both counting the same.

Once a window closes the chunk is sealed and never changes. Midnight to 6 AM has no chunk — a handful of late-night trips run in that window, but not enough to produce meaningful metrics, so we leave them out.

On-Time Performance (OTP)

Schedule

A trip is "on time" if it arrives within 1 minute early to 5 minutes late of schedule. This is the standard window used by most North American agencies.

Early departures are penalized because if a bus leaves a stop before the scheduled time, you miss it — that's worse than a bus running late.

Typical range for a mid-size Canadian city: 65–85%. Above 90% is world-class. Below 60% indicates a systemic problem.

Headway Covariance

Regularity

This metric measures how evenly spaced buses are along a route. If buses arrive at perfectly even intervals, this number is zero — regardless of whether those intervals match the published schedule. In practice, buses bunch up or leave big gaps — the higher the number, the more unpredictable your wait becomes.

Think of it like darts. The bullseye is the average gap between buses. Each actual arrival is a throw. Low covariance means the darts cluster around the bullseye — gaps between buses stay consistent. High covariance means darts scattered across the board — gaps are all over the place.

Below 0.3 is good — riders perceive the service as regular. A Cv near zero isn’t the goal either — that would be like standing too close to the dartboard, where hitting the bullseye says nothing about your aim. Some variance is inevitable and healthy. Above 0.5, gaps feel random and riders lose trust in the system.

Cv only judges regularity, not whether the bus is on time. A route that runs every 20 minutes when the schedule promised every 10 still has Cv = 0 if those gaps are perfectly even. To see whether the service matches its promise, look at EWT below.

Excess Wait Time (EWT)

Delay

Excess Wait Time seems to be the most important operation metric. Transit managers publish a schedule — that schedule is a promise. EWT measures how many extra minutes you actually wait beyond that promise because buses aren’t arriving at regular intervals. It’s the gap between what was committed to and what was delivered.

If a route runs every 15 minutes, you'd expect to wait 7.5 minutes on average. But if two buses arrive together and then nothing comes for 30 minutes, the average headway is still 15 minutes — yet the experience is far worse. EWT captures exactly this gap.

Why does EWT matter more than average delay? Because it counts people, not just buses. A long gap doesn't just mean one late bus — it means every person who showed up during that gap is standing at the stop, waiting. The longer the gap, the more people accumulate, and the longer each of them waits. EWT weights gaps by the number of riders they actually affect, making it a social metric: it measures total human time wasted, not just vehicle timing.

The two timelines below have the same average headway (15 min), but the rider experience is very different:

Bigger gaps hurt more because more riders show up during them. EWT captures that — it weights each gap by how many people it affects, then subtracts the wait you’d expect if buses ran on time.

Route Finder (RAPTOR)

The route finder uses RAPTOR (Round-bAsed Public Transit Optimized Router), an algorithm developed at Microsoft Research for computing optimal multi-leg transit journeys. RAPTOR works directly on the timetable — scanning routes round by round, where each round adds one more vehicle. Round 1 finds all destinations reachable by a single bus, round 2 finds journeys with one transfer, and so on. Because of this, RAPTOR is more efficient compared to graph-based approaches like Dijkstra or A* for transit routing.

References

SourceWhat It CoversEditor Note
Human Transit: Beyond On-Time PerformanceWhy frequency and network design matter more than schedule adherenceOne of many excellent posts on the Human Transit blog. The reason we track wait time and headway regularity alongside OTP
Delling, Pajor, Werneck — Round-Based Public Transit Routing (PDF)RAPTOR — the algorithm powering the route finder featureOur initial approach was graph-based; we’re still working on understanding this paper properly
Google SRE Book: IntroductionOrigins of the SRE discipline, error budgets, and treating operations as a software problemThe reference book tech companies use for structuring on-call rotations and reliability practices

Data Collection

Thunder Bay Transit publishes three GTFS Realtime feeds — vehicle positions, trip updates, and service alerts. We poll each on a fixed interval and store every response as a raw event. The official feed URLs are listed on the City of Thunder Bay Transit Open Data page.

SourceMethodFrequency
Vehicle positionsGTFS-RT feedEvery 6s
Trip delaysGTFS-RT feedEvery 1m0s
Service alertsGTFS-RT feedEvery 1m0s

The feeds are encoded as Protocol Buffers (protobuf) — an advanced binary serialization format developed at Google. The small efficiency gain per message, multiplied by Google scale, saves a significant amount of electricity.