On the night of 30 June 2012, a one-second adjustment to the world’s clocks took down chunks of the internet. Reddit went dark. So did several other large sites. The cause was not load or malice. A leap second is an extra second inserted occasionally to keep clock time aligned with the Earth’s rotation, and a bug in how the Linux kernel handled that second sent affected machines into a spin. One second of time handled wrongly, and major services fell over at once.
That is the thing about time in software. It feels like an implementation detail right up until it is not, and then it produces the worst kind of bug. Intermittent, off by exactly one unit, and quiet enough to run wrong for days before anyone notices. I want to walk through the two time failures that taught me to treat it as a design decision, and the small set of rules that made an entire category of these bugs stop.
The job that fired an hour late for two days
I run scheduled jobs that send time-sensitive notifications. They worked for months. Then one weekend they started going out an hour late. Not failing, just late, by exactly sixty minutes, and because an hour late is not broken, nobody flagged it. It ran wrong for about two days before I spotted the pattern. The cause was daylight saving. The scheduler was configured in the server’s local time, the clocks shifted, and every job slid with them. The schedule had not changed. The meaning of the numbers in it had.
That is the trap with local time. It is not a fixed reference. Twice a year it jumps, and anything anchored to it inherits the jump.
The team that lived in three timezones
The second pressure came from people. The team spans the UK, Eastern Europe, and the United States. When someone in one zone set a deadline and someone in another read it, whose day were we talking about? A deadline of the 27th is not a fact until you say the 27th where. Without one answer, the same stored date meant a different real moment to each person, and arguments about whether something was late became arguments about whose clock was authoritative.
The rules that made it stop
The fix was not clever code. It was a handful of decisions applied everywhere, without exception.
- Store every timestamp in UTC. The database never holds a local time. UTC does not observe daylight saving, so a stored moment means the same thing in January and July. That alone removes the whole clocks-changed class at the storage layer.
- Pick one timezone for everything human-facing and write it down. We standardised on a single zone for schedules and reports, stated in plain words in the project documentation. Which zone you pick matters less than the fact that there is exactly one, and nobody has to guess.
- Declare schedules against that fixed zone, not the server’s local clock. When daylight saving shifts, the jobs do not move, because they were never tied to a clock that shifts.
The subtle one: when is something actually overdue?
Those rules fix storage and scheduling. There is a third trap in the logic that compares times, and people get it wrong even after sorting out the rest. Say a task is due on the 27th. The naive check asks whether the current time is past the deadline. But the 27th with no time attached usually gets stored as the very start of that day, so the instant the 27th begins, at midnight, the naive check declares the task overdue, even though the person has the whole day to do it.
// naive and wrong: marks the task overdue at 00:00 on the due date
const overdue = now > task.deadline;
What you usually mean by overdue is that the due day has fully elapsed. So compare against the start of today, computed in your one chosen zone.
const startOfTodayUtc = zonedStartOfDay(now, AGREED_TZ);
const overdue = task.deadline < startOfTodayUtc;
Now a task due on the 27th becomes overdue only once the 27th is genuinely behind us. It is a small change with an outsized effect on trust, because nothing erodes faith in a reminder faster than it crying wolf a day early.
Test time by freezing the clock
There is one practical habit that makes all of this safe to change later. Time-dependent logic is almost impossible to test if it depends on the real clock, because the answer changes every time you run it. So in tests, freeze the clock. Inject the current time as a value rather than calling the system clock inside your logic, and you can write a test that says: given it is 23:59 on the 26th in the agreed zone, a task due on the 27th is not yet overdue, and one second later, at the start of the 27th, it is still not overdue, and only after the 27th has fully elapsed does it become overdue. Those tests would have caught the daylight saving drift before it ever shipped, because they pin down exactly what the comparison means at the boundaries where it tends to break.
It is also worth storing the intended timezone alongside the UTC timestamp. UTC is the right thing to compute with, but a human reading a report often wants to know which local moment was meant. Keep both, compute in UTC, and display in the one agreed zone. The small extra column is far cheaper than the conversation that starts with someone asking why the report says one time and their calendar says another.
The numbers were never the problem. The problem is that the same number means different moments to different observers, and time itself shifts under you twice a year and, occasionally, by a single inserted second. Pick one meaning, store it in UTC, and enforce it everywhere. The work is not difficulty. It is consistency. And consistency is exactly what turns time from a source of mysterious bugs into something you stop having to think about.





