Skip to main content

Command Palette

Search for a command to run...

The Session ID That Wouldn't Stop Changing

Updated
8 min read
The Session ID That Wouldn't Stop Changing
M
Senior Backend Engineer with 18+ years of experience building scalable web applications, primarily in PHP and Node.js. I operate as a T-shaped developer, combining clean backend architecture with a practical understanding of cloud infrastructure (AWS) and edge computing. I focus on system reliability and performance. A recent highlight: I reduced CI pipeline time from 30 to under 5 minutes by identifying I/O bottlenecks (moving operations to tmpfs) and optimizing test execution. I use AI-assisted tools (LLMs, coding assistants) to speed up prototyping, debugging, and everyday development tasks. I work closely with DevOps teams and understand production environments, which helps me deliver reliable and maintainable systems.

I was implementing a feature where the session container would track a lastActivity timestamp, updated on every authenticated request. Standard stuff. I wrote it, tested it locally with curl, and noticed something odd: I kept getting a new Set-Cookie header value on every response. Not occasionally. On every single one. A week later I was sending a pull request to mezzio/mezzio-session-cache.

The Setup: Two Backends, One Session

Our system had a constraint: two backend applications, written in different languages, sharing a single user session. One was the main PHP/Mezzio app. The other was a service in a different stack that needed to read from, and update the lastActivity timestamp on, the same session container.

There are a few ways to make polyglot session sharing work. We landed on a shared cache backend (Redis) with a well-defined session structure. Both apps could read and write through their own libraries, as long as they agreed on the storage format and the cookie name. The session ID was the contract.

That contract is the part that quietly broke.

A Missing Escape Hatch

My first instinct was the usual list of suspects. Was something calling regenerateId() in a middleware I didn't know about? Was there a logout being triggered somehow? Was a misconfigured cache layer evicting and recreating sessions?

After a bit of digging through the call stack, I ended up in the library itself. And there it was: CacheSessionPersistence was regenerating the session ID whenever the session data changed. Not on login. Not on privilege escalation. On every write.

That's when the real question hit me: why on earth would a library do that by default?

Reading Code Before Changing It

When you find behavior that surprises you in someone else's code, the wrong move is to immediately label it broken. The right move is to assume the maintainers had a reason, and find out what it was.

The reason, in this case, is session fixation.

Session fixation is a class of attack where an attacker tricks a victim into using a session ID the attacker already knows. If the session ID never changes after authentication or privilege changes, the attacker can hijack that session and act as the victim. The standard defense is to regenerate the session ID at any point where the trust level of the session changes: at minimum on login, ideally at any meaningful state transition.

Mezzio's library went further: it regenerated on every data change. Defense-in-depth taken to its logical extreme. The cost (a few extra cookie writes) is invisible to most apps. The benefit (making session fixation nearly impossible to exploit through normal usage) is real. For a single-application setup, you'd never notice the trade-off. For anything distributed, it changes the session ID from a stable contract into an ephemeral token that rotates with every write.

For our architecture, that was catastrophic. Both backends depended on the session ID remaining stable. The moment our PHP app updated a lastActivity timestamp, the library regenerated the ID on the spot. The other backend, still holding the previous ID, was now pointing at a session that no longer existed. A classic stale reference problem, except the staleness wasn't caused by TTL or eviction. It was caused by the library deliberately rotating identifiers under us. The contract broke.

I wasn't alone in hitting this. Months before my PR, an issue described the same underlying problem from a different angle: users losing their sessions when a response failed to reach the client, because the old session ID had already been destroyed the moment the session was persisted. Two production scenarios, one missing escape hatch.

The Decision: Where Does the Fix Belong?

Three paths on the table:

  • Patch around it in our app. Avoid writing to the session in certain code paths, or build a parallel storage mechanism that bypasses the library. Every developer on the team would need to remember the rule. Invisible discipline. The worst kind of debt.
  • Fork the library. Maintain our own version with regeneration disabled. Now I own a fork of a security-sensitive library. Every upstream release becomes a merge conflict review. Every security advisory becomes a question of "does that affect our fork too?"
  • Contribute upstream. Add an opt-in to disable auto-regeneration, keep the secure default, let maintainers decide if it's worth merging. Rejection risk is real. Review iterations are real. But the change either lands and stops being my problem, or it doesn't and I'm back to one of the other two paths anyway.

It wasn't a hard call. Patching around it costs the team forever. Forking costs me forever. Contributing upstream costs me a week, with non-zero rejection risk. The expected value of the week was lower than the certain cost of the other two options, by a wide margin.

What tipped it was the security angle. A fork of a session library is the kind of thing that would haunt me in three years when someone reports a CVE and I have to figure out whether our fork is affected, whether the upstream patch applies cleanly, and whether I still remember why we forked in the first place. Patching around the library has the same problem in a different shape: someone new joins the team, doesn't know we patched around the library, writes to the session in what looks like a perfectly reasonable place, and session sync breaks in production. It's happened to me before, on the receiving end. I wasn't going to set that up for someone else.

Only upstream made sense long-term.

The Cost: A week of focused work, plus review iterations spread over the week. Real rejection risk.

The Win: A library-level escape hatch maintained by the project, not by us. Zero ongoing maintenance cost.

Writing the Change

The shape of the contribution was deliberately small:

  • A new auto_regenerate config option, defaulting to true (no behavior change for existing users).
  • A constructor argument on CacheSessionPersistence that threaded the option through.
  • The factory updated to read the option from config.
  • Tests covering both the default behavior and the new opt-out path.
  • Psalm baseline updated to reflect the new types.

End result, from a user's perspective, looks like this:

// config/autoload/session.global.php
return [
    'mezzio-session-cache' => [
        'cookie_name'     => 'sid',
        'auto_regenerate' => false, // opt out of regeneration on every write
    ],
];

Config-level rather than a runtime toggle, because flipping session ID regeneration mid-request is the kind of thing that gets you into trouble you didn't ask for.

That last point about Psalm matters. When you write code in someone else's project, you're not just meeting a functional bar, you're meeting their quality bar. Mezzio uses Psalm at a strict level. The PR wasn't done when the feature worked; it was done when every static analysis check passed at the level the maintainers expected.

The Review Process Was the Real Test

The PR went through a proper review with two maintainers from the project. The feedback was sharp and useful:

  • Tests were requested specifically for the false case. Fair. I'd tested the default but not the new branch as thoroughly as I should have.
  • A type-casting suggestion on the factory: (bool) $autoRegenerate to remove a Psalm baseline entry and prevent a class of common errors.
  • One small inline tweak suggested via GitHub's review UI.

The standard was what stood out. Every comment was the kind of thing a careful engineer would say in your own team's review. No gatekeeping, no theater. Just "this needs to be a little more robust, here's why."

I addressed every point, pushed updated commits, and the PR was merged into the 1.13.x branch. The feature shipped in version 1.13.0.

Looking Back

The bug wasn't in our code. The fix didn't belong in our code either. The work was upstream, so I went upstream.

Two things stuck with me from this PR. First: when a library has a default that surprises you, that default is part of the API: not a bug, not laziness, but a deliberate choice the maintainers made for users who haven't thought about the edge case. Working with that choice rather than against it is the difference between contributing and forking.

Second: turning off auto_regenerate doesn't remove the session fixation defense. It moves it, from the library into the application. That's a fair trade when you've already got the rest of the controls in place (regeneration on login, on privilege escalation, on logout). It's a security regression when you don't.

The rest is the general pattern for working with other people's code: read more than you write, take time to understand the maintainers' position, don't fork what you can fix upstream. This PR just made them concrete.


The PR discussed here is mezzio/mezzio-session-cache#51, merged into the 1.13.x branch in May 2024. The auto_regenerate option shipped in version 1.13.0 and remains in the library today. A follow-up contribution from the same project, #52, implementing InitializePersistenceIdInterface, shipped in 1.14.0 a month later. The library has continued on its 1.x line; the latest release at the time of writing is 1.17.0.

More from this blog

M

Michał Iżewski | Senior Backend Engineer

2 posts

A technical blog focused on backend architecture and scalable systems, with deep roots in the PHP ecosystem and a polyglot engineering mindset. Sharing insights on CI/CD optimization, performance deep-dives, and pragmatic problem-solving.