The person best placed to explain why a change was made is the person making it. At that moment the reasoning is complete in their head: they know what they tried, what they rejected, and why this version won. They are also the one person for whom none of it needs writing down, because to them it isn't a question. It's just what they decided. So they don't write it. The gap only opens later, for future-them or the next person, who inherits the change without the reasoning and has to reconstruct it from the code alone.
That is the shape of the problem, and it is the reason "just write down your decisions" quietly fails. The record is owed by exactly the person least motivated to produce it, at the one moment it feels redundant to them. Call the cost of extracting it anyway the authoring tax. It looks small on any single change. In aggregate it falls on the people with the least room to pay it.
We felt this early, building Spelunk, because our first instinct was the obvious one: give people a place to record decisions, and ask them to record decisions. It demos well. Then you watch who is actually meant to keep it fed.
The tax lands on someone already buried
Think about where a decision gets made. It is usually in the middle of the work, under time pressure, with a review open and three other things half-done. That is exactly the moment a "please also write down why" step is most expensive and least likely to happen, and it is the moment the reasoning is clearest to the author and least in need of stating, at least to them.
Now look at the reviewer, because this is where it compounds. Reviewers are already buried. A modern review queue is full of large, machine-generated diffs that a human is expected to read, judge, and sign off, and the volume has gone up while the day has not. Into that, hand-authored memory adds a second thing to review: a parallel stream of rationale that also has to be read, checked against the diff, and approved. You have taken a person who is already struggling to keep up with the code and asked them to keep up with a running commentary on the code as well.
Something has to give, and it is always the same thing. The scarce judgement goes to the code, because the code is what ships. The written rationale is what gets skimmed, waved through, or quietly skipped. And the decisions that most deserve a careful record are usually the ones made under the most pressure, which makes them exactly the ones most likely to slip through with an empty or wrong note attached.
A record that only gets written when someone happens to have spare time will have its worst holes exactly where the work was hardest, because that is precisely when no one had spare time.
Let the work write the record
The alternative we landed on is to stop asking. The reasoning around a change is already being produced: it's in the diff, in the review discussion, in the decision that gets made when one approach wins over another. That material exists whether or not anyone files a separate note about it. So capture it from the work itself, at the point the work happens, rather than levying a tax for a second, hand-authored copy.
When memory is captured this way, a few things fall out that we didn't fully appreciate until we were living with them.
The diff is the proof. A decision that is tied to the change that enacted it doesn't need to be taken on faith. You can see the code the reasoning is about, next to the reasoning, anchored to the same point in history. There is no gap between the claim and the evidence for it, because the claim was captured from the evidence.
Nothing depends on remembering to write it down. The failure mode of hand-authored memory is silence: the most contested change is the one nobody had a spare minute to annotate. When the record comes from the work, the contested change carries its record for the same reason it carried its diff. The busiest moments are covered, not skipped.
Drift is visible, because the record is tied to the code. A hand-written note about why a function is the way it is starts rotting the moment the function changes and the note doesn't. The note stays confidently wrong. When the record is anchored to the code rather than sitting beside it, a change that moves the ground under an old decision is something the system can see, rather than something a human has to notice and manually reconcile.
Being wrong later, without losing the trail
None of this means the first decision is the last word. Reasoning gets revised. The retry that backed off for 800ms turns out to need a different number; the guard clause that was load-bearing stops being necessary; the library you hand-rolled around ships the feature you needed and you go back to it.
The thing you want when that happens is not a clean slate. It is the trail. You want the new decision to supersede the old one while keeping the old one legible, so the next person, or the next agent, can see not just what the current answer is but that there was a previous answer and why it changed. A record that quietly overwrites itself is only marginally better than no record, because it can't tell you that the ground has moved.
So supersede is a first-class move in Spelunk, and it is recoverable. A decision that replaced an earlier one still points back to what it replaced. The history of a decision is a history, not a single mutable cell. This matters most in exactly the situation the authoring tax fails: the high-pressure, frequently-revised, genuinely-contested changes, where the story of how the thinking evolved is worth more than any single snapshot of it.
What we're actually claiming
We are not claiming that writing things down is bad. Deliberate documentation has its place, and sometimes the right move is to sit and author a careful account of a hard decision. The claim is narrower and, we think, more useful: a memory system that can only be fed by hand will be starved exactly where it is needed most, because the moments that most deserve a record are the moments with the least room to make one.
So we built Spelunk to take the record from the work rather than to bill you for it. The diff is the proof, the record is tied to the code so drift shows up on its own, and when a decision changes the old one stays on the trail rather than being erased. That is less a feature list than a stance: the record should be there for the changes that were hardest to make, which are the changes nobody had the time to write up.
git tracks what changed. Spelunk remembers why.
Spelunk is open source and code-aware, built to be called from whatever agent you already use. Repo and docs: spelunk.cloud.