<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-US"><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://fidelitas.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://fidelitas.io/" rel="alternate" type="text/html" hreflang="en-US" /><updated>2026-05-18T23:40:23+02:00</updated><id>https://fidelitas.io/feed.xml</id><title type="html">fidelitas.io</title><subtitle>A technology blog covering distributed systems, performance engineering, developer tools, and the bleeding edge of software craft.</subtitle><author><name>fidelitas.io</name></author><entry><title type="html">Reciprocal knowledge elicitation: when the interview goes both ways</title><link href="https://fidelitas.io/2026/05/reciprocal-knowledge-elicitation/" rel="alternate" type="text/html" title="Reciprocal knowledge elicitation: when the interview goes both ways" /><published>2026-05-18T23:30:00+02:00</published><updated>2026-05-18T23:30:00+02:00</updated><id>https://fidelitas.io/2026/05/reciprocal-knowledge-elicitation</id><content type="html" xml:base="https://fidelitas.io/2026/05/reciprocal-knowledge-elicitation/"><![CDATA[<p>A few years ago I sat in on a knowledge transfer session that was, on
paper, going to be the answer to every team’s prayer.</p>

<p>The senior engineer we were going to lose to retirement — eight months
notice, a clean handover, every manager’s dream — had agreed to spend
an hour a week with a junior. They had a Confluence page open. There
was a structured template. There was a coffee. The template had
sections for “decision rationale”, “known edge cases”, and
“institutional history”. Everyone was very pleased with how organised
it all was.</p>

<p>Eight months later he retired. The Confluence page was forty pages
long. About six of those forty pages were genuinely useful. The other
thirty-four were a mixture of obvious things, things we already had
runbooks for, and an unsettling amount of “you’ll know it when you
see it” hedging.</p>

<p>The useful six pages were the ones where the junior had pushed back.
“Wait, but why?” “What if the queue’s full?” “What did you do that
time in 2019 when prod crashed at 3am?” Those questions had pulled
out the knowledge that wasn’t in his head as words — it was in his
head as <em>patterns</em>, and it only came out when something tugged at
the pattern from the right angle.</p>

<p>That experience was my first real encounter with the gap between
“asking an expert what they know” and “actually getting what they
know out of them”. I didn’t have a name for it at the time. There
is, now, an emerging body of research that gives it one. It calls
that approach <strong>reciprocal knowledge elicitation</strong>, and I think it’s
going to matter more than most people in the field currently realise.</p>

<h2 id="what-is-reciprocal-knowledge-elicitation">What is reciprocal knowledge elicitation?</h2>

<p>The short version: <strong>reciprocal knowledge elicitation is an approach
where the human and the AI both contribute to extracting, refining,
and validating knowledge during the same interaction</strong>, instead of
the AI passively collecting whatever the human happens to volunteer.</p>

<p>In a traditional setup — a structured interview, a survey, a knowledge
capture form — the expert is the source, the system is the recorder.
Information flows one way. The system’s only job is to not lose what
it’s given.</p>

<p>In a reciprocal setup, the system has a job <em>during</em> the conversation.
It asks targeted follow-ups. It summarises what it thinks it just
heard. It surfaces internal contradictions. It points at gaps the
expert didn’t realise were gaps. It can, when configured well, even
<em>challenge</em> the expert’s assumptions: “you said the timeout is
60 seconds, but earlier you mentioned the upstream times out at 45 —
which one wins?”</p>

<p>That isn’t just nicer UX. It’s a substantively different <em>epistemic</em>
process. Done right, reciprocal elicitation does for knowledge what
peer review does for a paper, or what code review does for a pull
request — it pressure-tests the artifact through structured
disagreement, in a way the artifact’s author cannot do alone.</p>

<p>The most active research community working on this is gathering
around a workshop series called RKEHAC (“Reciprocal Knowledge
Elicitation for Human-Agent Collaboration”), which is co-located
with the Hybrid Human-AI Intelligence conference; the
<a href="https://eavise.gitlab.io/rkehac/2026/">2026 edition runs in Brussels in July</a>.
Earlier iterations are written up in the
<a href="https://dl.acm.org/doi/10.1145/3623809.3623977">ACM Digital Library</a>.
This is a small but growing field, and it’s worth paying attention to.</p>

<h2 id="why-one-way-elicitation-keeps-failing">Why one-way elicitation keeps failing</h2>

<p>The deeper issue with traditional knowledge elicitation is that most
of the knowledge worth capturing isn’t <em>propositional</em> — it isn’t a
set of facts the expert could write down if you just gave them the
right form. It’s <em>tacit</em>.</p>

<p>Tacit knowledge is the kind that:</p>

<ul>
  <li>Lives in pattern-matching rather than facts (“this looks like the
thing we saw in Q3 of 2022”)</li>
  <li>Surfaces only in context (“if the queue is full and it’s a Monday,
the right move is X; on any other day it’s Y”)</li>
  <li>Includes the negative space (“we tried that in 2019, it almost
worked, here’s why we never did it again”)</li>
  <li>Doesn’t compress well into bullet points without losing what makes
it useful</li>
</ul>

<p>The retired engineer in my opening story knew thousands of small
rules of that shape. He didn’t think of them as rules. He couldn’t
enumerate them on demand. They came out when the situation matched.</p>

<p>This is, by the way, the same reason new hires don’t become useful
by reading the wiki. The wiki has the facts. The job has the
patterns.
<a href="https://www.apqc.org/training/knowledge-elicitation">APQC’s work on knowledge elicitation</a>
has been making this case for years — that the value of structured
elicitation is mostly in surfacing the patterns the expert doesn’t
think to mention. What’s new is that we now have the tooling, in
the form of capable language models, to actually do this at scale
without booking a forty-hour series of interviews.</p>

<h2 id="how-a-reciprocal-loop-actually-works">How a reciprocal loop actually works</h2>

<p>The skeleton of a working reciprocal elicitation session, stripped
to its essentials, looks something like this:</p>

<ol>
  <li><strong>The human describes a task, a decision, or an exception.</strong> Loose.
Conversational. Not a form field. Whatever shape they want to
describe it in.</li>
  <li><strong>The AI asks one — and only one — targeted follow-up.</strong> Not five.
Not a clarification form. A single question that pushes on whatever
was vague, incomplete, or surprising.</li>
  <li><strong>The AI summarises its current understanding back to the human.</strong>
In the human’s own vocabulary, preferably. With the unknowns
labeled as unknowns.</li>
  <li><strong>The human corrects, extends, or rejects.</strong> “That’s mostly right
but you’ve inverted the failure mode.” “You’re missing the case
where the upstream is degraded but not down.”</li>
  <li><strong>The refined version gets stored, <em>with a trace</em> of why it
reached that final shape.</strong> The trace is the part that gets
skipped, and the trace is the part that actually matters six
months later.</li>
</ol>

<p>Steps 2 through 4 loop until the conversation runs out of pressure,
not until a timer expires. That last detail is more important than
it sounds. Most of the bad versions of “AI interviewer” I’ve seen
in the wild stop too early — they capture the first plausible answer
and move on, because their underlying architecture treats the
conversation as a sequential form-fill rather than an iterative
search. The good ones keep going until they can’t find a new angle
to push from.</p>

<p>The thing this loop produces, when it works, is qualitatively
different from a transcript. It’s closer to a <em>negotiated artifact</em>
than a recording. The human contributed the raw material. The AI
contributed structure, scepticism, and the discipline of not letting
ambiguity slide. Both ended up with something neither would have
produced alone.</p>

<p>That’s why the word “reciprocal” matters. Not because the AI is
“learning” in any deep sense during the session (most of the time
it isn’t — that’s a separate question about how the stored knowledge
gets fed back into the model). The reciprocity is in the immediate
exchange: both parties are doing intellectual work, and both walk
away with more than they brought.</p>

<h2 id="where-its-actually-being-used">Where it’s actually being used</h2>

<p>A non-exhaustive list of places I’ve seen this approach show real
results in the last eighteen months:</p>

<p><strong>Requirements engineering.</strong> This is where reciprocal elicitation
fits most naturally. Stakeholders are notoriously bad at writing
down what they want, partly because they don’t <em>know</em> what they want
until somebody pushes back. An AI that asks “you said you want the
report to be ‘fast’. What’s fast in this context — under a second,
under five seconds, under a minute?” will, on a long enough
timeline, extract better requirements than any survey ever has.
This is, incidentally, why I’ve spent enough time recently with
<a href="https://eliciteer.ai">Eliciteer</a> to recommend it: it’s built
specifically around this pattern. You write a briefing, share a
link, and the system runs the interview against whoever you sent
the link to, asking adaptive follow-ups along the way. It is, in
spirit, exactly the kind of tool the RKEHAC line of work is
arguing for.</p>

<p><strong>Incident post-mortems.</strong> Most post-mortems get done in a meeting
where the loudest person remembers wrong, and the quiet engineer
who actually saw the failure at 3am doesn’t get a word in. A
reciprocal interview, async, against each participant separately,
produces <em>strictly</em> better source material. You then merge the
versions and look at where they disagree. Where they disagree is
where the post-mortem’s real lessons live.</p>

<p><strong>Knowledge transfer before someone leaves.</strong> The exact scenario I
opened this post with. The thing that pulled the useful six pages
out of forty was the junior asking “wait, but why?” That role can
be played, fairly competently, by a model that’s been briefed to
ask exactly that kind of question. I would have killed for one in
2019.</p>

<p><strong>Domain expert onboarding for AI products.</strong> If you’re building
anything that needs domain knowledge — legal, medical, industrial,
financial — you will, sooner or later, need to interview experts.
Doing it reciprocally compresses fifty hours of human interviewer
time into something an expert can do in three hours of their own,
at their own pace. The output also tends to be more honest, because
the expert isn’t performing for another human; they’re talking to
a system that doesn’t judge.</p>

<p><strong>Compliance and audit trails.</strong> This is the underrated use case.
The <em>trace</em> of why a knowledge artifact reached its final shape —
which questions were asked, which answers were accepted, which were
rejected and why — turns out to be exactly the kind of evidence a
serious audit wants to see. If you’ve ever tried to defend a
decision to a regulator with “the SME just kind of knew” as your
documentation, you’ll understand why this matters.</p>

<h2 id="design-principles-id-defend">Design principles I’d defend</h2>

<p>If you’re designing one of these, or evaluating a tool that claims
to do it, here are the things I’d insist on. None of these are
novel; most of them have been in the academic literature for years.
The reason to write them down anyway is that most products that
claim to be doing reciprocal elicitation are, in fact, doing
sequential form-fill with extra steps.</p>

<p><strong>One question at a time.</strong> Not three. Not “first, please tell us
about X, Y, and Z.” If you find yourself stacking sub-questions
into a single prompt, you’ve turned your interview back into a form
and the depth advantage is gone.</p>

<p><strong>Distinguish facts from judgments from assumptions.</strong> “The pod
restarts every 12 hours” is a fact. “The pod <em>should</em> restart
every 12 hours” is a judgment. “I think someone set that to 12
hours because Kubernetes was being weird in 2021” is an assumption.
Treating them as the same kind of statement is how knowledge bases
end up indistinguishable from urban legend.</p>

<p><strong>Use examples and counterexamples aggressively.</strong> “Tell me about a
time the system did the right thing here, and a time it didn’t.”
This is the single most reliable prompt for unlocking pattern-based
knowledge. It works in interviews. It works in interrogations. It
works on a 6-year-old trying to explain a story they made up. The
mechanism is universal.</p>

<p><strong>Let the human review the AI’s summary before storing.</strong> This is
non-negotiable. If the artifact in the knowledge base wasn’t
explicitly signed off on by the human, it isn’t their knowledge
anymore — it’s the model’s interpretation of their knowledge, and
those two things diverge faster than anyone wants to admit.</p>

<p><strong>Keep the trace.</strong> Not just the final answer. The path to it.
Which questions were asked. Which were skipped. Where the human
pushed back, and what they pushed back against. This is the part
that pays for itself the day someone asks “who decided this and
why?” and you don’t have to guess.</p>

<p><strong>Don’t lie about your epistemic state.</strong> If the system isn’t sure
what it just heard, it should say so. “I think you’re saying X, but
that contradicts what you said earlier about Y. Which one stands?”
This is awkward. It is also the difference between a knowledge
artifact you can build on and one you can’t.</p>

<h2 id="a-worked-example-bug-triage">A worked example: bug triage</h2>

<p>Let me ground this in something concrete, because abstract design
principles without a use case are how blog posts end up doing
nothing useful.</p>

<p>Say you’ve got a user-reported bug. The user is on the other side
of the planet, they don’t have the patience for a meeting, and
your support engineer is going to have to reconstruct what happened
from a single sentence in a ticket.</p>

<p>The form-based version of this is the standard “browser, OS,
reproduction steps” template. About 40% of tickets get filled in
properly. The other 60% say things like “doesn’t work on Mac” and
that’s it.</p>

<p>The reciprocal version, in practice, looks something like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>User:   The export button doesn't work on my computer.

AI:     Got it. A couple of things would help me reproduce. When
        you click "Export", does anything happen at all — a
        loading spinner, an error message, a frozen tab — or
        does the button just stay there with no response?

User:   It spins for a bit then nothing.

AI:     OK — that's useful. The spinner running and then stopping
        suggests the request started but didn't complete. Two
        more questions:
        Roughly how long does it spin before stopping?

User:   Maybe 30 seconds? Feels like a timeout.

AI:     That matches a timeout pattern. Last one for now — does
        this happen for every export, or only for certain ones?
        (For example: a particular date range, a particular
        report type, a particular project.)

User:   Only on the big quarterly report. The monthly works fine.

AI:     Summary so far, please correct anything wrong:
        - Export only fails for the quarterly report
        - Monthly export works
        - Failure pattern: button shows a spinner for ~30s,
          then nothing happens (no error message)
        - Likely cause based on this pattern: request timeout
          on the server, possibly related to report size

User:   Yes, exactly.
</code></pre></div></div>

<p>Compare that to “doesn’t work on Mac”. The reciprocal version got
to a probable root cause (server-side timeout on large report
generation) in four exchanges, <em>and</em> it has a trace that the
support engineer can paste into the ticket. The human did about
twenty seconds of typing. The AI did the work of pulling the
shape of the problem out of them.</p>

<p>You could do this with a human interviewer too, of course. You
could do it with a human interviewer better, in fact. The problem
is your support team has six people and three hundred tickets a
day, and the math doesn’t work. The reciprocal version scales.
That’s the actual pitch, not “the AI is smarter” but “the
discipline of a structured interview, run cheaply enough to apply
to every ticket”.</p>

<h2 id="doing-it-without-building-it-yourself">Doing it without building it yourself</h2>

<p>This is the practical question most teams will land on after
reading the above. Building a reciprocal elicitation pipeline from
scratch — managing the conversation state, the follow-up policy,
the summary loop, the trace storage — is a substantial piece of
engineering, and the academic literature is honest about how easy
it is to do it badly.</p>

<p>The honest answer is that you have three options, in increasing
order of effort:</p>

<ol>
  <li><strong>Use an existing product.</strong> <a href="https://eliciteer.ai">Eliciteer</a>
is the one I happen to know best — it implements the core
reciprocal loop (briefing, async interview, adaptive
follow-ups, structured summary) as a hosted service, with a
CLI and Python SDK for piping results into your own systems.
It’s the closest thing I’ve seen to an off-the-shelf RKEHAC
implementation. There are others worth looking at, but most of
the ones I’ve evaluated stop at “AI-generated survey questions”
without the adaptive depth that makes this approach work.</li>
  <li><strong>Build a thin wrapper on top of a frontier model.</strong> If your use
case is narrow and you have specific reasons to control the
prompt logic, building your own loop on top of an LLM API is
feasible. Expect to spend more time on the conversation policy
— when to ask, when to summarise, when to stop — than on the
model itself.</li>
  <li><strong>Build it properly from scratch.</strong> Only worth doing if you
have unusual requirements (regulated industries, on-prem,
custom fine-tunes). In that case, look at the
<a href="https://eavise.gitlab.io/rkehac/2026/">RKEHAC workshop proceedings</a>
and budget six months.</li>
</ol>

<p>For most teams, option one is the right answer. I say this not
because I want to sell you anything — I have no financial
relationship with any of the products I named — but because the
specific failure mode of “we built our own AI interviewer” tends
to be six months of engineering effort that produces a worse
result than something already on the shelf. There are interesting
problems in this space worth doing yourself. The bare conversation
loop is not one of them.</p>

<h2 id="frequently-asked-questions">Frequently asked questions</h2>

<p><strong>Is reciprocal knowledge elicitation the same as a chatbot interview?</strong>
No. A chatbot interview, in the way the term is normally used,
follows a scripted decision tree. A reciprocal elicitation system
adapts its questioning based on what’s already been said, surfaces
contradictions actively, and produces a negotiated artifact rather
than a transcript. The key difference is whether the system is
<em>pushing</em> on the answers it gets, or just collecting them.</p>

<p><strong>Doesn’t this just push the bias into the AI?</strong>
A fair concern. The honest answer: yes, <em>some</em> bias is shifted from
the human side (memory, salience, social desirability) to the model
side (training data, prompt design, hallucination). The mitigation
is the trace and the human review step. If you can see why the
model went where it went, and the human signs off on the artifact
before it’s stored, you’ve at least got auditable bias instead of
invisible bias. That’s a real improvement, but it isn’t a fix.</p>

<p><strong>How long should one of these sessions be?</strong>
The literature suggests diminishing returns past about 45 minutes
of interaction time, and frustration sets in past 60. The
practical advice is: design for 15–30 minute sessions, with the
option to come back. Long async sessions, split across days, tend
to produce richer artifacts than single sittings — because the
human has time to think about the questions in between.</p>

<p><strong>Can I use this to extract knowledge from junior engineers, not
just senior ones?</strong>
Yes, and you probably should more often than you do. Junior
engineers see things senior engineers have stopped seeing because
they’ve normalised them. A reciprocal session aimed at “what
confused you in your first month?” produces some of the most
useful onboarding documentation a team can have. I have, more
than once, learned things about my own systems by reading the
output of such sessions.</p>

<p><strong>What’s the difference between this and just having a smart
interviewer?</strong>
A smart interviewer is, almost by definition, better — for the one
interview they’re doing. They don’t scale. The point of reciprocal
elicitation isn’t to beat the best human interviewer; it’s to do
better than the median form-fill at near-zero marginal cost.</p>

<h2 id="why-i-think-this-matters">Why I think this matters</h2>

<p>Two reasons.</p>

<p>The first is mundane: institutional knowledge is leaving most
organisations faster than they can capture it. Retirements,
attrition, layoffs, reorgs. Forms haven’t worked. Wiki pages
haven’t worked. Recorded interviews haven’t really worked either,
because nobody reads them. A method that produces <em>negotiated
structured knowledge</em>, at scale, asynchronously, at near-zero
incremental cost per expert, is the first new tool we’ve had for
this problem in about thirty years. That alone makes it worth
paying attention to.</p>

<p>The second is less mundane. A reciprocal elicitation loop is, if
you squint, a small instance of the broader pattern we’re all
going to have to figure out: how do humans and AI agents <em>think
together</em>, in a way that takes both their strengths seriously?
The research community has called this “hybrid human-AI
intelligence”, and it’s still mostly aspirational. Reciprocal
knowledge elicitation is one of the first concrete, well-defined
problems where you can actually see what the answer might look
like in practice. It’s a tractable special case of a much larger
question.</p>

<p>That, more than the immediate productivity wins, is why I think
RKEHAC and the work coming out of it deserve more attention than
they’re currently getting. It’s not just a better way to do
expert interviews. It’s a small, well-instrumented dress
rehearsal for the kind of collaboration that’s about to become
the default.</p>]]></content><author><name>fidelitas.io</name></author><category term="knowledge-management" /><category term="ai" /><category term="human-ai-collaboration" /><category term="tacit-knowledge" /><category term="interviews" /><summary type="html"><![CDATA[Traditional knowledge elicitation is a one-way street: an expert talks, a system listens. A new wave of human-agent research is rewriting that contract. Here's what reciprocal knowledge elicitation actually means, why it matters for capturing tacit knowledge, and how to design one without losing the plot.]]></summary></entry><entry><title type="html">Rust vs. Go in 2026 — a calmer take</title><link href="https://fidelitas.io/2026/05/rust-vs-go-2026/" rel="alternate" type="text/html" title="Rust vs. Go in 2026 — a calmer take" /><published>2026-05-18T16:00:00+02:00</published><updated>2026-05-18T16:00:00+02:00</updated><id>https://fidelitas.io/2026/05/rust-vs-go-2026</id><content type="html" xml:base="https://fidelitas.io/2026/05/rust-vs-go-2026/"><![CDATA[<p>The Rust-vs-Go discourse online is, charitably, not great.</p>

<p>If you spend any time on Hacker News or the relevant subreddits you
will get the impression that this is a fundamentally tribal question.
One camp believes the borrow checker is the second coming and anyone
still writing garbage-collected code is committing professional
malpractice. The other camp believes Rust is a self-indulgent
academic exercise and real engineers ship Go. Both camps tend to
have very strong opinions on what kind of person prefers their
opponent’s language.</p>

<p>I have, for the last five years, shipped meaningful production code
in both. For most of that time my answer to “which should I use?”
was a cowardly “it depends,” which is the answer that gets the most
nods at meetups and the fewest fights, and is, on careful inspection,
correct but useless.</p>

<p>In 2026 I think we have enough collective scar tissue to be a bit
more specific. So here’s my attempt at a calmer version. Not the
hot take. The cooler, slower take, where I try to commit to actual
positions and explain them.</p>

<h2 id="the-tldr-table">The TL;DR table</h2>

<p>If you only have thirty seconds:</p>

<table>
  <thead>
    <tr>
      <th>Workload</th>
      <th>What I’d reach for first</th>
      <th>Why</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>HTTP/gRPC service, normal CRUD</td>
      <td>Go</td>
      <td><code class="language-plaintext highlighter-rouge">net/http</code>, generics, and a 4-second compile beat everything else in this niche</td>
    </tr>
    <tr>
      <td>Latency-critical, p99.9 budget under 1ms</td>
      <td>Rust</td>
      <td>No GC tail. The first time you see a 30ms Go GC pause in a hot path you’ll understand</td>
    </tr>
    <tr>
      <td>CLI tool you want people to install</td>
      <td>Rust, narrowly</td>
      <td>Cargo + static binaries is the most painless distribution story going</td>
    </tr>
    <tr>
      <td>Embedded, firmware, WASM</td>
      <td>Rust</td>
      <td>Go’s runtime is too heavy. This isn’t a close call.</td>
    </tr>
    <tr>
      <td>Internal batch jobs and data pipelines</td>
      <td>Go</td>
      <td>Channels are the right shape for the work, throughput rarely the bottleneck</td>
    </tr>
    <tr>
      <td>Library you want consumed from C, Python, Ruby, Node</td>
      <td>Rust</td>
      <td>C ABI + cbindgen is the lingua franca. Go’s FFI story is a journey nobody enjoys</td>
    </tr>
    <tr>
      <td>Anything where the spec changes weekly for the next year</td>
      <td>Go</td>
      <td>The “blank page to production” gradient is gentler</td>
    </tr>
    <tr>
      <td>Anything that has to be correct for the next five years</td>
      <td>Rust</td>
      <td>The type system compounds</td>
    </tr>
  </tbody>
</table>

<p>If your case isn’t in there, read on. If it is, you can probably stop
reading and go to lunch.</p>

<h2 id="whats-changed-since-2021">What’s changed since 2021</h2>

<p>It’s worth pausing on this because a lot of online discourse is
still arguing about the 2020 versions of both languages.</p>

<h3 id="rust-in-2026">Rust, in 2026</h3>

<ul>
  <li><strong>Compile times are not the deal-breaker they were.</strong> They are
still slow. They are not “go make coffee” slow anymore. With
<code class="language-plaintext highlighter-rouge">cargo check</code>, workspace splitting, <code class="language-plaintext highlighter-rouge">sccache</code>, and <code class="language-plaintext highlighter-rouge">mold</code> as your
linker, an incremental rebuild on a medium-sized service is
comfortably under ten seconds. It is still slower than Go’s
equivalent, and probably always will be. You will live.</li>
  <li><strong>Async is, mostly, fine.</strong> Tokio is the default and won.
<code class="language-plaintext highlighter-rouge">async fn</code> in traits works without <code class="language-plaintext highlighter-rouge">async-trait</code> ceremony. The
<code class="language-plaintext highlighter-rouge">Send</code> bound and pinning footguns are still there but they’re
documented now and most of the standard libraries handle them
for you.</li>
  <li><strong>The ecosystem is enormous.</strong> The hard problem stopped being
“is there a crate for this?” and became “which of the eleven
crates is actively maintained?” This is, on balance, a better
problem to have.</li>
</ul>

<h3 id="go-in-2026">Go, in 2026</h3>

<ul>
  <li><strong>Generics happened, and the sky didn’t fall.</strong> The stdlib has
absorbed them tastefully (<code class="language-plaintext highlighter-rouge">slices</code>, <code class="language-plaintext highlighter-rouge">maps</code>, <code class="language-plaintext highlighter-rouge">cmp</code>). Go code in
2026 is materially more expressive than it was in 2021 without
being less readable, which is a trick I would not have predicted.</li>
  <li><strong>The runtime got quietly, dramatically faster.</strong> GC pauses,
goroutine scheduling, escape analysis: all materially better.
The same Go service in 2026 vs 2021 with zero source changes
is meaningfully faster.</li>
  <li><strong>Modules stopped being a meme.</strong> <code class="language-plaintext highlighter-rouge">go.mod</code>, <code class="language-plaintext highlighter-rouge">go.sum</code>, <code class="language-plaintext highlighter-rouge">go work</code>
finally feel like a coherent story.</li>
  <li><strong>The language has stayed deliberately small.</strong> This is the
thing the Go team gets the least credit for, and it is, in my
view, the most important property the language has.</li>
</ul>

<p>Both languages, in other words, are noticeably better than the
languages people are still arguing about online. Both are also
<em>more themselves</em> — Rust has gotten Rustier, Go has gotten Go-ier.
Which is to say: the trade-off between them is sharper now, not
fuzzier.</p>

<h2 id="where-rust-earns-its-keep">Where Rust earns its keep</h2>

<p>Three cases. I will name them.</p>

<p><strong>Case 1: tail latency that actually has to hold.</strong> If your SLO is
“p99.9 under a millisecond” and you can’t afford to burn entire
cores on amortising GC, Go will, sooner or later, surprise you.
Rust won’t. I’ve been part of two metrics-ingestion rewrites where
the math came out in Rust’s favour by a factor of roughly 2.5x on
throughput and an order of magnitude on tail latency. Those numbers
are real and they were not the result of micro-optimising Go for
six months first — we tried that, it helped some, the GC was still
the GC.</p>

<p><strong>Case 2: memory pressure you can’t predict.</strong> Streaming pipelines,
parsers, anything that fans in a million small allocations per
second. Rust’s lack of GC isn’t free — you pay for it in ownership
annotations and <code class="language-plaintext highlighter-rouge">'a</code> lifetimes that occasionally make grown engineers
cry — but the worst case is predictable, which is precisely what you
need when there is no worst case in your test suite that survives
contact with reality.</p>

<p><strong>Case 3: software you can’t easily redeploy.</strong> Firmware, agents in
customer data centers, browser-side WASM, kernel modules, anything
where “we’ll just push a hotfix” is not a sentence you get to say.
Rust’s “if it compiles, it tends to work” property is, in this
class of problem, the difference between a quiet on-call rotation
and a recurring nightmare. I have lived both. I know which one I
prefer.</p>

<p>A worked example, badly anonymised: a metrics ingestion service my
team ran absorbed ~3.2M data points per second per pod, peak. It
started in Go. It worked, but cost us 80 pods and uncomfortable
p99.9 latency. We rewrote it in Rust over about ten weeks.
Throughput went up by a factor of 2.4. p99.9 ingest latency
dropped from ~4ms to ~700µs. Pod count went from 80 to 32. The
rewrite paid for itself, conservatively, in seven weeks of saved
cloud spend.</p>

<p>That kind of math is uncommon. When it shows up, it is <em>very</em>
loud.</p>

<h2 id="where-go-earns-its-keep">Where Go earns its keep</h2>

<p>Three cases here too. Different cases.</p>

<p><strong>Case 1: CRUD that’s shaped like CRUD.</strong> A frankly enormous
fraction of internal services do this: take a request, call two
other services, write to a database, return JSON. Go does this
without ceremony. The stdlib does the boring parts. The code
that comes out is approximately the code anyone on the team would
write, in approximately the same way. That last property —
<em>everyone writes Go the same way</em> — is genuinely valuable at
scale and undervalued in language debates.</p>

<p><strong>Case 2: teams that change a lot.</strong> A new hire is productive in
Go in days. A new hire is productive in Rust in… longer. If
your org has rotations, contractors, “we’re seconding one
engineer for a quarter” arrangements, or anything else that
churns the people on the codebase, Go’s onboarding gradient is
straightforwardly friendlier. Rust pays back over years. If
you don’t have years, you don’t get the payback.</p>

<p><strong>Case 3: pipelines and orchestration.</strong> Goroutines plus channels
remain the cleanest concurrency primitives in any mainstream
language I’ve used. Tokio is great. It is not as natural for
“a thousand small workers reading from a queue” as Go’s primitives
are, and pretending otherwise is something I think the Rust
community could be slightly more honest about.</p>

<p>The thing nobody mentions in these debates: the median Go program
written by the median Go programmer is <em>strikingly</em> better than the
median Rust program written by the median Rust programmer. Not
because Rust is worse — it’s better, on raw expressiveness — but
because Rust gives you more rope, and Go won’t let you go looking
for rope at all. There’s a real virtue in a language that limits
how clever you can be.</p>

<h2 id="how-i-actually-decide-in-practice">How I actually decide, in practice</h2>

<p>When the technical answer is “either of these works”, which is
honestly most of the time, here’s how I break the tie:</p>

<ol>
  <li><strong>Who is going to maintain this in three years?</strong> If the answer
is “we have no idea, probably people we haven’t hired yet,
possibly a contractor in a region with a thin Rust community” —
lean Go. If the answer is “the same five people, with maybe one
rotation” — lean Rust.</li>
  <li><strong>What’s the cost of a runtime panic?</strong> Low — either works.
High, in the way that costs money or trust or shows up in the
news — Rust’s type system pays for itself.</li>
  <li><strong>How much is the spec going to change in the next year?</strong> A
lot — lean Go. The cost of rewriting Rust code under a moving
spec is, in my experience, materially higher than the cost of
rewriting Go code. The Rust pays off when the spec calms down.
Until it does, you pay the tax for nothing.</li>
</ol>

<h2 id="things-i-will-not-argue-about">Things I will not argue about</h2>

<p>Some statements I see repeated online and have stopped engaging
with, because they are either wrong, outdated, or bad-faith:</p>

<ul>
  <li>“Rust is too complex to ship in a team.” Demonstrably false.
Many teams ship Rust every week, including some of the largest
infrastructure organisations on the planet. Pick a different
argument.</li>
  <li>“Go isn’t a real systems language.” It runs Kubernetes. It
runs Docker. It runs most of the cloud-native ecosystem your
Rust service is deployed inside. It is, by any sane definition,
a systems language.</li>
  <li>“GC is always a problem at scale.” It isn’t. Plenty of
high-throughput services run garbage-collected runtimes. The
question is whether <em>your</em> workload is GC-pressure-bound. Most
workloads aren’t.</li>
  <li>“The borrow checker is just a fancy linter.” It is, and that
linter prevents an entire category of CVEs. The CVE-prevention
business is, depending on your industry, worth quite a lot of
money.</li>
</ul>

<h2 id="what-id-actually-do-on-a-tuesday">What I’d actually do, on a Tuesday</h2>

<p>If I were starting a new project tomorrow:</p>

<ul>
  <li>A new product where everything’s in motion: Go. No hesitation.</li>
  <li>A service in our hot path with a meaningful latency budget: Rust.</li>
  <li>A CLI tool I plan to distribute to other engineers: Rust. Cargo
ships single static binaries. It is the single best distribution
story in any language ecosystem in 2026 and I am tired of
pretending otherwise.</li>
  <li>A library that other teams in other languages will consume: Rust,
exposed via a C ABI. This is, again, not a close call.</li>
  <li>An internal Lambda or Cloud Function: Go, because cold start
still matters and Go’s binaries are small enough that nobody
has to think about it.</li>
</ul>

<p>I no longer feel guilty using both in the same organisation. The
trick is being honest about <em>which problem you have</em>, not which
language matches your team’s identity. Far too many teams choose
their language because of identity (“we’re a Rust shop”) rather
than fit, and that’s how you end up with a monorepo full of Rust
config-loading scripts that take eight seconds to start, and Go
services rewritten in Rust because it sounded cool, not because
the math worked.</p>

<p>Pick the boring tool for the boring problem. Pick the precise
tool for the precise problem. Be willing to be wrong; be willing
to revise. That’s the entire thing. It just sounds less impressive
than tribalism does.</p>]]></content><author><name>fidelitas.io</name></author><category term="rust" /><category term="go" /><category term="languages" /><category term="opinion" /><summary type="html"><![CDATA[After five years of shipping both in production, my take has stopped being 'it depends' and started being 'it really depends — here is precisely on what'.]]></summary></entry><entry><title type="html">A love letter to cURL — the world’s most underrated debugger</title><link href="https://fidelitas.io/2026/05/a-love-letter-to-curl/" rel="alternate" type="text/html" title="A love letter to cURL — the world’s most underrated debugger" /><published>2026-05-17T10:15:00+02:00</published><updated>2026-05-17T10:15:00+02:00</updated><id>https://fidelitas.io/2026/05/a-love-letter-to-curl</id><content type="html" xml:base="https://fidelitas.io/2026/05/a-love-letter-to-curl/"><![CDATA[<p>I want to start with a confession. For about a decade of my career, I
used <code class="language-plaintext highlighter-rouge">curl</code> the way most engineers do: to copy-paste the contents of
the “Copy as cURL” button from Chrome DevTools into a shell, run it
once, see what came back, and never think about the tool again.</p>

<p>I want to start with a confession because, looking back, that was an
embarrassing waste of <code class="language-plaintext highlighter-rouge">curl</code>. Sort of like owning a piano and only
playing the C-major scale with one finger. There’s a whole instrument
in there. Most engineers I know — including, until recently, me —
don’t realise it.</p>

<p>This is a long-overdue thank-you post to the most underrated debugger
in our profession.</p>

<h2 id="a-brief-biased-argument">A brief, biased argument</h2>

<p>I have nothing against GUI HTTP clients. Postman has, on net,
probably made the world a better place for the average web developer.
Insomnia has a nicer UI than Postman. Bruno is open source and ships
faster than either. Hoppscotch runs in a browser. These are all good
tools and they all have their place.</p>

<p>Their place is <em>exploration</em>. Building a request from scratch.
Sharing a collection with the rest of the team. Onboarding a new
hire. Generating client code from a spec. Those are real, valuable
workflows and a GUI is the right shape for them.</p>

<p>The place a GUI is wrong for is <em>debugging</em>. Debugging happens
late, on someone else’s box, over an SSH session, when something is
already on fire and the GUI is not installed and you don’t have the
patience to install it. In that mode, the tool that wins is the one
that’s already there.</p>

<p><code class="language-plaintext highlighter-rouge">curl</code> is already there. It is, in fact, the most “already there”
piece of software on the modern internet. There is <code class="language-plaintext highlighter-rouge">curl</code> in your
container. There is <code class="language-plaintext highlighter-rouge">curl</code> in your Raspberry Pi. There is, somewhat
absurdly, <code class="language-plaintext highlighter-rouge">curl</code> running on Mars right now. It will outlive you. You
should probably learn it properly.</p>

<h2 id="the-six-flags-that-get-you-80-of-the-way">The six flags that get you 80% of the way</h2>

<p>Memorise these. They go together.</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">-sS</span> <span class="se">\</span>
     <span class="nt">-i</span> <span class="se">\</span>
     <span class="nt">-X</span> POST <span class="se">\</span>
     <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
     <span class="nt">-H</span> <span class="s1">'Authorization: Bearer eyJhbGciOi...'</span> <span class="se">\</span>
     <span class="nt">--data</span> <span class="s1">'{"hello":"world"}'</span> <span class="se">\</span>
     https://api.example.com/v1/things
</code></pre></div></div>

<p>What’s going on:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">-s</code> makes curl shut up about the progress meter. <code class="language-plaintext highlighter-rouge">-S</code> puts
errors back on stderr. The combination — <code class="language-plaintext highlighter-rouge">-sS</code> — is what you
want approximately always. Without <code class="language-plaintext highlighter-rouge">-S</code>, a silent failure is
<em>genuinely silent</em>. You will, at some point, lose half an hour
to this. Save yourself.</li>
  <li><code class="language-plaintext highlighter-rouge">-i</code> includes the response headers in stdout. They are, more
often than not, the part of the response you actually care
about. Run <code class="language-plaintext highlighter-rouge">curl</code> without <code class="language-plaintext highlighter-rouge">-i</code> and you’ll spend the next ten
minutes wondering why your authenticated endpoint returns an
empty body. It’s because it’s returning a <code class="language-plaintext highlighter-rouge">307</code> and you can’t
see it.</li>
  <li><code class="language-plaintext highlighter-rouge">-X POST</code> does what it says. There’s <code class="language-plaintext highlighter-rouge">-d</code>/<code class="language-plaintext highlighter-rouge">--data</code> for the body.
There’s also <code class="language-plaintext highlighter-rouge">--data-raw</code>, <code class="language-plaintext highlighter-rouge">--data-binary</code>, <code class="language-plaintext highlighter-rouge">--data-urlencode</code>,
and a couple of others, each for a slightly different shape of
payload. The mnemonic I use: <code class="language-plaintext highlighter-rouge">--data</code> is for JSON, <code class="language-plaintext highlighter-rouge">--data-binary</code>
is for the moment you discover that JSON is corrupting your
bytes.</li>
</ul>

<p>That’s the boring 80%. The fun starts past it.</p>

<h2 id="when-you-need-to-know-what-actually-went-over-the-wire">When you need to know what <em>actually</em> went over the wire</h2>

<p>The single most-useful flag in <code class="language-plaintext highlighter-rouge">curl</code> and almost nobody talks about it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>curl <span class="nt">--trace-ascii</span> /tmp/trace.txt https://api.example.com/healthz
</code></pre></div></div>

<p>This writes every byte sent and received, in a readable form, to
<code class="language-plaintext highlighter-rouge">/tmp/trace.txt</code>. Not “the headers”. Every byte. Including the bytes
that are causing you to lose your mind.</p>

<p>A non-exhaustive list of bugs I have personally solved by <code class="language-plaintext highlighter-rouge">cat</code>-ing
a <code class="language-plaintext highlighter-rouge">--trace-ascii</code> output and squinting at it:</p>

<ul>
  <li>A reverse proxy that was quietly stripping <code class="language-plaintext highlighter-rouge">Transfer-Encoding:
chunked</code> and breaking streaming uploads in production. The proxy
config was managed by another team. The bug had been live for
about six weeks. We found it in twenty minutes.</li>
  <li>An HTTP client library appending an extra <code class="language-plaintext highlighter-rouge">\r\n</code> to a JSON body,
which the server’s body parser silently treated as “end of body,
but the body is invalid, return 400 with no detail.”</li>
  <li>A load balancer rejecting HTTP/2 in production but accepting it
in staging. Root cause: a misconfigured ALPN cipher list, six
letters different. The “X-Service-Version” header was right.
The TLS handshake metadata was not.</li>
  <li>An “invalid signature” error caused by a single zero-width space
pasted from someone’s Slack message into a config file. I am
not making that up. I wish I were.</li>
</ul>

<p>I want to be clear here: I could not have caught any of these in
Postman. Postman would have shown me “400 Bad Request” and a body
that said “invalid request”. <code class="language-plaintext highlighter-rouge">--trace-ascii</code> showed me the bytes.
The bytes were lying. The bytes are <em>always</em> lying. You just have
to look at them.</p>

<h2 id="the-timing-breakdown">The timing breakdown</h2>

<p>This is the one I tell every junior engineer about and they look at
me like I just handed them a cheat code, which, in fairness, I sort
of did. Add this to your <code class="language-plaintext highlighter-rouge">~/.curlrc</code>:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># ~/.curlrc</span>
<span class="nt">-w</span> <span class="s1">'\n
namelookup:     %{time_namelookup}s
connect:        %{time_connect}s
appconnect:     %{time_appconnect}s
pretransfer:    %{time_pretransfer}s
starttransfer:  %{time_starttransfer}s
total:          %{time_total}s
http_code:      %{http_code}
size_download:  %{size_download} bytes\n'</span>
</code></pre></div></div>

<p>Now every <code class="language-plaintext highlighter-rouge">curl</code> invocation tells you, for free, <em>exactly where</em> the
time went. DNS slow? That’s <code class="language-plaintext highlighter-rouge">namelookup</code>. TLS handshake slow? That’s
<code class="language-plaintext highlighter-rouge">appconnect</code>. Server slow? That’s <code class="language-plaintext highlighter-rouge">starttransfer</code> minus
<code class="language-plaintext highlighter-rouge">pretransfer</code>. Egress slow? Subtract.</p>

<p>If you’ve ever spent half an hour in a meeting arguing about whether
your latency problem is “the network” or “the application”, and the
answer turned out to be “TLS handshake on a cold connection”, this
is the tool that would have ended that meeting in 30 seconds.</p>

<h2 id="replaying-traffic-badly-with-shell">Replaying traffic, badly, with shell</h2>

<p>Most teams I’ve worked with have at least once gone through the
ritual of “we need a load testing tool”. They evaluate k6, locust,
vegeta, gatling, sometimes JMeter (don’t), pick one, document
something on Confluence, and then nobody ever uses it again because
running it requires too much setup.</p>

<p><code class="language-plaintext highlighter-rouge">curl</code> plus <code class="language-plaintext highlighter-rouge">xargs</code> plus a directory of captured requests will, in
my experience, cover 70% of the use cases of an actual load testing
tool, with zero setup:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># Replay 1000 captured requests at staging, 20 in parallel,</span>
<span class="c"># count response codes.</span>
<span class="nb">ls </span>captured/<span class="k">*</span>.json | <span class="nb">head</span> <span class="nt">-1000</span> |
  xargs <span class="nt">-I</span><span class="o">{}</span> <span class="nt">-P20</span> <span class="se">\</span>
    curl <span class="nt">-sS</span> <span class="nt">-o</span> /dev/null <span class="nt">-w</span> <span class="s1">'%{http_code}\n'</span> <span class="se">\</span>
         <span class="nt">-H</span> <span class="s1">'Content-Type: application/json'</span> <span class="se">\</span>
         <span class="nt">--data-binary</span> @<span class="o">{}</span> <span class="se">\</span>
         https://staging.api.example.com/v1/things |
  <span class="nb">sort</span> | <span class="nb">uniq</span> <span class="nt">-c</span> | <span class="nb">sort</span> <span class="nt">-rn</span>
</code></pre></div></div>

<p>I have used variants of that one-liner to:</p>

<ul>
  <li>Confirm a new validation schema rejects ~0.03% of historical
traffic, none of which integration tests had ever caught.</li>
  <li>Bisect which day of which release introduced a sudden 403 spike
(it was a Tuesday, the spike was a customer with a leading
whitespace in their API key, the customer’s lawyers were
involved).</li>
  <li>Smoke-test a new region in a CDN before swinging DNS, by
combining the above with <code class="language-plaintext highlighter-rouge">--resolve</code> (see next section).</li>
</ul>

<p>I’m not saying don’t use a real load tester for real load testing.
I’m saying: a surprising amount of “real” load testing is actually
this, with a fancier wrapper.</p>

<h2 id="tls-is-the-rest-of-curls-superpower">TLS is the rest of <code class="language-plaintext highlighter-rouge">curl</code>’s superpower</h2>

<p>There is a whole second <code class="language-plaintext highlighter-rouge">curl</code> hiding inside <code class="language-plaintext highlighter-rouge">curl</code>, and it’s about
TLS. A non-exhaustive tour:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># What certificate chain is the server actually serving?</span>
curl <span class="nt">-vI</span> https://api.example.com 2&gt;&amp;1 | <span class="nb">grep</span> <span class="nt">-E</span> <span class="s1">'^\*'</span>

<span class="c"># Force a particular TLS version, to see if the new version is the</span>
<span class="c"># problem or the old version is.</span>
curl <span class="nt">--tls-max</span> 1.2 https://api.example.com

<span class="c"># Mutual TLS, in two flags.</span>
curl <span class="nt">--cert</span> client.pem <span class="nt">--key</span> client.key https://api.example.com/private

<span class="c"># Pin a hostname to a specific IP, bypassing DNS, GeoDNS, Anycast,</span>
<span class="c"># and whatever other routing voodoo is between you and the server.</span>
<span class="c"># This is the trick.</span>
curl <span class="nt">--resolve</span> api.example.com:443:203.0.113.42 https://api.example.com
</code></pre></div></div>

<p>That last one — <code class="language-plaintext highlighter-rouge">--resolve</code> — is the killer feature. The first
time you use it to hit one specific edge node in a CDN, in a
specific region, bypassing every routing layer between you and
that one box, you will wonder how you ever debugged a CDN issue
any other way. (Answer: very, very slowly, while arguing with
support.)</p>

<h2 id="the-real-argument">The real argument</h2>

<p>You could do all of the above in a GUI. You could also, in
principle, drive a screw with a butter knife. The question is
whether the tool fits the situation.</p>

<p>Debugging fits a specific situation. The situation is: tired, late,
on someone else’s machine, with no patience for friction. In that
situation, the tool that wins is the one that’s already installed,
scriptable, pipeable, honest about what’s happening at the byte
level, and old enough that the answer to any question you have is
on Stack Overflow.</p>

<p><code class="language-plaintext highlighter-rouge">curl</code> is all four. It has been all four since 1996. It will still
be all four in 2050. It works in a Docker <code class="language-plaintext highlighter-rouge">FROM scratch</code> image
(well, with the static build). It works over SSH on a box that
hasn’t been updated since 2014. It works on Mars.</p>

<p>Spend a weekend with <code class="language-plaintext highlighter-rouge">man curl</code>. Yes, the whole thing — it’s not
that long, and most of it is genuinely useful. Take notes. Build
yourself a cheat sheet, preferably on actual paper. The next time
something is on fire at 3am, you’ll save half an hour and possibly
your sanity.</p>

<p>Daniel Stenberg, if you’re reading this: thank you. Genuinely.
We all owe you several beers.</p>]]></content><author><name>fidelitas.io</name></author><category term="tooling" /><category term="http" /><category term="debugging" /><summary type="html"><![CDATA[Postman is great. Insomnia is great. But the moment something goes wrong in production, the tool I reach for first is the one that's been on every Unix box since 1996.]]></summary></entry><entry><title type="html">Performance engineering as a habit, not a project</title><link href="https://fidelitas.io/2026/05/performance-engineering-as-a-habit/" rel="alternate" type="text/html" title="Performance engineering as a habit, not a project" /><published>2026-05-16T18:00:00+02:00</published><updated>2026-05-16T18:00:00+02:00</updated><id>https://fidelitas.io/2026/05/performance-engineering-as-a-habit</id><content type="html" xml:base="https://fidelitas.io/2026/05/performance-engineering-as-a-habit/"><![CDATA[<p>Most performance work in the industry happens in one of two modes.</p>

<p>The first is heroics. Somebody — usually a tenured engineer with a
reputation for caring about this stuff — disappears for three weeks
with a stack of profilers, comes back with a deck full of flamegraphs,
deletes a <code class="language-plaintext highlighter-rouge">toString()</code> somewhere, drops the cloud bill by 18%, and is
celebrated at the next all-hands. Within four months the bill is back
where it was. The engineer is now annoyed. Nobody can pinpoint exactly
when the regression came back. It just sort of… did.</p>

<p>The second is reactive. A latency dashboard turns red. PagerDuty
goes off. A small task force is convened in a war room called
something like <code class="language-plaintext highlighter-rouge">#perf-tigers-q3</code>. The fires get put out. The task
force dissolves. Within four months the dashboard turns red again,
sometimes for the same reason.</p>

<p>Both modes work, sort of. They also share a fatal property: they
treat performance as a <em>project</em>. Projects end. The regression doesn’t
care that the project ended. It just patiently waits for the next
quarter and starts climbing again.</p>

<p>The teams I’ve seen ship genuinely fast systems do something else.
They don’t have a “performance initiative” with a project lead and an
OKR. They have a habit. The habit is boring. The boringness is the
point.</p>

<h2 id="the-habit-in-three-rules">The habit, in three rules</h2>

<p>If I had to compress eight years of doing this to a single page:</p>

<ol>
  <li><strong>Measure on every PR. No exceptions, no opt-outs, no “we’ll add
benchmarks once the feature stabilises.”</strong></li>
  <li><strong>Compare against yesterday, not against an SLO.</strong> SLOs catch
fires. Diffs catch the people lighting the matches.</li>
  <li><strong>Make regression a build failure</strong>, not a Slack ping somebody
gets around to reading on Friday.</li>
</ol>

<p>That’s it. That’s most of the trick. Everything below is plumbing,
diff-formatting, and politics. Particularly the politics.</p>

<h2 id="why-measuring-every-pr-matters-more-than-you-think">Why measuring every PR matters more than you think</h2>

<p>Almost no production performance regression comes from a single bad
commit. They come from twenty 0.6% regressions, none of which is big
enough to be worth arguing about in code review, stacked over a
quarter. By the time anyone notices, the original cause is buried
in a git log nobody is going to bisect.</p>

<p>The defence is a benchmark suite that runs on every PR. Same shape
as your test suite. Not a separate thing. Not a “performance team”
thing. Just part of CI.</p>

<p>It doesn’t have to be elaborate. It has to be:</p>

<ul>
  <li>Reliable. Hermetic runner, pinned hardware, no noisy neighbours.
If your CI runs on shared compute in a hyperscaler, do the perf
runs on a dedicated machine somewhere quiet. I’ve used a literal
Mac Mini under a desk for this. It worked great.</li>
  <li>Statistically honest. One run is not a benchmark, it’s a coin
flip. I aim for at least eight runs with <code class="language-plaintext highlighter-rouge">benchstat</code> (Go) or
<code class="language-plaintext highlighter-rouge">criterion</code> (Rust) or <code class="language-plaintext highlighter-rouge">pytest-benchmark</code> (Python) doing the
statistical heavy lifting.</li>
  <li>Boringly visible. The diff has to show up in the PR. Slack
doesn’t count. Email doesn’t count. Anything that requires a
human to go look at it will, over a long enough timeline, be
forgotten.</li>
</ul>

<p>For a Go service, the bones of it look like:</p>

<div class="language-go highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">package</span> <span class="n">bench</span>

<span class="k">import</span> <span class="p">(</span>
    <span class="s">"context"</span>
    <span class="s">"testing"</span>
<span class="p">)</span>

<span class="k">func</span> <span class="n">BenchmarkCreateOrder</span><span class="p">(</span><span class="n">b</span> <span class="o">*</span><span class="n">testing</span><span class="o">.</span><span class="n">B</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">srv</span> <span class="o">:=</span> <span class="n">newTestServer</span><span class="p">(</span><span class="n">b</span><span class="p">)</span>
    <span class="n">b</span><span class="o">.</span><span class="n">ReportAllocs</span><span class="p">()</span>
    <span class="n">b</span><span class="o">.</span><span class="n">ResetTimer</span><span class="p">()</span>

    <span class="k">for</span> <span class="n">i</span> <span class="o">:=</span> <span class="m">0</span><span class="p">;</span> <span class="n">i</span> <span class="o">&lt;</span> <span class="n">b</span><span class="o">.</span><span class="n">N</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span> <span class="p">{</span>
        <span class="k">if</span> <span class="n">_</span><span class="p">,</span> <span class="n">err</span> <span class="o">:=</span> <span class="n">srv</span><span class="o">.</span><span class="n">CreateOrder</span><span class="p">(</span><span class="n">context</span><span class="o">.</span><span class="n">Background</span><span class="p">(),</span> <span class="n">validOrder</span><span class="p">());</span> <span class="n">err</span> <span class="o">!=</span> <span class="no">nil</span> <span class="p">{</span>
            <span class="n">b</span><span class="o">.</span><span class="n">Fatal</span><span class="p">(</span><span class="n">err</span><span class="p">)</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Five lines, plus setup. Run it on every PR. Store the results.
That’s it.</p>

<h2 id="compare-against-yesterday-not-against-an-slo">Compare against yesterday, not against an SLO</h2>

<p>SLOs are great. SLOs catch fires, page on-call, force
prioritization, all the things they’re supposed to do.</p>

<p>SLOs are also completely useless at catching the kind of regression
I’m describing. A 4% bump in p99 latency does not violate any
reasonable SLO. It also, over fifty deploys, more than doubles your
tail latency. You will not catch this with thresholds. You will
always catch it with diffs.</p>

<p>The single most useful artefact in a perf-focused team’s day is not
a Grafana dashboard. It’s a comment in a PR that looks like this:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>benchmark                  before     after      delta
---------------------------------------------------------
BenchmarkCreateOrder       412 µs     438 µs     +6.3%   *
BenchmarkListOrders         91 µs      93 µs     +2.2%
BenchmarkAuth               18 µs      19 µs     +5.6%   *
BenchmarkSerializeOrder    7.2 µs     7.3 µs     +1.4%

* indicates statistically significant (n=8, p&lt;0.01)
</code></pre></div></div>

<p>Posted by a bot. Before human review. Every PR.</p>

<p>The point of this is not really the numbers. The point is <strong>social</strong>.
Once “+6.3%” is sitting in the review thread, the conversation
changes. The author justifies it (“we added a required signature
check, this is expected”), or they back it out, or they file a
follow-up. Either way: the regression is <em>seen</em>. Seen regressions
get fixed. Invisible ones don’t.</p>

<p>This is, deeply, a sociotechnical fix wearing a technical hat. The
tooling is the easy part.</p>

<h2 id="make-regression-a-build-failure">Make regression a build failure</h2>

<p>The first time you turn this on, CI will be on fire for a week.</p>

<p>This is good. You are paying down debt you’d otherwise pay in
production, in a smaller and more controlled way. The week sucks.
The quarters that follow are noticeably calmer.</p>

<p>A reasonable opening policy (yours should be more strict over time):</p>

<ul>
  <li>Block PRs with a &gt;10% regression on hot-path benchmarks.</li>
  <li>Warn but don’t block for 3–10%.</li>
  <li>Allow opt-out — with a written justification in the PR description.</li>
</ul>

<p>That last bit is the important one. “Migrating to a new crypto
library, +14% but mandated for FIPS, follow-up tracked at JIRA-1234”
is a perfectly fine justification. “Refactor, will fix later” is
exactly how regressions get accepted, and it’s the thing the policy
exists to stop. The justification doesn’t have to be approved by
anyone, it just has to <em>exist</em>. The act of writing it forces the
author to think about whether the regression is actually fine.</p>

<p>A skeleton GitHub Actions job:</p>

<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">name</span><span class="pi">:</span> <span class="s">perf</span>
<span class="na">on</span><span class="pi">:</span> <span class="pi">[</span><span class="nv">pull_request</span><span class="pi">]</span>

<span class="na">jobs</span><span class="pi">:</span>
  <span class="na">bench</span><span class="pi">:</span>
    <span class="na">runs-on</span><span class="pi">:</span> <span class="s">self-hosted-perf</span>  <span class="c1"># pinned hardware, not a hyperscaler runner</span>
    <span class="na">steps</span><span class="pi">:</span>
      <span class="pi">-</span> <span class="na">uses</span><span class="pi">:</span> <span class="s">actions/checkout@v4</span>
        <span class="na">with</span><span class="pi">:</span> <span class="pi">{</span> <span class="nv">fetch-depth</span><span class="pi">:</span> <span class="nv">2</span> <span class="pi">}</span>

      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Bench base</span>
        <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
          <span class="s">git checkout $</span>
          <span class="s">go test -bench=. -benchmem -count=8 ./... | tee /tmp/base.txt</span>

      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Bench head</span>
        <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
          <span class="s">git checkout $</span>
          <span class="s">go test -bench=. -benchmem -count=8 ./... | tee /tmp/head.txt</span>

      <span class="pi">-</span> <span class="na">name</span><span class="pi">:</span> <span class="s">Compare and enforce</span>
        <span class="na">run</span><span class="pi">:</span> <span class="pi">|</span>
          <span class="s">benchstat /tmp/base.txt /tmp/head.txt | tee /tmp/diff.txt</span>
          <span class="s">./scripts/enforce-budget.py /tmp/diff.txt --max-regression 0.10</span>
</code></pre></div></div>

<p>You can buy fancier versions of this off the shelf. They are not, in
my experience, better. The simple version above has shipped to
production three times for me and worked all three times.</p>

<h2 id="the-part-thats-actually-hard">The part that’s actually hard</h2>

<p>Tools are 20% of this. The other 80% is the team agreeing, out loud,
that performance is <em>a feature</em>. Not a nice-to-have. Not a
“non-functional requirement”, which is a phrase invented by people
who didn’t want to do the work. A feature. Same priority as
“can the user log in”.</p>

<p>I will spare you the inspirational paragraph. Here are the small
rituals that, in practice, calcify the habit:</p>

<ul>
  <li>A 30-minute weekly perf review. Fixed dashboard. What got faster
this week, what got slower, what’s the spend trend. Cancel it
exactly zero times in the first three months even if there’s
“nothing to discuss”.</li>
  <li>Performance budgets <em>in design docs</em>. “This endpoint must serve
p99 &lt; 80ms at 1000 RPS on a standard pod” should be a sentence
in the doc before any code gets written. If it isn’t, the doc
isn’t done.</li>
  <li>Profile every release. Save the flamegraph. Diff against last
release. You’d be amazed what shows up.</li>
</ul>

<p>The last one is my favourite, and the one that gets dropped first.
Don’t drop it.</p>

<h2 id="when-you-know-it-worked">When you know it worked</h2>

<p>The signal that the habit has taken hold is that <strong>nobody mentions
performance anymore</strong>. Bench diffs are part of every review the way
green checkmarks are. Regressions get reverted the same day. The
CFO stops being surprised by the cloud bill, which means they stop
asking about it, which means you get to spend your one-on-ones on
something else.</p>

<p>Performance, done well, is invisible. Performance, done badly, is
a permanent low-grade crisis with a different name every quarter.
Pick one.</p>]]></content><author><name>fidelitas.io</name></author><category term="performance" /><category term="engineering-practice" /><category term="observability" /><summary type="html"><![CDATA[Most performance work is reactive: the bill goes up, the dashboard turns red, somebody opens an investigation ticket. Here's how to make it boring instead.]]></summary></entry><entry><title type="html">Your event-driven architecture is a distributed monolith — and that’s fine, if you measure it</title><link href="https://fidelitas.io/2026/05/event-driven-distributed-monolith/" rel="alternate" type="text/html" title="Your event-driven architecture is a distributed monolith — and that’s fine, if you measure it" /><published>2026-05-15T12:30:00+02:00</published><updated>2026-05-15T12:30:00+02:00</updated><id>https://fidelitas.io/2026/05/event-driven-distributed-monolith</id><content type="html" xml:base="https://fidelitas.io/2026/05/event-driven-distributed-monolith/"><![CDATA[<p>So you rewrote the monolith into events.</p>

<p>Every team now owns a service. Every service owns its own database. The
architecture diagram has 47 arrows on it and a legend that says
“asynchronous, fire-and-forget”. Standups are shorter. Deploys are
calmer. The principal engineer who pushed for this rewrite has, in fact,
been promoted.</p>

<p>And then on a Tuesday around 3pm somebody from billing pings you on
Slack and asks why the dashboard team’s deploy is causing refund events
to vanish.</p>

<p>Welcome to the distributed monolith. The good news is you’re not alone:
this is, in my completely unscientific estimate, what about three out of
every four “event-driven” architectures look like in practice. The bad
news is the people who built it almost never realise it until something
breaks loudly. And the worse news is that the usual fix — “let’s just
move some of these back to synchronous calls” — is, in my experience,
the wrong fix.</p>

<p>This post is about the <em>right</em> fix, which is mostly boring and mostly
about discipline.</p>

<h2 id="what-actually-happened">What actually happened</h2>

<p>The pitch for going event-driven was: services are decoupled, deploys
are independent, blast radius is small. That pitch is sometimes true.
It is true when the <em>contract</em> between services is explicit, versioned,
and validated. It is almost never the default state.</p>

<p>What actually happens, more often:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>order-service ──▶  orders.created  ─┐
                                    ├──▶ inventory-service
                                    ├──▶ shipping-service
                                    ├──▶ billing-service
                                    ├──▶ analytics
                                    └──▶ that one Lambda nobody owns
</code></pre></div></div>

<p>The bus is technically decoupling those services. The schema <em>of the
event</em> is not. Inventory, shipping, billing, analytics, and the
mystery Lambda all parse the same JSON blob, and they all parse it in
slightly different ways. Field gets renamed. Field gets repurposed.
Field gets a new value that the old consumers interpret as “unknown,
proceed with defaults”. Defaults are wrong. Refunds vanish.</p>

<p>You didn’t decouple anything. You hid the coupling behind a message
broker, where nobody can see it without <code class="language-plaintext highlighter-rouge">grep</code> and a war room.</p>

<h2 id="the-thing-about-message-brokers">The thing about message brokers</h2>

<p>A message broker is a transport. It is not a contract, it is not a
schema, it is not an interface. It is a pipe. If you stick a typed
RPC call into a pipe and call the result “decoupled”, you’ve changed
the shape of the coupling from “synchronous and observable” to
“asynchronous and silent”. The total amount of coupling didn’t go
down. It just got harder to find.</p>

<p>If you only take one thing from this post: the bus is not the contract.</p>

<h2 id="step-one-schemas-are-not-folklore">Step one: schemas are not folklore</h2>

<p>A schema is folklore if the only way to find out what a <code class="language-plaintext highlighter-rouge">orders.created</code>
event looks like in 2026 is to ask the engineer who originally wrote
the producer. A schema is not folklore if you can <code class="language-plaintext highlighter-rouge">pip install</code> it,
or at least open a JSON Schema file in a registry and <code class="language-plaintext highlighter-rouge">diff</code> two
versions.</p>

<p>The smallest possible registry that works is a directory of files in
a repo every service depends on:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">//</span><span class="w"> </span><span class="err">schemas/orders.created/v</span><span class="mi">2</span><span class="err">.json</span><span class="w">
</span><span class="p">{</span><span class="w">
  </span><span class="nl">"$schema"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://json-schema.org/draft/2020-12/schema"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"title"</span><span class="p">:</span><span class="w"> </span><span class="s2">"orders.created.v2"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"object"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"required"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"order_id"</span><span class="p">,</span><span class="w"> </span><span class="s2">"customer_id"</span><span class="p">,</span><span class="w"> </span><span class="s2">"created_at"</span><span class="p">,</span><span class="w"> </span><span class="s2">"total_cents"</span><span class="p">,</span><span class="w"> </span><span class="s2">"currency"</span><span class="p">],</span><span class="w">
  </span><span class="nl">"properties"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"order_id"</span><span class="p">:</span><span class="w">    </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"string"</span><span class="p">,</span><span class="w"> </span><span class="nl">"format"</span><span class="p">:</span><span class="w"> </span><span class="s2">"uuid"</span><span class="w"> </span><span class="p">},</span><span class="w">
    </span><span class="nl">"customer_id"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"string"</span><span class="p">,</span><span class="w"> </span><span class="nl">"format"</span><span class="p">:</span><span class="w"> </span><span class="s2">"uuid"</span><span class="w"> </span><span class="p">},</span><span class="w">
    </span><span class="nl">"created_at"</span><span class="p">:</span><span class="w">  </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"string"</span><span class="p">,</span><span class="w"> </span><span class="nl">"format"</span><span class="p">:</span><span class="w"> </span><span class="s2">"date-time"</span><span class="w"> </span><span class="p">},</span><span class="w">
    </span><span class="nl">"total_cents"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"integer"</span><span class="p">,</span><span class="w"> </span><span class="nl">"minimum"</span><span class="p">:</span><span class="w"> </span><span class="mi">0</span><span class="w"> </span><span class="p">},</span><span class="w">
    </span><span class="nl">"currency"</span><span class="p">:</span><span class="w">    </span><span class="p">{</span><span class="w"> </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"string"</span><span class="p">,</span><span class="w"> </span><span class="nl">"pattern"</span><span class="p">:</span><span class="w"> </span><span class="s2">"^[A-Z]{3}$"</span><span class="w"> </span><span class="p">}</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="nl">"additionalProperties"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Two important things in there.</p>

<p>The first is <code class="language-plaintext highlighter-rouge">additionalProperties: false</code>. Yes, really. I know this
is unfashionable. I know “be liberal in what you accept” is the
default advice. It is also the reason nobody can ever delete a field
from a payload, and the reason your bus carries 14kB of legacy junk
per event seven years after launch. Set it to false. Bump the version
when you need a new field. Pay the cost up front.</p>

<p>The second thing is that the schema lives in a repo, not in a wiki.
Wikis go stale. Repos have CI. You want a CI job that fails when a
producer ships a payload that doesn’t validate against the registered
schema, and you want that job to be louder than a green checkmark.</p>

<p>This is unglamorous work. It is the single most valuable hour-per-week
you can spend on an event-driven system, and it almost never gets
prioritized until after the first big incident. Prioritize it before.</p>

<h2 id="step-two-subscriptions-are-not-folklore-either">Step two: subscriptions are not folklore either</h2>

<p>Question: which services consume <code class="language-plaintext highlighter-rouge">orders.created</code>?</p>

<p>If you can’t answer that without grep-ing across every repo in your
org, the answer is “you have no idea, and neither does anyone else”.
Which means you also have no idea what breaks when you change the
event.</p>

<p>The fix is some mechanism — any mechanism — that makes consumer
subscriptions explicit and discoverable. I have, at various
employers, shipped three versions of this:</p>

<ol>
  <li>A <code class="language-plaintext highlighter-rouge">subscriptions.yaml</code> at the root of every service repo. Cheap,
ugly, works. The downside is everyone forgets to update it.</li>
  <li>A decorator on the handler function that registers it in a tiny
internal library. Less forgetful, but only works inside one
language ecosystem.</li>
  <li>A handler-naming convention strict enough to grep for with
<code class="language-plaintext highlighter-rouge">ripgrep</code>. Honestly? My favourite of the three. The constraint
is the documentation.</li>
</ol>

<p>Whatever you pick, run a scheduled job that crawls the org, extracts
the declarations, and writes them into a graph. Render that graph
somewhere visible. I once had a printed version of this graph on the
wall outside my desk. People stopped asking me questions about
ownership. Highly recommend.</p>

<h2 id="step-three-trace-context-on-every-message-no-exceptions">Step three: trace context, on every message, no exceptions</h2>

<p>The thing that made the monolith debuggable was the call stack. The
thing that makes a distributed system debuggable is distributed
tracing. There is no third option. There is, in particular, no
“we’ll just check the logs in each service” option — anyone who has
tried to reconstruct a multi-service flow from interleaved Loki
queries at 2am can tell you exactly how that goes.</p>

<p>Every message has to carry trace context end-to-end. W3C trace
context, OpenTelemetry baggage, whatever your stack speaks natively.
The wrapper is about ten lines:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># producer side
</span><span class="kn">from</span> <span class="nn">opentelemetry</span> <span class="kn">import</span> <span class="n">propagate</span>

<span class="k">def</span> <span class="nf">publish</span><span class="p">(</span><span class="n">topic</span><span class="p">,</span> <span class="n">payload</span><span class="p">):</span>
    <span class="n">headers</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="n">propagate</span><span class="p">.</span><span class="n">inject</span><span class="p">(</span><span class="n">headers</span><span class="p">)</span>
    <span class="n">bus</span><span class="p">.</span><span class="n">publish</span><span class="p">(</span><span class="n">topic</span><span class="p">,</span> <span class="n">payload</span><span class="p">,</span> <span class="n">headers</span><span class="o">=</span><span class="n">headers</span><span class="p">)</span>

<span class="c1"># consumer side
</span><span class="k">def</span> <span class="nf">handle</span><span class="p">(</span><span class="n">msg</span><span class="p">):</span>
    <span class="n">ctx</span> <span class="o">=</span> <span class="n">propagate</span><span class="p">.</span><span class="n">extract</span><span class="p">(</span><span class="n">msg</span><span class="p">.</span><span class="n">headers</span><span class="p">)</span>
    <span class="k">with</span> <span class="n">tracer</span><span class="p">.</span><span class="n">start_as_current_span</span><span class="p">(</span><span class="sa">f</span><span class="s">"handle.</span><span class="si">{</span><span class="n">msg</span><span class="p">.</span><span class="n">topic</span><span class="si">}</span><span class="s">"</span><span class="p">,</span> <span class="n">context</span><span class="o">=</span><span class="n">ctx</span><span class="p">):</span>
        <span class="n">process</span><span class="p">(</span><span class="n">msg</span><span class="p">.</span><span class="n">payload</span><span class="p">)</span>
</code></pre></div></div>

<p>Once that’s deployed everywhere, the question “what happens when I
publish <code class="language-plaintext highlighter-rouge">orders.created</code>?” becomes a single query in your tracing
tool, and you can stop being the team’s human DAG diagram.</p>

<p>If you only have time to do one of the three things in this post,
do this one. Schemas help you ship safely. Subscriptions help you
plan. Trace context is what saves your weekend.</p>

<h2 id="the-fan-out-problem">The fan-out problem</h2>

<p>When you finally have visibility, you will notice a thing that may
make you uncomfortable: the average user request, traced
end-to-end, produces approximately 40 spans. Sometimes 80. Once,
on a particularly memorable Black Friday morning, I watched a
single “add to cart” produce 312.</p>

<p>This isn’t <em>inherently</em> a problem. Fan-out is fine. Fan-out is
sometimes the whole reason to use events. But unbounded, unwatched
fan-out is how your tail latency budget evaporates without anyone
making a decision about it.</p>

<p>A rough rubric I use, with no claim to it being scientific:</p>

<table>
  <thead>
    <tr>
      <th>fan-out per request</th>
      <th>what I do</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>under 10</td>
      <td>nothing, ship it</td>
    </tr>
    <tr>
      <td>10–30</td>
      <td>sample traces, watch the p99.9 tail</td>
    </tr>
    <tr>
      <td>30–80</td>
      <td>start asking who’s publishing inside a loop</td>
    </tr>
    <tr>
      <td>80+</td>
      <td>treat as an incident, even if nothing’s on fire yet</td>
    </tr>
  </tbody>
</table>

<p>The biggest culprit, in basically every postmortem I’ve been part
of, is a service that publishes events <em>inside a loop iterating over
a collection</em>. The author insists this is fine, because “we’re just
emitting domain events”. You are not. You are emitting 4,000
domain events, each of which fans out to seven consumers, each of
which writes a row to a database. The math doesn’t care about your
intent.</p>

<p>If you find one of these in code review, kill it. There is almost
always a single “batch” event that captures the same intent. If
there genuinely isn’t, that’s a design smell worth pulling on for
half an hour.</p>

<h2 id="the-thing-nobody-tells-you">The thing nobody tells you</h2>

<p>Here’s the part the “events are decoupled!” pitch leaves out: an
event-driven system is <em>strictly more work to operate</em> than a
synchronous one. Not less. More.</p>

<p>It has more failure modes. It has worse default error messages.
It needs idempotent consumers. It needs replay tooling. It needs
schema discipline, subscription discipline, fan-out discipline,
trace context discipline, dead-letter discipline. Most of those
words are unglamorous and none of them ship features.</p>

<p>In exchange, you get a system where teams can deploy independently
without coordinating, where you can replay a day of traffic to
recover from a bug, and where the blast radius of any one component
failing is contained. Those are real wins. They are not free.</p>

<p>The mistake is going event-driven and then pretending you’re not
running a distributed system. You are. Pay the operational tax,
keep paying it, and the architecture will keep paying you back.
Skip it and, well. See you in the war room.</p>]]></content><author><name>fidelitas.io</name></author><category term="distributed-systems" /><category term="observability" /><category term="architecture" /><summary type="html"><![CDATA[Async messaging didn't decouple your services; it hid the coupling. Here's how to find the seams again with traces, schemas, and a healthy fear of fan-out.]]></summary></entry><entry><title type="html">Welcome to fidelitas.io — a manifesto, of sorts</title><link href="https://fidelitas.io/2026/05/welcome-to-fidelitas/" rel="alternate" type="text/html" title="Welcome to fidelitas.io — a manifesto, of sorts" /><published>2026-05-12T09:00:00+02:00</published><updated>2026-05-12T09:00:00+02:00</updated><id>https://fidelitas.io/2026/05/welcome-to-fidelitas</id><content type="html" xml:base="https://fidelitas.io/2026/05/welcome-to-fidelitas/"><![CDATA[<p>OK so. I have, against my better judgment, started another technology blog.</p>

<p>There are already approximately ten million of them. Most are dead. The
ones that aren’t dead are mostly either thinly-veiled marketing for whatever
consultancy the author runs, or LinkedIn-grade “I’ve been thinking about
leadership” takes from people whose actual job is to ship code. I read maybe
four programming blogs a week and find one of them useful per quarter.</p>

<p>So why this one. Honestly, mostly for me. The job I do for a living involves
writing a lot of internal documents that nobody outside the company will ever
read — incident reviews, design docs, that one wiki page about the queue that
keeps coming up in interviews — and at some point I noticed that the
documents I’d put real effort into were the ones I’d want to read myself if
I worked somewhere else. So I figured: why not just write them somewhere
else.</p>

<p>That’s the whole pitch, really. There’s no business model. There is, as of
this evening, no audience either.</p>

<h2 id="what-this-is-going-to-look-like">What this is going to look like</h2>

<p>Roughly four shapes of post, in descending order of how likely I am to
finish them:</p>

<p><strong>Field notes.</strong> Short. Something broke in production this week, here is
the dumb reason, here is the slightly less dumb fix. The kind of post you
wish you’d read on Monday and didn’t see until Friday’s incident review.</p>

<p><strong>Tooling.</strong> Opinionated reviews of stuff on my <code class="language-plaintext highlighter-rouge">$PATH</code>. Sometimes these
will be about a CLI that’s older than the internet. Sometimes they’ll be
about whatever is currently being hyped on Hacker News. No affiliate
links — I make zero money from this and would prefer to keep it that way.</p>

<p><strong>Deep dives.</strong> Longer, slower, occasionally pedantic. When something is
worth explaining properly, I’d rather explain it properly than turn it
into a Twitter thread.</p>

<p><strong>Essays.</strong> Once in a while. Nothing on the calendar. These tend to write
themselves at 11pm after a hard week.</p>

<h2 id="what-this-is-not-going-to-look-like">What this is <em>not</em> going to look like</h2>

<p>A few promises, mostly to keep me honest later:</p>

<ul>
  <li>No tracking. No analytics, no fingerprinting, no Google Tag Manager
hiding inside a “performance monitoring” library. The simplest way for
me to find out if you’re reading is to ask, and I don’t intend to ask.</li>
  <li>No newsletter. I don’t want your email and I really don’t want a
Substack revenue line item. Subscribe via RSS or don’t.</li>
  <li>No popups, no cookie banners, no GDPR consent modal. The site has
nothing to consent <em>to</em>. (If you live in a jurisdiction where this
paragraph is, technically, the cookie banner, then welcome.)</li>
  <li>No AI-generated filler. I use tooling to spellcheck and occasionally
argue with me about a paragraph. I do not ship words I didn’t write
and stand behind.</li>
  <li>No drive-by hot takes. If a post is going to be strong, it’d better
earn it.</li>
</ul>

<p>That last one is going to be the hardest. I have a lot of hot takes.
Roughly 60% of them, when I sit down and try to write them out, dissolve
into “actually it depends.” This is, on reflection, mostly a good thing.</p>

<h2 id="whats-already-in-the-queue">What’s already in the queue</h2>

<p>I have three pieces drafted, in varying states of readiness:</p>

<ol>
  <li>Why your event-driven architecture quietly became a distributed monolith,
and the small handful of things that bring it back from the dead.</li>
  <li>Performance engineering as a habit, not a project. (This one’s been in
the drafts folder for over a year.)</li>
  <li>A love letter to <code class="language-plaintext highlighter-rouge">curl</code>. Long overdue.</li>
</ol>

<p>And one I haven’t started but keep thinking about: Rust vs. Go, but
calmer. The conversation about those two has gotten so dumb online that
I’d like to attempt a sober version. We’ll see if I have the energy.</p>

<h2 id="housekeeping">Housekeeping</h2>

<p>If you want to follow along, the <a href="/feed.xml">RSS feed</a>
is the only mechanism. There’s no comments section, which means there’s no
place to argue with me directly. That’s deliberate. Disagreement is fine,
even welcome, but it should live on your own blog where you have to take
responsibility for it. (See also: the entire history of Twitter.)</p>

<p>Right. That’s enough preamble. Let’s write something.</p>]]></content><author><name>fidelitas.io</name></author><category term="meta" /><category term="writing" /><category term="software-craft" /><summary type="html"><![CDATA[Why I'm starting another technology blog in 2026, and what I'm hoping to do differently this time.]]></summary></entry></feed>