technical reading

Reading an RFC With AI: Protocol Intent vs Implementation Detail

Scholia · · 11 min read
Engineer's dual-monitor setup with IETF RFC PDF on one screen and terminal reference implementation on the other

Key Anchors


The first thing most engineers notice when they open an RFC is the header block: a dense rectangle of metadata — document number, category, date, obsoletes, updates — sitting above the Abstract like a bureaucratic toll booth. RFC 9110, the 2022 revision of HTTP semantics, opens with exactly this: "This document obsoletes RFC 7230, RFC 7231, RFC 7232, RFC 7233, RFC 7235, and RFC 7538." Six documents, one sentence. Reading an RFC with AI changes nothing about that first encounter — the header is still there, the obsolescence chain is still opaque — but it changes what you do next. The question is whether you skip past the header toward the protocol body, or whether you stop and ask what it means that six documents collapsed into one. That choice determines whether you read the RFC or merely consult it.


The Header Is a Compression History

RFC 9110's obsolescence list is not administrative housekeeping. It is the document's first argument. Each of the six RFCs it replaces was itself a deliberate factoring of HTTP/1.1 — RFC 7230 handled message syntax and routing, RFC 7231 handled semantics and content, RFC 7232 handled conditional requests. The 2014 split was a design decision: separate the wire format from the application semantics so they could evolve independently. The 2022 recombination is a different design decision: the separation had created more cross-reference friction than it resolved.

A reader who skips the header misses this. They arrive at the normative text without knowing that the document is itself a position in an ongoing argument about how to factor a protocol specification. Every "see Section 7.4" cross-reference they encounter later is a residue of that factoring history. The header is the author's compressed answer to the question: what went wrong with the last version of this document?

The same logic applies to the Status of This Memo field. "Standards Track" means something precise: this document is on the path toward becoming an Internet Standard, which requires demonstrated interoperability across independent implementations. "Informational" means the IETF is publishing it but not endorsing it as a requirement. "Experimental" means the protocol is being tried in the field before the design is frozen. These are not bureaucratic categories — they are the document's epistemic status, and they change how every normative statement in the body should be read. An Experimental RFC's "MUST" is a hypothesis. A Standards Track RFC's "MUST" is a constraint on every implementation that claims conformance.

The Abstract Is a Thesis, Not a Summary

The Abstract of RFC 9110 reads: "The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypermedia information systems." That sentence has been in HTTP specifications since RFC 2616. Most engineers read past it. It is worth stopping on "stateless."

Stateless (sans état) here is a precise architectural constraint, not a description of what HTTP implementations do in practice. It means that each request must contain all the information necessary to understand it — the server is not required to retain any session context between requests. This is the constraint that makes HTTP horizontally scalable: any server in a pool can handle any request because no server holds privileged state. The entire edifice of cookies, sessions, and JWT tokens is the application layer's answer to the problem that statelessness creates. Reading the Abstract as a thesis rather than a summary means asking: what does this constraint rule out, and what problems does ruling it out solve?

The Abstract also names the document's scope: "HTTP semantics." Not syntax, not transport, not caching policy in full. The scope declaration is the author's way of telling you which questions this document will answer and which it will redirect. When RFC 9110 says "the specific mechanism by which this is accomplished is outside the scope of this document," that is not evasion — it is the document honoring its own scope declaration.

RFC 2119 and the Grammar of Obligation

The Requirements Language section — typically a single paragraph pointing to RFC 2119 — is the most underread load-bearing section in any IETF document. RFC 2119 defines a small vocabulary: MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, OPTIONAL. Each word carries a precise meaning:

"MUST: This word, or the terms 'REQUIRED' or 'SHALL', mean that the definition is an absolute requirement of the specification." (Bradner, RFC 2119, §1)

The distinction between MUST and SHOULD is the distinction between a conformance requirement and a design recommendation. A MUST violation means your implementation is non-conforming — it cannot claim to implement the protocol. A SHOULD violation means your implementation has made a deliberate trade-off that the specification authors considered but did not mandate against. The commentary that follows a SHOULD in a well-written RFC is the author's compressed reasoning about when that trade-off is acceptable.

Engineers reading an RFC for implementation guidance often treat SHOULD as a softer MUST — something to do unless it's inconvenient. This is the most common misreading in protocol engineering. The SHOULD keyword is not a weakened obligation; it is an invitation to read the surrounding prose, which will tell you the specific conditions under which the recommendation can be set aside. Skipping that prose and treating SHOULD as "try to do this" is how interoperability bugs get introduced.

A close-up of a printed RFC page with the words MUST, SHOULD, and MAY highlighted in three different colors of ink, surrounded by handwritten margin annotations on yellowed paper

The practical move when reading any RFC is to make two passes through the normative sections: one pass collecting every MUST and MUST NOT (these are your conformance checklist), and a second pass collecting every SHOULD and SHOULD NOT with its surrounding prose (these are your design decision log). The second pass is where the protocol's intent lives. The first pass is where the implementation boundary lives. Conflating them produces implementations that pass conformance tests and fail in production.

The Protocol Body: Intent vs. Mechanism

RFC 9110's Section 9 covers methods — GET, POST, PUT, DELETE, and the rest. The section opens with a property that most engineers know by name but rarely read in its specified form: safety and idempotency. A method is safe if it is defined to have no observable side effects on the server. A method is idempotent if multiple identical requests have the same effect as a single request.

"Request methods are considered 'safe' if their defined semantics are essentially read-only... The purpose of distinguishing between safe and unsafe methods is to allow automated retrieval processes (spiders) and cache performance optimization." (Fielding et al., RFC 9110, §9.2.1)

The parenthetical "(spiders)" is doing significant work. The safety property is not defined for the benefit of the application developer — it is defined for the benefit of intermediaries: caches, proxies, crawlers, and link prefetchers. A GET request is safe not because it cannot modify server state (it can, as a side effect) but because the specification says intermediaries are entitled to assume it won't. This is the difference between protocol intent and implementation detail. The intent is to give intermediaries a signal they can act on. The implementation detail is what your server actually does when it receives a GET.

The idempotency property carries the same structure. PUT is idempotent because the specification says a conforming server must treat multiple identical PUT requests as equivalent to one. This is a constraint on the server's behavior, not a description of what most servers do. A server that processes the second PUT differently from the first is non-conforming, regardless of whether any client ever sends two identical PUTs. The specification is defining a contract, not describing an observation.

Reading the protocol body with this lens — every normative statement is a contract clause, not a description — changes what you look for. You are not asking "what does this mean my code should do?" You are asking "what does this mean any conforming implementation must guarantee, and what does that guarantee enable at the system level?"

Security Considerations as Compressed Threat Model

Most engineers read the Security Considerations section last, if at all. This is the wrong order. In a well-written RFC, the Security Considerations section is the author's compressed threat model — the list of attacks the protocol was designed to resist, the attacks it was not designed to resist, and the attacks it inadvertently enables. Reading it first gives you the adversarial frame that makes the normative constraints in the protocol body legible.

RFC 9110's Security Considerations runs to several thousand words. It covers request smuggling, response splitting, message integrity, sensitive information in URIs, and the privacy implications of the Referer header. Each of these is a place where the protocol's design choices created an attack surface. The section on request smuggling, for instance, explains why the Content-Length and Transfer-Encoding headers have strict precedence rules in Section 6.3 — rules that look like arbitrary tie-breaking until you understand that ambiguity in message framing is the root cause of a class of cache-poisoning and firewall-bypass attacks.

The Security Considerations section is also where the authors are most candid about what the protocol does not solve. RFC 9110 is explicit that HTTP provides no end-to-end integrity or confidentiality — those are delegated to TLS. This is not a gap in the specification; it is a deliberate layering decision. But a reader who has not read the Security Considerations section will not know which security properties they are getting from HTTP and which they need to supply from the transport layer.

Reading an RFC With AI: What the Co-Reader Is Actually For

The question engineers most often bring to an RFC is "what does this mean for my implementation?" That is a good question, but it is the second question. The first question is "what problem was this constraint designed to solve?" A summarize-first AI tool optimised for the first question will give you a paraphrase of the normative text — which you already have. It will not give you the threat model that motivated the constraint, the factoring history that explains why the constraint lives in this section rather than another, or the interoperability failure that the SHOULD's surrounding prose is trying to prevent.

The cognitive-science literature on the fluency illusion documents a specific failure mode: smooth, confident summaries feel like comprehension. A reader who receives a clean paraphrase of RFC 9110 Section 9.2.1 may feel they understand safe methods — and may be wrong in exactly the ways that matter for production systems. Scholia's approach is the opposite: the LAND-before-LIFT move echoes the exact normative keyword the engineer tripped on, then pivots to the protocol decision it encodes, with the full document loaded so the Security Considerations section and the Requirements Language section are both in context when the method semantics are being read. Reading an RFC with AI is most useful when the AI refuses to flatten the document into a summary and instead walks the reader through the generative logic — the question the constraint was designed to answer.

If you are stuck on a specific normative clause — a SHOULD whose surrounding prose you cannot parse, a cross-reference that leads somewhere unexpected, a Security Considerations paragraph that seems to contradict the protocol body — upload the RFC to scholiaai.com and work through it with the full document in context.


Frequently Asked Questions

What is the best approach for reading an RFC with AI?

Treat the AI as a co-reader, not a summarizer. The most productive move is to ask about the intent behind a specific normative keyword — why does this constraint exist, what failure mode does it prevent — rather than asking for a paraphrase of the section. The RFC is already on your screen; what you need is the generative logic behind the constraint, not a smoother version of the words.

What does RFC 2119 mean for protocol spec reading?

RFC 2119 defines a small vocabulary with precise meanings. MUST is an absolute conformance requirement — violating it means your implementation cannot claim to implement the protocol. SHOULD is a design recommendation with documented exceptions; the surrounding prose tells you when the exception is acceptable. Treating SHOULD as a softer MUST is the most common misreading in protocol engineering and a reliable source of interoperability bugs.

How should engineers read the Security Considerations section of an IETF RFC?

Read it before the protocol body, not after. In a well-written RFC, the Security Considerations section is the author's compressed threat model — it names the attacks the protocol resists, the attacks it delegates to other layers, and the attack surfaces the design inadvertently creates. Reading it first gives you the adversarial frame that makes the normative constraints in the protocol body legible.

What is the difference between protocol intent and implementation detail in an RFC?

Protocol intent is the system-level guarantee a normative constraint enables — what intermediaries, clients, and servers are entitled to assume about conforming implementations. Implementation detail is how a specific stack achieves that conformance. RFCs specify the former and deliberately leave the latter open. Conflating them produces implementations that pass conformance tests and fail in production.

Why does the RFC header matter for understanding the document?

The obsolescence list encodes the document's factoring history — which design decisions were reversed and why. The Status of This Memo field sets the epistemic weight of every normative statement in the body. A Standards Track RFC's MUST is a constraint on every conforming implementation; an Experimental RFC's MUST is a hypothesis under test. Skipping the header means reading the normative text without knowing what kind of claim it is making.


Stuck on the passage?

Scholia walks one passage at a time with the full-book context of the edition you uploaded. Open the PDF or EPUB you're reading at scholiaai.com and we'll land on the exact line you tripped on — then lift to mechanism.

The AI Co-Reader for Philosophy

Scholia loads your full edition first, then walks one passage at a time.

It's the structural opposite of a summariser — LAND before LIFT, with the whole book in view. Not a database, not a translation, not a chat-with-PDF that forgot the argument by page 40.

Keep reading

Closer to this passage