Text protocols — the ones our human eyes can read and understand — arose out of necessity. When much of our communications involve machines talking to each other, they might seem like a relic of a bygone era. Couldn’t we just do away with them entirely, and just let the machines speak to each other more efficiently without slowing things down for our human brains?
Today I’m going to make the case that neither humans nor machines are ready for a world without text protocols. My argument is based on the essential qualities the IETF preferences — that protocols should be based on a mixture of pragmatism, reductionism, and simplicity.
The more complex the protocol, the less success it tends to have. Simpler protocols just work better, and text is simple. But that’s not to say there haven’t been and won’t be some stunningly successful non-text binary protocols, which we depend on daily.
Why text protocols aren’t universally loved
Not everyone is going to agree with me on this. The critics of text protocols have some decent points. Most obviously, text protocols can be hugely inefficient in terms of overhead. Have you looked at XML? The amount of textual overload to specify state in XML is fierce, and it can even require a complete precompiled schema, and all the XML has to be read in order to even to begin processing it sometimes (how can you tell it is well-formed XML if you haven’t seen the </end> marker?).
For complex binary objects like movies, or music being streamed, or encrypted file state (think file sharing protocols), it’s usually not sensible to re-encode the data into text form.
When putting these things as attachments in an email you have no choice but to use a text protocol, but the explosion in file size is obvious; we now have to consider pre-compressing data (ZIP) to be able to attach a PDF to a mail message and only short movie clips can be sent, mainly because the mail provider doesn’t want to store it, but also because it’s too much data to send in an email.
We can encode it in text to send, but should we?
Sometimes, we do. The Unix-to-Unix Copy (UUCP) protocol, which is used to send binary data, included UUENCODE to ensure it could be safely sent over text-only channels. And USENET (the old distributed news/chat framework that is somewhat like what Reddit or Twitter is now) used similar methods to send movies as ‘chunks’ of data.
So yes, we can do this. But, in general, it’s worth avoiding. Some protocols are designed from day one for machine-to-machine use. File sharing, (Network File System or NFS for instance) was designed in the mid 1980s from the ground-up as a binary protocol. It used binary encoded data in the ‘Sun RPC’ format. NFS would have been extremely slow, implemented as a text-encoded protocol.
So there are, indeed, plenty of reasons to think text protocols can slow things down. But they’re still really handy.
Text protocols have a history of pragmatism
Text protocols are a product of how the Internet Protocol suite emerged. They came out of simple forms of written communication between humans. But I should be clear here; at no time did the Internet Protocol suite mean ‘only text’. There’s always been a mix of that and binary being exchanged over IP packets. The ‘protocols’ here are the applications, the things that sit ‘on top’ of the TCP and UDP formats inside IP packets. And, in many ways, these things have continued to be specified (when appropriate) as text.
That said, let’s think back to the late 1960s and early 1970s. This was during a time when bandwidth was low, and a lot of computing power was needed to recover from data loss, congestion and jitter. Protocols had to be simple to function.
Most forms of data transmission of the time supported visible ASCII encoded text, acquired from the days of Morse Code, leading to punched tape. A Telex machine was not an unfamiliar device, and would be seen in any bank, travel agent, or government department and was confined to the use of the International Telegraph Code, which is basically the alphabet, plus numbers and some minor punctuation marks.
Given that you didn’t know how reliable the transmission would be, and you wanted to be able to use it in as many circumstances as possible, text-encoded messages were the obvious answer. Pragmatically, the messages you were sending were things that humans wanted to say to each other anyway, as Machine-to-Machine networking was still in its infancy and not for general use (although IBM was designing its own network architecture, Systems Network Architecture (SNA), which would give it significant advantage in wide-scale computer-network deployments).
Independent network researchers at the time tended to use text protocols. Anyone can read them, and anyone can write them (from a specification, which was also defined in text, in a rigidly policed simple format called the ‘Request for Comment or RFC).
But these first ‘bootstrap’ simple protocols, which defined the early stages of the Internet, were not solely ‘text’ protocols. The packets themselves were always in binary. Messages in BGP are in binary. Messages in DNS are in binary. Messages at lower levels have to be sent and received in binary all of the time.
So why am I talking about them as if they were text protocols?
The command and response messages defined in them often were defined in terms of text. The basic investment in a computer network was to be able to ‘type’ into another computer, across a phone network. This is what defined remote access. You might be sending punch card data, or paper tape data (if you wanted it faster than typing) but the interface was typically a teletype and the investment in protocols above the network layer of the ARPANET (which became the Internet) was that we can do the same communication over TELNET.
So what’s TELNET?
“The TELNET protocol is based upon the notion of a virtual teletype, employing a 7-bit ASCII character set. The primary function of a User TELNET, then, is to provide the means by which its users can ‘hit’ all the keys on that virtual teletype” (from RFC 206, 1971).
This was how text could be used in other protocols.
For example, many of us who debugged Simple Mail Transport Protocol (SMTP) messages did this by using the TELNET command, to basically ‘speak’ via mail. Like this:
(My input is in red. The responses from the mail server are in green. The black bold text is locally generated responses from my computer)
$ telnet mail.host 25Connected to mail.host. Escape character is '^]'.
220 mail.host ESMTP Sendmail 8.15.2/8.15.2; Tue, 27 Jul 2021 04:02:47 GMTHELO example.com 250 mail.host Hello localhost [127.0.0.1], pleased to meet you MAIL FROM: firstname.lastname@example.org
250 2.1.0 email@example.com... Sender ok
RCPT TO: firstname.lastname@example.org 2.1.5 email@example.com... Recipient ok DATA 354 Enter mail, end with "." on a line by itself Date: Tue, 27 Jul 2021 09:29:00 +1000 (AEST) Subject: hello From: firstname.lastname@example.org To: email@example.com Hello .
250 2.0.0 16R42lVS058551 Message accepted for deliveryQUIT 221 2.0.0 mail.host closing connection Connection closed by foreign host.
That’s it. That’s an entire sequence of client-to-server exchange over the SMTP protocol.
This was me typing mostly in words that English speakers can understand.
The ‘client’ side was typed by me using the TELNET protocol, but over the special TCP port number 25, which connects to the mail service to speak SMTP. And, you can do this for other protocols: the original Hyper-Text Transport Protocol (HTTP) over port 80, is similarly defined by query and response codes that are textual, and a web server fetch is as simple as
GET /path/to/data typed into a TELNET connection to a webserver on port 80.
Of course, if you want to do more complex data exchange over HTTP you very rapidly reach the limits of what you want to type by hand, but it is still technically possible in many cases, to implement Internet protocols simply by typing them.
You could not do this with RIP (the simple routing protocol) or with BGP. You can’t do this with TCP itself. But for applications ‘on top’ of the infrastructure, you often can.
By typing out Internet protocols, it was possible to define a robust, workable set of protocols for specific applications, on top of the basic ARPA/Internet protocols. All you had to do was define your application queries and command sequences, and responses, using text. Admittedly, it was inefficient, in as much as your commands might have to be encoded in seven-bit printable ASCII characters, and even your data — this isn’t always a very efficient use of a data link. The overhead to encode binary data into ASCII is approximately 1/3 (in either the UUENCODE or Base64 methods).
Text protocols are nice, simple and reductionist
One of the hallmarks of the IETF protocol suite has been a strong drive to take pragmatism and simplicity to the logical (reductionist) endpoint; you could add that feature, but how about you don’t?
In SMTP, the mail protocol, the ‘S’ part stands for ‘simple’. There were jokes made at the time: “If it’s that simple, how come it’s 1/2 an inch of line printer paper to write out the protocol?”
Still, SMTP was simple. It was for sending mail, and only that. Not for reading it, or for storing it, or for accessing a filter list of rules about it, or for checking on delivery progress. Just sending it.
All the other tasks still have to be done. So we have Post Office Protocol (POP) and its successor Internet Message Access Protocol (IMAP), which handle mailbox storage, reading, marking and searching. We have SIEVE for mail filtering. Each one is confined to the specific functional role it needs, and although you can use them to do things beyond their initial scope, it’s generally best to try not to. Protocols are kept small and focused.
One protocol in particular, Simple Network Management Protocol (SNMP), has taken this to heart. SNMP came from a time when competing protocol specifications were actively under design by another standards group, the CCITT (which became the ITU). This standard called CMIP/CMIS continues to be used, but almost exclusively by the telecoms sector for managing switching, telephony and networking systems. It is a large, complex, baroque protocol family with several elements, and is hard to understand and implement. SNMP is a far simpler, reductionist protocol, which uses the binary representation ASN.1 in the form of what are called Management Information Base (MIB) definitions that now exist for almost any network-attached host, switch, router, and device of any kind.
SNMP is simpler, so it was widely adopted and implemented because it was small, robust, and cheap (in implementation terms) to do.
Let’s not drop text, but let’s also think about what it really is
Is a text protocol always efficient? No, far from it! We can see above that it can incur significant network coding overhead, and so is sometimes both slower, and more ‘costly’ to use.
But, it’s easy to describe, to discuss, to talk about. After all, the primary role of text is to be communicative!
We need to be mindful of the worldwide textual representation problems, which are inherently about more than just 7-bit ASCII encoded data, and indeed much of the modern Internet is now working hard to define how to ‘upscale’ text protocols to work in non-Western languages.
This activity is often characterized as being about International Domain Names (IDNs) and indeed we now have a lot of non-western DNS labels being operated, but really it’s about far more than this. Without explicit changes and checks to the language in the standards, you can’t necessarily send mail to Fréderick@France.example.com because that ‘é’ with an accent may not be accepted all the way through the mail delivery path. That’s a problem we have to fix.
In fixing that, we don’t have to reject the use of ‘text’ as such, but just be more nuanced about what we think ‘text’ means.
Perhaps it simply means ‘things humans can understand as-is’. And that’s an important aspect to keep, for all kinds of reasons.
At the core, an ability to express and even implement your network function over a text protocol makes it far more likely other people will both read it, and be willing to implement it, compared to the costs of writing, debugging and deploying a binary protocol.
The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.