What are ping and traceroute, really?

By on 21 Jun 2021

Category: Tech matters

Tags: , ,

Blog home

“I’ll drop you a line.”

“I’ll get back to you on that.”

“Yeah, remind me about that.”

Or you might simply say “ping me.” In fact, it’s not that uncommon on WhatsApp or your SMS client, to send the one word message: “ping”.

An image of a phone screen chat
Figure 1 Ping sent as a single message on a phone.

Maybe this is geek talk, leaking into the ‘real’ world.

When we say ‘ping’ as geeks, what do we really mean? 

Ping is a fascinating side of the IP stack. It’s actually an independent protocol element, all of its own, alongside the more normal IP packets for User Datagram Protocol (UDP) and Transmission Control Protocol (TCP) that carry our web and other traffic. It’s a subtype called Internet Control Message Protocol (ICMP), which is still ‘inside’ an IP network packet, but it’s not like UDP or TCP.

Ping is all about what we call ‘the control plane’

When we talk about data protocols, there is a way of thinking about them that divides work into two kinds:

  • Things you do that actually send useful data.
  • Things you do, which are about how the data is sent.

The first one is what we call ‘the data plane’. This is also sometimes what people call ‘the goodput’, which is a way of saying ‘the thing we really want to do, the useful bit’.

When you use the web (reading this blog article, for example) you’re using IP packets over IPv4 or IPv6 to send data. Your browsers request “hey, give me this new blog article” and the reply from our servers says “sure, here’s all the data to render that web page”.

That’s goodput, the thing you wanted; the thing you asked for and got. 

The second one is what is known as the control plane, and it’s not always referring to goodput because it’s not actually sending data you want to see.

This is about how that data is sent. Ping is an instance of this control plane — it’s about the behaviour of the network itself, sending and receiving IP packets with data. Ping is a colloquial name for ICMP,  and ICMP is not exactly the same as the other protocols carried in the Internet Protocol (IP); it sits alongside TCP and UDP. This is why we sometimes call IP ‘the Internet Protocol Suite’ because there is a ‘suite’ of related protocols that make things work on top of it. It’s not one thing, it’s a few. TCP, UDP, and ICMP for starters.

ICMP is it’s own protocol, and it’s always been there alongside IP

ICMP isn’t enormously complicated. If you look at the Wikipedia entry, one of the things highlighted is the RFC number. It’s defined by RFC 792.

Veterans of the IETF scene will note there are only three digits there, and understand what that means. It’s one of the early ones, dating back to 1981. That’s figuratively in the dark ages, or shortly after (the real dark ages are when the Internet was the ARPAnet, and ran a protocol called NCP but that’s a story for another day. NCP was turned off in 1983, so ICMP dates from the same window).

ICMP is part of the inheritance from before we had a 9,000+ documented RFC burden. It’s been ‘grandfathered in’.

The role of ICMP is to provide information about the path the data is taking from its point of origin to its destination.

It has the same basic structure as an IP packet, but despite that, it’s not really goodput. It’s there to control ‘how things are done’, therefore, is part of the control plane. It does, however, manage how the actual goodput works.

ICMP has subtypes, and is able to be used to do a number of things.

They form a group of about 10 to 15 different sub-functions, which are fully detailed in the RFC and the Wikipedia page, so I won’t reproduce the complete list here, but a brief summary would look like this:

ICMP subtypefunction
ECHO Two separate message types for a request and a reply. This is what most people mean when they say ‘ping’
DESTINATION UNREACHABLE16 subtypes for different reasons things can’t get through
SOURCE QUENCH(Deprecated) basic congestion control: slow down, send me less
REDIRECTFour subtypes for different ways to redirect the IP flow
ROUTER Two subtypes, to solicit and offer routing (how things get forwarded off the local network)
TIME EXCEEDEDTwo failures due to ‘time’ but actually one is a special concept called time-to-live (TTL), which is … sort of not always about time
TIMESTAMPA pair of subtypes to do request and response about time
<other>A number of other ideas, officially now marked as deprecated (like source quench)

Phew! Seven ICMP subtypes, with their own sub-subtypes, and even more I didn’t list that are probably best forgotten as they may have once been good ideas, but have been ‘deprecated’ since then.

Most of these subtypes are on the control plane.

‘Destination unreachable’ is telling you that whatever you tried to send in the IP packet, the other end (the destination address you put in the packet) can’t be found. It usually means one of two things. Either there is a routing problem, and the ‘source’ address of this ICMP packet won’t be the one you sent it to (the source address of something along the way, telling you things can’t get any further), or, a firewall has decided not to allow your packet to flow, and is using this ICMP message to tell you to “go away”.

‘Redirect’ is trying to teach you how to get there, because where you want to go is not the place the network thinks you need to be. That’s pretty unusual. 

‘Time exceeded’ is wonderful, because it does (in one case) refer literally to a timer limit. If your IP packets have to be fragmented, they can (and often do) arrive either out of order, or not at all. You (a receiver) have to sit there and hang onto the fragments, waiting for the missing one(s) to arrive. This response is how you say back to the sender “nah: the fragment didn’t come, I can’t hold onto this any more”, which will allow the sender to resend the packet, if it can.

There is another handy way to use ‘time exceeded’, though…

So what about ping and traceroute?

Before we can get into the specifics of ping or traceroute, we need to examine ‘time exceeded’ more closely.  An ‘ICMP TIME EXCEEDED’ message is received when a very useful property of all the IP packets, TTL, sends it out. This is a field in the outer IP packet, which ICMP, UDP and TCP all share because it’s part of the IP ‘header’ (every IP packet, whether it’s ICMP, UDP or TCP, has a header, followed by the payload).

It does what it sounds like — it warns of its own demise. 

It’s basically a counter. Each time it hits a routing element, the figure goes down by a certain amount. Imagine heading along a path, and at each stone on the path you lose some of the coins in your pockets.

The thing is, there isn’t one set number of coins, nor are the same number of coins necessarily taken out at each step. So actually, the message might not make it all the way from A to B. When you’re all out of coins, you get to send a message home but you don’t get to finish walking the path. You’re done. You can send a message home, but that’s it.

An image showing the path when TTL is set to one.
Figure 2 — TTL being set to one, meaning it finishes at the first element and sends the message home.

The initial value isn’t a constant; the sender can choose to set this to a low count. This is going to come back in a few paragraphs, so hang onto that thought: TTL is being decremented, TTL zero means stop, and TTL isn’t a fixed value at the start — it’s an equation with several unknowns. 

It’s also important to note the element that makes the ‘ICMP TIME EXCEEDED’ announcement announces its IP address. In other words, when your TTL has terminated its path, we learn exactly where it happened.

An image showing the path when TTL is set to 2.
Figure 3 – TTL being set to two, and the message returning at point 2.

So in Figure 3 above, sender A learns the location of point 2 along the path.

Theoretically, you could start with a TTL of 1, and reach the first element. You learn the location of element one.

You could then set the TTL to 2, and learn the location of point two.

And then you set the TTL to 3, and so forth. By increasing your TTL figure each time, you can see every point along the path. Each time your message gets a bit further, then comes back to you. Eventually, it will reach its destination and you’ll have a pretty good idea of how it got there.

But bear in mind, the route back may not be the same.

An image showing how the return path can differ from the original path to destination.
Figure 4 — A demonstration of the return path differing from the path from source A to destination B.

So it’s not a perfect method. There are actually several things that can make it inaccurate, but this is just a basic way of tracing the route.

And that, right there, is the basic method behind ‘traceroute’.

Ok, what about ping?

Going back to our ICMP subtype table, ‘echo’ is the one that we all know and love. Echo is the primary ICMP packet type that the ping command uses by default. Echo, aka ping, is how you can find out, “If I wanted to send some data to you, are you there, and, for this echo packet, how long did it take, all up, for me to send it and you to receive it?”

Rather than traceroute, which tells us where this packet travelled, for ping we’re looking at how long it took.

But if you know how ‘long’ something takes, you can get a pretty good idea of distance. If someone travelled in a reasonably straight line for an hour at 100kms/h, you can guess they’ve travelled around 100km.

The ‘ping me’ idea is the echo ICMP packet. It’s a message sent out to a recipient to determine whether they’re receptive to receiving messages, and what the response might look like.

It’s a little bit like sonar.

Think of submarine films like The Hunt for Red October. This is a literal, actual, audible sonic signal — send out a ping and get an echo back.

And, the same idea of ‘how long did it take?’ is in sonar. The time to send-and-receive the ping-echo in sonar, from the speed of sound in water (better have a table handy, it varies by depth, temperature, salinity, and current) tells you the distance. Well in ICMP echo, the round-trip-time (RTT) tells you the effective ‘distance’ as a time, and is a good rough approximation for a few things:

  1. How far away are you, in time, all things being as best they can be? This is the ‘smallest’ RTT you see, when doing a sequence of pings. 
  2. How variable is the time, as a rough indication of congestion and delay? This is the range of RTT you see, how bad it can be, how good it can be and how much it ranges in-between. This is what we call both delay and jitter: jitter is the variability, and delay is the ‘how long?’ part.
  3. How reliable are you? What is the loss of packets over this series of pings?

All of these are informative. They lead to more questions, as well as answers — the best ping might be 10ms. That tells me that you are not more than 100kms from me because its impossible to send an optical signal further than this, without incurring speed-of-light delay. If it’s 350ms, you might be in London for all I know; if it never drops below 300, I would probably guess at that, but you might be on a very slow congested link and actually be closer. I might need to look at how variable this delay is, to try and figure things out.

If I see 10 ping packets sent out, but only 4 arrive, then I have 60% packet loss in ICMP. That’s pretty bad. I’m not happy. Is it me? Is it you? is it something in-between? It’s hard to know (but we will come back to this).

Echo is ping, but the Internet can decide to de-preference your ping; it may not actually mean what you think

Ping is the go-to diagnostic you use when you can’t get a web page to reply, you think your home router has been disconnected, or you’re worried there is some congestion on the line. Any how-to guide is going to suggest ICMP as the first diagnostic.

But, they really should tell you an important caveat. Routers and other devices like firewalls along the way can play with ping. They can give it lower preference than other packets (so you acquire more delay, therefore, isn’t indicative of the goodput any more); they can prioritize it (so it’s over-optimistic); or they can rate limit it so you see both slower responses, or effectively more loss (dropped packets) than your data will see. For these reasons and others, you need to be skeptical of the value of a ‘ping’ diagnostic, against other information. It’s a tool but it’s not categorically ‘right’ all the time.

You can think of it this way — you may know that they travelled for an hour at 100kms/h, but maybe they forgot to tell you about that detour for roadworks or how long they stopped to refuel.

Your guess was reasonable, but could be wrong.

Sometimes, people abuse ping

Geeks. Love them or loathe them, you sometimes have to admire them. Ping is part of the protocol stack; we need it to work and we want it to work (and please don’t over-filter ICMP), but it can be used to wreak havoc. Firstly, ICMP can ‘flood’ a network. It’s possible to use it to ask ALL the hosts on a local segment to reply, and the net effect of this is to cause a lot of surplus traffic to suddenly exist. Unfortunately, bad guys can use things like this to cause people to send unnecessary data. This is why people wind up filtering ICMP packets: it’s possible to abuse them to reduce overall network effectiveness.

Another problem with ICMP, is actually a niche problem for gamers, and not actually a big deal. This is the concept of ‘the low ping cheater’ (known in gaming communities as something else). This is the person who looks at the RTT, figures out they have lower RTT (faster packets) than anyone else, and then enters the game, with a preloaded advantage. They can get data to the game server before anyone else can. The net result is ‘quick draw McGraw’ will always win any shoot-em-up game because no matter how quickly you react, their packets tend to get there first.

Good games try to recognize this effect and socialize out the problem. Low ping cheaters reduce gameplay for everyone; the gamers themselves try to police this, or the game introduces modes of data service that ‘equalize’ for delay to even things out. (Basically, we’re all caught in the economics of game theory, playing games).

Lastly, there is lovely circularity about ‘this is the control plane not the data plane’ because ICMP packets actually can carry data, both ways: sender and receiver. 

IP over everything includes IP over ICMP

One of the ‘tricks of the trade’ of getting through firewalls and restrictions on packetflow, is to see that ICMP packets are allowed to flow, but not others.

If we can send ICMP and we know in advance (or, can set this up via a call out to somebody), then we can implement the entire IP stack as a ‘tunnel’ over ICMP. It may be slower, it may have more loss, and it may require IP fragmentation and reassembly, but as long as we can send and receive ICMP, we can actually send IP ‘inside’ ICMP. And yes, that includes sending ICMP inside ICMP (the ICMP itself is going to one place only; we’re having to de-encapsulate, remove these packets from the tunnel, and insert the other end address as ‘sender’, sending these IP packets out into the global Internet. It’s similar to running a VPN except using ICMP as the carrier).

Yes, there’s an IPv6 version

ICMP is part of the IP protocol suite, but we really mean IPv4. The version of ICMP we have in IPv6 is called ICMPv6. It’s subtly different to IPv4 ICMP, but it’s equally important. It has different options and capabilities that reflect changes in the IP protocol stack between version 4 and version 6. Arguably, some of these are just as ‘odd’ as the legacy and deprecated features of ICMP, but it pays to be cautious assuming they don’t get used. In fact, ICMPv6 is important for stateless auto-configuration. It is used for the address solicitation, router discovery and renumbering processes. It’s even more vital that ICMPv6 packets are allowed to flow properly on the local segment. For end-to-end uses, the same ping and TTL functionality exists, and the same model of path discovery for traceroute can be done in IPv6.

What are ping and traceroute, really? They’re part of the control plane, and the data plane, kind of…

So there you have it; ping is ICMP, traceroute is a tool that uses ICMP to infer things about the network. Both ping and traceroute are primarily diagnostic tools, but in the case of ICMP it is also part of the active control of end-to-end IP, and an important part that was ‘baked in’ to the protocol suite back in the early days. They’re important, but you need to be careful over-reading their behaviour, and do your due diligence on diagnostic value testing to really determine how your IP packets are flowing.

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

Top