Route to the Internet’s future

By on 15 Aug 2014

Category: Tech matters

Tags: , , , ,

3 Comments

Blog home

 

Yesterday APNIC Research Scientist George Michaelson commented on media coverage of the internet “running out of space“, quoting research by APNIC Chief Scientist, Geoff Huston. Today, Geoff continues the discussion.

 

In around 1990 Internet Engineering Task Force (IETF) was alerted to a looming problem: long before the Internet was a commercial reality it looked like we would hit two really solid walls if we wanted to make the Internet scale to a global communications system.

The first problem was that the Internet Protocol’s 32 bit binary address was just too small. It was looking likely that we were going to run out of addresses in the mid ’90s.

The second problem was that the Internet’s routing system was growing at an uncontrolled rate, and instead of using re-purposed low cost IBM XT machinery as routers, as was the case up until then, we’d soon need to use phenomenally expensive high end super computers because the routing system was growing faster than Moore’s Law and we were running out of cheap routing grunt.

We “solved” the address crunch by accident. The mainstream effort was to design a sucessor protocol to IPv4 that had really really big address fields. We did that by 1995, and its called IPv6. However, as well as IPv6, out of that effort came a form of low impact semi-transparent address sharing: Network Address Translation. These days IPv6 is still “some time away” as around 3% of the Internet can use IPv6, and the take up is still quite patchy. However, in the meantime NATs were adopted at a surprising rate and today NATs drive the Internet. While hard numbers are hard to come by we suspect that almost the entirety of the Internet’s client population are located behind at least one level of NATs, and many live behind two or more NATs. IPv6 is now 20 years old and still not here, and we have run out of IPv4 addresses. This was the envisaged “disaster” scenario, but NATs have kept the Internet network working despite that.

What about that routing problem? In 1993 we introduced another stopgap measure, similar to NATs. We introduced so-called “classless” routing protocols. This held back the onslaught of routing explosion for a couple of years. Then we changed the way addresses were allocated. The introduction of the regional Internet registry framework achieved one major outcome, which is little talked about but perhaps more effective than any other measure we took. The RIR system turned addresses from completely free resources that were allocated on a cemetery plot basis (once and forever) into a leased resource, where continued rights to use an address were associated with an annual payment. I would argue that this monetization of addresses pushed the evolving Internet industry from address profligacy to address conservatism. The changes were pervasive. Home consumers were given one IP address, and the customer’s CPE equipment included built-in NATs to allow their home network to grow to arbitrary sizes without using more of those expensive public addresses. Interestingly enough, the monetisation level was not high, but the results were significant. We shifted over to a model of address allocation that supported routing aggregation by network providers, and over the ensuring years the growth trajectory in the routing system was significantly lower, in relative terms, than that of the early 1990’s.

While the Internet’s growth has pressed on, the routing system and the addressing system have not grown at the same rate, due largely, in the case of routing, to the widespread deployment of NATs and provider-based addressing, and the ubiquitous use of classless routing protocols.

This story is visible in a plot of the past 20 years of routing table growth. If you look closely in this time series you can see the Internet boom and bust in the late 90’s and the onslaught of the mobile Internet in the past 10 years of growth.

However, today’s question is: Is routing growing faster than Moore’s law?”

If the answer is “yes” then routing gets more expensive. If “no” then the unit costs of routing will fall.

My presentation at NANOG 60 at the start of this year looked at the routing system and tried to forecast the next five years of routing table growth. Nothing in these projection numbers gives any cause for concern. The growth rate of IPv4 Internet is linear, at an approximate rate of 40,000 – 50,000 additional entires per year. This is a growth rate that is well below Moore’s Law. Yes, we cross threshold points from time to time, but we cross them at a leisurely pace. Recently, the IPv4 Internet passed 512K routing entries, and I’ve seen a few comments on this. But in fact it’s not a “real” threshold.

Let me explain…

The conventional way to get a router operating at a reasonably high packet throughput rate is to use very high speed TCAM banks (so-called “ternary content addressable memory”) on the line cards. The destination address field of the incoming packet is presented to the TCAM unit and a very short time later out comes what is, in effect, an outbound interface queue identifier where a router needs to send the packet onward. This is then used as input to the switching engine to take the packet and switch it over to the outbound interface queue.

TCAM is expensive, and high speed TCAM is (evidently) extremely expensive. Router vendors don’t over-provision more than “necessary” as this makes their products more expensive or reduces their margins on the product. They tend to build equipment with around five to ten years of anticipated growth.

Many vendors look at conventional projections of routing table size (such as page 27 here), double them and use this in their equipment. It should not be surprising to you see currently shipping routing equipment equipped with linecards with 2M of TCAM slots.

Of course its not just IPv4 – there’s IPv6 as well, but here the projected numbers are not as large (page 30 here). However, IPv6 prefixes are 64 bits, as compared to IPv4’s 32 bits, so they conventionally use 2 TCAM slots per rotting entry.

Yes, you could expand TCAM sizes to, say, 20M or more, but this will add to the router’s power and heat budget as well as the capital cost budget, and I suspect this would make its set of line cards look a lot like power-hungry, overheating, expensive bloatware. But if you wait for five to ten years, Moore’s Law will come to your assistance and you will be able to do 20M TCAMs for the power and price of, say, 2M TCAMs today. Most vendors provision their TCAMS in the linecards to ride around five to ten years ahead of the curve. If you have ten year old equipment you may well need a TCAM transplant to cope with today’s FIB size

Yes, you could do FIB compression to reduce the TCAM demands – but so far the cost of TCAM has been low enough that the added computational overhead to perform FIB compression is basically a dud proposition. But I suppose that its reassuring, to some extent, to believe that if things get really, really tough then FIB compression could be used!

What does all this mean? Well it means that older routing equipment in the Internet will, in time, fall over – inevitably.

Units with 256K TCAMS died some time ago. For those network operators running dual stack networks these 512K TCAMS actually died a few months back because the time of IPv4 and IPv6 TCAM slots exceeded 512K a few months ago. The “we’ve hit 512K and we died” story only applies to networks that are IPv4 only.

Nothing in the Internet’s routing table gives me cause for concern, and absolutely nothing I see is telling me to hit the alarm button – there really is is not much to see here. And while that is not exactly a newsworthy story, for a few tens of thousands of network operators out there, and for the vendors of equipment who service these network operators, the fact that BGP growth is actually steady and highly predictable is actually a really, really good story!

Rate this article

The views expressed by the authors of this blog are their own and do not necessarily reflect the views of APNIC. Please note a Code of Conduct applies to this blog.

3 Comments

  1. Andrew McRae

    I love the Freudian slip of those IPv6 “rotting” entries 🙂
    TCAMs don’t quite scale according to Moore’s law, because the limiting factor is the ‘CAM’ part of TCAM i.e parallel matching of a large number of entries, that is why we don’t have TCAMs that have scaled number of entries matching Moore’s law (but certainly they have scaled).
    We’ve seen these ‘the sky is falling’ stories about the Internet for many years. I am reminded of Mike O’Dell’s (UUNET architect) axiom: The only real problem is scaling.

    Reply
  2. Geoff Huston Post author

    Thanks for pointing out the type Andrew. I think its possibly very true that many routing entries are just sitting there rotting away, so I hope that the page’s editors leave the text as is! 🙂

    Yes, CAM memory evidently does not track Moore’s Law precisely, but then again neither does routing, so for the most part the story about scaling routing over the past decade is one that is largely a message of capability and stability.

    The story I like best about the Internet’s imminent collapse is Bob Metcalfe’s prediction back in 1995, and NANOG’s T-shirt in response (see the NANOG 08 T-shirt at http://www.nanog.org/meetings/t-shirts if you want to re-live the moment!)

    Reply
  3. Geoff Huston Post author

    I must confess that I’ve never tried to design hardware, and certainly not Content Addressable Memory. So its not surprising that I’ve received a comment that a detail in the description above is evidently incorrect. It appears that for IPv6 lookups the TCAM is configured to be 128 bits wide, or 4 times the size of an IPv4 TCAM entry. My apologies for this inaccuracy.

    Given that the low order 64 bits are, for the most part, used for the local interface identifier, the prefixes used in the IPv6 routing table are no longer than 64 bits, and I was under the assumption that a hardware designer would optimise the TCAM arrangement for IPv6 and handle only prefixes up to 64 bits in length in TCAM and throw the rest into a far slower software lookaside process. (I know I would, but maybe that’s a small part of the reason why nobody has ever tasked me to design hardware!) But it appears that the design of TCAM for IPv6 really does handle the absolute extreme case, and the TCAM for IPv6 is 128 bits wide.

    Reply

Leave a Reply

Your email address will not be published.

Top