Posts Tagged internet

The surprising centres of the Internet

A previous post, on “Barabási – Albert preferential attachment and the Internet“, gave a plot of  the Internet, as a sparsity map of its regular adjacency matrix, with the axes ordered by each ASes  eigencentrality:

Sparsity plot of the Internet adjacency matrix, for the UCLA IRL 2013-06 data-set, with the nodes ordered by their Eigencentrality ranking.

Sparsity plot of the Internet adjacency matrix, for the UCLA IRL 2013-06 data-set, with the nodes ordered by their Eigencentrality ranking.

Each connection in the BGP AS graph is represented as a dot, connecting the AS on the one axis to the AS on the other. As the BGP AS graph is undirected, the plot ends up symmetric. The top-right corner of this plot shows that the most highly-ranked ASes are very densely interconnected. The distinct outline probably is indicative (characteristic?) of a tree-like hierarchy in the data.

Who are these top-ranked ASes though? Are they large, well-known telecommunications companies? The answer might be surprising.

Read the rest of this entry »

Leave a Comment

Barabási – Albert preferential attachment and the Internet

The Barabási – Albert paper “Emergence of Scaling in Random Networks” helped popularise the preferential-attachment model of graphs, and its relevance to a number of real-world graphs. There’s a small, somewhat trivial tweak to that model that can be made which never the less changes its characteristics slightly, with the result possibly being more relevant to the BGP AS graph. GNU Octave and Java implementations are given for the original BA-model and the tweak. The tweak can also potentially improve performance of vectorised implementations, such as Octave/Matlab.

Read the rest of this entry »

Comments (1)

Are IPv6 addresses too small?

IPv4 addresses are running out. IPv6 seems to offer the promise of a vast, effectively limitless address space. However, is that true? Could it be that even IPv6 addresses are too limited in size to allow potential future routing problems to be tackled? Maybe, if we want scalable, efficient routing. How could that be though?

IPv4 addresses are all but exhausted, IANA having run out in 2011, APNIC shortly thereafter and RIPE in 2012.The idea is that we switch to IPv6, overcoming the obstacles to transition that have now been engineered into it (e.g. usually a completely unrelated address space from IPv4, so significantly hampering IPv6 from being able to make transparent use of existing IPv4 forwarding capabilities).

IPv6 addresses are 128 bits in length, compared to the 32 bits of IPv4. So in theory IPv6 provides a massive address space compared to IPv4, of 2128 addresses. Exactly how massive? Well, some have calculated that that is enough to assign an IPv6 address to every atom on the surface of the earth, and still have addresses to spare. If every IPv6 address was assigned to a node with just 512 bytes of memory, then some have calculated the system in total would require more energy than it would take to boil all the oceans on earth – presuming that system exists only as pure energy! So by those estimates, the IPv6 address space seems like it is so massive that we couldn’t possibly ever come close to using all of it, least not while we’re confined to earth.

Those estimates however assume the addresses are used as pure numbers with perfect efficiency. In reality this will not be the case. To make computer networking efficient we need to encode information into the structure of the addresses, particularly so when those networks become large. The lower, least-significant 64 bits of an IPv6 address are effectively reserved for use on local networks, to allow for auto-configuration of IPv6 addresses on ethernets. Changing this would be quite difficult. So IPv6 actuall;y offers only 64 bits to distinguish between different networks, and then a further 64 bits to distinguish between hosts on a given network. Further, of the upper, most-significant 64 bits (i.e. the “network prefix” portion of an IPv6 address), already at least the first, most-significant 3 bits are taken up to distinguish different blocks of IPv6 space. This means the global networking portion of the IPv6 address space only has about 61 bits available to it.

IPv4 and IPv6 BGP FIB sizes: # of prefixes on a logarithmic scale. Courtesy of Geoff Huston.

Why could this be a problem? Isn’t 61 bits more than enough to distinguish between networks globally (i.e., as on the Internet)? Well, that’s an interesting question. The global Internet routing tables have continued to grow at super-linear pace, as can be seen in the plot above (courtesy of Geoff Huston), with at least quadratic growth and possibly exponential modes of growth. IPv4 routing tables have continued to grow despite the exhaustion of the address-space, by means of de-aggregation. IPv6 routing tables are also growing fast, though from a much smaller base. There have been concerns over the years that this growth might one day become a problem, that router memory might not be able to cope with it, and so that could bottleneck Internet growth.

At present with BGP, every distinct network publishes its prefix globally and effectively every other network must keep a note of that. Hence, the Internet routing tables grow in direct proportion to the number of distinct networks in number of entries. As each entry uses an amount of memory in logarithmic proportion to the size of the network, the routing table growth is slightly faster than the size of the network in terms of memory. So as the Internet grows at a rapid pace, the amount of memory needed at each router grows even more rapidly.

If we wanted to tackle this routing table growth (and it’s not clear we need to), we would have to re-organise addressing and routing slightly. Schemes where routing table memory growth happens more slowly than growth of the network are possible. Hierarchical routing schemes existed before, and BGP even had support for hierarchical routing through CIDR and aggregation. However, as hierarchical routing could lead to very inefficient routing (packets taking very roundabout paths, compared to the shortest path), other than in carefully coördinated networks, it never gained traction for Internet routing, and support for aggregation has now been deprecated from BGP. Other schemes involving tunnelling and encapsulation are possible, but they suffer from similar, intrinsic inefficiencies.

A more promising scheme is Cowen Landmark Routing. This scheme guarantees both sub-linear growth in routing table sizes, relative to growth of the network and efficient routing. In this scheme, networks are associated with landmarks, and packet addresses contain the address of the landmark and the destination node. One problem with turning this into a practical routing system for the Internet is the addressing. How do you fit 2 network node identifiers into an IPv6 address? Currently organisations running networks are identified by AS numbers, which are now 32-bit. However, 2×32 bit = 64 bits, and there are no more than 61 bits available at present in IPv6 for inter-networking.

Generally, any routing scheme that seeks to address routing table growth is near certain to want to impose some kind of structure on the addressing in similar ways. The 64 bits of IPv6 doesn’t give much room to manoeuvre. Longer addresses or, perhaps better, flexible-length addresses, might have been better.

Are IPv6 addresses too short? Quite possibly, if we ever wish to try tackle routing table growth while retaining efficient routing.

Comments (4)

Cerf and Kahn on why you want to keep IP fragmentation

In “A Protocol for Packet Network Intercommunication“, Vint Cerf and Bob Kahn
explain the basic, core design decisions in TCP/IP, which they created. They describe the end-to-end principle. What fascinates me most though is their explanation of why they incorporated fragmentation into IP:

We believe the long range growth and development of internetwork communication would be seriously inhibited by specifying how much larger than the minimum a packet size can be, for the following reasons.

  1. If a maximum permitted packet size is specified then it becomes impossible to completely isolate the internal packet size parameters of one network from the internal packet size parameters of all other networks.
  2. It would be very difficult to increase the maximum permitted packet size in response to new technology (e.g. large memory systems, higher data rate communication facilities, etc.) since this would require the agreement and then implementation by all participating networks.
  3. Associative addressing and packet encryption may require the size of a particular packet to expand during transit for incorporation of new information.

Fragmentation generally is undesirable if it can be avoided, as it has a performance cost. The fragmenting router may do so on a slow-path, for example; and re-assembly at the end-host may introduce delay. As a consequence, end hosts have for a long while generally performed path-MTU-discovery (PMTUD) to discover the right overall MTU to a destination, thus allowing them to generate IP packets of just the right size (if the upper-level protocol doesn’t support some kind of segmentation, like TCP, this may still require it to generate IP fragments) and so set the “Don’t Fragment” bit on all packets and generally avoid intermediary fragmentation.  Unfortunately however PMTUD relies on ICMP messages which are sent out-of-band, and unfortunately as the internet became bigger, more and more less-than-clueful people became involved in the design and administration of the equipment needed to route IP packets. Routers started to either ignore over-size packets and (even more commonly) firewalls started to stupidly filter out nearly all ICMP – including the important “Destination Unreachable: Fragmentation Needed” ICMP message needed for PMTUD. As a consequence, end-host path-MTU discovery can be fragile. When it fails to work, the end-result is a “Path MTU blackhole”: packets get dropped for being too big at a router while the ICMP messages sent back to the host get dropped (usually elsewhere), meaning it never learns to drop its packet sizes. Where with IP fragmentation communication may be slow, but with PMTU blackholing it becomes impossible.

As a consequence of this, some upper-level applications protocols actually implement their own blackhole detection, on top of any lower-layer PMTU/segmentation support. An example being EDNS0, which specifies that EDNS0 implementations must take path-MTU into account (above the transport layer!).

So now the internet is crippled by an effective 1500 MTU. Though our equipment generally is capable of sending much larger datagrams, we have collectively failed to heed Cerf & Kahn’s wise words. The internet can not use the handy tool of encapsulation to encrypt packets, or to reroute them to mobile users. Possibly the worst aspect is that IPv6 completely removed fragmentation support. While there’s a good argument that end-end level packet resizing may be more ideal than intermediary fragmentation, as IPv6 still relies on out-of-band signalling of over-size packets, without addressing that mechanism’s fragility problem, it likely means IPv6 has cast the MTU-mess into stone for the next generation of inter-networking.

Updated: Some clarifications. Added consequence of how PMTU breaks due to ICMP filtering. Added how ULPs now have to work around these transport layer failings. Added why fragmentation was removed from IPv6, and word-smithed the conclusion a bit.

Comments (1)

Letter to the BBC Trust regarding concerns with the BBC iPlayer

This is an edited version of a submission of mine to the BBC Trust, for its recent review of BBC ondemand services.

Overview

The BBC has, no doubt in good faith, acted to frustrate a number of legitimate 3rd party clients for iPlayer which were created outwith the BBC. I believe this is not in keeping with the Charter for the BBC, nor with its remit for BBC Online, with regard to its obligations to provide broad access to all UK residents.

Further, in building BBC iPlayer on the proprietary technology of a single-vendor I believe the BBC helps promote that technology and that single vendor. I believe this distorts the market by requiring that all iPlayer access devices must build on “Flash”. I believe the BBC is acting anti-competitively by restricting 3rd party entrepreneurs from having a choice of technology vendors. Further, due to certain measures, the BBC has managed to make itself the arbiter of who may and who may not access the service.

I believe this situation is unhealthy and damaging to the public interest. I believe in fact there already are examples of the BBC having misused this power contrary to the public interest.

I would urge the BBC Trust to require the BBC to re-examine its iPlayer technologies and:

  • Investigate the use of multi-vendor, open standards, including HTML5 video
  • Engage with vendors and other interested parties in order to draw up, in an open and transparent manner, whatever further protocols are required (e.g. content rights protection protocols) to allow truly neutral access to BBC iPlayer
  • Eliminate the reliance on proprietary, single-vendor technology.
  • Draw down its development of end user/device iPlayer clients, and transfer all possible responsibility for device development to the free market, as is right and proper, and normal for other services such as radio and TV (both analogue and digital)

Background

Flash

As you are aware, the BBC chose to implement its iPlayer catchup/ondemand system on top of the “Flash” platform. This is a technology of Adobe Inc, a US corporation.

While some specifications for “Flash” have been published by Adobe, several key aspects of “Flash” remain undocumented and are highly proprietary to Adobe – particularly the RTMPE and RTMPS protocols, which the BBC use in iPlayer. Further, there are no useful implementations of the current version of “Flash” technology other than those offered by Adobe or that are licenced from Adobe, that I am aware of.

Adobe make end-user clients freely available for a number of popular operating systems and computers, such as Microsoft Windows on PC (32 and 64), OS-X on PPC/PC, Linux on 32bit PC. Adobe also licence versions of “Flash” to embedded-computer device manufacturers. It is unclear to me whether Adobe require royalties on such players, however from experience Adobe do charge an initial porting fee.

Competing video content delivery technologies include:

  • HTML5 video
    • Multi-vendor, industry standard, openly specified by the W3
    • Implemented in more recent versions of Microsoft Explorer (IE9), Google Chrome, Firefox
    • A number of online video sites have beta-tests of HTML5, including Youtube
    • NB: Firefox bundles only Ogg Theora codec support at this time
  • Microsoft Silverlight
    • Openly specified, with the usual patent issues around video.
    • 2nd source implementation available from the Mono project.
    • Has had some high profile deployment.

3rd Party iPlayer Clients

Since the iPlayer service was started, a number of 3rd party clients were created. These included a plugin for “XBMC”, a software solution to turn a PC into a “media centre”; and “get_iplayer” which is a command line tool to allow Unix/Linux users to access iPlayer.

Both these clients were respectful of the BBCs’ content protection policies and restrictions and were careful to honour them. These clients were also subject to the BBCs “geographic IP checks” (geo-IP) which allowed the BBC to ensure users of tools were in the UK (to the same extent as it could ensure this for official BBC iPlayer). These tools did not rely on Adobe technology, and their source is freely licenced. Some users perceived a number of technical advantages, such as being better integrated with their respective environments, having lower CPU usage, etc.

At several stages, the BBC made technical changes to the service which appeared to have no purpose but to frustrate unofficial, 3rd party clients. E.g. the BBC started requiring certain HTTP User-Agent strings.

For quite a while though, the BBC made no further changes, and the clients concerned worked very well for their users. Until very recently when the BBC implemented a protocol which requires that the client “prove” that it is the official BBC iPlayer client by providing a numeric value that is calculated from the binary data making up the BBC iPlayer client and from data provided by the server. This numeric value can obviously not be easily provided by 3rd party clients, unless they first download the official iPlayer SWF file from the BBC of course.

These measures taken by the BBC objectively do not add any immediate end-user value to the service. These measures objectively do interfere with the ability of 3rd party clients, being used legitimately by UK licence fee payers, to access the service. These measures have indeed reduced the base of clients able to access iPlayer,  with the developers of get_iplayer having decided to abandon further work due to repeated frustratative measures taken by the BBC.

Concerns

Concern 1: Public Value and Emerging Communications

In the licence for BBC Online, the Trust asked the following of the BBC with regard to emerging communications, in part II, 5.6:

BBC Online should contribute to the promotion of this purpose in a variety of ways including making its output available across a wide range of IP-enabled platforms and devices.

Additionally, the BBCs’ general Charter Agreement states, in clause 12, (1): “must do what is reasonably practicable to ensure that viewers … and other users … are able to access UK Public Services“.

My concern is that the BBC had made available a service, which was being used by UK licence fee payers, in full compliance with all usage restrictions, and it then took technical steps aimed directly at blocking access to the service. To me it is most contrary to the BBCs remit and its Charter agreement to act to block legitimate access!

I believe this one example of an abuse, though well-intentioned and in good faith, of the power the BBC has acquired by gaining control over the end-user platform (i.e. by restricting access to BBC iPlayer to the BBCs’  Flash-based player, and to a select, approved device-specific clients, such as the Apple iPhone).

Further, Adobe Flash is a roadblock to wide access. Whereas multi-vendor standards tend to be widely available across devices, Adobe Flash is not. This can be because certain embedded system vendors can not afford the moneys required to get Adobe’s attention to port Flash to new platforms (from experience, it’s a not insubstantial amount even for large multi-nationals; which can be an impossibly huge amount for small, low-budget startups).

In essence, the single-vendor nature of Flash is a significant obstacle to the goal of wide deployment over a range of platforms. With all the best will in the world, it is not within the resources of Adobe alone to support all the different kinds of things people would like to play video on, at a price suited to small startups.

Similarly, it is not within the resources of the BBC to properly engage with every aspiring builder of iPlayer access devices.

Concern 2: Market Impact

The BBC Online licence asks the BBC to consider the market impact, in part I, 1:

BBC Online should, at all times, balance the potential for creating public value against the risk of negative market impact.

I will set out my stall and say that I believe that in many spheres of human activity the public interest is best served through free market activity, in a reasonably unrestricted form. While I cherish the BBC for its content, and I agree with its public service remit, I do not believe the public interest is served by having the BBC dictate the end-user platform. Rather, I believe software and device vendors should be free to create and develop products, and the public to pick which is best.

The BBC have managed unfortunately to acquire the power to dictate whose devices and software do and do not have access to iPlayer. It has managed to do so under the guise of “content protection”. Under this pretext the BBC has persuaded the BBC Trust that, unlike prior modern history, the BBC alone must provide or approve of all code for iPlayer viewing that runs on end-user devices. The BBC Trust has gone along with this on a temporary basis.

This gives the BBC an unprecedented power. Which it has not had in any other area of end-user device technology in working memory.

As detailed above, the BBC has already wielded this power to cut-off clients which it does not approve of – though these clients and their users were acting as legitimately as any other. The BBC claims this is a necessary requirement. However, there is no reason why the BBC can not, in conjunction with the internet community and interested vendors, prescribe a standard way for clients to implement the BBCs’ content protection. Such a way that would afford the BBC all the same protections as it has at present, both technical and legal such as under the Copyright &Related… Act 2003, while allowing 3rd party access on a neutral basis.

There is no need for the BBC to be the king-maker of the device market, and the BBC should not have this power. The BBC should not be allowed to build a new empire over end-user devices. It should be scaling down its iPlayer client development – not up! It should focus on back-end delivery technologies, and provide a standards specified interface to those technologies, just as it does for over-the-air broadcast with DVB-T and DVB-T2, and its participation in setting those standards.

With regard to Adobe. Even if the BBC did not make use of Flash to restrict 3rd party iPlayer clients, the reliance on a single-vendor is a problem. Adobe may have a reasonably good history of being open and providing royalty free clients for most popular clients. However, Adobe are a for-profit US corporation, and so they in no way are beholden to the UK public interest. They would be perfectly entitled to start charging royalties in the future if they wished. They have charged device manufacturers royalties in the past in certain scenarios (exactly which is not public, to the best of my knowledge).

Further, Adobe are but a single software company. As stated previously, they do not have the resources to work with every little TV/STB startup. As such, they apply basic economics and raise their fees for enabling Flash on new embedded platforms (porting fees), to the point that demand diminishes to a manageable point. This means small startups may be excluded from enabling iPlayer on their devices.

Finally, the internet has been developing multi-vendor standards for video, as part of HTML5. It is reasonably widely acknowledged that multi-vendor standards are more favourable to the public interest than single-vendor proprietary technologies, all other things being equal.

As such, the BBC ought to have a duty to favour the use of multi-vendor, openly specified standards – to advance the public good.

Conclusion

My conclusions are as given in my introduction. I believe the BBC has, albeit in good faith, misused the control it has gained over end-user platforms with iPlayer by deliberately interfering with usage of iPlayer by legitimate clients.

I believe the BBC, no doubt in good faith and with all good intention, has made a mistake in choosing to build its iPlayer video delivery systems on the proprietary technologies of a single-vendor. I believe that these acts and decisions are not in keeping with the Charter, nor with its remit for BBC Online or iPlayer. I believe the public interest demands that the BBC should seek to step back from its current heavy involvement on the intimate mechanics of how end-user devices access iPlayer.

The BBC instead should publish technical documents describing the protocols used to access its services. It should engage all interested parties to formulate and develop any required new protocols, in an open and transparent manner. This is similar to how the BBC acts in other areas, such as with DVB.

I think this is required in order to allow the free market to be able to innovate and so create a range of exciting and useful iPlayer clients.

I think it is vitally important to all this that the BBC use openly specified and published protocols, available on non-discriminatory terms that are implementable by royalty-free software. Such software often forms the infrastructure for future innovation.

Comments (6)

Making the Internet Scale Through NAT

(just want to dump this here for future reference)

It’s widely acknowledged that the internet has a scaling problem ahead of it, and an even more imminent addressing problem. The core of internet routing is completely exposed to peripheral growth in mobile/multihomed leaf networks. Various groups are looking at solutions. The IRTF’s RRG has been discussing various solutions. The IETF have a LISP WG developing one particular solution, which is quite advanced.

The essential, common idea is to split an internet address into 2 distinct part, the “locator” and the “ID” of the actual host. The core routing fabric of the internet then needs only to be concerned with routing to “locator” addresses. Figuring out how to map onward to the (presumably far more numerous and/or less aggregatable) “ID” of the specific end-host to deliver to is then a question that need only concern a boundary layer between the core of the internet and the end-hosts. There are a whole bunch of details here (including the thorny question of what exactly the “ID” is in terms of scope and semantics) which we’ll try skip over as much as possible for fear of getting bogged down in them. We will note that some proposals, in order to be as transparent and invisible to end-host transport protocols as possible, use a “map and encap” approach – effectively tunneling packets over the core of the internet. Some other proposals use a NAT-like approach, like Six-One or GSE, with routers translating from inside to outside. Most proposals come with reasonable amounts of state, some proposals appear to have quite complex architectures and/or control planes. E.g. LISP in particular seems quite complex, relative to the “dumb” internet architecture we’re used to, as seems to try to solve every possible IP multi-homing and mobile-IP problem known to man. Some proposals also rely on IPv6 deployment.

Somewhat related proposals are shim6, which adds multi-homing capable “shim” layer in between IP and transport protocols like TCP (or “Upper Layer Protocols” / ULPs), and MultiPath-TCP (MPTCP), which aims to add multi-pathing extensions to TCP. Shim6 adds a small, additional state machine to whatever state machines the ULPs require, and is not backwards compatible with non-shimmed hosts (which isn’t a great problem of itself). Shim6 requires IPv6. MPTCP is still in a very early stage, and does not appear to have described any possible proposals yet.

Not too infrequently well-engineered solutions to some problem may not quite have a high enough unilateral benefit/cost ratio  for that solution to enjoy widespread adoption. Then cheap, quick hacks that address the immediate problem without causing too many problem can win out. The clear example in the IP world being NAT. It is therefore interesting to consider what will happen if none of the solutions currently being engineer, mentioned above, gain traction.

E.g. there are quite reasonable solutions which are IPv6 specific, and IPv6 is not exactly setting the world alight. There are other proposals which are v4/v6 agnostic, but require a significant amount of bilateral deployment to be useful, e.g. between ISP and customers, and do not have any obvious immediate advantages to most parties. Etc. So if, for whatever reasons, cost/benefit ratio means none of these solutions are rolled out, and the internet stays effectively v4-only for long enough, then the following will happen:

  1. Use of NAT will become ever more common, as pressure on IPv4 addresses increases
  2. As the use of NAT increases, the pressure on the transport layer port space (now press-ganged into service as an additional cross-host flow identifier) will increase, causing noticeable problems to end-user applications.
  3. As NAT flow-ID resources become ever more precious, so applications will become more careful to multiplex application protocols over available connections.
  4. However, with increased internet growth, even NAT will become more and more a luxury, and so ever more applications will be forced to rely on application layer proxies, to allow greater concentration of applications to public IP connections – at which stage you no longer have an internet.

The primary problems in this scenario are:

  • The transport protocol’s port number space has been repurposed as a cross-host flow-ID
  • That port number space is far too constraining for this purpose, as it’s just 2 bytes

The quickest hack fix is to extend the transport IDs. With TCP this is relatively easy to do, by defining a new TCP option. E.g. say we add a 2 * 4-byte host ID option, one ID for src, one for the dst. When replying to a packet that carried the option, you would set the dst to the received src, obviously. This would give NAT concentrators an additional number space to help associate packets with flows and the right NAT mappings.

This only fixes things for TCP, plus the space in the TCP header for options is becoming quite crowded and it’s not always possible to fit another 2*4+2 bytes in (properly aligned). IP also has an option mechanism, so we could define the option for IP and it would work for all IP protocols. However, low level router designers yelp in disgust at such suggestions, as they dislike having to parse out variably placed fields in HDL; indeed it is common for high-speed hardware routers to punt packets with IP options out to a slow, software data-path. The remaining possibility is a Shim style layer, i.e. in between IP and the ULPs, but that has 0 chance of graceful fallback compatibility with existing transports.

Let’s go with the IP option header as the least worst choice. It might lead to slower connectivity, but then again if your choice is between “no connectivity, because a NAT box in the middle is maxed out” and “slow connectivity, but connectivity still” then its better than nothing. Further, if such use of an IP option became wide-spread you can bet the hardware designers would eventually hold their noses and figure out a way to make common case forwarding fast even in its presence – only NAT concentrators would need to care about it.

Ok, so where are we now? We’ve got an IP option that adds 2 secondary host IDs to the IP header, thus allowing NAT concentrators to work better. Further, NAT concentrators could now even be stateless when it comes to processing packets that have this option. All it has to do is copy the dst address from the option into the IP header dst. This would allow the NATed hosts to be reachable from the outside world! The IDs don’t even have to be 4 byte each, per se.

Essentially you now have split addressing into 2. You could think of it as a split in terms of “internet” or “global network ID” and “end-host” ID, a bit like 6-to-1 or other older proposals I can’t remember the name of now, however it’s better to think of it as being 2 different addressing “scopes”:

  • The current forwarding scope, i.e. the IP header src/dst
  • The next forwarding scope, i.e. the option src/dst, when the the dst differs from the current scope dst

NAT boxes now become forwarding-scope context change points. End-host addressing in a sense can be thought of as end-host@concentrator. Note that the common, core scope of the internet can be blissfully unbothered by any details of the end-host number space. Indeed, if there’s no need for cross-scope communication then they can even use different addressing technologies, e.g. one IPv4 the other IPv6 (with some modification to the “copy address” scheme above).

Note that the end-host IDs no longer need be unique, e.g. 10.0.0.1@concentrator1 can be a different host from 10.0.0.1@concentrator2. In an IPv4 world, obviously the end-host IDs (or outside/non-core scope IDs) would not be globally unique. However, this scheme has benefits even with globally unique addresses, e.g. public-IP@concentrator still is beneficial because it would allow the core internet to not have to carry the prefixes for stub/leaf public IP prefixes.

You could take it a bit further and allow for prepending of scopes in this option, so you have a stack of scopes – like the label stack in MPLS – if desired. This would allow a multi-layered onion of forwarding: end-host@inner@outer. Probably not needed though.

What do we have now:

  • An IP option that adds a secondary addressing scope to packet
  • A 2-layer system of forwarding, extendible to n-layer
  • Decoupling of internet ID space from end-host ID space
  • Compatible with existing ULPs
  • Backwards compatible with IPv4 for at least some use cases
  • Allows NAT to scale
  • Relatively minor extension to existing, widely accepted and deployed NAT architecture
  • Allows scalable use of PI addressing
  • No per-connection signalling required
  • Minimal state

On the last point, the system clearly is not dynamic, as Shim6 tries to be. However, if the problem being solved is provider independence in the sense of being able to change providers without having to renumber internally, then this hack is adequate. Further, even if the reachability of network@concentrator is not advertised to the internet, the reachability of concentrator must be advertised at least. So there is some potential for further layering of reachability mechanisms onto this scheme.

No doubt most readers are thinking “You’re clearly on crack“. Which is what I’d have thought if I’d read the above a year or 5 ago. However, there seems to be a high-level of inertia on today’s internet for solving the addressing crunch through NAT. There seems to be little incentive to deploy IPv6. IPv6 also doesn’t solve multi-homing or routing scalability, the solutions for which all have complexity and/or compatibility issues to varying degrees. Therefore I think it is at least worth considering what happens if the internet sticks with IPv4 and NAT even as the IPv4 addressing crunch bites.

As I show above it is at least plausible to think there may be schemes that are compatible with IPv4 and allow the internet to scale up, which can likely be extended later to allow general connectivity across the NAT/scope boundaries, still with a level of backwards compatibility. Further, this scheme does not require any dynamic state or signalling protocols, beyond initial administrative setup. However, the scheme does allow for future augmentation to add dynamic behaviours.

In short, schemes can be devised which, while not solving every problem immediately, can deliver incremental benefits, by solving just a few pressing problems first and being extendible to the remaining problems over time. It’s at least worth thinking about them.

[Corrections for typos, nits and outright crack-headedness would be appreciated, as well as any comments]

Comments (4)

Wisdom of Squid for More Security Sensitive Applications

CVE-2009-2621 was released last week. The issues seem to include a buffer-overflow, which implies a potentially easy remote root shell exploit1. So it’s perhaps worth repeating my previous claim that Squid really should not be used for security sensitive applications.

This is not to take away from Squid. As the swiss-army knife of web-caching/proxying software it’s awesomely valuable software. It’s just that building for security is somewhat at odds with software intended to be highly flexible and featureful.

Even if Squid’s code were 100% defect free managing its default feature-set remains a significant security problem. E.g. my own ISPs’ Squid-using Pædosieve still honours CONNECT to anywhere, proving just how hard it is to configure such software to disable unneeded features – and they were made aware of the problem many months ago. Prior to that they had forgotten to lock-down access to the internal in-band management protocol (accessible by using CONNECT proxying to localhost). They only recently disabled SMTP on the Pædo-sieve, possibly because they received email sent through it via the Squid proxy.

1. The bug report mentions this is triggerable via ESI, which I don’t think is relevant to a Pædo-sieve, but as it was in some generic string storage code there may well be many other paths to the bug.

Also, bugs were found in the HTTP header parser, where it will assert if fed a very large header. Relying on assert() is somewhat fragile, as these can be disabled by the compiler. Indeed, the fix for the ESI buffer overlow is to add an assert, it seems. Not confidence inspiring, security wise at least.

Leave a Comment