Archive for Technology

BBC Response to my iPlayer DRM FoI Request

Got my response from the BBC regarding who is forcing them to implement content protection,  as their upper-management has publicly stated, and how they came to their decisions on it. It’s pretty disappointing really.

They’ve refused my request to determine exactly who is requiring it of the BBC, on the grounds that this information is held “for the purposes of ‘journalism, art or literature'”. The 2nd part they refused generally on the grounds of it being too burdensome – my request was phrased poorly it seems. They did release one document though, though heavily redacted.

Feel free to read the “final response”response and the released document, ERTG “Content Protection Update”.

Suggestions on what, if anything, to do next would be useful.

Comments (3)

Letter to the BBC Trust regarding concerns with the BBC iPlayer

This is an edited version of a submission of mine to the BBC Trust, for its recent review of BBC ondemand services.

Overview

The BBC has, no doubt in good faith, acted to frustrate a number of legitimate 3rd party clients for iPlayer which were created outwith the BBC. I believe this is not in keeping with the Charter for the BBC, nor with its remit for BBC Online, with regard to its obligations to provide broad access to all UK residents.

Further, in building BBC iPlayer on the proprietary technology of a single-vendor I believe the BBC helps promote that technology and that single vendor. I believe this distorts the market by requiring that all iPlayer access devices must build on “Flash”. I believe the BBC is acting anti-competitively by restricting 3rd party entrepreneurs from having a choice of technology vendors. Further, due to certain measures, the BBC has managed to make itself the arbiter of who may and who may not access the service.

I believe this situation is unhealthy and damaging to the public interest. I believe in fact there already are examples of the BBC having misused this power contrary to the public interest.

I would urge the BBC Trust to require the BBC to re-examine its iPlayer technologies and:

  • Investigate the use of multi-vendor, open standards, including HTML5 video
  • Engage with vendors and other interested parties in order to draw up, in an open and transparent manner, whatever further protocols are required (e.g. content rights protection protocols) to allow truly neutral access to BBC iPlayer
  • Eliminate the reliance on proprietary, single-vendor technology.
  • Draw down its development of end user/device iPlayer clients, and transfer all possible responsibility for device development to the free market, as is right and proper, and normal for other services such as radio and TV (both analogue and digital)

Background

Flash

As you are aware, the BBC chose to implement its iPlayer catchup/ondemand system on top of the “Flash” platform. This is a technology of Adobe Inc, a US corporation.

While some specifications for “Flash” have been published by Adobe, several key aspects of “Flash” remain undocumented and are highly proprietary to Adobe – particularly the RTMPE and RTMPS protocols, which the BBC use in iPlayer. Further, there are no useful implementations of the current version of “Flash” technology other than those offered by Adobe or that are licenced from Adobe, that I am aware of.

Adobe make end-user clients freely available for a number of popular operating systems and computers, such as Microsoft Windows on PC (32 and 64), OS-X on PPC/PC, Linux on 32bit PC. Adobe also licence versions of “Flash” to embedded-computer device manufacturers. It is unclear to me whether Adobe require royalties on such players, however from experience Adobe do charge an initial porting fee.

Competing video content delivery technologies include:

  • HTML5 video
    • Multi-vendor, industry standard, openly specified by the W3
    • Implemented in more recent versions of Microsoft Explorer (IE9), Google Chrome, Firefox
    • A number of online video sites have beta-tests of HTML5, including Youtube
    • NB: Firefox bundles only Ogg Theora codec support at this time
  • Microsoft Silverlight
    • Openly specified, with the usual patent issues around video.
    • 2nd source implementation available from the Mono project.
    • Has had some high profile deployment.

3rd Party iPlayer Clients

Since the iPlayer service was started, a number of 3rd party clients were created. These included a plugin for “XBMC”, a software solution to turn a PC into a “media centre”; and “get_iplayer” which is a command line tool to allow Unix/Linux users to access iPlayer.

Both these clients were respectful of the BBCs’ content protection policies and restrictions and were careful to honour them. These clients were also subject to the BBCs “geographic IP checks” (geo-IP) which allowed the BBC to ensure users of tools were in the UK (to the same extent as it could ensure this for official BBC iPlayer). These tools did not rely on Adobe technology, and their source is freely licenced. Some users perceived a number of technical advantages, such as being better integrated with their respective environments, having lower CPU usage, etc.

At several stages, the BBC made technical changes to the service which appeared to have no purpose but to frustrate unofficial, 3rd party clients. E.g. the BBC started requiring certain HTTP User-Agent strings.

For quite a while though, the BBC made no further changes, and the clients concerned worked very well for their users. Until very recently when the BBC implemented a protocol which requires that the client “prove” that it is the official BBC iPlayer client by providing a numeric value that is calculated from the binary data making up the BBC iPlayer client and from data provided by the server. This numeric value can obviously not be easily provided by 3rd party clients, unless they first download the official iPlayer SWF file from the BBC of course.

These measures taken by the BBC objectively do not add any immediate end-user value to the service. These measures objectively do interfere with the ability of 3rd party clients, being used legitimately by UK licence fee payers, to access the service. These measures have indeed reduced the base of clients able to access iPlayer,  with the developers of get_iplayer having decided to abandon further work due to repeated frustratative measures taken by the BBC.

Concerns

Concern 1: Public Value and Emerging Communications

In the licence for BBC Online, the Trust asked the following of the BBC with regard to emerging communications, in part II, 5.6:

BBC Online should contribute to the promotion of this purpose in a variety of ways including making its output available across a wide range of IP-enabled platforms and devices.

Additionally, the BBCs’ general Charter Agreement states, in clause 12, (1): “must do what is reasonably practicable to ensure that viewers … and other users … are able to access UK Public Services“.

My concern is that the BBC had made available a service, which was being used by UK licence fee payers, in full compliance with all usage restrictions, and it then took technical steps aimed directly at blocking access to the service. To me it is most contrary to the BBCs remit and its Charter agreement to act to block legitimate access!

I believe this one example of an abuse, though well-intentioned and in good faith, of the power the BBC has acquired by gaining control over the end-user platform (i.e. by restricting access to BBC iPlayer to the BBCs’  Flash-based player, and to a select, approved device-specific clients, such as the Apple iPhone).

Further, Adobe Flash is a roadblock to wide access. Whereas multi-vendor standards tend to be widely available across devices, Adobe Flash is not. This can be because certain embedded system vendors can not afford the moneys required to get Adobe’s attention to port Flash to new platforms (from experience, it’s a not insubstantial amount even for large multi-nationals; which can be an impossibly huge amount for small, low-budget startups).

In essence, the single-vendor nature of Flash is a significant obstacle to the goal of wide deployment over a range of platforms. With all the best will in the world, it is not within the resources of Adobe alone to support all the different kinds of things people would like to play video on, at a price suited to small startups.

Similarly, it is not within the resources of the BBC to properly engage with every aspiring builder of iPlayer access devices.

Concern 2: Market Impact

The BBC Online licence asks the BBC to consider the market impact, in part I, 1:

BBC Online should, at all times, balance the potential for creating public value against the risk of negative market impact.

I will set out my stall and say that I believe that in many spheres of human activity the public interest is best served through free market activity, in a reasonably unrestricted form. While I cherish the BBC for its content, and I agree with its public service remit, I do not believe the public interest is served by having the BBC dictate the end-user platform. Rather, I believe software and device vendors should be free to create and develop products, and the public to pick which is best.

The BBC have managed unfortunately to acquire the power to dictate whose devices and software do and do not have access to iPlayer. It has managed to do so under the guise of “content protection”. Under this pretext the BBC has persuaded the BBC Trust that, unlike prior modern history, the BBC alone must provide or approve of all code for iPlayer viewing that runs on end-user devices. The BBC Trust has gone along with this on a temporary basis.

This gives the BBC an unprecedented power. Which it has not had in any other area of end-user device technology in working memory.

As detailed above, the BBC has already wielded this power to cut-off clients which it does not approve of – though these clients and their users were acting as legitimately as any other. The BBC claims this is a necessary requirement. However, there is no reason why the BBC can not, in conjunction with the internet community and interested vendors, prescribe a standard way for clients to implement the BBCs’ content protection. Such a way that would afford the BBC all the same protections as it has at present, both technical and legal such as under the Copyright &Related… Act 2003, while allowing 3rd party access on a neutral basis.

There is no need for the BBC to be the king-maker of the device market, and the BBC should not have this power. The BBC should not be allowed to build a new empire over end-user devices. It should be scaling down its iPlayer client development – not up! It should focus on back-end delivery technologies, and provide a standards specified interface to those technologies, just as it does for over-the-air broadcast with DVB-T and DVB-T2, and its participation in setting those standards.

With regard to Adobe. Even if the BBC did not make use of Flash to restrict 3rd party iPlayer clients, the reliance on a single-vendor is a problem. Adobe may have a reasonably good history of being open and providing royalty free clients for most popular clients. However, Adobe are a for-profit US corporation, and so they in no way are beholden to the UK public interest. They would be perfectly entitled to start charging royalties in the future if they wished. They have charged device manufacturers royalties in the past in certain scenarios (exactly which is not public, to the best of my knowledge).

Further, Adobe are but a single software company. As stated previously, they do not have the resources to work with every little TV/STB startup. As such, they apply basic economics and raise their fees for enabling Flash on new embedded platforms (porting fees), to the point that demand diminishes to a manageable point. This means small startups may be excluded from enabling iPlayer on their devices.

Finally, the internet has been developing multi-vendor standards for video, as part of HTML5. It is reasonably widely acknowledged that multi-vendor standards are more favourable to the public interest than single-vendor proprietary technologies, all other things being equal.

As such, the BBC ought to have a duty to favour the use of multi-vendor, openly specified standards – to advance the public good.

Conclusion

My conclusions are as given in my introduction. I believe the BBC has, albeit in good faith, misused the control it has gained over end-user platforms with iPlayer by deliberately interfering with usage of iPlayer by legitimate clients.

I believe the BBC, no doubt in good faith and with all good intention, has made a mistake in choosing to build its iPlayer video delivery systems on the proprietary technologies of a single-vendor. I believe that these acts and decisions are not in keeping with the Charter, nor with its remit for BBC Online or iPlayer. I believe the public interest demands that the BBC should seek to step back from its current heavy involvement on the intimate mechanics of how end-user devices access iPlayer.

The BBC instead should publish technical documents describing the protocols used to access its services. It should engage all interested parties to formulate and develop any required new protocols, in an open and transparent manner. This is similar to how the BBC acts in other areas, such as with DVB.

I think this is required in order to allow the free market to be able to innovate and so create a range of exciting and useful iPlayer clients.

I think it is vitally important to all this that the BBC use openly specified and published protocols, available on non-discriminatory terms that are implementable by royalty-free software. Such software often forms the infrastructure for future innovation.

Comments (6)

Network Activity Visualisations

I’m tinkering on a network protocol simulator at the moment. For debug purposes, it can provide some basic visualisation of what’s going on, e.g. highlighting links which are transmitting messages. These visualisations can sometimes be mesmerising, I find. To avoid turning the front page of this blog into a blinking mess, I’ve put them on their own page.

Comments (2)

Making the Internet Scale Through NAT

(just want to dump this here for future reference)

It’s widely acknowledged that the internet has a scaling problem ahead of it, and an even more imminent addressing problem. The core of internet routing is completely exposed to peripheral growth in mobile/multihomed leaf networks. Various groups are looking at solutions. The IRTF’s RRG has been discussing various solutions. The IETF have a LISP WG developing one particular solution, which is quite advanced.

The essential, common idea is to split an internet address into 2 distinct part, the “locator” and the “ID” of the actual host. The core routing fabric of the internet then needs only to be concerned with routing to “locator” addresses. Figuring out how to map onward to the (presumably far more numerous and/or less aggregatable) “ID” of the specific end-host to deliver to is then a question that need only concern a boundary layer between the core of the internet and the end-hosts. There are a whole bunch of details here (including the thorny question of what exactly the “ID” is in terms of scope and semantics) which we’ll try skip over as much as possible for fear of getting bogged down in them. We will note that some proposals, in order to be as transparent and invisible to end-host transport protocols as possible, use a “map and encap” approach – effectively tunneling packets over the core of the internet. Some other proposals use a NAT-like approach, like Six-One or GSE, with routers translating from inside to outside. Most proposals come with reasonable amounts of state, some proposals appear to have quite complex architectures and/or control planes. E.g. LISP in particular seems quite complex, relative to the “dumb” internet architecture we’re used to, as seems to try to solve every possible IP multi-homing and mobile-IP problem known to man. Some proposals also rely on IPv6 deployment.

Somewhat related proposals are shim6, which adds multi-homing capable “shim” layer in between IP and transport protocols like TCP (or “Upper Layer Protocols” / ULPs), and MultiPath-TCP (MPTCP), which aims to add multi-pathing extensions to TCP. Shim6 adds a small, additional state machine to whatever state machines the ULPs require, and is not backwards compatible with non-shimmed hosts (which isn’t a great problem of itself). Shim6 requires IPv6. MPTCP is still in a very early stage, and does not appear to have described any possible proposals yet.

Not too infrequently well-engineered solutions to some problem may not quite have a high enough unilateral benefit/cost ratio  for that solution to enjoy widespread adoption. Then cheap, quick hacks that address the immediate problem without causing too many problem can win out. The clear example in the IP world being NAT. It is therefore interesting to consider what will happen if none of the solutions currently being engineer, mentioned above, gain traction.

E.g. there are quite reasonable solutions which are IPv6 specific, and IPv6 is not exactly setting the world alight. There are other proposals which are v4/v6 agnostic, but require a significant amount of bilateral deployment to be useful, e.g. between ISP and customers, and do not have any obvious immediate advantages to most parties. Etc. So if, for whatever reasons, cost/benefit ratio means none of these solutions are rolled out, and the internet stays effectively v4-only for long enough, then the following will happen:

  1. Use of NAT will become ever more common, as pressure on IPv4 addresses increases
  2. As the use of NAT increases, the pressure on the transport layer port space (now press-ganged into service as an additional cross-host flow identifier) will increase, causing noticeable problems to end-user applications.
  3. As NAT flow-ID resources become ever more precious, so applications will become more careful to multiplex application protocols over available connections.
  4. However, with increased internet growth, even NAT will become more and more a luxury, and so ever more applications will be forced to rely on application layer proxies, to allow greater concentration of applications to public IP connections – at which stage you no longer have an internet.

The primary problems in this scenario are:

  • The transport protocol’s port number space has been repurposed as a cross-host flow-ID
  • That port number space is far too constraining for this purpose, as it’s just 2 bytes

The quickest hack fix is to extend the transport IDs. With TCP this is relatively easy to do, by defining a new TCP option. E.g. say we add a 2 * 4-byte host ID option, one ID for src, one for the dst. When replying to a packet that carried the option, you would set the dst to the received src, obviously. This would give NAT concentrators an additional number space to help associate packets with flows and the right NAT mappings.

This only fixes things for TCP, plus the space in the TCP header for options is becoming quite crowded and it’s not always possible to fit another 2*4+2 bytes in (properly aligned). IP also has an option mechanism, so we could define the option for IP and it would work for all IP protocols. However, low level router designers yelp in disgust at such suggestions, as they dislike having to parse out variably placed fields in HDL; indeed it is common for high-speed hardware routers to punt packets with IP options out to a slow, software data-path. The remaining possibility is a Shim style layer, i.e. in between IP and the ULPs, but that has 0 chance of graceful fallback compatibility with existing transports.

Let’s go with the IP option header as the least worst choice. It might lead to slower connectivity, but then again if your choice is between “no connectivity, because a NAT box in the middle is maxed out” and “slow connectivity, but connectivity still” then its better than nothing. Further, if such use of an IP option became wide-spread you can bet the hardware designers would eventually hold their noses and figure out a way to make common case forwarding fast even in its presence – only NAT concentrators would need to care about it.

Ok, so where are we now? We’ve got an IP option that adds 2 secondary host IDs to the IP header, thus allowing NAT concentrators to work better. Further, NAT concentrators could now even be stateless when it comes to processing packets that have this option. All it has to do is copy the dst address from the option into the IP header dst. This would allow the NATed hosts to be reachable from the outside world! The IDs don’t even have to be 4 byte each, per se.

Essentially you now have split addressing into 2. You could think of it as a split in terms of “internet” or “global network ID” and “end-host” ID, a bit like 6-to-1 or other older proposals I can’t remember the name of now, however it’s better to think of it as being 2 different addressing “scopes”:

  • The current forwarding scope, i.e. the IP header src/dst
  • The next forwarding scope, i.e. the option src/dst, when the the dst differs from the current scope dst

NAT boxes now become forwarding-scope context change points. End-host addressing in a sense can be thought of as end-host@concentrator. Note that the common, core scope of the internet can be blissfully unbothered by any details of the end-host number space. Indeed, if there’s no need for cross-scope communication then they can even use different addressing technologies, e.g. one IPv4 the other IPv6 (with some modification to the “copy address” scheme above).

Note that the end-host IDs no longer need be unique, e.g. 10.0.0.1@concentrator1 can be a different host from 10.0.0.1@concentrator2. In an IPv4 world, obviously the end-host IDs (or outside/non-core scope IDs) would not be globally unique. However, this scheme has benefits even with globally unique addresses, e.g. public-IP@concentrator still is beneficial because it would allow the core internet to not have to carry the prefixes for stub/leaf public IP prefixes.

You could take it a bit further and allow for prepending of scopes in this option, so you have a stack of scopes – like the label stack in MPLS – if desired. This would allow a multi-layered onion of forwarding: end-host@inner@outer. Probably not needed though.

What do we have now:

  • An IP option that adds a secondary addressing scope to packet
  • A 2-layer system of forwarding, extendible to n-layer
  • Decoupling of internet ID space from end-host ID space
  • Compatible with existing ULPs
  • Backwards compatible with IPv4 for at least some use cases
  • Allows NAT to scale
  • Relatively minor extension to existing, widely accepted and deployed NAT architecture
  • Allows scalable use of PI addressing
  • No per-connection signalling required
  • Minimal state

On the last point, the system clearly is not dynamic, as Shim6 tries to be. However, if the problem being solved is provider independence in the sense of being able to change providers without having to renumber internally, then this hack is adequate. Further, even if the reachability of network@concentrator is not advertised to the internet, the reachability of concentrator must be advertised at least. So there is some potential for further layering of reachability mechanisms onto this scheme.

No doubt most readers are thinking “You’re clearly on crack“. Which is what I’d have thought if I’d read the above a year or 5 ago. However, there seems to be a high-level of inertia on today’s internet for solving the addressing crunch through NAT. There seems to be little incentive to deploy IPv6. IPv6 also doesn’t solve multi-homing or routing scalability, the solutions for which all have complexity and/or compatibility issues to varying degrees. Therefore I think it is at least worth considering what happens if the internet sticks with IPv4 and NAT even as the IPv4 addressing crunch bites.

As I show above it is at least plausible to think there may be schemes that are compatible with IPv4 and allow the internet to scale up, which can likely be extended later to allow general connectivity across the NAT/scope boundaries, still with a level of backwards compatibility. Further, this scheme does not require any dynamic state or signalling protocols, beyond initial administrative setup. However, the scheme does allow for future augmentation to add dynamic behaviours.

In short, schemes can be devised which, while not solving every problem immediately, can deliver incremental benefits, by solving just a few pressing problems first and being extendible to the remaining problems over time. It’s at least worth thinking about them.

[Corrections for typos, nits and outright crack-headedness would be appreciated, as well as any comments]

Comments (4)

Killing Free Software with Kindness

Free Software and Open-Source has gained a massive foothold in industry. Mobile phones, digital TVs, laser printers, etc. nearly all build on free software. It is actually hard these days to find embedded devices which do not contain free software. By any measure of deployment, free software is now wildly successful. That success brings economic benefits to free software hackers, in terms of employment.

Free Software has its roots in a desire to be able to tinker on software. In particular, it’s distinguished from hobbyist programming by a desire to be able to fix or improve software in devices one owns or has access to. As such, Free Software is an ethical construct which argues that vendors have a duty to their users to provide them with the ability to modify the code on the device (this is the “freedom” part). It has been further developed to argue that free software is economically a more efficient way to develop software.

As free software gained adoption, there has perhaps been an increasing trend amongst its developers to weigh the economic aspects far more heavily than the freedom to modify code. E.g. Linus Torvalds has stated he is fine with vendors using DRM on Linux devices to lock down the code, which could prevent users modifying the devices. Certain former BusyBox developers seem more concerned that big corporations use free software rather than provide source. There are many other calls for free-software copyright holders to be “pragmatic” about licensing.

Effectively, many developers now take an economic view of free software. They view free software perhaps purely as an advertisement of skills, a professional tool to build reputation, and what not. The ethical “end-user freedom” aspect is no longer significant, for many, who view it as unpragmatic. This leads to a “softly-softly” approach to Free Software licence enforcement being advocated by some, effectively tolerating vendors who do little to honour their apparent licence obligations.

However, I would argue that this is short-sighted. Firstly, if one does not believe in enforcing free software licences like the GPL, then what difference does one think there is between GPL and BSD? Secondly, consider the following:

Imagine 2 companies:

  • BadCorp
  • GoodCorp

BadCorp uses Linux, as it’s cheap and saves them precious per-unit $$ in royalties. Their upper-management have no great appreciation for free software and/or open-source other than expediency. They will release only as little code and material as they have to. They have a relatively large in-house legal team who are certainly capable of figuring out what their responsibilities are in distributing such software. They have been engaged with in the past by free software organisations about their licensing failings, however they still fail to provide full sources to the GPL software they distribute in their devices – i.e. there can be no doubt this behaviour is deliberate.

GoodCorp is medium-sized. They also use Linux because it’s cheap. However, they have people in their top-level management, perhaps founder-engineers, who still appreciate some of the reasoning behind free software, such as ethics and longer-term efficiency. The company doesn’t have huge resources, but they are willing to put in a little engineering effort into longer-term issues, like going through the community processes to get code reviewed and integrated upstream, as they can see some longer-term value in doing so, even if it less tangible. They are meticulous about making sure they comply with the GPL and they even GPL some (or, in the case of AngelCorp, all) of their device/function specific code in order to contribute back to the free software pool.

If BadCorp fails to meet its obligations, again and again, and no real action is ever taken against it, it can be argued that this is effectively punishing GoodCorp. GoodCorp is spending some of its scarce resources on complying with the GPL and engaging with the community, which BadCorp is not. This puts GoodCorp at a competitive disadvantage to BadCorp. The rational thing for GoodCorp to do is to emulate BadCorp and free up those resources and lower its costs. If nothing changes, eventually one of the more short-term-focused, businessy types in GoodCorps’ management will prevail with such an argument over the founder-engineer(s) types.

So by not enforcing free software licences, you reward those companies that pay lip service to their obligations and you punish those that care about meeting their obligations by allowing the one to free-ride on the efforts of the others. Economics dictates that the equilibrium point for the ratio of leechers to contributors will be much higher with a laisez-faire approach to licence enforcement, than it would be if the licence were enforced1.

In short, even if you take a “pragmatic” view of free software – if you are uncomfortable with the idea of suing companies that use customised free software in their devices – if you still believe that these customisations should be published (in accordance with those free software licences which often demand this), then it is important to the longer-term economic viability of free software that enforcement takes place.

Basically, and apologies for loading the terminology, are you for BadCorp or for GoodCorp?

1. This is true up to the point where licence enforcement activity makes proprietary software more attractive. Some people would argue this point is quite low, that only a laisez-faire enforcement approach allows significant adoption of free software. However that seems unlikely to me, as I feel that free software saw significant adoption rates during the period in the late 90s and early 2000s when most developers still emphasised the ethical, freedom aspects of free software. This particular argument needs further investigation, in particular to see which way the evidence lies – i.e. when the laisez-faire attitude took hold compared to adoption.

Leave a Comment

Thread-Level Parallelism and the UltraSPARC T1/T2

[I wrote this last year for something else – maybe sort of interesting given there’s been news recently about Intels’ Larrabee]

Thread-Level Parallelism and the UltraSPARC T1/T2

For a long period of time, increases in performance of CPUs were achieved by maximising sequential performance. This was achieved by increasing the instructions/cycle, using ever deeper, multi-issue pipelines and ever more speculative execution – thus extracting ever more instruction-level parallelism (ILP) from blocks of code, at ever higher clock rates. [3][4][5][6] Processors such as the DEC Alpha 21264, MIPS R12000 and Intel Pentium 4 were the result of this. As an example, the Pentium 4 Extreme Edition featured a 4-issue, out-of-order, 31 stage pipeline, and clocked at up to 3.2GHz, consuming 130W in the process.[4]

Unfortunately, though this approach served the processor industry well throughout the 1990s and most of the 2000s, its doom had already been foretold. David Wall showed empirically that there were practical limits to the amount of ILP available[1]. These limits were surprisingly low for many typical codes – between 4 and 10. Others observed that the disparity in the growth of CPU compute power, relative to memory access times, was increasing at such a rate that the latter would govern system performance ever more – dubbed the “memory wall”[2]. This meant all the gains made by complex, ILP-extracting logic would eventually be reduced to nothing, relative to the time spent waiting on memory.

Faced with these apparent hard limits on the gains that could be had from ILP, other levels of parallelism were investigated. Tullsen et al argued for thread-level parallelism (TLP): extracting parallelism by issuing instructions to multiple functional units from multiple threads of execution.[8] This is sometimes called “Chip Multi-Threading” or CMT. Olukotun et al, of the Computer Systems Laboratory of MIPS’ birthplace, argued that multiple, simpler processor-cores on a chip would be a more efficient use of the transistor budget for a chip than a single, complex, super-scalar core.[3] This is sometimes called “Chip Multi-Processing” or CMP.

Intel added CMT to their P4 CPUs, calling it “Hyper-Threading”. While it improved performance on many workloads, it was known to reduce performance on single-threaded numerical loads. Intels’ Core/Core2 Duo range are not CMT, but are CMP (being multi-core).

A 3rd, more practical factor would also come into play: Energy usage.[4][6] If an ILP-maximising processor is less efficient than a TLP-maximising processor, then the latter can potentially save a significant amount of energy. If the memory-wall means complex ILP logic will be to no avail, then ditching that logic should save power without affecting performance (in the aggregate).

In 2006, Sun released a brand new implementation of UltraSPARC, the T1 – codenamed “Niagara”. Like several other server-orientated processors, the UltraSPARC T1 was both CMP and CMT, having up to 8 cores, each core handling 4 threads. The Niagara differed reasonably radically from those other offerings in that it was designed specifically with the “memory wall” in mind, as well as power-consumption. To this end, rather than using complex, super-scalar processor cores, the T1 cores instead were very simple. Each core is single-issue, in-order, non-speculative and just 6 stages deep. The pipeline is the classic 5-stage RISC pipeline, with a “Thread Select” stage added between IF and ID. Each core is therefore relatively slow compared to a traditional super-scalar design. The T1 clocks at 1 to 1.2GHz due to the short pipeline. Rather than trying to mask memory latency with speculation, the T1 switches processing to another thread. So, when given a highly-multithreaded workload, the T1 can make very efficient use of both its compute and memory resources by amortising the memory-wait time for any one thread across the execution time of a large number of other threads. Thread selection is the only speculative logic in the T1. It is aimed at highly threaded, memory/IO bound workloads (e.g. the server half of client/server). [4] [5][6]

This approach has additional benefits. Less speculation means fewer wasted cycles. Rather than spending design time on complex, stateful logic to handle speculation and out-of-order issue, the designers can instead concentrate on optimising the chip’s power usage. The many-core approach also physically distributes heat better. The T1 uses just 72W, compared to the 130W of the P4EE, despite the T1s greater transistor count, and much higher peak performance/Watt on server work-loads. [5][6]

The T1 has just one FPU, shared by all cores. This a weakness for which the T1 has been much criticised. Indiscriminate use of floating-point by programmers, particularly less sophisticated ones, is not unheard of in target applications like web-applications (PHP, etc.). The T1 may not perform at all well in such situations. The T1 core also includes a ‘Modular Arithmetic Unit’ (MAU) to accelerate cryptographic operations.

In late 2007, Sun released the UltraSPARC T2. This CPU is very similar to T1. Changes include a doubling of number of threads per core to 8; an additional execution unit per-core and 2 issues per cycle; an 8-stage pipeline; an FPU per core (fixing the above major sore-point of the T1); and the ability to do SMP.[11] This allows for at least 32 cores and 256 threads in a single server (4-way SMP). A single CPU T2 system soon held the SPECWeb world record.[13]

Intel have gone down a somewhat similar path with their Atom processor – a very simple x86 CPU with CMT, but there are no CMP versions of it (yet).[10] Intel are also working on a massively-CMP x86 CPU, called “Larrabee”, using simple, in-order cores.[9] This will have 16 or more cores initially. It is to be aimed at the graphics market, rather than as a general-purpose CPU. Caveon offer the ‘OCTEON’, a MIPS64 based CMP, with up to 16 cores, targetted at networking/communications applications.[12]

To date there are no other simple-core, massively-CMP,CMT processors available as general-purpose systems other than systems using the UltraSPARC T1 and T2. The simple-core, massively-CMP/CMT approach it uses is undoubtedly the future, given the long-standing research results and more recent practical experience. It may take longer for such chips to become prevalent as desktop CPUs though, given the need to continue to expend transistors to maintain high sequential performance for existing, commonly minimally-threaded desktop software.

1. “Limits of Instruction-Level Parallelism“, Wall, DEC WRL researc report, 1993

2. “Hitting the Memory Wall: Implications of the Obvious“, Wulf, Mckee, Computer Architecture News, 1995

3. “The Case for a Single-Chip Multiprocessor“, Olukotun, Nayfeh, Hammond, Wilson, Chang, IEEE Computer, 1996

4. “Performance/Watt: The New Server Focus” (2005), James Laudon 1st Workshop on Design, Architecture and Simulation of CMP (dasCMP), 2005.

5. “Niagara: a 32-way multithreaded Sparc processor” Kongetira, Aingaran, Olukotun, IEEE Micro, 2005,

6. “A Power-Efficient High-Throughput 32-Thread SPARC Processor“, Leon, Tam, Weisner, Schumacher, IEEE Solid-State Circuits, 2007,

8. “Simultaneous Multithreading: Maximizing On-Chip Parallelism“, Tullsen, Eggers, Levy. 22nd An. Int. Symposium on Comp. Arch., 1995.

9. “Larrabee: a many-core x86 architecture for visual computing“, Seiler, et al. ACM Transactions on Graphics, Aug 2008.

10. http://www.intel.com/products/processor/atom/index.htm

11. “UltraSPARC T2: A Highly-Threaded, Power-Efficient, SPARC SOC“, Shah, et al. A-SSCC, 2007.

12. http://www.cavium.com/OCTEON_MIPS64.html

13. http://blogs.sun.com/bmseer/entry/sun_s_even_faster_specweb2005

Auxiliary Sources for Thread-Level Parallelism:

Leave a Comment

Wisdom of Squid for More Security Sensitive Applications

CVE-2009-2621 was released last week. The issues seem to include a buffer-overflow, which implies a potentially easy remote root shell exploit1. So it’s perhaps worth repeating my previous claim that Squid really should not be used for security sensitive applications.

This is not to take away from Squid. As the swiss-army knife of web-caching/proxying software it’s awesomely valuable software. It’s just that building for security is somewhat at odds with software intended to be highly flexible and featureful.

Even if Squid’s code were 100% defect free managing its default feature-set remains a significant security problem. E.g. my own ISPs’ Squid-using Pædosieve still honours CONNECT to anywhere, proving just how hard it is to configure such software to disable unneeded features – and they were made aware of the problem many months ago. Prior to that they had forgotten to lock-down access to the internal in-band management protocol (accessible by using CONNECT proxying to localhost). They only recently disabled SMTP on the Pædo-sieve, possibly because they received email sent through it via the Squid proxy.

1. The bug report mentions this is triggerable via ESI, which I don’t think is relevant to a Pædo-sieve, but as it was in some generic string storage code there may well be many other paths to the bug.

Also, bugs were found in the HTTP header parser, where it will assert if fed a very large header. Relying on assert() is somewhat fragile, as these can be disabled by the compiler. Indeed, the fix for the ESI buffer overlow is to add an assert, it seems. Not confidence inspiring, security wise at least.

Leave a Comment

Good Password Hygiene

In a hypothetically perfect world, we’d be able to remember infinite numbers of passwords. However, for most people that’s not possible. Instead we have to find a way to make best use of that limited memory. Here’s how:

  • Do not use passwords that are easy to guess, e.g anything directly related to you, like your name or names of family/friends/pets/etc; or date of birth; or favourite colour,band,etc..
  • Ideally, use a longish random string as your password, of at least 10 characters (but longer is better).
  • The same applies for password-recovery questions, which often ask for information that is in the public domain (e.g. mother’s maiden name, date of birth). Do not provide real answers! Instead just make something up, or use another random string if possible.
  • Do not re-use passwords across different websites, unless you truly do not care about what is on those sites, and what they can do in your name with that password.
  • Do not be afraid to write them down if you can store them securely. E.g. if your home is reasonably secure, it’s fine to store most passwords on paper there. This goes against advice from many well-meaning, but utterly-wrong “experts”.
  • If you trust that a computer or device is sufficiently secure, it’s perfectly fine to store passwords on it, e.g. in a text-file. Also, many programmes support saving passwords and if you trust those programmes then it’s perfectly OK to use those features.
  • Consider using disk-encryption products like PGPDisk, TrueCrypt, BitLocker or the built-in capabilities of many Linux/Unix distributions (some of which offer this at install time) to protect your data with a master key. This is particularly recommended for laptops.
  • Any computer running Microsoft Windows likely can not be considered secure and should not trusted with more sensitive information. Portable devices should not be considered secure, unless their contents are known to be encrypted, and they automatically lock themselves after a small period of unuse (i.e. don’t trust your phone too much for storing sensitive data).

Basically, in an ideal world, all your day-to-day passwords for your various, online accounts should be unguessable, random strings;  you’d never have to remember any of them; you would just, at certain times, have to enter a master pass-phrase (which should be unguessable, but still memorable and much longer than a password) without which the passwords would effectively not be accessible.

Remember, security is a compromise between convenience and consequence. The ideal level of compromise will differ between different people, and between different situations. E.g., obviously, it’s probably a good idea to tolerate a good bit of inconvenience with your online banking login details and commit these solely to memory. If you have too many accounts to memorise the details, then store them very securely, e.g. buy a strong box or small safe, and obscure which details belong to what accounts – hopefully this buys enough time to contact banks and have the details changed if your house is burgled and the box stolen.

Common sense goes a long way. Unfortunately the “experts” you sometimes hear from don’t always have it.

Comments (2)

Sharing DNS caches Considered Harmful

Eircom have been having problems with internet connectivity. It’s hard to get information about exactly what they’re seeing, but there seem to be 2 aspects to it:

  1. Eircom are getting hit with a lot of packets
  2. Customers have sometimes been directed to strange sites by Eircom’s DNS servers

Justin Mason has a good overview of the news coverage. There some points of his worth correcting though:

I.e. DDoS levels of incoming DNS packets are consistent with a poisoning attack on up-to-date DNS servers, which randomise QID.

The moral of the story here is that using recursive, caching DNS servers that are shared on any significant scale, like ISP nameservers or (even worse) OpenDNS, is just unhygienic. They’re just fundamentally flawed in todays internet environment, as they’re juicy targets for poisoning, until DNSSec is widely deployed. When finally DNSSec is deployed, shared, recursive nameservers remain a bad idea as they terminate the chain of the trust – the connection between the NS and client can still be spoofed.

In short:

  • Technical users and systems admins should install local, recursive nameservers. Preferably on a per-system basis.
  • Operating system vendors should provide easily-installed recursive nameservers and should consider installing and configuring their use by default. (Fedora provides a convenient ‘caching-nameserver’ package, and also a new dnssec-conf package with F11, though not enabled by default).
  • Consumer router vendors already ship with recursive servers, but tend to forward queries to the ISP – they should stop doing that and just provide full recursive service (hosts already do caching of lookup results anyway).

Widely shared, recursive nameservers are a major security risk and really should be considered an anachronism. It’s time to start gettting rid of them…

Comments (11)

Mail-Followup-To Considered Harmful

Dear Interwebs,

If you happen to have responsibility for some software that processes mail, please take the time to check the following:

  • That your software always errs on the side of preserving the Reply-To header, when passing on email. E.g. email list software in particular should not overwrite existing Reply-To headers, even if a list is configured to set a default Reply-To.
  • That your software can process Reply-To headers that have multiple email addresses.
  • That your software provides adequate, interactive cues to its user when they reply to an email, so as to discern what the user wants, in the context of what the sender prefers (so far as that is known).
  • If your software supports various weird, non-standard headers, like Mail-Followup-To, Mail-Copies-To, Mail-Replies-To, etc. deprecate and remove this support. No amount of extra headers can obviate the need for the cues in the previous point – all they do is make the situation worse overall.
  • If your software must support these headers, do not let such support cause the standard and well-supported Reply-To header to be automatically ignored.

If you’re a user of software that honours Mail-Followup-To and/or has buggy Reply-To behaviour, please file bugs and/or send them patches.

So why are Mail-Followup-To et al harmful? The answer is that they increase the number of ways that the wishes of the sender, the respondent and the state of the email may interact. What was already a tricky and murky area is made even murkier when you try add new ways to indicate the desired reply-path. E.g. witness Thunderbird’s help on MFT, or Mutt’s help on Mailing lists. I defy anyone to be able to keep all the rules of their MUA behaves in the presence of the various combinations of Reply-To, From, To, Cc, Mail-Followup-To and Mail-Reply-to in their head. For the few who can, I defy them to keep track of how their MUAs support interacts with the support in other MUAs.

Further, Mail-Followup-To has never been  standardised. The header is described in  DJB’s Mail-Followup-To and the dead IETF DRUMS draft. Both specs effectively say that the absence of MFT when a client does a “followup” reply should continue to go to the union of the From address (From == From or Reply-To), To and the Cc. However, these descriptions carry little authority. Unfortunately Mutt, one popular MUA, behaves differently when in list-reply mode and does not fallback to (From|Reply-To)+To+Cc in the absence of MFT – at least when I last investigated. This means Mutt basically does not inter-operate with other MUAs, when it comes to the standards-track means of indicating Reply preferences.

Before we had the problem of trying to get a few cases of bugs in broken Reply-To handling fixed (e.g. lists that blindly over-write Reply-To) + the UI design issues of figuring out where a user intends replies to go, without annoying them. Now we’ve added the problems of fractured interoperability with new same-but-different headers + the problems of bugs and deviations in the implementations of said new same-but-different headers.

Mail-Followup-To == more mess == even more brokenness.

See also: Archived DRUMS discussion on Mail-Followup-To.

Comments (2)

« Newer Posts · Older Posts »
%d bloggers like this: