posted by Nicholas Blachford on Tue 13th Jul 2004 21:56 UTC
"Next Gen, Page 2/3"

Guiding Principles

"Things should be made as simple as possible, but not any simpler" - Albert Einstein

Software is complex and the longer it exists the more complex it becomes, by starting again we can can consider all the requirements and produce a design to fit rather than modifying an existing design which is difficult and often leads to failure. So, when we start with the design or construction it should be simple. Simplicity is a good thing, it may make designing more difficult but the end result is easier to construct, easier to maintain and less prone to bugs. In the hardware world it's also likely to be faster, indeed this is how Semore Cray designed his machines even as far back as the 1950's, these machines later inspired the creation of RISC.

Hardware
This system is going to be more than software. While it would be possible to design an OS only and get many of the advantages you would also be missing a lot, especially in the form of performance enhancements. So, we'll start with what the physical system shall be, the hardware it shall use.

Hardware is changing. Processor manufacturers are hitting the limits of super-scalar processors which can be mass produced and cooled in a reasonable manner

The solutions they are switching to is single chip multi-core multi-threading ("Mulcoth") processors where a number of CPU cores are built on a single die and each of these cores can run multiple threads. The recently announced POWER5 CPU does this and other manufacturers (Intel, HP, Sun, AMD, Motorola) will join them in the future with Sun in particular following this strategy very aggressively, Sun plan to put 8 simple cores on a single chip each running 4 threads simultaneously. In the future I can see single-core single-threaded CPUs becoming a thing of the past for the desktop.

In the future physical limitations will have an increasing effect placing limitations on how CPUs can be designed forcing simpler designs [TISC], increasing the number of CPU cores on a single chip may eventually be the only way to increase performance.

If your system can take advantage of parallelism, Mulcoth CPUs are going to bring a big advantage performance wise even if individual cores are slower than single core solutions. In fact slowing the cores down may actually boost performance as lower clocked cores can use smaller transistors freeing up room on the die for more cache and additional cores. All modern processors are limited by memory, there more there is on chip, the faster they'll run. Using low clocked cores also means low power consumption is possible.

If we want a new platform it should take account of these changes and make use of them. Do it properly and we could have the fastest system on the market. One system which would be perfectly suitable to this sort of processor is BeOS, the entire system is heavily threaded and multi-tasks very well so a Mulcoth chip would run BeOS like a dream. You can actually take even more advantage of multiple cores than BeOS does but I'll come back to that when I discuss the OS.

Mulcoth CPUs aren't the only new technology on the way. FPGAs have been long predicted to appear in desktop systems but have yet to appear. Stream processors are another type of CPU which will probably turn up some day.

Stream Processors
Stream processors are an advancement on DSPs (Digital Signal Processors) which are CPUs designed specifically for high compute applications.

Many DSP processes can be broken apart into a stream - a sequential series of algorithms. In many cases DSP problems can be further divided across multiple streams and further divisions can be made within the algorithms making them suitable for SIMD (Single Instruction Multiple Data) processing.

Experimental parallel stream processors have been developed which take account of this divisibility and can process data at rates up to 100 times faster than even the most powerful desktop CPUs [Stream]. Additionally, within the algorithms data tends to be "local" so these processors do not need to constantly access a high bandwidth memory, this means their actual processing speed may be close to their theoretical peak - something very uncommon in general purpose processors.

Custom processors such as 3D Graphics processors are very high performance but cannot be programmed to do other tasks. Shaders can be programmed but this is still limited and difficult. Stream processors on the other hand are highly programmable so many different kinds of operations are possible. As if to rub the CPU manufacturers noses in it, these type of processors have low power requirements.

So I think we can use one of these into our new platform. But, where do we get them? Sony's new Cell processor [Cell] will allow this sort of processing. Each Cell has a number of cores all of which access an on chip high speed memory and these can be configured to process data as a stream. Cell processors will be made in vast numbers from the get go and will also be sold to 3rd parties, so they should be cheap, fast, and available. You'll not want to run your OS on them - they're not designed for that, but for video, audio and other high compute processing they will blow everything else into next week.

FPGAs
An FPGA (Field Programmable Gate Array) is a "customisable chip", it provides the parts and you tell it what to assemble itself as. They are not as fast as full custom chips but modern full custom chips cost $15 million+ just to develop. Stream processors will be able to do many of the tasks a FPGA would usually do but stream processors are best suited to well, streams. Not everything is a stream.

There may be cases where a stream processor can work but the cumulative latency may be too great - complex real time audio or video processing are areas where this could be an issue. There are as you see some areas where stream processors may be at a disadvantage due to their architecture. General purpose processors can do anything but performance is considerably lower than either a stream processor or an FPGA. In these cases the FPGA will provide a solution.

I don't know if the FPGA would be used much at the beginning as they are difficult to design for but they are cheap and there's free tools available so why not? Pre-programmed libraries on the other hand will be easy for any programmer to use.

Table of contents
  1. "Next Gen, Page 1/3"
  2. "Next Gen, Page 2/3"
  3. "Next Gen, Page 3/3"
e p (0)    34 Comment(s)

Related Articles

posted by Thom Holwerda on Thu 8th Jan 2009 18:47
posted by Thom Holwerda on Wed 7th Jan 2009 19:38