Inline videos. See also:Category: Articles with embedded Videos..

Cell (microprocessor)

From Biocrawler, the free encyclopedia.

(IBM Microelectronics) Cell microprocessor
Enlarge
(IBM Microelectronics) Cell microprocessor

The Cell is a microprocessor jointly developed by IBM, Toshiba and Sony. The Cell architecture is intended to be scalable from handheld devices to mainframe computers by utilizing parallel processing. Sony is using the chip in their PlayStation 3 game console to be released in the second quarter of 2006.

Contents

History

In 2000, IBM, Sony Computer Entertainment Inc., and Toshiba Corp. formed an alliance to design and build the processor. Design process debuted in design centers in March 2001. [1] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_cellbriefing) The Cell was designed over a period of four years, using enhanced versions of the design tools for the POWER4 processor. Over 400 engineers from the three companies worked together in 10 of IBM's design centers. [2] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_eetimes)

On the 17th May, 2005, Sony Computer Entertainment confirmed the spec of the Cell processor that would be shipping in the forthcoming Playstation 3 console. This Cell will have one processing unit on the core, with seven SPEs ("Synergistic Processing Units", see below) and one SPE reserved for redundancy. It will be clocked at 3.2 GHz, although in lab conditions the processor apparently has been clocked successfully up to 5.2 GHz. The chips are being fabricated using IBM's 90 nanometre SOI (Silicon on insulator) process, at its fab in East Fishkill, New York. Full production may switch at some later date to use a 65-nm or 45-nm process jointly developed by IBM and Toshiba at their Nagasaki fabrication plant. Sony currently is also using its 90-nm process to produce the integrated GS/EE for the PSX*, the Japan-only combination PlayStation 2/DVR unit. (* This usage of "PSX" is distinct from the commonly used informal designation of the original PlayStation.)

Open Specs

As of May 5, 2005, patches for the Cell processor were mailed to the Linux kernel mailing list by IBM developers (Find them here (http://lkml.org/lkml/2005/5/13/217)). Arnd Bergmann of IBM will describe and premier the Linux based Cell architecture at Linuxtag 2005 (22-25 Jun). [3] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_linuxtag)

Architecture

Cell's die (false-color)
While the Cell chip can have a number of different configurations, the basic configuration is composed of one "Power Processor Element" ("PPE") (sometimes called "Processing Element", or "PE"), and eight "Synergistic Processing Elements" ("SPE") [4] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_cellbriefing). Due to the nature of its applications, Cell is optimized towards single precision floating point computation. "This design decision is based the real time nature of game workloads and other media applications: most often, saturation is mathematically the right solution." [5] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_ibmresearch)

Power Processor Element

The PPE is based on the POWER Architecture, which is the basis of IBM's existing POWER line and related to the PowerPC used by Apple Computer and others. The PE is not the primary processor for the system, but acts as a controller for the other eight SPEs, which handle most of the computational workload. It has 32KB instruction & data Level 1 cache, and 512KB Level 2 cache. [6] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar)

Synergistic Processing Elements

Each SPE is composed of a "Synergistic Processing Unit" ("SPU"), and a SMF unit (DMA, MMU, and BUS IF[7] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_scei)). [8] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_ibmresearch) A SPE is a general purpose RISC processor with 128-bit SIMD organization [9] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar) for single and double precision instructions. It has 256 KB of instruction & data local high speed memory, which is also visible to the PPE to be loaded with data and programs as needed [10] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar). It has 128 registers of 128bits [11] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar). It measures 14.5mm˛ (90nm process) [12] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar). It also has its own DMA unit connected to the EIB through a MMU for address translating.

The local high speed memory is called 'Local Store'. It performs load/stores, transactions for DMA, and fetches instructions in a instruction-line buffer. [13] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_isscc2)

In general use the system will load the SPEs with small programs, chaining the SPEs together to handle each step in a complex operation. For instance, a set-top box could load up programs for reading a DVD, video and audio decoding, and display, and the data would be passed off from SPE to SPE until finally ending up on the TV. At 4 GHz, each SPE gives 32 GFLOPS of performance, thereby giving the SPEs 256 GFLOPS of performance. Performance of the PPE's VMX unit is unclear, but should be around 32 GFLOPS in addition to the SPEs.

"The SPU is an in-order dual-issue statically scheduled architecture. Two SIMD instructions can be issued per cycle: one compute instruction and one memory operation. The SPU branch architecture does not include dynamic branch prediction, but instead relies on compiler-generated branch prediction using "prepare-to-branch" instructions to redirect instruction prefetch to branch targets."[14] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_ibmresearch)

Element Interconnect Bus

Unit that enable communication from one core to another. [15] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar)

Memory controller and I/O

The memory controller, a dual XDR controller, is incorporated in the Cell processor (25.6GB/s @3.2Ghz). This replaces the north bridge, like in Athlon 64 processors. The processor also feature two reconfigurable I/O interfaces called FlexIO (76.8GB/s @6.4Ghz) that eliminates the need of south bridge. [16] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar)

Broadband Engine

Much less information is available about the 'broadband engine', most come from patent applications. It's believed the Cell allows for multiple processing cores to be put onto one die, and the patent (http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=/netahtml/search-adv.htm&r=1&f=G&l=50&d=PTXT&p=1&p=1&S1=) showed four cores on one die, called the "Broadband Engine", potentially giving over 1 TFLOPS theoretical performance. The companies designing the chip have claimed they intend to scale performance for various uses, both low-end and high-end, by varying the number of cores on the chip, the number of units in a single core, and by linking multiple chips to each other via network or memory bus.

Initial speculation of TFLOPS performance was largely based on claims of a 65nm SOI process. Though IBM, Sony and Toshiba were following this agenda in the beginning, Intel and AMD's renewed concern for multi-core processing and Sony wanting first-mover's advantage on next generation gaming consoles may have forced them to go with a 90nm SOI process very much similar to the Intel Prescott core manufacturing process. However the 'Broadband engine' integrated into the Cell helps it attain enough bandwidth for theoretical 1 TFLOPS performance, though real-world models may rarely rise to such a figure.

Similar multiple-core designs include Sun Microsystems' MAJC (pronounced "magic"). The first MAJC chip was originally designed for multimedia processing, although Sun have subsequently repositioned the MAJC chip as a high-end graphics processor for workstations. In addition, Stanford University's Imagine Stream Processor (http://cva.stanford.edu/imagine/project/im_arch.html) shares a similar conceptual underpinning.

Facts

This seems to be the most common edition:

  • 256 GFLOPS in single-precision operations @4Ghz. [17] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_slashdot20050211)
  • 25-30 GFLOPS in double-precision operations @4Ghz. [18] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_slashdot20050211) (probably 26 [19] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_rwt2))
  • 234 millions transistors [20] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_itmanager)
  • 1 PPE with 32KB I&D Level 1 cache, and 512KB Data Level 2 cache [21] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_macworld)
  • 8 SPE with 256KB I&D cache [22] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_macworld)
  • 0.9-1.3V nominal supply voltage [23] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_cellbriefing)
  • 10 digital thermal sensors [24] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_cellbriefing)
  • 5 power management states (Dynamic Power Management) [25] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_cellbriefing)
  • 221 square millimeters die (90nm process) [26] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_macworld)
  • Power consumption is unknown yet, [27] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_macworld) speaks of 30W, [28] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_rwt2) of 50-80W or more.

Architecture compared

In some ways the Cell system resembles early Seymour Cray designs in reverse. The famed CDC 6600 used a single very fast processor to handle the mathematical calculations, while a series of ten slower systems were given smaller programs to keep the main memory fed with data. In the Cell the problem has been reversed: reading the data is no longer the difficult problem due to the complex encodings used in industry; today the problem is efficiently decoding that data into an ever-less-compressed version as quickly as possible.

In other ways the Cell resembles a modern desktop computer on a single chip.

Modern graphics cards have multiple elements very similar to the SPE's, known as vertex shader units, with an attached high speed memory. Programs, known as shaders, are loaded onto the units to process the basic geometry fed from the computer's CPU, apply styles and display it.

The main differences are that the Cell's SPEs are much more general purpose than shader units, and the ability to chain the SPEs under program control offers considerably more flexibility, allowing the Cell to handle graphics, sound, or anything else.

Devices

Blade server

IBM has already presented a blade server prototype based on 2 Cell processors, running the Linux Kernel 2.6.11. [29] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_techon2) The processors ran at 2.4-2.8Ghz. IBM expect to make them run at 3Ghz giving 200 GFLOPS per CPU (or 400 GFLOPS per board), and to put seven boards in a single rack for a total performance of 2.8 TFLOPS. This is equivalent to the 70th supercomputer in the TOP500 List as of 11/2004 (http://www.top500.org/lists/plists.php?Y=2004&M=11), or 125th as of 06/2005 (http://www.top500.org/lists/plists.php?TB=2&M=06&Y=2005). However those supercomputers use between 600 and 1000 CPU.

IBM probably plans to build 16 TFLOPS racks. [30] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_seminar) [31] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_rwt2) That's 1 Peta-FLOPS (a million GFLOPS) for 64 racks.

Video Games

Sony's Playstation 3 video game console will use a 3.2Ghz Cell processor, providing 218 GFLOPS.

Home Cinema

Toshiba will probably manufacture HDTVs using this technology. They already presented a system to decode 48 MPEG-2 streams simultaneously on a 1920x1080 screen. [32] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_techon)

Software engineering

The PPE is the conductor, SPEs are the orchestra. The PPE should be used to control synchronization, for random access to memory, communicate with devices, run the operating system. SPEs should be used to execute repetitive tasks with limited memory access. Due to the flexible nature of the Cell, there're several ways to use it: [33] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_scei)

Job queue

The PPE maintains the job queue, schedules jobs in SPEs, and monitors progress. Each SPE has a mini kernel whose role is to get a job, execute it, and synchronize with the PPE. [34] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_scei) (More here (http://www.research.scea.com/research/html/CellGDC05/26.html))

Self-multitasking of SPEs

The kernel, and scheduling is distributed across the SPEs. Tasks are synchronized using mutexes or semaphores, like in a conventional operating systems. Ready to run tasks are either ran by a SPE, or in a waiting queue, other tasks wait. Tasks are contained in a shared memory. This maximizes the utilization of SPEs, and the PPE has nothing to do. [35] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_scei) (More here (http://www.research.scea.com/research/html/CellGDC05/37.html))

Stream processing

Each SPE has a program. Data comes from an input stream, and is sent to SPEs. When a SPE has terminated the processing, the output data is sent to output stream. [36] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_scei) (More here (http://www.research.scea.com/research/html/CellGDC05/41.html))

Software development

Both PPE and SPEs are programmable in C/C++ using a common API provided by libraries. No assembly is required to access SIMD instructions, the compiler has built-in functions. Compiler, debugger, IDE, performance analyzer, and Cell emulator should be made available. [37] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_scei)

Acronyms

  • EIB: Element Interconnect Bus [38] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_isscc)
  • LS: Load Store (SPE's local memory) [39] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_rwt1)
  • MIC: Memory Interface Controller [40] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_isscc)
  • PPE: Power Processor Element [41] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_isscc)
  • SPE: Synergistic Processing Element [42] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_isscc)
  • SPU: Streaming Processor Unit [43] (http://www.biocrawler.com/encyclopedia/Cell_%28microprocessor%29#endnote_isscc2)
  • STI: Sony Computer Entertainement Inc., Toshiba Corp., IBM

References

  1. ^  "Open sourcing of Cell coming to fruition (http://www.itmanagersjournal.com/article.pl?sid=05/06/08/2046215)", IT Manager's Journal, (June 10, 2005)
  2. ^  "Unleashing the power: A programming example of large FFTs on Cell (broadcase replay) (http://www.power.org/news/events/barcelona)", power.org, (June 9, 2005)
  3. ^  "IBM Discloses Cell Based Blade Server Board Prototype (http://techon.nikkeibp.co.jp/english/NEWS_EN/20050525/105050/?ST=english)", Tech-On!, (May 25, 2005)
  4. ^  "IBM will unlock door to Cell (http://www.eetimes.com/news/latest/showArticle.jhtml?articleID=163106213)", EETimes.com, (May 23, 2005)
  5. ^  "Toshiba Demonstrates Cell Microprocessor Simultaneously Decoding 48 MPEG-2 Streams (http://techon.nikkeibp.co.jp/english/NEWS_EN/20050425/104149/?ST=english)", Tech-On!, (April 25, 2005)
  6. ^  "CELL: A New Platform for Digital Entertainment (http://www.research.scea.com/research/html/CellGDC05/)", Sony Computer Entertainment Inc., (March 9, 2005)
  7. ^  "CELL Microprocessor Revisited (http://www.realworldtech.com/page.cfm?ArticleID=RWT022805234129)", Real World Technologies, (February 28, 2005)
  8. ^  "Power Efficient Processor Design and the Cell Processor (http://www.cerc.utexas.edu/vlsi-seminar/spring05/slides/2005.02.16.hph.pdf)", IBM, (February 16, 2005)
  9. ^  "Prospects For the CELL Microprocessor Beyond Games (http://slashdot.org/article.pl?sid=05/02/11/1352224)", Slashdot, (February 11, 2005)
  10. ^  "ISSCC 2005: The CELL Microprocessor (http://www.realworldtech.com/page.cfm?ArticleID=RWT021005084318)", Real World Technologies, (February 10, 2005)
  11. ^  "A 4.8Ghz Fully Pipelined Embedded SRAM in the Streaming Processor of a CELL Processor (http://www.ibm.com/chips/techlib/techlib.nsf/techdocs/372E2BE9229AC34987256FC00074A13A/$file/ISSCC-26.7-Cell_SRAM.PDF)", Sony Computer Entertainement Inc., Toshiba Corp., IBM, (February 9, 2005)
  12. ^  "The Design and Implementation of a First-Generation CELL Processor (http://www.ibm.com/chips/techlib/techlib.nsf/techdocs/7FB9EC5D5BBF51ED87256FC000742186/$file/ISSCC-10.2-Cell_Design.PDF)", Sony Computer Entertainement Inc., Toshiba Corp., IBM, (February 8, 2005)
  13. ^  "IBM, Sony, Toshiba unveil nine-core Cell processor (http://www.macworld.com/news/2005/02/07/celldetails/index.php?lsrc=mcrss-0205)", Macworld, (February 7, 2005)
  14. ^  "Cell Microprocessor Briefing (http://pc.watch.impress.co.jp/docs/2005/0208/kaigai153.htm)", IBM, Sony Computer Entertainment Inc., Tochiba Corp., (February 7, 2005)
  15. ^  "The Cell Processor Programming Model (http://www.linuxtag.org/typo3site/freecongress-details.html?talkid=156)." LinuxTag 2005. Accessed on June 11, 2005.
  16. ^  "IBM Research - Cell (http://www.research.ibm.com/cell/)." IBM. Accessed on June 11, 2005.

External links

it:Cell ja:Cell

Wikipedia (http://en.wikipedia.org/wiki/Main_Page) Cell_(microprocessor) (http://en.wikipedia.org/wiki/Cell_(microprocessor)) version history (http://en.wikipedia.org/w/index.php?title=Cell_(microprocessor)&action=history) GNU Free Documentation Lizenz (http://en.wikipedia.org/wiki/Wikipedia:Text_of_the_GNU_Free_Documentation_License) CC-by-sa (http://creativecommons.org/licenses/by-sa/2.5/)

Personal tools
Google Search
Google
Web
biocrawler.com

 
In other languages