Publication | Open Access
Corona: System Implications of Emerging Nanophotonic Technology
459
Citations
26
References
2008
Year
Unknown Venue
EngineeringNano-opticsComputer ArchitectureProgrammable PhotonicsOptical ComputingPhotonic Integrated CircuitParallel ComputingPin LimitationsNanoscale ScienceManycore ProcessorNanophotonicsPhotonicsElectrical EngineeringMany-core MicroprocessorsOptical InterconnectsNanoscale SystemNanotechnologyComputer EngineeringEmerging Nanophotonic TechnologyNano ApplicationMicroelectronicsNanomaterialsApplied PhysicsMany-core ArchitectureThread Corona SystemOptoelectronics
Many‑core microprocessors are projected to reach 10 teraflops, but pin, energy, and wire scalability limits demand new bandwidth solutions, and silicon nanophotonics offer a promising, power‑efficient path. Corona is a 3‑D many‑core design that employs nanophotonic links—both inter‑core and off‑stack—to connect 256 low‑power cores via a photonic crossbar delivering 20 TB/s, and a 1024‑thread system was simulated with synthetic and SPLASH‑2 workloads. The system achieves 10 teraflops peak FP performance and 10 TB/s memory bandwidth, and outperforms an electrically‑connected counterpart by 2–6× on memory‑intensive tasks while consuming less power.
We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3 D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.
| Year | Citations | |
|---|---|---|
Page 1
Page 1