Unconventional HPC Architectures
- 1. Maxeler
- 2. IBM
- 3. Heidelberg University
- 4. University of Manchester
- 5. SURF
Description
Moore’s Law, which stated that “the complexity for minimum component costs has increased at a rate of roughly a factor of two per year “, is slowing down due to the enormous cost of developing new process generations along with feature sizes approximating silicon interatomic distances. With the end of Dennard scaling (i.e; the rather constant power density across technology nodes, and the increase of operating frequency with each technology node) more than 15 years ago, using more transistors to build increasingly parallel architectures was a key focus. Now, other ways need to be found to deliver further advances in performance. This can be achieved by a combination of innovations at all levels: technology (3D stacking, using physics carry out computations, etc.), architecture (e.g., specialization, computing in/near memory, dataflow), software, algorithms and new ways to represent information (e.g., neuromorphic – coding information “in time” with “spikes”, quantum, mixed precision). A closely related challenge is the energy required to move data, which can be orders of magnitude higher than the energy of the computation itself.
These challenges give rise to new unconventional HPC architectures, which, through some form of specialisation, achieve more efficient and faster computations. This paper covers a range of new unconventional HPC architectures which are currently emerging. While these architectures and their underlying requirements are naturally diverse, AI emerges as a technology that drives the development of both novel architectures and computational techniques due to its dissemination and computing requirements. The ever-increasing demand of AI applications requires more efficient computation of data centric algorithms. Furthermore, minimising data movement and improving data locality plays an important role in achieving high performance while limiting power dissipation and this is reflected in both architectures and programming models. We also cover models of computation that differ from those used in conventional CPU architectures, or models that are purely focussed on achieving performance through parallelisation. Finally, we address the challenges of developing, porting and maintaining applications for these new architectures.
Files
ETP4HPC_WP_Unconventional-HPC-arch_20220425.pdf
Files
(1.5 MB)
Name | Size | Download all |
---|---|---|
md5:48b568cb2e777ded846a5991fe92f867
|
1.5 MB | Preview Download |
Additional details
References
- [1] R. Dimond, M. J. Flynn, O. Mencer and O. Pell, "MAXware: Acceleration in HPC," in 2008 IEEE Hot Chips 20 Symposium (HCS), 2008.
- [2] A. Podobas, K. Sano and S. Matsuoka, "A Survey on Coarse-Grained Reconfigurable Architectures From a Performance Perspective," IEEE Access, pp. 146719-146743, 2020.
- [3] D. V. Christensen, R. Dittmann, B. Linares-Barranco, A. Sebastian, M. L. Gallo, A. Redaelli, S. Slesazeck, T. Mikolajick, S. Spiga, S. Menzel, I. Valov, G. Milano, C. Ricciardi, S.-J. Liang, F. Miao, M. Lanza, T. J. Quill, S. T. Keene, A. Salleo, J. Grollier, D. Markovic, A. Mizrahi, P. Yao, J. J. Yang, G. Indiveri, J. P. Strachan, S. Datta, E. Vianello, A. Valentian, J. Feldmann, X. Li, W. H. Pernice, H. Bhaskaran and S. Furber, "2022 roadmap on neuromorphic computing and engineering," Journal of Neuromorphic Computing and Engineering, 2022.
- [4] S. Yu, H. Jiang, S. Huang, X. Peng and A. Lu, "Compute-in-Memory Chips for Deep Learning: Recent Trends and Prospects," IEEE Circuits and Systems Magazine, vol. 21, no. 3, p. 31–56, 2021.
- [5] [Online]. Available: https://github.com/lava-nc/lava.
- [6] [Online]. Available: https://www.hpcwire.com/2021/09/30/intel-unveils-loihi-2-its-second-generation-neuromorphic-chip/.
- [7] [Online]. Available: https://www.nest-simulator.org/.
- [8] [Online]. Available: https://www.nengo.ai/.
- [9] [Online]. Available: https://ebrains.eu/.
- [10] [Online]. Available: https://github.com/CEA-LIST/N2D2.
- [11] A. Sebastian, M. Le-Gallo, R. Khaddam-Aljameh and E. Eleftheriou, "Memory devices and applications for in-memory computing," Nature Nanotechnology, vol. 15, p. 529–544, 2020.
- [12] Y. Xi, B. Gao, J. Tang, A. Chen, M.-F. Chang, X. S. Hu, J. Van-Der-Spiegel, H. Qian and H. Wu, "In-memory Learning with Analog Resistive Switching Memory: A Review and Perspective," Proceedings of the IEEE, vol. 109, no. 1, pp. 14-42, January 2021.
- [13] R. Khaddam-Aljameh, M. Stanisavljevic, J. F. Mas, G. Karunaratne, M. Braendli, F. Liu, A. Singh, S. M. Müller, U. Egger, A. Petropoulos, T. Antonakopoulos, K. Brew, S. Choi, I. Ok, F. L. Lie, N. Saulnier, V. Chan, I. Ahsan, V. Narayanan, S. R. Nandakumar, M. Le Gallo, P. A. Francese, A. Sebastian and E. Eleftheriou, "HERMES-Core—A 1.59-TOPS/mm2 PCM on 14-nm CMOS In-Memory Compute Core Using 300-ps/LSB Linearized CCO-Based ADCs," IEEE Journal of Solid-State Circuits, vol. 57, no. 4, pp. 1027-1038, April 2022.
- [14] M. Halter, L. Bégon-Lours, V. Bragaglia, M. Sousa, B. J. Offrein, S. Abel, M. Luisier and J. Fompeyrine, "Back-End, CMOS-Compatible Ferroelectric Field-Effect Transistor for Synaptic Weights," ACS Applied Materials & Interfaces, vol. 12, no. 15, pp. 17725-17732, 2020.
- [15] G. Wetzstein, A. Ozcan, S. Gigan, S. Fan, D. Englund, M. Soljačić, C. Denz, D. A. B. Miller and D. Psaltis, "Inference in artificial intelligence with deep optics and photonics," Nature, vol. 588, no. 7836, pp. 39-47, 2020.
- [16] B. J. Shastri, A. N. Tait, T. F. d. Lima, W. H. P. Pernice, H. Bhaskaran, C. D. Wright and P. R. Prucnal, "Photonics for artificial intelligence and neuromorphic computing," Nature Photonics, vol. 15, no. 2, pp. 102-114, January 2021.
- [17] D. Abts, J. Ross, J. Sparling, M. Wong-VanHaren, M. Baker, T. Hawkins, A. Bell, J. Thompson, T. Kahsai, G. Kimmell, J. Hwang, R. Leslie-Hurd, M. Bye, E. Creswick, M. Boyd, M. Venigalla, E. Laforge, J. Purdy, P. Kamath, D. Maheshwari, M. Beidler, G. Rosseel, O. Ahmad, G. Gagarin, R. Czekalski, A. Rane, S. Parmar, J. Werner, J. Sproch, A. Macias and B. Kurtz, "Think Fast: A Tensor Streaming Processor (TSP) for Accelerating Deep Learning Workloads," in ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), Valencia, Spain, 2020 .
- [18] Z. Jia, B. Tillman, M. Maggioni and D. P. Scarpazza, Dissecting the Graphcore IPU Architecture via Microbenchmarking, arXiv, 2019.
- [19] G. Lauterbach, "The Path to Successful Wafer-Scale Integration: The Cerebras Story," IEEE Micro, vol. 41, no. 6, pp. 52-57, 2021.
- [20] "Eurolab4HPC Long-Term Vision on High-Performance Computing (2nd edition)," January 2020. [Online]. Available: https://www.eurolab4hpc.eu/media/public/vision/vision_final.pdf.
- [21] M. Malms, M. Ostasz, M. Gilliot, P. Bernier-Bruna, L. Cargemel, E. Suarez, H. Cornelius, M. Duranton, B. Koren, P. Rosse-Laurent, M. S. Pérez-Hernández, M. Marazakis, G. Lonsdale, P. Carpenter, G. Antoniu, S. Narasimhamurthy, A. Brinkman, D. Pleiter, A. Tate, J. Kueger, J. Krueger, H.-C. Hoppe, E. Laure and A. Wierse, "ETP4HPC's Strategic Research Agenda for High-Performance Computing in Europe 4," Zenodo, 2020.
- [22] P. Radojković, P. Carpenter, P. Esmaili-Dokht, R. Cimadomo, H.-P. Charles, A. Sebastian and P. Amato, "Processing in Memory: The Tipping Point," Zenodo, 2021.
- [23] C. Hagleitner, D. Diamantopoulos, B. Ringlein, C. Evangelinos, C. Johns, R. N. Chang, B. D'Amora, J. A. Kahle, J. Sexton, M. Johnston, E. Pyzer-Knapp and C. Ward, "Heterogeneous Computing Systems for Complex Scientific Discovery Workflows," in 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2021.
- [24] N. Voss, B. Kwaadgras, O. Mencer, W. Luk and G. Gaydadjiev, "On Predictable Reconfigurable System Design," ACM Transactions on Architecture and Code Optimization, vol. 18, no. 2, pp. 1-28, June 2021.
- [25] A. Mirhoseini, A. Goldie, M. Yazgan, J. W. Jiang, E. Songhori, S. Wang, Y.-J. Lee, E. Johnson, O. Pathak, A. Nazi, J. Pak, A. Tong, K. Srinivasa, W. Hang, E. Tuncer, Q. V. Le, J. Laudon, R. Ho, R. Carpenter and J. Dean, "A graph placement methodology for fast chip design," Nature, vol. 594, pp. 207-212, 2021.
- [26] [Online]. Available: https://www.taylorfrancis.com/chapters/edit/10.1201/9781351036863-9/modular-supercomputing-architecture-idea-production-estela-suarez-norbert-eicker-thomas-lippert.