Published September 22, 2023 | Version v1
Conference paper Open

Proyecto RED-SEA: Resultados Intermedios

Description

El objetivo general de RED-SEA es diseñar una nueva generación de red de interconexión europea, que posibilite la computación exascala en Europa, mediante una interconexión económicamente viable y tecnológicamente eficiente, aprovechando tecnología de interconexión europea (BXI) junto a la tecnología estándar y madura (Ethernet), iniciativas anteriores financiadas por la UE, como ExaNeSt, EuroEXA, ECOSCALE, Mont-Blanc, los proyectos DEEP y el proyecto de procesador europeo (EPI), así como estándares abiertos y API compatibles.

Para alcanzar este objetivo global, el proyecto RED-SEA se desarrolla en torno a cuatro pilares fundamentales:

i)       arquitectura y codiseño - con el objetivo de optimizar el ajuste con los otros proyectos EuroHPC y con los procesadores EPI;

ii)      desarrollo de un bridge de altas prestaciones, baja latencia y sin fisuras con Ethernet

iii)     gestión de recursos de red, incluyendo congestión y calidad de servicio; y

iv)     funciones de extremo a extremo implementadas en la red.

Este artículo presenta los principales logros alcanzados a mitad del proyecto por los 2 socios españoles que participan en el proyecto, es decir, la Universitat Politècnica de Valéncia (UPV) y la Universidad de Castilla La-Mancha (UCLM), contribuyentes a los pilares 1 y 3. En este sentido, cabe destacar

i)       la definición de los requisitos de la red y la arquitectura de la red, una lista inicial de aplicaciones y el modelado de la arquitectura BXI3 para poder evaluar las prestaciones de las propuestas del proyecto;

ii)      la caracterización de la congestión de las aplicaciones y las propuestas para reducir esta congestión mediante la optimización de las primitivas de comunicación colectiva.

Files

23_RED_SEA_Sarteco.pdf

Files (1.2 MB)

Name Size Download all
md5:87f1df503d83a0f163b69a5530a231ae
1.2 MB Preview Download

Additional details

Related works

Is part of
Conference proceeding: 978-84-09-54466-0 (ISBN)

Funding

RED-SEA – Network Solution for Exascale Architectures 955776
European Commission

References

  • Manolis Katevenis et al., "Next generation of exascaleclass systems: Exanest project and the status of its interconnect and storage development," Microprocessors and Microsystems, vol. 61, pp. 58–71, 2018.
  • Manolis Ploumidis, Nikolaos D. Kallimanis, Marios Asiminakis, Nikos Chrysos, Pantelis Xirouchakis, Michalis Gianoudis, Leandros Tzanakis, Nikolaos Dimou, Antonis Psistakis, Panagiotis Peristerakis, Giorgos Kalokairinos, Vassilis Papaefstathiou, and Manolis Katevenis, "Software and hardware co-design for low-power hpc platforms," Berlin, Heidelberg, 2019, p. 88–100, Springer-Verlag.
  • Biagioni, Andrea et al., "Euroexa custom switch: an innovative fpga-based system for extreme scale computing in europe," EPJ Web Conf., vol. 245, pp. 09004, 2020.
  • R Ammendola, A Biagioni, O Frezza, F Lo Cicero, A Lonardo, P S Paolucci, D Rossetti, F Simula, L Tosoratto, and P Vicini, "APEnet+: a 3D torus network optimized for GPU-based HPC systems," Journal of Physics: Conference Series, vol. 396, no. 4, pp. 042059, dec 2012.
  • Jeffrey S Vetter, Ron Brightwell, Maya Gokhale, Pat McCormick, Rob Ross, John Shalf, Katie Antypas, David Donofrio, Travis Humble, Catherine Schuman, et al., "Extreme heterogeneity 2018-productive computational science in the era of extreme heterogeneity: Report for doe ascr workshop on extreme heterogeneity," 2022.
  • Ken Raffenetti, Antonio J Pena, and Pavan Balaji, "Toward implementing robust support for portals 4 networks in mpich," in 2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing. IEEE, 2015, pp. 1173–1176.
  • Andrea Biagioni et al., "RED-SEA: network solution for exascale architectures," in 25th Euromicro Conference on Digital System Design, DSD 2022, Maspalomas, Spain, August 31 - Sept. 2, 2022. 2022, pp. 712–719, IEEE.
  • Andr´as Varga, "Omnet++," in Modeling and Tools for Network Simulation, Klaus Wehrle, Mesut G¨unes, and James Gross, Eds., pp. 35–59. Springer, 2010.
  • George F. Riley and Thomas R. Henderson, The ns-3 Network Simulator, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010.
  • Nathan Binkert et al., "The gem5 simulator," SIGARCH Comput. Archit. News, vol. 39, no. 2, pp. 1–7, aug 2011.
  • Nikolaos Tampouratzis, Ioannis Papaefstathiou, Antonios Nikitakis, Andreas Brokalakis, Stamatis Andrianakis, Apostolos Dollas, Marco Marcon, and Emanuele Plebani, "A novel, highly integrated simulator for parallel and distributed systems," ACM Trans. Archit. Code Optim., vol. 17, no. 1, mar 2020.
  • P. Yebenes, J. Escudero-Sahuquillo, P. J. Garcia, and F. J. Quiles, "Networks of exascale systems with omnet++.," in Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, 2013, pp. 203–207.
  • F. J. And´ujar, J. A. Villar, F.J. Alfaro, and et al., "An open-source family of tools to reproduce mpi-based workloads in interconnection network simulators.," Journal of Supercomputing, , no. 72, pp. 042059, 2016.
  • Daniele De Sensi, Salvatore Di Girolamo, Kim H. Mc- Mahon, Duncan Roweth, and Torsten Hoefler, "An indepth analysis of the slingshot interconnect," in SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, 2020, pp. 1–14.
  • Mark S. Birrittella, Mark Debbage, Ram Huggahalli, James Kunz, Tom Lovett, Todd Rimmer, Keith D. Underwood, and Robert C. Zak, "Intel®omni-path architecture: Enabling scalable, high performance fabrics," in 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, 2015, pp. 1–9.
  • Saïd Derradji, Thibaut Palfer-Sollier, Jean-Pierre Panziera, Axel Poudes, and Fran¸cois Wellenreiter Atos, "The bxi interconnect architecture," in 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects. IEEE, 2015, pp. 18–25.
  • Roberto Ammendola, Massimo Bernaschi, Andrea Biagioni, Mauro Bisson, Massimiliano Fatica, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Enrico Mastrostefano, Pier Stanislao Paolucci, Davide Rossetti, Francesco Simula, Laura Tosoratto, and Piero Vicini, "Gpu peer-to-peer techniques applied to a cluster interconnect," in 2013 IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum, 2013, pp. 806–815.
  • Adrià Armejach, Bine Brank, Jordi Cortina, Françcois Dolique, Timothy Hayes, Nam Ho, Pierre-Axel Lagadec, Romain Lemaire, Guillem L´opez-Parad´ıs, Laurent Marliac, Miquel Moret´o, Pedro Marcuello, Dirk Pleiter, Xubin Tan, and Said Derradji, "Mont-blanc 2020: Towards scalable and power efficient european hpc processors," in 2021 Design, Automation Test in Europe Conference Exhibition (DATE), 2021, pp. 136–141.
  • Norbert Eicker, Thomas Lippert, Thomas Moschny, Estela Suarez, and for the DEEP project, "The deep project an alternative approach to heterogeneous clustercomputing in the many-core era," Concurrency and Computation: Practice and Experience, vol. 28, no. 8, pp. 2394–2411, 2016.
  • Theodoropoulos et al., "The AXIOM project (Agile, eXtensible, fast I/O Module)," in 2015 International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation (SAMOS), 2015, pp. 262–269.
  • "Epi: European processor iniciative," .
  • Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Ber´anek, Luca Benini, and Torsten Hoefler, "A risc-v in-network accelerator for flexible high-performance low-power packet processing," in Proceedings of the 48th Annual International Symposium on Computer Architecture, 2021, ISCA '21.
  • Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, and Torsten Hoefler, "Flare: Flexible in-network allreduce," in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, New York, NY, USA, 2021, SC '21, Association for Computing Machinery.
  • Nikolaos Chrysos and Manolis Katevenis, "Scheduling in non-blocking buffered three-stage switching fabrics.," in INFOCOM, 2006, vol. 6, pp. 1–13.
  • Antonis Psistakis, Nikos Chrysos, Fabien Chaix, Marios Asiminakis, Michalis Gianioudis, Pantelis Xirouchakis, Vassilis Papaefstathiou, and Manolis Katevenis, "Optimized page fault handling during rdma," IEEE Transactions on Parallel and Distributed Systems, pp. 1–1, 2022.