BigScience’s global collaborative effort to develop Bloom, one of the largest, open and multilingual NLP (Natural Language Processing) model in the world, has been recognized in the annual HPCwire Readers’ and Editors’ Choice presented at the 2022 International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), in Dallas, Texas.  This award of the HPC Best collaboration of the year was given to GENCI, IDRIS and HuggingFace teams together.

BigScience was a community adventure as well as a research and engineering challenge. It gathered more than 1200 researchers from academia and industry (startups, SMEs, large groups) from 38 countries with the goal to develop and train BLOOM using a public HPC infrastructure, the Jean Zay supercomputer of GENCI (Grand Equipement National de Calcul Intensif) hosted and operated at IDRIS (Institut du développement et des ressources en informatique scientifique, CNRS).

Orchestrated by Hugging Face, the open-source AI start-up, 30 working groups set to work between mid-2021 and mid-2022, addressing all the different steps of the building of a such large language model (LLM) such as data governance, choice of input data and sources, modeling, evaluation of the model, engineering including optimization and scaling of the model, generalization, ethical AI and legal frameworks, introducing the ROOTS open multilingual dataset and the RAIL open AI license.

The final and biggest version of BLOOM with 176 billion parameters over 70 layers learned from a total amount of 1.61 terabytes of text spanning 46 natural languages and 13 programming languages. The engineering working group attained state-of-the-art throughput with this Transformer-based model on the latest nVIDIA A100-80 partition of Jean Zay supercomputer (offering more than 400 A100 GPUs over the >3100 of the total configuration).

From the right to the left : Pierre-François Lavallée (IDRIS), Tom Tabor (HPCWire), Stéphane Requena (GENCI)

With the support of experts from IDRIS, Hugging Face, Microsoft and nVIDIA (using the DeepSpeed-Megatron framework), the model reached a sustained performance of 156 TFlops/GPU (50% of the FP32/BF16 peak performance).

The training of BLOOM-176B took 3.5 months, with 1,082,990 compute hours over 48 Jean Zay nodes, requiringa total power consumption of 433 MWh representing a carbon footprint of only 25 tons CO2 eq emissions.

BLOOM is available openly upon a RAIL (Responsible AI Licenses) that limits potentially harmful use-case that BLOOM could enable. More information here:


Click on the following link to read the full press release