Artificial intelligence (AI) researchers have long suspected that the computational demands of training advanced AI models have a substantial environmental impact. Recent research quantifies these concerns, revealing that the process of training natural-language processing (NLP) models, a subfield focused on enabling machines to understand and generate human language, carries a significant carbon footprint. This revelation has surprised many within the AI community, highlighting the magnitude of the environmental cost associated with AI’s rapid advancements.
The Environmental Cost of NLP Model Training
The NLP field has witnessed remarkable progress in recent years, achieving significant milestones in tasks like machine translation, sentence completion, and even generating convincing fake news articles, as exemplified by OpenAI’s GPT-2 model. However, these breakthroughs are largely attributed to training increasingly larger models on vast datasets of text scraped from the internet. This approach is not only computationally intensive but also highly energy-demanding.
A specific study focused on four prominent NLP models that have driven major performance leaps: the Transformer, ELMo, BERT, and GPT-2. Researchers trained each of these models on a single Graphics Processing Unit (GPU) for up to 24 hours to measure their power consumption. By referencing the training hours detailed in the models’ original research papers, they calculated the total energy consumed throughout the complete training process. This energy consumption was then converted into pounds of carbon dioxide equivalent, using the average energy mix in the United States, which closely mirrors the energy mix utilized by Amazon Web Services (AWS), a leading cloud service provider.
The findings indicate that the computational and environmental expenses associated with training these models scale proportionally with model size. Furthermore, these costs experience a dramatic surge when additional tuning steps are implemented to enhance the model’s final accuracy. Notably, a tuning process called neural architecture search, which aims to optimize a model by iteratively refining its neural network design through extensive trial and error, was found to incur extraordinarily high costs with minimal performance gains. Without this specific tuning process, the most computationally expensive model, BERT, exhibited a carbon footprint of approximately 1,400 pounds of carbon dioxide equivalent. This figure is comparable to the carbon emissions of a round-trip trans-American flight for a single individual.
Beyond Baseline Calculations: The Full Development Pipeline
The researchers emphasize that these figures represent baseline estimations. Emma Strubell, the lead author of the paper and a PhD candidate at the University of Massachusetts, Amherst, points out that training a single model is the minimum effort involved. In real-world scenarios, AI researchers are more likely to develop entirely new models from scratch or adapt existing models to new datasets. Both of these processes typically involve numerous additional rounds of training and tuning.
To provide a more comprehensive understanding of the carbon footprint associated with the full AI development pipeline, Strubell and her colleagues conducted a case study using a model from their previous research. Their analysis revealed that the process of building and validating a final, publication-worthy model necessitated training 4,789 individual models over a six-month period. When converted to CO2 equivalent, this effort emitted over 78,000 pounds of carbon dioxide, a figure likely representative of typical work undertaken in the field. This comprehensive view underscores the substantial environmental impact of developing advanced AI models.
Driving Towards Sustainable AI
The growing awareness of the environmental impact of AI, particularly in areas like NLP, is prompting a search for more sustainable practices. Researchers are exploring methods to reduce the energy consumption of model training, such as developing more efficient algorithms, utilizing specialized hardware, and optimizing training strategies. The push for greener AI development is crucial as the technology continues to advance and its applications become more widespread.
Consider the implications of these findings for the future of AI development. As the demand for sophisticated AI models grows, so too does the imperative to mitigate their environmental consequences. The insights from this research serve as a critical call to action for the AI community to prioritize sustainability alongside innovation.
What are your thoughts on the environmental cost of AI development? Share your perspective in the comments below.

