In February 2020, the European Commission published a manuscript providing an overview of the objectives sought in Europe related to digital transformation. The text mentions that the ICT sector is responsible for between 5% and 9% of global electricity consumption and 2% of all emissions. With the current trend of increased digitalisation it is expected that these numbers will go up. It is clear that AI is going to play an important role in helping us reduce our environmental footprint, but while we tend to see AI solutions as intangible, the reality is that they require many hours of computation or, in other words, have a high electric consumption.
It is only recently, thanks to studies such as this one from MIT, that attention has begun to be paid to the environmental impact of some of the latest AI models currently used. In the aforementioned study, the researchers studied the computational efforts required to train a selection of the most modern NLP (Natural Language Processing) models today. The aim of the study was to try to quantify the approximate financial and environmental costs associated with training these models. They found that training a BERT model, one of the current state of the art models developed by google, for a 24 hours using a GPU produced the equivalent CO2 (approx. 650 kg) to that of a round-trip trans-American flight for one person. They also discovered that while the environmental cost increased proportionally with the size of the models they skyrocketed when they introduced some sort of tuning step. Using a NAS (Neural Architecture Search) tuning process caused the environmental cost to explode and reach the equivalent of five times the average lifetime emissions of a US car (approx. 284000 kg). While this last case may be an extreme one, the reality is that the development pipeline of an AI solution includes much more than just training. It includes data collection, data processing and cleaning, and then several training iterations. So in order to get a more realistic estimation the researchers used data from a previous model development they had been working on. Using this case study they estimated an approximate total CO2 emission of 35000 kg (approx. 4 times the emission from a homes’ energy in a year) which we can assume as the most realistic estimate of the cost of a typical AI solution development cycle.
Admittedly, one can argue that there may be multiple variables affecting the computational performance of the models or that NLP models are one of the most computationally expensive models in AI, but the sheer magnitude of the results should be a cause for concern. One solution that is already being worked on is the development of hardware that is more energy-efficient. While this is likely to help reduce some of the environmental costs, the increasing use and development of power hungry AI models may make this solution insufficient in the future. A second possible and simple solution would be to increase the use of renewable energy sources in the AI pipeline. Thanks to the growing use of cloud services such as AWS, the power intensive computational parts of a model development could be taken to data centers in regions where the environmental cost is lower. Companies like Amazon and Google are already investing heavily in renewable energy for their services and this will surely reduce the global footprint of AI. Despite the fact that this solution can be effective this does not mean that the problem should only be tackled from a hardware perspective, but also from a software perspective. There is a movement called “Green AI” which seeks to make AI both greener and more inclusive. They classify the current research into “Red AI” and “Green AI”. The former encompasses the research focused in obtaining state-of-the-art results in accuracy through the use of massive computational power. Granting that the value of this research cannot be underestimated the aim of the movement is to encourage research that takes a different path and focus in optimizing efficiency rather than accuracy. The ultimate aim is not to completely change the current AI research paradigm but to promote awareness among researchers and the use of efficiency as an additional metric when evaluating a model and not just its accuracy (and related measures). There are already projects such as “experiment-impact-tracker” or “Carbontracker” that aim to create simple tools to help researchers track and predict energy consumption and carbon emissions of training AI models.
The research may seem very negative at first but it is necessary to bear in mind, as I have mentioned before, that there are multiple factors affecting the overall carbon footprint of AI. Most importantly, the potential benefit that the use of these models may be having in reducing environmental impact is not being taken into account. Take for example an AI model that is used to reduce aircraft emissions by optimising air routes. On the one hand, such a model would probably not be as computational expensive as the ones studied, and on the other hand the amount of emissions needed for its development would be practically negligible to the benefit it is creating. So, while the environmental impact of AI models is not currently an urgent issue, it certainly needs to be taken into account and awareness raised among researchers in order to prevent it from becoming a future problem. Last but not least I believe it is important to highlight the unexpected effects that the constant development of new models with immense computational costs can have on the research community. This means that in order to keep up to date with the research in AI, one needs to spend high amounts of money in specialised equipment or rely on cloud compute services such as AWS, Google Cloud and Microsoft Azure. These costs can create a insurmountable barrier for researchers especially from smaller institutions and developing countries how want to contribute to the world of AI. There is a growing gap in the research field between academia and the industry in favour of the latter. In the long run this could lead to a number of problems, so the development of more efficient models is a necessity.
Hope you’ve enjoyed this blog and don’t forget to visit datascience.aero for other interesting blogs about data science in aviation.