Data emissions and how to deal with them

Clara Argerich

2022-09-21 12:18:31
Reading Time: 3 minutes

Sustainability is becoming a major topic for industries of heavy energy consumption, and the need to reduce the carbon footprints and emissions is all over the news. With the European Union aiming to reach zero net emissions by 2050, virtually all of these industries and stakeholders are working towards a greener future. Concurrently, Data Businesses has experienced huge growth as companies start to realize the value of data and the benefits of storing, analysing and understanding data. However, as data also has a carbon footprint, as described in this previous post from our colleague Pablo Hernández, we must understand the impact of artificial intelligence and its carbon footprint.

Right now, according to Climate Care, internet emissions account for 3.7% of all global emissions, more than aviation emissions. This number is expected to double by 2025. Data centers account for 0.3% of global emissions, and with the current growth in use of data, these numbers are expected to continue increasing in the upcoming years. So, if we want Big Data and artificial intelligence to play bigger roles in a greener future, we must also account for the impact of virtual emissions on global emissions

How can industries and stakeholders reduce the impact of their data emissions and how can data science play a role? Here are some key aspects to consider:

  • Awareness: it is important to understand that data flow requires energy, and thus, generates emissions. For instance, we can find a information on the topic on Eco2Clouds, a European project developed in the Framework of FP7 devoted to the development of methods guidelines and technology for enriching cloud computing with means to take into proper consideration ecological concerns, such as energy consumption and CO2 footprint.
  • Define sustainability goals: the key to reducing carbon footprint with data is balance. For instance, as mentioned in our previous post, an artificial intelligence model that has been used for optimising flight trajectories, thereby reducing co2 aviation emissions, is beneficial overall. Another way of reducing virtual emissions would be by ensuring green energy sources for data uses. It will be mandatory in the near future to maintain a sustainability agenda as growth in environmental policies is expected. For instance, the European Investment Bank Group has approved supporting €1 trillion of investment into climate and the environment by 2030 (see this link).
  • From a data science point of view, optimisation of data processes and efficiency is always a goal.
    • Defining data pipelines and ETL pipelines that are optimised: reduce the number of interactions within the cloud, data duplication and storing useless data. Data science techniques for dimensionality reduction are key at this stage. By understanding the meaning hidden in large datasets, we can largely reduce the number of entries required.
    • Use artificial intelligence as a tool to optimise data center infrastructure and energy consumption. Much literature speaks to how artificial intelligence technique can support optimisation problems, from capacity planning or efficiency analysis to infrastructure optimisation. More detailed information for the interested reader can be found in this article by Google and this post by Linesight.
    • Smart data solutions: Big Data can be described by the five Vs: value, variety, volume, velocity, veracity. Smart Data reduces “volume” by utilising only useful information for a given problem. Variety may, or may not, be reduced but Value, velocity, and veracity (accuracy) should all increase with the decrease in volume. Reducing data will lead to artificial intelligence models that require less computational cost to run, and thus less energy, and thus less emissions.
  • Small actions: There are guidelines that will help the common internet user to reduce his/her carbon footprint: for instance, this fine analysis shows how reducing attachments in emails could have a positive impact on total email emissions.

To sum up, it is important to take action in every field in order to achieve zero net emissions. Data Science is a powerful tool that provides solutions among many different fields, but it also comes with its own carbon footprint, meaning it must be accounted for in the fight to reduce climate impact.

Author: Clara Argerich