Taking a next step in data mining in aviation with Google’s Federated Learning

Samuel Cristobal

2018-02-27 16:11:06
Reading Time: 2 minutes

A few weeks ago we explored the need for smart(er) data protection mechanisms in aviation to enable open data, and ultimately foster research and applications. In the previous post we explained why it is not possible to assess the impact of our data mining models when removing certain variables, unless it encompasses the complete data set, including private fields, from the onset. In the practical scenario, data owners might accept providing access to their private data, as long as it is not shared; ie. the data stays and it is analysed on their premises. This leads to a scenario in which data mining exercises would be performed on a de-centralised manner.

Fortunately, we have seen a solution to this problem developed by Google itself and its Federated Learning algorithm within the domain of machine learning. According to Google,  “Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. Note that this goes beyond the use of local models that make predictions on mobile devices by bringing model training to the device as well.

In short, with Federated Learning each device can download the current model, improve the model by learning from the device’s own data, and then summarise the changes in a small, but focused update of the model. Only changes in the model are shared and integrated with other device updates to improve the overall, shared model. Model and updates are integrated in a continuous release channel, and since only the changes on the trained model are shared, the private data remains on the device and data remains confidential and privacy respected.

What does this mean for the aviation industry? Imagine every aircraft uses the same Federated Learning algorithm, learning from flight performances and updating a unique machine learning model. This is similar to the current Safety Management System of the Airline Operators, however built in a shared scheme. With this approach, private aircraft data remains confidential, and internet connection is not required on flight. All processing is made on-board, and the overall model is improved and accessible by all users upon arrival to the hub, similar to the Quick Access Registers of the Flight Data Monitoring system. Furthermore, since the model is constantly updated, it can be used to make local on-board predictions while offline and always taking into account the latest improvements.

Learn more about Federated Learning and how it works in detail in Google’s Research Blog.

© datascience.aero