The weak case for Trusted Third Parties in Data Science

David Perez

2019-01-17 11:25:45
Reading Time: 3 minutes

Trusted third parties are sought-after when confidential data needs to be managed. We’ve covered this in the Datascience.aero blog before.

A trusted third party is an entity that collects confidential information, normally privacy sensitive for the data owner for competitive or legal reasons. This trusted party performs only agreed computations that are then distributed as agreed by all parties. The trusted party then usually deletes the data to ensure there are no future data leaks of the confidential datasets. This is normally done after a few hours or days.

‘Trusted’ signifies the data owners have confidence that the system will protect their interests. However, “trusted”, implicitly means that there is no way to guarantee the protection is effective and surprisingly, in reality the secure data deletion isn’t normally certified by an independent security auditor that can guarantee the correct governance.

The agreed computations could refer, for instance, to blind-benchmarking operations in which each data owner gets a comparison of their own performance against an industry average. The industry average of a particular operation indicator is usually computed whenever there is at least three participants in the operation participating, which is normally referred as the “rule of the three”.

While trusted third parties, whether independent industry consultants, authorities or sector associations, have been the standard solution to the blind-benchmarking challenge in many industries, the current state-of-the-art in cryptography enables more sophisticated technical approaches. These alternative approaches provide a level of security that guarantee the data confidentiality and allows more secure protocols.

However, there are potential risks and limitations of blind benchmarking based on trusted third parties.

In terms of risks, centralising data collection brings the obvious risk of cybersecurity breaks, as all industry data at a single point of collection makes it much more attractive to potential attackers. Additionally, data transmission, whether it be across networks or secure deletion, adds additional risks.

This setup also presents important limitations. For instance, let’s consider the security of the “rule of protocol depends on the number of occurrences of the different parties”. For example, if a particular indicator is computed in a hub airport, the airline mainly operating from that hub can be very easily leaking important information. As many large airports are the main hub of one airline, it is very likely that the average indicator in that hub is very similar to the value of the airline that operates from that airport, thus releasing unwanted information about the operations of the airline.

Trusted third parties are normally manually operated. Once information is uploaded, for instance through FTP, it leads to considerable overhead. In this scenario, it is impossible to run more sophisticated computations, such as real-time alarms if a metric has grown on a particular day, or ad-hoc computations where only few participants are involved. These are just a few interesting blind benchmarking protocols.

The alternative to a trusted third party can be state-of-the-art cryptographic techniques. For this, a cryptographic protocol can be verified to operate in your interests. The system, including the protocol, should be verifiably secure, and in other words, should not need your trust.

This system would also avoid data leaving the participant’s safe corporate storage system, thus avoiding the need a centralised collection system. This leads to reduced risks in transmission, data management in the centralised storage, and less hacker targeting. In addition, an automated system that uses cryptographic protocols allows peer-to-peer communications between participants, with sharing only the necessary information to perform the computations, which is never sufficient enough to reveal confidential information. This allows for increased flexibility in comparing datasets, more insightful metrics between hub operators, and operations between various stakeholders like air navigation service providers and airline operations.

DataBeacon is a data platform exploring the opportunities in Artificial Intelligence and cryptographic systems for aviation. DataBeacon works on new ways to provide blind-benchmarking capabilities to bring new insights to aviation without compromising data ownership.

DataBeacon is developed for and supported by a wide aviation consortium formed by several operational stakeholders, including airlines, airports, air navigation service providers and Eurocontrol.

If you need more information on DataBeacon, please, emails us at [email protected].

Author: David Perez

© datascience.aero