When we first launched the Datascience.aero blog in 2017, we hoped to regularly update our readers on the evolution of data science based solutions as applied to the aviation industry. We presented these updates alongside other breakthrough ideas that might inspire others with new applications. Following this spirit, we have continued to publish our thoughts on techniques such as data miningwith Federated Learning, blockchain, Artificial Intelligence and Deep Learning or cryptography among many others; additionally, we have published thoughts on applications such us predictive maintenance, evidence based training or fuel efficiency.
One of the most extended branches of Artificial Intelligence is Natural Language Processing (NLP), which overcomes communication barriers between humans and computers. In fields such as health, these techniques demonstrate very promising potential in the analysis of health records, clinical trials and patients reports. Of course, we also have examples like Apple’s Siri or Amazon’s Alexa, which are capable of responding to our questions and commands. Or our e-mail provider, which automatically filters spam, classifies our emails and, even, proposing a (not always) relevant reply.
Although we have dedicated several blog posts to this technique (see here for a complete catalogue), the ongoing initiatives in the aviation industry are still worth a dedicated article.
Based on the analysis of textual data or reports, there are three main applications under development that are at varying levels on maturity:
- Among airlines, customer outreach and marketing in general. These techniques already help airlines better identify and understand targeted customers and tailor their marketing campaigns to reach and engage their passengers. In fact, most airlines having data science profiles among their staff who are dedicated to business intelligence tasks. For these applications, said data science profiles have the advantage of easily accessing and processing extensive data (for internal purposes). For other types of applications requiring operational data, lack of data access and inability to prepare said data may limit further development, as we have discussed in this blog.
- In spite of limited access to this kind of data, research has been conducted during past years in the automatic analysis of safety reports. Safety reports are, naturally, highly confidential. For the sake of facilitating the analysis and benchmarking, significant effort has been allocated to standardize these reports across responsible institutions and national authorities, to no real avail. Currently, the reporting forms include multiple choices fields (e.g. to select the type of incident), fixed format fields (e.g. date and time) and also free text. Typically, the best description of the event and its circumstances can be provided by the people involved (pilots, controllers, etc.) in the free text box. While analysis and investigation are sensitive processes, the possibility of analyzing vast amounts of reports, which are already dully anonymized, could significantly enhance safety intelligence: precursors analysis, trends underlying, hidden hazards, etc.
- Predictive maintenance is a key area for airlines as MRO entails a huge cost for them. The size of this market (118B$ in 2019 according to Forbes) justifies the resources that airlines are allocating to predict component failures, including monitoring and analyzing different performance indicators. In this sense, the role of NLP techniques is crucial in enabling the analysis of technical documents (such as aircraft manuals) as well as pilots and mechanic notes.
There is also huge potential in the application of NLP techniques for voice recognitions purposes and, in particular, for conversations between pilots and controllers. While strong regulations currently prevent development in this area, digitalizing controllers’ voice recordings could have numerous applications from improving safety analysis to enabling the automatization of some control tasks; this, in turn, could reduce their workload.