Now that the ultra-hype period has passed, I think it is time to talk about Google Duplex. If you haven’t heard about it, there is a great post on the Google AI blog, or you can watch the following video.
Now that the ultra-hype period has passed, I think it is time to talk about Google Duplex. If you haven’t heard about it, there is a great post on the Google AI blog, or you can watch the following video.
At first it might seem scary, then it might feel exciting. Are we actually approaching an AI capable of passing the Turing test? Well certainly not, despite the impressive “Aham” trick that Google’s data scientists insists it learned itself. We are still ages from any general form of AI. The difference is a clear context and goal for each conversation. However, when working in AI, it is more effective to first tackle smaller intelligent tasks.
Google Duplex is a good example of the current state of AI development. Machine learning applications no longer need to be built from scratch and algorithms do not need to be explicitly coded. Computer science AI is no longer about programming, but about putting the right pieces together and filling the gaps to create a well-round training set. In this case, Google Duplex is a mix of natural language processing, artificial voice generation, text-to-speech, interpretation and context management- all supported by deep neural networks and convolutional neural networks. This was trained over thousands of millions of US conversations provided by Google Voice users. The effort is no longer about programming a system for any possible conversation or scenario, but rather creating a set of examples sufficiently large with an optimum learning algorithm so most cases are included.
No, this is not a step towards full automation. Full automation will not require humans in the loop, therefore no fancy chat both either. It is the contextual information over a very precise topic that could be useful. Having a communication support tool, automatically recognising the conversation and trying to detect disagreements could help improve safety beyond our best expectations. Miscommunication has been a returning topic in aviation safety, especially when it involves so many different accents and non-native speakers.*
In a 2017 report**, the CAA identified 267 reports in the MOR database as “related to miscommunication in some way, either the primary focus of the reported incident or concern of the reporter (less common), or ancillary to another event (as was the case with the majority of MORs which contained language-related miscommunication).”
Having a support system like Duplex could help both speakers. The system does not need to communicate, as it could be running separately listening to the conversation in the background and can provide contextual information in real time. If the context doesn’t match what the speaker is understanding, then they could ask for further clarification. After all, humans are still on control, at least for now.
–
* Fatal miscommunication: English in aviation safety https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.0883-2919.2004.00368.x
** Aviation English Research Project: Data analysis findings and best practice recommendations https://publicapps.caa.co.uk/docs/33/CAP1375%20Mar17.pdf