Emotionally intelligent machines may not be as far away as it seems. This fact has already been reported in recent news and studies.
Over the last few decades, artificial intelligence (AI) have got increasingly good at reading emotional reactions in humans.
But reading is not the same as understanding.
AI can now, among other things, recognise faces, turn face sketches into photos, recognise speech and play Go.
Recently, researchers have developed an AI that is able to tell whether a person is a criminal just by looking at their facial features. The AI mistakenly categorised innocents as criminals in only around 6 percent of the cases, while it was able to successfully identify around 83 percent of the criminals. This leads to a staggering overall accuracy of almost 90 percent.
It is argued that, if we are to comfortably live and interact with robots, these machines should be able to understand and appropriately react to human emotions.
The company claims to be able to correctly identify emotions with 80% accuracy.
Sony is even trying to develop a robot able to form emotional bonds with people.
These facts have been, in recent months, the key news, but where is research really going?
The most recent tools go in the direction of detecting emotions from facial expressions. This is a direct consequence on the universality of facial expressions as studied by Paul Ekman.
The six universally recognised emotions are the following (source here) :
|EXPRESSION||MOTION CUES||PSEUDO-MUSCLES USED|
|Happiness||raising and lowering of mouth corners||6 linear muscles|
|Sadness||lowering of mouth corners
raise inner portion of brows
|6 linear muscles|
eyes open wide to expose more white
jaw drops slightly
|3 linear muscles|
mouth opens slightly
|5 linear muscles
1 sphincter for the mouth
|Disgust||upper lip is raised
nose bridge is wrinkled
|6 linear muscles|
lips pressed firmly
|4 linear muscles
1 sphincter for the mouth
So, it sounds a direct consequence that if we can teach a machine to recognise facial expressions it will automatically recognise the emotion that is being conveyed through that expression. Easier said than done, of course, but more and more tools and experiments are already appearing.
Riganelli et al. (2017, July) try to recognise human emotions, processing images streamed in real-time from a mobile device using open source libraries and convolutional neural networks.
Other studies go in the direction of understanding emotions from speech: a good overview can be found in Deng et al. (2013, May) where they lay the basis for understanding speech using deep learning. However, this is only half of the story, the other half is to extract emotions from speech recognition.
Han et al. (2014) is one of the most cited papers, where they claim a +20% result compared to other methods. Once again, deep learning is not a magic wand, but just a better tool, that needs improvement, a lot of work and the study of different methodologies to find the most effective ones.
Accuracy is still very low, but improving.
On the applications side, there are already companies “selling” intelligent machines able to recognise emotions. Besides the various emotional robots that now frequently appear on the main news channels, there are companies developing software such as Affectiva, claiming they can recognise emotions through a simple webcam and thus allow for the development (they have a SDK) of applications for marketing, customer care and so on.
You can try various demos that they have on their website, one really fun searches for Giphy gifs based on your face recognition. What I notice is that it reacts to “major” changes of your facial expressions, for example I deliberately showed an angry expression and it connected with angry gifs.
Another more sophisticated demo tool allows you to determine your reaction when you watch certain adv.
Probably, the best strategy in the near future will be to mix facial expressions and speech recognition to obtain significant results.
There are of course many questions that remain more on a philosophical side rather than a technical one: for example once the machine recognises an emotion, will it be able to respond adequately?
Deng, L., Hinton, G., & Kingsbury, B. (2013, May). New types of deep neural network learning for speech recognition and related applications: An overview. In Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (pp. 8599-8603). IEEE.
Han, K., Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. In Fifteenth Annual Conference of the International Speech Communication Association.
Riganelli, M., Franzoni, V., Gervasi, O., & Tasso, S. (2017, July). EmEx, a Tool for Automated Emotive Face Recognition Using Convolutional Neural Networks. In International Conference on Computational Science and Its Applications (pp. 692-704). Springer, Cham.