Artificial intelligence is making miracles happen. A group of researchers at Massachusetts Institute of Technology (MIT) recently developed an AI that predicts social interactions. The core algorithm of their research is able to predict whether a person will go for a hug, handshake, high-five, or kiss.
Like every AI, this algorithm also needs some data to train itself. The researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) used television shows like “Desperate Housewives” and “The Office” to train the algorithm. In addition to the shows, they also used more than 600 hours of YouTube videos for the training.
To get a picture of how this AI works, think about how a baby learns to walk. She first looks at people and patterns that they depict. Then she tries to walk. She may fall. Once again, she looks at people walking. She learns from her mistake and corrects what was going wrong. She tries again while decreasing the chances of falling. She repeats the process again and again until she learns how to walk while maintaining her balance. The AI works in a similar manner.
The algorithm “looks” at the videos to study the patterns that people depict before greeting each other. It learns the patterns using deep-learning techniques and predicts one of the four actions people will make while greeting each other.
CSAIL PhD student Carl Vondrick is the first author of the paper detailing the research which will be published at International Conference on Computer Vision and Pattern Recognition. Vondrick reports an accuracy of 43 percent after the rigorous 600-hour training, which means that the algorithm accurately predicts one of the four greetings 43 percent of the time. This is 36 percent improvement in accuracy as compared to the state-of-the-art algorithms.
The increase in accuracy is attributed to the algorithm’s ability to train itself on videos and predicting “visual representations” which simply means a complete scene.
“Humans automatically learn to anticipate actions through experience, which is what made us interested in trying to imbue computers with the same sort of common sense,” said Vondrick in a statement. “We wanted to show that just by watching large amounts of video, computers can gain enough knowledge to consistently make predictions about their surroundings.”
This sort of AI can have many practical applications. It may be used to create friendly robot which greet people in a proper manner. It may also be used to train itself using the CCTV footages of roads and highways to report accidents or any other emergency situations. Another application can be in interrogation rooms where the algorithm may predict if a person is telling the truth or not.
However, the accuracy of 43 percent is not practical enough to be used in organizations. While the researchers have made massive improvements on previous algorithms, there is still a long way to go. But we are getting there. The basic principles and foundations of AI have been laid. The researchers just need to make their algorithms more accurate so that we may be able to see AI more in practice than papers. Nevertheless, being able to tell if it’s a hug-or-handshake is one step forward.
Mazhar Naqvi is a CS grad student with research interests in computer networks and security. He can be reached at email@example.com and you can follow him on linkedin at https://www.linkedin.com/in/mazharnaqvi
Learn how Unified Inbox’s UnificationEngine™ platform enables communications with complex systems through IoT Messaging at http://unificationengine.com!