Adam Cheyer has a dream — and given that he sold his two previous companies to Steve Jobs and Samsung, it would be wise not to dismiss his ambition.
Cheyer is the co-creator of Siri, the pioneering intelligent personal assistant technology that was purchased by Apple Inc. in 2010. He is also the co-founder of Viv, which created a next-generation artificial intelligence-powered assistant system acquired by Samsung in 2016. Yet, despite his well-established record as a visionary in the AI field, Cheyer is not a happy man when it comes to the current state of his industry.
Cheyer’s dream is that AI-driven assistants will become as important to the world as the internet or mobile technology. In an appearance at the Re-Work Deep Learning Summit in San Francisco on Thursday, Cheyer expressed frustration that virtual assistants have not yet reached the same level of significance.
“There are billions of inquiries that flow through assistants every day, but it’s not important,” Cheyer complained. “We as a community need to get to that level, or at least try.”
Cheyer’s plea was symbolic of the conundrum faced by the AI and machine learning ecosystem today. It’s powerful technology with a growing set of intriguing use cases, but is it essential on a global basis? Not even close — at least not yet.
Over two days at the Deep Learning Summit, attendees were provided with evidence that, while progress is being made in the development of machine learning algorithms, the impact of virtual personal assistants and other facets of AI remains far more limited than it could be.
As Nikhil Mane, a conversation engineer at Autodesk Inc., told attendees during one conference session, “If you want to do this right, you have to be OK with failing.”
Assistants still not perfect
Virtual assistants have achieved impressive adoption rates, with more than half of all smartphones expected to include the technology in 2019. But they remain hampered by clunky interfaces and continued difficulty in being able to fully grasp conversational speech.
Cheyer pointed out that assistant technology still has a hard time handling third-party integrations. He cited an example where users can verbally check to see if tickets to a play are available, but then they have to punch out and use a different app to buy them.
“Essentially third parties are second-class citizens,” Cheyer said. “It has to feel like one experience.”
While a number of researchers presented plenty of evidence at the conference that voice recognition technology has advanced, there is still a gap between what machines hear and what they actually understand.
Cathy Pearl, head of conversation design outreach at Google LLC, has spent a great deal of time testing assistants and their level of understanding. She cited one example where a colleague left a speech-to-text computer program on and began to play the trumpet. Rather than simply interpreting the new sound as music, the computer literally translated what it heard as “woo, woo, woo.”
In another example, Pearl asked an assistant what time her local library was open tomorrow. Instead of the hours, the Google researcher received a listing of the “top-rated” libraries in her area, evidence of the bias inherent in many virtual assistant programs towards mining data from popular websites.
“These computers don’t really have a lot of common sense,” Pearl said. “It’s human nature. We like to feel understood.”
Progress in adversarial networks
Despite frustration among industry leaders such as Cheyer and Pearl, there are also instances where AI and machine learning are clearly having real impact.
Ian Goodfellow (pictured, right), senior staff research scientist for Google Brain, has emerged as something of a celebrity in the AI community for his groundbreaking work on generative adversarial networks or GANs. His concept of pitting two neural networks against each other has resulted in a great deal of attention in the machine learning field and the “GANfather” offered further evidence of progress at the conference.
Goodfellow showed video clips of how machines used GANs to interpret and process the dance moves of one person and dubbed them perfectly onto the video image of another individual. It also raised the prospect that the technology could give whole new meaning to the definition of “fake news.”
A team of researchers recently employed GANs to automatically generate designs of dental crowns that can be 3-D-printed and applied to patients within hours instead of weeks. The concept is currently being tested with a small group of doctors in Southern California.
“We can actually use generative models to produce useful objects in the real world,” Goodfellow told the gathering.
Advances in visual recognition are also opening new doors for the visually impaired. Seeing AI provides an AI-based smartphone camera app that allows the visually impaired to see the world around them in ways they never could before.
The app can now identify and sound out currency denominations, descriptions of people in a room and their distance from the user, handwriting, storefront names, numbers and whether an aluminum can is a Coke or Pepsi. The AI-centered app used millions of photos submitted by volunteers as the training data.
“Our networks have started to get a lot more robust,” said Anirudh Koul (left), founder of the Seeing AI app and head of artificial intelligence and research at Aira Corp.
Beta projects show promise
Researchers are also working on new intelligent technology in beta test mode that could become significant over time. Google is moving to place machine learning capability in the hands of advertisers through a new tool in development called Responsive Search Ads.
Advertisers can create up to 15 different headlines and as many as four descriptions for one promotion and then submit the whole package to Google. The search giant then tests different combinations using its machine learning algorithms to determine the version most likely to get the best click-through results.
Another Google researcher, Zornitsa Kozareva, is spearheading a project to place machine learning models directly on devices. Kozareva, who previously worked on the development of Amazon.com Inc.’s Alexa, has been pioneering the use of on-device neural networks and Google has released a machine learning kit called Learn2Compress to facilitate the mobile model.
One potential application is that a home microwave would come equipped with natural language processing and enough intelligence to immediately know what to do when a user commands it to heat up the spaghetti. “We are trying to push the boundaries,” said Kozareva.
The promise of AI is that with enough data, we can raise the level of perfection in an imperfect world. The problem is how much data and machine learning will ultimately make a difference?
“When you gather data it is not a complete reflection of the world,” Joshua Kroll, a postdoctoral scholar at the UC Berkeley School of Information, cautioned in a presentation on Friday. Adam Cheyer still believes that assistants will become the next globally transformative technology, but his dream is still a long way from being realized.
Since you’re here …
The journalism, reporting and commentary on SiliconANGLE — along with live, unscripted video from our Silicon Valley studio and globe-trotting video teams at theCUBE — take a lot of hard work, time and money. Keeping the quality high requires the support of sponsors who are aligned with our vision of ad-free journalism content.
If you like the reporting, video interviews and other ad-free content here, please take a moment to check out a sample of the video content supported by our sponsors, tweet your support, and keep coming back to SiliconANGLE.