|A New Way for Machines to See, Taking Shape in Toronto|
By CADE METZ
New York Times
NOV. 28, 2017
Geoffrey Hinton and Sara Sabour, holding a two-piece pyramid puzzle, are researching a system that could let computers see more like humans at a Google laboratory in Toronto. Credit Christopher Wahl for The New York Times
TORONTO — In 2012, Geoffrey Hinton changed the way machines see the world.
Along with two graduate students at the University of Toronto, Mr. Hinton, a professor there, built a system that could analyze thousands of photos and teach itself to identify common objects like flowers and cars with an accuracy that didn’t seem possible.
He and his students soon moved to Google, and the mathematical technique that drove their system — called a neural network — spread across the tech world. This is how autonomous cars recognize things like street signs and pedestrians.
But as Mr. Hinton himself points out, his idea has had its limits. If a neural network is trained on images that show a coffee cup only from a side, for example, it is unlikely to recognize a coffee cup turned upside down.
Now Mr. Hinton and Sara Sabour, a young Google researcher, are exploring an alternative mathematical technique that he calls a capsule network. The idea is to build a system that sees more like a human. If a neural network sees the world in two dimensions, a capsule network can see it in three.
Mr. Hinton, a 69-year-old British expatriate, opened Google’s artificial intelligence lab in Toronto this year. The new lab is emblematic of what some believe to be the future of cutting-edge tech research: Much of it is expected to happen outside the United States in Europe, China and longtime A.I. research centers, like Toronto, that are more welcoming to immigrant researchers.
Ms. Sabour is an Iranian researcher who wound up in Toronto after the United States government denied her a visa to study computer vision at the University of Washington.
Her task is to turn Mr. Hinton’s conceptual idea into a mathematical reality, and the project is bearing fruit. They recently published a paper showing that in certain situations their method can more accurately recognize objects when viewing them from unfamiliar angles.
“It can generalize much better than the traditional neural nets everyone is now using,” Ms. Sabour said.
When I walked into his office this month, Mr. Hinton, dressed in his usual button-down shirt and sweater, handed me two large white blocks. They looked like something he had found at the bottom of an old toy chest.
He explained that the blocks were two halves of a pyramid, and he asked if I could put the pyramid back together. That didn’t seem too hard. The blocks were oddly shaped, but each had only five sides. All I had to do was find the two sides that matched and line them up. But I couldn’t.
Most people fail this test, he told me, including two tenured professors at the Massachusetts Institute of Technology. One declined to try, and the other insisted it wasn’t possible. It is possible. But we all failed, Mr. Hinton explained, because the puzzle undercuts the natural way we see something like a pyramid.
We do not recognize an object by looking at one side and then another and then another. We picture the whole thing sitting in three-dimensional space. And because of the way the puzzle cuts the pyramid in two, it prevents us from picturing it in 3-D space as we normally would.
With his capsule networks, Mr. Hinton aims to finally give machines the same three-dimensional perspective that humans have — allowing them to recognize a coffee cup from any angle after learning what it looks like from only one. This is not something that neural networks can do.
“It is a fact that is ignored by researchers in computer vision,” he said. “And that is a huge mistake.”
Loosely modeled on the web of neurons in the human brain, neural networks are algorithms that can learn discrete tasks by identifying patterns in large amounts of data. By analyzing thousands of car photos, for instance, a neural network can learn to recognize a car.
This mathematical idea dates back to the 1950s, but the concept has found real-world applications in recent years, thanks to improvements in processing power and the large amounts of data generated by the internet. Over the last five years, neural networks have accelerated the progress of everything from smartphone digital assistants to language translation services to autonomous robots.
But these methods are still a long way from delivering machines with true intelligence — and new research is needed to deliver the kinds of autonomous machines that so many of the top tech companies are now promising, including conversational computers and driverless cars.
Mr. Hinton, who is a kind of godfather figure for the A.I. community, is part of a small but increasingly vocal group of specialists who are working to push the industry into these alternative areas of research.
Oren Etzioni, chief executive of the Allen Institute for Artificial Intelligence, based in Seattle, lamented what he called the industry’s myopia. Its current focus on neural networks, he said, will hurt the progress of A.I. in the long run.
Eric Horvitz, who oversees much of the A.I. work at Microsoft, argued that neural networks and related techniques were small advances compared with technologies that would arrive in the years to come.
“Right now, what we are doing is not a science but a kind of alchemy,” he said.
Mr. Hinton acknowledges that his project in Toronto has so far shown only preliminary results. And others, like Mr. Etzioni and Mr. Horvitz, believe that very different techniques will be needed to achieve truly intelligent machines. Mr. Etzioni said that although machine learning methods would remain at the center of A.I. work, they must be augmented with other techniques. They are fundamentally limited because they learn from data. The right data isn’t always available.
But Mr. Hinton believes his capsule networks can eventually expand to a wider array of situations, accelerating the progress of computer vision and things like conversational computing. Capsule networks are an attempt to mimic the brain’s network of neurons in a more complex and structured way, and he explained that this added structure could help other forms of artificial intelligence as well.
He certainly understands that many will be skeptical of his technique. But Mr. Hinton also pointed out that five years ago, many were skeptical of neural networks.
“History is going to repeat itself,” he said. “I think.”