|Make sure you don't include any background scenery in those selfies:|
Google neural network can tell where photos were taken better than humans
By Ryan Whitwam
on February 26, 2016 at 9:32 am
When you look at a photos, you might be able to use various visual cues to determine where in the world it was taken. Maybe it’s the architecture or the vegetation that tips you off. Whatever it is, this is something machines have always had trouble doing, until now. Google computer vision researcher Tobias Weyand and his team have created a neural network that is capable of looking at a photo and determining where in the world it was taken with higher accuracy than a human.
To train their neural network, the team started by dividing the world into 26,000 squares based on how many photos there are from an area. For example, there are more pictures taken in big cities, so the resolution of the grid is much higher in those areas. The team ignored areas like the oceans and polar regions, where very few photos are taken.
A database of 126 million photos with location data collected from the web. The team used 91 million of those images to teach the neural network (called PlaNet) how to identify the location of a photo on the grid. As for the other 34 million images, they were used to validate the model. To test their model, the team used 2.3 million photos gathered from Flickr that had geolocation data attached. PlaNet Was able to identify the location of 3.6% of those images to the street level. Then 10.1% were identified in the correct city, 28.4% were in the right country, and 48% were on the correct continent.
The above results may not sound impressive at first, but it turns out humans are much worse at this then we would think. The Google team tested their neural network against 10 humans with travel experience, who would presumably have a pretty good handle on what different areas of the world look like. The test make use of an online game, which you can play right now if you like. It uses Google Street View data to display different areas of the globe, asking the participant to guess where it is. PlaNet managed to win the majority of games with a median error of a little over 1100 kilometers. By comparison, the humans had a median error of over 2300 kilometers.
Wayland speculates that PlaNet is so much better than humans because it has seen more images of distant lands than any human would ever be able to see in a lifetime. The team even devised a clever way to find location of photos that weren’t taken outside. The neural network can identify the location of other photos in an album, and assuming the indoor photo was taken in the same place, it assigns a location.
What’s particularly interesting about this neural network is that it doesn’t need to carry around all that photo data with it to operate. The model itself is only 377MB, which you could fit on a smartphone. PlaNet is currently just a research project, but it’s not outside the realm of possibility that Google could one day integrate it into its products.