Interesting technology for ground truthing the real world. Google is able to index the real world through computer image and object character recognition (OCR) technologies . In the process of building maps, Google ”Ground Truth” technology extracts street names and addresses from pictures.
I wrote about this technology sometimes ago in my article titled Image2Geographic-Coordinate, location referencing system. Realizing right now that it is one of the awesome technologies behind Google Map production.
This article below courtesy of Alexis C. Madrigal throws more light on how Google is leveraging this technology in building great maps.
”Behind every Google Map, there is a much more complex map that’s the key to your queries but hidden from your view. The deep map contains the logic of places: their no-left-turns and freeway on-ramps, speed limits and traffic conditions. This is the data that you’re drawing from when you ask Google to navigate you from point A to point B — and last week, Google showed me the internal map and demonstrated how it was built. It’s the first time the company has let anyone watch how the project it calls GT, or “Ground Truth,” actually works.
Google opened up at a key moment in its evolution. The company began as an online search company that made money almost exclusively from selling ads based on what you were querying for. But then the mobile world exploded. Where you’re searching from has become almost as important as what you’re searching for. Google responded by creating an operating system, brand, and ecosystem in Android that has become the only significant rival to Apple’s iOS.
And for good reason. If Google’s mission is to organize all the world’s information, the most important challenge — far larger than indexing the web — is to take the world’s physical information and make it accessible and useful.
“If you look at the offline world, the real world in which we live, that information is not entirely online,” Manik Gupta, the senior product manager for Google Maps, told me. “Increasingly as we go about our lives, we are trying to bridge that gap between what we see in the real world and [the online world], and Maps really plays that part.”
This is not just a theoretical concern. Mapping systems matter on phones precisely because they are the interface between the offline and online worlds. If you’re at all like me, you use mapping more than any other application except for the communications suite (phone, email, social networks, and text messaging).
Google is locked in a battle with the world’s largest company, Apple, about who will control the future of mobile phones. Whereas Apple’s strengths are in product design, supply chain management, and retail marketing, Google’s most obvious realm of competitive advantage is in information. Geo data — and the apps built to use it — are where Google can win just by being Google. That didn’t matter on previous generations of iPhones because they used Google Maps, but now Apple’s created its own service. How the two operating systems incorporate geo data and present it to users could become a key battleground in the phone wars.
But that would entail actually building a better map.
***
The office where Google has been building the best representation of the world is not a remarkable place. It has all the free food, ping pong, and Google Maps-inspired Christoph Niemann cartoons that you’d expect, but it’s still a low-slung office building just off the 101 in Mountain View in the burbs.
I was slated to meet with Gupta and the engineering ringleader on his team, former NASA engineer Michael Weiss-Malik, who’d spent his 20 percent time working on Google Mars, and Nick Volmar, an “operator” who actually massages map data.
“So you want to make a map,” Weiss-Malik tells me as we sit down in front of a massive monitor. “There are a couple of steps. You acquire data through partners. You do a bunch of engineering on that data to get it into the right format and conflate it with other sources of data, and then you do a bunch of operations, which is what this tool is about, to hand massage the data. And out the other end pops something that is higher quality than the sum of its parts.”
This is what they started out with, the TIGER data from the US Census Bureau (though the base layer could and does come from a variety of sources in different countries).
On first inspection, this data looks great. The roads look like they are all there and you’ve got the freeways differentiated. This is a good map to the untrained eye. But let’s look closer. There are issues where the digital data does not match the physical world. I’ve circled a few obvious ones below.
And that’s just from comparing the map to the satellite imagery. But there are also a variety of other tools at Google’s disposal. One is bringing in data from other sources, say the US Geological Survey. But Google’s Ground Truthers can also bring another exclusive asset to bear on the maps problem: the Street View cars’ tracks and imagery. In keeping with Google’s more-data-is-better-data mantra, the maps team, largely driven by Street View, is publishing more imagery data every two weeks than Google possessed total in 2006.*
Let’s step back a tiny bit to recall with wonderment the idea that a single company decided to drive cars with custom cameras over every road they could access. Google is up to five million miles driven now. Each drive generates two kinds of really useful data for mapping. One is the actual tracks the cars have taken; these are proof-positive that certain routes can be taken. The other are all the photos. And what’s significant about the photographs in Street View is that Google can run algorithms that extract the traffic signs and can even paste them onto the deep map within their Atlas tool. So, for a particularly complicated intersection like this one in downtown San Francisco, that could look like this:
Google Street View wasn’t built to create maps like this, but the geo team quickly realized that computer vision could get them incredible data for ground truthing their maps. Not to detour too much, but what you see above is just the beginning of how Google is going to use Street View imagery. Think of them as the early web crawlers (remember those?) going out in the world, looking for the words on pages. That’s what Street View is doing. One of its first uses is finding street signs (and addresses) so that Google’s maps can better understand the logic of human transportation systems. But as computer vision and OCR improve, any word that is visible from a road will become a part of Google’s index of the physical world.
Later in the day, Google Maps VP Brian McClendon put it like this: “We can actually organize the world’s physical written information if we can OCR it and place it,” McClendon said. “We use that to create our maps right now by extracting street names and addresses, but there is a lot more there.”
More like what? “We already have what we call ‘view codes’ for 6 million businesses and 20 million addresses, where we know exactly what we’re looking at,” McClendon continued. “We’re able to use logo matching and find out where are the Kentucky Fried Chicken signs … We’re able to identify and make a semantic understanding of all the pixels we’ve acquired. That’s fundamental to what we do.”
For now, though, computer vision transforming Street View images directly into geo-understanding remains in the future. The best way to figure out if you can make a left turn at a particular intersection is still to have a person look at a sign — whether that’s a human driving or a human looking at an image generated by a Street View car.
There is an analogy to be made to one of Google’s other impressive projects: Google Translate. What looks like machine intelligence is actually only a recombination of human intelligence. Translate relies on massive bodies of text that have been translated into different languages by humans; it then is able to extract words and phrases that match up. The algorithms are not actually that complex, but they work because of the massive amounts of data (i.e. human intelligence) that go into the task on the front end.
Google Maps has executed a similar operation. Humans are coding every bit of the logic of the road onto a representation of the world so that computers can simply duplicate (infinitely, instantly) the judgments that a person already made.
This reality is incarnated in Nick Volmar, the operator who has been showing off Atlas while Weiss-Malik and Gupta explain it. He probably uses twenty-five keyboard shortcuts switching between types of data on the map and he shows the kind of twitchy speed that I associate with long-time designers working with Adobe products or professional Starcraft players. Volmar has clearly spent thousands of hours working with this data. Weiss-Malik told me that it takes hundreds of operators to map a country. (Rumor has it many of these people work in the Bangalore office, out of which Gupta was promoted.)
The sheer amount of human effort that goes into Google’s maps is just mind-boggling. Every road that you see slightly askew in the top image has been hand-massaged by a human. The most telling moment for me came when we looked at couple of the several thousand user reports of problems with Google Maps that come in every day. The Geo team tries to address the majority of fixable problems within minutes. One complaint reported that Google did not show a new roundabout that had been built in a rural part of the country. The satellite imagery did not show the change, but a Street View car had recently driven down the street and its tracks showed the new road perfectly.
Volmar began to fix the map, quickly drawing the new road and connecting it to the existing infrastructure. In his haste (and perhaps with the added pressure of three people watching his every move), he did not draw a perfect circle of points. Weiss-Malik and I detoured into another conversation for a couple of minutes. By the time I looked back at the screen, Volmar had redrawn the circle with perfect precision and upgraded a few other things while he was at it. The actions were impressively automatic. This is an operation that promotes perfectionism.
And that’s how you get your maps to look this this:
Some details are worth pointing out. In the top at the center, trails have been mapped out and coded as places for walking. All the parking lots have been mapped out. All the little roads, say, to the left of the small dirt patch on the right, have also been coded. Several of the actual buildings have been outlined. Down at the bottom left, a road has been marked as a no-go. At each and every intersection, there are arrows that delineate precisely where cars can and cannot turn.
Now imagine doing this for every tile on Google’s map in the United States and 30 other countries over the last four years. Every roundabout perfectly circular, every intersection with the correct logic. Every new development. Every one-way street. This is a task of a nearly unimaginable scale. This is not something you can put together with a few dozen smart engineers.
I came away convinced that the geographic data Google has assembled is not likely to be matched by any other company. The secret to this success isn’t, as you might expect, Google’s facility with data, but rather its willingness to commit humans to combining and cleaning data about the physical world. Google’s map offerings build in the human intelligence on the front end, and that’s what allows its computers to tell you the best route from San Francisco to Boston.***It’s probably better not to think of Google Maps as a thing like a paper map. Geographic information systems represent a jump from paper maps like the abacus to the computer. “I honestly think we’re seeing a more profound change, for map-making, than the switch from manuscript to print in the Renaissance,” University of London cartographic historian Jerry Brotton told the
Sydney Morning Herald. “That was huge. But this is bigger.”The maps we used to keep folded in our glove compartments were a collection of lines and shapes that we overlaid with human intelligence. Now, as we’ve seen, a map is a collection of lines and shapes with Nick Volmar’s (and hundreds of others’) intelligence encoded within.It’s common when we discuss the future of maps to reference the Borgesian dream of a 1:1 map of the entire world. It seems like a ridiculous notion that we would need a complete representation of the world when we already have the world itself. But to take scholar Nathan Jurgenson’s conception of augmented reality seriously, we would have to believe that every physical space is, in his words, “interpenetrated” with information. All physical spaces already are also informational spaces. We humans all hold a Borgesian map in our heads of the places we know and we use it to navigate and compute physical space. Google’s strategy is to bring all our mental maps together and process them into accessible, useful forms.
Their MapMaker product makes that ambition clear. Project managed by Gupta during his time in India, it’s the “bottom up” version of Ground Truth. It’s a publicly accessible way to edit Google Maps by adding landmarks and data about your piece of the world. It’s a way of sucking data out of human brains and onto the Internet. And it’s a lot like Google’s open competitor, Open Street Map, which has proven that it, too, can harness the crowd’s intelligence.
As we slip and slide into a world where our augmented reality is increasingly visible to us off and online, Google’s geographic data may become its most valuable asset. Not solely because of this data alone, but because location data makes everything else Google does and knows more valuable.
Or as my friend and sci-fi novelist Robin Sloan put it to me, “I maintain that this is Google’s core asset. In 50 years, Google will be the self-driving car company (powered by this deep map of the world) and, oh, P.S. they still have a search engine somewhere.”
Of course, they will always need one more piece of geographic information to make all this effort worthwhile: You. Where you are, that is. Your location is the current that makes Google’s giant geodata machine run. They’ve built this whole playground as an elaborate lure for you. As good and smart and useful as it is, good luck resisting taking the bait.”