In guessing what Google and the marketplace have in store for us, I’m taking into account both what’s technologically feasible, now and on the horizon, and what I think people will demand.
I blogged a while back about virtualization and privacy issues for both Street View and Maps/Earth, which, for starters, implies to a [predicted] future version of Street View that erases people and even cars from the imagery you see. So let’s start there.
The main reason for removing people and cars is the same reason you’d want the base Google Earth imagery to lack clouds:
- These things tend to block your view. You can’t really look behind them in a simple (essentially 2D) panoramic image.
- They only represent one snapshot in time vs. a broader/more virtualized essence of the place.
- They make it confusing to add dynamic versions of the same things on top of what’s permanently baked into the imagery. You can already see this with 3D buildings on top of 2D ones. (similarly, adding dynamic lighting and shadows is also hard when there are already shadows in the imagery)
- And in the case of people, we tend to not like being included in commercial imagery without our permission — perhaps cars and clouds feel the same way, but they don’t represent any known marketing demographic, nor do they sue.
Now, that doesn’t mean there shouldn’t ever be people or cars or clouds in Google’s version(s) of the Earth. It just means those should only be included for a much better purpose than "they just happened to get caught in the camera’s lens" and they should be first removed, and then re-added as needed.
For example, dynamic weather overlays make a lot of sense, once you remove the baked-in clouds. Moving 3D avatars also make sense in an opt-in application, e.g., in some 3D augmented-reality social network. And why not represent the real-time traffic layer with some corresponding density of cars?
All of those issues are solvable, if you first remove the items from the imagery and then add them back. The simplest method of removal is oversampling. Imagine taking multiple photos of the same place over a number of days. Most of the time, pixel for pixel, transient objects will show up in only one of the images, so you can basically vote on each pixel — majority wins. But this requires multiple passes over each city, when Google is presently racing to get coverage everywhere, and there are alignment issues, so give it some time (and perhaps some public pressure).
One obvious thing that Google can do right away is better integrate Street View with its generally-strong (though not always right) driving directions — now with impressive "route-dragging." It turns out, there’s a very compelling reason for integrating Street View with driving directions: most people can’t read maps.
Basically, there’s two kinds of people in the world — those who maintain re-orientable 2D/3D mental maps of the world around them, and those who navigate mainly by visual cues, landmarks, i.e., what they can see from their direct 1st-person perspective.
The people who can re-orient their mental maps have what we call high spatial cognition. The vast majority of the human race, however, has just enough spatial cognition to reach for a bag of potato chips and avoid walking into walls. That doesn’t mean those people are dumb — until recently, high spatial cognition was generally only useful for throwing spears. There are many other kinds of intelligence. And those with high spatial intelligence are simply more cut out for jobs in 3D graphics, architecture, industrial design, and perhaps billiards. People who can’t tell the difference between panoramic 2D images and actual 3D graphics might be really good at business or marketing, for example, but still get lost in a cul-de-sac.
The point is that driving directions, for the vast majority of the world, would do much better with a series of 1st person images or video that shows you where to get off the highway with an actual image of the exit sign and off-ramp, saying, "turn right here" in the same way that one might say to a driver, "turn where that blue car just went." That, and as I discovered this weekend, the Home Depot on Route 17 is not quite where Google Maps says it us. Having images of my route would be a nice "sanity check."
Frankly, I’d expect this feature way before true virtualization as described above. But the closest they come to it right now is that you can use Street View when interactively examining your driving directions — drag the little yellow AOL-like guy around to see just one Street View image at a time. Perhaps they’ll soon offer to replace or combine the series of overhead street maps (which they include when you print) with 1st person views of each stage of navigation along the route.
What might come after that? The main thing to do, I’m guessing, is integrate the vast quantities of Street View data with Google Earth in a more 3D fashion, solving these 2D panoramas for depth and turning everything into textures and polygons.
The ultimate goal would be a view of the Earth not only with basic buildings, but with well detailed street-level views so you could zoom down and even [virtually] walk around. Go one step further, and Google could take their $10 per store photography project and recreate virtual interiors of semi-public places, perhaps with links or embedded shopping when you go inside.
That’s when I think we might start to see avatars in Google Earth — when there’s some actual reason for people to have a sense of their own body and of other people in the virtual world, not that a Second Life / GE combination is likely, for reasons I’ve already outlined.
What else might there be? Well, integrating live street cameras would be a neat trick. All it really takes is knowing the position and orientation of the camera and coming up with a flash video player that can project 2D movies on an arbitrary 3D polygon, rather than always in the plane of your screen. It’s even simpler if the video is in a separate window.
Having the density of image capture points (aka "nodes") be high enough to resemble moving video is also on the horizon. The cameras Google is reportedly driving around with are capable of capturing and processing 30 FPS and 360-degrees [edit: see the first comment below for a cool example of panoramic video]. The only real limitation is server-side storage for the higher density of nodes (not a big deal for Google) and bandwidth to your browser. Caveat: virtualizing the scene to a real-time 3D model is still preferable to streaming video because we don’t always want to move in the exact path (and direction) that Google took. Just imagine trying to drive down a street backwards — with Street View Video, time would seem to run backwards.
Clickable store-front information is an obvious must-have feature — show menus for restaurants, and so on. And like Google Earth, we’ll want to turn on information layers beyond the mere street names. Given the vast store of information already in Google Earth, adding this to Street View shouldn’t be too hard — even rendering KML should be doable, given enough time and energy.
The only really hard problem is the one I mentioned earlier — turning essentially 2D panoramic Street View images into 3D models — the start of the art still requires some manual labor, which kind of hampers any planet-scale effort. But new camera technology for recording distance per pixel could soon save the day, as well as an interesting recent acquisition.
At that point, and as browsers are more and more capable of 3D, I expect to see Google Street View, Maps, and Earth become one truly integrated product, offering different perspectives on the same digital earth.
As always, I’m merely an observer. Google doesn’t tell me their plans. But let’s check back in a year or so and see how much of this comes to pass.