Turning 2D Photos into First Rate 3D Experiences
Google Earth Blog: Look Around with Google Panoramio
Frank Taylor has the news about Google’s Panoramio geo/photo/sharing site adding a more "Photosynth-like" feature that lets you navigate from 2D photo to 2D photo based on their computed areas of overlap. In essence, the software seems to figure out if/where two photos overlap, then computes the viewing transformation to go from each photo to the every overlapped match, and then does ye old cross-fade warp (a simple trapezoidal stretch) when you select some source and destination combo to navigate around.
The effect is somewhat similar to Photosynth, in that you can navigate a well-covered real-world scene based purely on existing 2D photos. The downside is very similar to Photosynth — the transformations are still lacking something that’s hard to pin down. Each photo is still 2D, and the warping or flying around isn’t enough to preserve the implied 3D perspective you get when you view a photo from the exact angle it was taken. Move a little to either side and the 3D effect diminishes, while disorientation increases rapidly. Both apps try to minimize the downside by fading a picture out when it gets too far off base, leaving lots of missing information temporally and spatially.
Now, I’m deeply impressed with Photosynth’s ability to find those per-pixel correspondences among so many photos so quickly, and now for Panoramio as well. That’s the real heart of each technology. But both apps will need to make a big leap in UI before they’ll be really useful.
It ultimately needs to be as simple as Google Earth’s UI — grab the panoramic-composite-of-many-photos and click or slide the view to where you want to go, the way you can grab the earth or fly from place to place.
Here’s an example of how it works well (IMO) for a pre-defined video sequence. It’s not exactly the same hard problem to solve, but it shows some of the cool new (invisible, intuitive) UI I’m getting at.
So the real-time viewing transformations ultimately need to choose and blend the most relevant parts of each overlapping 2D image, preserving correct depth at each pixel, compensating for irksome lens deformations and divergent fields of view, seamlessly sliding the new composite 3D view around.
Neither product is quite there yet — the 3D point cloud of Photosynth isn’t really working for me either. Both apps frankly make me a little queasy to use because the "world" is constantly warping or fading in unnatural ways.
The problem is that this is very hard to do right. Hell if I know how to solve those per-pixel transforms cheaply, i.e., in real-time. It’d take someone much smarter than me. However, if someone can show how to solve the problem at all — and I’ve seen some academic work on this, making 2D photos pop out in 3D with some serious pre-computation — then the results can be pre-digested in such a way as to make them navigable in real-time. That part is something I’d at least know how to do.
Still, the best results, IMO, will be when the many 2D photos can be easily "splatted" (projected, dissected, virtualized) onto and above the 3D landscape & buildings of Google’s Earth or Microsoft’s VE. The goal should be to make it all viewable from many more angles than originally captured, not just to move around a set of limited 2D photos in pseudo-3D. Once you can do that, moving around the 2D images in 3D becomes a much more natural activity. In fact, do this well enough, and you’d never even notice that the photos are only 2D to begin with.
Sorry if that was all a bit too technical today. My head is full of computer science. Ask questions in the comments and I can try to clarify.
Discussion Area - Leave a Comment