More on PhotoSynth


Yechezkal writes in with some additional notes on PhotoSynth, which has already been covered elsewhere.

While Microsoft demo’d this interesting app, it turns out it was more of an acquisition than a R&D development on their part. The company SeaDragon Software was recently sold to MS. And some of the principals have much more interesting videos and papers on a Home Page at the University of Washington, where it looks like the bulk of the R&D was actually done (this video was apparently the technical one presented at SIGGRAPH this year).

Watching the long video, I’m left with a much better impression of PhotoSynth. Stitching multiple photos into a 3D landscape is very cool. But being able to navigate between a series of photos for more or less detail of a scene is even cooler. The integration with MS new virtual globe seems somewhat tangential though — an aerial shot was used to georeference a set of photos. But I didn’t see any evidence of PhotoSynth being integrated into said virtual globe.

PhotoSynth is indeed interesting, but its limits are evident in the segment where they navigate along the Great Wall. It reminded me here of the old MS image-based hardware push called Talisman, which failed for various reasons. I’m also curious about how well PhotoSynth handles photos taken with wildly different parameters: day and night, various weather. I noticed some big discontinuties when switching from shots where the sun is simply at different angles. It’s understandable. Computer vision is a hard problem.

Check out Paul Debevec’s work in this area (his work on imaging in general is some of the most important, IMO).

So all of this is to say, that as a stand-alone product, I think it’s potentially quite cool. I’d use it if it isn’t too expensive. As part of a virtual globe, I have some doubts about the tech I’ve seen so far.

The most useful aspect for general use of virtual globes would be the surface reconstruction feature, though that unfortunately seems a bit toned down — a cloud of 3D points but not the solid reconstruction, with texture detail that would bring a virtual scene to life. You do get a quality picture now, but only when your view is locked closely to one of the original quality photos.

The hardest technical aspect seems to be the registration of hundreds of images to find key similarities. That seems simpler and less error-prone than using GPS tagging of the sources, as I probably would have done. But if that could be done off-line (not sure how long it takes, even with PhotoSynth), what I’d want to see in the end is an actual 3D model with textured LOD and shadows either removed or aligned to the same time of day. Once that pre-baked base model is available for streaming, it seems less of a stretch to let people plug in their own oriented and mapped photos on top of it in a standard virtual globe.

Anyway, I have no idea what GE is doing in this regard. So far, they let you put your photos as icons on the earth. It would be fairly easy for them to let you orient those photos in 3D, even automatically fade in/out the ones that are closest to your current view (like PhotoSynth), or let you click on the photo to align your browser’s view with the one from the photo.

The hard work would be in registering 1000s of photos, as SeaDragon has done. But if GE or someone else can take that a step farther, and come up with a more detailed 3D model from the source photos, then that could be much better streamed as part of GE or some other virtual globe as layer data.

Apart from the technical aspects, I guess the final question I’d ask is if it takes hundreds of photos to accurately reconstruct a site, the idea of a base 3D (hand-made or automatic) model with good detail, augmented by individual photos makes more sense than hoping 100 people take good photos. Because if a place is important enough for hundreds of people to photograph, it may be high on the list for someone to model in Sketchup. On the other hand, there’s some aesthetic value in seeing the combined effort of so many people, something that says “we were all here,” which I think makes PhotoSynth interesting as a Flickr-style enhancement.

  1. #1 by Yechezkal on August 6, 2006 - 1:08 pm

    Microsoft recently bought VEXEL, which is in the "close range photogrametry" business. I.e. making CAD models from imagery data.

    So they are clearly following up on your line of thought about capturing 3D data from imagery and using this to augment mixed reality worlds. (and possibly provide better image drapes, as well as 3D navigation and modeling and simulation as an overlay to the real world).

    http://www.vexcel.com/image_gallery/crange/index.html

(will not be published)