The billion of images available from a site like Flickr has stimulated the imagination of many researchers. After designing tools using Flickr to edit your photos, another team at the University of Washington (UW) is using our vacation photos to create 3D models of world landmarks. But recreating original scenes is challenging because all the photos we put on Flickr and similar sites don't exhibit the same quality. Anyway, with such a large number of pictures available, the researchers have been able to reconstruct with great accuracy virtual 3D model of landmarks, including Notre Dame Cathedral in Paris and the Statue of Liberty in New York City.
You can see above an example of the accuracy reached by the computer scientists. They've used 56 images of the Pisa Duomo taken by 8 photographers to reconstruct a digital version. You can see on top a merged surface model from the 56 depth maps. On the bottom left, you can compare it with a partial model of the Duomo acquired with a time-of-flight laser scanning system. The false color rendering on the bottom right shows the registered models overlaid on top of each other. And the team found that "90% of the reconstructed samples are within 0.128m of the laser scanned model of this 51m high building." (Credit: UW)
This research project has been led by Steve Seitz, a UW associate professor of computer science and engineering, and Michael Goesele, a former postdoctoral researcher at the UW who is now a professor at Technische Universität Darmstadt in Germany. Other researchers involved in the project include Noah Snavely, a UW doctoral student in computer science and engineering, Brian Curless, a UW associate professor of computer science and engineering; and Hugues Hoppe, a researcher at Microsoft Research in Redmond, Washington.
But how these researchers were able to reach such an accuracy? "To make the 3D digital model, the researchers first download photos of a landmark. For instance, they might download the roughly 60,000 pictures on Flickr that are tagged with the words 'Statue of Liberty.' The computer finds photos that it will be able to use in the reconstruction and discards pictures that are of low quality or have obstructions. Photo Tourism, a tool developed at the UW, then calculates where each person was standing when he or she took the photo. By comparing two photos of the same object that were taken from slightly different perspectives, the software applies principles of computer vision to figure out the distance to each point."
And how long does it take to digitally reconstruct a building starting from online images? "In tests, a computer took less than two hours to make a 3D reconstruction of St. Peter's Basilica in Rome, using 151 photos taken by 50 different photographers. A reconstruction of Notre Dame Cathedral used 206 images taken by 92 people. All the calculations and image sorting were performed automatically."
For more information, this research work has been presented at the International Conference on Computer Vision (ICCV 2007), which was held last month in Rio de Janeiro, Brazil. Here is a link to this paper, "Multi-View Stereo for Community Photo Collections" (PDF format, 8 pages, 8.75 MB), from which the above image has been extracted. The UW researchers also presented another paper, "Scene Summarization for Online Image Collections" (PDF format, 8 pages, 2.27 MB), which is also worth reading.
Finally, for your viewing pleasure, here are three additional links about this research project, which contain many more images and references.
Sources: University of Washington News, November 1, 2007; and various websites
You'll find related stories by following the links below.