The idea that most man-made objects can be represented with sweep surfaces (cylinders, tubes, squares, etc..) isn’t that new. Second Life primitives used exactly the same principle, with some interesting extensions for cuts, twists, tapers and so on.
But selecting photographic imagery based on implicit primitives, in-painting (hallucinating) the background and unseen object views, and (occasional) relighting of the object is all extremely clever and very useful. Combine this and a system that can relight virtual objects based on scene shadows and you have a paint program that can revise reality, at least virtually, but in a way that would fool almost anyone.
The end-goal of all this work is something I used to call “parametric 3D video” — which roughly means we take one or more 2D video streams, split out the objects, backgrounds, people into separate and fully adjustable pieces, send them as 3D content vs. pixels, and then re-synthesize the result from any angle at the receiving end, along with any changes you want to make.
3D (color + depth) video capture makes the problem much easier. Techniques like this paper are still needed to finish the job, but they can be much more automatic in terms of finding and cutting objects.