Archive for category Articles
This idea reportedly comes from a competition that Meta (Space Glasses) is holding. The idea is to project a holographic image of your phone in space (using said glasses) and let you virtually interact with it, instead of taking the phone out of your pocket to do exactly the same.
Why is it a brilliant idea?
It’s simple. People get accustomed to their phone’s UIs. Projecting the phone holographically requires not a single new thought and changes nothing about the core experience. Well, it does lose out on touching that sleek and sexy touch-screen, feeling the nicely balanced weight of the phone in your hand, and of course key sensors like accelerometers (to a degree) and cameras (at all) to certain phone experiences.
So I guess that means you couldn’t run old augmented reality apps on your holographic phone for a recursive experience. Oh well. There goes a nice photo op.
Why is this a stupid idea?
Your head mounted device can [eventually] paint pixels anywhere you look. It can detect touch anywhere it can see your hands. Why would we limit ourselves to drawing a 4″ screen when we have an infinitely large screen on our head?
It’s a lot like saying, “Hey, we got used to small CRT TVs so let’s draw a small TV inside our brand new 60″ flat screen TV so people don’t have to learn something new.”
Interfaces for AR will run the gamut from holographic virtual actors who become your daily assistant, to making every physical surface in the world potentially interactive by touch, sight and sound. Why would we limit ourselves to UI mechanisms that were designed around the limits of small screens and touch?
Just for the experience of not having to take our phone out of our pocket? Are we really that lazy? If so, ask yourself how much you’d be willing to pay to use your phone without taking it out of your pocket. I’d pay maybe $1.
This really comes down to a core question about AR. Is it about being the ultimate hands-free device, principally meant to deliver us from holding our phone in our hands or up to our faces? Or is it about re-imagining the analog world with new digital layer(s) of content on top?
I can see an app like this being very popular, at least in the way the fart app is popular. That’s only because people’s imaginations are presently too limited. They just haven’t seen the best ideas yet.
On the other hand, it’s turning out that the most popular interface for your new 60″ flat-screen TV with billions of streaming video options is not some new fancy XBox-like natural UI, but rather just your phone.
So what do I know? People may ultimately find ‘stupid’ brilliant.
The idea that most man-made objects can be represented with sweep surfaces (cylinders, tubes, squares, etc..) isn’t that new. Second Life primitives used exactly the same principle, with some interesting extensions for cuts, twists, tapers and so on.
But selecting photographic imagery based on implicit primitives, in-painting (hallucinating) the background and unseen object views, and (occasional) relighting of the object is all extremely clever and very useful. Combine this and a system that can relight virtual objects based on scene shadows and you have a paint program that can revise reality, at least virtually, but in a way that would fool almost anyone.
The end-goal of all this work is something I used to call “parametric 3D video” — which roughly means we take one or more 2D video streams, split out the objects, backgrounds, people into separate and fully adjustable pieces, send them as 3D content vs. pixels, and then re-synthesize the result from any angle at the receiving end, along with any changes you want to make.
3D (color + depth) video capture makes the problem much easier. Techniques like this paper are still needed to finish the job, but they can be much more automatic in terms of finding and cutting objects.
Here’s a great* Verge article on the AWE 2013 conference that wrapped up last week. I had the honor of speaking, along with a number of my co-workers. Although, as Tish points out, we’re not actually doing AR at Syntertainment, we’re very passionate about the field as one of several key enabling technologies.
* yes, of course mentioning me positively supports my opinion of your article, even if you get my name slightly wrong.
I was checking the Oculus VR site today to see when my dev kit will ship (April, apparently) and I noticed this interesting job posting:
- Design and implement techniques for optical flow and structure from motion and camera sensors
Relevant computer vision research (3d optical flow, structure from motion, feature tracking, SLAM)
What this means to me is that Oculus may be trying to solve AR as well as VR.
Now, it’s possible they’re just trying to use optical flow to create an absolute reference frame for the rendered VR scene. The tiny gyros that track your head rotation tend to drift over time and corresponding magnetometers (basically, compasses) aren’t always reliable enough to help. It’s also desirable to know your absolute lateral translation at any given time, which onboard accelerometers can only guess at. So some amount of computer vision makes sense for VR, implying there will be cameras mounted on the VR glasses, if not already. (I’d certainly encourage them to put cameras facing your eyes too).
However, “structure from motion” and SLAM would be better suited to 3D tracking such that a video camera could overlay virtual 3D objects on video of real world, like many of the AR demos you see on phones and tablets today. Done well, it gives much more precise 3D transformations than VR might need, and SLAM is quite expensive to compute traditionally.
In fact, if you were doing VR glasses with cameras, tracking your arms and legs would be the first problem I’d solve. Getting 3D input right is very important. Oculus seems to be thinking of that as well. (see this other job posting).
So let’s assume AR uses for this SLAM work and see where that leads. The idea is you’d put a camera (or two) on the goggles to capture the real world at a similar field-of-view to the VR display, pipe that video into your 3D renderer with some depth-per-pixel, and you’d theoretically have a better AR solution than Google Glass or similar HUD approaches.
On the plus side, pixels captured from the real world can be correctly occluded by pixels in the virtual scene. That’s very hard to do with see-through AR, which is plagued by the ghostly transparent images we’ve come to call “holograms,” for their ethereal qualities. They’re transparent because there’s currently no solution on the market to stop the natural light that comes from the real world other than to darken the glasses everywhere, so they go from AR to VR anyway…
Of course, latency is the main issue with the camera-based-AR approach. If you turn your head with see-through AR, the real world is visible with zero lag, and the virtual part can be dialed down whenever it would be nauseating. With camera-based AR, you typically have a few frames of latency added to the queue, which can induce nausea and disorientation, especially if the goal is to just walk around with these and not hit or be hit by “things.” Latency has to be extremely low, like 4 ms.
There are tricks to avoid this. Cheaper cameras use what’s called a “rolling shutter” which means they’re effectively only capturing one line of pixels at a time, while that line sweeps across the image at some high rate. This is what causes those funny skewing artifacts when you move your cell phone while taking pictures or video — the picture wasn’t all taken at one point in time like an old film camera or high-end capture device.
The challenge is how to couple a rolling shutter to a “beam-chasing” rendering algorithm, which is the same idea applied in reverse — to the visual output instead input. If you do that right (and I’m sure there is a way), then the distance between the camera’s “scan line” and the currently rendered “raster line” is your actual latency, which would be measured in low milliseconds instead of full frames. Cool stuff.
But that’s not a full solution — the 99% of the scan lines we rendered in the past (this “frame”) would be stuck in their old positions while our heads turn. But at least the newest rendered pixels would be more correct, right?
A full solution includes a very low-latency image capture of the physical scene, 3D pose estimation (SLAM or similar) while a high-fidelity render of the virtual scene is conducted, and then a very high-frame-rate (120-240hz) continuous re-rendering of that scene is done based on the latest head rotation and translation measurements from the gyros and accelerometers.
This is indeed a kind of magic. Could Oculus do it? They’ve seemed to raise enough money. So best of luck to them.
TLDR; an artist is iterating through every combination of pixels to produce every possible digital photograph so as to explore the concept of infinity.
This is less interesting to me as an exploration of infinity, since there are a provably finite number of unique combinations of pixels for any given resolution and color depth.
Of course, for any reasonably sized image, say 640×480 at 24bpp, the number is exceedingly large. It’s about 2 to the 7 million, in this case, which would take more time than the age of the universe to merely count, given computers that today can typically count to only 2^64 sometime before you die. That’s not really that interesting, because counting images is no way to find anything interesting in there.
What’s more interesting to me is the underlying idea that every image is just a number. If you see an array of colored pixels as the mere bits they are, then it’s more obvious that there is a distinct integer or index matched to each unique image. A paint program, like Photoshop, is not actually helping you “draw” anything, in this sense, but merely changing which of those strings of bits are being displayed at any given time.
Yes, so therefore a paint program is just navigating through a pre-determined space of all possible images of a given size. Painting just one pixel is enough to move a little or a lot in that finite space.
You didn’t make that nice piece of art, you merely steered the computer towards it.
But what’s even more interesting is the idea that all of those images already and provably exist. That’s right, if you know the number, you know the image and vice-versa. Whether anyone’s ever seen the funny picture of George W. Bush lighting his hair on fire isn’t the point — that image definitely exists, and at 640×480 is going to be very clear. Whether it’s a photo of anything real is another story. And since movies are just strings of strings of bits, the same goes for video too.
So, one might ask, what the hell is copyright for digital imagery but a claim of owning a specific number, or set of numbers that represent the same approximate image? This would go for books and music too, btw, but let’s not wander.
I was so fascinated by this concept when I was younger that I wrote an April Fools post for an old computer graphics usenet group back in 1993. Thanks to Google usenet archives, I have recovered the text (yes, I really wrote this and posted it anonymously while I worked at Disney):
October 21, 1993 — [Geneva] Two Swiss scientists announced last week their stunning discovery of a method for generating and storing any conceivable picture using ordinary personal computers. Called The Database of Every Picture Imaginable, or DOEPI, their system is currently seeking patent and copyright protection in virtually every industrialized nation, including the United States.
Other image generation and storage technologies have been introduced in the past to help cope with the incredible demands of Multimedia and Video-Dialtone but, according to co-inventor Dr. Francois La Tete, of the Alpine Institute, a well-respected Swiss mathematical society, DOEPI is the first system which is capable of storing literally every image. ”Our proprietary algorithm is the first of it’s kind,” says La Tete, ” It can compress every image into such a compact space that the software can run with less than one megabyte of memory.”
Indeed, the performance of their system is impressive. Independent experts have confirmed that when fed a “bit-index-code” (a string of 1′s and 0′s which tell the database how to find the proper image), the database can produce the correct image in less than two seconds.
According to La Tete, the original developer of the software, the technique uses an extremely simple principle but one which obviously has eluded the rest of the world so far. Like many inventors, he came up with the idea indirectly. ”I was trying to automate the process of collecting images from various FTP sites,” he explains shyly, “when I realized that I could simply create the images myself.” From that point on, his personal computer worked day and night to generate the images he desired.
But Alfonso Marzipani, La Tete’s business partner and former Human Genome Project director, immediately realized the practical side of the invention. He understood that a technique that could generate any picture could be used to create pictures never before seen. ”We used [La Tete's] stuff to make a movie about some scary dinosours,” explains Marzipani. “Then we saw the same exact thing in a movie I can’t name for legal reasons. I said, ‘Frankie,’ we’re on to something big.”
One year later, Marzipani claims the database is complete. The energetic team has, in addition to filing for software patents, filed for blanket copyrights on all of the images stored in the database. ”If we make the images first, we should own them, right?” claims Marzipani.
Copyright Offices seem to agree. Nearly twenty countries have already granted blanket copyrights to Marzipani’s operating company, La Monde, SpA. Other countries, like the United States are more cautious. According to US Copyright Chairperson Ingrid Dingot, such a broad copyright must be seriously considered. ”A blanket copyright might mean that they might own any image anyone else tries to create,” says Dingot. ”That might have an impact on the US economy.”
To conclusively determine if DOEPI actually does contain every image, the USCO has enlisted James Farrel, an independent data retrieval expert. Farrel has begun the laborious process of printing copies of all of the images in the database, one by one. It is estimated that a full dump of the database will require several trillion years and more paper than exists on the planet.
In the mean time, he has used a more direct approach. Seven major entertainment companies have come together to donate their accumulated libraries of images to Farrell’s effort in the hopes of proving the DOEPI database incomplete. ”If we can find even one image they don’t have,” says Arturo Nakagawa, CEO of Sony America, “then their claim is false.”
So far, Farrel has searched for nearly ten thousand images with complete success. ”It’s incredible,” says Farrel. ”After they compute the bit-index from the control image, it takes only a second or two to find the matching image in the database. Every damn time.”
But according to La Tete, the software isn’t perfect. He admits that the size of the bit-index-code, sometimes in excess of one megabyte, or eight million bits, is overly cumbersome and hints that the next generation of the software will reduce this requirement by a hundred times or more. ”At that point,” says La Tete, “our software will be used by nearly every person on the planet.”
But it appears as if La Tete and Marzipani may have their way before the improved software is ever released. ”We may have to grant the copyrights,” admits Dingot candidly. ”Even the one for sequencing a series of still images as a motion picture.” Indeed, the future of ownership of visual imagery seems bleak.
But the battle isn’t over. Michael Eisner, CEO of the Walt Disney Company, has countered threat with threat. ”If their database contains every image, then it contains Disney property,” says Eisner, “Those two owe the Disney Company a large amount of money.” This battle may go over to Disney, with a record of success at this kind of clear-cut property claim. But only time will tell who will win the war.
[AP] -end included article. Reprinted without permission.
I had attributed this to some fake users, but now you know.
Here’s a great interview with my former CEO/CTO, the brilliant Michael T. Jones, in the Atlantic magazine. Link goes to the extended version.
[BTW, The Atlantic seems to be on a tear about Google lately (in a good way), with John Hanke and Niantic last month and lots on Glass recently as well. With that and Google getting out of federal antitrust hot water, it seems they're definitely doing something right on the PR front.]
Quoting Michael on the subject of personal maps:
The major change in mapping in the past decade, as opposed to in the previous 6,000 to 10,000 years, is that mapping has become personal.
It’s not the map itself that has changed. You would recognize a 1940 map and the latest, modern Google map as having almost the same look. But the old map was a fixed piece of paper, the same for everybody who looked at it. The new map is different for everyone who uses it. You can drag it where you want to go, you can zoom in as you wish, you can switch modes–traffic, satellite—you can fly across your town, even ask questions about restaurants and directions. So a map has gone from a static, stylized portrait of the Earth to a dynamic, inter-active conversation about your use of the Earth.
I think that’s officially the Big Change, and it’s already happened, rather than being ahead.
It’s a great article and interview, but I’m not so sure about the “already happened” bit. I think there’s still a lot more to do. From what I see and can imagine, maps are not that personal yet. Maps are still mostly objective today. Making maps more personal ultimately means making them more subjective, which is quite challenging but not beyond what Google could do.
He’s of course 100% correct that things like layers, dynamic point of view (e.g., 2D pan, 3D zoom) and the like have made maps much more customized and personally useful than a typical 1940s paper map, such that a person can make them more personal on demand. But we also have examples from the 1940s and even the 1640s that are way more personal than today.
For example, consider the classic pirate treasure map at right, or an architectural blueprint of a home, or an X-ray that a surgeon marks up to plan an incision (not to mention the lines drawn ON the patient — can’t get much more personal than that).
Michael is right that maps will become even more personal, but only after one or two likely things happen next IMO: companies like Google know enough about you to truly personalize your world for you automatically, AND/OR someone solves personalization with you, collaboratively, such that you have better control of your personal data and your world.
This last bit goes to the question of the “conversation,” which I’ll get to by the end.
First up, we should always honor the value that Google’s investments in “Ground Truth” have brought us, where other companies have knowingly devolved or otherwise strangled their own mapping projects, despite the efforts of a few brave souls (e.g., to make maps cheaper to source and/or more personal to deliver). But “Ground Truth” is, by its very nature, objective. It’s one truth for everyone, at least thus far.
We might call the more personalized form of truth “Personal Truth” — hopefully not to confuse it with religion or metaphysics about things we can’t readily resolve. It concerns “beliefs” much of the time, but beliefs about the world vs. politics or philosophy. It’s no less grounded in reality than ground truth. It’s just a ton more subjective, using more personal filters to view narrow and more personally-relevant slices of the same [ultimately objective] ground truth. In other words, someone else’s “personal truth” might not be wrong to you, but wrong for you.
Right now, let’s consider what a more personal map might mean in practice.
A theme park map may be one of the best modern (if not cutting edge) examples of a personal map in at least one important sense — not that it’s unique per visitor (yet) — but that it conveys extra personally useful information to one or more people, but certainly not to everyone.
It works like this. You’re at the theme park. You want to know what’s fun and where to go. Well, here’s a simplified depiction of what’s fun and where to go, leaving out the crowds, the lines, the hidden grunge and the entire real world outside the park. It answers your biggest contextual questions without reflecting “ground truth” in any strict sense of the term.
Case in point: the “Indiana Jones” ride above is actually contained in a big square building outside the central ring of the park you see here. But yet you only see the entrance/exit temple. The distance you travel to get to and from the ride is just part of the normal (albeit long) line. So Disney safely elides that seemingly irrelevant fact.
Who wants to bet that ground truth scale of the Knotts Berry map is anywhere near reality?
Now imagine that the map can be dynamically customized to reveal only what you’d like or want to see right now. You have toddlers in tow? Let’s shrink most of the rollercoasters and instead blow up the kiddie land in more detail. You’re hungry? Let’s enhance the images of food pavilions with yummy food photos. For those into optimizing their experience, let’s also show the crowds and queues visually, perhaps in real-time.
A Personal Map of The World is one that similarly shows “your world” — the places and people you care most about or are otherwise relevant to you individually, or at least people like you, collectively.
Why do you need to see the art museum on your map if you don’t like seeing art? Why do you need to see the mall if you’re not going shopping or hanging out?
The answer, I figure, is that Google doesn’t really know what you do or don’t care about today or tomorrow, at least not yet. You might actually want to view fine art or go shopping, or plan an outing with someone else who does. That’s often called “a date.” No one wants to “bubble” you, I hope. So you currently get the most conservative and broadest view possible.
How would Google find out what you plan to do with a friend or spouse unless you searched for it? Well, you could manually turn on a layer: like “art” or “shopping” or “fun stuff.” But a layer is far more like a query than a conversation IMO — “show me all of the places that sell milk or cheese” becomes the published “dairy layer” that’s both quite objective and not much more personal than whether someone picks Google or Bing as their search engine.
Just having more choices about how to get information isn’t what makes something personal. It makes it more customized perhaps… For truly personal experiences, you might think back to the treasure map at the top. It’s about the treasure. The map exists to find it.
Most likely, you want to see places on the map that Google could probably already guess you care about: your home, your friends’ homes, your favorite places to go. You’d probably want to see your work and the best commute options with traffic at the appropriate times, plus what’s interesting near those routes, like places that sell milk or flowers on the way home.
Are those more personal than an art layer or even a dairy layer? Perhaps.
Putting that question aside for a moment, an important and well known information design technique focuses on improving “signal to noise” by not just adding information but more importantly removing things of lesser import. You can’t ever show everything on a map, so best to show what matters and make it clear, right?
City labels, for example, usually deter adjacent labels of less importance (e.g., neighborhoods) to better stand out. You can actually see a ring of “negative space” around an important label if it’s done properly.
In the theme park map example, we imagined some places enlarged and stylized to better convey their meaning to you, like with the toddler-friendly version we looked at. That’s another way to enhance signal over noise — make it more personally relevant. Perhaps, in the general case, your house is not just a literal photo of the structure from above, but rather represented by a collage of your family, some great dinners you remember, your comfy bed or big TV, or all of the above — whatever means the most to you.
That’s also more personal, is it not?
Another key set of tools in this quest concerns putting you in charge of your data, so you can edit that map to suit and even pick from among many different contexts.
Google already has a way to edit in their “my maps” feature. But even with the vast amount of information they collect about us, it’s largely a manual or right-click-to-add kind of effort. Why couldn’t they draw an automatic “my maps” based on what they know about us already? Why isn’t that our individual “base layer” whenever we’re signed in, collecting up our searches in a editable visual history of what we seem to care about most?
Consider also, why don’t they show subjective distances instead of objective ones, esp. on your mobile devices? This is another dimension of “one size fits all” vs. the truly personal experience to which we aspire.
A “subjective distance” map also mirrors the theme park examples above. If you’re driving on a highway, places of interest (say gas stations) six miles down the road but near an off-ramp are really much “closer” than something that’s perhaps only 15 feet off the highway, but 20 feet below, behind a sound wall and a maze of local streets and speed bumps.
How do you depict that visually? Well, for one, you need to start playing more loosely with real world coordinates and scale, as those cartoon maps above already do quite well. Google doesn’t seem to play with scale yet (not counting the coolness of continuous zoom — the third dimension). I’m not saying it’s easy, given how tiled map rendering works today. But it’s certainly possible and likely desirable, especially with “vector” and semantic techniques.
For a practical and well known example, consider subway maps. They show time-distance and conceptual-distance while typically discarding Cartesian relationships (which is the usual mode for most maps we use today).
I have no idea where these places (below) are in the real world, and yet I could use this to estimate travel time and get somewhere interesting. And in this case, I don’t even need a translator.
Consider next the role of context. Walking is a very different context than driving to compute and depict more personalized distance relationships. If I’m walking, I want to see where can I easily walk and what else is on the way. I almost certainly don’t want to walk two hours past lunch to reach a better restaurant. I’m hungry now. And I took the train to work today, don’t you remember?
Google must certainly know most of that by Now (and by “Now” I mean “Google Now”). So why restrict its presence to impersonal pop up cards?
Similarly, restaurants nearby are not filtered by Cartesian distance, but rather by what’s in this neighborhood, in my interest graph, and near something else I might also want to walk to (e.g., dinner, movie, coffee == date) based on the kinds of places we (my wife and I) might like.
Context is everything in the realm of personal maps. And it seems context must be solicited in some form. It’s extremely hard to capture automatically partly because we often have more than one active context at a time — I’m a husband, a father, a programmer, a designer, a consumer, a commuter, and a friend all at the same time. So what do I want right now?
Think about how many times have you bought a one-time gift on Amazon only to see similar items come up in future recommendations. That’s due to an unfortunate lack of context about why I bought that and what I want right now. On the other hand, when I finish reading a book on my Kindle, Amazon wisely assumes I’m in the mood to buy another one and makes solid recommendations. That’s also using personal context, by design.
The trick, it turns out, is figuring out how to solicit this information in a way that is not creepy, leaky, or invasive. That same “fun factor” Michael talks about that made Google Earth so compelling is very useful for addressing this problem too.
Given what we’ve seen, I think Google is probably destined to go the route of its “Now” product to address this question. Rather than have a direct conversation with users to learn their real-time context and intent and thus truly personalize maps, search, ads, etc.. , Google will use every signal and machine learning trick they can to more passively sift that information from the cumulative data streams around you — your mails, your searches, your location, and so on.
I don’t mean to be crude, but it’s kind of like learning what I like to eat from living in my sewer pipes. Why not just ask me, inspector?
I mean, learning where my house is from watching my phone’s GPS is a nice machine learning trick, but I’m also right there in the phone book. Or again, just ask me if you think you can provide me with better service by using that information. If you promise not to sell it or share it and also delete it when I want you to, I’m more than happy to share, esp if it improves my view of the world.
So why not just figure out how to better ask and get answers from people, like other people do?
If the goal is to make us smarter, then why not start with what WE, the users, already know, individually and collectively?
And more importantly, is it even possible to make more personal maps without making the whole system more personal, more human?
The answer to what Google can and will do probably comes down to a mix of their company culture, science, and the very idea of ground truth. Data is more factual than opinions, by definition. Algorithms are more precise than dialog. It’s hard to gauge, test, and improve based on anyone’s opinions or anything subjective like what someone “means” or “wants” vs. what they “did” based on the glimpses one can collect. Google would need a way of “indexing” people, perhaps in real-time, which is not likely to happen for some time. Or will it?
When it comes to “Personal Truth,” vs. “Ground Truth” perception and context of users are what matter most. And the best way to learn and represent the information is without a doubt to engage people more directly, more humanely, with personalized information on the way in and on the way out.
This, I think, it what Michael is driving at when he uses the word “conversation.” But with complete respect to Michael, the Geo team, and Google as a whole, I think it’s still quite early days — but I’m also looking forward to what comes next.
In 2002, I wrote a sci-fi short story titled Blockbuster that depicted a world in which individual people go IPO. In such a world, investors would want to start young, picking the winners early, funding their education and public launches in exchange for “stock” in the person and, effectively, a cut of their future earnings. I mean, if corporations can be people, then why can’t people be corporations?
It’s a dystopian vision, in which the main character, the most successful such funded individual in history, groomed from childhood for ultimate success, bankrupts himself in every possible way to finally become “fully self-owned.”
We should never do this in real life.
It’s not that there is no value in attending a top kindergarten, a great private school, the best ivy league university, the best startup incubator, etc… There’s generally a reason why these are vaunted. But when we place so much value on the promise of success vs. the actual evaluation of good ideas, we set ourselves up for a kind of “blockbuster effect” that plagues movies and books, in which only the most fundable ideas get supported, because no one wants to take a risk. Everyone wants a winner.
For example, the funding model sounds reasonable. Instead of a loan, individuals promise a cut of future earnings. Imagine if universities worked this way (ignoring alumni contributions, which are voluntary). Universities would by nature want to accept only students who will likely make a lot of money. And that would often mean picking students who start out with a lot of money, valuable connections, or who attended only the best kindergartens, etc..
And that’s unfortunately a lot like the admission process for the best ivy league universities already. Do we want all of life to be like that? What do we lose by focusing only on the fat head and losing the long tail?
Of course, there’s something to be said for bringing back the benefactor model, in which the wealthy don’t just invest in people (artists, writers, visionaries) for the sake of profit, but to make the world a richer place.
I tend not to wade into the whole “Apple Screwed Up Maps” thing. For one thing, I don’t have a dog in this fight. Yes, I indirectly helped Google (before Keyhole became Google Earth). And I more directly helped Microsoft in ways we can’t get into. I do have friends in most of these companies, but they know me well enough to know that I speak my mind or not at all.
Mostly, I really just want maps to work well everywhere, and that’s best served by healthy competition, great (free and open) data, and really good crowd-sourcing for keeping things accurate and fresh.
If anything, what I’m most disappointed by is that Apple had the golden opportunity to crowd-source their map data. If something was wrong somewhere on the globe, it could be fixed in 20 seconds by a dedicated user. Everyone else would see an improved result, well before reporters started harping on it.
Alas, they ditched Google’s mostly automated Ground Truth. They barely used Open Street Map, and not where (and how) it counted. Waze, as well, would have been a great ally to improve their ground truth and real-time updates.
But here’s the real insight worth considering: try running Google’s “new” iOS maps app and then run Google Earth on the same device, switching back and forth for the same areas.
Tell me if you can spot the differences.
- Google Maps on iOS has turn by turn directions. GEarth has this on other platforms, in the form of similar animated tours.
- Google Maps on iOS has traffic info — but this can be added to Google Earth too as a layer.
- Google Maps on IOS apparently uses the same web services that their web maps do for directions, navigation, etc..
- The icons are slightly different, but the road label rendering looks the same.
- Map rendering and manipulation are virtually identical.
What I take from this is that the team may have used engine code from Google Earth to power their new maps app, stripping out some features but keeping others. I’m guessing they spent the last few months adding dedicated UX specific to the more targeted use case — directions, traffic, turn-by-turn, etc..
If true, that’s exactly the kind of convergence I’d hoped to see when Google bought Keyhole.
But what’s most remarkable about that is that Google Earth never left iOS. It was there throughout the whole “Apple booted Google” fiasco. All it was missing were some UI tweaks and the above features, which I figure were left out of the iOS version initially because of ’locked-up’ features like “turn by turn.” So in a sense, Google fixed that and now re-released it under the name “Maps.”
Of course, the Google Maps browser version was also available the whole time. But people like the native “Maps” app entry point, it seems.
If true, this means that in a next update or two, Google can add 3D buildings to the iOS Maps app with relative ease to compete handily with Apple’s acquired C3 technologies 3D buildings. Oops.
But does Google really want their iOS Maps app to be so great?
That’s a harder question to suss out, and I bet it depends who at Google you ask. I have no doubt that Android sales improved this Christmas due to Apple’s map problems. People just can’t risk having their maps suck, even if only 0.05% of users had problems. But I expect the Google Geo team just wants to be the best possible solution everywhere.
So what we have now is a Google Maps app that could totally rock anything Apple does on their own platform, and more on Google’s own terms. At some point, “people” can even force Apple to make the default maps provider user-selectable, so geospatial links will open whichever app is so registered.
I mean, this is basically what happened to Microsoft with IE bundling, right? Just a matter of time, given ‘reality’ is creeping back in. That is, no doubt, what Apple was afraid of — losing control of a differentiating feature on their own devices — and rightly so. But they seem to have played their hand rather poorly and that’s the inevitable result.
Polygons/triangles/quads are great for efficient low-level 2D/3D rendering — they’re butt simple graphics primitives that don’t require overly complex shaders or hierarchical composition languages to represent and render.
However (and no offense to the legendary Hughes Hoppe), they truly suck for representing dynamic levels of detail, such as you get with significantly zooming close and far. They don’t compress as well as other forms because the detail is way too explicit and often wrongly expressed for the given need.
It’s like trying to express the function a*sin(b) as a long series of undulating (X, Y) sample points instead of, well, just “a*sin(b).” The sample points are invariably not the ones you really want to ideally reconstruct the curve. And god help you if you want to change the key parameters to alter the waveform on the fly. The sample points are missing the essential mathematical (trig, in this case) relationship.
Mostly, polygons suck for later editing the objects we create, It takes years of training for good results in the first place vs. the functional equivalent of play-dough — anyone should be able to do it. And parametric approaches, as above, are more easily mutated on the fly, which is the key element for easy editing.
The most success to date I’ve had in my 20 year dream to obsolete polygons was with Second Life, where I wrote their 3D Prim generation system, still in use today. I wanted to do much more than the simple convolution volumes we ultimately shipped, but it was a good step in the right direction and at least proved the approach viable. However, one doesn’t create technology for its own sake — you always need to do what’s right for the product.
The ultimate vision I was hoping for then was more like what Uformia is now doing — giving us the ability to mash up and blend 3D models with ease. And fortunately for all of us, Uformia has found a real use case that obviously needs true volume modeling: 3D printing.
3D printing is notoriously hampered but not pampered by the polygonal meshes one tries to feed to these systems. Polygons have zero volume and can cut, tear, and inter-penetrate each other without violating any rules of physics. Real material is just the opposite. Using polygons is like trying to make a tasty vodka martini using only origami (and even then, paper has real volume, even if we don’t think of it that way).
Uformia can apparently prove their models are viable, and even aid in building supporting micro-structures. I’m guessing they do some sort of guided parametric evolution to fit their model to the input polygons, but it could easily be smarter than that. I ordered my $100 copy, so I intend to find out.
The main downside of procedural/parametric modeling is, as always, the quality and availability of the tools. So I fully support this company giving a run at getting that part right.
What’s the next step? Blending arbitrary models is a good start, not entirely unseen for ye old polygonal modelers. The real kicker comes when we can take two models and say “make A more like B, right here in this part but not that other part.” If we solve that, then we can then imagine a real open ecosystem for 3D designs that truly credits (and rewards) the creators original designs while allowing easy mashups of the results.
(Evolver is a good example of that trend for humanoid avatars at least. I met those guys maybe 7 years ago when they were just deciding to form a company.)
I’ve long been hoping I didn’t have to write this stuff myself — it’s quite hard, probably way over my head — and I just want to use it for some future projects I have in mind. Also, it’s notoriously difficult to make money selling 3D modeling tools. The most successful business model to date is “sell to Autodesk and let them figure it out.” But I’m rooting for this one to get the UX right and hit it out of the park.
Meridian seems to be doing some interesting work with easy-to-make indoor geospatial experiences for museums, tours, shopping, and so on. Their site is quite sparse on exactly how the tracking part works, but I’d guess it’s the usual wi-fi triangulation with some accelerometer-driven “dead reckoning” or they’d be bragging about it.
The good news for AR enthusiasts is that the more attention paid to solving more precise location and orientation, indoor and out, the easier it will be to augment our perception of and interaction with the world, regardless of what device it’s rendered on. A rising tide floats all boats here.
This also intersects nicely with where I’d always hoped KML would go — we desperately need a standard markup language for the real world that’s location-aware — and not just lat/long. Indoor location demonstrates the need for ‘location’ to work in multiple different coordinate reference systems, not just the common Mercator or WGS-84 projections.
I’m glad to see ARML 2.0 is moving along in that direction as well, and asking for comments from what I can see. But now imagine when things like Meridian’s editor can output some standard ML that any webapp or maps app could consume and render without requiring a dedicated app. Now we’re talking!