A few years ago, I documented some of the cool experiences I worked on at Disney Imagineering starting in 1994. Now, Inspired by John Carmack exploring Scheme as the language of VR for Oculus, I figured it would be helpful to talk about the software stack a bit. And I’ll finish with a few thoughts on Scheme for VR in the future.
First, as always, I suck at taking credit, in the company of such amazing co-workers. So for the real kudos, please thank Scott Watson (now CTO of Disney R&D) and JWalt Adamczyk (Oscar Winner and amazing solo VR artist/hacker) and our whole team for building much of this system before I even arrived. Thant Tessman esp. deserves credit for the Scheme bindings and interop layer.
This Disney gig was my first “big company” job after college, not counting my internships at Bell Labs. My one previous startup, Worldesign, tried to be a cutting edge VR concept studio about 20 years too early. But Peter Wong and I managed to scrape together a pretty killer CAVE experience (a hot air balloon time travel ride) for only $30,000, which represented exactly all of the money in the world to us. The startup went broke before we even started that work. But because we’d borrowed ample SGI equipment, it did get me noticed by this secret and amazing Disney*Vision Aladdin VR project I knew nothing about.
I had to join on faith.
I quickly learned that Disney was using multiple SGI “Onyx” supercomputers, each costing about a million dollars to render VR scenes for just one person each. Each “rack” (think refrigerator-sized computer case) had about the same rendering power as an Xbox, using the equivalent of today’s “SLI” to couple three RealityEngine 3D graphics cards (each card holding dozens of i860 CPUs) in series to render just 20fps each for a total of 60fps for each VR participant. In theory, anyway.
Disney was really buying themselves a peek ahead of Moore’s Law, roughly 10 years, and they knew it. This was a research project, for sure, but using hundreds of thousands of live “guests” in the park to tell us if we were onto something. (Guests are what Disney calls humans who don’t work there…)
I talked previously about the optically-excellent-but-quite-heavy HMD (driven by Eric Haseltine and others). Remember this was an ultra-low-latency system, using monochrome CRTs to avoid any hint of pixels or screen doors. So let’s dive into the software environment that inspired me for another 20 years.
Even with supercomputers with 4-8 beefy CPUs each (yes, sounds like nothing today), it took a while to re-compile the C++ core of the ride. “SGI Doom” and “Tron 3D lightcycles” filled some of those lapses in productivity…
This code was built on top of the excellent SGI Performer 3D engine/library written by Michael Jones, Remi Arnaud, John Rohlf, Chris Tanner and others, with customizations to handle that 3-frame latency introduced by the “TriClops” (SLI) approach. The SGI folks were early masters of multi-core asynchronous programming, and we later went on to build Intrinsic Graphics games-middleware and then Google Earth. But let’s focus on the Scheme part here.
Above the C++ performance layer, Scott, Thant, JWalt and team had build a nice “show programming” layer with C++ bindings to send data back and forth. Using scheme, the entire show could be programmed, functions prototyped and later ported to C++ as needed. But the coolest thing about it was that the show never stopped (you know the old saying…) unless you wanted to recompile the low-level. The VR experience continued to run at 60fps while you could interactively define Scheme functions or commands to change any aspect of the show interactively.
So imagine using Emacs (or your favorite editor), writing a cool real-time particle system function to match the scarab’s comet-like tail from the Aladdin movie, and hitting two keys to send that function into the world. Viola, the particle system I wrote was running instantly on my screen or HMD. When I wanted to tweak it, I just sent the new definition down and I’d see it just as fast. Debugging was similar. I could write code to inspect values and get the result back to my emacs session, or visually depict it with objects in-world. I prototyped new control filters in Scheme and ported them to C++ when performance became an issue, getting the best of both worlds.
The Scheme layer was fairly incapable of crashing the C++ side (with much effort, to be honest). So for me, this kind of system became the gold standard for rapid prototyping for all future projects. Thant even managed to get multi-threading working in Scheme using continuations. So we were able to escape the single-threaded nature of the thing.
Thant and I also worked a bit on a hierarchical control structure for code and data to serve as a real-time “registry” for all show contents — something to hang an entire virtual world off so everyone can reference the same data in an organized fashion. That work later lead me to build what became KML at Keyhole, now a geographic standard (but forget the XML part — our original JSON-like syntax is superior).
BTW, apart from programming the actual Aladdin show, my first real contribution to this work was getting it all to run at 60fps. That required inventing some custom occlusion culling, because the million dollar hardware was severely constrained in terms of the pixel fill complexity. We went from 20fps to 60fps in about two weeks with some cute hacks, though the Scheme part always stayed at 24fps, as I recall. Similarly, animating complex 3D characters was also too slow for 60fps, so I rewrote that system to beef it up and eventually separated those 3 graphics cards so each could run its own show, about a 10x performance improvement in six months.
The original three-frame latency increased the nausea factor, not surprisingly. So we worked extra hard make to something not far from Carmack’s “time warp” method, sans programmable shaders. We rendered a wider FOV than needed and set the head angle at the very last millisecond in the pipeline, thanks to some SGI hacks for us. That and a bunch of smoothing and prediction on the 60fps portions of the show made for a very smooth ride, all told.
(I do recall getting the then-Senate-majority leader visibly nauseated under the collar for one demo in particular, but only because we broke the ride controls that day and I used my mouse to mirror his steering motions, with 2-3 seconds of human-induced latency as a result).
This Disney project attracted and redistributed some amazing people also worth mentioning. I also got to work with Dr. Randy Pausch, Jesse Schell (also in his first real gig as a jr. show programmer) went on to great fame in the gaming world. Aaron Pulkka also went onto an excellent career as well. I’m barely even mentioning the people on the art and creative leadership side, resulting in a VR demo that is still better than at least half of what I see today.
So can Scheme help Carmack and company make it easier to build VR worlds? Absolutely. A dynamic language is exactly what VR needs, esp. one strong in the ways of Functional Reactive Programming, closures, monads, and more.
Is it the right language? If you asked my wise friend Brian Beckman, he’d probably recommend Clojure for any lisp-derived syntax today, since it benefits from the JVM for easy interoperability with Java, Scala and more. Brian is the one who got me turned onto Functional Reactive Programming in the first place, and with Scott Isaacs, helped inspire Read/Write World at Microsoft, which was solving a similar problem to John’s, but for the real world…
Syntactically, lisp-derivatives aren’t that hard to learn IMO, but it does take some brain warping to get right. I worked with CS legend Danny Hillis for a time and he tried to get me to write the next VR system in Lisp directly. He told me he could write lisp that outperformed C++, and I believed him. But I balked at the learning curve for doing that myself. If other young devs balk at Scheme due to simple inertia, that’s a downside, unfortunately.