AusGamers John Carmack Oculus Rift HMD Demo Session
Post by Steve Farrelly @ 03:04pm 12/07/12 | Comments
AusGamers sat down with id Software's John Carmack to check out his tinkering with the Oculus Rift HMD prototype and tech. Read on for the full demo session transcript and impressions...
AusGamers: So John, let’s talk about all of... this [gestures to head-mounted display apparatus].
John Carmack: So the way this all came to be: after we finished RAGE last year, I decided that I was just going to treat myself to a little technology toy. I was going to go out and see what the state of head-mounted displays and virtual reality stuff was now.
We were involved in this back in the early 90s in the heyday of VR when people were doing... they licensed Wolfenstein, Doom, and Quake and all these things, and they did a bunch of really crumby products with it -- none of these companies were going to amount to anything.
And VR had its flash-in-the-pan, sort of crashed and burned, and it wound up just being military stuff. Those companies survived as little niche products -- making DARPA, DoD products -- where you could sell 20 or 50 thousand dollar head-mounted displays to companies.
But a lot of people, me included, sort of felt that well, 15 years have passed. We’ve got computers that are a million times more powerful than we had back then. Can’t we just go out and buy that head-mounted display that we wanted back in the early 90s? And the truth is, you really couldn’t. There had been surprisingly little progress in head-mounted displays -- at least at that non-exotic, high-end.
So, when at the end of last year, I went and spent about fifteen-hundred dollars on a head-mount and a tracker for it from one of the integrator companies, and it was just terrible. It was terrible in a lot of ways: the display wasn’t good, there was huge latency on the tracking, the build quality wasn’t good -- just lots of bad things.
And I had a little bit of time, so I decided to dig into it a little bit more, and take things apart and find out, well exactly what is wrong with this? There’s obviously the displays not what you’d want, where the head-mounts for consumers, they will talk about them as it’s a 120 inch TV and it’s 20 feet away from you, or some metric that’s not really sensible on there.
In gamer terms, if you go and actually measure it, these things had, like, a 26 to 30 degree field of view, which is toilet-paper tubes, it’s completely useless for immersion on there, compared to a game that wants a 60 to 80 degree field of view or so.
So that’s the obvious part. It’s obvious that that little screen that you’re looking at isn’t going to throw you into the world. And there’s a whole lot of other not-as-obvious things, to do with the latency and the precision of the tracking and things, where... when you’ve got something on your head; when you’re trying to pretend that it’s a virtual world around you that you’re walking around in, the demands on the accuracy and the latency there, are so much more than in traditional gaming.
With a traditional console gaming controller, you’re actually doing a pretty sophisticated task. Where if you’re trying to aim or move on here, you are making this very precise little adjustment with your thumb, and then you’re integrating over time. So you’re saying “I want to be moving at three-quarters speed for two-tenths of a second”, and in some ways it’s very amazing that people like professional players can do these amazingly skilled things that they can on here, because that’s a tough task.
Now I’ve always been this big booster of 60 frames-per-second. I still think one of the real triumphs for RAGE was that we were able to get visuals of 60 frames-per-second, because even on a console controller, it makes a difference, at 60 hertz it is better than 30 hertz. But there are limits to what you need on this.
One of these experiments that I did in the last six months, was I dug out an old Sony Trinitron Cathode Ray Tube that you can drive at 170Hz. So I was doing experiments to say: “well ok, 60 is better than 30. Is 120 better than 60? Or 170?”, and my conclusion is that with any controller pad, it doesn’t really matter which, it can be a little bit smoother in some cases, but it’s hard to make a case for that, but with a more direct controller like a mouse -- where you are just directly converting the horizontal motion of your hand to an angular rotation of your eye -- that’s a lot easier to track, and most people that are gamers can tell the difference if they’re sat down with a mouse at 120Hz display versus a 60Hz display. Your average... like, a casual gamer, still won’t notice the difference, but if you care about it, you can tell there.
With a head-mounted display, there’s really zero degrees of separation, because your head’s there; your brain knows what’s supposed to happen. You have to train your brain to say what to expect, based on this other motion. It knows from the time you were born, how the world’s supposed to respond, and the response is very, very important there, and the numbers that I came down to there is I think 20 milliseconds from motion to photons striking your eyes is about where you want to be for your brain accepting this as looking into a virtual world, rather than just controlling the game with your head -- because that is what most head-mounted... so have you ever tried a head-mounted display at a trade-show or anything?
AusGamers: Not since the 90s.
John: Ok. Well they really haven’t done much better, but this is best when I’ve got somebody that’s tried things in past years, to contrast the difference here. What you had in those days and really, until recently, is you could control something with your head, you could train yourself to do it slowly and “ok, yeah, it’s moving where your head is going”, but if you do something like this [shakes head rapidly] it’s going to be all out of phase. You keep yourself from doing that if you’re in these things for very long. But when you do get all of that other latency out -- if you can get it down to this very responsive level -- you can start hitting these points where you might buy it as reality.
And the first time that I was able to get to that, was... one of the great things about these computers is we just have these insanely powerful graphics cards. So while the output here might be 60 frames-per-second going to the display, you can render more than that and let it tear. Now normally if you render it at 80 frames per second, you’ve got an ugly tear line and a big shear and those are really to be avoided. And it was one of the things I was proud of with RAGE, that we avoided tearing in as many cases as we could.
But I found out that on my little R&D test program, I could render this wall at 1000 frames-per-second on my high-end video card. So I render two screens a thousand times a second, which means I’ve got fifteen/sixteen bands of new screen coming in, and if the screen’s like an LCD that’s got a little bit of break between it, you can still tell that it’s doing some horizontal shimmery thing there, but it’s close to interesting.
Now if you’re moving your head horizontally or vertically, those horizontal lines are still going to detract from the experience, but I found that in a hold orientation, tipping like this [gestures moving head in an arch], where the lines aren’t sort of staying constant, where they’re moving with it, your brain decorrelates that, and I want to have all of this other stuff out. And I take this, and I rotate the display like that [gestures arching head movement again], it felt -- for the first time ever -- like I had a cut-out -- a plastic cut-out -- and I was looking into the same old world. Like the world was solid and I was just moving this cut-out in front of me.
And that’s where I came to this about 20 millisecond number. Now you can’t get that with horizontal and vertical on any 60Hz display, the tear lines are too distracting on there, so what you need is a faster display rate. So that’s one of the axis that needs to get better.
Field of view is one of them, you need this immersive world, you can’t be looking at these postage-stamp sized screens there, you need something that blocks you off from the world. Much faster response rate is another, and then the third part is: most head-mount demos just track orientation -- they track which direction you’re looking -- but there is a huge amount of information that you get from these tiny little millimeter-level movements. Even when you’re just turning side to side, you’re eyeball is actually following an arc in space.
Most 3D games treat you as this disembodied eyeball just moving it in space, I can hack this a little bit and that’s what I do in this demo by having a little model of the distance to your neck, the distance down to your spine, so when you look around, you’re following a little bit of a trajectory. And that helps, but what you really want is the actual data, where there are sensors that you can use, like a Sixense Razer Hydra or a TrackIR, they can provide position information at a millimeter level on there, so that if you sway your body side-to-side, it tracks that.
They each have their trade-offs though. TrackIR is pretty accurate inside a little field of view, it’s got a little camera on top of your monitor, so you’re like “This is awesome, I’m looking at this” and then you look over here and one of its little tracking dots goes outside the field of view and the view snaps, and it’s horribly oppressive in a head-mounted display. But inside that window, it’s great.
The Sixense Hydra has an electromagnetic tracking system on there. If you take apart one of those and put it on a head-mount, you get orientation and positioning that’s reasonably accurate. In the best cases, when it’s setup and calibrated right, and the base-station’s right there, you can lean over something, and look at something. You can even crouch down on the ground, and put your hand on the virtual floor. That’s deeply cool.
But while it’s precise -- it gives these precise readings in millimeters on there -- it’s a very lucky configuration space, so it’s right in just this area. You look over here and do the same thing and you’re tilted at a different angle. And I’ve been trying to calibrate around this little configuration space, but it’s not quite good enough. But still...
So those are the three axis that are really important in a virtual reality experience: wide field of view, fast response rate and then the addition of positional tracking.
So what I’ve got here, is a very wide field of view. It has an analytic model for this head/neck, plus a little bit of position tracking. It’s pretty fast on the updates, there’s no video processing going into it, but it is an older LCD panel that takes 20 to 30 milliseconds to transition. So this is above my perceived reality threshold, but it’s still better than anything you’ve ever seen as a VR demo.
So I have been tinkering with this for a few months, just in little odd bits of time, trying some experiments, getting some contacts with some of the companies doing this. And it was just my pet little project, because I was interested in this, when the company decided that we were going to do the Doom 3: BFG Edition.
The thought was “ok, first we’re going to make it 60 frames-per-second on consoles -- which is much harder than you think. An eight year-old PC game, you think you’d just pile it over here and it runs great, but I have spent a lot of work these last couple of months, getting that out. But still, it runs great, plays great, but how do you make people care about an eight year-old title? It’s still good looking, and in fact, when you go and look at a lot of the top-tier games here at E3, there are a lot of games that don’t look as good as that eight-year old title running right now. But still, nobody’s going to front-page any of this, as it’s a re-release, a remaster of an old title.
So I was thinking “well, I’ve been doing all of this experimental work with stereoscopy and the virtual reality stuff, let’s use this as a bit of a lever. Both HDMI and Sony support HDMI 3D now on the consoles natively, so they’re interested in that -- on the 3DTV experience. I’m not a huge booster of 3DTVs, I think we did a great job, but... have you played it out there, that version [BFG Edition on the E3 show floor]? It’s interesting, but not super-spectacular out there.
The head-mounted display though, is another world entirely. At its least, if you take the Sony device there [gestures to his modified Sony HMZ-T1 device], it’s pitched as a 3DTV for your head. There’s no ghosting, so it’s positive there -- you won’t have to worry about the glass power running out separately there, but it’s not virtual reality on there.
So I have hot-glued my sensors on there, and I used to use this for my demos, because it’s consumer; it’s available; it’s pretty neat, but it’s not “awesome”. So what I’ve got here though, does take that a few steps more in different ways, and it really is pretty awesome.
And it’s sort of cheating here to say “well, we’re going to use this as a promotional device for Doom 3”, because not many people have head-mounted displays here. Lots of people have 3DTVs -- they’ll be able to do that. I don’t know how many units Sony has sold of [the HMZ-T1]. Some of the earlier ones, they moved maybe 10 or 20 thousand of these that have been out there; most of them are probably in dustbins by now, because honestly, they weren’t all that cool or useful for a lot of different things.
But what we’ve got here -- this actual display -- [gestures to the Oculus Rift prototype HMD] is something that another guy, Palmer Luckery, built this base here. I was personally doing all this stuff with laser projectors and lenses in welding glasses, and these other things that I was putting together myself, but he had gone and built this in his workshop, that was a pretty innovative way of doing things.
Where it’s using a six inch display panel, that’s sort of a mini-tablet size on there. It’s a 1280x800 panel, which is not very high resolution. You have two sets of optics, so there’s a pair of lenses for each eye in here, and you look at half of it. So each eye gets 640x800 -- not high resolution -- and it’s stretched over in this enormous field of view.
So you can resolve pixels in here, but that’s the easiest thing to correct, we just need to wait. There are people making 1080p displays in that size. Toshiba’s got a two and a half P display prototype at that size -- the resolution is going to be the easiest thing to address.
But the exciting part of it is this huge immersive field of view -- 90 degrees horizontal, 110 vertical -- and that throws you in there, and the reason why all displays didn’t have something like this is, if you use simple lenses to achieve this wide field of view, then you wind up with a huge fish eye...
--- Part two, Demo begins here --
It's hard to convey in words the sensation of Carmack's foray into the world of HMD gaming. Ultimately, as rudimentary as his prototype was, it still worked quite well, and having no other peripheral light sensations piercing the casing helped me transport into the testbed world of Doom III.
Initially it does look quite low-res, but as John points out, this is something that will likely change before any such device with his stamp on it even makes it to market. What was most impressive though was the smooth frame-rate and responsiveness to head movement.
I started out using my thumb on the controller to move the camera, but once I realised my head could do the majority of that, it became far more natural to just move in the game-world as I would in real-life. This sensation was furthered when I started to physically react to fireballs being hurled at me from Imps. I began crouching and stiffeing up and I'm not afraid to admit my heart-rate went up as well.
A few minutes into my eyes-in session and I was moving about the game-space like a pro, peeking in and around corners and eating up the virtual 3D world around me as if I were actually there. Obviously the testbed was not nearly as challenging as the actual Doom III game, but it gave great insight into how this technology, specifically for shooters, can work. And again, the only major issue I had was in the overall resolution. It's also important to point out, for all the FPS keyboard and mouse warriors out there, twitch play with head movement coupled with controller in-hand was just as precise and far more engaging. If this technology goes mass market, honestly it's the only way I'd ever want to play a first-person shooter.
(see box out)
...going to a flat display, it comes out through these optics , so it’s all warpy there, with the lines bending, but when you look at it through the optics, it comes out straight and clear on this.
So we’ll get you started in this, find out if it’s reasonably comfortable in here. We can loosen or tighten this strap as necessary, and I’ll hand you the headphones. It’s black for a second, and here’s your headphones. Here’s your controller.
So usually what you see here is, he gets his bearings, starts walking down the stairs and gets to the point where he starts fighting these monsters...
AusGamers: That is, insane.
John: [laughs] That’s the vision that people always thought VR would be like.
AusGamers: That is amazing. Initially, I was playing with the camera like I’m used to doing with the stick, but eventually you just start totally tracking everything on your own.
John: Yeah. I hoped that the gross motions would be the most important thing, but actually, the subtle motions are very important on there. Just as you’re moving around, these small little things that you’re doing makes you feel much more like you’re actually there.
AusGamers: Absolutely. That’s the thing, when I play games, I look in every corner, so being...
John: I go to so much... even in the games I’ve made on here, I have so much more fun looking through on here, and I see all these little touches that the artists and designers had in there, that I never even knew were there. Like this chunk of RAGE that I’ve got cut-off for my test-bed, just sitting there looking at the detail in this pipe and the bricks and the plants and stuff.
AusGamers: I guess the next level then John, is resolution, because obviously that was the only thing that was missing there.
John: Yeah, that’s going to be changing automatically, because there are people that... there are 1080p panels of that size now, Toshiba demoed 2500 resolution panels, it’s not in production, but if they make that at 120Hz update rate, this will be really, really fabulous.
AusGamers So how long do you predict before that’s consumer accessible?
John: The Kickstarter for this kit [Oculus Rift] should be starting in a couple of weeks out there, and I’m hoping that we can get some of the materials there available in time for QuakeCon. I don’t know if that’s going to be possible, with the needs for the displays and everything. But certainly over the next year, if we get, I think the first batch will be a hundred kits or something, and I’m sure they’ll find a taker or two for that.
And we get people that are going to actually make things with these. Not just assembling the kit, but experimenting and figuring out how we can put adjustable optics and adjustable sensors there; adjusting things in the software. I’m sure people will think it through and port it to Unity and other things that are more commonly available there, and user experimentation should bear a lot of fruit. That’s going to be people exploring in very different directions.
Then somebody... Sony was in here yesterday taking a look at this, they want to get a couple of kits to play with there. And when I showed Microsoft -- they haven’t seen this cool version, but I showed them something with the Sony head-mount and was saying “VR should be on your radar”. I know people have kind of wiped that out of their memories because of the bad old days with the Virtual Boy and all of those colossal failures, but “it’s all here now, and you should be thinking about this”. And it might not be the direction you want to go, but somebody, whether it’s Microsoft, Sony, Google, Apple, AMD, Intel, Nvidia, some company should be able to do something with this.
And what’s going to be great is, people can make a real difference. People can buy these kits and figure out what is the best way to fit it all on your face, how high the nose will be, what will be the optimal face and how wide this piece will be. Doom 3 is going to be a wonderful testbed for this. It’s a real application that people care enough to be in this. It won’t just be heating up your own poor applications for it. We’ll have the source-code available.
Plans for the kit... I’m excited for what’s going to happen eventually. We know the technologies will get better, with the resolution improving automatically. We’ll get better sensors integrated, some people will do optical-tracking work on there.