Oftkinetic Optrima 3d camera

The days of rifling through couch cushions for a television remote could be coming to an end, as 3-D gesture-recognition technology finds its way into set top boxes following a deal between Intel and Softkinetic-Optrima.

Like a hyperevolved descendant of The Clapper, the devices will let television viewers navigate menus and control volume by moving their arms in a predefined patterns.

Gesture recognition technology, previously somewhat arcane, gathered momentum last year when Microsoft demoed its Project Natal to enormous acclaim. Natal applies similar technology to hard-core gaming on the Xbox, letting users play fighting games by actually punching and kicking in the air, using technology from Microsoft’s acquisition of Israel-based gesture-recognition company 3DV.

In addition to a partnership with EA Sports for games, Softkinetic Optrima plans to apply gesture recognition to the lean-back television experience, allowing people to turn up the volume by moving their hand in a circle, switch the channel by swiping to the right, pause by extending their hands in a “stop” gesture, and so on.

Softkinetic-Optrima’s gesture-recognition technology, which links up with cameras with radarlike properties, will be bundled in a box running on top of Intel’s powerful Atom Processor CE4100.

That chip will appear in Orange’s cable services in Africa, Europe and the Middle East by the end of this year or early next year, and likely in the United States as cable and satellite providers incorporate Intel’s chip, which also supports 3-D television. The jury is still out on 3-D TV, but regardless of whether people are willing to don 3-D glasses, Softkinetic Optrima’s gesture-recognition technique might come in handy (so to speak) because it works with regular broadcasts and menus.

How It Works

The 3-D camera Softkinetic-Optrima uses for these Intel-inside boxes (prototype pictured above) produces a depth map of the distance of each pixel from the camera. These work more like radar than like a traditional two-lens stereoscopic camera(like the one used by Earthmine to make more detailed amps than Google’s). That’s because stereo cameras need visible light to make a 3-D image, and people often watch television or play videogames in relative darkness. Making matters worse, a purely optical solution can’t distinguish between a white shirt and the white wall behind it.

SoftKinetic-Optrima's software analyzes the output from the radarlike camera, assigns depth to each pixel, and creates a body model for controller-free, gesture-based interaction.
SoftKinetic-Optrima’s software analyzes the output from the radarlike camera, assigns depth to each pixel, and creates a body model for controller-free, gesture-based interaction images courtesy SoftKinetic-Optrima.


Because you shouldn’t have to turn on the light or change shirts just to switch the channel on your television, the current generation of 3-D gesture-recognition cameras shine their own invisible, infrared light against their subjects and judge the distance of each point based either on the time it takes to return (the “time of flight” method”) or deformations in a projected grid. Until recently, they were too expensive to be included in consumer devices, so SoftKinetic-Optrima focused on industrial uses, prior to its acquisition of Optrima, which makes the cameras. As tends to happen with technology, the price of gesture-recognition cameras has dropped significantly over time, to the point where set-top box manufacturers can include them in standard cable or satellite boxes.

The company’s software analyzes 3-D camera data at 50 frames per second, recognizing gestures and movements or recreating the bodies of one or more people in front of the camera on the television screen, like a lower-resolution version of the cameras-and-dots technique used to capture the movements of athletes for sports videogames. In the case of 3-D programming, it can place your avatar within the scene based on the size of the room, where you’re standing in it, your height, and so on, and allow you to grab objects that appear behind other objects.

My take on all this…

  • Are we ready to control televisions with movements?
  • And what about the privacy issues associated with pointing a connected camera at your living room, 24 hours a day?

According to Intel, we’ll need this technology in part to deal with the fire hose of content streaming through our television sets, which will grow stronger as internet delivered television becomes commonplace.

“By the year 2015, it’s expected there will be billions of consumer devices delivering billions of hours of video content, music, video games and web browsing, so naturally we’ll need much more sophisticated ways to organize and deliver content in interactive and intuitive ways, so they suggest this has to be one of them.