Ready to use your voice and hands to control your PC? How about a virtual orchestra or a flying quadrocopter?
Microsoft Research this morning is officially expanding the Kinect motion sensor beyond its Xbox 360 game console to traditional Windows machines. The company is releasing a development kit that students, researchers and other noncommercial software developers can use to create Windows 7 programs that sense their surroundings and respond to voice commands and gestures.
The beta release of the free Kinect software development kit, or SDK, could fuel the grassroots Kinect applications that have until now been considered hacks. Microsoft says it’s also working on a version of the SDK for commercial software programs.
Over time, the move by Microsoft could help the motion sensor find a place in such settings as kitchens, doctor’s offices or auto repair shops, where grubby hands or sterile gloves make a keyboard and mouse difficult or undesirable to use.
But developers are thinking bigger than that. Examples of applications from a coding marathon that was hosted by Microsoft over the past 24 hours include a virtual orchestra that users can conduct with their hands, and a miniature quadrocopter that can similarly be controlled with gestures.
The programs use the $150 Kinect sensor currently sold for Microsoft’s Xbox 360.
Kinect on Windows will represent “an inflection point” for computing, said Anoop Gupta, Microsoft Research distinguished scientist, in an interview this morning. It will take some time to spread to the hundreds of millions of people who have PCs, but in the evolution of computing “this will be a pretty memorable moment,” he predicted.
Microsoft isn’t giving a timeframe for the release of a commercial version of the SDK, and it isn’t talking about how the sensor could ultimately be integrated into its own PC products. However, it’s not difficult to envision the sensor being used to control the new Windows 8 interface, or working in conjunction with Skype for video conferencing, after Microsoft’s acquisition of Skype is complete.
Raw Sensor Streams. Developers have access to raw data streams from depth sensor, color camera sensor and the four-element microphone array. These will allow them to build upon the low-level streams generated by the Kinect sensor.
Skeletal Tracking. The SDK has the capability to track the skeleton image of one or two people moving within the Kinect field of view, making it possible to create gesture-driven applications.
Advanced Audio Capabilities. Audio processing capabilities include sophisticated noise suppression and echo cancellation, beam formation to identify the current sound source, and integration with the Windows speech recognition API.