Ikea Assistant

A cognitive assistant that helps assemble a stool



About

Ikea Assistant is a cloudlet based cognitive assistant application for Google Glass that helps users assemble an Ikea stool.  It uses an R-CNN (region based convolutional neural network) to detect what step the user is on and any mistakes that have been made.  It can then give verbal guidance to the user, showing them what to do next or what to fix.  This research was done in collaboration with Professor Satyanarayanan and Dr. Zhuo Chen from Carnegie Mellon University and was funded by the NSF.  The cloudlet interface code was built on the Gabriel Cognitive Assistant platform.


How it Works

The core component of this application is an R-CNN, trained to identify the different states of the construction process.  An R-CNN is a convolutional network designed to detect objects.  It operates by convolving filters over an image and using their outputs as inputs for more filter convolutions.  This filter stacking, through supervised training, results in a feature extractor and identifier which is state of the art accurate.  The R-CNN allows the system to understand not only what is present in the image, but also how the image is configured.  Once the scene composition has been established, some simple logic allows the system to point out mistakes and check for completion.


Challenges

The main challenges associated with this project were related to the R-CNN.  My inexperience with machine learning meant that I had to learn not only how a neural network operated, but also how to use and train one.  I also had to learn the Gabriel Cognitive Assistant framework and OpenCV.


My Contribution

I trained the R-CNN using a GUI training interface currently in development at Carnegie Mellon University. The dataset was not provided to me so I collected it myself, capturing objects of interest in different lighting and positions. I also wrote the code that ran the R-CNN (in Keras) and sent the correct instructions to the user.