Sketching based Big Data Acceleration on Low Power Cores

Wireless medical technologies have created opportunities for new methods of preventive care using biomedical implanted and body-worn devices. The design of the technologies that will enable these applications requires correct delivery of the vital physiological signs of the patient along with the energy management in power-constrained devices. The high cost and even higher risk of battery replacement require that these devices be designed and developed for minimum energy consumption.

Deep Neural Nets for Embedded Big Data Applications

We explore the use of deep neural networks (DNN) for embedded big data applications. Deep neural networks have been demonstrated to outperform state-of-the-art solutions for a variety of complex classification tasks, such as image recognition. The ability to train networks to both perform feature abstraction and classification provides a number of key benefits. One key benefit is that it reduces the burden of the developer to produce efficient, optimal feature engineering, which typically requires expert domain-knowledge and significant time. A second key benefit is that the network's complexity can be adjusted to achieve desired accuracy performance. Despite these benefits, DNNs have yet to be fully realized in an embedded setting. In this research, we explore novel architecture optimizations and develop optimal static mappings for neural networks onto highly parallel, highly granular hardware processors such as many-cores and embedded GPUs.

A Low Power Wearable Tongue Drive System for People with Severe Disabilities

This work demonstrates an ultra low power multi-sensor Tongue Drive System (TDS) used for individuals with severe disabilities to control their environment using their tongue movement. An ultra low power local processor is proposed which can perform all signal processing at sensor side, rather than sending all raw data out. The proposed TDS will significantly reduce the transmission power consumption and subsequently increase the battery life. Assuming the TDS user issuing one command per second, implementing the proposed local processing reduces the data volume that needs to be wirelessly transmitted to a PC or smartphone by a factor of 1500x, from 12 kbit/s to approximately 8 bit/s. The proposed processor consists of three blocks: I2C protocol for communication, External Magnetic Field (EMF) Attenuation, and Logistic Regression machine learning for command classification. The processor is implemented in 65-nm CMOS technology, occupies 0.016 mm2 and consumes 3.9 nJ energy, which is 41 times smaller than the implementation in the previous work. For demonstration, the complete TDS on headset with FPGA, Bluetooth, battery and sensors has been tested. The detection accuracy is 90.12%