Advanced science.  Applied technology.

Search

Real-Time Machine Learning for Resource Constrained Systems with Low-Precision Mathematics, 10-R6130

Principal Investigators
Mike Koets
Inclusive Dates 
01/04/21 to 05/04/21

Background

A typical image analysis application requires tens of millions of floating-point operations per frame and hundreds of MB of memory. Execution of deep neural networks (DNN) on embedded platforms, such as spacecraft, unmanned aerial vehicles, and remote sensors may be impossible due to limitations on memory or power constraints, or may be characterized by extreme latency and low throughput. One approach to reducing the computation demand of a DNN is to quantize the implementation of the mathematics used from 32-bit floating point math to 8-bit integer math, but even this substantially simplified implementation may not permit practical implementation of the algorithm on an extremely resource-constrained system. It is possible to extend this approach to use 2-bit or 1-bit representations, resulting in a low-precision neural network. This further reduces memory demands and allows for more efficient computation. Research into low-precision neural networks shows that aggressive quantization can impair the performance of the DNN. Performance impairment has been observed in image classification problems that discern among many categories. We intend to explore the impact of low-precision implementations of DNN on a simpler problem of binary image classification. We also focused on characterizing the performance of the low-precision DNNs on a resource-constrained space rated micro-processor. This allowed us to make better design decision that would align with potential business areas that this research could be targeted towards.

Approach

We researched state of the art low precision neural networks and implemented networks with numerical precision of 2-bit (ternary network) and 1-bit (binary network) precision to provide further efficiency improvements. We were able to identify many different techniques to maintain accuracy while using 2-bit and 1-bit solutions for the deep neural network. We then trained these solutions to do the simpler binary classification with an accuracy greater than 95%.

Accomplishments

This research resulted in techniques for training low-precision DNNs that can maintain high accuracy for the binary classification problem. One major discovery is that the modifications needed to achieve this higher accuracy resulted in a lower theoretical performance for the resource constrained space rated micro-processor. This lower performance resulted due to the lack of support of custom mathematical operations and memory movement that many of the most advanced techniques for restoring accuracy to low-precision DNNs utilize. While the micro-processor we were targeting lacked this support and alternative route for deployment through Field Programmable Gate Arrays was identified. These types of processors consist of a large pool of reconfigurable hardware that allow for the creation of highly custom, high-performance algorithms.