Neural processor development is reducing our reliance on remote server access to process deep learning operations in an increasingly edge-driven world. By employing in-memory processing, parallelization techniques, and algorithm-hardware co-design, memristor crossbar arrays are known to efficiently compute large scale matrix-vector multiplications. However, state-of-the-art implementations of negative weights require duplicative column wires, and high precision weights using single-bit memristors further distributes computations. These constraints dramatically increase chip area and resistive losses, which lead to increased power consumption and reduced accuracy. In this paper, we develop an adaptive precision method by varying the number of memristors at each crosspoint. We also present a weight mapping algorithm designed for implementation on our crossbar array. This novel algorithm-hardware solution is described as the radix-X Convolutional Neural Network Crossbar Array, and demonstrate how to efficiently represent negative weights using a single column line, rather than double the number of additional columns. Using both simulation and experimental results, we verify that our radix-5 CNN array achieves a validation accuracy of 90.5% on the CIFAR-10 dataset, a 4.5% improvement over binarized neural networks whilst simultaneously reducing crossbar area by 46% over conventional arrays by removing the need for duplicate columns to represent signed weights.
|Number of pages||12|
|Publication status||Published - 22 Jun 2019|