Recently, the team of Professor Qian He and Wu Huaqiang from the Institute of Microelectronics of Tsinghua University and the Future Chip Technology Advanced Innovation Center and their collaborators published a research paper titled “Fully hardware-implemented memristor convolutional neural network” online in “Nature”, reporting A complete hardware implementation of a convolutional network based on a memristor array chip is presented.
The energy efficiency of the integrated storage and computing system is two prime levels higher than that of the cutting-edge graphics processor chip (GPU) in processing convolutional neural networks (CNN), which can be said to break through the “Von Neumann bottleneck” to a certain extent. Manufacturing: While greatly improving the computing power, it realizes the completion of complex calculations with smaller power consumption and lower hardware cost.
Schematic diagram of multiple memristor array chips working together. (Photo from: Tsinghua News Network, the same below)
Memory and computing integrated system based on memristor chip
What is a memristor?
Memristor, the full name of Memristor, is the fourth basic circuit element after resistance, capacitance, and inductance. It represents the relationship between magnetic flux and electric charge. It was first predicted by Cai Shaotang, a professor at the University of California, Berkeley, in 1971. Exist, Hewlett-Packard in 2008 successfully developed.
In simple terms, the resistance of such a component will change with the amount of current passed, and even if the current stops, its resistance will still stay at the previous value, until it receives the reverse current it will be pushed back , which is equivalent to saying that it can “remember” the previous amount of current.
This wonderful effect is actually similar to that of neuron synapses. In addition, memristors also have the advantages of small size, low operating power consumption, and large-scale integration (three-dimensional integration).
Artificial neural networks have been brilliant in recent years. If memristors are connected into arrays as hardware of artificial neural networks, what effect will it have?
The current international related research is still stuck in the verification of simple network structure, or the simulation based on a small amount of device data. There are still many challenges in the complete hardware implementation based on the memristor array: Resistor arrays are still a challenge; in terms of systems, the inherent non-ideal characteristics of devices (such as inter-device fluctuations, device conductance stuck, conductance state drift, etc.) On the one hand, the realization of the convolution function of the memristor array requires continuous sampling and calculation of multiple input blocks in a serial sliding manner, which cannot match the computational efficiency of the fully connected structure.
The team of Professor Qian He and Wu Huaqiang successfully fabricated a high-performance memristor array by optimizing the material and device structure. In May 2017, the research group reported in Nature Communications that, for the first time, brain-like computing based on 1024 oxide memristor arrays was realized, increasing the integration scale of oxide memristors by an order of magnitude. This enables the chip to complete face recognition computing tasks more efficiently, reducing energy consumption to less than one thousandth of the original.
memristor neural network
In order to solve the problem of the system recognition accuracy decline caused by the non-ideal characteristics of the device, they proposed a new hybrid training algorithm, which only needs to train the neural network with fewer image samples, and fine-tune some of the weights of the last layer of the network, so that the memory can be saved. The recognition accuracy of the computing-integrated architecture on the set of handwritten digits reaches 96.19%, which is comparable to the recognition accuracy of the software. At the same time, a spatial parallel mechanism is proposed to program the same convolution kernel into multiple groups of memristor arrays. Each group of memristor arrays can process different convolution input blocks in parallel, increasing the degree of parallelism to speed up the convolution calculation. .
On this basis, the team built a complete storage and computing integrated system composed of all hardware, integrated 8 arrays including 2048 memristors in the system to improve the efficiency of parallel computing, and efficiently run on the system. The convolutional neural network algorithm successfully verified the image recognition function and proved the feasibility of the full hardware implementation of the storage-computing integrated architecture.
Storage and Computing Integrated System Architecture
In recent years, the team of Professors Qian He and Wu Huaqiang has long been devoted to the research of storage-computing integration technology for artificial intelligence, and has achieved innovative breakthroughs at multiple levels such as device performance optimization, process integration, circuit design, and architecture and algorithms. Published many papers in journals such as Nature Communications, Nature Electronics, Advanced Materials and other top academic conferences such as International Conference on Electronic Devices (IEDM) and International Conference on Solid State semiconductor Circuits (ISSCC). .
Professor Wu Huaqiang from the Institute of Microelectronics of Tsinghua University is the corresponding author of this paper, and Yao Peng, a doctoral student of the Institute of Microelectronics of Tsinghua University, is the first author. The research work was supported by the National Natural Science Foundation of China, the National Key R&D Program, the Beijing Municipal Science and Technology Commission, the Beijing National Research Center for Information Science and Technology, and Huawei Technologies Co., Ltd.