Speeding up MLP Execution through Difference Forward Propagation

    loading  Checking for direct PDF access through Ovid

Abstract

At present the Multi-layer Perceptron model (MLP) is, without doubt, the most diffused neural network for applications. So, it is important, from an engineering point of view, to design and test methods to improve MLP efficiency at run time. This paper describes a simple but effective method to cut down execution time for MLP networks dealing with a sequential input. This case is very common, including all kinds of temporal processing, like speech, video, and in general signals varying in time. The suggested technique requires neither specialized hardware nor big quantities of additional memory. The method is based on the ubiquitous idea of difference transmission, widely used in signal coding. For each neuron, the activation value at a certain moment is compared with the corresponding activation value computed at the previous net forward computation: if no relevant change occurred the neuron does not perform any computation, otherwise it propagates to the connected neurons the difference of its two activations multiplied by its outgoing weights. The method requires the introduction of a quantization of the unit activation function that causes an error which is analyzed empirically. In particular, the effectiveness of the method is verified on two speech recognition tasks with two different neural networks architectures. The results show a drastic reduction of the execution time on both the neural architectures and no significant changes in recognition quality.

Related Topics

    loading  Loading Related Articles