Efficient Digital Implementation of Extreme Learning Machines for Classification

Sergio Decherchi, Paolo Gastaldo, Alessio Leoncini, Rodolfo Zunino.

The availability of compact, fast circuitry for the support of artificial neural systems is a long-standing and critical requirement for many important applications. This work addresses the implementation of the powerful Extreme Learning Machine (ELM) model on reconfigurable digital hardware. The design strategy first provides a training procedure for ELMs, which effectively trades off prediction accuracy and network complexity. This in turn facilitates the optimization of hardware resources.

The research analyzes two implementation approaches: one involving FPGA devices, and one embedding low-cost, low-performance devices such as CPLDs. Experimental results show that, in both cases, the design approach yields efficient digital architectures with satisfactory performances and limited costs.

Available material

This page make available the VHDL code that implements the ELM classifier. All the material is packed into a password protected zip file. Please contact Paolo Gastaldo (paolo.gastaldo@unige.it) to get the password.

The implementation refers to a classifier tested on the MNIST dataset; the neural network has 81 inputs and 18 neurons in hidden layer. The training phase was completed offline; hence, the number of neurons and the hidden weights are hardcoded into the digital design.

A signed 2-complement fixed-point representation is adopted; 16 bits encode all numerical quantities, with 4 bits for the integer part and 12 bits for the fractional part.

The following files are provided:

Implementation on FPGA devices:

Implementation on CPLD devices:

Common files



  1. Implement the VHDL code with BinaryELM as top level entity
  2. Start the simulation by using testbench Test.vhd; this step generates the Test.out file. The file contains the outputs of the ELM network represented as binary values
  3. Convert binary values into real values (16 bits, 4 bits for the integer part, 12 bits for the fractional part) by using this converter ( win32 source | linux source )
  4. Evaluate the classification performance of hardware implementation by using this spreadsheet