Asynchronous programmable logic for reconfigurable network on chip architectures

Yu, J 2019, Asynchronous programmable logic for reconfigurable network on chip architectures, Doctor of Philosophy (PhD), Engineering, RMIT University.


Document type: Thesis
Collection: Theses

Attached Files
Name Description MIMEType Size
Yu.pdf Thesis application/pdf 10.38MB
Title Asynchronous programmable logic for reconfigurable network on chip architectures
Author(s) Yu, J
Year 2019
Abstract Since their invention, reconfigurable systems have evolved from a simple logic replacement function to an important technology that enables hardware to be flexibly tuned, reshaped and altered at will to optimally suit the design purpose. Reconfigurable architectures such as field programmable gate arrays (FPGA) have become popular due to their low non-recurring engineering costs and faster time-to-market and are now found in diverse applications from large server farms to embedded and portable systems. FPGAs have more recently emerged as reconfigurable co-processor platforms tightly coupled to microprocessors within integrated system on chip architectures. However, it is becoming clear that issues such as extreme process variability in deep sub-micron transistors will impact the ability to achieve timing closure in high-performance synchronous FPGAs in the same way as it affects conventional ASIC architectures. This is particularly true for low-voltage, low power embedded and portable devices.

This has motivated renewed research into asynchronous approaches, which eliminate the distributed clock and replace it with a handshake scheme that automatically adapts to changes in circuit behaviour due to the combined effects of process variability, supply voltage and temperature. As one of a number of alternative asynchronous approaches, Null Convention Logic (NCL) is a symbolically complete, quasi-delay insensitive logic system that is inherently self-determined, locally autonomous and self-synchronising. Thus, NCL circuit design is largely insensitive to the delay of each individual logic element and can be set up to be 'correct-by-construction'. In contrast to conventional Boolean gates, NCL gates are designed as majority (threshold) logic with state-holding (hysteresis) behaviour. However, NCL circuits cannot be easily mapped onto conventional FPGA systems which contain no appropriate delay-insensitive functions. Therefore, this thesis proposes and analyses a novel NCL-based reconfigurable logic block that is intended to form one component of an asynchronous reconfigurable system on a chip.

A dual-rail NCL reconfigurable logic block (NRLB) has been designed and simulated using a commercial 28nm FDSOI CMOS process. The Boolean logic equations representing the 27 fundamental dual-rail NCL gates were first decomposed and common terms identified. A look-up table was then created that allowed these logic terms to be re-combined under the control of a configuration memory (shift register) so that the equations for each of fundamental dual-rail NCL gates could be created as required. The table is 'fracturable' in that it can also be partitioned and set up as a pair of 2-input NCL registers. While the basic area and latency of the dual-rail 2-output NRLB are comparable with previous single-rail asynchronous reconfigurable systems, it is demonstrated that the mapping process can be more productive, and it results in a gate resource usage that is much less than comparable asynchronous architectures. Furthermore, the NRLB has more flexibility, and it is more suitable to perform fundamental NCL threshold gate functions. The cell also exhibits a static power consumption figure similar to that of a commercial Intel Stratix-V FPGA device.

A customised CAD flow is introduced to provide a specific CAD toolset for this new NCL-based reconfigurable architecture. The flow is based on the open-source Verilog-to-Routing (VTR) CAD toolset, with its architecture files tuned to match the 28nm Stratix V FPGA characteristics, augmented with a simple Verilog to dual-rail NCL conversion process. The VTR CAD flow can produce various statistics such as area and time analysis to support comparison between the conventional FPGA system and NCL-based reconfigurable system.

Due to the size expansion from single-rail to dual-rail logic operation and the sub-optimal mapping optimisation in VTR, the average size of an asynchronous benchmark mapped onto the NCL-based reconfigurable system is typically in the order of nine times larger than the equivalent synchronous design mapped onto conventional FPGA devices. However, using a set of standard benchmarks, the NCL-based reconfigurable system shows an improvement in input to output latency of 51% on average compared to the synchronous case.

Finally, an existing Virtual Channel Network-on-Chip (NoC) Router was implemented in both a synchronous (FPGA) style and an asynchronous style using the CAD flow, and its performance was analysed and compared. This router system is intended to provide an improved interconnection solution to current FPGA systems, and therefore overall latency is an important consideration. The specific NoC architecture was selected as its five sub-modules have different circuit sizes and algorithmic complexities, and it is, therefore, able to demonstrate a range of area and latency behaviours. By tuning the various parameters controlling their size and depth on three of the sub-modules (input module, VC allocator and switch allocator), a range of sub-modules of varying sizes and complexity were derived. A best-case improvement of 20%~30% lower latency was found for the NRLB compared to the synchronous case, with the percentage improvement increasing with the module complexity. The final two modules (output module and crossbar) exhibit small gate delays compared to the others as well as a small number of pipeline stages. As a result, they show a similar or even worse latency performance on the NRLB architecture. In general, all five sub-modules show the expected area ratios-from 8x~12x larger than the synchronous case. Finally, a complete NoC router was implemented on the NCL-based reconfigurable system as an example of a medium to large scale hardware design. In this case, a 50% latency improvement was observed, similar to the previous benchmark tests. These experiments indicate that the latency benefit of this asynchronous reconfigurable approach when compared to synchronous FPGA systems, increases with the size of the designs and the complexity of the algorithms.
Degree Doctor of Philosophy (PhD)
Institution RMIT University
School, Department or Centre Engineering
Subjects Circuits and Systems
Keyword(s) NCL
programmable logic
network on chip
FPGA
asynchronous design
null convention logic
Versions
Version Filter Type
Access Statistics: 183 Abstract Views, 39 File Downloads  -  Detailed Statistics
Created: Fri, 10 Jan 2020, 10:47:53 EST by Keely Chapman
© 2014 RMIT Research Repository • Powered by Fez SoftwareContact us