Architecture Synthesis of High Performance Application-Specific Processors

Mauricio Breternitz Jr

doi:10.5281/zenodo.1035864

Published April 21, 1984 | Version v1

Thesis Open

Architecture Synthesis of High Performance Application-Specific Processors

Mauricio Breternitz Jr¹

1. Carnegie-Mellon University

Contributors

Supervisor:

Shen, John Paul

Abstract

A new method to design Application-Specific Processors (ASP) for computation-intensive scientific and/or embedded applications is presented. Target application areas include scientific and engineering programs and mission-oriented signal-processing systems requiring very high numerical computation and memory bandwidths. The application code in conventional HLL such as FORTRAN or C is the input to the synthesis process. Latest powerful VLSI chips are used as the primitive building blocks for design implementation. The eventual performance of the application-specific processor in executing the application code is the primary goal of the synthesis task. Advanced code scheduling techniques that go beyond basic block boundaries are employed to achieve high performance via exploitation of fine-grain parallelism. The Application-Specific Processor Design (ASPD) method divides the task of designing an special-purpose processor architecture into Specification Optimization (behavioral) and Implementation Optimization (structural) phases. An architectural template resembling a scalable Very Long Instruction Word (VLIW) processor and a suite of compilation tools are used to generate an optimized processor specification. The designer quickly explores various cost versus performance tradeoff points by performing repeated compilation for scaled architectures. The powerful microcode compilation techniques of Percolation Scheduling and Enhanced Pipeline Scheduling extract and enhance parallelism in. the application object code to generate highly parallelized code, which serves as the optimized specification for the architecture. Further performance/efficiency enhancement is obtained in Implementation Optimization by tailoring the implementation template to the execution requirements of the optimized processor specification. A scalable implementation template constrains the implementation style. Graph-coloring algorithms that exploit special graph characteristics are used to minimize the amount of hardware to support execution of the optimized application microcode without impairing code performance. Compilation techniques to allocate data over multiple memory banks are used to enhance concurrent access. The entire architecture synthesis procedure has been implemented and applied to numerous examples. Speedups in the range of 2.6 to 7.7 over contemporary RISC processors have been obtained. The computation times needed for the synthesis of these examples are on the order of a few seconds.

Files

mauricioBreternitzPhdThesis.pdf

Files (8.3 MB)

Name	Size	Download all
mauricioBreternitzPhdThesis.pdf md5:8db0678662838561bd5a469f9ac3de34	8.3 MB	Preview Download

	All versions	This version
Views	51	50
Downloads	184	182
Data volume	1.6 GB	1.6 GB

Architecture Synthesis of High Performance Application-Specific Processors

Creators

Contributors

Supervisor:

Description

Files

mauricioBreternitzPhdThesis.pdf

Files (8.3 MB)