hybridthreads compiler fabrice baijot jim stevens
TRANSCRIPT
![Page 1: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/1.jpg)
Hybridthreads Compiler
Fabrice Baijot
Jim Stevens
![Page 2: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/2.jpg)
What we will talk about
• Give an overview of the current state of affairs in the Hybridthreads Compiler project.
• Answer the following questions:– Why are you writing a compiler?
– How do you represent C in hardware?
– What is HIF?
– How does the compilation process work?
![Page 3: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/3.jpg)
Hthreads System
• OS kernel that runs on hybrid CPU/FPGAs
• Allows for creation of threads in both software and hardware.
• Hardware threads are custom hardware implementations of threads.
• Assume everyone here knows enough about Hthreads for us to go on.
![Page 4: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/4.jpg)
Hthreads Diagram
![Page 5: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/5.jpg)
HWTI
• Provides primitives that allow hardware threads to communicate with the rest of the system (read, write, pthread_mutex_lock)
• Provides a local memory for the hardware thread that is integrated into the global address space.
• Provides primitives for using the local memory to implement a function call stack and a heap.
![Page 6: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/6.jpg)
HWTI Diagram
![Page 7: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/7.jpg)
Why are we writing a compiler?
![Page 8: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/8.jpg)
Reason 1: Open hthreads to Software Developers
• Allow software developers to create hardware thread cores without needing to know details of hardware design.
• It is expected by developers that all new architectures have C compilers.
![Page 9: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/9.jpg)
Reason 2: Design space exploration
• Having a C to VHDL compilation system allows for threads to simply be recompiled to cross the software/hardware boundary.
• Developer can test more system configurations if they do not have to write VHDL for hardware threads.
![Page 10: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/10.jpg)
Reason 3: Physical Thread Level Parallelism of C Threads
• C has limited fine-grained parallelism due to underlying Von Neumann model and there is not a lot we can do in the compiler about that.
• Other C to Hardware projects in the past have focused on creating accelerators with fine grained parallelism, often modifying the input language to increase available fine-grained parallelism.
• We are attacking course-grained parallelism that is made available by the user when threads are created.
• Input language is NOT modified!!!
• Independent, physically concurrent hardware threads are highly deterministic and potentially high performance.
![Page 11: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/11.jpg)
Reason 4: Potential Base for Partial Reconfiguration
• User can create hardware and software images for all threads in the system.
• Use an hthread attribute or some other mechanism to pick where a thread is executed at run time.
• System reconfigures itself to multiplex different hardware threads into the FPGA fabric during the life of the system.
![Page 12: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/12.jpg)
How to represent C in hardware
![Page 13: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/13.jpg)
Requirements
• To represent C in hardware, you need:– Primitive arithmetic/logical operations
– Primitive control flow operations
– Memory Model
– Function Call Model
![Page 14: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/14.jpg)
Primitive Operations
• Most of C's primitive arithmetic/logical operations can be directly represented in synthesizable VHDL– Integer add, subtract, and, or, etc.
• More complicated primitives can be implemented by using simple state machines or instantiating vendor provided IP– Integer divide, floating point ops, etc.
![Page 15: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/15.jpg)
Control Flow
• Use a state machine to handle all control flow operations.
• If statements, all forms of loops, and unconditional branches (goto, continue, and break) can all be supported with the state machine based model.
• This is highly sequential, but we do not care since we are worried about TLP rather than ILP.
• Still allows for basic ILP optimizations (schedule multiple operations in the same state).
![Page 16: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/16.jpg)
Memory Model• Since we are running in the hthreads system, global
memory access is done via bus transactions.
• Thread's stack and heap are in a BRAM that is part of the hardware thread.
• Since BRAM has such low latency, there is no need for the thread to have a cache.
• Hardware thread interface provides unified API for accessing both the main memory and the local BRAM.
• Pointers have been a problem in the past for C to hardware, but are trivial in this model.
![Page 17: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/17.jpg)
Function Call Model
• Thread has a function call stack and can implement function calls using standard RISC style calling conventions.
• State machines handle everything (push parameters and return address, update stack pointer, jump to function start, etc)
![Page 18: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/18.jpg)
Compilation Process
• Compiler is divided into two modules– hifgen – Compiles C to HIF
– hif2vhdl – Compiles HIF to VHDL
• Produces both .hif and .o files for each C file.
![Page 19: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/19.jpg)
Flow diagram of compiler
![Page 20: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/20.jpg)
HIF
• Defined the Hardware Intermediate Form to act as a linear representation of GCC's architecture independent GIMPLE IF.
• Acts as both an intermediate form for compilation as well as an object file.
• Has three address code style with a few additions.
![Page 21: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/21.jpg)
Example HIF Program#### Created from factorial.c line 2@function @signed @int 32 factorial
# Argument declaration section @param @signed @int 32 hif_n
# Variable declaration section @declare @signed @int 32 hif_D.1279 @declare @signed @int 32 hif_D.1280 @declare @signed @int 32 hif_D.1281
@if hif_n != 0 @goto hif_label0 @mov hif_D.1279 1 @return hif_D.1279 @goto hif_label1
@label hif_label0 : #else @sub hif_D.1280 hif_n 1 @call factorial hif_D.1280 @castmov hif_D.1281 @returnVal @mul hif_D.1279 hif_D.1281 hif_n @return hif_D.1279 @label hif_label1 : #end if
int factorial(int n){ if (n == 0) return 1; else return factorial(n-1)*n;}
![Page 22: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/22.jpg)
hifgen Requirements
• Must generate proper HIF to implement all of C language– We are ignoring other GCC input languages for
now. It is possible to support these.
• Module must be flexible enough to adjust to any changes in GIMPLE and GCC
• Must not interfere with normal GCC compilation
![Page 23: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/23.jpg)
hifgen Implementation
• Added a module into GCC that implements a walk of the GIMPLE tree that generates HIF as a side effect.
• Use predefined GCC macros to access GIMPLE data structure.
• Reorders statements to match HIF syntax.
![Page 24: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/24.jpg)
hif2vhdl Requirements
• Implement the semantics of HIF in VHDL hardware threads.
• Must generate efficient finite state machines.– Otherwise there is no point to move thread to
hardware.
• Implement HWTI protocol properly.
• Must be flexible and extensible enough to be used as a research platform.
![Page 25: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/25.jpg)
hif2vhdl Implementation
• Implemented proof of concept version in Python– Chose Python for flexibility and development
speed.
– May implement in another language (e.g. Haskell, Ocaml, or C) in the future if there is a very good reason.
• Using PLY (Python Lex-Yacc) as the basis for the parser.
![Page 26: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/26.jpg)
User’s View
• HTC script provides a GCC-like interface for our compiler.
• To compile from C to HIF:htc -c thread1.c
• To compile from HIF to VHDL:htc -o thread1.vhd -m mainfunction thread1.hif
![Page 27: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/27.jpg)
Current Complete Version
• Completed for Senior Honors Project
• Works with most C constructs
• Installed on the ITTC network– To use, source the following file into a bash
environment.../projects/hybridthreads/tools/hif2vhdl_v1/init.sh
– This sets up the 'htc' command.
![Page 28: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/28.jpg)
Limitations of Current Version
• Types missing– Only supports 32-bit signed integers.
– All integer types get mapped into 32-bit signed integers.
– Floating points not supported at all.
• Function pointers not supported
• Pointer aliasing partially implemented.
• Translation from C to HIF fails in some corner cases.
![Page 29: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/29.jpg)
Ground Up Redesign
• Redesigned HIF to support all types that are in GIMPLE– Primitives have bit length, type modifiers, and
integer/floating point specifier
– Also has support for arrays, pointers, structs, etc.
• Redesigning compiler structure to reflect our experience and new knowledge– First version of compiler was implemented before we
had chance to study compilation in detail.
![Page 30: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/30.jpg)
Current Development Work
• Have implemented hifgen module for new HIF language and are currently testing it.
• Have completed front end of new hif2vhdl module.
• Next step is to create the framework for the middle end of the hif2vhdl module that will later be used to implement optimizations.
– HIF CFG
– HIF PCFG
• Also working on implementation details of representing new HIF in hardware.
![Page 31: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/31.jpg)
Future Work
• Add optimizers into hif2vhdl module– Classic compiler optimizations
– Specialized hardware thread optimizations
• Explore different implementation techniques for hardware threads.
• Partial reconfiguration research.
• Write papers!!!
![Page 32: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/32.jpg)
Possible Hardware Thread Model
![Page 33: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/33.jpg)
Conclusions
• Designed representations for C in hardware.
• Created the Hardware Intermediate Form.
• Developed hifgen and hif2vhdl modules.
• Currently working to create a feature complete system.
![Page 34: Hybridthreads Compiler Fabrice Baijot Jim Stevens](https://reader035.vdocuments.net/reader035/viewer/2022062719/56649ef25503460f94c04408/html5/thumbnails/34.jpg)
Questions?