LLVM Notes and Resources
LLVM Notes and Resources
This is the CMPT 383 page for LLVM (September 2015).
LLVM is a compiler infrastructure that can be used with many different programming languages and target architectures.
LLVM IR: A Common Intermediate Representation
- Separate front-ends for each programming language are built for each language to parse programs in the language and translate them to LLVM IR.
- (Clang) C/C++ to LLVM IR.
- D to LLVM IR
- Haskell to LLVM IR
- Swift to LLVM IR
- Separate back-ends for each target architecture translate the platform-independent IR into platform-specific assembly code.
- LLVM IR to Intel Architecture 32-bit (x86).
- LLVM IR to Intel/AMD 64-bit architecture.
- LLVM IR to Power PC
- LLVM IR to Cell BroadBand Engine (Playstation)
- LLVM IR to ARM (mobile CPUs)
- LLVM IR to Nvidia GPUs
- Reduces the problem of creating \(m \times n \) compilers (\(m\) front-ends, \(n\) back-ends) to \(m + n\).
LLVM IR: Not Just Internal
- Many compiler systems have internal intermediate representations.
- LLVM IR has a defined printable syntax and semantics as given by the LLVM Language Reference Manual.
- LLVM IR can actually be used to directly write programs!
Rich, Strongly-Typed IR
The first-class types represent values that can be stored in registers and computed by instructions.
- Integers of any width in bits.
i1- 1 bit integers, used for Boolean values.i8,i16,i32,i64: common integer typesi128,i256: wide integers using SIMD registersi43: 43-bit integers are syntactically valid, but not directly supported by typical back-ends.
- Floating point types
half: 16 bitsfloat: 32 bitsdouble: 64 bitsfp128: 128 bits
- Pointer Types
i8*byte pointersdouble*
- Vectors of Integers, Floats
- abstraction of SIMD registers, e.g., 128-bit SSE2 registers on Intel
- treated as first class types
<16 x i8>: 16 8-bit integers<8 x i16>: 8 16-bit integers<2 x i64><4 x float><2 x i8*>
Instructions
- The statements of the LLVM language are instructions operating on the first-class types.
- Most instructions are in 3-operand form: an operation on two register values produces a single result value.
- Control flow instructions are limited:
- branch instructions to a single label
- conditional branch instructions to one of two labels depending on an
i1value - indirect branch instructions through address calculation
- call instructions
- Well-formed LLVM IR is in static-single assignment (SSA) form.
- There is only one assignment (definition) of any given variable.
- Phi-nodes are used to produce a single value from multiple paths.
Generating LLVM IR
Updated Thu Oct. 22 2015, 09:32 by cameron.