Just-in-time compilation (JIT, short for Just-In-Time Compilation) is a dynamic compilation technique that converts source code or an intermediate representation (such as bytecode) into machine code during program execution. Unlike traditional static compilation, a JIT compiler selectively compiles code fragments at runtime and can optimize them according to the current execution context.
Below we use Rust to build a minimal JIT example, based on rustyjit.
Basic Idea
- During process execution, mmap a block of memory called code_buffer (4 KiB aligned) and set its permissions to readable, writable, and executable;
- Emit the binary encoding of a host instruction sequence into code_buffer;
- Cast the entry address of the program section in code_buffer to a function pointer;
- Execute that function pointer.
Prerequisites
FFI Programming
Allocating memory in a process requires calling libc interfaces, which involves Rust FFI Programming - libc crate. In short, we need to extern the libc crate already wrapped by Rust so that we can access the interfaces we need. The main ones are:
- mmap()
- memset()
- mprotect()
- memalign()
Note that some interfaces in Rust are already wrapped variants, usually associated functions, and cannot be used with the raw API names.
Trait
A trait can tell the compiler what characteristics a particular type has and what it can share with other types. We can use traits to define this shared behavior in an abstract way, and we can also use trait bounds to require that a generic type implement some specific behavior.
Because Rust is a very safety-oriented language, we need to rely on traits to implement mutable access to memory.
Building a Minimal JIT
First define a struct that contains a pointer to code_buffer:
1struct JitMemory {
2 code_buffer: *mut u8,
3}
Then we write a new method to allocate and initialize memory:
1std::mem;
2// Define the page-size constant
3const PAGE_SIZE: usize = 4096;
4
5impl JitMemory {
6 fn new(num_pages: usize) -> JitMemory {
7 let code_buffer: *mut u8;
8 unsafe {
9 let size: usize = num_pages * PAGE_SIZE;
10 let mut _contents: *mut libc::c_void =
11 mem::MaybeUninit::<libc::c_void>::uninit().as_mut_ptr();
12 // Allocate memory and align it to the page size
13 libc::posix_memalign(&mut _contents, PAGE_SIZE, size);
14 // Set memory permissions to executable, readable, and writable
15 libc::mprotect(
16 _contents,
17 size,
18 libc::PROT_EXEC | libc::PROT_READ | libc::PROT_WRITE,
19 );
20
21 // Initialize memory contents to the 'RET' instruction
22 libc::memset(_contents, 0xc3, size);
23 code_buffer = mem::transmute(_contents);
24 }
25
26 JitMemory { code_buffer }
27 }
28}
Next we implement bytecode emission for instruction encodings:
1use std::ops::{Index, IndexMut};
2
3// Implement the Index trait to allow indexed access to memory
4impl Index<usize> for JitMemory {
5 type Output = u8; // Declares the associated Output type for Index<usize> as u8
6
7 fn index(&self, _index: usize) -> &u8 {
8 unsafe { &*self.code_buffer.offset(_index as isize) }
9 }
10}
11
12// Implement the IndexMut trait to allow indexed mutation of memory
13impl IndexMut<usize> for JitMemory {
14 fn index_mut(&mut self, _index: usize) -> &mut u8 {
15 unsafe { &mut *self.code_buffer.offset(_index as isize) }
16 }
17}
Finally, we write a simple assembly program, emit it into memory, and run it:
1fn run_jit() -> fn() -> i64 {
2 let mut jit: JitMemory = JitMemory::new(1);
3
4 // Write machine code in JIT memory to generate a function that returns 3
5 jit[0] = 0x48; // mov RAX, 0x30
6 jit[1] = 0xc7;
7 jit[2] = 0xc0;
8 jit[3] = 0x30;
9 jit[4] = 0;
10 jit[5] = 0;
11 jit[6] = 0;
12
13 // Convert the memory pointer into a function pointer and return it
14 unsafe { mem::transmute(jit.code_buffer) }
15}
16
17fn main() {
18 // Generate and call the JIT-compiled function, then print the result
19 let fun: fn() -> i64 = run_jit();
20 println!("{:#x}", fun());
21}
Compile and run it to see the output:
1$ cargo run
2...
30x30