RC RANDOM CHAOS

Jank gains a Clojure-aware custom IR to close the gap with the JVM

· via Hacker News

Original source

Jank now has its own custom IR

Hacker News →

Jank, a native Clojure dialect compiled through LLVM, has shipped its own intermediate representation tailored to Clojure semantics. Until now, the project leaned entirely on LLVM IR for optimization, but LLVM operates far below the level of vars, transients, persistent data structures, and lazy sequences, so the heavy polymorphism and indirection in Clojure code left LLVM with little room to optimize. The new IR sits well above both LLVM IR and JVM bytecode, encoding Clojure-specific concepts directly so the compiler can reason about them.

The representation is SSA-based and organized as a control flow graph of basic blocks, each terminated by a single branching, returning, or throwing instruction. It is stored as C++ data structures in memory but can be rendered as Clojure data for debugging and testing, and it explicitly lifts vars and constants while exposing operations like var-deref and dynamic-call. C++ codegen now flows from this IR rather than directly from the AST, with generated variable names mirroring their IR counterparts.

The roughly six-week rewrite ships without any optimization passes yet enabled; the author chose to merge the pipeline first and then iterate benchmark by benchmark. As far as the author is aware, no other Clojure dialect has taken this step, which positions jank to pursue language-aware optimizations the JVM and LLVM cannot easily perform.

Read the full article

Continue reading at Hacker News →

This is an AI-generated summary. Read the original for the full story.