Nov 13, 2025
PolyBlocks is another interesting ML compiler, written using MLIR. It’s a startup incubated in IISc Bangalore, run by someone (Uday Bondhugula) who co-authored a paper on compiler optimizations for GPGPUs back in 2008 (17 years ago)!
Some of the compiler passes to keep in mind:
- fusion
- tiling
- use hardware acceleration (like tensor cores)
- constant folding
- perform redundant computation to avoid global memory accesses where profitable
- pack into buffers
- loop transformation
- unroll-and-jam (register tiling?)
- vectorization
- reorder execution for better spatial, temporary and group reuse
Scheduling approaches:
Nov 7, 2025
Tags: ml, compiler, onnx, ggml, sdkit, worklog
Wrote a simple script to convert ONNX to GGML. It auto-generates C++ code that calls the corresponding ggml functions (for each ONNX operator). This file can then be compiled and run like a normal C++ ggml program, and will produce the same results as the original model in PyTorch.
The generated file can work on multiple backends: CPU, CUDA, ROCm, Vulkan, Metal etc, by providing the correct compiler flags during cmake -B, e.g. -D GGML_CUDA=1 for CUDA.