Nov 18, 2025
Tags: ml, compiler, sdkit, onnx, ggml
Successfully compiled the VAE of Stable Diffusion 1.5 using graph-compiler.
The compiled model is terribly slow because I haven’t written any performance optimizations, and it (conservatively) converts a lot of intermediate tensors to contiguous copies. But we don’t need any clever optimizations to get to decent performance, just basic ones.
It’s pretty exciting because I was able to bypass the need to port the model to C++ manually. Instead, I was able to just compile the exported ONNX model and get the same output values as the original PyTorch implementation (given the same input and weights). I could compile to any platform supported by ggml by just changing one flag (e.g. CPU, CUDA, ROCm, Vulkan, Metal etc).
Nov 7, 2025
Tags: ml, compiler, onnx, ggml, sdkit, worklog
Wrote a simple script to convert ONNX to GGML. It auto-generates C++ code that calls the corresponding ggml functions (for each ONNX operator). This file can then be compiled and run like a normal C++ ggml program, and will produce the same results as the original model in PyTorch.
The generated file can work on multiple backends: CPU, CUDA, ROCm, Vulkan, Metal etc, by providing the correct compiler flags during cmake -B, e.g. -D GGML_CUDA=1 for CUDA.