Easy Diffusion v3

Nov 5, 2025

Following up to the deep-dive on ML compilers:

sdkit v3 won’t use general-purpose ML compilers. They aren’t yet ready for sdkit’s target platforms, and need a lot of work (well beyond sdkit v3’s scope). But I’m quite certain that sdkit v4 will use them, and sdkit v3 will start making steps in that direction.

For sdkit v3, I see two possible paths:

Use an array of vendor-specific compilers (like TensorRT-RTX, MiGraphX, OpenVINO etc), one for each target platform.
Auto-generate ggml code from onnx (or pytorch), and beat it on the head until it meets sdkit v3’s performance goals. Hand-tune kernels, contribute to ggml, and take advantage of ggml’s multi-backend kernels.

Both approaches provide a big step-up from sdkit v2 in terms of install size and performance. So it makes sense to tap into these first, and leave ML compilers for v4 (as another leap forward).