GPT-Generated Unified Format (GGUF)
- is a file format for storing and distributing large language models (LLMs)
- particularly designed for use with the GGML engine and its executors
- it’s a successor to previous formats like GGML, GGMF, and GGJT
- is optimized for fast model loading and inference, especially on consumer-grade CPUs
- GGUF models are typically created by converting models trained in frameworks like PyTorch into the GGUF format