GPT-Generated Unified Format (GGUF)
  • is a file format for storing and distributing large language models (LLMs)
  • particularly designed for use with the GGML engine and its executors
  • it’s a successor to previous formats like GGML, GGMF, and GGJT
  • is optimized for fast model loading and inference, especially on consumer-grade CPUs
  • GGUF models are typically created by converting models trained in frameworks like PyTorch into the GGUF format