Lemonade by AMD: Fast Open Source Local LLM Server
AMD introduces Lemonade, a fast, open-source local LLM server supporting GPUs and NPUs, ready to deploy on any PC in minutes.
Key Features
- GPU and NPU Support — Full utilization of AMD hardware acceleration
- Multimodal Capabilities — Text, image generation, speech recognition and synthesis
- OpenAI API Compatible — Works out-of-the-box with hundreds of applications
- Lightweight Backend — Native C++ backend, only 2MB
- One-Minute Install — Simple installer with automatic configuration
- Multi-Engine Compatible — Supports llama.cpp, Ryzen AI SW, FastFlowLM, etc.
- Multiple Models at Once — Load and run multiple models simultaneously
- Cross-Platform — Windows, Linux, macOS (beta)
Use Cases
Run large models like gpt-oss-120b or Qwen-Coder-Next locally for advanced tool use, without cloud dependency.
Learn more: https://lemonade-server.ai