- The enhanced vLLM Docker images are built on top of the official vLLM container to provide secure inference services for encrypted models.
- To preserve security, we enforce additional restrictions on container startup.
- The requirements are outlined below, including immutable and configurable flags.
Want to understand how security is implemented?
See our Security Architecture documentation for detailed technical insights.
See our Security Architecture documentation for detailed technical insights.
Supported vLLM Versions
All images are available on our Docker Hub:| Platform | Image Name | Base Image |
|---|---|---|
| GPU (x86_64/ARM64) | koalavault/vllm-openai:v0.11.0 | vllm/vllm-openai:v0.11.0 |
| CPU (x86_64/ARM64) | koalavault/vllm-cpu:v0.11.0 | koalavault/vllm-cpu-base-amd64:v0.11.0,koalavault/vllm-cpu-base-arm64:v0.11.0 |
latesttag: Points to the latest vLLM version (currentlyv0.11.0) with the most recent CryptoTensors build
Pull the Docker Image
- GPU (x86/arm64)
- CPU (x86/arm64)
- Apple Silicon (arm)
Deploy with Docker
Deploying encrypted models with KoalaVault is almost identical to deploying standard models, except for adding a few additional flags to the container startup command to ensure model decryption occurs in a secure environment. The typical docker deployment command looks like below:- Nvidia GPU (x86/arm64)
- CPU (x86/arm64)
- Apple Silicon (arm)
Additional Notes
All other requirements and best practices remain the same as the official vLLM container.Please refer to the vLLM documentation for further details on system requirements, GPU support, and advanced configuration: 👉 vLLM Official Documentation