Are you the model seller?Model sellers automatically have access to their own models (public or private) without purchase. You can skip directly to Deploy Model section.
Buy Model
Before purchasing, make sure you have completed the Account Setup.
1
Browse & Select Model
- Browse the marketplace to discover available models
- Find a model that fits your needs (e.g., koalavault/qwen3-0.6b for testing)
- Click on the model to view details, pricing, and specifications
2
Create Order
- On the model page, click Purchase in the top right corner
- Select your preferred pricing plan
- Review the order details and total amount
- Click Create Order
3
Make the Payment
After creating your order, you’ll see the payment address and instructions on the order detail page. Send USDT payment on BSC network using a supported exchange or wallet.
Payment Requirements:
- Must use USDT (Tether) cryptocurrency
- Must be on BSC (BNB Smart Chain) network
- Token standard: BEP-20
New to Crypto Payments?
Don’t worry! Follow our beginner-friendly payment guide with step-by-step instructions for Binance, crypto wallets, and more.
4
Submit Transaction ID
After completing your payment, you’ll receive a Transaction ID (txid) from your exchange or wallet. Submit this to KoalaVault for verification:
- Find your order: Go to your order detail page or Subscriptions to view pending orders
- Submit txid: Paste your (starts with
0x...) and click Confirm Payment
Download Model
1
Install Required Tools
Install Koava with HuggingFace support, which includes the
hf command for downloading models:2
Download Model
- Go to your purchased model’s detail page on the KoalaVault platform
- Click the Deploy tab and copy the download command
- Run the command to download the encrypted model:
<PUBLISHER_USERNAME>/<MODEL_NAME> with your actual model identifier from the Deploy tab.Example (using the free demo model):Deploy Model
Before deploying, ensure you have Docker installed on your system (Install Docker). GPU support is recommended for optimal performance.For detailed system requirements and hardware specifications, see the vLLM Installation Guide.
Currently, KoalaVault only supports:
- Models in safetensors format (other formats like GGUF are not supported yet)
- Deployment via vLLM inference engine only
1
Pull the Docker Image
Pull the KoalaVault enhanced vLLM docker image for your architecture:
- GPU (x86/arm64)
- CPU (x86/arm64)
- Apple Silicon (arm)
2
Get API Key and Set Environment Variable
- Generate your KoalaVault API key (generate KoalaVault API key)
- Set the API key as an environment variable:
3
Deploy with Docker
Run the Docker container with the downloaded model:
<PUBLISHER_KOALAVAULT_USERNAME> is the KoalaVault username of the model publisher (who sells the model), not necessarily the publisher’s username on HuggingFace.- Nvidia GPU (x86/arm64)
- CPU (x86/arm64)
- Apple Silicon (arm)
4
Test Your Deployment
Once the container is running, you can test it with a simple request:You should see a JSON response with the model’s generated text.
Understanding Model Encryption
Curious what happens without proper authentication/decryption?There are two scenarios to understand:
- KoalaVault images without authorization: Will fail to start with “No suitable decryption key available” error
- Standard vLLM images with encrypted models: Will start but produce gibberish or crash during inference
- ✅ Container starts and model loads (due to valid safetensors format)
- ❌ Produces gibberish output or crashes (due to encrypted tensor data)
- ❌ No meaningful text generation possible (of course!)
1
Run Without Decryption
Try running the encrypted model with standard vLLM images (without KoalaVault authentication):
- Nvidia GPU (x86/arm64)
- CPU (x86/arm64)
- Apple Silicon (arm)
2
Test the Deployment
Try making a request to the running container:The request may complete but return nonsensical text, or the container may crash during processing.Why? The tensor data (model weights) is encrypted. Standard vLLM images can read the safetensors file format, but without proper decryption via KoalaVault authentication, the model processes encrypted weights, resulting in nonsensical output or system instability.