Open WebUI: Best Frontend for Local LLMs on Your Homelab
Why Open WebUI Beats the Alternatives for Local LLM Hosting
You've got Ollama running on your homelab—solid choice—but you're stuck using CLI commands to chat with your models. Open WebUI changes that. It's a self-hosted web interface built specifically for local LLMs, with conversation history, RAG document uploads, model switching, and multi-user support. On my T5810 with 24GB RAM running Ubuntu 24.04.1 LTS, it cuts interaction friction to near-zero while keeping everything behind your firewall.
This post covers a production-grade Docker setup for Open WebUI 0.3.15 with persistent storage, Ollama integration, and proper user isolation—the setup you'd run on actual homelab infrastructure, not a tutorial toy.
Prerequisites: What You Need Before Starting
- Ollama 0.1.48+ running and accessible (local or remote). Test with
curl http://localhost:11434/api/tags - Docker 26.1.3+ and Docker Compose 2.27.0+ installed
- At least 4GB RAM free after Ollama allocation (Open WebUI is lightweight, but you're running two services)
- Ubuntu 24.04.1 LTS or equivalent (RHEL derivatives work identically)
- Port 8080 or 3000 available for the web interface (check:
sudo netstat -tlnp | grep -E ':(8080|3000)')
Gotcha #1: If Ollama and Open WebUI are on different hosts, Ollama must listen on 0.0.0.0:11434, not just 127.0.0.1. Edit /etc/systemd/system/ollama.service, add Environment="OLLAMA_HOST=0.0.0.0:11434" under [Service], then sudo systemctl restart ollama.
Setting Up Docker Compose for Open WebUI + Persistent Storage
Create a project directory and docker-compose.yml that handles both Ollama and Open WebUI in one network. This way they discover each other automatically.
mkdir -p ~/homelab/openwebui
cd ~/homelab/openwebui
Create the compose file:
version: '3.8'
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
ports:
- "11434:11434"
environment:
- OLLAMA_HOST=0.0.0.0:11434
volumes:
- ollama_data:/root/.ollama
networks:
- webui
open-webui:
image: ghcr.io/open-webui/open-webui:0.3.15
container_name: open-webui
restart: unless-stopped
ports:
- "8080:8080"
environment:
- OLLAMA_API_BASE_URL=http://ollama:11434/api
- WEBUI_SECRET_KEY=your-secure-random-key-here-change-this
- WEBUI_AUTH=true
volumes:
- webui_data:/app/backend/data
depends_on:
- ollama
networks:
- webui
volumes:
ollama_data:
driver: local
webui_data:
driver: local
networks:
webui:
driver: bridge
Important: Replace your-secure-random-key-here-change-this with actual random output: openssl rand -base64 32. Store it in a password manager—you'll need it if you ever redeploy.
Bring everything up:
docker compose up -d
docker compose logs -f open-webui
Wait for the "Application startup complete" message, then hit http://localhost:8080 in your browser. You'll land on a signup page for the first user (admin).
First-Login Configuration: Users, Models, and RAG Setup
Creating Your Admin Account
Sign up with email and password. This becomes your admin account. Once logged in, click the user icon (top-right) → Settings.
Connecting Your Ollama Models
Go to Settings → Admin tab. You'll see "Ollama API URL"—it should already show http://ollama:11434/api from the compose environment variable. Click Refresh to sync available models.
If you pulled models directly into Ollama (e.g., ollama pull mistral:latest), they appear here. Test by selecting a model and sending a message. Response time depends on your hardware; on my T5810, mistral:7b takes 3–5 seconds per token.
Enabling RAG (Retrieval-Augmented Generation)
Open WebUI's RAG lets you upload PDFs, docs, or text files and have the model reference them during chat. Go to Settings → Documents.
Click the folder icon to upload files. Supported formats: PDF, TXT, Markdown, CSV. The interface chunks documents automatically. In a conversation, reference uploaded docs by typing #document-name.
Gotcha #2: RAG chunking happens client-side by default. For large document collections (100+ files), this chokes the browser. Use the Web Search plugin instead if you're indexing a lot of content—it queries external search APIs and pulls fresh results.
Multi-User Setup: Creating and Managing Team Members
Open WebUI's auth is token-based and stores users in the persistent webui_data volume. Add team members from Settings → Admin → Users.
Click Add User, set email and temporary password. They sign in once, then change their password. Each user gets isolated conversation history and custom model preferences.
To revoke access: Go to the user row, click the three-dot menu, select Delete. Their conversations remain in the database but they can't access anything.
For team homelabs: If you want SSO (LDAP/OAuth), that's not built-in at v0.3.15, but the community is working on it. For now, manual user creation is the standard.
Managing Model Performance and Memory
Open WebUI doesn't allocate VRAM—Ollama does. But you can control which models load in Open WebUI's dropdown and set per-model preferences.
Go to Settings → Models. You'll see every model Ollama knows about. Pull new models directly from the Ollama container:
docker exec ollama ollama pull neural-chat:7b
Refresh in Open WebUI and the model appears instantly. No service restart needed.
Pro move: For models you use rarely, keep them pulled but unload them from memory. In Ollama's settings (accessible via its admin endpoint), you can set OLLAMA_NUM_PARALLEL=1 to ensure only one model runs at a time, preventing OOM on memory-constrained hardware.
Persistent Storage and Backup Strategy
Everything persists in Docker volumes: ollama_data (models and Ollama state) and webui_data (conversations, users, documents). Back both up weekly.
#!/bin/bash
# backup-openwebui.sh
BACKUP_DIR="/mnt/nas/backups/openwebui"
DATE=$(date +%Y%m%d_%H%M%S)
docker run --rm -v webui_data:/data -v $BACKUP_DIR:/backup \
alpine tar czf /backup/webui-$DATE.tar.gz -C /data .
docker run --rm -v ollama_data:/data -v $BACKUP_DIR:/backup \
alpine tar czf /backup/ollama-$DATE.tar.gz -C /data .
# Keep only last 30 days
find $BACKUP_DIR -name "*.tar.gz" -mtime +30 -delete
Run this via cron: 0 2 * * * /home/user/backup-openwebui.sh (2 AM daily).
Common Issues and Troubleshooting
Open WebUI Can't Connect to Ollama
If you see "Error connecting to Ollama" in Settings, check:
docker exec open-webui curl -v http://ollama:11434/api/tags
If that fails, the containers aren't on the same network. Verify:
docker network ls
docker network inspect openwebui_webui | grep -A5 "Containers"
Both should appear in the webui network. If not, nuke and restart: docker compose down && docker compose up -d.
Models Don't Appear in Open WebUI After Pulling
Click the Refresh button in Settings → Admin. If still missing, verify the model is actually pulled:
docker exec ollama ollama list
If it's there but Open WebUI doesn't show it, restart the web container: docker compose restart open-webui.
Memory Exhaustion During Long Conversations
Ollama handles context limits, but long chats can bloat the conversation history in Open WebUI. Go to Settings → Parameters and lower Context Length to 2048 or 4096 (defaults to 8192). This limits how much prior conversation the model references.
Can't Log Back In After First Setup
If the login page resets you repeatedly, check that WEBUI_AUTH=true is set and webui_data volume is mounted correctly. Verify:
docker inspect open-webui | grep -A5 Mounts
Should show /app/backend/data mapped to a volume, not a local path. If it's a path (like /root/something), the volume didn't mount and auth data is ephemeral.
What You've Built and Next Steps
You now have a production-grade local LLM frontend—Open WebUI on homelab infrastructure—with Ollama integration, persistent storage, multi-user auth, and RAG capability. It runs 24/7 without cloud dependencies and costs nothing to operate after hardware.
Next moves:
- Set up a reverse proxy (nginx or Caddy) if you want to access it from outside your network—use TLS and basic auth
- Integrate it with your knowledge base via RAG or connect a vector database (Milvus, Weaviate) for semantic search across large document sets
- Experiment with different model sizes: smaller models (3B–7B) for quick summarization, larger ones (13B+) for deep analysis when latency isn't critical
Related: Check out Open WebUI's GitHub for plugin development and community extensions. The project moves fast—stay on v0.3.x or later for stability.