DeepSeek R1 Distill Deployment on Ubuntu 24.04 LTS
Home | Projects | Articles | Apophthegm | About |
DeepSeek R1 is released on Jan 20, 2025 and it is Open Source LLM (Large Language Model) developed by DeepSeek Company in China.
01 Install Docker
sudo apt update
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "${UBUNTU_CODENAME:-$VERSION_CODENAME}") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce
02 Install NVIDIA Container Toolkit (Optional)
If you have NVIDIA Display Card and the proprietary driver has been installed. You need to install NVIDIA Container Toolkit in order to allow NVIDIA driver to communicate with Docker Container.
03 Install Ollama
CPU only
sudo docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart always ollama/ollama
NVIDIA Display Card only
sudo docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart always ollama/ollama
AMD Display Card only
You need to install ROCm driver beforehand.
sudo docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama --restart always ollama/ollama:rocm
04 Import DeepSeek R1 Distill Models
The followings are 4bit models.
Multi-core CPU and 8GB RAM
sudo docker exec -it ollama ollama run hf.co/bartowski/DeepSeek-R1-Distill-Qwen-1.5B-GGUF:Q4_0
Multi-core CPU and 16GB RAM
sudo docker exec -it ollama ollama run hf.co/bartowski/DeepSeek-R1-Distill-Qwen-7B-GGUF:Q4_0
Multi-core CPU and 32GB RAM
sudo docker exec -it ollama ollama run hf.co/bartowski/DeepSeek-R1-Distill-Qwen-14B-GGUF:Q4_0
Multi-core CPU and 64GB RAM
sudo docker exec -it ollama ollama run hf.co/bartowski/DeepSeek-R1-Distill-Qwen-32B-GGUF:Q4_0
When the model is imported, you can try to interactive with the model. To quit is to enter /bye
.
05 Install Open WebUI
Without Login
sudo docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data -e WEBUI_AUTH=False --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
With Login
sudo docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama
Once With Login
is implemented, you are very hard to disable it unless you re-install Docker, Ollama and Open WebUI.
Once Open WebUI installed, you can access the frontend by going to :
http://localhost:3000
06 Install AnythingLLM (Optional)
sudo docker pull mintplexlabs/anythingllm
export STORAGE_LOCATION=$HOME/anythingllm && \
mkdir -p $STORAGE_LOCATION && \
touch "$STORAGE_LOCATION/.env" && \
sudo docker run -d -p 3001:3001 \
--cap-add SYS_ADMIN \
-v ${STORAGE_LOCATION}:/app/server/storage \
-v ${STORAGE_LOCATION}/.env:/app/server/.env \
-e STORAGE_DIR="/app/server/storage" \
--restart always \
mintplexlabs/anythingllm
To run the frontend :
http://localhost:3001
07 Mobile Phone (Optional)
Install PocketPal AI from the Google Play Store or Apple Store and import models to the phone. Make sure use 1.5b or 7b/8b depends on the amount of RAM on your phone.
Reference
- DeepSeek Official Site
- DeepSeek GitHub
- Ollama Official Site
- Open WebUI Official Site
- Open WebUI GitHub
- Huggingface - bartowski
- Anything LLM Desktop
- PocketPal AI Google Play Store
- PocketPal AI GitHub
Home | Projects | Articles | Apophthegm | About |