Redesign Docker build system to single-stage architecture for reliable model loading

Replaced complex two-stage build (builder → runtime) with single-stage Dockerfile that builds and runs Ollama in one image. This fixes model loading issues caused by missing CUDA libraries and LD_LIBRARY_PATH mismatches in the previous multi-stage design. Changes: - Add docker/Dockerfile: Single-stage build with GCC 10, CMake 4, Go 1.25.3, CUDA 11.4 - Clone source from https://github.com/dogkeeper886/ollama37 - Compile Ollama with "CUDA 11" preset for Tesla K80 (compute capability 3.7) - Keep complete CUDA toolkit and all libraries in final image (~20GB) - Update docker-compose.yml: Simplified config, use ollama37:latest image - Update docker/README.md: New build instructions and architecture docs Trade-off: Larger image size (~20GB vs ~3GB) for guaranteed compatibility and reliable GPU backend operation. All libraries remain accessible with correct paths, ensuring models load properly on Tesla K80. Tested: Successfully runs gemma3:1b on Tesla K80 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-20 20:57:01 +00:00 · 2025-11-10 09:19:22 +08:00
parent 0293c53746
commit 6dbd8ed44e
3 changed files with 232 additions and 161 deletions
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -2,10 +2,9 @@ version: "3.8"

 services:
  ollama:
-    image: ollama37-runtime:latest
-    container_name: ollama37-runtime
+    image: ollama37:latest
+    container_name: ollama37
    runtime: nvidia
-    user: "${UID:-1000}:${GID:-1000}"
    deploy:
      resources:
        reservations:
@@ -16,9 +15,8 @@ services:
    ports:
      - "11434:11434"
    volumes:
-      - ${HOME}/.ollama:${HOME}/.ollama
+      - ollama-data:/root/.ollama
    environment:
-      - HOME=${HOME}
      - OLLAMA_HOST=0.0.0.0:11434
      - NVIDIA_VISIBLE_DEVICES=all
      - NVIDIA_DRIVER_CAPABILITIES=compute,utility
@@ -29,6 +27,7 @@ services:
      timeout: 10s
      retries: 3
      start_period: 5s
+
 volumes:
  ollama-data:
    name: ollama-data