Files
ollama37/docker/runtime/Dockerfile
Shang Chieh Tseng 4810471b33 Redesign Docker build system to two-stage architecture with builder/runtime separation
Redesigned the Docker build system from a single-stage monolithic design to a clean
two-stage architecture that separates build environment from compilation process while
maintaining library path compatibility.

## Architecture Changes

### Builder Image (docker/builder/Dockerfile)
- Provides base environment: CUDA 11.4, GCC 10, CMake 4, Go 1.25.3
- Built once, cached for subsequent builds (~90 min first time)
- Removed config file copying (cuda-11.4.sh, gcc-10.conf, go.sh)
- Added comprehensive comments explaining each build step
- Added git installation for runtime stage source cloning

### Runtime Image (docker/runtime/Dockerfile)
- Two-stage build using ollama37-builder as base for BOTH stages
- Stage 1 (compile): Clone source from GitHub → CMake configure → Build C/C++/CUDA → Build Go
- Stage 2 (runtime): Copy artifacts from stage 1 → Setup environment → Configure server
- Both stages use identical base image to ensure library path compatibility
- Removed -buildvcs=false flag (VCS info embedded from git clone)
- Comprehensive comments documenting library paths and design rationale

### Makefile (docker/Makefile)
- Simplified from 289 to 145 lines (-50% complexity)
- Removed: run, stop, logs, shell, test targets (use docker-compose instead)
- Removed: build orchestration targets (start-builder, copy-source, run-cmake, etc.)
- Removed: artifact copying (handled internally by multi-stage build)
- Focus: Build images only (build, build-builder, build-runtime, clean, help)
- All runtime operations delegated to docker-compose.yml

### Documentation (docker/README.md)
- Completely rewritten for new two-stage architecture
- Added "Build System Components" section with file structure
- Documented why both runtime stages use builder base (library path compatibility)
- Updated build commands to use Makefile
- Updated runtime commands to use docker-compose
- Added comprehensive troubleshooting section
- Added build time and image size tables
- Reference to archived single-stage design

## Key Design Decision

**Problem**: Compiled binaries have hardcoded library paths
**Solution**: Use ollama37-builder as base for BOTH compile and runtime stages
**Trade-off**: Larger image (~18GB) vs guaranteed library compatibility

## Benefits

-  Cleaner separation of concerns (builder env vs compilation vs runtime)
-  Builder image cached after first build (90 min → <1 min rebuilds)
-  Runtime rebuilds only take ~10 min (pulls latest code from GitHub)
-  No library path mismatches (identical base images)
-  No complex artifact extraction (multi-stage COPY)
-  Simpler Makefile focused on image building
-  Runtime management via docker-compose (industry standard)

## Files Changed

Modified:
- docker/builder/Dockerfile - Added comments, removed COPY config files
- docker/runtime/Dockerfile - Converted to two-stage build
- docker/Makefile - Simplified to focus on image building only
- docker/README.md - Comprehensive rewrite for new architecture

Deleted:
- docker/builder/README.md - No longer needed
- docker/builder/cuda-11.4.sh - Generated in Dockerfile
- docker/builder/gcc-10.conf - Generated in Dockerfile
- docker/builder/go.sh - Generated in Dockerfile

Archived:
- docker/Dockerfile → docker/Dockerfile.single-stage.archived

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-10 13:14:49 +08:00

74 lines
2.8 KiB
Docker

# Ollama37 Runtime Image
# Two-stage build: compile stage builds the binary, runtime stage packages it
# Both stages use ollama37-builder base to maintain identical library paths
# This ensures the compiled binary can find all required runtime libraries
# Stage 1: Compile ollama37 from source
FROM ollama37-builder as builder
# Clone ollama37 source code from GitHub
RUN cd /usr/local/src\
&& git clone https://github.com/dogkeeper886/ollama37.git
# Set working directory for build
WORKDIR /usr/local/src/ollama37
# Configure build with CMake
# Use "CUDA 11" preset for Tesla K80 compute capability 3.7 support
# Set LD_LIBRARY_PATH to find GCC 10 and system libraries during build
RUN bash -c 'LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/usr/lib64:$LD_LIBRARY_PATH \
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ \
cmake --preset "CUDA 11"'
# Build C/C++/CUDA libraries with CMake
# Compile all GGML CUDA kernels and Ollama native libraries
RUN bash -c 'LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/usr/lib64:$LD_LIBRARY_PATH \
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ \
cmake --build build -j$(nproc)'
# Build Go binary
# VCS info is embedded automatically since we cloned from git
RUN go build -o /usr/local/bin/ollama .
# Stage 2: Runtime environment
# Use ollama37-builder as base to maintain library path compatibility
# The compiled binary has hardcoded library paths that match this environment
FROM ollama37-builder as runtime
# Copy the entire source directory including compiled libraries
# This preserves the exact directory structure the binary expects
COPY --from=builder /usr/local/src/ollama37 /usr/local/src/ollama37
# Copy the ollama binary to system bin directory
COPY --from=builder /usr/local/bin/ollama /usr/local/bin/ollama
# Setup library paths for runtime
# The binary expects libraries in these exact paths:
# /usr/local/src/ollama37/build/lib/ollama - Ollama CUDA/GGML libraries
# /usr/local/lib64 - GCC 10 runtime libraries (libstdc++, libgcc_s)
# /usr/local/cuda-11.4/lib64 - CUDA 11.4 runtime libraries
# /usr/lib64 - System libraries
ENV LD_LIBRARY_PATH=/usr/local/src/ollama37/build/lib/ollama:/usr/local/lib64:/usr/local/cuda-11.4/lib64:/usr/lib64
# Configure Ollama server to listen on all interfaces
ENV OLLAMA_HOST=0.0.0.0:11434
# Expose Ollama API port
EXPOSE 11434
# Create persistent volume for model storage
# Models downloaded by Ollama will be stored here
RUN mkdir -p /root/.ollama
VOLUME ["/root/.ollama"]
# Configure health check to verify Ollama is running
# Uses 'ollama list' command to check if the service is responsive
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD /usr/local/bin/ollama list || exit 1
# Set entrypoint and default command
# Container runs 'ollama serve' by default to start the API server
ENTRYPOINT ["/usr/local/bin/ollama"]
CMD ["serve"]