mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-10 07:46:59 +00:00
Redesign Docker build system to two-stage architecture with builder/runtime separation
Redesigned the Docker build system from a single-stage monolithic design to a clean two-stage architecture that separates build environment from compilation process while maintaining library path compatibility. ## Architecture Changes ### Builder Image (docker/builder/Dockerfile) - Provides base environment: CUDA 11.4, GCC 10, CMake 4, Go 1.25.3 - Built once, cached for subsequent builds (~90 min first time) - Removed config file copying (cuda-11.4.sh, gcc-10.conf, go.sh) - Added comprehensive comments explaining each build step - Added git installation for runtime stage source cloning ### Runtime Image (docker/runtime/Dockerfile) - Two-stage build using ollama37-builder as base for BOTH stages - Stage 1 (compile): Clone source from GitHub → CMake configure → Build C/C++/CUDA → Build Go - Stage 2 (runtime): Copy artifacts from stage 1 → Setup environment → Configure server - Both stages use identical base image to ensure library path compatibility - Removed -buildvcs=false flag (VCS info embedded from git clone) - Comprehensive comments documenting library paths and design rationale ### Makefile (docker/Makefile) - Simplified from 289 to 145 lines (-50% complexity) - Removed: run, stop, logs, shell, test targets (use docker-compose instead) - Removed: build orchestration targets (start-builder, copy-source, run-cmake, etc.) - Removed: artifact copying (handled internally by multi-stage build) - Focus: Build images only (build, build-builder, build-runtime, clean, help) - All runtime operations delegated to docker-compose.yml ### Documentation (docker/README.md) - Completely rewritten for new two-stage architecture - Added "Build System Components" section with file structure - Documented why both runtime stages use builder base (library path compatibility) - Updated build commands to use Makefile - Updated runtime commands to use docker-compose - Added comprehensive troubleshooting section - Added build time and image size tables - Reference to archived single-stage design ## Key Design Decision **Problem**: Compiled binaries have hardcoded library paths **Solution**: Use ollama37-builder as base for BOTH compile and runtime stages **Trade-off**: Larger image (~18GB) vs guaranteed library compatibility ## Benefits - ✅ Cleaner separation of concerns (builder env vs compilation vs runtime) - ✅ Builder image cached after first build (90 min → <1 min rebuilds) - ✅ Runtime rebuilds only take ~10 min (pulls latest code from GitHub) - ✅ No library path mismatches (identical base images) - ✅ No complex artifact extraction (multi-stage COPY) - ✅ Simpler Makefile focused on image building - ✅ Runtime management via docker-compose (industry standard) ## Files Changed Modified: - docker/builder/Dockerfile - Added comments, removed COPY config files - docker/runtime/Dockerfile - Converted to two-stage build - docker/Makefile - Simplified to focus on image building only - docker/README.md - Comprehensive rewrite for new architecture Deleted: - docker/builder/README.md - No longer needed - docker/builder/cuda-11.4.sh - Generated in Dockerfile - docker/builder/gcc-10.conf - Generated in Dockerfile - docker/builder/go.sh - Generated in Dockerfile Archived: - docker/Dockerfile → docker/Dockerfile.single-stage.archived 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
@@ -1,46 +1,73 @@
|
||||
FROM rockylinux/rockylinux:8
|
||||
# Ollama37 Runtime Image
|
||||
# Two-stage build: compile stage builds the binary, runtime stage packages it
|
||||
# Both stages use ollama37-builder base to maintain identical library paths
|
||||
# This ensures the compiled binary can find all required runtime libraries
|
||||
|
||||
# Install only CUDA runtime libraries (not the full toolkit)
|
||||
# The host system provides the NVIDIA driver at runtime via --gpus flag
|
||||
RUN dnf -y install dnf-plugins-core\
|
||||
&& dnf -y config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo\
|
||||
&& dnf -y install cuda-cudart-11-4 libcublas-11-4 \
|
||||
&& dnf clean all
|
||||
# Stage 1: Compile ollama37 from source
|
||||
FROM ollama37-builder as builder
|
||||
|
||||
# Create directory structure
|
||||
RUN mkdir -p /usr/local/bin /usr/local/lib/ollama
|
||||
# Clone ollama37 source code from GitHub
|
||||
RUN cd /usr/local/src\
|
||||
&& git clone https://github.com/dogkeeper886/ollama37.git
|
||||
|
||||
# Copy the ollama binary from build output
|
||||
COPY docker/output/ollama /usr/local/bin/ollama
|
||||
# Set working directory for build
|
||||
WORKDIR /usr/local/src/ollama37
|
||||
|
||||
# Copy all shared libraries from build output (includes ollama libs + GCC 10 runtime libs)
|
||||
COPY docker/output/lib/ /usr/local/lib/ollama/
|
||||
# Configure build with CMake
|
||||
# Use "CUDA 11" preset for Tesla K80 compute capability 3.7 support
|
||||
# Set LD_LIBRARY_PATH to find GCC 10 and system libraries during build
|
||||
RUN bash -c 'LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/usr/lib64:$LD_LIBRARY_PATH \
|
||||
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ \
|
||||
cmake --preset "CUDA 11"'
|
||||
|
||||
# Set library path to include our ollama libraries first
|
||||
# This includes:
|
||||
# - Ollama CUDA/GGML libraries
|
||||
# - GCC 10 runtime libraries (libstdc++.so.6, libgcc_s.so.1)
|
||||
# - System CUDA libraries
|
||||
ENV LD_LIBRARY_PATH=/usr/local/lib/ollama:/usr/local/cuda-11.4/lib64:/usr/lib64
|
||||
# Build C/C++/CUDA libraries with CMake
|
||||
# Compile all GGML CUDA kernels and Ollama native libraries
|
||||
RUN bash -c 'LD_LIBRARY_PATH=/usr/local/lib:/usr/local/lib64:/usr/lib64:$LD_LIBRARY_PATH \
|
||||
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ \
|
||||
cmake --build build -j$(nproc)'
|
||||
|
||||
# Base image already sets these, but we can override if needed:
|
||||
# NVIDIA_DRIVER_CAPABILITIES=compute,utility
|
||||
# NVIDIA_VISIBLE_DEVICES=all
|
||||
# Build Go binary
|
||||
# VCS info is embedded automatically since we cloned from git
|
||||
RUN go build -o /usr/local/bin/ollama .
|
||||
|
||||
# Ollama server configuration
|
||||
|
||||
# Stage 2: Runtime environment
|
||||
# Use ollama37-builder as base to maintain library path compatibility
|
||||
# The compiled binary has hardcoded library paths that match this environment
|
||||
FROM ollama37-builder as runtime
|
||||
|
||||
# Copy the entire source directory including compiled libraries
|
||||
# This preserves the exact directory structure the binary expects
|
||||
COPY --from=builder /usr/local/src/ollama37 /usr/local/src/ollama37
|
||||
|
||||
# Copy the ollama binary to system bin directory
|
||||
COPY --from=builder /usr/local/bin/ollama /usr/local/bin/ollama
|
||||
|
||||
# Setup library paths for runtime
|
||||
# The binary expects libraries in these exact paths:
|
||||
# /usr/local/src/ollama37/build/lib/ollama - Ollama CUDA/GGML libraries
|
||||
# /usr/local/lib64 - GCC 10 runtime libraries (libstdc++, libgcc_s)
|
||||
# /usr/local/cuda-11.4/lib64 - CUDA 11.4 runtime libraries
|
||||
# /usr/lib64 - System libraries
|
||||
ENV LD_LIBRARY_PATH=/usr/local/src/ollama37/build/lib/ollama:/usr/local/lib64:/usr/local/cuda-11.4/lib64:/usr/lib64
|
||||
|
||||
# Configure Ollama server to listen on all interfaces
|
||||
ENV OLLAMA_HOST=0.0.0.0:11434
|
||||
|
||||
# Expose the Ollama API port
|
||||
# Expose Ollama API port
|
||||
EXPOSE 11434
|
||||
|
||||
# Create a data directory for models
|
||||
# Create persistent volume for model storage
|
||||
# Models downloaded by Ollama will be stored here
|
||||
RUN mkdir -p /root/.ollama
|
||||
VOLUME ["/root/.ollama"]
|
||||
|
||||
# Health check
|
||||
# Configure health check to verify Ollama is running
|
||||
# Uses 'ollama list' command to check if the service is responsive
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
||||
CMD /usr/local/bin/ollama list || exit 1
|
||||
|
||||
# Set entrypoint and default command
|
||||
# Container runs 'ollama serve' by default to start the API server
|
||||
ENTRYPOINT ["/usr/local/bin/ollama"]
|
||||
CMD ["serve"]
|
||||
|
||||
Reference in New Issue
Block a user