Reorganize Docker build infrastructure for better maintainability

- Restructure from ollama37/ to docker/ with clear separation - Separate builder and runtime images into dedicated directories - Group environment scripts in builder/scripts/ subdirectory - Add comprehensive root-level README.md (257 lines) - Add .dockerignore files for optimized build contexts - Enhance shell scripts with shebangs and documentation headers - Update docker-compose.yml to build locally instead of pulling - Add environment variables for GPU and host configuration - Remove duplicate Dockerfile and confusing nested structure New structure: docker/ ├── README.md (comprehensive documentation) ├── docker-compose.yml (local build support) ├── builder/ (build environment: CUDA 11.4 + GCC 10 + Go 1.24) │ ├── Dockerfile │ ├── README.md │ ├── .dockerignore │ └── scripts/ (organized environment setup) └── runtime/ (production image) ├── Dockerfile ├── README.md └── .dockerignore This reorganization eliminates confusion, removes duplication, and provides a professional, maintainable structure for Tesla K80 builds.
2025-12-09 23:37:06 +00:00 · 2025-10-28 14:47:39 +08:00
parent 736cbdf52a
commit 8dc4ca7ccc
11 changed files with 569 additions and 0 deletions
--- a/docker/README.md
+++ b/docker/README.md
@@ -0,0 +1,257 @@
+# Ollama37 Docker Build System
+
+Docker infrastructure for running Ollama on NVIDIA Tesla K80 GPUs (CUDA Compute Capability 3.7).
+
+## Overview
+
+This directory contains Docker build configurations for the ollama37 project, which maintains support for legacy NVIDIA Tesla K80 GPUs. The official Ollama project dropped support for Compute Capability 3.7 when transitioning to CUDA 12, but this fork preserves compatibility using CUDA 11.4.
+
+## Directory Structure
+
+```
+docker/
+├── README.md                  # This file - overview of Docker build system
+├── docker-compose.yml         # Simplified deployment configuration
+├── builder/                   # Build environment image
+│   ├── Dockerfile             # Builder base image (CUDA 11.4 + GCC 10 + Go 1.24)
+│   ├── README.md              # Builder documentation
+│   └── scripts/               # Environment setup scripts
+│       ├── cuda-11.4.sh       # CUDA 11.4 PATH configuration
+│       ├── gcc-10.sh          # GCC 10 library paths
+│       └── go-1.24.2.sh       # Go 1.24.2 PATH configuration
+└── runtime/                   # Production runtime image
+    ├── Dockerfile             # Multi-stage build for ollama37 binary
+    └── README.md              # Runtime image documentation
+```
+
+## Components
+
+### Builder Image (`builder/`)
+
+The builder image provides a complete compilation environment for building ollama37 from source with Tesla K80 support.
+
+**Base:** Rocky Linux 8  
+**Key Software:**
+- CUDA 11.4 Toolkit (last version supporting Compute Capability 3.7)
+- NVIDIA Driver 470 (compatible with Tesla K80)
+- GCC 10 (custom-built from source)
+- CMake 4.0.0
+- Go 1.24.2
+
+**Purpose:** Compile ollama37 binary with optimized CUDA kernels for Tesla K80 GPUs.
+
+See [builder/README.md](builder/README.md) for detailed information.
+
+### Runtime Image (`runtime/`)
+
+The runtime image is a minimal production image containing only the compiled ollama37 binary and required CUDA libraries.
+
+**Base:** Rocky Linux 8  
+**Build:** Multi-stage build using builder image  
+**Size:** Optimized for production deployment  
+
+**Purpose:** Run ollama37 server with NVIDIA GPU acceleration.
+
+See [runtime/README.md](runtime/README.md) for usage instructions.
+
+## Quick Start
+
+### Option 1: Build from Source (Recommended for Development)
+
+Build both builder and runtime images locally:
+
+```bash
+cd /home/jack/src/ollama37/docker
+
+# Build the builder image first (takes ~30-60 minutes)
+docker build -t ollama37-builder:local -f builder/Dockerfile builder/
+
+# Build the runtime image (uses local builder)
+docker build -t ollama37:local -f runtime/Dockerfile runtime/
+
+# Run with docker-compose
+docker-compose up -d
+```
+
+### Option 2: Use Pre-built Images
+
+Pull and run pre-built images from Docker Hub:
+
+```bash
+cd /home/jack/src/ollama37/docker
+docker-compose pull
+docker-compose up -d
+```
+
+### Option 3: Manual Docker Run
+
+Run the runtime image directly:
+
+```bash
+docker run -d \
+  --name ollama37 \
+  --runtime nvidia \
+  -p 11434:11434 \
+  -v ollama37-data:/root/.ollama \
+  ollama37:local
+```
+
+## Usage
+
+### Start the Server
+
+```bash
+docker-compose up -d
+```
+
+### Check Logs
+
+```bash
+docker-compose logs -f
+```
+
+### Pull a Model
+
+```bash
+docker exec -it ollama37 ollama pull llama3.2:3b
+```
+
+### Run a Chat Session
+
+```bash
+docker exec -it ollama37 ollama run llama3.2:3b
+```
+
+### API Access
+
+The Ollama API is available at `http://localhost:11434`:
+
+```bash
+curl http://localhost:11434/api/generate -d '{
+  "model": "llama3.2:3b",
+  "prompt": "Why is the sky blue?"
+}'
+```
+
+### Stop the Server
+
+```bash
+docker-compose down
+```
+
+## Tesla K80 Support
+
+### Hardware Requirements
+
+- **GPU:** NVIDIA Tesla K80 (Compute Capability 3.7)
+- **VRAM:** 12GB per GPU (24GB for dual-GPU K80)
+- **Driver:** NVIDIA Driver 470 or compatible
+- **Runtime:** nvidia-docker2 or NVIDIA Container Toolkit
+
+### Recommended Models
+
+Based on 12GB VRAM per GPU:
+
+| Model | Size | Quantization | Context Length |
+|-------|------|--------------|----------------|
+| Llama 3.2 3B | 3B | Full precision | 8K |
+| Qwen 2.5 7B | 7B | Q4_K_M | 4K |
+| Llama 3.1 8B | 8B | Q4_0 | 4K |
+| Mistral 7B | 7B | Q4_K_M | 4K |
+
+For larger models, use aggressive quantization (Q4_0, Q4_K_M) or multi-GPU setups.
+
+### Multi-GPU Support
+
+Tesla K80 dual-GPU configurations are supported:
+
+```bash
+# Use both GPUs
+docker run --gpus all ...
+
+# Use specific GPU
+docker run --gpus '"device=0"' ...
+
+# Split model across GPUs
+docker run -e CUDA_VISIBLE_DEVICES=0,1 ...
+```
+
+## Build Configuration
+
+### Environment Variables
+
+The runtime Dockerfile uses these build arguments:
+
+- `CC=/usr/local/bin/gcc` - Use custom GCC 10
+- `CXX=/usr/local/bin/g++` - Use custom G++ 10
+
+### CUDA Architecture Targets
+
+The build includes Compute Capability 3.7 in `CMAKE_CUDA_ARCHITECTURES`:
+
+```cmake
+set(CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80")
+```
+
+This ensures CUDA kernels are compiled for Tesla K80 (CC 3.7).
+
+## Troubleshooting
+
+### GPU Not Detected
+
+Check NVIDIA runtime and driver installation:
+
+```bash
+docker run --rm --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi
+```
+
+### Out of Memory Errors
+
+Reduce context length or use more aggressive quantization:
+
+```bash
+docker exec ollama37 ollama run llama3.2:3b --num-ctx 2048
+```
+
+### Build Failures
+
+Ensure sufficient disk space and memory:
+
+- Disk space: 50GB+ recommended
+- RAM: 16GB+ recommended for building
+- Swap: 8GB+ recommended if RAM is limited
+
+## Development
+
+### Rebuilding After Changes
+
+```bash
+# Rebuild builder image
+docker build --no-cache -t ollama37-builder:local -f builder/Dockerfile builder/
+
+# Rebuild runtime image
+docker build --no-cache -t ollama37:local -f runtime/Dockerfile runtime/
+```
+
+### Testing Local Builds
+
+```bash
+# Run with local image
+docker run --rm -it --gpus all ollama37:local ollama --version
+```
+
+## Contributing
+
+See the main project repository for contribution guidelines:
+- [ollama37 GitHub Repository](https://github.com/dogkeeper886/ollama37)
+
+## License
+
+This project maintains the same license as the upstream Ollama project.
+
+## Resources
+
+- [Ollama Documentation](https://github.com/ollama/ollama/tree/main/docs)
+- [CUDA 11.4 Documentation](https://docs.nvidia.com/cuda/archive/11.4.0/)
+- [Tesla K80 Specifications](https://www.nvidia.com/en-gb/data-center/tesla-k80/)
+- [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html)
--- a/docker/builder/.dockerignore
+++ b/docker/builder/.dockerignore
@@ -0,0 +1,28 @@
+# Exclude documentation
+*.md
+README.md
+
+# Exclude runtime files
+../runtime/
+
+# Exclude volume data
+../volume/
+
+# Exclude docker-compose
+../docker-compose.yml
+
+# Exclude git
+.git/
+.gitignore
+.gitattributes
+
+# Exclude editor files
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# Exclude OS files
+.DS_Store
+Thumbs.db
--- a/docker/builder/Dockerfile
+++ b/docker/builder/Dockerfile
@@ -0,0 +1,51 @@
+FROM rockylinux/rockylinux:8
+
+# Update OS and install cuda toolkit 11.4 and nvdia driver 470
+RUN dnf -y update\
+    && dnf -y install epel-release\
+    && dnf -y config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo\
+    && dnf -y module install nvidia-driver:470-dkms\
+    && dnf -y install cuda-11-4
+
+# Post install, setup path    
+COPY cuda-11.4.sh /etc/profile.d/cuda-11.4.sh
+
+# Install gcc 10
+RUN dnf -y install wget unzip lbzip2\
+    && dnf -y groupinstall "Development Tools"\
+    && cd /usr/local/src\
+    && wget https://github.com/gcc-mirror/gcc/archive/refs/heads/releases/gcc-10.zip\
+    && unzip gcc-10.zip\
+    && cd gcc-releases-gcc-10\
+    && contrib/download_prerequisites\
+    && mkdir /usr/local/gcc-10\
+    && cd /usr/local/gcc-10\
+    && /usr/local/src/gcc-releases-gcc-10/configure --disable-multilib\
+    && make -j ${nproc}\
+    && make install
+
+# Post install, setup path
+COPY gcc-10.sh /etc/profile.d/gcc-10.sh
+COPY gcc-10.sh /etc/ld.so.conf.d/gcc-10.conf
+
+# Install cmake
+ENV LD_LIBRARY_PATH="/usr/local/lib64:/usr/local/cuda-11.4/lib64"
+ENV PATH="/usr/local/cuda-11.4/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
+RUN dnf -y install openssl-devel\
+    && cd /usr/local/src\
+    && wget https://github.com/Kitware/CMake/releases/download/v4.0.0/cmake-4.0.0.tar.gz\
+    && tar xvf cmake-4.0.0.tar.gz\
+    && mkdir /usr/local/cmake-4\
+    && cd /usr/local/cmake-4\
+    && /usr/local/src/cmake-4.0.0/configure\
+    && make -j ${nproc}\
+    && make install
+
+# Install go
+RUN cd /usr/local\
+    && wget https://go.dev/dl/go1.24.2.linux-amd64.tar.gz\
+    && tar xvf go1.24.2.linux-amd64.tar.gz
+
+# Post install, setup path
+COPY go-1.24.2.sh /etc/profile.d/go-1.24.2.sh
+ENV PATH="$PATH:/usr/local/go/bin"
--- a/docker/builder/README.md
+++ b/docker/builder/README.md
@@ -0,0 +1,49 @@
+# Ollama CUDA 11.4 Builder Image for Tesla K80 (Compute Capability 3.7)
+
+This Docker image provides a development environment tailored specifically to build [Ollama37](https://github.com/dogkeeper886/ollama37) on older NVIDIA GPUs, with an emphasis on devices like the **Tesla K80** which have compute capability of `3.7`. It comes equipped with essential tools and software including CUDA toolkit 11.4 support.
+
+## 🔧 Key Features
+
+- **Base Image:** Rocky Linux v8
+- **CUDA Toolkit Version:** 11.4 - For high-performance GPU acceleration.
+- **NVIDIA Driver (v470):** `nvidia-driver:470-dkms` to ensure compatibility with Tesla K80 GPUs and beyond, specifically targeting compute capability of version 3.7.
+- **GCC v10:** A versatile compiler that will be necessary for building C/C++ projects in this Docker image environment is compiled from source within the container itself; thus ensuring up-to-date features are available during builds.
+- **CMake (v4.0.0):** This build system generator, also built directly into our custom Rocky Linux 8 image version v10 ensures a comprehensive and flexible C/C++ project building process that can be tailored to your needs within this environment; again compiled from source for the latest features right in your container.
+- **Go (v1.24.2):** This lightweight programming language is essential when compiling Go projects, especially those utilizing cgo.
+
+This Docker image strikes a balance between supporting legacy hardware such as Tesla K80 and meeting modern software build requirements like CUDA 11.4 for cutting-edge development needs including but not limited to Ollama37.
+
+
+## 🚀 How To Use
+
+Designed with builders in mind; this container is perfect when you're aiming to compile projects that leverage the power of NVIDIA GPUs, particularly those compatible with compute capability `3.7`.
+
+### Quick Example Usage:
+
+```docker
+docker run --rm -it dogkeeper886/ollama37-builder bash
+```
+
+When you have access inside your newly instantiated Docker environment (`dogkeeper886/ollama37-builder`):
+
+1. Navigate to the source directory:
+    ```bash
+    cd /usr/local/src \
+        && git clone https://github.com/dogkeeper886/ollama37 \
+        && cd ollama37 
+    ```
+2. Set up your build and compile it using CMake along with GCC (as our custom compiled version):
+    ```bash
+    CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build \
+        && cmake --build build
+    ```
+3. Lastly, go ahead and compile the project using Go (also utilizing our custom-built version), ensuring you have enabled modules for compatibility:
+    ```bash
+    go build -o ollama .
+    ```
+
+## 🎯 Contributing
+
+We're thrilled to welcome your contributions! Should you encounter any issues or have ideas for improving this Docker image, please submit them as an issue on the GitHub repository: [https://github.com/dogkeeper886/ollama-k80-lab](https://github.com/dogkeeper886/ollama-k80-lab).
+
+We are committed to continually enhancing our projects and appreciate all feedback. Thank you!
--- a/docker/builder/scripts/cuda-11.4.sh
+++ b/docker/builder/scripts/cuda-11.4.sh
@@ -0,0 +1,6 @@
+#!/bin/sh
+# cuda-11.4.sh - CUDA 11.4 environment configuration for Tesla K80 support
+# This script configures PATH and LD_LIBRARY_PATH for CUDA 11.4 toolkit
+
+export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}}
+export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
--- a/docker/builder/scripts/gcc-10.sh
+++ b/docker/builder/scripts/gcc-10.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+# gcc-10.sh - GCC 10 library path configuration
+# This script configures LD_LIBRARY_PATH for custom-built GCC 10
+
+export LD_LIBRARY_PATH=/usr/local/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
--- a/docker/builder/scripts/go-1.24.2.sh
+++ b/docker/builder/scripts/go-1.24.2.sh
@@ -0,0 +1,5 @@
+#!/bin/sh
+# go-1.24.2.sh - Go 1.24.2 environment configuration
+# This script adds Go 1.24.2 binary directory to PATH
+
+export PATH=/usr/local/go/bin${PATH:+:${PATH}}
--- a/docker/docker-compose.yml
+++ b/docker/docker-compose.yml
@@ -0,0 +1,16 @@
+services:
+  ollama37:
+    build:
+      context: ./runtime
+      dockerfile: Dockerfile
+    image: ollama37:local
+    container_name: ollama37
+    ports:
+      - "11434:11434"
+    restart: unless-stopped
+    runtime: nvidia
+    volumes:
+      - ./volume:/root/.ollama
+    environment:
+      - CUDA_VISIBLE_DEVICES=all
+      - OLLAMA_HOST=0.0.0.0
--- a/docker/runtime/.dockerignore
+++ b/docker/runtime/.dockerignore
@@ -0,0 +1,28 @@
+# Exclude documentation
+*.md
+README.md
+
+# Exclude builder files
+../builder/
+
+# Exclude volume data
+../volume/
+
+# Exclude docker-compose
+../docker-compose.yml
+
+# Exclude git
+.git/
+.gitignore
+.gitattributes
+
+# Exclude editor files
+.vscode/
+.idea/
+*.swp
+*.swo
+*~
+
+# Exclude OS files
+.DS_Store
+Thumbs.db
--- a/docker/runtime/Dockerfile
+++ b/docker/runtime/Dockerfile
--- a/docker/runtime/README.md
+++ b/docker/runtime/README.md
@@ -0,0 +1,124 @@
+# Docker Image for Ollama on NVIDIA K80 GPU
+
+## Description
+
+This Docker image provides a ready-to-use environment for running Ollama, a local Large Language Model (LLM) runner, specifically optimized to leverage the capabilities of an NVIDIA K80 GPU. This setup is ideal for AI researchers and developers looking to experiment with models in a controlled home lab setting.
+
+The project repository, [dogkeeper886/ollama-k80-lab](https://github.com/dogkeeper886/ollama-k80-lab), offers insights into configuring and using the image effectively. The Dockerfile included in this image is designed for ease of use and efficiency:
+
+- **Build Stage**: Compiles Ollama from source using GCC and CMake.
+- **Runtime Environment**: Utilizes Rocky Linux 8 with necessary GPU drivers and libraries pre-configured.
+
+This setup ensures that users can start experimenting with AI models without the hassle of manual environment configuration, making it a perfect playground for innovation in AI research.
+
+## Features
+
+- **GPU Acceleration**: Fully supports NVIDIA K80 GPUs to accelerate model computations.
+- **Multi-Modal AI**: Supports vision-language models like Qwen2.5-VL for image understanding.
+- **Advanced Reasoning**: Built-in thinking support for enhanced AI reasoning capabilities.
+- **Pre-built Binary**: Contains the compiled Ollama binary for immediate use.
+- **CUDA Libraries**: Includes necessary CUDA libraries and drivers for GPU operations.
+- **Enhanced Tool Support**: Improved tool calling and WebP image input support.
+- **Environment Variables**: Configured to facilitate seamless interaction with the GPU and network settings.
+
+## Usage
+
+### Prerequisites
+
+Ensure you have Docker installed on your system and that your NVIDIA K80 GPU is properly set up. You may need the NVIDIA Container Toolkit to enable GPU support in Docker containers.
+
+### Pulling the Image
+
+To pull the image from Docker Hub, use:
+
+```bash
+docker pull dogkeeper886/ollama37
+```
+
+### Running the Container
+
+To run the container with GPU support, execute:
+
+```bash
+docker run --runtime=nvidia --gpus all -p 11434:11434 dogkeeper886/ollama37
+```
+
+This command will start Ollama and expose it on port `11434`, allowing you to interact with the service.
+
+## Ollama37 Docker Compose
+
+This `docker-compose.yml` file sets up an Ollama 3.7 container for a more streamlined and persistent environment. It utilizes volumes to persist data and ensures the container automatically restarts if it fails.
+
+### Prerequisites
+
+*   Docker
+*   Docker Compose
+
+### Usage
+
+1.  **Save the `docker-compose.yml` file:** Save the content provided below into a file named `docker-compose.yml` in a convenient directory.
+
+2.  **Run the container:** Open a terminal in the directory where you saved the file and run the following command:
+
+    ```bash
+    docker-compose up -d
+    ```
+
+    This command downloads the `dogkeeper886/ollama37` image (if not already present) and starts the Ollama container in detached mode.
+
+    ```yml
+    services:
+      ollama37:
+        image: dogkeeper886/ollama37
+        container_name: ollama37
+        ports:
+          - "11434:11434"
+        restart: unless-stopped # Automatically restart the container
+        runtime: nvidia # Utilize NVIDIA GPU runtime
+        volumes:
+          - ./volume:/root/.ollama # Persist Ollama data
+    ```
+
+    **Explanation of key `docker-compose.yml` directives:**
+
+    *   `version: '3.8'`: Specifies the Docker Compose file version.
+    *   `services.ollama.image: dogkeeper886/ollama37`: Defines the Docker image to use.
+    *   `ports: - "11434:11434"`: Maps port 11434 on the host machine to port 11434 inside the container, making Ollama accessible.
+    *   `volumes: - ./.ollama:/root/.ollama`:  **Important:**  This mounts a directory named `.ollama` in the same directory as the `docker-compose.yml` file to the `/root/.ollama` directory inside the container.  This ensures that downloaded models and Ollama configuration data are persisted even if the container is stopped or removed.  Create a `.ollama` directory if it does not already exist.
+    *   `restart: unless-stopped`:  This ensures the container automatically restarts if it crashes or is stopped (but not if you explicitly stop it with `docker-compose down`).
+    *   `runtime: nvidia`: Explicitly instructs Docker to use the NVIDIA runtime, ensuring GPU acceleration.
+
+3.  **Accessing Ollama:** After running the container, you can interact with Ollama using its API.  Refer to the Ollama documentation for usage details.
+
+### Stopping the Container
+
+To stop the container, run:
+
+```bash
+docker-compose down
+```
+
+This will stop and remove the container, but the data stored in the `.ollama` directory will be preserved.
+
+## 📦 Version History
+
+### v1.3.0 (2025-07-01)
+
+This release expands model support while maintaining full Tesla K80 compatibility:
+
+**New Model Support:**
+- **Qwen2.5-VL**: Multi-modal vision-language model for image understanding
+- **Gemma 3n**: Efficient models designed for execution on everyday devices such as laptops, tablets or phones
+
+**Documentation Updates:**
+- Updated installation guides for Tesla K80 compatibility
+
+### v1.2.0 (2025-05-06)
+
+This release introduces support for Qwen3 models, marking a significant step in our commitment to staying Tesla K80 with leading open-source language models. Testing includes successful execution of Gemma 3 12B, Phi-4 Reasoning 14B, and Qwen3 14B, ensuring compatibility with models expected to be widely used in May 2025.
+
+## 🎯 Contributing
+
+We're thrilled to welcome your contributions! Should you encounter any issues or have ideas for improving this Docker image, please submit them as an issue on the GitHub repository: [https://github.com/dogkeeper886/ollama-k80-lab](https://github.com/dogkeeper886/ollama-k80-lab).
+
+We are committed to continually enhancing our projects and appreciate all feedback. Thank you!