From 83973336d644a6cad685e63a3a0e002aea7ea32c Mon Sep 17 00:00:00 2001 From: Shang Chieh Tseng Date: Fri, 8 Aug 2025 11:44:59 +0800 Subject: [PATCH] Optimize Docker build performance with parallel compilation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add -j$(nproc) flag to cmake build in ollama37.Dockerfile - Use all available CPU cores for faster compilation - Add sync-upstream.md documentation for future maintenance 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude --- docs/sync-upstream.md | 147 ++++++++++++++++++++++++++++++++++++++++++ ollama37.Dockerfile | 2 +- 2 files changed, 148 insertions(+), 1 deletion(-) create mode 100644 docs/sync-upstream.md diff --git a/docs/sync-upstream.md b/docs/sync-upstream.md new file mode 100644 index 00000000..9af84817 --- /dev/null +++ b/docs/sync-upstream.md @@ -0,0 +1,147 @@ +# Syncing with Upstream Ollama + +This document describes the process for syncing the ollama37 fork with the official ollama/ollama repository while preserving CUDA Compute Capability 3.7 support for Tesla K80 GPUs. + +## Prerequisites + +- Git configured with the upstream remote: `https://github.com/ollama/ollama.git` +- Understanding of which files contain CUDA 3.7 specific changes + +## Key Files to Preserve + +When merging from upstream, always preserve CUDA 3.7 support in these files: + +1. **`ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt`** - Contains `CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"` +2. **`CMakePresets.json`** - Keep both "CUDA 11" (with arch 37) and "CUDA 12" presets +3. **`README.md`** - Maintain ollama37 specific documentation +4. **`docs/*.md`** - Keep our custom documentation + +## Sync Process + +### 1. Create a New Branch + +```bash +git checkout main +git checkout -b sync-upstream-models +``` + +### 2. Add Upstream Remote (if not exists) + +```bash +git remote add upstream https://github.com/ollama/ollama.git +``` + +### 3. Fetch Latest Changes + +```bash +git fetch upstream main +``` + +### 4. Merge Upstream Changes + +```bash +git merge upstream/main -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support" +``` + +### 5. Resolve Conflicts + +Common conflict resolutions: + +#### CMakePresets.json +- **Resolution**: Keep both CUDA 11 (with architecture 37) and CUDA 12 configurations +- **Example**: Preserve the "CUDA 11" preset with `"CMAKE_CUDA_ARCHITECTURES": "37;50;52;53;60;61;70;75;80;86"` + +#### README.md and Documentation +- **Resolution**: Keep our version with `git checkout --ours README.md` +- **Reason**: Maintains ollama37 specific instructions and branding + +#### Model Support Files +- **Resolution**: Accept upstream changes for new model support +- **Files**: `model/models/models.go`, new model directories +- **Example**: Accept new imports like `_ "github.com/ollama/ollama/model/models/gptoss"` + +#### Backend/Tools Updates +- **Resolution**: Generally accept upstream improvements +- **Files**: `tools/tools.go`, `ml/backend/ggml/ggml.go` +- **Caution**: Verify no CUDA 3.7 specific code is removed + +#### Test Files +- **Resolution**: Accept upstream test additions +- **Files**: `integration/utils_test.go` +- **Action**: Include new model tests in the test lists + +### 6. Commit the Merge + +```bash +git add -A +git commit -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support + +- Added support for new [model name] from upstream +- Preserved CUDA Compute Capability 3.7 (Tesla K80) support +- Kept CUDA 11 configuration alongside CUDA 12 +- Maintained all documentation specific to ollama37 fork +- [List other significant changes]" +``` + +### 7. Test the Build + +Build with Docker to verify CUDA 3.7 support: + +```bash +docker build -f ollama37.Dockerfile -t ollama37:test . +``` + +Or build manually: + +```bash +CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build +CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build +go build -o ollama . +``` + +### 8. Test with Tesla K80 + +```bash +# Run the server +./ollama serve + +# In another terminal, test a model +ollama run llama3.2:1b +``` + +### 9. Merge to Main + +After successful testing: + +```bash +git checkout main +git merge sync-upstream-models +git push origin main +``` + +## Troubleshooting + +### Conflict in CUDA Architecture Settings +- Always ensure "37" is included in CMAKE_CUDA_ARCHITECTURES +- CUDA 11 is required for Compute Capability 3.7 support +- CUDA 12 dropped support for architectures below 50 + +### Build Failures +- Check that GCC 10 is being used (required for CUDA 11.4) +- Verify CUDA 11.x is installed (not CUDA 12) +- Ensure all CUDA 3.7 specific patches are preserved + +### New Model Integration Issues +- New models should work automatically if they don't have specific CUDA version requirements +- If a model requires CUDA 12 features, document the limitation in README.md + +## Recent Sync History + +- **2025-08-08**: Synced with upstream, added gpt-oss model support +- **Previous syncs**: See git log for merge commits from upstream/main + +## Notes + +- This process maintains our fork's ability to run on Tesla K80 GPUs while staying current with upstream features +- Always test thoroughly before merging to main +- Document any model-specific limitations discovered during testing \ No newline at end of file diff --git a/ollama37.Dockerfile b/ollama37.Dockerfile index 15a2b81a..80972bfb 100644 --- a/ollama37.Dockerfile +++ b/ollama37.Dockerfile @@ -5,7 +5,7 @@ FROM dogkeeper886/ollama37-builder AS builder COPY . /usr/local/src/ollama37 WORKDIR /usr/local/src/ollama37 RUN CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build \ - && CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build \ + && CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build -j$(nproc) \ && go build -o ollama . # ===== Stage 2: Runtime image =====