Optimize Docker build performance with parallel compilation

- Add -j$(nproc) flag to cmake build in ollama37.Dockerfile - Use all available CPU cores for faster compilation - Add sync-upstream.md documentation for future maintenance 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-18 19:56:59 +00:00 · 2025-08-08 11:44:59 +08:00
parent 0cd81c838a
commit 83973336d6
2 changed files with 148 additions and 1 deletions
--- a/docs/sync-upstream.md
+++ b/docs/sync-upstream.md
@@ -0,0 +1,147 @@
 # Syncing with Upstream Ollama
 This document describes the process for syncing the ollama37 fork with the official ollama/ollama repository while preserving CUDA Compute Capability 3.7 support for Tesla K80 GPUs.
 ## Prerequisites
 - Git configured with the upstream remote: `https://github.com/ollama/ollama.git`
 - Understanding of which files contain CUDA 3.7 specific changes
 ## Key Files to Preserve
 When merging from upstream, always preserve CUDA 3.7 support in these files:
 1. **`ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt`** - Contains `CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"`
 2. **`CMakePresets.json`** - Keep both "CUDA 11" (with arch 37) and "CUDA 12" presets
 3. **`README.md`** - Maintain ollama37 specific documentation
 4. **`docs/*.md`** - Keep our custom documentation
 ## Sync Process
 ### 1. Create a New Branch
 ```bash
 git checkout main
 git checkout -b sync-upstream-models
 ```
 ### 2. Add Upstream Remote (if not exists)
 ```bash
 git remote add upstream https://github.com/ollama/ollama.git
 ```
 ### 3. Fetch Latest Changes
 ```bash
 git fetch upstream main
 ```
 ### 4. Merge Upstream Changes
 ```bash
 git merge upstream/main -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support"
 ```
 ### 5. Resolve Conflicts
 Common conflict resolutions:
 #### CMakePresets.json
 - **Resolution**: Keep both CUDA 11 (with architecture 37) and CUDA 12 configurations
 - **Example**: Preserve the "CUDA 11" preset with `"CMAKE_CUDA_ARCHITECTURES": "37;50;52;53;60;61;70;75;80;86"`
 #### README.md and Documentation
 - **Resolution**: Keep our version with `git checkout --ours README.md`
 - **Reason**: Maintains ollama37 specific instructions and branding
 #### Model Support Files
 - **Resolution**: Accept upstream changes for new model support
 - **Files**: `model/models/models.go`, new model directories
 - **Example**: Accept new imports like `_ "github.com/ollama/ollama/model/models/gptoss"`
 #### Backend/Tools Updates
 - **Resolution**: Generally accept upstream improvements
 - **Files**: `tools/tools.go`, `ml/backend/ggml/ggml.go`
 - **Caution**: Verify no CUDA 3.7 specific code is removed
 #### Test Files
 - **Resolution**: Accept upstream test additions
 - **Files**: `integration/utils_test.go`
 - **Action**: Include new model tests in the test lists
 ### 6. Commit the Merge
 ```bash
 git add -A
 git commit -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support
 - Added support for new [model name] from upstream
 - Preserved CUDA Compute Capability 3.7 (Tesla K80) support
 - Kept CUDA 11 configuration alongside CUDA 12
 - Maintained all documentation specific to ollama37 fork
 - [List other significant changes]"
 ```
 ### 7. Test the Build
 Build with Docker to verify CUDA 3.7 support:
 ```bash
 docker build -f ollama37.Dockerfile -t ollama37:test .
 ```
 Or build manually:
 ```bash
 CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build
 CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build
 go build -o ollama .
 ```
 ### 8. Test with Tesla K80
 ```bash
 # Run the server
 ./ollama serve
 # In another terminal, test a model
 ollama run llama3.2:1b
 ```
 ### 9. Merge to Main
 After successful testing:
 ```bash
 git checkout main
 git merge sync-upstream-models
 git push origin main
 ```
 ## Troubleshooting
 ### Conflict in CUDA Architecture Settings
 - Always ensure "37" is included in CMAKE_CUDA_ARCHITECTURES
 - CUDA 11 is required for Compute Capability 3.7 support
 - CUDA 12 dropped support for architectures below 50
 ### Build Failures
 - Check that GCC 10 is being used (required for CUDA 11.4)
 - Verify CUDA 11.x is installed (not CUDA 12)
 - Ensure all CUDA 3.7 specific patches are preserved
 ### New Model Integration Issues
 - New models should work automatically if they don't have specific CUDA version requirements
 - If a model requires CUDA 12 features, document the limitation in README.md
 ## Recent Sync History
 - **2025-08-08**: Synced with upstream, added gpt-oss model support
 - **Previous syncs**: See git log for merge commits from upstream/main
 ## Notes
 - This process maintains our fork's ability to run on Tesla K80 GPUs while staying current with upstream features
 - Always test thoroughly before merging to main
 - Document any model-specific limitations discovered during testing
--- a/ollama37.Dockerfile
+++ b/ollama37.Dockerfile
@@ -5,7 +5,7 @@ FROM dogkeeper886/ollama37-builder AS builder
 COPY . /usr/local/src/ollama37
 WORKDIR /usr/local/src/ollama37
 RUN CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build \
-    && CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build \
+    && CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build -j$(nproc) \
    && go build -o ollama .
 # ===== Stage 2: Runtime image =====