mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-10 15:57:04 +00:00
- Add -j$(nproc) flag to cmake build in ollama37.Dockerfile - Use all available CPU cores for faster compilation - Add sync-upstream.md documentation for future maintenance 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
4.2 KiB
4.2 KiB
Syncing with Upstream Ollama
This document describes the process for syncing the ollama37 fork with the official ollama/ollama repository while preserving CUDA Compute Capability 3.7 support for Tesla K80 GPUs.
Prerequisites
- Git configured with the upstream remote:
https://github.com/ollama/ollama.git - Understanding of which files contain CUDA 3.7 specific changes
Key Files to Preserve
When merging from upstream, always preserve CUDA 3.7 support in these files:
ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt- ContainsCMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"CMakePresets.json- Keep both "CUDA 11" (with arch 37) and "CUDA 12" presetsREADME.md- Maintain ollama37 specific documentationdocs/*.md- Keep our custom documentation
Sync Process
1. Create a New Branch
git checkout main
git checkout -b sync-upstream-models
2. Add Upstream Remote (if not exists)
git remote add upstream https://github.com/ollama/ollama.git
3. Fetch Latest Changes
git fetch upstream main
4. Merge Upstream Changes
git merge upstream/main -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support"
5. Resolve Conflicts
Common conflict resolutions:
CMakePresets.json
- Resolution: Keep both CUDA 11 (with architecture 37) and CUDA 12 configurations
- Example: Preserve the "CUDA 11" preset with
"CMAKE_CUDA_ARCHITECTURES": "37;50;52;53;60;61;70;75;80;86"
README.md and Documentation
- Resolution: Keep our version with
git checkout --ours README.md - Reason: Maintains ollama37 specific instructions and branding
Model Support Files
- Resolution: Accept upstream changes for new model support
- Files:
model/models/models.go, new model directories - Example: Accept new imports like
_ "github.com/ollama/ollama/model/models/gptoss"
Backend/Tools Updates
- Resolution: Generally accept upstream improvements
- Files:
tools/tools.go,ml/backend/ggml/ggml.go - Caution: Verify no CUDA 3.7 specific code is removed
Test Files
- Resolution: Accept upstream test additions
- Files:
integration/utils_test.go - Action: Include new model tests in the test lists
6. Commit the Merge
git add -A
git commit -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support
- Added support for new [model name] from upstream
- Preserved CUDA Compute Capability 3.7 (Tesla K80) support
- Kept CUDA 11 configuration alongside CUDA 12
- Maintained all documentation specific to ollama37 fork
- [List other significant changes]"
7. Test the Build
Build with Docker to verify CUDA 3.7 support:
docker build -f ollama37.Dockerfile -t ollama37:test .
Or build manually:
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build
go build -o ollama .
8. Test with Tesla K80
# Run the server
./ollama serve
# In another terminal, test a model
ollama run llama3.2:1b
9. Merge to Main
After successful testing:
git checkout main
git merge sync-upstream-models
git push origin main
Troubleshooting
Conflict in CUDA Architecture Settings
- Always ensure "37" is included in CMAKE_CUDA_ARCHITECTURES
- CUDA 11 is required for Compute Capability 3.7 support
- CUDA 12 dropped support for architectures below 50
Build Failures
- Check that GCC 10 is being used (required for CUDA 11.4)
- Verify CUDA 11.x is installed (not CUDA 12)
- Ensure all CUDA 3.7 specific patches are preserved
New Model Integration Issues
- New models should work automatically if they don't have specific CUDA version requirements
- If a model requires CUDA 12 features, document the limitation in README.md
Recent Sync History
- 2025-08-08: Synced with upstream, added gpt-oss model support
- Previous syncs: See git log for merge commits from upstream/main
Notes
- This process maintains our fork's ability to run on Tesla K80 GPUs while staying current with upstream features
- Always test thoroughly before merging to main
- Document any model-specific limitations discovered during testing