mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-10 15:57:04 +00:00

Files

Shang Chieh Tseng 83973336d6 Optimize Docker build performance with parallel compilation

- Add -j$(nproc) flag to cmake build in ollama37.Dockerfile
- Use all available CPU cores for faster compilation
- Add sync-upstream.md documentation for future maintenance

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-08 11:44:59 +08:00

4.2 KiB

Raw Blame History

Syncing with Upstream Ollama

This document describes the process for syncing the ollama37 fork with the official ollama/ollama repository while preserving CUDA Compute Capability 3.7 support for Tesla K80 GPUs.

Prerequisites

Git configured with the upstream remote: https://github.com/ollama/ollama.git
Understanding of which files contain CUDA 3.7 specific changes

Key Files to Preserve

When merging from upstream, always preserve CUDA 3.7 support in these files:

ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt - Contains CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"
CMakePresets.json - Keep both "CUDA 11" (with arch 37) and "CUDA 12" presets
README.md - Maintain ollama37 specific documentation
docs/*.md - Keep our custom documentation

Sync Process

1. Create a New Branch

git checkout main
git checkout -b sync-upstream-models

2. Add Upstream Remote (if not exists)

git remote add upstream https://github.com/ollama/ollama.git

3. Fetch Latest Changes

git fetch upstream main

4. Merge Upstream Changes

git merge upstream/main -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support"

5. Resolve Conflicts

Common conflict resolutions:

CMakePresets.json

Resolution: Keep both CUDA 11 (with architecture 37) and CUDA 12 configurations
Example: Preserve the "CUDA 11" preset with "CMAKE_CUDA_ARCHITECTURES": "37;50;52;53;60;61;70;75;80;86"

README.md and Documentation

Resolution: Keep our version with git checkout --ours README.md
Reason: Maintains ollama37 specific instructions and branding

Model Support Files

Resolution: Accept upstream changes for new model support
Files: model/models/models.go, new model directories
Example: Accept new imports like _ "github.com/ollama/ollama/model/models/gptoss"

Backend/Tools Updates

Resolution: Generally accept upstream improvements
Files: tools/tools.go, ml/backend/ggml/ggml.go
Caution: Verify no CUDA 3.7 specific code is removed

Test Files

Resolution: Accept upstream test additions
Files: integration/utils_test.go
Action: Include new model tests in the test lists

6. Commit the Merge

git add -A
git commit -m "Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support

- Added support for new [model name] from upstream
- Preserved CUDA Compute Capability 3.7 (Tesla K80) support
- Kept CUDA 11 configuration alongside CUDA 12
- Maintained all documentation specific to ollama37 fork
- [List other significant changes]"

7. Test the Build

Build with Docker to verify CUDA 3.7 support:

docker build -f ollama37.Dockerfile -t ollama37:test .

Or build manually:

CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build
go build -o ollama .

8. Test with Tesla K80

# Run the server
./ollama serve

# In another terminal, test a model
ollama run llama3.2:1b

9. Merge to Main

After successful testing:

git checkout main
git merge sync-upstream-models
git push origin main

Troubleshooting

Conflict in CUDA Architecture Settings

Always ensure "37" is included in CMAKE_CUDA_ARCHITECTURES
CUDA 11 is required for Compute Capability 3.7 support
CUDA 12 dropped support for architectures below 50

Build Failures

Check that GCC 10 is being used (required for CUDA 11.4)
Verify CUDA 11.x is installed (not CUDA 12)
Ensure all CUDA 3.7 specific patches are preserved

New Model Integration Issues

New models should work automatically if they don't have specific CUDA version requirements
If a model requires CUDA 12 features, document the limitation in README.md

Recent Sync History

2025-08-08: Synced with upstream, added gpt-oss model support
Previous syncs: See git log for merge commits from upstream/main

Notes

This process maintains our fork's ability to run on Tesla K80 GPUs while staying current with upstream features
Always test thoroughly before merging to main
Document any model-specific limitations discovered during testing

4.2 KiB Raw Blame History