mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-09 23:37:06 +00:00
This commit represents a complete rework after pulling the latest changes from official ollama/ollama repository and re-applying Tesla K80 compatibility patches. ## Key Changes ### CUDA Compute Capability 3.7 Support (Tesla K80) - Added sm_37 (compute 3.7) to CMAKE_CUDA_ARCHITECTURES in CMakeLists.txt - Updated CMakePresets.json to include compute 3.7 in "CUDA 11" preset - Using 37-virtual (PTX with JIT compilation) for maximum compatibility ### Legacy Toolchain Compatibility - **NVIDIA Driver**: 470.256.02 (last version supporting Kepler/K80) - **CUDA Version**: 11.4.4 (last CUDA 11.x supporting compute 3.7) - **GCC Version**: 10.5.0 (required by CUDA 11.4 host_config.h) ### CPU Architecture Trade-offs Due to GCC 10.5 limitation, sacrificed newer CPU optimizations: - Alderlake CPU variant enabled WITHOUT AVX_VNNI (requires GCC 11+) - Still supports: SSE4.2, AVX, F16C, AVX2, BMI2, FMA - Performance impact: ~3-7% on newer CPUs (acceptable for K80 compatibility) ### Build System Updates - Modified ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt for compute 3.7 - Added -Wno-deprecated-gpu-targets flag to suppress warnings - Updated ml/backend/ggml/ggml/src/CMakeLists.txt for Alderlake without AVX_VNNI ### Upstream Sync Merged latest llama.cpp changes including: - Enhanced KV cache management with ISWA and hybrid memory support - Improved multi-modal support (mtmd framework) - New model architectures (Gemma3, Llama4, Qwen3, etc.) - GPU backend improvements for CUDA, Metal, and ROCm - Updated quantization support and GGUF format handling ### Documentation - Updated CLAUDE.md with comprehensive build instructions - Documented toolchain constraints and CPU architecture trade-offs - Removed outdated CI/CD workflows (tesla-k80-*.yml) - Cleaned up temporary development artifacts ## Rationale This fork maintains Tesla K80 GPU support (compute 3.7) which was dropped in official Ollama due to legacy driver/CUDA requirements. The toolchain constraint creates a deadlock: - K80 → Driver 470 → CUDA 11.4 → GCC 10 → No AVX_VNNI We accept the loss of cutting-edge CPU optimizations to enable running modern LLMs on legacy but still capable Tesla K80 hardware (12GB VRAM per GPU). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
57 lines
2.3 KiB
Bash
Executable File
57 lines
2.3 KiB
Bash
Executable File
#!/bin/sh
|
|
#
|
|
# Mac ARM users, rosetta can be flaky, so to use a remote x86 builder
|
|
#
|
|
# docker context create amd64 --docker host=ssh://mybuildhost
|
|
# docker buildx create --name mybuilder amd64 --platform linux/amd64
|
|
# docker buildx create --name mybuilder --append desktop-linux --platform linux/arm64
|
|
# docker buildx use mybuilder
|
|
|
|
|
|
set -eu
|
|
|
|
. $(dirname $0)/env.sh
|
|
|
|
mkdir -p dist
|
|
NOVULKAN=${NOVULKAN:-""}
|
|
|
|
docker buildx build \
|
|
--output type=local,dest=./dist/ \
|
|
--platform=${PLATFORM} \
|
|
${OLLAMA_COMMON_BUILD_ARGS} \
|
|
--target archive${NOVULKAN} \
|
|
-f Dockerfile \
|
|
.
|
|
|
|
if echo $PLATFORM | grep "amd64" > /dev/null; then
|
|
outDir="./dist"
|
|
if echo $PLATFORM | grep "," > /dev/null ; then
|
|
outDir="./dist/linux_amd64"
|
|
fi
|
|
docker buildx build \
|
|
--output type=local,dest=${outDir} \
|
|
--platform=linux/amd64 \
|
|
${OLLAMA_COMMON_BUILD_ARGS} \
|
|
--build-arg FLAVOR=rocm \
|
|
--target archive \
|
|
-f Dockerfile \
|
|
.
|
|
fi
|
|
|
|
# buildx behavior changes for single vs. multiplatform
|
|
echo "Compressing linux tar bundles..."
|
|
if echo $PLATFORM | grep "," > /dev/null ; then
|
|
tar c -C ./dist/linux_arm64 --exclude cuda_jetpack5 --exclude cuda_jetpack6 . | pigz -9vc >./dist/ollama-linux-arm64.tgz
|
|
tar c -C ./dist/linux_arm64 ./lib/ollama/cuda_jetpack5 | pigz -9vc >./dist/ollama-linux-arm64-jetpack5.tgz
|
|
tar c -C ./dist/linux_arm64 ./lib/ollama/cuda_jetpack6 | pigz -9vc >./dist/ollama-linux-arm64-jetpack6.tgz
|
|
tar c -C ./dist/linux_amd64 --exclude rocm . | pigz -9vc >./dist/ollama-linux-amd64.tgz
|
|
tar c -C ./dist/linux_amd64 ./lib/ollama/rocm | pigz -9vc >./dist/ollama-linux-amd64-rocm.tgz
|
|
elif echo $PLATFORM | grep "arm64" > /dev/null ; then
|
|
tar c -C ./dist/ --exclude cuda_jetpack5 --exclude cuda_jetpack6 bin lib | pigz -9vc >./dist/ollama-linux-arm64.tgz
|
|
tar c -C ./dist/ ./lib/ollama/cuda_jetpack5 | pigz -9vc >./dist/ollama-linux-arm64-jetpack5.tgz
|
|
tar c -C ./dist/ ./lib/ollama/cuda_jetpack6 | pigz -9vc >./dist/ollama-linux-arm64-jetpack6.tgz
|
|
elif echo $PLATFORM | grep "amd64" > /dev/null ; then
|
|
tar c -C ./dist/ --exclude rocm bin lib | pigz -9vc >./dist/ollama-linux-amd64.tgz
|
|
tar c -C ./dist/ ./lib/ollama/rocm | pigz -9vc >./dist/ollama-linux-amd64-rocm.tgz
|
|
fi
|