mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-10 07:46:59 +00:00

Files

Shang Chieh Tseng 7c029749bc docs: restructure README and create comprehensive manual build guide

- Restructure README.md for better readability and organization
- Reduce README word count by 75% while maintaining key information
- Move detailed installation guides to docs/manual-build.md
- Add Tesla K80-specific build instructions and optimizations
- Update CLAUDE.md with new documentation structure and references
- Improve title formatting with emoji and clear tagline

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-07-20 09:11:43 +08:00

7.9 KiB

Raw Blame History

Manual Build Guide for Ollama37

This document provides comprehensive instructions for building Ollama37 from source on various platforms, specifically optimized for Tesla K80 and CUDA Compute Capability 3.7 hardware.

Quick Build Options

Docker Build (Recommended)

# Build ollama37 image for Tesla K80/Compute 3.7 support
docker build -f ollama37.Dockerfile -t ollama37 .

This Dockerfile uses a multi-stage build process:

Stage 1 (Builder): Uses dogkeeper886/ollama37-builder base image with pre-installed CUDA 11.4, GCC 10, and CMake 4
Stage 2 (Runtime): Creates a minimal Rocky Linux 8 runtime image with only the compiled binary and required libraries

The build process automatically:

Configures CMake with GCC 10 and CUDA 3.7 support
Compiles the C++ components with Tesla K80 optimizations
Builds the Go binary
Creates a runtime image with proper CUDA environment variables

Native Build

For native builds, you'll need to install the following prerequisites:

Prerequisites:

Rocky Linux 8 (or compatible)
git - For cloning the repository
cmake - For managing the C++ build process
go 1.24.2+ - The Go compiler and toolchain
gcc version 10 - GNU Compiler Collection and G++
CUDA Toolkit 11.4

Quick Build Steps:

# Clone repository
git clone https://github.com/dogkeeper886/ollama37
cd ollama37

# Configure and build
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build

# Build Go binary
go build -o ollama .

Detailed Installation Guides

CUDA 11.4 Installation on Rocky Linux 8

Prerequisites:

A Rocky Linux 8 system or container
Root privileges
Internet connectivity

Steps:

Update the system:
```
dnf -y update
```
Install EPEL Repository:
```
dnf -y install epel-release
```

Add NVIDIA CUDA Repository:

dnf -y config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo

Install NVIDIA Driver (Version 470):

dnf -y module install nvidia-driver:470-dkms

Install CUDA Toolkit 11.4:
```
dnf -y install cuda-11-4
```

Set up CUDA Environment Variables:

# Create /etc/profile.d/cuda-11.4.sh
echo "export PATH=/usr/local/cuda-11.4/bin:${PATH}" > /etc/profile.d/cuda-11.4.sh
echo "export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64:${LD_LIBRARY_PATH}" >> /etc/profile.d/cuda-11.4.sh

# Apply changes
source /etc/profile.d/cuda-11.4.sh

Verification:

# Check CUDA compiler
nvcc --version

# Check driver
nvidia-smi

GCC 10 Installation Guide

Steps:

Update and Install Prerequisites:

dnf -y install wget unzip lbzip2
dnf -y groupinstall "Development Tools"

Download GCC 10 Source Code:

cd /usr/local/src
wget https://github.com/gcc-mirror/gcc/archive/refs/heads/releases/gcc-10.zip

Extract Source Code:

unzip gcc-10.zip
cd gcc-releases-gcc-10

Download Prerequisites:
```
contrib/download_prerequisites
```
Create Installation Directory:
```
mkdir /usr/local/gcc-10
```

Configure GCC Build:

cd /usr/local/gcc-10
/usr/local/src/gcc-releases-gcc-10/configure --disable-multilib

Compile GCC:
```
make -j $(nproc)
```
Install GCC:
```
make install
```

Post-Install Configuration:

# Create environment script
echo "export LD_LIBRARY_PATH=/usr/local/lib64:\$LD_LIBRARY_PATH" > /etc/profile.d/gcc-10.sh

# Configure dynamic linker
echo "/usr/local/lib64" > /etc/ld.so.conf.d/gcc-10.conf
ldconfig

CMake 4.0 Installation Guide

Install OpenSSL Development Libraries:
```
dnf -y install openssl-devel
```

Download CMake Source Code:

cd /usr/local/src
wget https://github.com/Kitware/CMake/releases/download/v4.0.0/cmake-4.0.0.tar.gz

Extract the Archive:
```
tar xvf cmake-4.0.0.tar.gz
```
Create Installation Directory:
```
mkdir /usr/local/cmake-4
```

Configure CMake:

cd /usr/local/cmake-4
/usr/local/src/cmake-4.0.0/configure

Compile CMake:
```
make -j $(nproc)
```
Install CMake:
```
make install
```

Go 1.24.2 Installation Guide

Download Go Distribution:

cd /usr/local
wget https://go.dev/dl/go1.24.2.linux-amd64.tar.gz

Extract the Archive:
```
tar xvf go1.24.2.linux-amd64.tar.gz
```

Post Install Configuration:

echo 'export PATH=$PATH:/usr/local/go/bin' > /etc/profile.d/go.sh
source /etc/profile.d/go.sh

Complete Ollama37 Compilation Guide

Prerequisites: All components installed as per the guides above:

Rocky Linux 8
Git
CMake 4.0
Go 1.24.2
GCC 10
CUDA Toolkit 11.4

Compilation Steps:

Navigate to Build Directory:
```
cd /usr/local/src
```

Clone the Repository:

git clone https://github.com/dogkeeper886/ollama37
cd ollama37

CMake Configuration: Set compiler variables and configure the build system:
```
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build
```

CMake Build: Compile the C++ components:

CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build

Go Build: Compile the Go components:
```
go build -o ollama .
```
Verification:
```
./ollama --version
```

Tesla K80 Specific Optimizations

The Ollama37 build includes several Tesla K80-specific optimizations:

CUDA Architecture Support

CMake Configuration: CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"
Build Files: Located in ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt

CUDA 11 Compatibility

Uses CUDA 11 toolchain (CUDA 12 dropped Compute Capability 3.7 support)
Environment variables configured for CUDA 11.4 paths
Driver version 470 for maximum compatibility

Performance Tuning

Optimized memory management for Tesla K80's 12GB VRAM
Kernel optimizations for Kepler architecture
Reduced precision operations where appropriate

Troubleshooting

Common Issues

CUDA Version Conflicts:

# Check CUDA version
nvcc --version
# Should show CUDA 11.4

# If wrong version, check PATH
echo $PATH
# Should include /usr/local/cuda-11.4/bin

GCC Version Issues:

# Check GCC version
/usr/local/bin/gcc --version
# Should show GCC 10.x

# If build fails, ensure CC and CXX are set
export CC=/usr/local/bin/gcc
export CXX=/usr/local/bin/g++

Memory Issues:

Tesla K80 has 12GB VRAM - adjust model sizes accordingly
Monitor GPU memory usage with nvidia-smi
Use quantized models (Q4, Q8) for better memory efficiency

Build Verification

After successful compilation, verify Tesla K80 support:

# Check if ollama detects your GPU
./ollama serve &
./ollama run llama3.2 "Hello Tesla K80!"

# Monitor GPU utilization
watch -n 1 nvidia-smi

Performance Optimization Tips

Model Selection: Use quantized models (Q4_0, Q8_0) for better performance on Tesla K80
Memory Management: Monitor VRAM usage and adjust context sizes accordingly
Temperature Control: Ensure adequate cooling for sustained workloads
Power Management: Tesla K80 requires proper power delivery (225W per GPU)

Docker Alternative

If manual compilation proves difficult, the pre-built Docker image provides the same Tesla K80 optimizations:

docker pull dogkeeper886/ollama37
docker run --runtime=nvidia --gpus all -p 11434:11434 dogkeeper886/ollama37

This image includes all the optimizations and dependencies pre-configured for Tesla K80 hardware.

7.9 KiB Raw Blame History