Files
ollama37/docs/manual-build.md
2025-10-28 15:53:49 +08:00

16 KiB

Manual Build Guide for Ollama37

This document provides comprehensive instructions for building Ollama37 from source on various platforms, specifically optimized for Tesla K80 and CUDA Compute Capability 3.7 hardware.

⚠️ Important: Kernel Compatibility Notice

Recent kernel updates in Fedora, Ubuntu, and Rocky Linux have broken compatibility with:

  • NVIDIA Driver 470 (required for Tesla K80 / Compute Capability 3.7)
  • CUDA 11.4 nvcc compiler

Solution: Compile a compatible kernel from source (Linux 5.14.x) before installing NVIDIA drivers.

Recommended Linux Distribution: Rocky Linux 9

  • Rocky Linux 8 has docker-ce compatibility issues
  • Rocky Linux 9 provides better stability and container support

Native Build Overview

For native builds on Rocky Linux 9, you'll need to follow these steps in order:

Installation Steps:

  1. Install GCC 10 - Required for kernel compilation and ollama37 source builds
  2. Compile Custom Kernel (Linux 5.14.x) - Required for NVIDIA 470 compatibility
  3. Install NVIDIA Driver 470 & CUDA 11.4 - Tesla K80 GPU support
  4. Install CMake 4.0 - Build system
  5. Install Go 1.24.2 - Go compiler
  6. Compile Ollama37 (Optional - if not using pre-built binaries)

Quick Native Build (after prerequisites):

# Clone repository
git clone https://github.com/dogkeeper886/ollama37
cd ollama37

# If compiling from source (requires GCC 10):
cmake -B build
cmake --build build -j$(nproc)
go build -o ollama .

# If using pre-built binary (GCC 10 not required):
# Just download and run the ollama binary

Detailed Installation Guide for Rocky Linux 9

Step 1: GCC 10 Installation

Why Install GCC 10 First?

GCC 10 is required for:

  • Compiling the custom Linux kernel (Step 2)
  • Building ollama37 from source (Step 6)
  • CUDA 11.4 compatibility (CUDA 11.4 nvcc is not compatible with GCC 11.5+)

Rocky Linux 9 ships with GCC 11.5 by default, which is:

  • Incompatible with CUDA 11.4 nvcc compiler
  • Not recommended for kernel compilation with NVIDIA drivers
  • Sufficient for running pre-built binaries (if you skip Steps 2 and 6)

Installation Steps

Complete installation script:

# Install prerequisites
dnf -y groupinstall "Development Tools"

# Download and extract GCC 10 source
cd /usr/local/src
wget https://github.com/gcc-mirror/gcc/archive/refs/heads/releases/gcc-10.zip
unzip gcc-10.zip
cd gcc-releases-gcc-10

# Download GCC prerequisites (GMP, MPFR, MPC, ISL)
contrib/download_prerequisites

# Create build directory and configure
mkdir /usr/local/gcc-10
cd /usr/local/gcc-10
/usr/local/src/gcc-releases-gcc-10/configure --disable-multilib

# Compile and install (1-2 hours depending on CPU)
make -j $(nproc)
make install

Note

: The compilation step make -j $(nproc) will take 1-2 hours depending on your CPU performance. The $(nproc) command uses all available CPU cores to speed up compilation.

Post-Install Configuration:

# Configure dynamic linker to include both system and GCC 10 library paths
cat > /etc/ld.so.conf.d/gcc-10.conf << 'EOF'
/usr/lib64
/usr/local/lib64
EOF

ldconfig

# Update system compiler symlinks to use GCC 10
rm -f /usr/bin/cc /usr/bin/gcc /usr/bin/g++ /usr/bin/c++
ln -s /usr/local/bin/gcc /usr/bin/cc
ln -s /usr/local/bin/gcc /usr/bin/gcc
ln -s /usr/local/bin/g++ /usr/bin/g++
ln -s /usr/local/bin/g++ /usr/bin/c++

Verify Installation:

# Verify GCC 10 installation
gcc --version
# Should output: gcc (GCC) 10.x.x

g++ --version
# Should output: g++ (GCC) 10.x.x

# Verify symlinks are correct
which cc
# Should output: /usr/bin/cc

ls -al /usr/bin/cc
# Should show: /usr/bin/cc -> /usr/local/bin/gcc

Step 2: Kernel Compilation (Required for NVIDIA 470 Compatibility)

Why Compile a Custom Kernel?

Recent kernel updates in Rocky Linux 9, Fedora, and Ubuntu have broken compatibility with:

  • NVIDIA Driver 470 (required for Tesla K80 / Compute Capability 3.7)
  • CUDA 11.4 nvcc compiler

Solution: Use Linux kernel 5.14.x, which maintains stable NVIDIA 470 driver support.

Prerequisites

System Requirements:

  • Rocky Linux 9 (clean installation recommended)
  • Root privileges
  • At least 20GB free disk space
  • Stable internet connection

Install Build Tools:

dnf -y groupinstall "Development Tools"
dnf -y install ncurses-devel

Download Kernel Source

  1. Navigate to source directory:

    cd /usr/src/kernels
    
  2. Download Linux 5.14.x kernel:

    wget https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.14.tar.xz
    

    Note

    : Check kernel.org for the latest 5.14.x stable release.

  3. Extract the archive:

    tar xvf linux-5.14.tar.xz
    cd linux-5.14
    

Configure Kernel

  1. Copy existing kernel configuration:

    # First, check available kernel configurations
    ls /usr/src/kernels
    
    # Copy config from the running kernel (adjust version as needed)
    # Example: cp /usr/src/kernels/5.14.0-570.52.1.el9_6.x86_64/.config .config
    cp /usr/src/kernels/$(uname -r)/.config .config
    
  2. Open menuconfig to adjust settings:

    make menuconfig
    
  3. Required Configuration Changes:

    Navigate and DISABLE the following options:

    a) Disable Module Signature Verification:

    Enable loadable module support
      → [ ] Module signature verification  (press N to disable)
    

    b) Disable Trusted Keys:

    Cryptographic API
      → Certificates for signature checking
        → [ ] Provide system-wide ring of trusted keys  (press N)
        → System trusted keys filename = "" (delete any content, leave empty)
    

    c) Disable BTF Debug Info:

    Kernel hacking
      → Compile-time checks and compiler options
        → [ ] Generate BTF typeinfo  (press N to disable CONFIG_DEBUG_INFO_BTF)
    

    Why disable these?

    • Module signatures: Prevents loading unsigned NVIDIA proprietary driver
    • Trusted keys: Conflicts with out-of-tree driver compilation
    • BTF debug: Can cause build failures and is unnecessary for production use
  4. Save configuration:

    • Press <Save>
    • Confirm default filename .config
    • Press <Exit> to quit menuconfig

Compile Kernel

  1. Compile kernel (using all CPU cores):

    make -j$(nproc)
    

    Estimated time: 30-60 minutes depending on CPU performance

  2. Install kernel modules:

    make modules_install
    
  3. Install kernel:

    make install
    

Reboot and Verify

  1. Reboot system:

    reboot
    
  2. After reboot, verify kernel version:

    uname -r
    # Should output: 5.14.21
    

Troubleshooting Kernel Compilation

Issue: BTF-related build errors

BTF: .tmp_vmlinux.btf: pahole (pahole) is not available
Failed to generate BTF for vmlinux

Solution:

  • Disable CONFIG_DEBUG_INFO_BTF in menuconfig (see step 4c above)

Issue: Module signing key errors

Can't read private key

Solution:

  • Disable CONFIG_MODULE_SIG_ALL and clear CONFIG_SYSTEM_TRUSTED_KEYS in menuconfig
  • Ensure the "System trusted keys filename" field is completely empty

Step 3: NVIDIA Driver 470 & CUDA 11.4 Installation

Prerequisites:

  • Rocky Linux 9 system running custom kernel 5.14.x (from Step 2)
  • Root privileges
  • Internet connectivity

Steps:

  1. Update the system:

    dnf -y update
    
  2. Install EPEL Repository:

    dnf -y install epel-release
    
  3. Add NVIDIA CUDA Repository:

    dnf -y config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel9/x86_64/cuda-rhel9.repo
    
  4. Install NVIDIA Driver (Version 470):

    dnf -y module install nvidia-driver:470-dkms
    

    Note

    : If the module install fails, you may need to install directly:

    dnf -y install nvidia-driver-470 nvidia-driver-470-dkms
    
  5. Install CUDA Toolkit 11.4:

    dnf -y install cuda-11-4
    
  6. Set up CUDA Environment Variables:

    # Create /etc/profile.d/cuda-11.4.sh
    cat > /etc/profile.d/cuda-11.4.sh << 'EOF'
    

#!/bin/sh

cuda-11.4.sh - CUDA 11.4 environment configuration for Tesla K80 support

export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}} export LD_LIBRARY_PATH=/usr/local/cuda-11.4/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} EOF

Apply changes

source /etc/profile.d/cuda-11.4.sh


7. **Reboot to load NVIDIA driver:**
```bash
reboot

Verification:

# Check CUDA compiler
nvcc --version
# Should show: Cuda compilation tools, release 11.4

# Check driver and GPU
nvidia-smi
# Should show Tesla K80 GPU(s) with driver version 470.x

Step 4: CMake 4.0 Installation

  1. Install OpenSSL Development Libraries:

    dnf -y install openssl-devel
    
  2. Download CMake Source Code:

    cd /usr/local/src
    wget https://github.com/Kitware/CMake/releases/download/v4.0.0/cmake-4.0.0.tar.gz
    
  3. Extract the Archive:

    tar xvf cmake-4.0.0.tar.gz
    
  4. Create Installation Directory:

    mkdir /usr/local/cmake-4
    
  5. Configure CMake:

    cd /usr/local/cmake-4
    /usr/local/src/cmake-4.0.0/configure
    
  6. Compile CMake:

    make -j $(nproc)
    
  7. Install CMake:

    make install
    
  8. Verify Installation:

    cmake --version
    # Should output: cmake version 4.0.0
    

Step 5: Go 1.24.2 Installation

  1. Download Go Distribution:

    cd /usr/local
    wget https://go.dev/dl/go1.24.2.linux-amd64.tar.gz
    
  2. Extract the Archive:

    tar xvf go1.24.2.linux-amd64.tar.gz
    
  3. Post Install Configuration:

    cat > /etc/profile.d/go-1.24.2.sh << 'EOF'
    

#!/bin/sh

go-1.24.2.sh - Go 1.24.2 environment configuration

export PATH=/usr/local/go/bin${PATH:+:${PATH}} EOF

source /etc/profile.d/go-1.24.2.sh


4. **Verify Installation:**
```bash
go version
# Should output: go version go1.24.2 linux/amd64

Step 6: Ollama37 Compilation (Optional - For Custom Builds)

Prerequisites: All components installed as per the guides above:

  • GCC 10 (from Step 1)
  • Rocky Linux 9 with custom kernel 5.14.x (from Step 2)
  • CUDA Toolkit 11.4 (from Step 3)
  • CMake 4.0 (from Step 4)
  • Go 1.24.2 (from Step 5)
  • Git

Compilation Steps:

  1. Navigate to Build Directory:

    cd /usr/local/src
    
  2. Clone the Repository:

    git clone https://github.com/dogkeeper886/ollama37
    cd ollama37
    
  3. CMake Configuration: Set compiler variables and configure the build system:

    cmake -B build
    
  4. CMake Build: Compile the C++ components (parallel build):

    cmake --build build -j$(nproc)
    

    Note: -j$(nproc) enables parallel compilation using all available CPU cores. You can specify a number like -j4 to limit the number of parallel jobs.

  5. Go Build: Compile the Go components:

    go build -o ollama .
    
  6. Verification:

    ./ollama --version
    
  7. Optional: Install System-Wide:

    cp ollama /usr/local/bin/
    cp -r lib/ollama /usr/local/lib/
    

Tesla K80 Specific Optimizations

The Ollama37 build includes several Tesla K80-specific optimizations:

CUDA Architecture Support

  • CMake Configuration: CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"
  • Build Files: Located in ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt

CUDA 11 Compatibility

  • Uses CUDA 11 toolchain (CUDA 12 dropped Compute Capability 3.7 support)
  • Environment variables configured for CUDA 11.4 paths
  • Driver version 470 for maximum compatibility

Performance Tuning

  • Optimized memory management for Tesla K80's 12GB VRAM
  • Kernel optimizations for Kepler architecture
  • Reduced precision operations where appropriate
  • Enhanced VMM pool with granularity alignment
  • Progressive memory allocation fallback (4GB → 2GB → 1GB → 512MB)

Troubleshooting

NVIDIA Driver Issues

Issue: nvidia-smi shows "Failed to initialize NVML"

Solution:

# Check if driver is loaded
lsmod | grep nvidia

# If not loaded, load manually
modprobe nvidia

# Check dmesg for errors
dmesg | grep -i nvidia

Issue: Driver loads but CUDA version mismatch

Solution:

# Check CUDA version
nvcc --version

# Check driver CUDA support
nvidia-smi

# Ensure PATH points to CUDA 11.4
echo $PATH | grep cuda-11.4

CUDA Compilation Issues

Issue: nvcc not found

Solution:

# Check if CUDA is in PATH
which nvcc

# If not, source environment
source /etc/profile.d/cuda-11.4.sh

# Verify
nvcc --version

Issue: "nvcc fatal: Unsupported gpu architecture 'compute_37'"

Solution: This error means you're using CUDA 12 instead of CUDA 11.4. Ensure:

# Check CUDA version
nvcc --version
# Must show CUDA 11.4

# If wrong version, check PATH
echo $PATH
# Should include /usr/local/cuda-11.4/bin BEFORE any other CUDA paths

GCC Version Issues

Issue: CMake can't find GCC 10

Solution:

# Check GCC version
/usr/local/bin/gcc --version
# Should show GCC 10.x

# If build fails, explicitly set CC and CXX
export CC=/usr/local/bin/gcc
export CXX=/usr/local/bin/g++

Issue: CUDA compilation fails with GCC 11 errors

Solution:

# CUDA 11.4 is not compatible with GCC 11+
# You MUST use GCC 10 for compilation
# Ensure you've installed GCC 10 (Step 5)

# Verify compiler paths
which gcc  # Should point to /usr/local/bin/gcc
/usr/local/bin/gcc --version  # Should show 10.x

Memory Issues

Issue: Out of memory during model loading

Solution:

  • Tesla K80 has 12GB VRAM per GPU
  • Use quantized models (Q4_0, Q8_0) for better memory efficiency
  • Reduce context length: ollama run model --num-ctx 2048
  • Monitor GPU memory: watch -n 1 nvidia-smi

Build Verification

After successful compilation, verify Tesla K80 support:

# Check if ollama detects your GPU
./ollama serve &

# Pull a small model
./ollama pull llama3.2:3b

# Test inference
./ollama run llama3.2:3b "Hello Tesla K80!"

# Monitor GPU utilization
watch -n 1 nvidia-smi

Performance Optimization Tips

  1. Model Selection: Use quantized models (Q4_0, Q8_0) for better performance on Tesla K80
  2. Memory Management: Monitor VRAM usage and adjust context sizes accordingly
  3. Temperature Control: Ensure adequate cooling for sustained workloads
  4. Power Management: Tesla K80 requires proper power delivery (225W per GPU)
  5. Multi-GPU: For dual K80 setups, use CUDA_VISIBLE_DEVICES=0,1 to leverage both GPUs

Summary: Installation Paths

Path 1: Pre-built Binary (Easier)

  1. Skip GCC 10 installation (not needed for pre-built binaries)
  2. Compile custom kernel 5.14.x
  3. Install NVIDIA Driver 470 & CUDA 11.4
  4. Install CMake 4.0
  5. Install Go 1.24.2
  6. Download and run pre-built ollama37 binary

Path 2: Compile from Source (Advanced - Requires All Steps)

  1. Install GCC 10 (required for kernel and ollama37 compilation)
  2. Compile custom kernel 5.14.x (uses GCC 10)
  3. Install NVIDIA Driver 470 & CUDA 11.4
  4. Install CMake 4.0
  5. Install Go 1.24.2
  6. Compile ollama37 from source (uses GCC 10)

Choose the path that best fits your requirements and technical expertise.