- Split Step 3 into two distinct steps: - Step 3: NVIDIA Driver 470 installation via .run file - Step 4: CUDA 11.4 Toolkit installation via local installer - Add libglvnd-devel dependency requirement - Add text mode (init 3) requirement for driver installation - Specify exact driver version (470.256.02) and download URL - Specify exact CUDA installer (11.4.0 with 470.42.01 driver) - Add note to deselect driver during CUDA installation - Separate environment configuration: - PATH in /etc/profile.d/cuda-11.4.sh - Dynamic linker in /etc/ld.so.conf.d/cuda-11-4.conf - Update all subsequent step numbers (5-7) - Update all cross-references throughout document
17 KiB
Manual Build Guide for Ollama37
This document provides comprehensive instructions for building Ollama37 from source on various platforms, specifically optimized for Tesla K80 and CUDA Compute Capability 3.7 hardware.
⚠️ Important: Kernel Compatibility Notice
Recent kernel updates in Fedora, Ubuntu, and Rocky Linux have broken compatibility with:
- NVIDIA Driver 470 (required for Tesla K80 / Compute Capability 3.7)
- CUDA 11.4 nvcc compiler
Solution: Compile a compatible kernel from source (Linux 5.14.x) before installing NVIDIA drivers.
Recommended Linux Distribution: Rocky Linux 9
- Rocky Linux 8 has docker-ce compatibility issues
- Rocky Linux 9 provides better stability and container support
Native Build Overview
For native builds on Rocky Linux 9, you'll need to follow these steps in order:
Installation Steps:
- Install GCC 10 - Required for kernel compilation and ollama37 source builds
- Compile Custom Kernel (Linux 5.14.x) - Required for NVIDIA 470 compatibility
- Install NVIDIA Driver 470 - Tesla K80 GPU driver support
- Install CUDA 11.4 Toolkit - CUDA development environment
- Install CMake 4.0 - Build system
- Install Go 1.25.3 - Go compiler
- Compile Ollama37 (Optional - if not using pre-built binaries)
Quick Native Build (after prerequisites):
# Clone repository
git clone https://github.com/dogkeeper886/ollama37
cd ollama37
# If compiling from source (requires GCC 10):
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build -j$(nproc)
go build -o ollama .
# If using pre-built binary (GCC 10 not required):
# Just download and run the ollama binary
Detailed Installation Guide for Rocky Linux 9
Step 1: GCC 10 Installation
Why Install GCC 10 First?
GCC 10 is required for:
- Compiling the custom Linux kernel (Step 2)
- Building ollama37 from source (Step 6)
- CUDA 11.4 compatibility (CUDA 11.4 nvcc is not compatible with GCC 11.5+)
Rocky Linux 9 ships with GCC 11.5 by default, which is:
- ❌ Incompatible with CUDA 11.4 nvcc compiler
- ❌ Not recommended for kernel compilation with NVIDIA drivers
- ✅ Sufficient for running pre-built binaries (if you skip Steps 2 and 6)
Installation Steps
Complete installation script:
# Install prerequisites
dnf -y groupinstall "Development Tools"
# Download and extract GCC 10 source
cd /usr/local/src
wget https://github.com/gcc-mirror/gcc/archive/refs/heads/releases/gcc-10.zip
unzip gcc-10.zip
cd gcc-releases-gcc-10
# Download GCC prerequisites (GMP, MPFR, MPC, ISL)
contrib/download_prerequisites
# Create build directory and configure
mkdir /usr/local/gcc-10
cd /usr/local/gcc-10
/usr/local/src/gcc-releases-gcc-10/configure --disable-multilib
# Compile and install (1-2 hours depending on CPU)
make -j $(nproc)
make install
Note
: The compilation step
make -j $(nproc)will take 1-2 hours depending on your CPU performance. The$(nproc)command uses all available CPU cores to speed up compilation.
Post-Install Configuration:
# Configure dynamic linker to include both system and GCC 10 library paths
cat > /etc/ld.so.conf.d/gcc-10.conf << 'EOF'
/usr/lib64
/usr/local/lib64
EOF
ldconfig
# Update system compiler symlinks to use GCC 10
rm -f /usr/bin/cc
ln -s /usr/local/bin/gcc /usr/bin/cc
Verify Installation:
# Verify GCC 10 installation
gcc --version
# Should output: gcc (GCC) 10.x.x
g++ --version
# Should output: g++ (GCC) 10.x.x
# Verify symlinks are correct
which cc
# Should output: /usr/bin/cc
ls -al /usr/bin/cc
# Should show: /usr/bin/cc -> /usr/local/bin/gcc
Step 2: Kernel Compilation (Required for NVIDIA 470 Compatibility)
Why Compile a Custom Kernel?
Recent kernel updates in Rocky Linux 9, Fedora, and Ubuntu have broken compatibility with:
- NVIDIA Driver 470 (required for Tesla K80 / Compute Capability 3.7)
- CUDA 11.4 nvcc compiler
Solution: Use Linux kernel 5.14.x, which maintains stable NVIDIA 470 driver support.
Prerequisites
System Requirements:
- Rocky Linux 9 (clean installation recommended)
- Root privileges
- At least 20GB free disk space
- Stable internet connection
Install Build Tools:
dnf -y groupinstall "Development Tools"
dnf -y install ncurses-devel
Download Kernel Source
-
Navigate to source directory:
cd /usr/src/kernels -
Download Linux 5.14.x kernel:
wget https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.14.tar.xzNote
: Check kernel.org for the latest 5.14.x stable release.
-
Extract the archive:
tar xvf linux-5.14.tar.xz cd linux-5.14
Configure Kernel
-
Copy existing kernel configuration:
# First, check available kernel configurations ls /usr/src/kernels # Copy config from the running kernel (adjust version as needed) # Example: cp /usr/src/kernels/5.14.0-570.52.1.el9_6.x86_64/.config .config cp /usr/src/kernels/$(uname -r)/.config .config -
Open menuconfig to adjust settings:
make menuconfig -
Required Configuration Changes:
Navigate and DISABLE the following options:
a) Disable Module Signature Verification:
Enable loadable module support → [ ] Module signature verification (press N to disable)b) Disable Trusted Keys:
Cryptographic API → Certificates for signature checking → [ ] Provide system-wide ring of trusted keys (press N) → System trusted keys filename = "" (delete any content, leave empty)c) Disable BTF Debug Info:
Kernel hacking → Compile-time checks and compiler options → [ ] Generate BTF typeinfo (press N to disable CONFIG_DEBUG_INFO_BTF)Why disable these?
- Module signatures: Prevents loading unsigned NVIDIA proprietary driver
- Trusted keys: Conflicts with out-of-tree driver compilation
- BTF debug: Can cause build failures and is unnecessary for production use
-
Save configuration:
- Press
<Save> - Confirm default filename
.config - Press
<Exit>to quit menuconfig
- Press
Compile Kernel
-
Compile kernel (using all CPU cores):
make -j$(nproc)Estimated time: 30-60 minutes depending on CPU performance
-
Install kernel modules:
make modules_install -
Install kernel:
make install
Reboot and Verify
-
Reboot system:
reboot -
After reboot, verify kernel version:
uname -r # Should output: 5.14.21
Troubleshooting Kernel Compilation
Issue: BTF-related build errors
BTF: .tmp_vmlinux.btf: pahole (pahole) is not available
Failed to generate BTF for vmlinux
Solution:
- Disable
CONFIG_DEBUG_INFO_BTFin menuconfig (see step 4c above)
Issue: Module signing key errors
Can't read private key
Solution:
- Disable
CONFIG_MODULE_SIG_ALLand clearCONFIG_SYSTEM_TRUSTED_KEYSin menuconfig - Ensure the "System trusted keys filename" field is completely empty
Step 3: NVIDIA Driver 470 Installation
Prerequisites:
- Rocky Linux 9 system running custom kernel 5.14.x (from Step 2)
- Root privileges
- Internet connectivity
Steps:
-
Update the system:
dnf -y update -
Install required dependencies:
dnf -y install epel-release dnf -y install libglvnd-devel.x86_64 -
Switch to text mode (runlevel 3):
init 3Note
: This will exit the graphical interface. You'll need to log in via text console.
-
Download NVIDIA Driver 470.256.02:
cd /tmp wget https://us.download.nvidia.com/tesla/470.256.02/NVIDIA-Linux-x86_64-470.256.02.run -
Install NVIDIA Driver:
chmod +x NVIDIA-Linux-x86_64-470.256.02.run sh NVIDIA-Linux-x86_64-470.256.02.runInstallation prompts:
- Accept the license agreement
- If asked about DKMS, select "Yes" to register with DKMS
- If asked about 32-bit compatibility libraries, select based on your needs
- If asked about X configuration, select "Yes" if you use graphical interface
-
Reboot to load NVIDIA driver:
reboot
Verification:
# Check driver and GPU
nvidia-smi
# Should show Tesla K80 GPU(s) with driver version 470.256.02
Step 4: CUDA 11.4 Toolkit Installation
Prerequisites:
- NVIDIA Driver 470 installed and verified (from Step 3)
- Root privileges
Steps:
-
Download CUDA 11.4.0 installer:
cd /tmp wget https://developer.download.nvidia.com/compute/cuda/11.4.0/local_installers/cuda_11.4.0_470.42.01_linux.run -
Run CUDA installer:
sh cuda_11.4.0_470.42.01_linux.runInstallation prompts:
- Accept the license agreement
- IMPORTANT: Deselect "Driver" option (driver already installed in Step 3)
- Keep selected: CUDA Toolkit, CUDA Samples, CUDA Demo Suite, CUDA Documentation
- Confirm installation
-
Set up CUDA Environment Variables:
Create two configuration files:
a) PATH configuration in
/etc/profile.d/:cat > /etc/profile.d/cuda-11.4.sh << 'EOF' #!/bin/sh # cuda-11.4.sh - CUDA 11.4 PATH configuration for Tesla K80 support export PATH=/usr/local/cuda-11.4/bin${PATH:+:${PATH}} EOF # Apply PATH changes source /etc/profile.d/cuda-11.4.shb) Dynamic linker configuration:
The CUDA installer creates
/etc/ld.so.conf.d/cuda-11-4.confautomatically with the following content:/usr/local/cuda-11.4/lib64 /usr/local/cuda-11.4/targets/x86_64-linux/libIf the file doesn't exist or needs to be recreated:
cat > /etc/ld.so.conf.d/cuda-11-4.conf << 'EOF' /usr/local/cuda-11.4/lib64 /usr/local/cuda-11.4/targets/x86_64-linux/lib EOF # Update dynamic linker cache ldconfig
Verification:
# Check CUDA compiler
nvcc --version
# Should show: Cuda compilation tools, release 11.4, V11.4.48
# Check driver and CUDA compatibility
nvidia-smi
# Should show Tesla K80 GPU(s) with driver version 470.256.02 and CUDA Version: 11.4
Step 5: CMake 4.0 Installation
-
Install OpenSSL Development Libraries:
dnf -y install openssl-devel -
Download CMake Source Code:
cd /usr/local/src wget https://github.com/Kitware/CMake/releases/download/v4.0.0/cmake-4.0.0.tar.gz -
Extract the Archive:
tar xvf cmake-4.0.0.tar.gz -
Create Installation Directory:
mkdir /usr/local/cmake-4 -
Configure CMake:
cd /usr/local/cmake-4 /usr/local/src/cmake-4.0.0/configure -
Compile CMake:
make -j $(nproc) -
Install CMake:
make install -
Verify Installation:
cmake --version # Should output: cmake version 4.0.0
Step 6: Go 1.25.3 Installation
-
Download Go Distribution:
cd /usr/local wget https://go.dev/dl/go1.25.3.linux-amd64.tar.gz -
Extract the Archive:
tar xvf go1.25.3.linux-amd64.tar.gz -
Post Install Configuration:
cat > /etc/profile.d/go.conf << 'EOF' #!/bin/sh # go.conf - Go environment configuration export PATH=/usr/local/go/bin${PATH:+:${PATH}} EOF # Apply the configuration source /etc/profile.d/go.conf -
Verify Installation:
go version # Should output: go version go1.25.3 linux/amd64
Step 7: Ollama37 Compilation (Optional - For Custom Builds)
Prerequisites: All components installed as per the guides above:
- GCC 10 (from Step 1)
- Rocky Linux 9 with custom kernel 5.14.x (from Step 2)
- NVIDIA Driver 470 (from Step 3)
- CUDA Toolkit 11.4 (from Step 4)
- CMake 4.0 (from Step 5)
- Go 1.25.3 (from Step 6)
- Git
Compilation Steps:
-
Navigate to Build Directory:
cd /usr/local/src -
Clone the Repository:
git clone https://github.com/dogkeeper886/ollama37 cd ollama37 -
CMake Configuration: Set compiler variables and configure the build system:
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake -B build -
CMake Build: Compile the C++ components (parallel build):
CC=/usr/local/bin/gcc CXX=/usr/local/bin/g++ cmake --build build -j$(nproc)Note:
-j$(nproc)enables parallel compilation using all available CPU cores. You can specify a number like-j4to limit the number of parallel jobs. -
Go Build: Compile the Go components:
go build -o ollama . -
Verification:
./ollama --version -
Optional: Install System-Wide:
cp ollama /usr/local/bin/ cp -r lib/ollama /usr/local/lib/
Tesla K80 Specific Optimizations
The Ollama37 build includes several Tesla K80-specific optimizations:
CUDA Architecture Support
- CMake Configuration:
CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80" - Build Files: Located in
ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt
CUDA 11 Compatibility
- Uses CUDA 11 toolchain (CUDA 12 dropped Compute Capability 3.7 support)
- Environment variables configured for CUDA 11.4 paths
- Driver version 470 for maximum compatibility
Performance Tuning
- Optimized memory management for Tesla K80's 12GB VRAM
- Kernel optimizations for Kepler architecture
- Reduced precision operations where appropriate
- Enhanced VMM pool with granularity alignment
- Progressive memory allocation fallback (4GB → 2GB → 1GB → 512MB)
Troubleshooting
NVIDIA Driver Issues
Issue: nvidia-smi shows "Failed to initialize NVML"
Solution:
# Check if driver is loaded
lsmod | grep nvidia
# If not loaded, load manually
modprobe nvidia
# Check dmesg for errors
dmesg | grep -i nvidia
Issue: Driver loads but CUDA version mismatch
Solution:
# Check CUDA version
nvcc --version
# Check driver CUDA support
nvidia-smi
# Ensure PATH points to CUDA 11.4
echo $PATH | grep cuda-11.4
CUDA Compilation Issues
Issue: nvcc not found
Solution:
# Check if CUDA is in PATH
which nvcc
# If not, source environment
source /etc/profile.d/cuda-11.4.sh
# Verify
nvcc --version
Issue: "nvcc fatal: Unsupported gpu architecture 'compute_37'"
Solution: This error means you're using CUDA 12 instead of CUDA 11.4. Ensure:
# Check CUDA version
nvcc --version
# Must show CUDA 11.4
# If wrong version, check PATH
echo $PATH
# Should include /usr/local/cuda-11.4/bin BEFORE any other CUDA paths
GCC Version Issues
Issue: CMake can't find GCC 10
Solution:
# Check GCC version
/usr/local/bin/gcc --version
# Should show GCC 10.x
# If build fails, explicitly set CC and CXX
export CC=/usr/local/bin/gcc
export CXX=/usr/local/bin/g++
Issue: CUDA compilation fails with GCC 11 errors
Solution:
# CUDA 11.4 is not compatible with GCC 11+
# You MUST use GCC 10 for compilation
# Ensure you've installed GCC 10 (Step 1)
# Verify compiler paths
which gcc # Should point to /usr/local/bin/gcc
gcc --version # Should show 10.x
Memory Issues
Issue: Out of memory during model loading
Solution:
- Tesla K80 has 12GB VRAM per GPU
- Use quantized models (Q4_0, Q8_0) for better memory efficiency
- Reduce context length:
ollama run model --num-ctx 2048 - Monitor GPU memory:
watch -n 1 nvidia-smi
Build Verification
After successful compilation, verify Tesla K80 support:
# Check if ollama detects your GPU
./ollama serve &
# Pull a small model
./ollama pull llama3.2:3b
# Test inference
./ollama run llama3.2:3b "Hello Tesla K80!"
# Monitor GPU utilization
watch -n 1 nvidia-smi
Performance Optimization Tips
- Model Selection: Use quantized models (Q4_0, Q8_0) for better performance on Tesla K80
- Memory Management: Monitor VRAM usage and adjust context sizes accordingly
- Temperature Control: Ensure adequate cooling for sustained workloads
- Power Management: Tesla K80 requires proper power delivery (225W per GPU)
- Multi-GPU: For dual K80 setups, use
CUDA_VISIBLE_DEVICES=0,1to leverage both GPUs
Summary: Installation Paths
Path 1: Pre-built Binary (Easier)
- ❌ Skip GCC 10 installation (not needed for pre-built binaries)
- Compile custom kernel 5.14.x
- Install NVIDIA Driver 470
- Install CUDA 11.4 Toolkit
- Install CMake 4.0
- Install Go 1.25.3
- Download and run pre-built ollama37 binary
Path 2: Compile from Source (Advanced - Requires All Steps)
- ✅ Install GCC 10 (required for kernel and ollama37 compilation)
- Compile custom kernel 5.14.x (uses GCC 10)
- Install NVIDIA Driver 470
- Install CUDA 11.4 Toolkit
- Install CMake 4.0
- Install Go 1.25.3
- Compile ollama37 from source (uses GCC 10)
Choose the path that best fits your requirements and technical expertise.