- Add .github/workflows/build-test.yml for automated testing - Add tests/ directory with TypeScript test runner - Add docs/CICD.md documentation - Remove .gitlab-ci.yml (migrated to GitHub Actions) - Update .gitignore for test artifacts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
11 KiB
CI/CD Plan for Ollama37
This document describes the CI/CD pipeline for building and testing Ollama37 with Tesla K80 (CUDA compute capability 3.7) support.
Infrastructure Overview
┌─────────────────────────────────────────────────────────────────────────┐
│ GITHUB │
│ dogkeeper886/ollama37 │
│ │
│ Push to main ──────────────────────────────────────────────────────┐ │
└─────────────────────────────────────────────────────────────────────│───┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ CI/CD NODE │
│ │
│ Hardware: │
│ - Tesla K80 GPU (compute capability 3.7) │
│ - NVIDIA Driver 470.x │
│ │
│ Software: │
│ - Rocky Linux 9.7 │
│ - Docker 29.1.3 + Docker Compose 5.0.0 │
│ - NVIDIA Container Toolkit │
│ - GitHub Actions Runner (self-hosted, labels: k80, cuda11) │
│ │
│ Services: │
│ - TestLink (http://localhost:8090) - Test management │
│ - TestLink MCP - Claude Code integration │
│ │
└─────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────┐
│ SERVE NODE │
│ │
│ Services: │
│ - Ollama (production) │
│ - Dify (LLM application platform) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Build Strategy: Docker-Based
We use the two-stage Docker build system located in /docker/:
Stage 1: Builder Image (Cached)
Image: ollama37-builder:latest (~15GB)
Contents:
- Rocky Linux 8
- CUDA 11.4 toolkit
- GCC 10 (built from source)
- CMake 4.0 (built from source)
- Go 1.25.3
Build time: ~90 minutes (first time only, then cached)
Build command:
cd docker && make build-builder
Stage 2: Runtime Image (Per Build)
Image: ollama37:latest (~18GB)
Process:
- Clone source from GitHub
- Configure with CMake ("CUDA 11" preset)
- Build C/C++/CUDA libraries
- Build Go binary
- Package runtime environment
Build time: ~10 minutes
Build command:
cd docker && make build-runtime
Pipeline Stages
Stage 1: Docker Build
Trigger: Push to main branch
Steps:
- Checkout repository
- Ensure builder image exists (build if not)
- Build runtime image:
make build-runtime - Verify image created successfully
Test Cases:
- TC-BUILD-001: Builder Image Verification
- TC-BUILD-002: Runtime Image Build
- TC-BUILD-003: Image Size Validation
Stage 2: Container Startup
Steps:
- Start container with GPU:
docker compose up -d - Wait for health check to pass
- Verify Ollama server is responding
Test Cases:
- TC-RUNTIME-001: Container Startup
- TC-RUNTIME-002: GPU Detection
- TC-RUNTIME-003: Health Check
Stage 3: Inference Tests
Steps:
- Pull test model (gemma3:4b)
- Run inference tests
- Verify CUBLAS legacy fallback
Test Cases:
- TC-INFERENCE-001: Model Pull
- TC-INFERENCE-002: Basic Inference
- TC-INFERENCE-003: API Endpoint Test
- TC-INFERENCE-004: CUBLAS Fallback Verification
Stage 4: Cleanup & Report
Steps:
- Stop container:
docker compose down - Report results to TestLink
- Clean up resources
Test Case Design
Build Tests (Suite: Build Tests)
| ID | Name | Type | Description |
|---|---|---|---|
| TC-BUILD-001 | Builder Image Verification | Automated | Verify builder image exists with correct tools |
| TC-BUILD-002 | Runtime Image Build | Automated | Build runtime image from GitHub source |
| TC-BUILD-003 | Image Size Validation | Automated | Verify image sizes are within expected range |
Runtime Tests (Suite: Runtime Tests)
| ID | Name | Type | Description |
|---|---|---|---|
| TC-RUNTIME-001 | Container Startup | Automated | Start container with GPU passthrough |
| TC-RUNTIME-002 | GPU Detection | Automated | Verify Tesla K80 detected inside container |
| TC-RUNTIME-003 | Health Check | Automated | Verify Ollama health check passes |
Inference Tests (Suite: Inference Tests)
| ID | Name | Type | Description |
|---|---|---|---|
| TC-INFERENCE-001 | Model Pull | Automated | Pull gemma3:4b model |
| TC-INFERENCE-002 | Basic Inference | Automated | Run simple prompt and verify response |
| TC-INFERENCE-003 | API Endpoint Test | Automated | Test /api/generate endpoint |
| TC-INFERENCE-004 | CUBLAS Fallback Verification | Automated | Verify legacy CUBLAS functions used |
GitHub Actions Workflow
File: .github/workflows/build-test.yml
Triggers:
- Push to
mainbranch - Pull request to
mainbranch - Manual trigger (workflow_dispatch)
Runner: Self-hosted with labels [self-hosted, k80, cuda11]
Jobs:
build- Build Docker runtime imagetest- Run inference tests in containerreport- Report results to TestLink
TestLink Integration
Project: ollama37
Test Suites:
- Build Tests
- Runtime Tests
- Inference Tests
Test Plan: Created per release/sprint
Builds: Created per CI run (commit SHA)
Execution Recording:
- Each test case result recorded via TestLink API
- Pass/Fail status with notes
- Linked to specific build/commit
Makefile Targets for CI
| Target | Description | When to Use |
|---|---|---|
make build-builder |
Build base image | First time setup |
make build-runtime |
Build from GitHub | Normal CI builds |
make build-runtime-no-cache |
Fresh GitHub clone | When cache is stale |
make build-runtime-local |
Build from local | Local testing |
Environment Variables
Build Environment
| Variable | Value | Description |
|---|---|---|
BUILDER_IMAGE |
ollama37-builder | Builder image name |
RUNTIME_IMAGE |
ollama37 | Runtime image name |
Runtime Environment
| Variable | Value | Description |
|---|---|---|
OLLAMA_HOST |
0.0.0.0:11434 | Server listen address |
NVIDIA_VISIBLE_DEVICES |
all | GPU visibility |
OLLAMA_DEBUG |
1 (optional) | Enable debug logging |
GGML_CUDA_DEBUG |
1 (optional) | Enable CUDA debug |
TestLink Environment
| Variable | Value | Description |
|---|---|---|
TESTLINK_URL |
http://localhost:8090 | TestLink server URL |
TESTLINK_API_KEY |
(configured) | API key for automation |
Prerequisites
One-Time Setup on CI/CD Node
-
Install GitHub Actions Runner:
mkdir -p ~/actions-runner && cd ~/actions-runner curl -o actions-runner-linux-x64-2.321.0.tar.gz -L \ https://github.com/actions/runner/releases/download/v2.321.0/actions-runner-linux-x64-2.321.0.tar.gz tar xzf ./actions-runner-linux-x64-2.321.0.tar.gz ./config.sh --url https://github.com/dogkeeper886/ollama37 --token YOUR_TOKEN --labels k80,cuda11 sudo ./svc.sh install && sudo ./svc.sh start -
Build Builder Image (one-time, ~90 min):
cd /home/jack/src/ollama37/docker make build-builder -
Verify GPU Access in Docker:
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi -
Start TestLink:
cd /home/jack/src/testlink-code docker compose up -d
Monitoring & Logs
View CI/CD Logs
# GitHub Actions Runner logs
journalctl -u actions.runner.* -f
# Docker build logs
docker compose logs -f
# TestLink logs
cd /home/jack/src/testlink-code && docker compose logs -f
Test Results
- TestLink Dashboard: http://localhost:8090
- GitHub Actions: https://github.com/dogkeeper886/ollama37/actions
Troubleshooting
Builder Image Missing
cd docker && make build-builder
GPU Not Detected in Container
# Check UVM device files on host
ls -l /dev/nvidia-uvm*
# Create if missing
nvidia-modprobe -u -c=0
# Restart container
docker compose restart
Build Cache Stale
cd docker && make build-runtime-no-cache
TestLink Connection Failed
# Check TestLink is running
curl http://localhost:8090
# Restart if needed
cd /home/jack/src/testlink-code && docker compose restart