mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-18 11:47:07 +00:00
- Add .github/workflows/build-test.yml for automated testing - Add tests/ directory with TypeScript test runner - Add docs/CICD.md documentation - Remove .gitlab-ci.yml (migrated to GitHub Actions) - Update .gitignore for test artifacts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
319 lines
11 KiB
Markdown
319 lines
11 KiB
Markdown
# CI/CD Plan for Ollama37
|
|
|
|
This document describes the CI/CD pipeline for building and testing Ollama37 with Tesla K80 (CUDA compute capability 3.7) support.
|
|
|
|
## Infrastructure Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ GITHUB │
|
|
│ dogkeeper886/ollama37 │
|
|
│ │
|
|
│ Push to main ──────────────────────────────────────────────────────┐ │
|
|
└─────────────────────────────────────────────────────────────────────│───┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ CI/CD NODE │
|
|
│ │
|
|
│ Hardware: │
|
|
│ - Tesla K80 GPU (compute capability 3.7) │
|
|
│ - NVIDIA Driver 470.x │
|
|
│ │
|
|
│ Software: │
|
|
│ - Rocky Linux 9.7 │
|
|
│ - Docker 29.1.3 + Docker Compose 5.0.0 │
|
|
│ - NVIDIA Container Toolkit │
|
|
│ - GitHub Actions Runner (self-hosted, labels: k80, cuda11) │
|
|
│ │
|
|
│ Services: │
|
|
│ - TestLink (http://localhost:8090) - Test management │
|
|
│ - TestLink MCP - Claude Code integration │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ SERVE NODE │
|
|
│ │
|
|
│ Services: │
|
|
│ - Ollama (production) │
|
|
│ - Dify (LLM application platform) │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Build Strategy: Docker-Based
|
|
|
|
We use the two-stage Docker build system located in `/docker/`:
|
|
|
|
### Stage 1: Builder Image (Cached)
|
|
|
|
**Image:** `ollama37-builder:latest` (~15GB)
|
|
|
|
**Contents:**
|
|
- Rocky Linux 8
|
|
- CUDA 11.4 toolkit
|
|
- GCC 10 (built from source)
|
|
- CMake 4.0 (built from source)
|
|
- Go 1.25.3
|
|
|
|
**Build time:** ~90 minutes (first time only, then cached)
|
|
|
|
**Build command:**
|
|
```bash
|
|
cd docker && make build-builder
|
|
```
|
|
|
|
### Stage 2: Runtime Image (Per Build)
|
|
|
|
**Image:** `ollama37:latest` (~18GB)
|
|
|
|
**Process:**
|
|
1. Clone source from GitHub
|
|
2. Configure with CMake ("CUDA 11" preset)
|
|
3. Build C/C++/CUDA libraries
|
|
4. Build Go binary
|
|
5. Package runtime environment
|
|
|
|
**Build time:** ~10 minutes
|
|
|
|
**Build command:**
|
|
```bash
|
|
cd docker && make build-runtime
|
|
```
|
|
|
|
## Pipeline Stages
|
|
|
|
### Stage 1: Docker Build
|
|
|
|
**Trigger:** Push to `main` branch
|
|
|
|
**Steps:**
|
|
1. Checkout repository
|
|
2. Ensure builder image exists (build if not)
|
|
3. Build runtime image: `make build-runtime`
|
|
4. Verify image created successfully
|
|
|
|
**Test Cases:**
|
|
- TC-BUILD-001: Builder Image Verification
|
|
- TC-BUILD-002: Runtime Image Build
|
|
- TC-BUILD-003: Image Size Validation
|
|
|
|
### Stage 2: Container Startup
|
|
|
|
**Steps:**
|
|
1. Start container with GPU: `docker compose up -d`
|
|
2. Wait for health check to pass
|
|
3. Verify Ollama server is responding
|
|
|
|
**Test Cases:**
|
|
- TC-RUNTIME-001: Container Startup
|
|
- TC-RUNTIME-002: GPU Detection
|
|
- TC-RUNTIME-003: Health Check
|
|
|
|
### Stage 3: Inference Tests
|
|
|
|
**Steps:**
|
|
1. Pull test model (gemma3:4b)
|
|
2. Run inference tests
|
|
3. Verify CUBLAS legacy fallback
|
|
|
|
**Test Cases:**
|
|
- TC-INFERENCE-001: Model Pull
|
|
- TC-INFERENCE-002: Basic Inference
|
|
- TC-INFERENCE-003: API Endpoint Test
|
|
- TC-INFERENCE-004: CUBLAS Fallback Verification
|
|
|
|
### Stage 4: Cleanup & Report
|
|
|
|
**Steps:**
|
|
1. Stop container: `docker compose down`
|
|
2. Report results to TestLink
|
|
3. Clean up resources
|
|
|
|
## Test Case Design
|
|
|
|
### Build Tests (Suite: Build Tests)
|
|
|
|
| ID | Name | Type | Description |
|
|
|----|------|------|-------------|
|
|
| TC-BUILD-001 | Builder Image Verification | Automated | Verify builder image exists with correct tools |
|
|
| TC-BUILD-002 | Runtime Image Build | Automated | Build runtime image from GitHub source |
|
|
| TC-BUILD-003 | Image Size Validation | Automated | Verify image sizes are within expected range |
|
|
|
|
### Runtime Tests (Suite: Runtime Tests)
|
|
|
|
| ID | Name | Type | Description |
|
|
|----|------|------|-------------|
|
|
| TC-RUNTIME-001 | Container Startup | Automated | Start container with GPU passthrough |
|
|
| TC-RUNTIME-002 | GPU Detection | Automated | Verify Tesla K80 detected inside container |
|
|
| TC-RUNTIME-003 | Health Check | Automated | Verify Ollama health check passes |
|
|
|
|
### Inference Tests (Suite: Inference Tests)
|
|
|
|
| ID | Name | Type | Description |
|
|
|----|------|------|-------------|
|
|
| TC-INFERENCE-001 | Model Pull | Automated | Pull gemma3:4b model |
|
|
| TC-INFERENCE-002 | Basic Inference | Automated | Run simple prompt and verify response |
|
|
| TC-INFERENCE-003 | API Endpoint Test | Automated | Test /api/generate endpoint |
|
|
| TC-INFERENCE-004 | CUBLAS Fallback Verification | Automated | Verify legacy CUBLAS functions used |
|
|
|
|
## GitHub Actions Workflow
|
|
|
|
**File:** `.github/workflows/build-test.yml`
|
|
|
|
**Triggers:**
|
|
- Push to `main` branch
|
|
- Pull request to `main` branch
|
|
- Manual trigger (workflow_dispatch)
|
|
|
|
**Runner:** Self-hosted with labels `[self-hosted, k80, cuda11]`
|
|
|
|
**Jobs:**
|
|
1. `build` - Build Docker runtime image
|
|
2. `test` - Run inference tests in container
|
|
3. `report` - Report results to TestLink
|
|
|
|
## TestLink Integration
|
|
|
|
**URL:** http://localhost:8090
|
|
|
|
**Project:** ollama37
|
|
|
|
**Test Suites:**
|
|
- Build Tests
|
|
- Runtime Tests
|
|
- Inference Tests
|
|
|
|
**Test Plan:** Created per release/sprint
|
|
|
|
**Builds:** Created per CI run (commit SHA)
|
|
|
|
**Execution Recording:**
|
|
- Each test case result recorded via TestLink API
|
|
- Pass/Fail status with notes
|
|
- Linked to specific build/commit
|
|
|
|
## Makefile Targets for CI
|
|
|
|
| Target | Description | When to Use |
|
|
|--------|-------------|-------------|
|
|
| `make build-builder` | Build base image | First time setup |
|
|
| `make build-runtime` | Build from GitHub | Normal CI builds |
|
|
| `make build-runtime-no-cache` | Fresh GitHub clone | When cache is stale |
|
|
| `make build-runtime-local` | Build from local | Local testing |
|
|
|
|
## Environment Variables
|
|
|
|
### Build Environment
|
|
|
|
| Variable | Value | Description |
|
|
|----------|-------|-------------|
|
|
| `BUILDER_IMAGE` | ollama37-builder | Builder image name |
|
|
| `RUNTIME_IMAGE` | ollama37 | Runtime image name |
|
|
|
|
### Runtime Environment
|
|
|
|
| Variable | Value | Description |
|
|
|----------|-------|-------------|
|
|
| `OLLAMA_HOST` | 0.0.0.0:11434 | Server listen address |
|
|
| `NVIDIA_VISIBLE_DEVICES` | all | GPU visibility |
|
|
| `OLLAMA_DEBUG` | 1 (optional) | Enable debug logging |
|
|
| `GGML_CUDA_DEBUG` | 1 (optional) | Enable CUDA debug |
|
|
|
|
### TestLink Environment
|
|
|
|
| Variable | Value | Description |
|
|
|----------|-------|-------------|
|
|
| `TESTLINK_URL` | http://localhost:8090 | TestLink server URL |
|
|
| `TESTLINK_API_KEY` | (configured) | API key for automation |
|
|
|
|
## Prerequisites
|
|
|
|
### One-Time Setup on CI/CD Node
|
|
|
|
1. **Install GitHub Actions Runner:**
|
|
```bash
|
|
mkdir -p ~/actions-runner && cd ~/actions-runner
|
|
curl -o actions-runner-linux-x64-2.321.0.tar.gz -L \
|
|
https://github.com/actions/runner/releases/download/v2.321.0/actions-runner-linux-x64-2.321.0.tar.gz
|
|
tar xzf ./actions-runner-linux-x64-2.321.0.tar.gz
|
|
./config.sh --url https://github.com/dogkeeper886/ollama37 --token YOUR_TOKEN --labels k80,cuda11
|
|
sudo ./svc.sh install && sudo ./svc.sh start
|
|
```
|
|
|
|
2. **Build Builder Image (one-time, ~90 min):**
|
|
```bash
|
|
cd /home/jack/src/ollama37/docker
|
|
make build-builder
|
|
```
|
|
|
|
3. **Verify GPU Access in Docker:**
|
|
```bash
|
|
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
|
|
```
|
|
|
|
4. **Start TestLink:**
|
|
```bash
|
|
cd /home/jack/src/testlink-code
|
|
docker compose up -d
|
|
```
|
|
|
|
## Monitoring & Logs
|
|
|
|
### View CI/CD Logs
|
|
|
|
```bash
|
|
# GitHub Actions Runner logs
|
|
journalctl -u actions.runner.* -f
|
|
|
|
# Docker build logs
|
|
docker compose logs -f
|
|
|
|
# TestLink logs
|
|
cd /home/jack/src/testlink-code && docker compose logs -f
|
|
```
|
|
|
|
### Test Results
|
|
|
|
- **TestLink Dashboard:** http://localhost:8090
|
|
- **GitHub Actions:** https://github.com/dogkeeper886/ollama37/actions
|
|
|
|
## Troubleshooting
|
|
|
|
### Builder Image Missing
|
|
|
|
```bash
|
|
cd docker && make build-builder
|
|
```
|
|
|
|
### GPU Not Detected in Container
|
|
|
|
```bash
|
|
# Check UVM device files on host
|
|
ls -l /dev/nvidia-uvm*
|
|
|
|
# Create if missing
|
|
nvidia-modprobe -u -c=0
|
|
|
|
# Restart container
|
|
docker compose restart
|
|
```
|
|
|
|
### Build Cache Stale
|
|
|
|
```bash
|
|
cd docker && make build-runtime-no-cache
|
|
```
|
|
|
|
### TestLink Connection Failed
|
|
|
|
```bash
|
|
# Check TestLink is running
|
|
curl http://localhost:8090
|
|
|
|
# Restart if needed
|
|
cd /home/jack/src/testlink-code && docker compose restart
|
|
```
|