Add GitHub Actions CI/CD pipeline and test framework

- Add .github/workflows/build-test.yml for automated testing
- Add tests/ directory with TypeScript test runner
- Add docs/CICD.md documentation
- Remove .gitlab-ci.yml (migrated to GitHub Actions)
- Update .gitignore for test artifacts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Shang Chieh Tseng
2025-12-15 14:06:44 +08:00
parent 2b5aeaf86b
commit d11140c016
23 changed files with 3014 additions and 50 deletions

318
docs/CICD.md Normal file
View File

@@ -0,0 +1,318 @@
# CI/CD Plan for Ollama37
This document describes the CI/CD pipeline for building and testing Ollama37 with Tesla K80 (CUDA compute capability 3.7) support.
## Infrastructure Overview
```
┌─────────────────────────────────────────────────────────────────────────┐
│ GITHUB │
│ dogkeeper886/ollama37 │
│ │
│ Push to main ──────────────────────────────────────────────────────┐ │
└─────────────────────────────────────────────────────────────────────│───┘
┌─────────────────────────────────────────────────────────────────────────┐
│ CI/CD NODE │
│ │
│ Hardware: │
│ - Tesla K80 GPU (compute capability 3.7) │
│ - NVIDIA Driver 470.x │
│ │
│ Software: │
│ - Rocky Linux 9.7 │
│ - Docker 29.1.3 + Docker Compose 5.0.0 │
│ - NVIDIA Container Toolkit │
│ - GitHub Actions Runner (self-hosted, labels: k80, cuda11) │
│ │
│ Services: │
│ - TestLink (http://localhost:8090) - Test management │
│ - TestLink MCP - Claude Code integration │
│ │
└─────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────┐
│ SERVE NODE │
│ │
│ Services: │
│ - Ollama (production) │
│ - Dify (LLM application platform) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
```
## Build Strategy: Docker-Based
We use the two-stage Docker build system located in `/docker/`:
### Stage 1: Builder Image (Cached)
**Image:** `ollama37-builder:latest` (~15GB)
**Contents:**
- Rocky Linux 8
- CUDA 11.4 toolkit
- GCC 10 (built from source)
- CMake 4.0 (built from source)
- Go 1.25.3
**Build time:** ~90 minutes (first time only, then cached)
**Build command:**
```bash
cd docker && make build-builder
```
### Stage 2: Runtime Image (Per Build)
**Image:** `ollama37:latest` (~18GB)
**Process:**
1. Clone source from GitHub
2. Configure with CMake ("CUDA 11" preset)
3. Build C/C++/CUDA libraries
4. Build Go binary
5. Package runtime environment
**Build time:** ~10 minutes
**Build command:**
```bash
cd docker && make build-runtime
```
## Pipeline Stages
### Stage 1: Docker Build
**Trigger:** Push to `main` branch
**Steps:**
1. Checkout repository
2. Ensure builder image exists (build if not)
3. Build runtime image: `make build-runtime`
4. Verify image created successfully
**Test Cases:**
- TC-BUILD-001: Builder Image Verification
- TC-BUILD-002: Runtime Image Build
- TC-BUILD-003: Image Size Validation
### Stage 2: Container Startup
**Steps:**
1. Start container with GPU: `docker compose up -d`
2. Wait for health check to pass
3. Verify Ollama server is responding
**Test Cases:**
- TC-RUNTIME-001: Container Startup
- TC-RUNTIME-002: GPU Detection
- TC-RUNTIME-003: Health Check
### Stage 3: Inference Tests
**Steps:**
1. Pull test model (gemma3:4b)
2. Run inference tests
3. Verify CUBLAS legacy fallback
**Test Cases:**
- TC-INFERENCE-001: Model Pull
- TC-INFERENCE-002: Basic Inference
- TC-INFERENCE-003: API Endpoint Test
- TC-INFERENCE-004: CUBLAS Fallback Verification
### Stage 4: Cleanup & Report
**Steps:**
1. Stop container: `docker compose down`
2. Report results to TestLink
3. Clean up resources
## Test Case Design
### Build Tests (Suite: Build Tests)
| ID | Name | Type | Description |
|----|------|------|-------------|
| TC-BUILD-001 | Builder Image Verification | Automated | Verify builder image exists with correct tools |
| TC-BUILD-002 | Runtime Image Build | Automated | Build runtime image from GitHub source |
| TC-BUILD-003 | Image Size Validation | Automated | Verify image sizes are within expected range |
### Runtime Tests (Suite: Runtime Tests)
| ID | Name | Type | Description |
|----|------|------|-------------|
| TC-RUNTIME-001 | Container Startup | Automated | Start container with GPU passthrough |
| TC-RUNTIME-002 | GPU Detection | Automated | Verify Tesla K80 detected inside container |
| TC-RUNTIME-003 | Health Check | Automated | Verify Ollama health check passes |
### Inference Tests (Suite: Inference Tests)
| ID | Name | Type | Description |
|----|------|------|-------------|
| TC-INFERENCE-001 | Model Pull | Automated | Pull gemma3:4b model |
| TC-INFERENCE-002 | Basic Inference | Automated | Run simple prompt and verify response |
| TC-INFERENCE-003 | API Endpoint Test | Automated | Test /api/generate endpoint |
| TC-INFERENCE-004 | CUBLAS Fallback Verification | Automated | Verify legacy CUBLAS functions used |
## GitHub Actions Workflow
**File:** `.github/workflows/build-test.yml`
**Triggers:**
- Push to `main` branch
- Pull request to `main` branch
- Manual trigger (workflow_dispatch)
**Runner:** Self-hosted with labels `[self-hosted, k80, cuda11]`
**Jobs:**
1. `build` - Build Docker runtime image
2. `test` - Run inference tests in container
3. `report` - Report results to TestLink
## TestLink Integration
**URL:** http://localhost:8090
**Project:** ollama37
**Test Suites:**
- Build Tests
- Runtime Tests
- Inference Tests
**Test Plan:** Created per release/sprint
**Builds:** Created per CI run (commit SHA)
**Execution Recording:**
- Each test case result recorded via TestLink API
- Pass/Fail status with notes
- Linked to specific build/commit
## Makefile Targets for CI
| Target | Description | When to Use |
|--------|-------------|-------------|
| `make build-builder` | Build base image | First time setup |
| `make build-runtime` | Build from GitHub | Normal CI builds |
| `make build-runtime-no-cache` | Fresh GitHub clone | When cache is stale |
| `make build-runtime-local` | Build from local | Local testing |
## Environment Variables
### Build Environment
| Variable | Value | Description |
|----------|-------|-------------|
| `BUILDER_IMAGE` | ollama37-builder | Builder image name |
| `RUNTIME_IMAGE` | ollama37 | Runtime image name |
### Runtime Environment
| Variable | Value | Description |
|----------|-------|-------------|
| `OLLAMA_HOST` | 0.0.0.0:11434 | Server listen address |
| `NVIDIA_VISIBLE_DEVICES` | all | GPU visibility |
| `OLLAMA_DEBUG` | 1 (optional) | Enable debug logging |
| `GGML_CUDA_DEBUG` | 1 (optional) | Enable CUDA debug |
### TestLink Environment
| Variable | Value | Description |
|----------|-------|-------------|
| `TESTLINK_URL` | http://localhost:8090 | TestLink server URL |
| `TESTLINK_API_KEY` | (configured) | API key for automation |
## Prerequisites
### One-Time Setup on CI/CD Node
1. **Install GitHub Actions Runner:**
```bash
mkdir -p ~/actions-runner && cd ~/actions-runner
curl -o actions-runner-linux-x64-2.321.0.tar.gz -L \
https://github.com/actions/runner/releases/download/v2.321.0/actions-runner-linux-x64-2.321.0.tar.gz
tar xzf ./actions-runner-linux-x64-2.321.0.tar.gz
./config.sh --url https://github.com/dogkeeper886/ollama37 --token YOUR_TOKEN --labels k80,cuda11
sudo ./svc.sh install && sudo ./svc.sh start
```
2. **Build Builder Image (one-time, ~90 min):**
```bash
cd /home/jack/src/ollama37/docker
make build-builder
```
3. **Verify GPU Access in Docker:**
```bash
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
```
4. **Start TestLink:**
```bash
cd /home/jack/src/testlink-code
docker compose up -d
```
## Monitoring & Logs
### View CI/CD Logs
```bash
# GitHub Actions Runner logs
journalctl -u actions.runner.* -f
# Docker build logs
docker compose logs -f
# TestLink logs
cd /home/jack/src/testlink-code && docker compose logs -f
```
### Test Results
- **TestLink Dashboard:** http://localhost:8090
- **GitHub Actions:** https://github.com/dogkeeper886/ollama37/actions
## Troubleshooting
### Builder Image Missing
```bash
cd docker && make build-builder
```
### GPU Not Detected in Container
```bash
# Check UVM device files on host
ls -l /dev/nvidia-uvm*
# Create if missing
nvidia-modprobe -u -c=0
# Restart container
docker compose restart
```
### Build Cache Stale
```bash
cd docker && make build-runtime-no-cache
```
### TestLink Connection Failed
```bash
# Check TestLink is running
curl http://localhost:8090
# Restart if needed
cd /home/jack/src/testlink-code && docker compose restart
```