Add LLM Judge container infrastructure

- Add cicd/docker-compose.judge.yml for stable reference Ollama
- Runs on port 11435 (separate from test subject on 11434)
- Uses dogkeeper886/ollama37:latest from DockerHub
- Add cicd/README.md documenting CI infrastructure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Shang Chieh Tseng
2025-12-15 22:25:13 +08:00
parent 0e66cc6f93
commit 6b84acd7d7
2 changed files with 87 additions and 0 deletions

54
cicd/README.md Normal file
View File

@@ -0,0 +1,54 @@
# CI/CD Infrastructure
This folder contains CI/CD infrastructure components separate from the main build system.
## Components
### LLM Judge (`docker-compose.judge.yml`)
A stable reference Ollama instance for evaluating test results.
**Purpose:**
- Acts as secondary judge alongside simple exit-code checking
- Analyzes test logs semantically to detect hidden issues
- Uses stable DockerHub image (not the build being tested)
**Architecture:**
```
Port 11434 → ollama37 (test subject, local build)
Port 11435 → ollama37-judge (stable reference, DockerHub)
```
**Usage:**
```bash
# Start judge container
cd cicd
docker compose -f docker-compose.judge.yml up -d
# Check status
docker compose -f docker-compose.judge.yml ps
# Pull model for judging (first time)
curl -X POST http://localhost:11435/api/pull -d '{"name": "gemma3:1b"}'
# Stop judge
docker compose -f docker-compose.judge.yml down
```
## Folder Structure
```
cicd/
├── docker-compose.judge.yml # LLM Judge container
├── README.md # This file
└── scripts/ # (future) CI helper scripts
```
## Related Components
| Component | Location | Purpose |
|-----------|----------|---------|
| Test subject | `docker/docker-compose.yml` | Ollama build being tested |
| Test runner | `tests/src/` | Executes tests, uses judge |
| Test cases | `tests/testcases/` | YAML test definitions |
| Workflows | `.github/workflows/` | CI pipeline definitions |

View File

@@ -0,0 +1,33 @@
services:
# LLM Judge - stable reference version for evaluating test results
# Runs on port 11435 to avoid conflict with test subject (11434)
ollama-judge:
image: dogkeeper886/ollama37:latest
container_name: ollama37-judge
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- "11435:11434"
volumes:
- ollama-judge-data:/root/.ollama
environment:
- OLLAMA_HOST=0.0.0.0:11434
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,utility
restart: unless-stopped
healthcheck:
test: ["CMD", "/usr/local/bin/ollama", "list"]
interval: 30s
timeout: 10s
retries: 3
start_period: 5s
volumes:
ollama-judge-data:
name: ollama-judge-data