Implement Go-based test runner framework for Tesla K80 testing

Add comprehensive test orchestration framework:

Test Runner (cmd/test-runner/):
- config.go: YAML configuration loading and validation
- server.go: Ollama server lifecycle management (start/stop/health checks)
- monitor.go: Real-time log monitoring with pattern matching
- test.go: Model testing via Ollama API (pull, chat, validation)
- validate.go: Test result validation (GPU usage, response quality, log analysis)
- report.go: Structured reporting (JSON and Markdown formats)
- main.go: CLI interface with run/validate/list commands

Test Configurations (test/config/):
- models.yaml: Full test suite with quick/full/stress profiles
- quick.yaml: Fast smoke test with gemma2:2b

Updated Workflow:
- tesla-k80-tests.yml: Use test-runner instead of shell scripts
- Run quick tests first, then full tests if passing
- Generate structured JSON reports for pass/fail checking
- Upload test results as artifacts

Features:
- Multi-model testing with configurable profiles
- API-based testing (not CLI commands)
- Real-time log monitoring for GPU events and errors
- Automatic validation of GPU loading and response quality
- Structured JSON and Markdown reports
- Graceful server lifecycle management
- Interrupt handling (Ctrl+C cleanup)

Addresses limitations of shell-based testing by providing:
- Better error handling and reporting
- Programmatic test orchestration
- Reusable test framework
- Clear pass/fail criteria
- Detailed test metrics and timing
This commit is contained in:
Shang Chieh Tseng
2025-10-30 11:04:48 +08:00
parent aaaf334e7f
commit d59284d30a
10 changed files with 1631 additions and 113 deletions

View File

@@ -20,132 +20,74 @@ jobs:
exit 1
fi
ls -lh ollama
file ollama
./ollama --version
- name: Run Go unit tests
- name: Build test-runner
run: |
go test -v -race -timeout 10m ./...
continue-on-error: false
cd cmd/test-runner
go mod init github.com/ollama/ollama/cmd/test-runner || true
go mod tidy
go build -o ../../test-runner .
cd ../..
ls -lh test-runner
- name: Start ollama server (background)
- name: Validate test configuration
run: |
./ollama serve > ollama.log 2>&1 &
echo $! > ollama.pid
echo "Ollama server started with PID $(cat ollama.pid)"
./test-runner validate --config test/config/quick.yaml
- name: Wait for server to be ready
- name: Run quick tests
run: |
for i in {1..30}; do
if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then
echo "Server is ready!"
exit 0
fi
echo "Waiting for server... attempt $i/30"
sleep 2
done
echo "Server failed to start"
cat ollama.log
exit 1
./test-runner run --profile quick --config test/config/quick.yaml --output test-report-quick --verbose
timeout-minutes: 10
- name: Run integration tests
- name: Check quick test results
run: |
go test -v -timeout 20m ./integration/...
continue-on-error: false
- name: Clear server logs for model test
run: |
# Truncate log file to start fresh for model loading test
> ollama.log
- name: Pull gemma2:2b model
run: |
echo "Pulling gemma2:2b model..."
./ollama pull gemma2:2b
echo "Model pull completed"
timeout-minutes: 15
- name: Run inference with gemma2:2b
run: |
echo "Running inference test..."
./ollama run gemma2:2b "Hello, this is a test. Please respond with a short greeting." --verbose
echo "Inference completed"
timeout-minutes: 5
- name: Wait for logs to flush
run: sleep 3
- name: Analyze server logs with Claude
run: |
echo "Analyzing ollama server logs for proper model loading..."
# Create analysis prompt
cat > log_analysis_prompt.txt << 'EOF'
Analyze the following Ollama server logs from a Tesla K80 (CUDA Compute Capability 3.7) system.
Verify that:
1. The model loaded successfully without errors
2. CUDA/GPU acceleration was properly detected and initialized
3. The model is using the Tesla K80 GPU (not CPU fallback)
4. There are no CUDA compatibility warnings or errors
5. Memory allocation was successful
6. Inference completed without errors
Respond with:
- "PASS" if all checks pass and model loaded properly with GPU acceleration
- "FAIL: <reason>" if there are critical issues
- "WARN: <reason>" if there are warnings but model works
Be specific about what succeeded or failed. Look for CUDA errors, memory issues, or CPU fallback warnings.
Server logs:
---
EOF
cat ollama.log >> log_analysis_prompt.txt
# Run Claude in headless mode to analyze
claude -p log_analysis_prompt.txt > log_analysis_result.txt
echo "=== Claude Analysis Result ==="
cat log_analysis_result.txt
# Check if analysis passed
if grep -q "^PASS" log_analysis_result.txt; then
echo "✓ Log analysis PASSED - Model loaded correctly on Tesla K80"
exit 0
elif grep -q "^WARN" log_analysis_result.txt; then
echo "⚠ Log analysis has WARNINGS - Review needed"
cat log_analysis_result.txt
exit 0 # Don't fail on warnings, but they're visible
else
echo "✗ Log analysis FAILED - Model loading issues detected"
cat log_analysis_result.txt
if ! jq -e '.summary.failed == 0' test-report-quick.json; then
echo "Quick tests failed!"
jq '.results[] | select(.status == "FAILED")' test-report-quick.json
exit 1
fi
echo "Quick tests passed!"
- name: Upload quick test results
if: always()
uses: actions/upload-artifact@v4
with:
name: quick-test-results
path: |
test-report-quick.json
test-report-quick.md
ollama.log
retention-days: 7
- name: Run full tests (if quick tests passed)
if: success()
run: |
./test-runner run --profile full --config test/config/models.yaml --output test-report-full --verbose
timeout-minutes: 45
- name: Check full test results
if: success()
run: |
if ! jq -e '.summary.failed == 0' test-report-full.json; then
echo "Full tests failed!"
jq '.results[] | select(.status == "FAILED")' test-report-full.json
exit 1
fi
echo "All tests passed!"
- name: Upload full test results
if: always()
uses: actions/upload-artifact@v4
with:
name: full-test-results
path: |
test-report-full.json
test-report-full.md
ollama.log
retention-days: 14
- name: Check GPU memory usage
if: always()
run: |
echo "=== GPU Memory Status ==="
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
- name: Stop ollama server
if: always()
run: |
if [ -f ollama.pid ]; then
kill $(cat ollama.pid) || true
rm ollama.pid
fi
pkill -f "ollama serve" || true
- name: Upload logs and analysis
if: always()
uses: actions/upload-artifact@v4
with:
name: ollama-test-logs-and-analysis
path: |
ollama.log
log_analysis_prompt.txt
log_analysis_result.txt
retention-days: 7