mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-10 07:46:59 +00:00
Implement Go-based test runner framework for Tesla K80 testing
Add comprehensive test orchestration framework: Test Runner (cmd/test-runner/): - config.go: YAML configuration loading and validation - server.go: Ollama server lifecycle management (start/stop/health checks) - monitor.go: Real-time log monitoring with pattern matching - test.go: Model testing via Ollama API (pull, chat, validation) - validate.go: Test result validation (GPU usage, response quality, log analysis) - report.go: Structured reporting (JSON and Markdown formats) - main.go: CLI interface with run/validate/list commands Test Configurations (test/config/): - models.yaml: Full test suite with quick/full/stress profiles - quick.yaml: Fast smoke test with gemma2:2b Updated Workflow: - tesla-k80-tests.yml: Use test-runner instead of shell scripts - Run quick tests first, then full tests if passing - Generate structured JSON reports for pass/fail checking - Upload test results as artifacts Features: - Multi-model testing with configurable profiles - API-based testing (not CLI commands) - Real-time log monitoring for GPU events and errors - Automatic validation of GPU loading and response quality - Structured JSON and Markdown reports - Graceful server lifecycle management - Interrupt handling (Ctrl+C cleanup) Addresses limitations of shell-based testing by providing: - Better error handling and reporting - Programmatic test orchestration - Reusable test framework - Clear pass/fail criteria - Detailed test metrics and timing
This commit is contained in:
168
.github/workflows/tesla-k80-tests.yml
vendored
168
.github/workflows/tesla-k80-tests.yml
vendored
@@ -20,132 +20,74 @@ jobs:
|
||||
exit 1
|
||||
fi
|
||||
ls -lh ollama
|
||||
file ollama
|
||||
./ollama --version
|
||||
|
||||
- name: Run Go unit tests
|
||||
- name: Build test-runner
|
||||
run: |
|
||||
go test -v -race -timeout 10m ./...
|
||||
continue-on-error: false
|
||||
cd cmd/test-runner
|
||||
go mod init github.com/ollama/ollama/cmd/test-runner || true
|
||||
go mod tidy
|
||||
go build -o ../../test-runner .
|
||||
cd ../..
|
||||
ls -lh test-runner
|
||||
|
||||
- name: Start ollama server (background)
|
||||
- name: Validate test configuration
|
||||
run: |
|
||||
./ollama serve > ollama.log 2>&1 &
|
||||
echo $! > ollama.pid
|
||||
echo "Ollama server started with PID $(cat ollama.pid)"
|
||||
./test-runner validate --config test/config/quick.yaml
|
||||
|
||||
- name: Wait for server to be ready
|
||||
- name: Run quick tests
|
||||
run: |
|
||||
for i in {1..30}; do
|
||||
if curl -s http://localhost:11434/api/tags > /dev/null 2>&1; then
|
||||
echo "Server is ready!"
|
||||
exit 0
|
||||
fi
|
||||
echo "Waiting for server... attempt $i/30"
|
||||
sleep 2
|
||||
done
|
||||
echo "Server failed to start"
|
||||
cat ollama.log
|
||||
exit 1
|
||||
./test-runner run --profile quick --config test/config/quick.yaml --output test-report-quick --verbose
|
||||
timeout-minutes: 10
|
||||
|
||||
- name: Run integration tests
|
||||
- name: Check quick test results
|
||||
run: |
|
||||
go test -v -timeout 20m ./integration/...
|
||||
continue-on-error: false
|
||||
|
||||
- name: Clear server logs for model test
|
||||
run: |
|
||||
# Truncate log file to start fresh for model loading test
|
||||
> ollama.log
|
||||
|
||||
- name: Pull gemma2:2b model
|
||||
run: |
|
||||
echo "Pulling gemma2:2b model..."
|
||||
./ollama pull gemma2:2b
|
||||
echo "Model pull completed"
|
||||
timeout-minutes: 15
|
||||
|
||||
- name: Run inference with gemma2:2b
|
||||
run: |
|
||||
echo "Running inference test..."
|
||||
./ollama run gemma2:2b "Hello, this is a test. Please respond with a short greeting." --verbose
|
||||
echo "Inference completed"
|
||||
timeout-minutes: 5
|
||||
|
||||
- name: Wait for logs to flush
|
||||
run: sleep 3
|
||||
|
||||
- name: Analyze server logs with Claude
|
||||
run: |
|
||||
echo "Analyzing ollama server logs for proper model loading..."
|
||||
|
||||
# Create analysis prompt
|
||||
cat > log_analysis_prompt.txt << 'EOF'
|
||||
Analyze the following Ollama server logs from a Tesla K80 (CUDA Compute Capability 3.7) system.
|
||||
|
||||
Verify that:
|
||||
1. The model loaded successfully without errors
|
||||
2. CUDA/GPU acceleration was properly detected and initialized
|
||||
3. The model is using the Tesla K80 GPU (not CPU fallback)
|
||||
4. There are no CUDA compatibility warnings or errors
|
||||
5. Memory allocation was successful
|
||||
6. Inference completed without errors
|
||||
|
||||
Respond with:
|
||||
- "PASS" if all checks pass and model loaded properly with GPU acceleration
|
||||
- "FAIL: <reason>" if there are critical issues
|
||||
- "WARN: <reason>" if there are warnings but model works
|
||||
|
||||
Be specific about what succeeded or failed. Look for CUDA errors, memory issues, or CPU fallback warnings.
|
||||
|
||||
Server logs:
|
||||
---
|
||||
EOF
|
||||
|
||||
cat ollama.log >> log_analysis_prompt.txt
|
||||
|
||||
# Run Claude in headless mode to analyze
|
||||
claude -p log_analysis_prompt.txt > log_analysis_result.txt
|
||||
|
||||
echo "=== Claude Analysis Result ==="
|
||||
cat log_analysis_result.txt
|
||||
|
||||
# Check if analysis passed
|
||||
if grep -q "^PASS" log_analysis_result.txt; then
|
||||
echo "✓ Log analysis PASSED - Model loaded correctly on Tesla K80"
|
||||
exit 0
|
||||
elif grep -q "^WARN" log_analysis_result.txt; then
|
||||
echo "⚠ Log analysis has WARNINGS - Review needed"
|
||||
cat log_analysis_result.txt
|
||||
exit 0 # Don't fail on warnings, but they're visible
|
||||
else
|
||||
echo "✗ Log analysis FAILED - Model loading issues detected"
|
||||
cat log_analysis_result.txt
|
||||
if ! jq -e '.summary.failed == 0' test-report-quick.json; then
|
||||
echo "Quick tests failed!"
|
||||
jq '.results[] | select(.status == "FAILED")' test-report-quick.json
|
||||
exit 1
|
||||
fi
|
||||
echo "Quick tests passed!"
|
||||
|
||||
- name: Upload quick test results
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: quick-test-results
|
||||
path: |
|
||||
test-report-quick.json
|
||||
test-report-quick.md
|
||||
ollama.log
|
||||
retention-days: 7
|
||||
|
||||
- name: Run full tests (if quick tests passed)
|
||||
if: success()
|
||||
run: |
|
||||
./test-runner run --profile full --config test/config/models.yaml --output test-report-full --verbose
|
||||
timeout-minutes: 45
|
||||
|
||||
- name: Check full test results
|
||||
if: success()
|
||||
run: |
|
||||
if ! jq -e '.summary.failed == 0' test-report-full.json; then
|
||||
echo "Full tests failed!"
|
||||
jq '.results[] | select(.status == "FAILED")' test-report-full.json
|
||||
exit 1
|
||||
fi
|
||||
echo "All tests passed!"
|
||||
|
||||
- name: Upload full test results
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: full-test-results
|
||||
path: |
|
||||
test-report-full.json
|
||||
test-report-full.md
|
||||
ollama.log
|
||||
retention-days: 14
|
||||
|
||||
- name: Check GPU memory usage
|
||||
if: always()
|
||||
run: |
|
||||
echo "=== GPU Memory Status ==="
|
||||
nvidia-smi --query-gpu=memory.used,memory.total --format=csv
|
||||
|
||||
- name: Stop ollama server
|
||||
if: always()
|
||||
run: |
|
||||
if [ -f ollama.pid ]; then
|
||||
kill $(cat ollama.pid) || true
|
||||
rm ollama.pid
|
||||
fi
|
||||
pkill -f "ollama serve" || true
|
||||
|
||||
- name: Upload logs and analysis
|
||||
if: always()
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: ollama-test-logs-and-analysis
|
||||
path: |
|
||||
ollama.log
|
||||
log_analysis_prompt.txt
|
||||
log_analysis_result.txt
|
||||
retention-days: 7
|
||||
|
||||
Reference in New Issue
Block a user