mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-22 05:37:01 +00:00
Add LogCollector for precise test log boundaries
Problem: Tests used `docker compose logs --since=5m` which caused:
- Log overlap between tests
- Logs from previous tests included
- Missing logs if test exceeded 5 minutes
Solution:
- New LogCollector class runs `docker compose logs --follow`
- Marks test start/end boundaries
- Writes test-specific logs to /tmp/test-{testId}-logs.txt
- Test steps access via TEST_ID environment variable
Changes:
- tests/src/log-collector.ts: New LogCollector class
- tests/src/executor.ts: Integrate LogCollector, set TEST_ID env
- tests/src/cli.ts: Start/stop LogCollector for runtime/inference
- All test cases: Use log collector with fallback to docker compose
Also updated docs/CICD.md with:
- Test runner CLI documentation
- Judge modes (simple, llm, dual)
- Log collector integration
- Updated test case list (12b, 27b models)
- Model unload strategy
- Troubleshooting guide
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
533
docs/CICD.md
533
docs/CICD.md
@@ -1,4 +1,4 @@
|
|||||||
# CI/CD Plan for Ollama37
|
# CI/CD Pipeline for Ollama37
|
||||||
|
|
||||||
This document describes the CI/CD pipeline for building and testing Ollama37 with Tesla K80 (CUDA compute capability 3.7) support.
|
This document describes the CI/CD pipeline for building and testing Ollama37 with Tesla K80 (CUDA compute capability 3.7) support.
|
||||||
|
|
||||||
@@ -24,185 +24,269 @@ This document describes the CI/CD pipeline for building and testing Ollama37 wit
|
|||||||
│ - Rocky Linux 9.7 │
|
│ - Rocky Linux 9.7 │
|
||||||
│ - Docker 29.1.3 + Docker Compose 5.0.0 │
|
│ - Docker 29.1.3 + Docker Compose 5.0.0 │
|
||||||
│ - NVIDIA Container Toolkit │
|
│ - NVIDIA Container Toolkit │
|
||||||
│ - GitHub Actions Runner (self-hosted, labels: k80, cuda11) │
|
│ - GitHub Actions Runner (self-hosted) │
|
||||||
│ │
|
|
||||||
│ Services: │
|
|
||||||
│ - TestLink (http://localhost:8090) - Test management │
|
|
||||||
│ - TestLink MCP - Claude Code integration │
|
|
||||||
│ │
|
|
||||||
└─────────────────────────────────────────────────────────────────────────┘
|
|
||||||
│
|
|
||||||
▼
|
|
||||||
┌─────────────────────────────────────────────────────────────────────────┐
|
|
||||||
│ SERVE NODE │
|
|
||||||
│ │
|
|
||||||
│ Services: │
|
|
||||||
│ - Ollama (production) │
|
|
||||||
│ - Dify (LLM application platform) │
|
|
||||||
│ │
|
│ │
|
||||||
└─────────────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
## Build Strategy: Docker-Based
|
## Test Framework
|
||||||
|
|
||||||
We use the two-stage Docker build system located in `/docker/`:
|
### Test Runner CLI
|
||||||
|
|
||||||
### Stage 1: Builder Image (Cached)
|
The test runner is located in `tests/src/` and provides a CLI tool:
|
||||||
|
|
||||||
**Image:** `ollama37-builder:latest` (~15GB)
|
```bash
|
||||||
|
cd tests
|
||||||
|
npm run dev -- run [options]
|
||||||
|
```
|
||||||
|
|
||||||
**Contents:**
|
**Commands:**
|
||||||
|
- `run` - Execute test cases
|
||||||
|
- `list` - List all available test cases
|
||||||
|
|
||||||
|
**Options:**
|
||||||
|
| Option | Default | Description |
|
||||||
|
|--------|---------|-------------|
|
||||||
|
| `-s, --suite <suite>` | all | Filter by suite (build, runtime, inference) |
|
||||||
|
| `-i, --id <id>` | - | Run specific test by ID |
|
||||||
|
| `-w, --workers <n>` | 1 | Parallel worker count |
|
||||||
|
| `-d, --dry-run` | false | Preview without executing |
|
||||||
|
| `-o, --output <format>` | console | Output format: console, json, junit |
|
||||||
|
| `--no-llm` | false | Skip LLM, use simple exit code check only |
|
||||||
|
| `--judge-model <model>` | gemma3:12b | Model for LLM judging |
|
||||||
|
| `--dual-judge` | true | Run both simple and LLM judge |
|
||||||
|
| `--ollama-url <url>` | localhost:11434 | Test subject server |
|
||||||
|
| `--judge-url <url>` | localhost:11435 | Separate judge instance |
|
||||||
|
|
||||||
|
### Judge Modes
|
||||||
|
|
||||||
|
The test framework supports three judge modes:
|
||||||
|
|
||||||
|
| Mode | Flag | Description |
|
||||||
|
|------|------|-------------|
|
||||||
|
| **Simple** | `--no-llm` | Exit code checking only (exit 0 = pass) |
|
||||||
|
| **LLM** | `--judge-model` | Semantic analysis of test logs using LLM |
|
||||||
|
| **Dual** | `--dual-judge` | Both must pass (default) |
|
||||||
|
|
||||||
|
**LLM Judge:**
|
||||||
|
- Analyzes test execution logs semantically
|
||||||
|
- Detects hidden issues (e.g., CUDA errors with exit 0)
|
||||||
|
- Uses configurable model (default: gemma3:12b)
|
||||||
|
- Batches tests for efficient judging
|
||||||
|
|
||||||
|
**Simple Judge:**
|
||||||
|
- Fast, deterministic
|
||||||
|
- Checks exit codes only
|
||||||
|
- Fallback when LLM unavailable
|
||||||
|
|
||||||
|
### Log Collector
|
||||||
|
|
||||||
|
The test framework includes a log collector that solves log overlap issues:
|
||||||
|
|
||||||
|
**Problem:** `docker compose logs --since=5m` can include logs from previous tests or miss logs if a test exceeds 5 minutes.
|
||||||
|
|
||||||
|
**Solution:** LogCollector class that:
|
||||||
|
1. Runs `docker compose logs --follow` as background process
|
||||||
|
2. Marks test start/end boundaries
|
||||||
|
3. Writes test-specific logs to `/tmp/test-{testId}-logs.txt`
|
||||||
|
4. Provides precise logs for each test
|
||||||
|
|
||||||
|
Test steps access logs via:
|
||||||
|
```bash
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
```
|
||||||
|
|
||||||
|
## GitHub Workflows
|
||||||
|
|
||||||
|
Located in `.github/workflows/`:
|
||||||
|
|
||||||
|
| Workflow | Purpose |
|
||||||
|
|----------|---------|
|
||||||
|
| `build.yml` | Docker image build verification |
|
||||||
|
| `runtime.yml` | Container startup and GPU detection |
|
||||||
|
| `inference.yml` | Model inference tests (4b, 12b, 27b) |
|
||||||
|
| `full-pipeline.yml` | Orchestrates all stages sequentially |
|
||||||
|
|
||||||
|
### Workflow Inputs
|
||||||
|
|
||||||
|
| Parameter | Default | Options | Description |
|
||||||
|
|-----------|---------|---------|-------------|
|
||||||
|
| `judge_mode` | dual | simple, llm, dual | Judge strategy |
|
||||||
|
| `judge_model` | gemma3:12b | Any model | LLM for evaluation |
|
||||||
|
| `use_existing_container` | false | true, false | Reuse running container |
|
||||||
|
| `keep_container` | false | true, false | Leave container running |
|
||||||
|
|
||||||
|
### Example: Run Inference Tests
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Manual trigger via GitHub Actions UI
|
||||||
|
# Or via gh CLI:
|
||||||
|
gh workflow run inference.yml \
|
||||||
|
-f judge_mode=dual \
|
||||||
|
-f judge_model=gemma3:12b
|
||||||
|
```
|
||||||
|
|
||||||
|
## Test Suites
|
||||||
|
|
||||||
|
### Build Suite (3 tests)
|
||||||
|
|
||||||
|
| ID | Name | Timeout | Description |
|
||||||
|
|----|------|---------|-------------|
|
||||||
|
| TC-BUILD-001 | Builder Image Verification | 2m | Verify builder image exists |
|
||||||
|
| TC-BUILD-002 | Runtime Image Build | 30m | Build runtime image |
|
||||||
|
| TC-BUILD-003 | Image Size Validation | 30s | Check image sizes |
|
||||||
|
|
||||||
|
### Runtime Suite (3 tests)
|
||||||
|
|
||||||
|
| ID | Name | Timeout | Description |
|
||||||
|
|----|------|---------|-------------|
|
||||||
|
| TC-RUNTIME-001 | Container Startup | 2m | Start container with GPU |
|
||||||
|
| TC-RUNTIME-002 | GPU Detection | 2m | Verify K80 detected |
|
||||||
|
| TC-RUNTIME-003 | Health Check | 3m | API health verification |
|
||||||
|
|
||||||
|
### Inference Suite (5 tests)
|
||||||
|
|
||||||
|
| ID | Name | Model | Timeout | Description |
|
||||||
|
|----|------|-------|---------|-------------|
|
||||||
|
| TC-INFERENCE-001 | Model Pull | gemma3:4b | 10m | Pull and warmup 4b model |
|
||||||
|
| TC-INFERENCE-002 | Basic Inference | gemma3:4b | 3m | Simple prompt test |
|
||||||
|
| TC-INFERENCE-003 | API Endpoint Test | gemma3:4b | 2m | REST API verification |
|
||||||
|
| TC-INFERENCE-004 | Medium Model | gemma3:12b | 10m | 12b inference (single GPU) |
|
||||||
|
| TC-INFERENCE-005 | Large Model Dual-GPU | gemma3:27b | 15m | 27b inference (dual GPU) |
|
||||||
|
|
||||||
|
### Model Unload Strategy
|
||||||
|
|
||||||
|
Each model size test unloads its model after completion:
|
||||||
|
|
||||||
|
```
|
||||||
|
4b tests (001-003) → unload 4b
|
||||||
|
12b test (004) → unload 12b
|
||||||
|
27b test (005) → unload 27b
|
||||||
|
```
|
||||||
|
|
||||||
|
Workflow-level cleanup (`if: always()`) provides safety fallback.
|
||||||
|
|
||||||
|
## Test Case Structure
|
||||||
|
|
||||||
|
Test cases are YAML files in `tests/testcases/{suite}/`:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
id: TC-INFERENCE-002
|
||||||
|
name: Basic Inference
|
||||||
|
suite: inference
|
||||||
|
priority: 2
|
||||||
|
timeout: 180000
|
||||||
|
|
||||||
|
dependencies:
|
||||||
|
- TC-INFERENCE-001
|
||||||
|
|
||||||
|
steps:
|
||||||
|
- name: Run simple math question
|
||||||
|
command: docker exec ollama37 ollama run gemma3:4b "What is 2+2?"
|
||||||
|
timeout: 120000
|
||||||
|
|
||||||
|
- name: Check for errors in logs
|
||||||
|
command: |
|
||||||
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
# Check for CUDA errors...
|
||||||
|
|
||||||
|
criteria: |
|
||||||
|
Expected:
|
||||||
|
- Model responds with "4" or equivalent
|
||||||
|
- NO CUBLAS_STATUS_ errors
|
||||||
|
- NO CUDA errors
|
||||||
|
```
|
||||||
|
|
||||||
|
## Build System
|
||||||
|
|
||||||
|
### Docker Images
|
||||||
|
|
||||||
|
**Builder Image:** `ollama37-builder:latest` (~15GB)
|
||||||
- Rocky Linux 8
|
- Rocky Linux 8
|
||||||
- CUDA 11.4 toolkit
|
- CUDA 11.4 toolkit
|
||||||
- GCC 10 (built from source)
|
- GCC 10, CMake 4.0, Go 1.25.3
|
||||||
- CMake 4.0 (built from source)
|
- Build time: ~90 minutes (cached)
|
||||||
- Go 1.25.3
|
|
||||||
|
|
||||||
**Build time:** ~90 minutes (first time only, then cached)
|
**Runtime Image:** `ollama37:latest` (~18GB)
|
||||||
|
- Built from GitHub source
|
||||||
|
- Build time: ~10 minutes
|
||||||
|
|
||||||
|
### Build Commands
|
||||||
|
|
||||||
**Build command:**
|
|
||||||
```bash
|
```bash
|
||||||
cd docker && make build-builder
|
cd docker
|
||||||
|
|
||||||
|
# Build base image (first time only)
|
||||||
|
make build-builder
|
||||||
|
|
||||||
|
# Build runtime from GitHub
|
||||||
|
make build-runtime
|
||||||
|
|
||||||
|
# Build without cache
|
||||||
|
make build-runtime-no-cache
|
||||||
|
|
||||||
|
# Build from local source
|
||||||
|
make build-runtime-local
|
||||||
```
|
```
|
||||||
|
|
||||||
### Stage 2: Runtime Image (Per Build)
|
## Running Tests Locally
|
||||||
|
|
||||||
**Image:** `ollama37:latest` (~18GB)
|
### Prerequisites
|
||||||
|
|
||||||
**Process:**
|
1. Docker with NVIDIA runtime
|
||||||
1. Clone source from GitHub
|
2. Node.js 20+
|
||||||
2. Configure with CMake ("CUDA 11" preset)
|
3. Tesla K80 GPU (or compatible)
|
||||||
3. Build C/C++/CUDA libraries
|
|
||||||
4. Build Go binary
|
|
||||||
5. Package runtime environment
|
|
||||||
|
|
||||||
**Build time:** ~10 minutes
|
### Quick Start
|
||||||
|
|
||||||
**Build command:**
|
|
||||||
```bash
|
```bash
|
||||||
cd docker && make build-runtime
|
# Start the container
|
||||||
|
cd docker && docker compose up -d
|
||||||
|
|
||||||
|
# Install test runner
|
||||||
|
cd tests && npm ci
|
||||||
|
|
||||||
|
# Run all tests with dual judge
|
||||||
|
npm run dev -- run --dual-judge
|
||||||
|
|
||||||
|
# Run specific suite
|
||||||
|
npm run dev -- run --suite inference
|
||||||
|
|
||||||
|
# Run single test
|
||||||
|
npm run dev -- run --id TC-INFERENCE-002
|
||||||
|
|
||||||
|
# Simple mode (no LLM)
|
||||||
|
npm run dev -- run --no-llm
|
||||||
|
|
||||||
|
# JSON output
|
||||||
|
npm run dev -- run -o json > results.json
|
||||||
```
|
```
|
||||||
|
|
||||||
## Pipeline Stages
|
### Test Output
|
||||||
|
|
||||||
### Stage 1: Docker Build
|
Results are saved to `/tmp/`:
|
||||||
|
- `/tmp/build-results.json`
|
||||||
|
- `/tmp/runtime-results.json`
|
||||||
|
- `/tmp/inference-results.json`
|
||||||
|
|
||||||
**Trigger:** Push to `main` branch
|
JSON structure:
|
||||||
|
```json
|
||||||
**Steps:**
|
{
|
||||||
1. Checkout repository
|
"summary": {
|
||||||
2. Ensure builder image exists (build if not)
|
"total": 5,
|
||||||
3. Build runtime image: `make build-runtime`
|
"passed": 5,
|
||||||
4. Verify image created successfully
|
"failed": 0,
|
||||||
|
"timestamp": "2025-12-17T...",
|
||||||
**Test Cases:**
|
"simple": { "passed": 5, "failed": 0 },
|
||||||
- TC-BUILD-001: Builder Image Verification
|
"llm": { "passed": 5, "failed": 0 }
|
||||||
- TC-BUILD-002: Runtime Image Build
|
},
|
||||||
- TC-BUILD-003: Image Size Validation
|
"results": [...]
|
||||||
|
}
|
||||||
### Stage 2: Container Startup
|
```
|
||||||
|
|
||||||
**Steps:**
|
|
||||||
1. Start container with GPU: `docker compose up -d`
|
|
||||||
2. Wait for health check to pass
|
|
||||||
3. Verify Ollama server is responding
|
|
||||||
|
|
||||||
**Test Cases:**
|
|
||||||
- TC-RUNTIME-001: Container Startup
|
|
||||||
- TC-RUNTIME-002: GPU Detection
|
|
||||||
- TC-RUNTIME-003: Health Check
|
|
||||||
|
|
||||||
### Stage 3: Inference Tests
|
|
||||||
|
|
||||||
**Steps:**
|
|
||||||
1. Pull test model (gemma3:4b)
|
|
||||||
2. Run inference tests
|
|
||||||
3. Verify CUBLAS legacy fallback
|
|
||||||
|
|
||||||
**Test Cases:**
|
|
||||||
- TC-INFERENCE-001: Model Pull
|
|
||||||
- TC-INFERENCE-002: Basic Inference
|
|
||||||
- TC-INFERENCE-003: API Endpoint Test
|
|
||||||
- TC-INFERENCE-004: CUBLAS Fallback Verification
|
|
||||||
|
|
||||||
### Stage 4: Cleanup & Report
|
|
||||||
|
|
||||||
**Steps:**
|
|
||||||
1. Stop container: `docker compose down`
|
|
||||||
2. Report results to TestLink
|
|
||||||
3. Clean up resources
|
|
||||||
|
|
||||||
## Test Case Design
|
|
||||||
|
|
||||||
### Build Tests (Suite: Build Tests)
|
|
||||||
|
|
||||||
| ID | Name | Type | Description |
|
|
||||||
|----|------|------|-------------|
|
|
||||||
| TC-BUILD-001 | Builder Image Verification | Automated | Verify builder image exists with correct tools |
|
|
||||||
| TC-BUILD-002 | Runtime Image Build | Automated | Build runtime image from GitHub source |
|
|
||||||
| TC-BUILD-003 | Image Size Validation | Automated | Verify image sizes are within expected range |
|
|
||||||
|
|
||||||
### Runtime Tests (Suite: Runtime Tests)
|
|
||||||
|
|
||||||
| ID | Name | Type | Description |
|
|
||||||
|----|------|------|-------------|
|
|
||||||
| TC-RUNTIME-001 | Container Startup | Automated | Start container with GPU passthrough |
|
|
||||||
| TC-RUNTIME-002 | GPU Detection | Automated | Verify Tesla K80 detected inside container |
|
|
||||||
| TC-RUNTIME-003 | Health Check | Automated | Verify Ollama health check passes |
|
|
||||||
|
|
||||||
### Inference Tests (Suite: Inference Tests)
|
|
||||||
|
|
||||||
| ID | Name | Type | Description |
|
|
||||||
|----|------|------|-------------|
|
|
||||||
| TC-INFERENCE-001 | Model Pull | Automated | Pull gemma3:4b model |
|
|
||||||
| TC-INFERENCE-002 | Basic Inference | Automated | Run simple prompt and verify response |
|
|
||||||
| TC-INFERENCE-003 | API Endpoint Test | Automated | Test /api/generate endpoint |
|
|
||||||
| TC-INFERENCE-004 | CUBLAS Fallback Verification | Automated | Verify legacy CUBLAS functions used |
|
|
||||||
|
|
||||||
## GitHub Actions Workflow
|
|
||||||
|
|
||||||
**File:** `.github/workflows/build-test.yml`
|
|
||||||
|
|
||||||
**Triggers:**
|
|
||||||
- Push to `main` branch
|
|
||||||
- Pull request to `main` branch
|
|
||||||
- Manual trigger (workflow_dispatch)
|
|
||||||
|
|
||||||
**Runner:** Self-hosted with labels `[self-hosted, k80, cuda11]`
|
|
||||||
|
|
||||||
**Jobs:**
|
|
||||||
1. `build` - Build Docker runtime image
|
|
||||||
2. `test` - Run inference tests in container
|
|
||||||
3. `report` - Report results to TestLink
|
|
||||||
|
|
||||||
## TestLink Integration
|
|
||||||
|
|
||||||
**URL:** http://localhost:8090
|
|
||||||
|
|
||||||
**Project:** ollama37
|
|
||||||
|
|
||||||
**Test Suites:**
|
|
||||||
- Build Tests
|
|
||||||
- Runtime Tests
|
|
||||||
- Inference Tests
|
|
||||||
|
|
||||||
**Test Plan:** Created per release/sprint
|
|
||||||
|
|
||||||
**Builds:** Created per CI run (commit SHA)
|
|
||||||
|
|
||||||
**Execution Recording:**
|
|
||||||
- Each test case result recorded via TestLink API
|
|
||||||
- Pass/Fail status with notes
|
|
||||||
- Linked to specific build/commit
|
|
||||||
|
|
||||||
## Makefile Targets for CI
|
|
||||||
|
|
||||||
| Target | Description | When to Use |
|
|
||||||
|--------|-------------|-------------|
|
|
||||||
| `make build-builder` | Build base image | First time setup |
|
|
||||||
| `make build-runtime` | Build from GitHub | Normal CI builds |
|
|
||||||
| `make build-runtime-no-cache` | Fresh GitHub clone | When cache is stale |
|
|
||||||
| `make build-runtime-local` | Build from local | Local testing |
|
|
||||||
|
|
||||||
## Environment Variables
|
## Environment Variables
|
||||||
|
|
||||||
@@ -222,76 +306,19 @@ cd docker && make build-runtime
|
|||||||
| `OLLAMA_DEBUG` | 1 (optional) | Enable debug logging |
|
| `OLLAMA_DEBUG` | 1 (optional) | Enable debug logging |
|
||||||
| `GGML_CUDA_DEBUG` | 1 (optional) | Enable CUDA debug |
|
| `GGML_CUDA_DEBUG` | 1 (optional) | Enable CUDA debug |
|
||||||
|
|
||||||
### TestLink Environment
|
### Test Environment
|
||||||
|
|
||||||
| Variable | Value | Description |
|
| Variable | Description |
|
||||||
|----------|-------|-------------|
|
|----------|-------------|
|
||||||
| `TESTLINK_URL` | http://localhost:8090 | TestLink server URL |
|
| `TEST_ID` | Current test ID (set by executor) |
|
||||||
| `TESTLINK_API_KEY` | (configured) | API key for automation |
|
| `OLLAMA_HOST` | Test subject URL |
|
||||||
|
|
||||||
## Prerequisites
|
|
||||||
|
|
||||||
### One-Time Setup on CI/CD Node
|
|
||||||
|
|
||||||
1. **Install GitHub Actions Runner:**
|
|
||||||
```bash
|
|
||||||
mkdir -p ~/actions-runner && cd ~/actions-runner
|
|
||||||
curl -o actions-runner-linux-x64-2.321.0.tar.gz -L \
|
|
||||||
https://github.com/actions/runner/releases/download/v2.321.0/actions-runner-linux-x64-2.321.0.tar.gz
|
|
||||||
tar xzf ./actions-runner-linux-x64-2.321.0.tar.gz
|
|
||||||
./config.sh --url https://github.com/dogkeeper886/ollama37 --token YOUR_TOKEN --labels k80,cuda11
|
|
||||||
sudo ./svc.sh install && sudo ./svc.sh start
|
|
||||||
```
|
|
||||||
|
|
||||||
2. **Build Builder Image (one-time, ~90 min):**
|
|
||||||
```bash
|
|
||||||
cd /home/jack/src/ollama37/docker
|
|
||||||
make build-builder
|
|
||||||
```
|
|
||||||
|
|
||||||
3. **Verify GPU Access in Docker:**
|
|
||||||
```bash
|
|
||||||
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
|
|
||||||
```
|
|
||||||
|
|
||||||
4. **Start TestLink:**
|
|
||||||
```bash
|
|
||||||
cd /home/jack/src/testlink-code
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
|
|
||||||
## Monitoring & Logs
|
|
||||||
|
|
||||||
### View CI/CD Logs
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# GitHub Actions Runner logs
|
|
||||||
journalctl -u actions.runner.* -f
|
|
||||||
|
|
||||||
# Docker build logs
|
|
||||||
docker compose logs -f
|
|
||||||
|
|
||||||
# TestLink logs
|
|
||||||
cd /home/jack/src/testlink-code && docker compose logs -f
|
|
||||||
```
|
|
||||||
|
|
||||||
### Test Results
|
|
||||||
|
|
||||||
- **TestLink Dashboard:** http://localhost:8090
|
|
||||||
- **GitHub Actions:** https://github.com/dogkeeper886/ollama37/actions
|
|
||||||
|
|
||||||
## Troubleshooting
|
## Troubleshooting
|
||||||
|
|
||||||
### Builder Image Missing
|
|
||||||
|
|
||||||
```bash
|
|
||||||
cd docker && make build-builder
|
|
||||||
```
|
|
||||||
|
|
||||||
### GPU Not Detected in Container
|
### GPU Not Detected in Container
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# Check UVM device files on host
|
# Check UVM device files
|
||||||
ls -l /dev/nvidia-uvm*
|
ls -l /dev/nvidia-uvm*
|
||||||
|
|
||||||
# Create if missing
|
# Create if missing
|
||||||
@@ -301,18 +328,72 @@ nvidia-modprobe -u -c=0
|
|||||||
docker compose restart
|
docker compose restart
|
||||||
```
|
```
|
||||||
|
|
||||||
### Build Cache Stale
|
### LLM Judge Timeout
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
|
# Use simple mode
|
||||||
|
npm run dev -- run --no-llm
|
||||||
|
|
||||||
|
# Or increase judge model size
|
||||||
|
npm run dev -- run --judge-model gemma3:4b
|
||||||
|
```
|
||||||
|
|
||||||
|
### Log Collector Issues
|
||||||
|
|
||||||
|
If test step can't find logs:
|
||||||
|
```bash
|
||||||
|
# Check log file exists
|
||||||
|
ls -l /tmp/test-*-logs.txt
|
||||||
|
|
||||||
|
# Fallback to direct logs
|
||||||
|
docker compose logs --since=5m
|
||||||
|
```
|
||||||
|
|
||||||
|
### Build Failures
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Clean build
|
||||||
cd docker && make build-runtime-no-cache
|
cd docker && make build-runtime-no-cache
|
||||||
|
|
||||||
|
# Check builder image
|
||||||
|
docker images | grep ollama37-builder
|
||||||
```
|
```
|
||||||
|
|
||||||
### TestLink Connection Failed
|
## Error Patterns
|
||||||
|
|
||||||
```bash
|
The test framework checks for these critical errors:
|
||||||
# Check TestLink is running
|
|
||||||
curl http://localhost:8090
|
| Pattern | Severity | Description |
|
||||||
|
|---------|----------|-------------|
|
||||||
|
| `CUBLAS_STATUS_*` | Critical | CUDA/cuBLAS error (K80-specific) |
|
||||||
|
| `CUDA error` | Critical | General CUDA failure |
|
||||||
|
| `cudaMalloc failed` | Critical | GPU memory allocation failure |
|
||||||
|
| `out of memory` | Critical | VRAM exhausted |
|
||||||
|
| `level=ERROR` | Warning | Ollama application error |
|
||||||
|
| `panic`, `fatal` | Critical | Runtime crash |
|
||||||
|
| `id=cpu library=cpu` | Critical | CPU-only fallback (GPU not detected) |
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
# Restart if needed
|
```
|
||||||
cd /home/jack/src/testlink-code && docker compose restart
|
tests/
|
||||||
|
├── src/
|
||||||
|
│ ├── cli.ts # CLI entry point
|
||||||
|
│ ├── executor.ts # Test execution engine
|
||||||
|
│ ├── judge.ts # LLM/simple judging
|
||||||
|
│ ├── loader.ts # YAML test case parser
|
||||||
|
│ ├── log-collector.ts # Docker log collector
|
||||||
|
│ ├── reporter.ts # Output formatters
|
||||||
|
│ └── types.ts # Type definitions
|
||||||
|
├── testcases/
|
||||||
|
│ ├── build/ # Build test cases
|
||||||
|
│ ├── runtime/ # Runtime test cases
|
||||||
|
│ └── inference/ # Inference test cases
|
||||||
|
└── package.json
|
||||||
|
|
||||||
|
.github/workflows/
|
||||||
|
├── build.yml # Build verification
|
||||||
|
├── runtime.yml # Container/GPU tests
|
||||||
|
├── inference.yml # Model inference tests
|
||||||
|
└── full-pipeline.yml # Complete pipeline
|
||||||
```
|
```
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ import { TestExecutor } from "./executor.js";
|
|||||||
import { LLMJudge } from "./judge.js";
|
import { LLMJudge } from "./judge.js";
|
||||||
import { Reporter, TestLinkReporter } from "./reporter.js";
|
import { Reporter, TestLinkReporter } from "./reporter.js";
|
||||||
import { RunnerOptions, Judgment } from "./types.js";
|
import { RunnerOptions, Judgment } from "./types.js";
|
||||||
|
import { LogCollector } from "./log-collector.js";
|
||||||
|
|
||||||
const __dirname = path.dirname(fileURLToPath(import.meta.url));
|
const __dirname = path.dirname(fileURLToPath(import.meta.url));
|
||||||
const defaultTestcasesDir = path.join(__dirname, "..", "testcases");
|
const defaultTestcasesDir = path.join(__dirname, "..", "testcases");
|
||||||
@@ -68,9 +69,12 @@ program
|
|||||||
log("=".repeat(60));
|
log("=".repeat(60));
|
||||||
|
|
||||||
const loader = new TestLoader(options.testcasesDir);
|
const loader = new TestLoader(options.testcasesDir);
|
||||||
const executor = new TestExecutor(path.join(__dirname, "..", ".."));
|
const projectRoot = path.join(__dirname, "..", "..");
|
||||||
const judge = new LLMJudge(options.judgeUrl, options.judgeModel);
|
const judge = new LLMJudge(options.judgeUrl, options.judgeModel);
|
||||||
|
|
||||||
|
// LogCollector will be created if running runtime or inference tests
|
||||||
|
let logCollector: LogCollector | undefined;
|
||||||
|
|
||||||
// Load test cases
|
// Load test cases
|
||||||
log("\nLoading test cases...");
|
log("\nLoading test cases...");
|
||||||
let testCases = await loader.loadAll();
|
let testCases = await loader.loadAll();
|
||||||
@@ -107,10 +111,38 @@ program
|
|||||||
process.exit(0);
|
process.exit(0);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Check if we need log collector (runtime or inference tests)
|
||||||
|
const needsLogCollector = testCases.some(
|
||||||
|
(tc) => tc.suite === "runtime" || tc.suite === "inference"
|
||||||
|
);
|
||||||
|
|
||||||
|
if (needsLogCollector) {
|
||||||
|
log("\nStarting log collector...");
|
||||||
|
logCollector = new LogCollector(projectRoot);
|
||||||
|
try {
|
||||||
|
await logCollector.start();
|
||||||
|
log(" Log collector started");
|
||||||
|
} catch (error) {
|
||||||
|
log(` Warning: Log collector failed to start: ${error}`);
|
||||||
|
log(" Tests will run without precise log collection");
|
||||||
|
logCollector = undefined;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create executor with optional log collector
|
||||||
|
const executor = new TestExecutor(projectRoot, logCollector);
|
||||||
|
|
||||||
// Execute tests (progress goes to stderr via executor)
|
// Execute tests (progress goes to stderr via executor)
|
||||||
const workers = parseInt(options.workers);
|
const workers = parseInt(options.workers);
|
||||||
const results = await executor.executeAll(testCases, workers);
|
const results = await executor.executeAll(testCases, workers);
|
||||||
|
|
||||||
|
// Stop log collector if running
|
||||||
|
if (logCollector) {
|
||||||
|
log("\nStopping log collector...");
|
||||||
|
await logCollector.stop();
|
||||||
|
log(" Log collector stopped");
|
||||||
|
}
|
||||||
|
|
||||||
// Judge results
|
// Judge results
|
||||||
log("\nJudging results...");
|
log("\nJudging results...");
|
||||||
let judgments: Judgment[];
|
let judgments: Judgment[];
|
||||||
|
|||||||
@@ -1,6 +1,7 @@
|
|||||||
import { exec } from 'child_process'
|
import { exec } from 'child_process'
|
||||||
import { promisify } from 'util'
|
import { promisify } from 'util'
|
||||||
import { TestCase, TestResult, StepResult } from './types.js'
|
import { TestCase, TestResult, StepResult } from './types.js'
|
||||||
|
import { LogCollector } from './log-collector.js'
|
||||||
|
|
||||||
const execAsync = promisify(exec)
|
const execAsync = promisify(exec)
|
||||||
|
|
||||||
@@ -17,9 +18,12 @@ export class TestExecutor {
|
|||||||
private workingDir: string
|
private workingDir: string
|
||||||
private totalTests: number = 0
|
private totalTests: number = 0
|
||||||
private currentTest: number = 0
|
private currentTest: number = 0
|
||||||
|
private logCollector: LogCollector | null = null
|
||||||
|
private currentTestId: string | null = null
|
||||||
|
|
||||||
constructor(workingDir: string = process.cwd()) {
|
constructor(workingDir: string = process.cwd(), logCollector?: LogCollector) {
|
||||||
this.workingDir = workingDir
|
this.workingDir = workingDir
|
||||||
|
this.logCollector = logCollector || null
|
||||||
}
|
}
|
||||||
|
|
||||||
// Progress output goes to stderr (visible in console)
|
// Progress output goes to stderr (visible in console)
|
||||||
@@ -34,12 +38,24 @@ export class TestExecutor {
|
|||||||
let exitCode = 0
|
let exitCode = 0
|
||||||
let timedOut = false
|
let timedOut = false
|
||||||
|
|
||||||
|
// Build environment with TEST_ID for log access
|
||||||
|
const env = { ...process.env }
|
||||||
|
if (this.currentTestId) {
|
||||||
|
env.TEST_ID = this.currentTestId
|
||||||
|
}
|
||||||
|
|
||||||
|
// Update log file before each step so test can access current logs
|
||||||
|
if (this.logCollector && this.currentTestId) {
|
||||||
|
this.logCollector.writeCurrentLogs(this.currentTestId)
|
||||||
|
}
|
||||||
|
|
||||||
try {
|
try {
|
||||||
const result = await execAsync(command, {
|
const result = await execAsync(command, {
|
||||||
cwd: this.workingDir,
|
cwd: this.workingDir,
|
||||||
timeout,
|
timeout,
|
||||||
maxBuffer: 10 * 1024 * 1024, // 10MB buffer
|
maxBuffer: 10 * 1024 * 1024, // 10MB buffer
|
||||||
shell: '/bin/bash'
|
shell: '/bin/bash',
|
||||||
|
env
|
||||||
})
|
})
|
||||||
stdout = result.stdout
|
stdout = result.stdout
|
||||||
stderr = result.stderr
|
stderr = result.stderr
|
||||||
@@ -77,8 +93,14 @@ export class TestExecutor {
|
|||||||
const timestamp = new Date().toISOString().substring(11, 19)
|
const timestamp = new Date().toISOString().substring(11, 19)
|
||||||
|
|
||||||
this.currentTest++
|
this.currentTest++
|
||||||
|
this.currentTestId = testCase.id
|
||||||
this.progress(`[${timestamp}] [${this.currentTest}/${this.totalTests}] ${testCase.id}: ${testCase.name}`)
|
this.progress(`[${timestamp}] [${this.currentTest}/${this.totalTests}] ${testCase.id}: ${testCase.name}`)
|
||||||
|
|
||||||
|
// Mark test start for log collection
|
||||||
|
if (this.logCollector) {
|
||||||
|
this.logCollector.markTestStart(testCase.id)
|
||||||
|
}
|
||||||
|
|
||||||
for (let i = 0; i < testCase.steps.length; i++) {
|
for (let i = 0; i < testCase.steps.length; i++) {
|
||||||
const step = testCase.steps[i]
|
const step = testCase.steps[i]
|
||||||
const stepTimestamp = new Date().toISOString().substring(11, 19)
|
const stepTimestamp = new Date().toISOString().substring(11, 19)
|
||||||
@@ -106,6 +128,12 @@ export class TestExecutor {
|
|||||||
|
|
||||||
const totalDuration = Date.now() - startTime
|
const totalDuration = Date.now() - startTime
|
||||||
|
|
||||||
|
// Mark test end for log collection
|
||||||
|
if (this.logCollector) {
|
||||||
|
this.logCollector.markTestEnd(testCase.id)
|
||||||
|
}
|
||||||
|
this.currentTestId = null
|
||||||
|
|
||||||
// Combine all logs
|
// Combine all logs
|
||||||
const logs = stepResults.map(r => {
|
const logs = stepResults.map(r => {
|
||||||
return `=== Step: ${r.name} ===
|
return `=== Step: ${r.name} ===
|
||||||
|
|||||||
200
tests/src/log-collector.ts
Normal file
200
tests/src/log-collector.ts
Normal file
@@ -0,0 +1,200 @@
|
|||||||
|
import { spawn, ChildProcess } from "child_process";
|
||||||
|
import { writeFileSync } from "fs";
|
||||||
|
import path from "path";
|
||||||
|
|
||||||
|
interface LogEntry {
|
||||||
|
timestamp: Date;
|
||||||
|
line: string;
|
||||||
|
}
|
||||||
|
|
||||||
|
interface TestMarker {
|
||||||
|
start: number;
|
||||||
|
end?: number;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* LogCollector runs `docker compose logs --follow` as a background process
|
||||||
|
* and captures logs with precise test boundaries.
|
||||||
|
*
|
||||||
|
* This solves the log overlap problem where `docker compose logs --since=5m`
|
||||||
|
* could include logs from previous tests or miss logs if a test exceeds 5 minutes.
|
||||||
|
*/
|
||||||
|
export class LogCollector {
|
||||||
|
private process: ChildProcess | null = null;
|
||||||
|
private logs: LogEntry[] = [];
|
||||||
|
private testMarkers: Map<string, TestMarker> = new Map();
|
||||||
|
private workingDir: string;
|
||||||
|
private isRunning: boolean = false;
|
||||||
|
|
||||||
|
constructor(workingDir: string) {
|
||||||
|
// workingDir should be the project root, docker-compose.yml is in docker/
|
||||||
|
this.workingDir = path.join(workingDir, "docker");
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Start the log collector background process
|
||||||
|
*/
|
||||||
|
async start(): Promise<void> {
|
||||||
|
if (this.isRunning) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
return new Promise((resolve, reject) => {
|
||||||
|
// Spawn docker compose logs --follow
|
||||||
|
this.process = spawn("docker", ["compose", "logs", "--follow", "--timestamps"], {
|
||||||
|
cwd: this.workingDir,
|
||||||
|
stdio: ["ignore", "pipe", "pipe"],
|
||||||
|
});
|
||||||
|
|
||||||
|
this.isRunning = true;
|
||||||
|
|
||||||
|
// Handle stdout (main log output)
|
||||||
|
this.process.stdout?.on("data", (data: Buffer) => {
|
||||||
|
const lines = data.toString().split("\n").filter((l) => l.trim());
|
||||||
|
for (const line of lines) {
|
||||||
|
this.logs.push({
|
||||||
|
timestamp: new Date(),
|
||||||
|
line: line,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// Handle stderr (docker compose messages)
|
||||||
|
this.process.stderr?.on("data", (data: Buffer) => {
|
||||||
|
const lines = data.toString().split("\n").filter((l) => l.trim());
|
||||||
|
for (const line of lines) {
|
||||||
|
this.logs.push({
|
||||||
|
timestamp: new Date(),
|
||||||
|
line: `[stderr] ${line}`,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
this.process.on("error", (err) => {
|
||||||
|
this.isRunning = false;
|
||||||
|
reject(err);
|
||||||
|
});
|
||||||
|
|
||||||
|
this.process.on("close", (code) => {
|
||||||
|
this.isRunning = false;
|
||||||
|
});
|
||||||
|
|
||||||
|
// Give it a moment to start
|
||||||
|
setTimeout(() => {
|
||||||
|
if (this.isRunning) {
|
||||||
|
resolve();
|
||||||
|
} else {
|
||||||
|
reject(new Error("Log collector failed to start"));
|
||||||
|
}
|
||||||
|
}, 500);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Stop the log collector background process
|
||||||
|
*/
|
||||||
|
async stop(): Promise<void> {
|
||||||
|
if (!this.process || !this.isRunning) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
return new Promise((resolve) => {
|
||||||
|
this.process!.on("close", () => {
|
||||||
|
this.isRunning = false;
|
||||||
|
resolve();
|
||||||
|
});
|
||||||
|
|
||||||
|
this.process!.kill("SIGTERM");
|
||||||
|
|
||||||
|
// Force kill after 5 seconds
|
||||||
|
setTimeout(() => {
|
||||||
|
if (this.isRunning && this.process) {
|
||||||
|
this.process.kill("SIGKILL");
|
||||||
|
}
|
||||||
|
resolve();
|
||||||
|
}, 5000);
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Mark the start of a test - records current log index
|
||||||
|
*/
|
||||||
|
markTestStart(testId: string): void {
|
||||||
|
this.testMarkers.set(testId, {
|
||||||
|
start: this.logs.length,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Mark the end of a test - records current log index and writes logs to file
|
||||||
|
*/
|
||||||
|
markTestEnd(testId: string): void {
|
||||||
|
const marker = this.testMarkers.get(testId);
|
||||||
|
if (marker) {
|
||||||
|
marker.end = this.logs.length;
|
||||||
|
|
||||||
|
// Write logs to temp file for test steps to access
|
||||||
|
const logs = this.getLogsForTest(testId);
|
||||||
|
const filePath = `/tmp/test-${testId}-logs.txt`;
|
||||||
|
writeFileSync(filePath, logs);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get logs for a specific test (between start and end markers)
|
||||||
|
*/
|
||||||
|
getLogsForTest(testId: string): string {
|
||||||
|
const marker = this.testMarkers.get(testId);
|
||||||
|
if (!marker) {
|
||||||
|
return "";
|
||||||
|
}
|
||||||
|
|
||||||
|
const endIndex = marker.end ?? this.logs.length;
|
||||||
|
const testLogs = this.logs.slice(marker.start, endIndex);
|
||||||
|
|
||||||
|
return testLogs.map((entry) => entry.line).join("\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get all logs collected so far
|
||||||
|
*/
|
||||||
|
getAllLogs(): string {
|
||||||
|
return this.logs.map((entry) => entry.line).join("\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get logs since a specific index
|
||||||
|
*/
|
||||||
|
getLogsSince(startIndex: number): string {
|
||||||
|
return this.logs.slice(startIndex).map((entry) => entry.line).join("\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Get current log count (useful for tracking)
|
||||||
|
*/
|
||||||
|
getLogCount(): number {
|
||||||
|
return this.logs.length;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Check if the collector is running
|
||||||
|
*/
|
||||||
|
isActive(): boolean {
|
||||||
|
return this.isRunning;
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Write current test logs to file (call during test execution)
|
||||||
|
* This allows test steps to access logs accumulated so far
|
||||||
|
*/
|
||||||
|
writeCurrentLogs(testId: string): void {
|
||||||
|
const marker = this.testMarkers.get(testId);
|
||||||
|
if (!marker) {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
const testLogs = this.logs.slice(marker.start);
|
||||||
|
const filePath = `/tmp/test-${testId}-logs.txt`;
|
||||||
|
writeFileSync(filePath, testLogs.map((entry) => entry.line).join("\n"));
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -27,8 +27,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify model loading in logs
|
- name: Verify model loading in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Model Loading Check ==="
|
echo "=== Model Loading Check ==="
|
||||||
|
|
||||||
@@ -58,8 +62,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify llama runner started
|
- name: Verify llama runner started
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Llama Runner Check ==="
|
echo "=== Llama Runner Check ==="
|
||||||
|
|
||||||
@@ -74,8 +82,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for model loading errors
|
- name: Check for model loading errors
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Model Loading Error Check ==="
|
echo "=== Model Loading Error Check ==="
|
||||||
|
|
||||||
@@ -97,8 +109,12 @@ steps:
|
|||||||
|
|
||||||
- name: Display model memory allocation from logs
|
- name: Display model memory allocation from logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Model Memory Allocation ==="
|
echo "=== Model Memory Allocation ==="
|
||||||
echo "$LOGS" | grep -E '(model weights|kv cache|compute graph|total memory).*device=' | tail -8
|
echo "$LOGS" | grep -E '(model weights|kv cache|compute graph|total memory).*device=' | tail -8
|
||||||
|
|||||||
@@ -17,8 +17,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for inference errors in logs
|
- name: Check for inference errors in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Inference Error Check ==="
|
echo "=== Inference Error Check ==="
|
||||||
|
|
||||||
@@ -47,8 +51,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify inference request in logs
|
- name: Verify inference request in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Inference Request Verification ==="
|
echo "=== Inference Request Verification ==="
|
||||||
|
|
||||||
@@ -69,8 +77,12 @@ steps:
|
|||||||
|
|
||||||
- name: Display recent CUDA activity from logs
|
- name: Display recent CUDA activity from logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Recent CUDA Activity ==="
|
echo "=== Recent CUDA Activity ==="
|
||||||
echo "$LOGS" | grep -iE "(CUDA|cuda|device=CUDA)" | tail -5 || echo "No recent CUDA activity logged"
|
echo "$LOGS" | grep -iE "(CUDA|cuda|device=CUDA)" | tail -5 || echo "No recent CUDA activity logged"
|
||||||
|
|||||||
@@ -22,8 +22,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify API requests logged successfully
|
- name: Verify API requests logged successfully
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== API Request Log Verification ==="
|
echo "=== API Request Log Verification ==="
|
||||||
|
|
||||||
@@ -40,8 +44,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for API errors in logs
|
- name: Check for API errors in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== API Error Check ==="
|
echo "=== API Error Check ==="
|
||||||
|
|
||||||
@@ -64,8 +72,12 @@ steps:
|
|||||||
|
|
||||||
- name: Display API response times from logs
|
- name: Display API response times from logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== API Response Times ==="
|
echo "=== API Response Times ==="
|
||||||
|
|
||||||
|
|||||||
@@ -27,8 +27,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify model loaded to GPU
|
- name: Verify model loaded to GPU
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Model Loading Check for gemma3:12b ==="
|
echo "=== Model Loading Check for gemma3:12b ==="
|
||||||
|
|
||||||
@@ -63,8 +67,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for inference errors
|
- name: Check for inference errors
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=5m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=5m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Inference Error Check ==="
|
echo "=== Inference Error Check ==="
|
||||||
|
|
||||||
|
|||||||
@@ -39,8 +39,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify model loaded across GPUs
|
- name: Verify model loaded across GPUs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=10m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=10m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Model Loading Check for gemma3:27b ==="
|
echo "=== Model Loading Check for gemma3:27b ==="
|
||||||
|
|
||||||
@@ -96,8 +100,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for inference errors
|
- name: Check for inference errors
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs --since=10m 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs --since=10m 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Inference Error Check ==="
|
echo "=== Inference Error Check ==="
|
||||||
|
|
||||||
|
|||||||
@@ -22,12 +22,21 @@ steps:
|
|||||||
|
|
||||||
- name: Capture startup logs
|
- name: Capture startup logs
|
||||||
command: |
|
command: |
|
||||||
cd docker && docker compose logs 2>&1 | head -100
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
head -100 /tmp/test-${TEST_ID}-logs.txt
|
||||||
|
else
|
||||||
|
cd docker && docker compose logs 2>&1 | head -100
|
||||||
|
fi
|
||||||
|
|
||||||
- name: Check for startup errors in logs
|
- name: Check for startup errors in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
# Check for critical errors
|
# Check for critical errors
|
||||||
if echo "$LOGS" | grep -qE "(level=ERROR|CUBLAS_STATUS_|CUDA error|cudaMalloc failed)"; then
|
if echo "$LOGS" | grep -qE "(level=ERROR|CUBLAS_STATUS_|CUDA error|cudaMalloc failed)"; then
|
||||||
|
|||||||
@@ -30,8 +30,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify GPU detection in Ollama logs
|
- name: Verify GPU detection in Ollama logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== GPU Detection Check ==="
|
echo "=== GPU Detection Check ==="
|
||||||
|
|
||||||
@@ -60,8 +64,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for GPU-related errors in logs
|
- name: Check for GPU-related errors in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== GPU Error Check ==="
|
echo "=== GPU Error Check ==="
|
||||||
|
|
||||||
@@ -82,8 +90,12 @@ steps:
|
|||||||
|
|
||||||
- name: Display GPU memory status from logs
|
- name: Display GPU memory status from logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== GPU Memory Status ==="
|
echo "=== GPU Memory Status ==="
|
||||||
echo "$LOGS" | grep -E "gpu memory.*library=CUDA" | tail -4
|
echo "$LOGS" | grep -E "gpu memory.*library=CUDA" | tail -4
|
||||||
|
|||||||
@@ -30,8 +30,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify server listening in logs
|
- name: Verify server listening in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Server Status Check ==="
|
echo "=== Server Status Check ==="
|
||||||
|
|
||||||
@@ -46,8 +50,12 @@ steps:
|
|||||||
|
|
||||||
- name: Check for runtime errors in logs
|
- name: Check for runtime errors in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== Runtime Error Check ==="
|
echo "=== Runtime Error Check ==="
|
||||||
|
|
||||||
@@ -71,8 +79,12 @@ steps:
|
|||||||
|
|
||||||
- name: Verify API request handling in logs
|
- name: Verify API request handling in logs
|
||||||
command: |
|
command: |
|
||||||
cd docker
|
# Use log collector file if available, fallback to docker compose logs
|
||||||
LOGS=$(docker compose logs 2>&1)
|
if [ -f "/tmp/test-${TEST_ID}-logs.txt" ]; then
|
||||||
|
LOGS=$(cat /tmp/test-${TEST_ID}-logs.txt)
|
||||||
|
else
|
||||||
|
LOGS=$(cd docker && docker compose logs 2>&1)
|
||||||
|
fi
|
||||||
|
|
||||||
echo "=== API Request Logs ==="
|
echo "=== API Request Logs ==="
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user