- Remove CUDA initialization checks from TC-RUNTIME-002 (ggml_cuda_init,
load_backend only appear when a model is loaded, not at startup)
- Fix bash integer comparison error in TC-RUNTIME-003
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename step to "Verify UVM device files" for clarity
- Add "WARNING:" prefix when UVM device is missing
- Add "SUCCESS:" prefix when device is present
- Add confirmation message after UVM fix is applied
- Separate ls command for cleaner output
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New options:
- --dual-judge: Run both simple and LLM judge, fail if either fails
- --judge-url: Separate LLM Judge server URL (default: localhost:11435)
- --judge-model: Model for LLM judging (default: gemma3:4b)
Dual judge logic:
- Simple judge checks exit codes
- LLM judge analyzes logs semantically
- Final result: FAIL if either judge says FAIL
- Combines reasons from both judges on failure
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Strip ANSI escape codes from stdout/stderr to reduce log size
(spinner animations were ~95% of inference log size)
- Add [TIMEOUT] indicator when commands are killed due to timeout
for clearer failure diagnosis
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Tesla K80 needs ~60-180s to load model into VRAM on first inference.
Add warmup step with 5-minute timeout to preload model before
subsequent inference tests run.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Redundant test - if TC-INFERENCE-002 (Basic Inference) passes,
CUBLAS fallback is already working. Any errors would cause
inference to fail, making a separate error-check test unnecessary.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add step to check/create /dev/nvidia-uvm device files
- Use nvidia-modprobe -u -c=0 if UVM devices missing
- Restart container after creating UVM devices
- Update criteria to clarify GPU detection requirements
- Increase timeout to 120s for container restart
Fixes issue where nvidia-smi works but Ollama only detects CPU.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Timeout: 900s -> 3600s (60 min) for runtime image build
- Add tee to capture full build log to /tmp/build-runtime.log
- Add step to show last 200 lines of build log for debugging
- Helps diagnose build failures with proper log capture
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Separate progress output (stderr) from JSON results (stdout)
- Add timestamps, test counters, and step progress to executor
- Update CLI to use stderr for progress messages
- Update workflow to capture JSON to file while showing progress
- Add --silent flag to suppress npm banner noise
This allows real-time visibility into test execution during CI runs
while preserving clean JSON output for artifact collection.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add .github/workflows/build-test.yml for automated testing
- Add tests/ directory with TypeScript test runner
- Add docs/CICD.md documentation
- Remove .gitlab-ci.yml (migrated to GitHub Actions)
- Update .gitignore for test artifacts
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>