Files
ollama37/tests/testcases/inference/TC-INFERENCE-002.yml
Shang Chieh Tseng d11140c016 Add GitHub Actions CI/CD pipeline and test framework
- Add .github/workflows/build-test.yml for automated testing
- Add tests/ directory with TypeScript test runner
- Add docs/CICD.md documentation
- Remove .gitlab-ci.yml (migrated to GitHub Actions)
- Update .gitignore for test artifacts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 14:06:44 +08:00

29 lines
856 B
YAML

id: TC-INFERENCE-002
name: Basic Inference
suite: inference
priority: 2
timeout: 180000
dependencies:
- TC-INFERENCE-001
steps:
- name: Run simple math question
command: docker exec ollama37 ollama run gemma3:4b "What is 2+2? Answer with just the number." 2>&1
timeout: 120000
- name: Check GPU memory usage
command: docker exec ollama37 nvidia-smi --query-compute-apps=pid,used_memory --format=csv 2>/dev/null || echo "No GPU processes"
criteria: |
Basic inference should work on Tesla K80.
Expected:
- Model responds to the math question
- Response should indicate "4" (accept variations: "4", "four", "The answer is 4", etc.)
- GPU memory should be allocated during inference
- No CUDA errors in output
This is AI-generated output - accept reasonable variations.
Focus on the model producing a coherent response.