Add GitHub Actions CI/CD pipeline and test framework

- Add .github/workflows/build-test.yml for automated testing - Add tests/ directory with TypeScript test runner - Add docs/CICD.md documentation - Remove .gitlab-ci.yml (migrated to GitHub Actions) - Update .gitignore for test artifacts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 05:37:01 +00:00 · 2025-12-15 14:06:44 +08:00
parent 2b5aeaf86b
commit d11140c016
23 changed files with 3014 additions and 50 deletions
--- a/tests/testcases/inference/TC-INFERENCE-002.yml
+++ b/tests/testcases/inference/TC-INFERENCE-002.yml
@@ -0,0 +1,28 @@
+id: TC-INFERENCE-002
+name: Basic Inference
+suite: inference
+priority: 2
+timeout: 180000
+
+dependencies:
+  - TC-INFERENCE-001
+
+steps:
+  - name: Run simple math question
+    command: docker exec ollama37 ollama run gemma3:4b "What is 2+2? Answer with just the number." 2>&1
+    timeout: 120000
+
+  - name: Check GPU memory usage
+    command: docker exec ollama37 nvidia-smi --query-compute-apps=pid,used_memory --format=csv 2>/dev/null || echo "No GPU processes"
+
+criteria: |
+  Basic inference should work on Tesla K80.
+
+  Expected:
+  - Model responds to the math question
+  - Response should indicate "4" (accept variations: "4", "four", "The answer is 4", etc.)
+  - GPU memory should be allocated during inference
+  - No CUDA errors in output
+
+  This is AI-generated output - accept reasonable variations.
+  Focus on the model producing a coherent response.