Remove TC-INFERENCE-004: CUBLAS Fallback Verification

Redundant test - if TC-INFERENCE-002 (Basic Inference) passes,
CUBLAS fallback is already working. Any errors would cause
inference to fail, making a separate error-check test unnecessary.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Shang Chieh Tseng
2025-12-15 20:38:35 +08:00
parent 8d65fd4211
commit 3f3f68f08d

View File

@@ -1,32 +0,0 @@
id: TC-INFERENCE-004
name: CUBLAS Fallback Verification
suite: inference
priority: 4
timeout: 120000
dependencies:
- TC-INFERENCE-002
steps:
- name: Check for CUBLAS errors in logs
command: cd docker && docker compose logs 2>&1 | grep -i "CUBLAS_STATUS" | grep -v "SUCCESS" | head -10 || echo "No CUBLAS errors"
- name: Check compute capability detection
command: cd docker && docker compose logs 2>&1 | grep -iE "compute|capability|cc.*3" | head -10 || echo "No compute capability logs"
- name: Verify no GPU errors
command: cd docker && docker compose logs 2>&1 | grep -iE "error|fail" | grep -i gpu | head -10 || echo "No GPU errors"
criteria: |
CUBLAS should work correctly on Tesla K80 using legacy fallback.
Expected:
- No CUBLAS_STATUS_ARCH_MISMATCH errors
- No CUBLAS_STATUS_NOT_SUPPORTED errors
- Compute capability 3.7 may be mentioned in debug logs
- No fatal GPU-related errors
The K80 uses legacy CUBLAS functions (cublasSgemmBatched)
instead of modern Ex variants. This should work transparently.
Accept warnings. Only fail on actual CUBLAS errors.