ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-22 05:37:01 +00:00

Author	SHA1	Message	Date
Shang Chieh Tseng	2c5094db92	Add LogCollector for precise test log boundaries Problem: Tests used `docker compose logs --since=5m` which caused: - Log overlap between tests - Logs from previous tests included - Missing logs if test exceeded 5 minutes Solution: - New LogCollector class runs `docker compose logs --follow` - Marks test start/end boundaries - Writes test-specific logs to /tmp/test-{testId}-logs.txt - Test steps access via TEST_ID environment variable Changes: - tests/src/log-collector.ts: New LogCollector class - tests/src/executor.ts: Integrate LogCollector, set TEST_ID env - tests/src/cli.ts: Start/stop LogCollector for runtime/inference - All test cases: Use log collector with fallback to docker compose Also updated docs/CICD.md with: - Test runner CLI documentation - Judge modes (simple, llm, dual) - Log collector integration - Updated test case list (12b, 27b models) - Model unload strategy - Troubleshooting guide 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-17 17:46:49 +08:00
Shang Chieh Tseng	82ab6cc96e	Refactor model unload: each test cleans up its own model - TC-INFERENCE-003: Add unload step for gemma3:4b at end - TC-INFERENCE-004: Remove redundant 4b unload at start - TC-INFERENCE-005: Remove redundant 12b unload at start Each model size test now handles its own VRAM cleanup. Workflow-level unload remains as safety fallback for failures. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-17 17:20:44 +08:00
Shang Chieh Tseng	1a185f7926	Add comprehensive Ollama log checking and configurable LLM judge mode Test case enhancements: - TC-RUNTIME-001: Add startup log error checking (CUDA, CUBLAS, CPU fallback) - TC-RUNTIME-002: Add GPU detection verification, CUDA init checks, error detection - TC-RUNTIME-003: Add server listening verification, runtime error checks - TC-INFERENCE-001: Add model loading logs, layer offload verification - TC-INFERENCE-002: Add inference error checking (CUBLAS/CUDA errors) - TC-INFERENCE-003: Add API request log verification, response time display Workflow enhancements: - Add judge_mode input (simple/llm/dual) to all workflows - Add judge_model input to specify LLM model for judging - Configurable via GitHub Actions UI without code changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-16 23:27:57 +08:00
Shang Chieh Tseng	d11140c016	Add GitHub Actions CI/CD pipeline and test framework - Add .github/workflows/build-test.yml for automated testing - Add tests/ directory with TypeScript test runner - Add docs/CICD.md documentation - Remove .gitlab-ci.yml (migrated to GitHub Actions) - Update .gitignore for test artifacts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 14:06:44 +08:00

4 Commits