ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-21 13:17:05 +00:00

Author	SHA1	Message	Date
Shang Chieh Tseng	806232d95f	Add multi-model inference tests for gemma3 12b and 27b - TC-INFERENCE-004: gemma3:12b single GPU test - TC-INFERENCE-005: gemma3:27b dual-GPU test (K80 layer split) - Each test unloads previous model before loading next - Workflows unload all 3 model sizes after inference suite - 27b test verifies both GPUs have memory allocated	2025-12-17 17:01:25 +08:00
Shang Chieh Tseng	22e77e0dde	Unload models from VRAM after use to free GPU memory - Add unloadModel() method to LLMJudge class - CLI calls unloadModel() after judging completes - Workflows unload gemma3:4b after inference tests - Uses Ollama API with keep_alive:0 to trigger unload	2025-12-17 16:51:12 +08:00
Shang Chieh Tseng	7bb050f146	Change workflow defaults: judge_mode=dual, judge_model=gemma3:12b	2025-12-17 16:43:38 +08:00
Shang Chieh Tseng	1a185f7926	Add comprehensive Ollama log checking and configurable LLM judge mode Test case enhancements: - TC-RUNTIME-001: Add startup log error checking (CUDA, CUBLAS, CPU fallback) - TC-RUNTIME-002: Add GPU detection verification, CUDA init checks, error detection - TC-RUNTIME-003: Add server listening verification, runtime error checks - TC-INFERENCE-001: Add model loading logs, layer offload verification - TC-INFERENCE-002: Add inference error checking (CUBLAS/CUDA errors) - TC-INFERENCE-003: Add API request log verification, response time display Workflow enhancements: - Add judge_mode input (simple/llm/dual) to all workflows - Add judge_model input to specify LLM model for judging - Configurable via GitHub Actions UI without code changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-16 23:27:57 +08:00
Shang Chieh Tseng	0e66cc6f93	Fix workflows to fail on test failures The '\|\| true' was swallowing test runner exit codes, causing workflows to pass even when tests failed. Added separate 'Check test results' step that reads JSON summary and fails workflow if any tests failed. Affected workflows: - build.yml - runtime.yml - inference.yml - full-pipeline.yml 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 21:48:40 +08:00
Shang Chieh Tseng	fb01b8b1ca	Split monolithic workflow into modular components Separate workflows for flexibility: - build.yml: Build verification (standalone + reusable) - runtime.yml: Container & runtime tests with container lifecycle - inference.yml: Inference tests with optional container management - full-pipeline.yml: Orchestrates all stages with LLM judge Each workflow can be triggered independently for targeted testing, or run the full pipeline for complete validation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 17:57:11 +08:00

6 Commits