ollama37

matt/ollama37

Fork 0

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-21 13:17:05 +00:00

Commit Graph

Author	SHA1	Message	Date
Shang Chieh Tseng	82ab6cc96e	Refactor model unload: each test cleans up its own model - TC-INFERENCE-003: Add unload step for gemma3:4b at end - TC-INFERENCE-004: Remove redundant 4b unload at start - TC-INFERENCE-005: Remove redundant 12b unload at start Each model size test now handles its own VRAM cleanup. Workflow-level unload remains as safety fallback for failures. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-17 17:20:44 +08:00
Shang Chieh Tseng	806232d95f	Add multi-model inference tests for gemma3 12b and 27b - TC-INFERENCE-004: gemma3:12b single GPU test - TC-INFERENCE-005: gemma3:27b dual-GPU test (K80 layer split) - Each test unloads previous model before loading next - Workflows unload all 3 model sizes after inference suite - 27b test verifies both GPUs have memory allocated	2025-12-17 17:01:25 +08:00

Author

SHA1

Message

Date

Shang Chieh Tseng

82ab6cc96e

Refactor model unload: each test cleans up its own model

- TC-INFERENCE-003: Add unload step for gemma3:4b at end
- TC-INFERENCE-004: Remove redundant 4b unload at start
- TC-INFERENCE-005: Remove redundant 12b unload at start

Each model size test now handles its own VRAM cleanup.
Workflow-level unload remains as safety fallback for failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2025-12-17 17:20:44 +08:00

Shang Chieh Tseng

806232d95f

Add multi-model inference tests for gemma3 12b and 27b

- TC-INFERENCE-004: gemma3:12b single GPU test
- TC-INFERENCE-005: gemma3:27b dual-GPU test (K80 layer split)
- Each test unloads previous model before loading next
- Workflows unload all 3 model sizes after inference suite
- 27b test verifies both GPUs have memory allocated

2025-12-17 17:01:25 +08:00

2 Commits