ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-10 15:57:04 +00:00

Files

Shang Chieh Tseng 1aa80e9411 Simplify test profiles to focus on Tesla K80 capabilities

Changes to test/config/models.yaml:

Quick profile:
- Use gemma3:4b (was gemma2:2b)
- Single prompt: 'Hello, respond with a brief greeting.'
- Timeout: 60s
- Purpose: Fast smoke test (~5 min)

Full profile:
- REMOVED: gemma2:2b, gemma3:4b (redundant with quick test)
- ONLY gemma3:12b (largest model for single K80)
- Single prompt: 'Hello, respond with a brief greeting.' (same as quick)
- Timeout: 120s (sufficient - loads in ~24s)
- Purpose: Validate Phase 2 memory optimization for large models

Rationale:
- Quick test validates basic functionality with gemma3:4b
- Full test validates single-GPU capability with gemma3:12b
- No need to test multiple sizes if both work
- Consistent prompts make comparison easier
- Tests the critical optimization: 12B model on single K80

2025-10-30 11:57:30 +08:00

models.yaml

Simplify test profiles to focus on Tesla K80 capabilities

2025-10-30 11:57:30 +08:00

quick.yaml

Add Claude AI-powered response validation and update test model

2025-10-30 11:42:10 +08:00