mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-10 15:57:04 +00:00
Changes to test/config/models.yaml: Quick profile: - Use gemma3:4b (was gemma2:2b) - Single prompt: 'Hello, respond with a brief greeting.' - Timeout: 60s - Purpose: Fast smoke test (~5 min) Full profile: - REMOVED: gemma2:2b, gemma3:4b (redundant with quick test) - ONLY gemma3:12b (largest model for single K80) - Single prompt: 'Hello, respond with a brief greeting.' (same as quick) - Timeout: 120s (sufficient - loads in ~24s) - Purpose: Validate Phase 2 memory optimization for large models Rationale: - Quick test validates basic functionality with gemma3:4b - Full test validates single-GPU capability with gemma3:12b - No need to test multiple sizes if both work - Consistent prompts make comparison easier - Tests the critical optimization: 12B model on single K80