ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-10 07:46:59 +00:00

Files

Shang Chieh Tseng 68f9b1580e Add timing instrumentation and user progress messages for model loading

Problem: Model loading takes 2-3 minutes on first load with no user feedback,
causing confusion about whether the system is frozen or working.

Root Cause: GPU initialization (reserveWorstCaseGraph) takes ~164 seconds on
Tesla K80 GPUs due to CUDA kernel compilation (PTX JIT for compute 3.7). This
is by design - it validates GPU compatibility before committing to full load.

Solution:
1. Add comprehensive timing instrumentation to identify bottlenecks
2. Add user-facing progress messages explaining the delay

Changes:
- cmd/cmd.go: Update spinner with informative message for users
- llama/llama.go: Add timing logs for CGO model loading
- runner/llamarunner/runner.go: Add detailed timing for llama runner
- runner/ollamarunner/runner.go: Add timing + stderr messages for new engine
- server/sched.go: Add timing for scheduler load operation

User Experience:
Before: Silent wait with blinking cursor for 2-3 minutes
After: Rotating spinner with message "loading model (may take 1-3 min on first load)"

Performance Metrics Captured:
- GGUF file reading: ~0.4s
- GPU kernel compilation: ~164s (bottleneck identified)
- Model weight loading: ~0.002s
- Total end-to-end: ~165s

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-11-12 19:09:37 +08:00

cache_test.go

Runner for Ollama engine

2025-02-13 17:09:26 -08:00

cache.go

Sync with upstream ollama/ollama and restore Tesla K80 (compute 3.7) support

2025-11-05 14:03:05 +08:00

image_test.go

Sync with upstream ollama/ollama and restore Tesla K80 (compute 3.7) support

2025-11-05 14:03:05 +08:00

image.go

Sync with upstream ollama/ollama and restore Tesla K80 (compute 3.7) support

2025-11-05 14:03:05 +08:00

runner.go

Add timing instrumentation and user progress messages for model loading

2025-11-12 19:09:37 +08:00