ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-10 15:57:04 +00:00

Files

Shang Chieh Tseng 08f38b19ea Fix Tesla K80 multi-GPU model switching deadlocks and silent failures

Resolves two critical issues preventing robust model switching:

1. Scheduler deadlock: Fixed improper loop control flow that prevented
   model unloading from triggering after conflict detection. Added proper
   multi-GPU conflict detection and unload sequencing.

2. Silent inference failures: Changed critical cudaSetDevice() calls from
   graceful error handling back to CUDA_CHECK to prevent models from
   appearing to load successfully but failing silently during inference.

Result: Robust Tesla K80 dual-GPU model switching with self-healing
recovery instead of requiring system reboots.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-10 01:30:10 +08:00

ggml

Fix Tesla K80 multi-GPU model switching deadlocks and silent failures

2025-08-10 01:30:10 +08:00

backend.go

next ollama runner (#7913 )

2025-02-13 16:31:21 -08:00