ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-12 00:37:04 +00:00

Files

Shang Chieh Tseng 46213c5880 Fix Tesla K80 VMM pool crash by aligning to granularity

- Fix CUDA_ERROR_INVALID_VALUE from cuMemAddressReserve by aligning max_pool_size to GPU granularity
- Set dynamic max_pool_size based on 90% of actual GPU memory instead of static 32GB
- Add memory availability check before allocation to prevent OOM
- Tested on Tesla K80 dual GPU setup with successful model loading and chat completions

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-08 17:48:31 +08:00

ggml

Fix Tesla K80 VMM pool crash by aligning to granularity

2025-08-08 17:48:31 +08:00

backend.go

next ollama runner (#7913 )

2025-02-13 16:31:21 -08:00