ollama37/server/sched.go at 4b8a2e341a9b4e713180b483f42316665c5faea3

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-11 08:17:03 +00:00

Files

Jesse Gross 6cd566872b sched: Lift parallel restriction for multimodal models except mllama

The Go runner does not have a problem with supporting parallel
requests for most multimodal models. Now that we won't be potentially
falling back to server.cpp, this restriction can be lifted.

However, the new mllama model can't support parallel requests, so we
will need to keep a restriction for that.

2024-11-06 13:32:18 -08:00

29 KiB

Raw Blame History

View Raw

29 KiB Raw Blame History

29 KiB

Raw Blame History