mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-14 17:57:06 +00:00
gpt-oss works best with a context length of at least 8k. However, for GPUs with limited amount of VRAM, there is a significant performance hit to this increased context. In these cases, we switch to the Ollama default of 4k
47 KiB
47 KiB