Remove no longer supported max vram var

The OLLAMA_MAX_VRAM env var was a temporary workaround for OOM
scenarios.  With Concurrency this was no longer wired up, and the simplistic
value doesn't map to multi-GPU setups.  Users can still set `num_gpu`
to limit memory usage to avoid OOM if we get our predictions wrong.
This commit is contained in:
Daniel Hiltgen
2024-07-22 09:08:11 -07:00
parent 80ee9b5e47
commit cc269ba094
3 changed files with 2 additions and 16 deletions

View File

@@ -1344,7 +1344,6 @@ func NewCLI() *cobra.Command {
envVars["OLLAMA_TMPDIR"],
envVars["OLLAMA_FLASH_ATTENTION"],
envVars["OLLAMA_LLM_LIBRARY"],
envVars["OLLAMA_MAX_VRAM"],
})
default:
appendEnvDocs(cmd, envs)