Support multiple variants for a given llm lib type

In some cases we may want multiple variants for a given GPU type or CPU. This adds logic to have an optional Variant which we can use to select an optimal library, but also allows us to try multiple variants in case some fail to load. This can be useful for scenarios such as ROCm v5 vs v6 incompatibility or potentially CPU features.
2025-12-14 17:57:06 +00:00 · 2024-01-05 12:13:08 -08:00
parent b24e8d17b2
commit 8da7bef05f
16 changed files with 428 additions and 212 deletions
--- a/llm/ext_server_windows.go
+++ b/llm/ext_server_windows.go
@@ -1,6 +1,8 @@
 package llm

 import (
+	"fmt"
+
 	"github.com/jmorganca/ollama/api"
 )

@@ -8,5 +10,6 @@ func newDefaultExtServer(model string, adapters, projectors []string, opts api.O
 	// On windows we always load the llama.cpp libraries dynamically to avoid startup DLL dependencies
 	// This ensures we can update the PATH at runtime to get everything loaded

-	return newDynamicShimExtServer(AvailableShims["cpu"], model, adapters, projectors, opts)
+	// This should never happen as we'll always try to load one or more cpu dynamic libaries before hitting default
+	return nil, fmt.Errorf("no available default llm library on windows")
 }