Revamp ROCm support

This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed.  It also cleans up after itself.

We now build only a single ROCm version (latest major) on both windows
and linux.  Given the large size of ROCms tensor files, we split the
dependency out.  It's bundled into the installer on windows, and a
separate download on windows.  The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.

For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us.  For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
This commit is contained in:
Daniel Hiltgen
2024-02-15 17:15:09 -08:00
parent 2e20110e50
commit 6c5ccb11f9
27 changed files with 1091 additions and 588 deletions

View File

@@ -67,6 +67,43 @@ You can see what features your CPU has with the following.
cat /proc/cpuinfo| grep flags | head -1
```
## AMD Radeon GPU Support
Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In
some cases you can force the system to try to use a close GPU type. For example
The Radeon RX 5400 is `gfx1034` (also known as 10.3.4) however, ROCm does not
support this patch-level, the closest support is `gfx1030`. You can use the
environment variable `HSA_OVERRIDE_GFX_VERSION` with `x.y.z` syntax. So for
example, to force the system to run on the RX 5400, you would set
`HSA_OVERRIDE_GFX_VERSION="10.3.0"` as an environment variable for the server.
At this time, the known supported GPU types are the following: (This may change from
release to release)
- gfx900
- gfx906
- gfx908
- gfx90a
- gfx940
- gfx941
- gfx942
- gfx1030
- gfx1100
- gfx1101
- gfx1102
This will not work for all unsupported GPUs. Reach out on [Discord](https://discord.gg/ollama)
or file an [issue](https://github.com/ollama/ollama/issues) for additional help.
## Installing older versions on Linux
If you run into problems on Linux and want to install an older version you can tell the install script
which version to install.
```sh
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION="0.1.27" sh
```
## Known issues
* N/A