ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-17 19:27:00 +00:00

Author	SHA1	Message	Date
Shang Chieh Tseng	ef14fb5b26	Sync with upstream ollama/ollama and restore Tesla K80 (compute 3.7) support This commit represents a complete rework after pulling the latest changes from official ollama/ollama repository and re-applying Tesla K80 compatibility patches. ## Key Changes ### CUDA Compute Capability 3.7 Support (Tesla K80) - Added sm_37 (compute 3.7) to CMAKE_CUDA_ARCHITECTURES in CMakeLists.txt - Updated CMakePresets.json to include compute 3.7 in "CUDA 11" preset - Using 37-virtual (PTX with JIT compilation) for maximum compatibility ### Legacy Toolchain Compatibility - NVIDIA Driver: 470.256.02 (last version supporting Kepler/K80) - CUDA Version: 11.4.4 (last CUDA 11.x supporting compute 3.7) - GCC Version: 10.5.0 (required by CUDA 11.4 host_config.h) ### CPU Architecture Trade-offs Due to GCC 10.5 limitation, sacrificed newer CPU optimizations: - Alderlake CPU variant enabled WITHOUT AVX_VNNI (requires GCC 11+) - Still supports: SSE4.2, AVX, F16C, AVX2, BMI2, FMA - Performance impact: ~3-7% on newer CPUs (acceptable for K80 compatibility) ### Build System Updates - Modified ml/backend/ggml/ggml/src/ggml-cuda/CMakeLists.txt for compute 3.7 - Added -Wno-deprecated-gpu-targets flag to suppress warnings - Updated ml/backend/ggml/ggml/src/CMakeLists.txt for Alderlake without AVX_VNNI ### Upstream Sync Merged latest llama.cpp changes including: - Enhanced KV cache management with ISWA and hybrid memory support - Improved multi-modal support (mtmd framework) - New model architectures (Gemma3, Llama4, Qwen3, etc.) - GPU backend improvements for CUDA, Metal, and ROCm - Updated quantization support and GGUF format handling ### Documentation - Updated CLAUDE.md with comprehensive build instructions - Documented toolchain constraints and CPU architecture trade-offs - Removed outdated CI/CD workflows (tesla-k80-*.yml) - Cleaned up temporary development artifacts ## Rationale This fork maintains Tesla K80 GPU support (compute 3.7) which was dropped in official Ollama due to legacy driver/CUDA requirements. The toolchain constraint creates a deadlock: - K80 → Driver 470 → CUDA 11.4 → GCC 10 → No AVX_VNNI We accept the loss of cutting-edge CPU optimizations to enable running modern LLMs on legacy but still capable Tesla K80 hardware (12GB VRAM per GPU). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>	2025-11-05 14:03:05 +08:00
Yoshi	3515cc377c	docs: fix typos and remove trailing whitespaces (#11554 )	2025-07-28 11:19:13 -07:00
Daniel Hiltgen	1c6669e64c	Re-remove cuda v11 (#10694 ) * Re-remove cuda v11 Revert the revert - drop v11 support requiring drivers newer than Feb 23 This reverts commit `c6bcdc4223`. * Simplify layout With only one version of the GPU libraries, we can simplify things down somewhat. (Jetsons still require special handling) * distinct sbsa variant for linux arm64 This avoids accidentally trying to load the sbsa cuda libraries on a jetson system which results in crashes. * temporary prevent rocm+cuda mixed loading	2025-06-23 14:07:00 -07:00
Daniel Hiltgen	c6bcdc4223	Revert "remove cuda v11 (#10569 )" (#10692 ) Bring back v11 until we can better warn users that their driver is too old. This reverts commit `fa393554b9`.	2025-05-13 13:12:54 -07:00
Daniel Hiltgen	fa393554b9	remove cuda v11 (#10569 ) This reduces the size of our Windows installer payloads by ~256M by dropping support for nvidia drivers older than Feb 2023. Hardware support is unchanged. Linux default bundle sizes are reduced by ~600M to 1G.	2025-05-06 17:33:19 -07:00
frob	ccc8c6777b	cleanup: remove OLLAMA_TMPDIR and references to temporary executables (#10182 ) * cleanup: remove OLLAMA_TMPDIR * cleanup: ollama doesn't use temporary executables anymore --------- Co-authored-by: Richard Lyons <frob@cloudstaff.com>	2025-04-08 15:01:39 -07:00
copeland3300	5e0b904e88	docs: add flags to example linux log output command (#9852 )	2025-03-25 09:52:23 -07:00
frob	4df98f3eb5	Move cgroups fix out of AMD section. (#9072 ) Co-authored-by: Richard Lyons <frob@cloudstaff.com>	2025-02-25 08:52:50 -08:00
Azis Alvriyanto	b901a712c6	docs: improve syntax highlighting in code blocks (#8854 )	2025-02-07 09:55:07 -08:00
Stefan Weil	abfdc4710f	all: fix typos in documentation, code, and comments (#7021 )	2024-12-10 12:58:06 -08:00
Daniel Hiltgen	d863298210	docs: Link to AMD guide on multi-GPU guidance (#7744 )	2024-11-20 16:00:46 -08:00
Daniel Hiltgen	ac07160c8d	doc: capture numeric group requirement (#6941 ) Docker uses the container filesystem for name resolution, so we can't guide users to use the name of the host group. Instead they must specify the numeric ID.	2024-11-12 09:13:23 -08:00
Daniel Hiltgen	6606e4243c	docs: Capture docker cgroup workaround (#7519 ) GPU support can break on some systems after a while. This captures a known workaround to solve the problem.	2024-11-12 09:12:50 -08:00
Daniel Hiltgen	9246e6dd15	Verify permissions for AMD GPU (#6736 ) This adds back a check which was lost many releases back to verify /dev/kfd permissions which when lacking, can lead to confusing failure modes of: "rocBLAS error: Could not initialize Tensile host: No devices found" This implementation does not hard fail the serve command but instead will fall back to CPU with an error log. In the future we can include this in the GPU discovery UX to show detected but unsupported devices we discovered.	2024-09-11 11:38:25 -07:00
frob	b73b0940ef	Disable paging for journalctl (#6154 ) Users using `journalctl` to get logs for issue logging sometimes don't realize that paging is causing information to be missed.	2024-08-05 00:10:53 -04:00
Daniel Hiltgen	52abc8acb7	Document older win10 terminal problems We haven't found a workaround, so for now recommend updating.	2024-07-03 17:32:14 -07:00
Daniel Hiltgen	ef757da2c9	Better nvidia GPU discovery logging Refine the way we log GPU discovery to improve the non-debug output, and report more actionable log messages when possible to help users troubleshoot on their own.	2024-07-03 10:50:40 -07:00
Daniel Hiltgen	9d8a4988e8	Implement log rotation for tray app	2024-06-19 12:53:34 -07:00
Daniel Hiltgen	f77713bf1f	Add isolated gpu test to troubleshooting	2024-05-23 09:33:25 -07:00
Patrick Devine	3bade04e10	doc updates for the faq/troubleshooting (#4565 )	2024-05-21 15:30:09 -07:00
alwqx	8800c8a59b	chore: fix typo in docs (#4536 )	2024-05-20 14:19:03 -07:00
Daniel Hiltgen	8cc0ee2efe	Doc container usage and workaround for nvidia errors	2024-05-09 09:26:45 -07:00
Daniel Hiltgen	0a74cb31d5	Safeguard for noexec We may have users that run into problems with our current payload model, so this gives us an escape valve.	2024-04-01 16:48:33 -07:00
Bruce MacDonald	a5ba0fcf78	doc: faq gpu compatibility (#3142 )	2024-03-21 05:21:34 -04:00
Daniel Hiltgen	6459377ae0	Add ROCm support to linux install script (#2966 )	2024-03-14 18:00:16 -07:00
Jeffrey Morgan	6d3adfbea2	Update troubleshooting.md	2024-03-11 13:22:28 -07:00
Daniel Hiltgen	69f0227813	Refined ROCm troubleshooting docs	2024-03-07 11:22:37 -08:00
Daniel Hiltgen	6c5ccb11f9	Revamp ROCm support This refines where we extract the LLM libraries to by adding a new OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already idempotenent, so this should speed up startups after the first time a new release is deployed. It also cleans up after itself. We now build only a single ROCm version (latest major) on both windows and linux. Given the large size of ROCms tensor files, we split the dependency out. It's bundled into the installer on windows, and a separate download on windows. The linux install script is now smart and detects the presence of AMD GPUs and looks to see if rocm v6 is already present, and if not, then downloads our dependency tar file. For Linux discovery, we now use sysfs and check each GPU against what ROCm supports so we can degrade to CPU gracefully instead of having llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows dynamic library loading logic to access the amdhip64.dll APIs to query the GPU information.	2024-03-07 10:36:50 -08:00
Daniel Hiltgen	29e90cc13b	Implement new Go based Desktop app This focuses on Windows first, but coudl be used for Mac and possibly linux in the future.	2024-02-15 05:56:45 +00:00
Daniel Hiltgen	e7dbb00331	Add container hints for troubleshooting Some users are new to containers and unsure where the server logs go	2024-01-29 08:53:41 -08:00
Daniel Hiltgen	d88c527be3	Build multiple CPU variants and pick the best This reduces the built-in linux version to not use any vector extensions which enables the resulting builds to run under Rosetta on MacOS in Docker. Then at runtime it checks for the actual CPU vector extensions and loads the best CPU library available	2024-01-11 08:42:47 -08:00
Matt Williams	291700c92d	Clean up documentation (#1506 ) * Clean up documentation Will probably need to update with PRs for new release. Signed-off-by: Matt Williams <m@technovangelist.com> * Correcting to fit in 0.1.15 changes Signed-off-by: Matt Williams <m@technovangelist.com> * Update README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * addressing comments Signed-off-by: Matt Williams <m@technovangelist.com> * more api cleanup Signed-off-by: Matt Williams <m@technovangelist.com> * its llava not llama Signed-off-by: Matt Williams <m@technovangelist.com> * Update docs/troubleshooting.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Updated hosting to server and documented all env vars Signed-off-by: Matt Williams <m@technovangelist.com> * remove last of the cli descriptions Signed-off-by: Matt Williams <m@technovangelist.com> * Update README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * update further per conversation with jeff earlier today Signed-off-by: Matt Williams <m@technovangelist.com> * cleanup the doc readme Signed-off-by: Matt Williams <m@technovangelist.com> * move upgrade to faq Signed-off-by: Matt Williams <m@technovangelist.com> * first change Signed-off-by: Matt Williams <m@technovangelist.com> * updated Signed-off-by: Matt Williams <m@technovangelist.com> * Update docs/faq.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * examples in parent Signed-off-by: Matt Williams <m@technovangelist.com> * add exapmle for create model. Signed-off-by: Matt Williams <m@technovangelist.com> * update faq Signed-off-by: Matt Williams <m@technovangelist.com> * update create model api Signed-off-by: Matt Williams <m@technovangelist.com> * Update docs/api.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/faq.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/troubleshooting.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * update the readme in docs Signed-off-by: Matt Williams <m@technovangelist.com> * update a few more things Signed-off-by: Matt Williams <m@technovangelist.com> * Update docs/troubleshooting.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/faq.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update README.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/modelfile.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> * Update docs/troubleshooting.md Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com> --------- Signed-off-by: Matt Williams <m@technovangelist.com> Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>	2023-12-22 09:10:01 -08:00

32 Commits