Commit Graph

  • b5dac79d2c Update README.md main Shang Chieh Tseng 2025-11-13 18:14:16 +08:00
  • 68f9b1580e Add timing instrumentation and user progress messages for model loading Shang Chieh Tseng 2025-11-12 19:09:37 +08:00
  • 84210db18a Delete useles files Shang Chieh Tseng 2025-11-12 12:52:36 +08:00
  • 4cf745b40a Update README.md Shang Chieh Tseng 2025-11-12 12:50:13 +08:00
  • 8d376e0f9b Add local development build support to Docker build system Shang Chieh Tseng 2025-11-12 06:51:05 +08:00
  • 7d9b59c520 Improve GPU detection and add detailed model loading logs Shang Chieh Tseng 2025-11-11 23:28:00 +08:00
  • db00f2d5f4 Create dockerhub-readme.md Shang Chieh Tseng 2025-11-10 20:35:43 +08:00
  • 738a8ba2da Improve Docker runtime Dockerfile documentation and accuracy Shang Chieh Tseng 2025-11-10 14:18:08 +08:00
  • 4810471b33 Redesign Docker build system to two-stage architecture with builder/runtime separation Shang Chieh Tseng 2025-11-10 13:14:49 +08:00
  • 6dbd8ed44e Redesign Docker build system to single-stage architecture for reliable model loading Shang Chieh Tseng 2025-11-10 09:19:22 +08:00
  • 0293c53746 Fix Docker container to run as host user and use host .ollama directory Shang Chieh Tseng 2025-11-09 18:00:42 +08:00
  • 8380ca93f8 Fix Docker build system: add library paths, GCC 10 runtime libs, and Go build flags Shang Chieh Tseng 2025-11-09 00:05:12 +08:00
  • 6237498297 Fix Makefile to use custom-built GCC 10 instead of non-existent gcc-toolset-10 Shang Chieh Tseng 2025-11-08 21:20:26 +08:00
  • f2c94bb9af Add Docker builder image with CUDA 11.4, GCC 10, CMake 4, and Go 1.25.3 Shang Chieh Tseng 2025-11-08 21:03:38 +08:00
  • 71fc994a63 Fix Docker build: clean host artifacts after copy to prevent conflicts Shang Chieh Tseng 2025-11-08 17:16:46 +08:00
  • 94bbfbb2e7 Add Docker-based build system with GPU-enabled builder and runtime containers Shang Chieh Tseng 2025-11-07 12:48:05 +08:00
  • 5744fb792a Remove hardcoded compiler paths from CMakePresets.json for portability Shang Chieh Tseng 2025-11-06 23:38:46 +08:00
  • 92ba15bcb1 Fix multi-GPU memory allocation for large models (deepseek-r1:14b) Shang Chieh Tseng 2025-11-06 14:13:29 +08:00
  • d948926581 Fix Tesla K80 CUBLAS compatibility with two-tier fallback strategy Shang Chieh Tseng 2025-11-05 23:52:45 +08:00
  • ef14fb5b26 Sync with upstream ollama/ollama and restore Tesla K80 (compute 3.7) support Shang Chieh Tseng 2025-11-05 14:03:05 +08:00
  • fabe2c5cb7 Revert Phase 1 memory optimization to fix multi-GPU stability Shang Chieh Tseng 2025-10-30 19:10:23 +08:00
  • d002de9af4 Fix multi-GPU OOM errors by disabling Phase 2 graph correction Shang Chieh Tseng 2025-10-30 18:15:46 +08:00
  • c8f6b24358 Update tesla-k80-multi-gpu-tests.yml Shang Chieh Tseng 2025-10-30 17:48:42 +08:00
  • 40b956b23c Fix false positive CPU backend error in test configuration Shang Chieh Tseng 2025-10-30 16:00:20 +08:00
  • 1906882ce6 Fix test-runner log monitor to properly follow log file Shang Chieh Tseng 2025-10-30 15:55:20 +08:00
  • f1d4c7f969 Fix test config: don't treat CPU backend loading as failure Shang Chieh Tseng 2025-10-30 15:39:17 +08:00
  • 6bbdf3e148 Fix test-runner GPU detection by preserving startup events Shang Chieh Tseng 2025-10-30 15:27:40 +08:00
  • d9d3f7b0b4 Fix GitHub Actions workflows to upload build libraries and remove LD_LIBRARY_PATH Shang Chieh Tseng 2025-10-30 15:08:34 +08:00
  • d8ea75a3e2 Fix test-runner to inherit LD_LIBRARY_PATH for CUDA backend loading Shang Chieh Tseng 2025-10-30 14:08:24 +08:00
  • c022e79e77 Add LD_LIBRARY_PATH to GitHub Actions workflows for CUDA library discovery Shang Chieh Tseng 2025-10-30 13:28:44 +08:00
  • bc8992d014 Add RPATH for CUDA libraries in Linux builds Shang Chieh Tseng 2025-10-30 12:51:59 +08:00
  • 46f1038724 Fix Claude validation response format parsing Shang Chieh Tseng 2025-10-30 12:34:02 +08:00
  • c8b7015a2c Move test-runner temp directory into project Shang Chieh Tseng 2025-10-30 12:25:25 +08:00
  • 9b487aa5f5 Rename validateConfig function to validateConfigFile to avoid conflict Shang Chieh Tseng 2025-10-30 12:16:55 +08:00
  • a7b3f6eda5 Fix test-runner variable name conflict Shang Chieh Tseng 2025-10-30 12:15:12 +08:00
  • 5895b414f4 Fix cross-workflow artifact download using dawidd6/action-download-artifact Shang Chieh Tseng 2025-10-30 12:12:59 +08:00
  • a171c8a087 Fix test workflows to use build artifacts instead of local binary Shang Chieh Tseng 2025-10-30 12:07:28 +08:00
  • 6c3876a30d Add multi-GPU test workflow and rename single-GPU workflow Shang Chieh Tseng 2025-10-30 12:04:50 +08:00
  • 1aa80e9411 Simplify test profiles to focus on Tesla K80 capabilities Shang Chieh Tseng 2025-10-30 11:57:30 +08:00
  • 4de7dd453b Add Claude AI-powered response validation and update test model Shang Chieh Tseng 2025-10-30 11:42:10 +08:00
  • d59284d30a Implement Go-based test runner framework for Tesla K80 testing Shang Chieh Tseng 2025-10-30 11:04:48 +08:00
  • aaaf334e7f Update tesla-k80-ci.yml Shang Chieh Tseng 2025-10-30 11:02:14 +08:00
  • b402b073c5 Split Tesla K80 workflows into build and test; add test framework plan Shang Chieh Tseng 2025-10-30 10:59:52 +08:00
  • 7e317fdd74 Add Phase 2 summary documentation for CC 3.7 graph correction Shang Chieh Tseng 2025-10-30 10:27:25 +08:00
  • 296d537a2c Update CLAUDE.md: Document Phase 2 CC 3.7 graph correction Shang Chieh Tseng 2025-10-30 00:16:38 +08:00
  • 6d87524e22 Fix gemma3:12b to load on single Tesla K80 GPU Shang Chieh Tseng 2025-10-30 00:15:59 +08:00
  • d04ea50ced Fix gpt-oss model architecture to match GGUF tensor format Shang Chieh Tseng 2025-10-29 23:34:03 +08:00
  • 241a03402e Optimize GPU memory estimation for single-GPU preference on Tesla K80 Shang Chieh Tseng 2025-10-29 19:58:20 +08:00
  • 5077ab3fb4 Document Phase 9 completion: Fix CUDA backend loading for CC 3.7 Shang Chieh Tseng 2025-10-29 17:44:36 +08:00
  • 66fca1b685 Remove remaining MMA/WMMA template instances for CC 3.7 optimization Shang Chieh Tseng 2025-10-29 15:24:08 +08:00
  • 771044bead Complete CC 3.7-only CUDA optimization for Tesla K80 support Shang Chieh Tseng 2025-10-29 15:21:08 +08:00
  • 135b799b13 Update command. Shang Chieh Tseng 2025-10-29 14:21:03 +08:00
  • 6024408ea5 Update command. Shang Chieh Tseng 2025-10-28 18:42:49 +08:00
  • 92acf0f91e Add GitHub Actions workflow for Tesla K80 CI/CD Shang Chieh Tseng 2025-10-28 18:09:49 +08:00
  • fe0fd5b494 Update manual-build.md Shang Chieh Tseng 2025-10-28 17:20:03 +08:00
  • e6e91af024 Separate NVIDIA driver and CUDA toolkit installation steps Shang Chieh Tseng 2025-10-28 16:55:38 +08:00
  • 35c4d078f7 Fix step reference in troubleshooting: GCC 10 is Step 1, not Step 5 Shang Chieh Tseng 2025-10-28 15:56:49 +08:00
  • 417b451af1 Add system compiler symlink updates to use GCC 10 by default Shang Chieh Tseng 2025-10-28 15:53:49 +08:00
  • c788de5f8b Fix GCC 10 dynamic linker config to include both /usr/lib64 and /usr/local/lib64 Shang Chieh Tseng 2025-10-28 15:51:41 +08:00
  • e549dcb710 Reorganize installation steps: Move GCC 10 to Step 1 before kernel compilation Shang Chieh Tseng 2025-10-28 15:35:10 +08:00
  • 29706d14d7 Consolidate GCC 10 installation steps into single script format Shang Chieh Tseng 2025-10-28 15:28:11 +08:00
  • 85d98064d1 Fix kernel config copy path to use /usr/src/kernels for Rocky Linux 9 Shang Chieh Tseng 2025-10-28 15:27:40 +08:00
  • 8dc4ca7ccc Reorganize Docker build infrastructure for better maintainability Shang Chieh Tseng 2025-10-28 14:47:39 +08:00
  • 736cbdf52a Remove unuse file. Shang Chieh Tseng 2025-10-22 22:35:41 +08:00
  • 3364327801 Implement single GPU preference for multi-GPU model loading feature/single-gpu-preference Shang Chieh Tseng 2025-09-15 22:54:49 +08:00
  • 29cb9d3a27 Remove GitHub Actions workflows from fork Shang Chieh Tseng 2025-08-11 19:22:12 +08:00
  • c61e0ce554 Update README.md for v1.4.0: GPT-OSS support and Tesla K80 memory improvements v1.4.0 Shang Chieh Tseng 2025-08-10 01:42:38 +08:00
  • 08f38b19ea Fix Tesla K80 multi-GPU model switching deadlocks and silent failures Shang Chieh Tseng 2025-08-10 01:30:10 +08:00
  • 46213c5880 Fix Tesla K80 VMM pool crash by aligning to granularity Shang Chieh Tseng 2025-08-08 17:48:31 +08:00
  • e4113f080a Merge upstream ollama/ollama with Tesla K80 support preserved Shang Chieh Tseng 2025-08-08 15:17:24 +08:00
  • 2be9575694 Fix BF16 compatibility for Tesla K80 (Compute Capability 3.7) Shang Chieh Tseng 2025-08-08 15:15:49 +08:00
  • 83973336d6 Optimize Docker build performance with parallel compilation Shang Chieh Tseng 2025-08-08 11:44:59 +08:00
  • 0cd81c838a Merge upstream ollama/ollama main branch while preserving CUDA 3.7 support Shang Chieh Tseng 2025-08-08 10:43:29 +08:00
  • 114c3f2265 tests: add integration coverage for oss-gpt (#11696) Daniel Hiltgen 2025-08-07 15:06:57 -07:00
  • f2e9c9aff5 server: Reduce gpt-oss context length for small VRAM GPUs Jesse Gross 2025-08-07 13:49:26 -07:00
  • aa9d889522 Merge pull request #11765 from ollama/drifkin/thinking-without-content Devon Rifkin 2025-08-06 19:02:23 -07:00
  • 735c41f9ca openai: always provide reasoning Devon Rifkin 2025-08-06 18:54:20 -07:00
  • 223a619468 Merge pull request #11761 from ollama/drifkin/openai-tool-names Devon Rifkin 2025-08-06 17:53:25 -07:00
  • 759dd78dd6 openai: when converting role=tool messages, propagate the tool name Devon Rifkin 2025-08-06 17:00:24 -07:00
  • 44bc36d063 docs: update the faq (#11760) Patrick Devine 2025-08-06 16:55:57 -07:00
  • 8f14e1f5f6 Merge pull request #11759 from ollama/drifkin/oai-tool-calling Devon Rifkin 2025-08-06 16:11:31 -07:00
  • 203c137810 openai: allow for content _and_ tool calls in the same message Devon Rifkin 2025-08-06 15:50:30 -07:00
  • fa8be9e35c clean up debugging (#11756) Daniel Hiltgen 2025-08-06 13:31:22 -07:00
  • 8a75e9ee15 Update downloading to pulling in api.md (#11170) Gao feng 2025-08-07 02:33:09 +08:00
  • 4742e12c23 docs: update turbo model name (#11707) Parth Sareen 2025-08-05 17:29:08 -07:00
  • 2d06977ade Merge pull request #11705 from ollama/drifkin/fn-schema Devon Rifkin 2025-08-05 17:02:42 -07:00
  • 30f8a68c4c tools: support anyOf types Devon Rifkin 2025-08-05 16:46:24 -07:00
  • e378e33421 win: static link msvc libs (#11612) Daniel Hiltgen 2025-08-05 16:10:42 -07:00
  • fcec04bf42 gptoss: fix memory calc (#11700) Michael Yang 2025-08-05 15:56:12 -07:00
  • ee92ca3e1d docs: add docs for Ollama Turbo (#11687) Jeffrey Morgan 2025-08-05 13:09:10 -07:00
  • 8253ad4d2b ggml: Prevent kv cache quanitization on gpt-oss Jesse Gross 2025-08-05 12:42:07 -07:00
  • fa7776fd24 gpt-oss (#11672) Michael Yang 2025-08-05 12:21:16 -07:00
  • 0d38b66502 kvcache: Log contents of cache when unable to find a slot Jesse Gross 2025-08-04 16:44:23 -07:00
  • 4183bb0574 kvcache: Enable SWA to retain additional entries Jesse Gross 2025-07-30 14:42:57 -07:00
  • ff89ba90bc fixing broken AMD driver link (#11579) Sajal Kulshreshtha 2025-07-31 00:32:54 +05:30
  • 6dcc5dfb9c Revert "CI: switch back to x86 macos builder" (#11588) Daniel Hiltgen 2025-07-30 08:56:01 -07:00
  • 25911a6e6b mac: disable bf16 on unsupported OS versions (#11585) Daniel Hiltgen 2025-07-30 08:50:54 -07:00
  • 8afa6e83f2 CI: switch back to x86 macos builder (#11572) Daniel Hiltgen 2025-07-29 16:41:25 -07:00
  • ea85e27bbd Increase performance for Gemma3n models on NVGPUs by enabling CUDA Graph execution (#11525) Oliver Simons 2025-07-29 21:37:06 +02:00
  • c116a7523d kvcache: Don't shift empty batches Jesse Gross 2025-07-28 11:29:25 -07:00