mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-09 23:37:06 +00:00

Go to file

Shang Chieh Tseng 7c029749bc docs: restructure README and create comprehensive manual build guide

- Restructure README.md for better readability and organization
- Reduce README word count by 75% while maintaining key information
- Move detailed installation guides to docs/manual-build.md
- Add Tesla K80-specific build instructions and optimizations
- Update CLAUDE.md with new documentation structure and references
- Improve title formatting with emoji and clear tagline

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-07-20 09:11:43 +08:00

.github

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

api

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

app

feat: add trace log level (#10650 )

2025-05-12 11:43:00 -07:00

auth

lint

2024-08-01 17:06:06 -07:00

cmd

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

convert

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

discover

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

docs

docs: restructure README and create comprehensive manual build guide

2025-07-20 09:11:43 +08:00

envconfig

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

format

chore(all): replace instances of interface with any (#10067 )

2025-04-02 09:44:27 -07:00

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

integration

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

kvcache

kvcache: Skip computing causal mask for worst case graph reservation

2025-05-27 14:25:15 -07:00

llama

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

llm

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

logutil

feat: add trace log level (#10650 )

2025-05-12 11:43:00 -07:00

macapp

docs: improve syntax highlighting in code blocks (#8854 )

2025-02-07 09:55:07 -08:00

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

model

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

openai

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

parser

skip tokenizer.model if possible (#11050 )

2025-06-11 12:10:35 -07:00

progress

create blobs in parallel (#10135 )

2025-05-05 11:59:26 -07:00

readline

add thinking support to the api and cli (#10584 )

2025-05-28 19:38:52 -07:00

runner

ml: Panic rather than return error on tensor allocation failure

2025-05-22 14:38:09 -07:00

sample

model: handle multiple eos tokens (#10577 )

2025-05-16 13:40:23 -07:00

scripts

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

server

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

template

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

thinking

move thinking logic into its own package (#10990 )

2025-06-06 12:02:20 -07:00

tools

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

types

add thinking support to the api and cli (#10584 )

2025-05-28 19:38:52 -07:00

version

add version

2023-08-22 09:40:58 -07:00

.dockerignore

next build (#8539 )

2025-01-29 15:03:38 -08:00

.gitattributes

chore: update gitattributes (#8860 )

2025-02-05 16:37:18 -08:00

.gitignore

server/internal: copy bmizerany/ollama-go to internal package (#9294 )

2025-02-24 22:39:44 -08:00

.golangci.yaml

lint: enable usetesting, disable tenv (#10594 )

2025-05-08 11:42:14 -07:00

CLAUDE.md

docs: restructure README and create comprehensive manual build guide

2025-07-20 09:11:43 +08:00

CMakeLists.txt

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

CMakePresets.json

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

CONTRIBUTING.md

CONTRIBUTING: fix code block formatting

2025-04-07 13:53:33 -07:00

Dockerfile

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

go.mod

Add support for new models and fix GitHub issues

2025-07-20 00:12:36 +08:00

go.sum

feat: incremental gguf parser (#10822 )

2025-06-12 11:04:11 -07:00

LICENSE

proto -> ollama

2023-06-26 15:57:13 -04:00

main.go

lint

2024-08-01 17:06:06 -07:00

Makefile.sync

chore: update mllama to use ollama engine (#10637 )

2025-05-13 17:36:02 -07:00

ollama37.Dockerfile

Revert "docker: optimize binary copy in Dockerfile"

2025-07-20 00:21:47 +08:00

README.md

docs: restructure README and create comprehensive manual build guide

2025-07-20 09:11:43 +08:00

SECURITY.md

Create SECURITY.md

2024-07-30 21:01:12 -07:00

README.md

Ollama37 🚀

Tesla K80 Compatible Ollama Fork

Run modern LLMs on NVIDIA Tesla K80 and other CUDA Compute Capability 3.7 GPUs. While official Ollama dropped legacy GPU support, Ollama37 keeps your Tesla K80 hardware functional with the latest models and features.

Key Features

⚡ Tesla K80 Support - Full compatibility with CUDA Compute Capability 3.7
🔄 Always Current - Synced with upstream Ollama for latest models and fixes
🛠️ Optimized Build - CUDA 11 toolchain for maximum legacy GPU compatibility
💰 Cost Effective - Leverage existing hardware without expensive upgrades

Quick Start

Docker (Recommended)

# Pull and run
docker pull dogkeeper886/ollama37
docker run --runtime=nvidia --gpus all -p 11434:11434 dogkeeper886/ollama37

Docker Compose

services:
  ollama:
    image: dogkeeper886/ollama37
    ports: ["11434:11434"]
    volumes: ["./.ollama:/root/.ollama"]
    runtime: nvidia
    restart: unless-stopped

docker-compose up -d

Usage

Run Your First Model

# Download and run a model
ollama pull llama3.2
ollama run llama3.2 "Why is the sky blue?"

# Interactive chat
ollama run gemma3

Supported Models

All models from ollama.com/library including Llama 3.2, Gemma 3, Qwen 2.5, Phi-4, and Code Llama.

REST API

# Generate response
curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "prompt": "Hello Tesla K80!"}'

# Chat
curl http://localhost:11434/api/chat -d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hello!"}]}'

Technical Details

Tesla K80 Support

CUDA 3.7 Support: Maintained via CMAKE_CUDA_ARCHITECTURES "37;50;61;70;75;80"
CUDA 11 Toolchain: Compatible with legacy GPUs (CUDA 12 dropped 3.7 support)
Optimized Builds: Tesla K80-specific performance tuning

Recent Updates

v1.3.0 (2025-07-19): Added Gemma 3, Qwen2.5VL, latest upstream sync
v1.2.0 (2025-05-06): Qwen3, Gemma 3 12B, Phi-4 14B support

Building from Source

Docker Build

docker build -f ollama37.Dockerfile -t ollama37 .

Manual Build

For detailed manual compilation instructions including CUDA 11.4, GCC 10, and CMake setup, see our Manual Build Guide.

Contributing

Found an issue or want to contribute? Check our GitHub issues or submit Tesla K80-specific bug reports and compatibility fixes.

License

Same license as upstream Ollama. See LICENSE file for details.

Advanced Usage

Custom Models

# Import GGUF model
ollama create custom-model -f Modelfile

# Customize existing model
echo 'FROM llama3.2
PARAMETER temperature 0.8
SYSTEM "You are a helpful Tesla K80 expert."' > Modelfile
ollama create tesla-expert -f Modelfile

CLI Commands

ollama list              # List models
ollama show llama3.2     # Model info  
ollama ps               # Running models
ollama stop llama3.2    # Stop model
ollama serve            # Start server

Libraries & Community

ollama-python | ollama-js
Discord | Reddit

See API documentation for complete REST API reference.

Languages

Go 89.9%

GLSL 6.9%

Shell 0.7%

TypeScript 0.6%

PowerShell 0.5%

Other 1.3%