mirror of
https://github.com/dogkeeper886/ollama-k80-lab.git
synced 2025-12-09 23:37:07 +00:00
Update documentation for v1.3.0 release
- Add v1.3.0 release notes with new model support (Qwen2.5-VL, Qwen3 Dense & Sparse, improved MLLama) - Update both main README.md and ollama37/README.md for consistency - Add CLAUDE.md for future Claude Code instances - Enhanced Docker Hub documentation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
92
CLAUDE.md
Normal file
92
CLAUDE.md
Normal file
@@ -0,0 +1,92 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
|
||||||
|
This is a laboratory for running Ollama (local LLM runner) on NVIDIA K80 GPUs with custom Docker builds optimized for CUDA 11.4 compatibility. The project focuses on LLM-powered workflow automation for software quality assurance, integrating with tools like Dify, VS Code Continue plugin, N8N, and auto-webui.
|
||||||
|
|
||||||
|
## Docker Commands
|
||||||
|
|
||||||
|
### Running Ollama
|
||||||
|
```bash
|
||||||
|
# Pull and run the custom K80-optimized Ollama image
|
||||||
|
docker pull dogkeeper886/ollama37
|
||||||
|
docker run --runtime=nvidia --gpus all -p 11434:11434 dogkeeper886/ollama37
|
||||||
|
|
||||||
|
# Using docker-compose (recommended for persistent data)
|
||||||
|
cd ollama37/
|
||||||
|
docker-compose up -d
|
||||||
|
|
||||||
|
# Stop the service
|
||||||
|
docker-compose down
|
||||||
|
```
|
||||||
|
|
||||||
|
### Building Custom Images
|
||||||
|
```bash
|
||||||
|
# Build the builder image (contains CUDA 11.4, GCC 10, CMake, Go)
|
||||||
|
cd ollama37-builder/
|
||||||
|
docker build -t dogkeeper886/ollama37-builder .
|
||||||
|
|
||||||
|
# Build the runtime image
|
||||||
|
cd ollama37/
|
||||||
|
docker build -t dogkeeper886/ollama37 .
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
### Core Components
|
||||||
|
|
||||||
|
1. **ollama37-builder/**: Multi-stage Docker build environment
|
||||||
|
- Rocky Linux 8 base with NVIDIA drivers 470
|
||||||
|
- CUDA 11.4 toolkit for K80 GPU compatibility
|
||||||
|
- Custom-compiled GCC 10, CMake 4.0, Go 1.24.2
|
||||||
|
- Environment setup scripts for proper library paths
|
||||||
|
|
||||||
|
2. **ollama37/**: Runtime Docker image
|
||||||
|
- Compiled Ollama binary optimized for K80
|
||||||
|
- Minimal runtime environment with required CUDA libraries
|
||||||
|
- Exposes Ollama API on port 11434
|
||||||
|
- Persistent volume support for model storage
|
||||||
|
|
||||||
|
3. **dify/**: Workflow automation configurations
|
||||||
|
- YAML workflow definitions for LLM-powered QA tasks
|
||||||
|
- Python utilities for Atlassian/Jira integration (`format_jira_ticket.py`)
|
||||||
|
- Workflow templates: BugBlitz, QualityQuest, ER2Test, etc.
|
||||||
|
- Knowledge base with PDF documentation for various systems
|
||||||
|
|
||||||
|
4. **mcp-servers/**: Model Context Protocol integrations
|
||||||
|
- Web browser MCP server for enhanced LLM capabilities
|
||||||
|
|
||||||
|
### Key Environment Variables
|
||||||
|
- `OLLAMA_HOST=0.0.0.0:11434` - API endpoint
|
||||||
|
- `LD_LIBRARY_PATH="/usr/local/lib64:/usr/local/cuda-11.4/lib64"` - CUDA libraries
|
||||||
|
- `NVIDIA_DRIVER_CAPABILITIES=compute,utility` - GPU capabilities
|
||||||
|
- `NVIDIA_VISIBLE_DEVICES=all` - GPU visibility
|
||||||
|
|
||||||
|
### Hardware Requirements
|
||||||
|
- NVIDIA K80 GPU
|
||||||
|
- NVIDIA Tesla K80 driver installed
|
||||||
|
- NVIDIA Container Runtime for Docker
|
||||||
|
- Sufficient storage for model downloads (models stored in `./volume/` when using docker-compose)
|
||||||
|
|
||||||
|
## Development Workflow
|
||||||
|
|
||||||
|
### Model Testing
|
||||||
|
The project supports running various LLM models optimized for K80:
|
||||||
|
- Qwen2.5-VL (multi-modal vision-language model)
|
||||||
|
- Qwen3 Dense & Sparse variants
|
||||||
|
- Improved MLLama models
|
||||||
|
- Gemma 3 12B
|
||||||
|
- Phi-4 Reasoning 14B
|
||||||
|
- DeepSeek-R1:32B
|
||||||
|
|
||||||
|
### Quality Assurance Integration
|
||||||
|
The Dify workflows enable automated processing of:
|
||||||
|
- Jira tickets to Markdown conversion
|
||||||
|
- Requirements analysis and test generation
|
||||||
|
- Documentation refinement
|
||||||
|
- Bug report processing
|
||||||
|
|
||||||
|
### Persistent Data
|
||||||
|
When using docker-compose, model data persists in `./volume/` directory, mapped to `/root/.ollama` inside the container.
|
||||||
13
README.md
13
README.md
@@ -20,6 +20,19 @@ This repository includes a customized version of Ollama, specifically optimized
|
|||||||
|
|
||||||
### 📦 Version History
|
### 📦 Version History
|
||||||
|
|
||||||
|
#### v1.3.0 (2025-07-01)
|
||||||
|
|
||||||
|
This release expands model support while maintaining full Tesla K80 compatibility:
|
||||||
|
|
||||||
|
**New Model Support:**
|
||||||
|
- **Qwen2.5-VL**: Multi-modal vision-language model for image understanding
|
||||||
|
- **Qwen3 Dense & Sparse**: Enhanced Qwen3 model variants
|
||||||
|
- **Improved MLLama**: Better support for Meta's LLaMA models
|
||||||
|
|
||||||
|
**Documentation Updates:**
|
||||||
|
- Updated installation guides for Tesla K80 compatibility
|
||||||
|
- Enhanced Docker Hub documentation with latest model information
|
||||||
|
|
||||||
#### v1.2.0 (2025-05-06)
|
#### v1.2.0 (2025-05-06)
|
||||||
|
|
||||||
This release introduces support for Qwen3 models, marking a significant step in our commitment to staying Tesla K80 with leading open-source language models. Testing includes successful execution of Gemma 3 12B, Phi-4 Reasoning 14B, and Qwen3 14B, ensuring compatibility with models expected to be widely used in May 2025.
|
This release introduces support for Qwen3 models, marking a significant step in our commitment to staying Tesla K80 with leading open-source language models. Testing includes successful execution of Gemma 3 12B, Phi-4 Reasoning 14B, and Qwen3 14B, ensuring compatibility with models expected to be widely used in May 2025.
|
||||||
|
|||||||
@@ -14,8 +14,11 @@ This setup ensures that users can start experimenting with AI models without the
|
|||||||
## Features
|
## Features
|
||||||
|
|
||||||
- **GPU Acceleration**: Fully supports NVIDIA K80 GPUs to accelerate model computations.
|
- **GPU Acceleration**: Fully supports NVIDIA K80 GPUs to accelerate model computations.
|
||||||
|
- **Multi-Modal AI**: Supports vision-language models like Qwen2.5-VL for image understanding.
|
||||||
|
- **Advanced Reasoning**: Built-in thinking support for enhanced AI reasoning capabilities.
|
||||||
- **Pre-built Binary**: Contains the compiled Ollama binary for immediate use.
|
- **Pre-built Binary**: Contains the compiled Ollama binary for immediate use.
|
||||||
- **CUDA Libraries**: Includes necessary CUDA libraries and drivers for GPU operations.
|
- **CUDA Libraries**: Includes necessary CUDA libraries and drivers for GPU operations.
|
||||||
|
- **Enhanced Tool Support**: Improved tool calling and WebP image input support.
|
||||||
- **Environment Variables**: Configured to facilitate seamless interaction with the GPU and network settings.
|
- **Environment Variables**: Configured to facilitate seamless interaction with the GPU and network settings.
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
@@ -99,6 +102,19 @@ This will stop and remove the container, but the data stored in the `.ollama` di
|
|||||||
|
|
||||||
## 📦 Version History
|
## 📦 Version History
|
||||||
|
|
||||||
|
### v1.3.0 (2025-07-01)
|
||||||
|
|
||||||
|
This release expands model support while maintaining full Tesla K80 compatibility:
|
||||||
|
|
||||||
|
**New Model Support:**
|
||||||
|
- **Qwen2.5-VL**: Multi-modal vision-language model for image understanding
|
||||||
|
- **Qwen3 Dense & Sparse**: Enhanced Qwen3 model variants
|
||||||
|
- **Improved MLLama**: Better support for Meta's LLaMA models
|
||||||
|
|
||||||
|
**Documentation Updates:**
|
||||||
|
- Updated installation guides for Tesla K80 compatibility
|
||||||
|
- Enhanced Docker Hub documentation with latest model information
|
||||||
|
|
||||||
### v1.2.0 (2025-05-06)
|
### v1.2.0 (2025-05-06)
|
||||||
|
|
||||||
This release introduces support for Qwen3 models, marking a significant step in our commitment to staying Tesla K80 with leading open-source language models. Testing includes successful execution of Gemma 3 12B, Phi-4 Reasoning 14B, and Qwen3 14B, ensuring compatibility with models expected to be widely used in May 2025.
|
This release introduces support for Qwen3 models, marking a significant step in our commitment to staying Tesla K80 with leading open-source language models. Testing includes successful execution of Gemma 3 12B, Phi-4 Reasoning 14B, and Qwen3 14B, ensuring compatibility with models expected to be widely used in May 2025.
|
||||||
|
|||||||
Reference in New Issue
Block a user