mirror of
https://github.com/dogkeeper886/ollama-k80-lab.git
synced 2025-12-10 07:46:59 +00:00
Update README.md
This commit is contained in:
70
README.md
70
README.md
@@ -1,45 +1,69 @@
|
||||
# ollama-k80-lab
|
||||
# ollama-k80-lab: Exploring Local LLMs & Workflow Automation
|
||||
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
|
||||
## Overview
|
||||
|
||||
This project explores running Ollama, a local LLM runner, with an NVIDIA K80 GPU. The goal is to assess performance, explore limitations, and demonstrate the potential of this combination for local LLM experimentation and deployment. We also investigate the usage of Dify for building LLM applications and the Model Context Protocol (MCP) with VS Code's Continue plugin for filesystem operations.
|
||||
This project is a laboratory for experimenting with running Ollama, a local Large Language Model (LLM) runner, on NVIDIA K80 GPUs. While the K80 is an older card, this project aims to overcome its hardware limitations through custom compilation, demonstrating the potential for accessible and cost-effective LLM experimentation. Beyond basic execution, this project explores leveraging LLMs to significantly improve software quality assurance workflows, integrating with tools like Dify, VS Code's 'Continue' plugin, N8N, and auto-webui. This is more than just getting Ollama running; it's about integrating LLMs into practical development workflows.
|
||||
|
||||
## Motivation
|
||||
|
||||
* **Local LLM Exploration:** Ollama simplifies running Large Language Models locally. This project leverages Ollama's ease of use with the power of a GPU.
|
||||
* **K80 Utilization:** The NVIDIA K80, while an older GPU, remains a viable option for LLM inference, particularly for smaller to medium-sized models.
|
||||
* **Dify Integration:** Dify provides a robust framework for building LLM applications (chatbots, agents, etc.). We aim to demonstrate seamless integration between Ollama and Dify for rapid prototyping and deployment.
|
||||
* **Cost-Effective Experimentation:** Running LLMs locally avoids the costs associated with cloud-based APIs, enabling broader access and experimentation.
|
||||
* **Democratizing Local LLM Access:** Ollama simplifies running LLMs locally, but compatibility can be a barrier. This project aims to remove those barriers, making LLM experimentation more accessible, even with older hardware.
|
||||
* **K80 Hardware Utilization:** The NVIDIA K80 offers a viable option for LLM inference, especially for smaller to medium-sized models. This project seeks to maximize its utility.
|
||||
* **LLM-Powered Software Quality Assurance:** This project investigates how LLMs can revolutionize software quality assurance, automating tedious tasks and improving overall efficiency.
|
||||
* **Cost-Effective Experimentation:** Local LLM execution avoids the costs associated with cloud-based APIs, enabling wider access and experimentation for developers and researchers.
|
||||
* **Career Development:** This project serves as a practical platform for exploring and developing skills in prompt engineering, LLM application development, and workflow automation within a software quality assurance context.
|
||||
|
||||
## Modified Version
|
||||
## Customized Ollama Build
|
||||
|
||||
This repository includes a customized version of Ollama, specifically optimized for running on an NVIDIA K80 GPU. This build incorporates specific configurations to address potential compatibility issues with the K80 architecture. For more details, contributions, and the full build process, visit our GitHub page:
|
||||
[ollama37](https://github.com/dogkeeper886/ollama37)
|
||||
This repository includes a customized version of Ollama, specifically optimized for running on an NVIDIA K80 GPU. This involves compiling from source with specific configurations to address potential compatibility issues. For detailed build instructions, contributions, and the full build process, please visit: [ollama37](https://github.com/dogkeeper886/ollama37)
|
||||
|
||||
**Key Features of the Custom Build:**
|
||||
**Key Build Considerations:**
|
||||
|
||||
* **CUDA 11.4 Support:** The build is configured to work with CUDA Toolkit 11.4, which is a common and well-supported version for the K80.
|
||||
* **GCC 10 Compatibility:** We built GCC 10 from source to ensure compatibility with the Ollama build process.
|
||||
* **CMake Manual Compilation:** CMake was compiled manually to avoid potential issues with pre-built binaries.
|
||||
* **Go Installation:** The project includes instructions for installing Go, a key component of the Ollama build.
|
||||
* **CUDA 11.4 Support:** The build is configured to work with CUDA Toolkit 11.4, a common and well-supported version for the K80.
|
||||
* **GCC 10 Compatibility:** Built from source to ensure compatibility with the Ollama build process.
|
||||
* **Manual CMake Compilation:** Compiled manually to avoid potential issues with pre-built binaries.
|
||||
* **Go Installation:** Includes instructions for installing Go, a key component of the Ollama build.
|
||||
|
||||
## LLM-Powered Workflow Exploration
|
||||
|
||||
Beyond simply running Ollama, this project explores integrating LLMs into practical workflows. Here's a breakdown of the tools and techniques being investigated:
|
||||
|
||||
* **Dify Integration:** Leveraging Dify's platform for building LLM applications (chatbots, agents, workflows) and integrating them with Ollama.
|
||||
* **VS Code 'Continue' Plugin & Model Context Protocol (MCP):** Investigating filesystem operations and data manipulation within LLM workflows using the 'Continue' plugin and the Model Context Protocol.
|
||||
* **N8N Integration:** Exploring the use of N8N, a visual automation platform, to orchestrate LLM-powered quality assurance tasks.
|
||||
* **auto-webui Usage:** Investigating the integration of LLMs into automated web UI testing and analysis pipelines.
|
||||
|
||||
## Setup and Running
|
||||
|
||||
**Prerquisites:**
|
||||
**Prerequisites:**
|
||||
|
||||
* NVIDIA K80 GPU
|
||||
* CUDA Toolkit 11.4
|
||||
* GCC 10 (or later)
|
||||
* Go (version compatible with Ollama - check Ollama documentation)
|
||||
* CMake
|
||||
* Git
|
||||
* NVIDIA K80 GPU
|
||||
* CUDA Toolkit 11.4
|
||||
* GCC 10 (or later)
|
||||
* Go (version compatible with Ollama - check Ollama documentation)
|
||||
* CMake
|
||||
* Git
|
||||
|
||||
**Steps:**
|
||||
|
||||
1. Clone the repository: `git clone https://github.com/dogkeeper886/ollama37`
|
||||
2. Follow the instructions in the `ollama37` repository for building and installing Ollama.
|
||||
1. **Pull the Docker Image:** To get the pre-built Ollama environment, pull the image from Docker Hub using this command:
|
||||
|
||||
```bash
|
||||
docker pull dogkeeper886/ollama37/ollama-k80-lab
|
||||
```
|
||||
|
||||
2. **Run the Docker Container:** Start the Ollama container with GPU support using the following command. This command also exposes Ollama on port 11434, which you'll need to interact with it.
|
||||
|
||||
```bash
|
||||
docker run --runtime=nvidia --gpus all -p 11434:11434 dogkeeper886/ollama37/ollama-k80-lab
|
||||
```
|
||||
|
||||
* `--runtime=nvidia`: Specifies that the container should use the NVIDIA runtime for GPU acceleration.
|
||||
* `--gpus all`: Makes all available GPUs accessible to the container.
|
||||
* `-p 11434:11434`: Maps port 11434 on the host machine to port 11434 inside the container.
|
||||
|
||||
For detailed build instructions and further customization, refer to the [GitHub repository](https://github.com/dogkeeper886/ollama-k80-lab/ollama37/README.md).
|
||||
|
||||
## Video Showcase
|
||||
|
||||
|
||||
Reference in New Issue
Block a user