docs: update examples to use llama3.1 (#6718)

This commit is contained in:
Jeffrey Morgan
2024-09-09 22:47:16 -07:00
committed by GitHub
parent 4a8069f9c4
commit 83a9b5271a
6 changed files with 45 additions and 45 deletions

View File

@@ -32,7 +32,7 @@ When using the API, specify the `num_ctx` parameter:
```shell
curl http://localhost:11434/api/generate -d '{
"model": "llama3",
"model": "llama3.1",
"prompt": "Why is the sky blue?",
"options": {
"num_ctx": 4096
@@ -247,12 +247,12 @@ The `keep_alive` parameter can be set to:
For example, to preload a model and leave it in memory use:
```shell
curl http://localhost:11434/api/generate -d '{"model": "llama3", "keep_alive": -1}'
curl http://localhost:11434/api/generate -d '{"model": "llama3.1", "keep_alive": -1}'
```
To unload the model and free up memory use:
```shell
curl http://localhost:11434/api/generate -d '{"model": "llama3", "keep_alive": 0}'
curl http://localhost:11434/api/generate -d '{"model": "llama3.1", "keep_alive": 0}'
```
Alternatively, you can change the amount of time all models are loaded into memory by setting the `OLLAMA_KEEP_ALIVE` environment variable when starting the Ollama server. The `OLLAMA_KEEP_ALIVE` variable uses the same parameter types as the `keep_alive` parameter types mentioned above. Refer to section explaining [how to configure the Ollama server](#how-do-i-configure-ollama-server) to correctly set the environment variable.