mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-10 15:57:04 +00:00
backend: Support graph computation that does not return an output
There are two cases where we may not have an output after computing: - Prompt processing where the length of the input exceeds the batch size - Internal memory management operations such as cache defrag and shift
This commit is contained in:
@@ -49,7 +49,7 @@ type Context interface {
|
||||
FromIntSlice(s []int32, shape ...int) (Tensor, error)
|
||||
|
||||
Forward(Tensor)
|
||||
Compute(Tensor) Tensor
|
||||
Compute(...Tensor)
|
||||
Close()
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user