backend: Support graph computation that does not return an output

There are two cases where we may not have an output after computing:
 - Prompt processing where the length of the input exceeds the batch
   size
 - Internal memory management operations such as cache defrag and shift
This commit is contained in:
Jesse Gross
2025-02-03 19:35:12 -08:00
committed by Jesse Gross
parent 0e38297f87
commit 4d4463b2bd
3 changed files with 22 additions and 14 deletions

View File

@@ -49,7 +49,7 @@ type Context interface {
FromIntSlice(s []int32, shape ...int) (Tensor, error)
Forward(Tensor)
Compute(Tensor) Tensor
Compute(...Tensor)
Close()
}