mirror of
https://github.com/dogkeeper886/ollama37.git
synced 2025-12-19 12:17:02 +00:00
When truncating inputs to the the context window at the beginning of a sequence, we remove the minimum amount possible. However, this may cause us to truncate to the middle of a set of inputs that the model specified should not be split up. To avoid this, we need to remove the rest of the partial batch.
7.5 KiB
7.5 KiB