ollama37

mirror of https://github.com/dogkeeper886/ollama37.git synced 2025-12-10 07:46:59 +00:00

Files

Shang Chieh Tseng d04ea50ced Fix gpt-oss model architecture to match GGUF tensor format

The gpt-oss model architecture code expected fused tensors (attn_qkv,
ffn_gate_up_exps) but the actual GGUF files contain separate tensors
(attn_q/k/v, ffn_gate_exps/up_exps), causing nil pointer panics during
model loading.

Changes:
- model/models/gptoss/model.go: Updated AttentionBlock to use separate
  Query/Key/Value fields instead of fused QKV, modified Forward() to
  compute projections separately
- model/models/gptoss/model.go: Updated MLPBlock to use separate Gate/Up
  fields instead of fused GateUp, simplified Forward() logic
- fs/ggml/type.go: Reorganized MXFP4 tensor type constant ordering
- ml/backend/ggml/ggml/include/ggml.h: Moved GGML_TYPE_MXFP4 to end of
  enum to match GGUF file format specification
- ml/backend/ggml/ggml/src/ggml.c: Updated type name array to match
  reordered enum
- CLAUDE.md: Documented gpt-oss model compatibility fix

Result: gpt-oss:20b model now loads and runs successfully on Tesla K80,
all 25 layers offload to GPU correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-10-29 23:34:03 +08:00

ggml_test.go

ggml: fix crash for array head counts

2025-04-27 11:38:06 -07:00

ggml.go

gptoss: fix memory calc (#11700 )

2025-08-05 15:56:12 -07:00

gguf_test.go

gguf: fix write order (#11068 )

2025-06-16 10:42:32 -07:00

gguf.go

add new gemma model (#11204 )

2025-06-25 21:47:09 -07:00

type.go

Fix gpt-oss model architecture to match GGUF tensor format

2025-10-29 23:34:03 +08:00