Michael Yang
adff143bcd
fix: mllama quality ( #10807 )
...
* fix mllama convert
- transform attn_gate and ffn_gate
- swap attention heads for vision models
* fix mllama
the mlp gate which was applied in the wrong place
2025-05-22 11:30:49 -07:00
..
2025-04-02 09:44:27 -07:00
2025-01-15 16:31:22 -08:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-02-13 16:31:21 -08:00
2025-03-13 13:59:19 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-15 12:15:01 -07:00
2025-05-06 11:20:48 -07:00
2025-05-06 11:20:48 -07:00
2025-05-22 11:30:49 -07:00
2025-05-06 11:20:48 -07:00
2025-05-13 20:58:02 -07:00
2025-05-13 20:58:02 -07:00
2025-05-19 09:54:22 -07:00
2025-05-16 13:40:23 -07:00
2025-04-25 16:59:20 -07:00
2025-04-25 16:59:20 -07:00
2025-05-13 17:36:02 -07:00
2024-12-10 12:58:06 -08:00
2025-05-13 20:58:02 -07:00
2025-03-11 14:49:18 -07:00
2025-05-16 13:40:23 -07:00
2025-05-16 13:40:23 -07:00