Fix Claude validation response format parsing

The Claude AI validator was receiving detailed explanations with markdown
formatting (e.g., '**PASS**') instead of the expected simple format.

Updated the validation prompt to explicitly require responses to start
with either 'PASS' or 'FAIL: <reason>' without any additional formatting,
explanations, or markdown before the verdict.

This fixes the 'Warning: Unexpected Claude response format' error that
was causing valid test results to be incorrectly marked as unclear.
This commit is contained in:
Shang Chieh Tseng
2025-10-30 12:34:02 +08:00
parent c8b7015a2c
commit 46f1038724

View File

@@ -228,11 +228,16 @@ func (v *Validator) validateWithClaude(prompt *PromptTest, simpleCheckPassed boo
4. Appears to be from a working LLM model (not system errors or failures)
5. Has reasonable quality for a 4B parameter model
Respond with ONLY one of these formats:
CRITICAL: Your response MUST start with exactly one of these two words:
- "PASS" if the response is valid and acceptable
- "FAIL: <brief reason>" if the response has issues
Be concise. One line only.`)
Example valid responses:
- "PASS"
- "FAIL: Response is gibberish"
- "FAIL: Contains error messages instead of proper response"
Do NOT include explanations, markdown formatting, or additional text before the verdict. Start your response with PASS or FAIL: only.`)
// Write to temp file
promptFile := filepath.Join(v.claudeTempDir, fmt.Sprintf("prompt_%d.txt", os.Getpid()))