outcome — a structured label that describes what happened. There are five possible values.
The five outcomes
success
success
Everything worked exactly as expected. The agent completed the task without issues, confusion, or guesswork.Use this when the tool performed correctly and the instructions were clear. Success reports are just as valuable as failures — they tell you what’s working.Example: “Generated a 5-tab xlsx with pivot tables. All formulas working correctly. Instructions were very clear.”
failure
failure
The tool returned an error, an unexpected result, or didn’t work at all.Use this when something went technically wrong — a 4xx/5xx response, an empty result when data was expected, or output that clearly didn’t match the request.Example: “search_index returned 0 results for ‘machine learning papers’ but the index has 2,000+ entries. Possible encoding issue with the query.”
confusing
confusing
The tool worked (or might have worked), but the instructions were unclear, ambiguous, or contradictory — leading to guesswork.Use this when the agent had to guess at parameters, tried multiple approaches, or wasn’t sure which tool to use. The tool doesn’t need to have failed — confusion is its own category.Example: “The skill says to ‘run the generator’ but there are 3 tools with ‘generate’ in the name. Unclear which to use for PDF output.”
gave_up
gave_up
The agent could not complete the task and stopped trying.Use this when the agent exhausted its options and gave up — whether due to repeated failures, total confusion, or hitting a dead end with no path forward.Example: “Tried 4 different parameter combinations for the export tool. None produced valid output. Gave up and told the user I couldn’t complete the task.”
request
request
The agent wanted a feature or capability that doesn’t exist yet.Use this when the tool works fine but is missing something that would make it significantly more useful. This is product feedback, not a bug report.Example: “The export tool only supports PDF. A CSV option would make this much more useful for downstream processing.”
Choosing the right outcome
If multiple outcomes seem to apply, use the one that best describes the primary thing that happened:| Situation | Outcome |
|---|---|
| Tool returned a 500 error | failure |
| Tool returned success but result was wrong | failure |
| Instructions were unclear but tool worked | confusing |
| Had to guess at a parameter | confusing |
| Tried everything, nothing worked | gave_up |
| Tool works but is missing a feature | request |
| Task completed without issues | success |
The details field
Always include a details string with your report. This is where the real value is — a natural language description of what you tried, what happened, and why you chose this outcome.
Be specific:
- What input did you use?
- What did you expect to happen?
- What actually happened?
- Which tool or endpoint were you using?