Back to Insights
•
#AI Architecture#Data
Data Contracts for AI Agents
The Consistency Problem
LLMs are probabilistic. If you ask GPT-4 to "extract the user's budget," it might return:
- "$5,000"
- "5000"
- "Five thousand dollars"
- "They have a budget of 5k"
This variance breaks downstream code. Your database expects a number.
Enter Pydantic & Zod
You must enforce a Data Contract—a strict schema that defines exactly what the output must look like.
When building agents, we use libraries like instructor (Python) or structured outputs in OpenAI to force the LLM to adhere to a JSON schema.
{
"budget": {
"type": "integer",
"description": "The budget in USD, numbers only."
},
"intent": {
"type": "string",
"enum": ["buy", "inquire", "support"]
}
}
If the LLM's output doesn't match the schema, we reject it and ask again automatically. This is the secret to reliable agents.
Anthony Hunter
Chief Systems Architect at Nexaira.