Back to Insights
#AI Architecture#Data

Data Contracts for AI Agents

The Consistency Problem

LLMs are probabilistic. If you ask GPT-4 to "extract the user's budget," it might return:

  • "$5,000"
  • "5000"
  • "Five thousand dollars"
  • "They have a budget of 5k"

This variance breaks downstream code. Your database expects a number.

Enter Pydantic & Zod

You must enforce a Data Contract—a strict schema that defines exactly what the output must look like.

When building agents, we use libraries like instructor (Python) or structured outputs in OpenAI to force the LLM to adhere to a JSON schema.

{
  "budget": {
    "type": "integer",
    "description": "The budget in USD, numbers only."
  },
  "intent": {
    "type": "string",
    "enum": ["buy", "inquire", "support"]
  }
}

If the LLM's output doesn't match the schema, we reject it and ask again automatically. This is the secret to reliable agents.

Anthony Hunter

Chief Systems Architect at Nexaira.