TIL: Long, repetitive field names can make Gemini structured output brittle

Recently I ran into a weird issue while working with structured output.

The schema was valid. Gemini saw it, accepted it, and used it as the source of truth. But once the schema got bigger — more fields, more nesting, more long repetitive property names — extraction started falling apart. Shorter version of the same schema? Worked much better. Boom.

OpenAI has a pretty good explanation for why this kind of thing can happen. In their Structured Outputs docs, they describe converting JSON Schema into a context-free grammar (CFG) and constraining decoding so the model can only generate tokens that keep the output valid. So this is not just prompt engineering with better formatting. The schema shape itself affects generation.

Gemini’s docs also support structured output via JSON Schema, although Google does not really explain the exact decoding path in the docs I found.

My takeaway is simple: if structured output starts failing, don’t just stare at the prompt and blame the model. Look at the schema. Large, nested, repetitive schemas can become fragile even when they are technically valid.

Structure matters. More than it seems at first.

Long, repetitive field names can make Gemini structured output brittle

Sources