How to Get Structured JSON Responses from a LLM using Ollama

In modern AI engineering, getting a response from an LLM is easy—but getting it in a format your application can actually read, like JSON, is often a challenge. Whether you’re building a dashboard, a data pipeline, or a structured agent, the ability to receive structured data enables a more reliable and automated experience.

But what happens if the model adds conversational filler or Markdown formatting? How do you ensure your data is always valid?

In this post, I’ll explain how to get structured JSON from Ollama and, more importantly, how to enforce a specific schema so your data is perfectly formatted every time.

Why Use JSON Responses in Ollama?

Raw text responses are great for humans, but they’re difficult for code to parse. By forcing the model to output JSON, you eliminate the need for fragile regular expressions (regex) and allow your system to directly consume the AI’s output as a Python dictionary or a JavaScript object.

Method 1: The Simple “JSON Mode”

The quickest way to get structured data is to tell Ollama that you want JSON. By setting the format parameter to "json", the model will wrap its knowledge in a JSON object.

Example query and response

messages=[{'role': 'user', 'content': 'Tell me about Canada in JSON.'}],
format="json"

{
  "name": "Canada",
  "capitalCity": "Ottawa",
  "officialLanguage": "English and French",
  "continent": "North America"
}

While this approach is simple, the model still decides the field names (for example, capitalCity vs. capital). If you need a specific structure, you need a schema.

Method 2: Enforcing a JSON Schema

If your data must follow a strict structure, you can provide a JSON Schema directly to the format field. This is the most robust approach because it constrains the model to generate only the fields you define.

Example query and response

Query: 
messages=[{'role': 'user', 'content': 'Tell me about Canada.'}],
format=schema

Response: 
{
  "name": "Canada",
  "capital": "Ottawa",
  "languages": ["English", "French"]
}

Implementation code

import ollama
import json

def test():
    # Create a client
    client = ollama.Client()
    # Define the schema
    schema = {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "capital": {"type": "string"},
            "languages": {
                "type": "array",
                "items": {"type": "string"}
            }
        },
        "required": ["name", "capital", "languages"]
    }

    print("Testing with schema in 'format'...")
    try:
        # Send request to Ollama
        resp = client.chat(
            model='llama3.1:latest',
            messages=[{'role': 'user', 'content': 'Tell me about Canada.'}],
            format=schema
        )
        print("Success!")
        print(resp.message.content)
    except Exception as e:
        print(f"Failed with schema: {e}")

    print("\nTesting with 'json' in 'format'...")
    try:
        resp = client.chat(
            model='llama3.1:latest',
            messages=[{'role': 'user', 'content': 'Tell me about Canada in JSON.'}],
            format="json"
        )
        print("Success!")
        print(resp.message.content)
    except Exception as e:
        print(f"Failed with 'json': {e}")

if __name__ == "__main__":
    # Call the function
    test()

What Happens Behind the Scenes?

When you run this code, Ollama ensures the model doesn’t say “Sure! Here is the info.” It goes straight to the data. Here is the actual output from our test

Testing with schema in format:

{
  "name": "Canada",
  "capital": "Ottawa",
  "languages": ["English", "French", "Other (Indigenous)"]
}

Testing with "json" in format:

{
  "name": "Canada",
  "capitalCity": "Ottawa",
  "region": "North America",
  "landArea": 9984670,
  "population": 37742154,
  "language": ["English", "French"],
  "currency": "Canadian dollar (CAD)",
  "government": "Parliamentary democracy and constitutional monarchy",
  "climateZone": ["Maritime", "Continental", "Arctic"],
  "interestingFacts": [
    {"description": "Home to the world's longest road, the Trans-Canada Highway"},
    {"description": "Has 10 provinces and three territories"}
  ]
}

Step-by-Step: How to Get Reliable JSON

Step 1: Define Your Schema

Identify exactly which fields you need. If you need a list of ratings, define them as an array. If you need an explanation, define it as a string.

Step 2: Pass the Schema to the Format Field

Instead of passing the string “json”, pass your schema dictionary directly into the format parameter.

Step 3: Parse the Result

Once you receive the response, you can safely parse it, confident that it matches your expected structure.

Conclusion

Getting JSON from Ollama is a straightforward process that significantly improves the reliability of your AI applications. By defining and enforcing a schema at generation time, you ensure your data is structured, predictable, and easy to consume.

While simple JSON mode works well for quick experiments, enforcing a strict schema is the pro move for production-grade AI engineering. It’s one of the best ways to keep your data clean, your code robust, and your systems reliable.

Akash Gupta
Senior VoIP Engineer and AI Enthusiast

AI and VoIP Blog

Thank you for visiting the Blog. Hit the subscribe button to receive the next post right in your inbox. If you find this article helpful don't forget to share your feedback in the comments and hit the like button. This will helps in knowing what topics resonate with you, allowing me to create more that keeps you informed.

Thank you for reading, and stay tuned for more insights and guides!

AI and VoIP Blog