Proxy Structuring Engine

Overview

Large Language Models (LLMs) demonstrate remarkable capabilities, but their non-deterministic nature presents challenges for production deployments. While these models excel at generating content, their responses are structurally inconsistent and can deviate from required formats.

The Proxy Structuring Engine (PSE) solves this through inference-time steering of the model. The structuring engine acts as bumpers - ensuring outputs stay on track without compromising the model's creativity or intention.

Use Cases

Advanced Agents & Chatbots - Deploy multi-step AI agents that will perform reliably every time.

Data Pipelines & APIs - Generate structured data and consistent API responses at scale.

Automated Code Generation - Generate code that adheres to strict specifications.

How It Works

Your Rules, Their Playground - The engine enforces structure boundaries while giving models creative freedom within those constraints.

Error Correction Built-In - Automatically repairs mid-generation mistakes before they compound.

Dynamic State Architecture - PSE uses layered state machines to guide generation in real-time, ensuring structural validity while preserving the LLM's creative freedom. Unlike rigid templates, it adapts to the evolving context of generation.

What this means for you

Build with Confidence - Create reliable AI-powered systems you can trust in production.

Save Time and Resources - Say goodbye to tedious post-processing and output cleanup.

New AI Applications - Unlock complex systems that were previously impossible without guaranteed reliability.

Benchmarks

We conducted a series of benchmarks comparing the PSE to the best structured output methods, and found that the PSE delivers superior performance in many cases.

All benchmarks were conducted with Llama-3.1-8B-Instruct on a single Mac Studio with 192GB of RAM.
See our benchmark repository for evaluation code, methodology, and more.

Quality Comparison
Performance Metrics
Generation Time Comparison
Latency Metrics

The PSE allows models to generate naturally, prioritizing coherent outputs - while outperforming methods that sacrifice quality for performance gains.

Getting Started

The PSE is currently distributed as a developer SDK on PyPI. Get started in minutes by installing the Python package:

pip install pse

The Python library is also open source under the Apache 2.0 license, and the source code is available on GitHub.

Check out the examples in the GitHub repository to see some of the ways you can use the PSE in practice.

We've also created two ready-to-run Google Colab notebooks for you to try: