Incremental LLM Data Stream Parsers
A high-performance parsing suite published for Python, TypeScript, Dart, C#, and Kotlin. Engineered to parse and incrementally yield incomplete JSON, YAML, and XML data streams from large language models in real-time.
The Problem
In modern structured output tasks, Large Language Models stream their responses token-by-token in formats like JSON, YAML, or XML. However, standard parsing runtimes require a completed document to parse, throwing fatal syntax or parse errors when fed incomplete fragments. Waiting for the entire stream to finish blocks the user experience behind static spinners, negating the real-time responsive speed of generative applications.
Image showcasing the problem: Standard parsers breaking on partial, streaming JSON/YAML/XML tokens
The Solution
To resolve this, we engineered high-performance, incremental stream parsers across five native programming languages. These parsers act as a resilient reactive middleware, intercepting incoming text chunks, scanning syntax states, and automatically healing broken keys, escaping open strings, and closing uncompleted brackets in real-time. This allows frontends to immediately render partial, valid data trees, creating zero-latency, frame-by-frame generative UI experiences.
Real-Time JSON Parsing
Engineered to scan and reconstruct fractured JSON structures in real-time. As chunks stream from the LLM, the parser reactively synthesizes partial trees, escaping tokens, closing unclosed keys, strings, and brackets to dynamically yield valid JSON objects on the fly.
Image showing the solution in the works: stream-parsing JSON in real-time, demonstrating auto-completed keys and balanced brackets
Usage
import 'package:llm_json_stream/llm_json_stream.dart';
final parser = JsonStreamParser(chunkStream);
// Reactive property chunk extraction
parser.getStringProperty('user.bio').stream.listen((chunk) {
print('Bio fragment: $chunk');
});
// List element increment callbacks
parser.onElement('items').listen((item) {
print('New list element parsed: $item');
}); import { JsonStreamParser } from 'llm-json-stream-ts';
const parser = new JsonStreamParser(responseStream);
// Pure JS/TS async iterator stream loop
for await (const chunk of parser.getStringProperty('user.bio')) {
console.log('Bio fragment:', chunk);
}
// Reactively receive items as they finish healing
parser.onElement('items', (item) => {
console.log('New list element parsed:', item);
}); from llm_json_stream import JsonStreamParser
parser = JsonStreamParser(async_iterable)
# Stream-based processing via async generators
async for chunk in parser.get_string_property("user.bio"):
print(f"Bio fragment: {chunk}")
// Direct await resolution (Dual-behavior object)
complete_bio = await parser.get_string_property("user.bio") using LlmJsonStream;
var parser = new JsonStreamParser(responseStream);
// Reactive async stream iteration
await foreach (var chunk in parser.GetStringProperty("user.bio"))
{
Console.WriteLine("Bio fragment: " + chunk);
}
// Await directly for full property completion
string completeBio = await parser.GetStringProperty("user.bio"); import llm.json.stream.JsonStreamParser
val parser = JsonStreamParser(responseFlow)
// Reactive Flow collection for UI mapping
parser.getStringProperty("user.bio").flow.collect { chunk ->
println("Bio fragment: \$chunk")
}
// Direct suspend await
val completeBio = parser.getStringProperty("user.bio").await() Real-Time YAML Parsing
Designed to handle indentation-sensitive data blocks dynamically. The YAML parser monitors spacing, carriage returns, and document boundaries, producing incremental key-value pairs without breaking the structural indentation rules crucial to YAML parser runtimes.
Image showing the solution in the works: live indentation checking and structural alignment of partially streamed YAML keys
Usage
import 'package:llm_yaml_stream/llm_yaml_stream.dart';
final parser = YamlStreamParser(chunkStream);
// Reactive property chunk extraction
parser.getStringProperty('config.environments').stream.listen((chunk) {
print('YAML chunk: $chunk');
}); import { YamlStreamParser } from 'llm-yaml-stream-ts';
const parser = new YamlStreamParser(asyncIterable);
// Scan spacing, carriage returns and document boundaries incrementally
for await (const chunk of parser.getStringProperty('config.environments')) {
console.log('YAML chunk:', chunk);
} from llm_yaml_stream import YamlStreamParser
parser = YamlStreamParser(async_iterable)
# Clean reactive generator iteration for streaming YAML data blocks
async for chunk in parser.get_string_property("config.environments"):
print(f"YAML chunk: {chunk}") using LlmYamlStream;
var parser = new YamlStreamParser(responseStream);
// Reactive async stream iteration
await foreach (var chunk in parser.GetStringProperty("config.environments"))
{
Console.WriteLine("YAML chunk: " + chunk);
} import llm.yaml.stream.YamlStreamParser
val parser = YamlStreamParser(responseFlow)
// Reactive Flow collection for UI mapping
parser.getStringProperty("config.environments").flow.collect { chunk ->
println("YAML chunk: \$chunk")
} Real-Time XML Parsing
Optimized to parse streaming document trees. By tracking open and closed tags, nested attributes, and text nodes incrementally, the XML parser closes tags on-the-fly to render live, structured document interfaces without parsing errors.
Image showing the solution in the works: incremental XML element nesting and auto-tag closure mid-stream
Usage
import 'package:llm_xml_stream/llm_xml_stream.dart';
final parser = XmlStreamParser(chunkStream);
// Reactive text node chunk extraction
parser.getElementText('channel.item').stream.listen((chunk) {
print('XML tag chunk: $chunk');
}); import { XmlStreamParser } from 'llm-xml-stream-ts';
const parser = new XmlStreamParser(asyncIterable);
// Stream text nodes, nested attributes, and elements reactively
for await (const chunk of parser.getElementText('channel.item')) {
console.log('XML tag chunk:', chunk);
} from llm_xml_stream import XmlStreamParser
parser = XmlStreamParser(async_iterable)
# Track open and closed tags and output incremental text nodes
async for chunk in parser.get_element_text("channel.item"):
print(f"XML tag chunk: {chunk}") using LlmXmlStream;
var parser = new XmlStreamParser(responseStream);
// Reactive async stream iteration
await foreach (var chunk in parser.GetElementText("channel.item"))
{
Console.WriteLine("XML tag chunk: " + chunk);
} import llm.xml.stream.XmlStreamParser
val parser = XmlStreamParser(responseFlow)
// Reactive Flow collection for UI mapping
parser.getElementText("channel.item").flow.collect { chunk ->
println("XML tag chunk: \$chunk")
} Feature Highlights
Engineered to deliver extreme performance and seamless integration capability. These cross-platform features guarantee structural validity for JSON, YAML, and XML under any streaming state.
Streaming Strings
Yields partial strings safely by automatically balancing unclosed quotes and escape sequences mid-chunk.
String quotes auto-balance mechanism
Reactive Lists
Trigger incremental callback events instantly as soon as list elements complete healing.
Incremental array yield pipeline
Smart Casts
Type-heals invalid streaming property types into correct target formats dynamically.
Stream value structural healer
Yap Filter
Ignores conversational chatter and preambles, focusing strictly on target data objects.
LLM conversational chatter filter
Observability
Monitors syntax healing frequencies, raw byte ratios, and healing events dynamically.
Real-time healing metrics dashboard
Flexible API
Method-chained builders that natively align with language paradigms across all 5 SDKs.
Multi-platform native SDK API
Robustness & Testing
Engineered for absolute fault-tolerance. The stream parsing engines are subjected to a strict test matrix covering deep structural nesting, random network stalling, chunk-boundary splits, and partial multi-byte UTF-8 character reconstructions.
$ run-tests --verbose --all-languages
✓ JSON Stream Suite (240 tests passed)
✓ Escaped unicode fragments heal [0.2ms]
✓ Multi-byte character stream boundaries [1.1ms]
✓ Deep tree reconstruction [0.6ms]
✓ Partially parsed array streams [0.3ms]
✓ YAML Stream Suite (150 tests passed)
✓ Indentation repair under broken maps [1.3ms]
✓ Multi-line block string healing [0.8ms]
✓ XML Stream Suite (114 tests passed)
✓ Closing nested tree tags [0.5ms]
✓ Incomplete attributes repair [0.3ms]
Test Suites: 3 passed, 3 total
Tests: 504 passed, 504 total
Snapshots: 0 total
Time: 0.784s
Ran all test suites successfully across Python, TypeScript, Dart, C#, and Kotlin. LLM Provider Setup
Configure and pipe streaming token text from any major Large Language Model provider directly into the parser instance with minimal orchestration.
OpenAI
const parser = new JsonStreamParser();
// Stream choices directly to parser
for await (const chunk of openaiStream) {
const text = chunk.choices[0]?.delta?.content;
if (text) parser.write(text);
} Anthropic
const parser = new JsonStreamParser();
// Feed Anthropic event chunks
for await (const event of claudeStream) {
if (event.type === 'content_block_delta') {
parser.write(event.delta.text);
}
} Google Gemini
const parser = new JsonStreamParser();
// Pipe Gemini text segments
for await (const chunk of geminiStream) {
parser.write(chunk.text);
}