MULTI-LANGUAGE & MULTI-FORMAT UTILITY

Incremental LLM Data Stream Parsers

A high-performance parsing suite published for Python, TypeScript, Dart, C#, and Kotlin. Engineered to parse and incrementally yield incomplete JSON, YAML, and XML data streams from large language models in real-time.

The Problem

In modern structured output tasks, Large Language Models stream their responses token-by-token in formats like JSON, YAML, or XML. However, standard parsing runtimes require a completed document to parse, throwing fatal syntax or parse errors when fed incomplete fragments. Waiting for the entire stream to finish blocks the user experience behind static spinners, negating the real-time responsive speed of generative applications.

DIAGRAM / IMAGE PLACEHOLDER

Image showcasing the problem: Standard parsers breaking on partial, streaming JSON/YAML/XML tokens

The Solution

To resolve this, we engineered high-performance, incremental stream parsers across five native programming languages. These parsers act as a resilient reactive middleware, intercepting incoming text chunks, scanning syntax states, and automatically healing broken keys, escaping open strings, and closing uncompleted brackets in real-time. This allows frontends to immediately render partial, valid data trees, creating zero-latency, frame-by-frame generative UI experiences.

Real-Time JSON Parsing

Engineered to scan and reconstruct fractured JSON structures in real-time. As chunks stream from the LLM, the parser reactively synthesizes partial trees, escaping tokens, closing unclosed keys, strings, and brackets to dynamically yield valid JSON objects on the fly.

DIAGRAM / IMAGE PLACEHOLDER

Image showing the solution in the works: stream-parsing JSON in real-time, demonstrating auto-completed keys and balanced brackets

Usage

DART USAGE
import 'package:llm_json_stream/llm_json_stream.dart';

final parser = JsonStreamParser(chunkStream);

// Reactive property chunk extraction
parser.getStringProperty('user.bio').stream.listen((chunk) {
  print('Bio fragment: $chunk');
});

// List element increment callbacks
parser.onElement('items').listen((item) {
  print('New list element parsed: $item');
});
TYPESCRIPT USAGE
import { JsonStreamParser } from 'llm-json-stream-ts';

const parser = new JsonStreamParser(responseStream);

// Pure JS/TS async iterator stream loop
for await (const chunk of parser.getStringProperty('user.bio')) {
  console.log('Bio fragment:', chunk);
}

// Reactively receive items as they finish healing
parser.onElement('items', (item) => {
  console.log('New list element parsed:', item);
});
PYTHON USAGE
from llm_json_stream import JsonStreamParser

parser = JsonStreamParser(async_iterable)

# Stream-based processing via async generators
async for chunk in parser.get_string_property("user.bio"):
    print(f"Bio fragment: {chunk}")

// Direct await resolution (Dual-behavior object)
complete_bio = await parser.get_string_property("user.bio")
C# USAGE
using LlmJsonStream;

var parser = new JsonStreamParser(responseStream);

// Reactive async stream iteration
await foreach (var chunk in parser.GetStringProperty("user.bio"))
{
    Console.WriteLine("Bio fragment: " + chunk);
}

// Await directly for full property completion
string completeBio = await parser.GetStringProperty("user.bio");
KOTLIN USAGE
import llm.json.stream.JsonStreamParser

val parser = JsonStreamParser(responseFlow)

// Reactive Flow collection for UI mapping
parser.getStringProperty("user.bio").flow.collect { chunk ->
    println("Bio fragment: \$chunk")
}

// Direct suspend await
val completeBio = parser.getStringProperty("user.bio").await()

Real-Time YAML Parsing

Designed to handle indentation-sensitive data blocks dynamically. The YAML parser monitors spacing, carriage returns, and document boundaries, producing incremental key-value pairs without breaking the structural indentation rules crucial to YAML parser runtimes.

DIAGRAM / IMAGE PLACEHOLDER

Image showing the solution in the works: live indentation checking and structural alignment of partially streamed YAML keys

Usage

DART USAGE
import 'package:llm_yaml_stream/llm_yaml_stream.dart';

final parser = YamlStreamParser(chunkStream);

// Reactive property chunk extraction
parser.getStringProperty('config.environments').stream.listen((chunk) {
  print('YAML chunk: $chunk');
});
TYPESCRIPT USAGE
import { YamlStreamParser } from 'llm-yaml-stream-ts';

const parser = new YamlStreamParser(asyncIterable);

// Scan spacing, carriage returns and document boundaries incrementally
for await (const chunk of parser.getStringProperty('config.environments')) {
  console.log('YAML chunk:', chunk);
}
PYTHON USAGE
from llm_yaml_stream import YamlStreamParser

parser = YamlStreamParser(async_iterable)

# Clean reactive generator iteration for streaming YAML data blocks
async for chunk in parser.get_string_property("config.environments"):
    print(f"YAML chunk: {chunk}")
C# USAGE
using LlmYamlStream;

var parser = new YamlStreamParser(responseStream);

// Reactive async stream iteration
await foreach (var chunk in parser.GetStringProperty("config.environments"))
{
    Console.WriteLine("YAML chunk: " + chunk);
}
KOTLIN USAGE
import llm.yaml.stream.YamlStreamParser

val parser = YamlStreamParser(responseFlow)

// Reactive Flow collection for UI mapping
parser.getStringProperty("config.environments").flow.collect { chunk ->
    println("YAML chunk: \$chunk")
}

Real-Time XML Parsing

Optimized to parse streaming document trees. By tracking open and closed tags, nested attributes, and text nodes incrementally, the XML parser closes tags on-the-fly to render live, structured document interfaces without parsing errors.

DIAGRAM / IMAGE PLACEHOLDER

Image showing the solution in the works: incremental XML element nesting and auto-tag closure mid-stream

Usage

DART USAGE
import 'package:llm_xml_stream/llm_xml_stream.dart';

final parser = XmlStreamParser(chunkStream);

// Reactive text node chunk extraction
parser.getElementText('channel.item').stream.listen((chunk) {
  print('XML tag chunk: $chunk');
});
TYPESCRIPT USAGE
import { XmlStreamParser } from 'llm-xml-stream-ts';

const parser = new XmlStreamParser(asyncIterable);

// Stream text nodes, nested attributes, and elements reactively
for await (const chunk of parser.getElementText('channel.item')) {
  console.log('XML tag chunk:', chunk);
}
PYTHON USAGE
from llm_xml_stream import XmlStreamParser

parser = XmlStreamParser(async_iterable)

# Track open and closed tags and output incremental text nodes
async for chunk in parser.get_element_text("channel.item"):
    print(f"XML tag chunk: {chunk}")
C# USAGE
using LlmXmlStream;

var parser = new XmlStreamParser(responseStream);

// Reactive async stream iteration
await foreach (var chunk in parser.GetElementText("channel.item"))
{
    Console.WriteLine("XML tag chunk: " + chunk);
}
KOTLIN USAGE
import llm.xml.stream.XmlStreamParser

val parser = XmlStreamParser(responseFlow)

// Reactive Flow collection for UI mapping
parser.getElementText("channel.item").flow.collect { chunk ->
    println("XML tag chunk: \$chunk")
}

Feature Highlights

Engineered to deliver extreme performance and seamless integration capability. These cross-platform features guarantee structural validity for JSON, YAML, and XML under any streaming state.

FEATURE_01

Streaming Strings

Yields partial strings safely by automatically balancing unclosed quotes and escape sequences mid-chunk.

DIAGRAM / IMAGE PLACEHOLDER

String quotes auto-balance mechanism

FEATURE_02

Reactive Lists

Trigger incremental callback events instantly as soon as list elements complete healing.

DIAGRAM / IMAGE PLACEHOLDER

Incremental array yield pipeline

FEATURE_03

Smart Casts

Type-heals invalid streaming property types into correct target formats dynamically.

DIAGRAM / IMAGE PLACEHOLDER

Stream value structural healer

FEATURE_04

Yap Filter

Ignores conversational chatter and preambles, focusing strictly on target data objects.

DIAGRAM / IMAGE PLACEHOLDER

LLM conversational chatter filter

FEATURE_05

Observability

Monitors syntax healing frequencies, raw byte ratios, and healing events dynamically.

DIAGRAM / IMAGE PLACEHOLDER

Real-time healing metrics dashboard

FEATURE_06

Flexible API

Method-chained builders that natively align with language paradigms across all 5 SDKs.

DIAGRAM / IMAGE PLACEHOLDER

Multi-platform native SDK API

Robustness & Testing

Engineered for absolute fault-tolerance. The stream parsing engines are subjected to a strict test matrix covering deep structural nesting, random network stalling, chunk-boundary splits, and partial multi-byte UTF-8 character reconstructions.

INTEGRATION TEST PIPELINE RUNTIME LOG
$ run-tests --verbose --all-languages

 JSON Stream Suite (240 tests passed)
   Escaped unicode fragments heal [0.2ms]
   Multi-byte character stream boundaries [1.1ms]
   Deep tree reconstruction [0.6ms]
   Partially parsed array streams [0.3ms]

 YAML Stream Suite (150 tests passed)
   Indentation repair under broken maps [1.3ms]
   Multi-line block string healing [0.8ms]

 XML Stream Suite (114 tests passed)
   Closing nested tree tags [0.5ms]
   Incomplete attributes repair [0.3ms]

Test Suites: 3 passed, 3 total
Tests:       504 passed, 504 total
Snapshots:   0 total
Time:        0.784s
Ran all test suites successfully across Python, TypeScript, Dart, C#, and Kotlin.

LLM Provider Setup

Configure and pipe streaming token text from any major Large Language Model provider directly into the parser instance with minimal orchestration.

OpenAI

OPENAI CONNECTION (JS)
const parser = new JsonStreamParser();

// Stream choices directly to parser
for await (const chunk of openaiStream) {
  const text = chunk.choices[0]?.delta?.content;
  if (text) parser.write(text);
}

Anthropic

CLAUDE CONNECTION (JS)
const parser = new JsonStreamParser();

// Feed Anthropic event chunks
for await (const event of claudeStream) {
  if (event.type === 'content_block_delta') {
    parser.write(event.delta.text);
  }
}

Google Gemini

GEMINI CONNECTION (JS)
const parser = new JsonStreamParser();

// Pipe Gemini text segments
for await (const chunk of geminiStream) {
  parser.write(chunk.text);
}