Project Overview

This entry marks the beginning of documentation for what I’m calling the Progressive State Transformer (PST) - a hierarchical narrative generation system designed to create coherent, structured stories using LLMs that I’ve been working on (on and off) for the past months. The core innovation is maintaining persistent state across multiple narrative layers, allowing for more cohesive long-form text generation than traditional single-pass approaches. Once good novels can be written with this system, could we use it for factual information in Enterprise Knowledge Base Systems?

Architectural Foundations

The system is built on a multi-layered sequential architecture that maps to narrative structure theory. Each layer maintains specialized state representing distinct narrative levels:

Act → Sequence → Scene → Beat → Line

Unlike conventional LLM text generation, this approach:

  1. Maintains hierarchical state outside model parameters
  2. Implements directional information flow between sequential layers
  3. Leverages the autoregressive nature of decoder-only transformers

Each layer serves a specific narrative function:

  • Acts: Major structural divisions marking fundamental shifts in narrative terrain. Each act serves a distinct dramatic function (establishment, complication, resolution) while maintaining coherence with the controlling idea.

  • Sequences: Intermediate structures containing multiple scenes unified by a tactical objective. Sequences create momentum toward act-level goals while maintaining their own dramatic integrity. Most professional screenwriters consider 8-10 sequences the structural foundation of feature films.

  • Scenes: The fundamental battlegrounds of dramatic conflict. Each scene represents a discrete unit of dramatic action occurring in continuous time and space, containing a measurable value change (positive to negative or vice versa).

  • Beats: Atomic units of dramatic action—the smallest divisible moments of meaning. Each beat represents a single action/reaction exchange or emotional shift that accumulates into scene-level change.

  • Lines: The actual textual elements—dialogue, action description, interior thought—that manifest the beat-level intentions in language. Word choice, syntax, and rhythm operate at this resolution.

Implementation Details

The current implementation consists of:

  1. Sequential processing architecture:

    • Recognizes the autoregressive nature of decoder-only transformers
    • Implements narrative layers as directional sequences rather than holographic scaffolds
    • Simplifies reconciliation by appending new elements to existing sequences
  2. Transformation flow:

    • Higher layers provide context for generating lower layers
    • Each layer’s output becomes the input context for the next layer down
  3. Infrastructure:

    • Python-based CLI implementation
    • Mistral models deployed on Hetzner GEX44 with Ollama
    • State persistence using sequential field storage

Current Progress

I’m currently working on reverse inference through the hierarchy using existing novels. This involves:

  1. Taking complete works and decomposing them into their sequential narrative elements
  2. Creating examples at each narrative level to use for few-shot prompting
  3. Analyzing the transitions between layers to understand optimal sequencing

This reverse engineering serves two purposes:

  • Creating a robust evaluation set based on professional writing
  • Building a curated selection of examples for few-shot prompting that implicitly guide generation

The most challenging aspect has been selecting the right few-shot examples that effectively communicate the desired pattern to the base model without explicit instructions. Since I’m using base models rather than fine-tuned ones, the selection of examples becomes critical for guiding the model’s behavior at each layer.

Future Work

Near-term objectives include:

  1. Completing the reverse inference analysis on a corpus of 5-10 novels
  2. Implementing the first complete generation pipeline
  3. Generating fan-fiction extensions of existing stories for evaluation by friends familiar with the source material

Longer term, the system will evolve to generate original stories based on high-level user inputs, potentially with interactive guidance during the generation process.

Technical Challenges

The primary challenges remain:

  1. Implicit Prompting with Base Models: Since I’m using base models without fine-tuning, I need to create carefully crafted few-shot examples that implicitly guide the model toward proper generation at each layer.

  2. Few-Shot Selection System: Developing an LLM-based understanding of each example’s impact on generation, requiring a separate perception system to analyze and select optimal examples at runtime.

  3. Sequential Thinking: Working with the inherently sequential nature of autoregressive transformers while maintaining hierarchical narrative coherence across long-form text.

  4. Context Window Management: Exploring whether we can truncate older sequence elements while preserving just terminal elements with sufficient context markers, implement some form of compression for historical elements, or employ a hybrid approach. This becomes critical when generating long-form narratives that exceed model context limits.