Research Workflow Automation System

  • Python
  • Automation
  • LLM
  • Research Workflow
Overview

Academic research involves repetitive, time-consuming tasks: data cleaning, literature searches, statistical analysis, figure generation, and writing. This system automates the entire research pipeline—from data to final PDF—with a single prompt and zero human intervention.

The workflow runs 9 sequential stages with 60+ minute execution time, handles interruptions with resumable execution, manages token overflow across stages, and validates outputs using Python scripts rather than LLM self-verification.

System Architecture
Orchestrator + Skills Pattern

The system uses a master orchestrator that coordinates all stages, reads progress tracking for resume capability, and handles errors and feedback loops. Each stage is implemented as a separate skill with self-contained instructions.

Linear Workflow with Feedback Loop:

linear-workflow

Key Components
ComponentDescriptionLocation
OrchestratorMaster coordinator running stages in sequenceworkflow/skills/orchestrator/SKILL.md
SkillsIndividual pipeline stages with instructionsworkflow/skills/<stage>/SKILL.md
Shared ScriptsReusable utilities for progress, context, feedbackworkflow/scripts/*.py
State FilesJSON files tracking progress, context, decisionsexam_paper/*.json
Model Tiering Strategy

Different pipeline stages require different reasoning capabilities. The orchestrator uses a three-tier model system to balance cost and quality:

Model LevelModelUsed ForStages
highopus[1m]Deep reasoning, complex synthesisResearch Questions (2), Write Paper (8)
mediumsonnetData inspection, code generationLoad & Profile (1), Score & Rank (3), Analysis (5), Figures (6), Lit Review (7)
lowhaikuSimple downloads, mechanical tasksAcquire Data (0, 4), Compile & Review (9)
Key Innovations
Python-Based Validation

LLMs hallucinate when verifying “did this work?” and cannot reliably check file existence. The system uses Python-based file system validation with pre-emptive feasibility checks:

def _validate_outputs(expected_outputs: dict) -> None:
    """Validate that expected output files exist and have content."""
    for name, path in expected_outputs.items():
        if not os.path.exists(path):
            raise ValueError(f"Missing required output: {name} at {path}")
        if os.path.getsize(path) == 0:
            raise ValueError(f"Empty output file: {name} at {path}")

The _validate_outputs() function checks file existence and size directly via the OS, raising ValueError if expected outputs are missing. complete_stage() calls this validation before marking a stage complete.

Token Management: Context Bundles + Pruning

Problem: 9 stages × large JSON files = token overflow. Each stage needs all previous context.

Solution: Two-part system. Context bundles capture semantic decisions (why) rather than raw outputs (what). Each stage adds a compressed layer with:

  • key_decisions - What was decided and why
  • forward_references - Pointers to preserved files
  • stage_summary - Stage-specific output summary

Selective pruning rules specify:

  • can_prune - Files deletable after each stage
  • must_preserve - Files required for downstream stages
  • summary_in_context - What summaries remain in context

Pruning modes: safe (after checkpoint stages), aggressive (after every eligible stage), off (debugging). Result: ~80% token reduction while maintaining full resumability.

Feedback Loop State Management

When analysis fails, the system re-runs stages 3-5 while preserving state. cycle_state.json tracks feedback loop iterations with:

  • current_cycle - Current iteration number
  • max_cycles - Maximum allowed iterations
  • failed_candidates - Variables that failed analysis
  • failure_reasons - Why each candidate failed

The reset_stage_progress() function deletes progress.json to enable re-entry. Fast-track mode skips web searches (unchanged), runs primary model + Table 1 only, and applies score penalties to failed candidates. Stages 3-5 files are never pruned during active feedback cycles.

Resources
  • GitHub Repository: https://github.com/DamarisDeng/paper-writing-system
  • workflow/scripts/progress_utils.py - Progress tracking implementation
  • workflow/scripts/context_manager.py - Context bundle and pruning system
  • workflow/scripts/feedback_utils.py - Feedback loop management
  • workflow/scripts/feasibility_validator.py - Pre-emptive validation