The Terminal That (Almost) Never Dies: Building a Persistent Terminal Daemon for Electron

By Avi Peltz

This post discusses how Hoook built a feature allowing terminals to survive application restarts, crediting Andreas Asprou for spearheading the effort. The core problem: terminals are ephemeral processes, which creates friction when users update the application or experience crashes, losing all running processes across multiple worktrees.

The Problem: Electron Terminals Are Fragile

In standard Electron + node-pty setups, the terminal lifecycle follows: App starts → spawn PTY → user runs commands → app closes → PTY dies. The PTY spawns in the main Node process, terminating when the application closes and killing child processes.

The Tempting Solution: Just Use tmux

The initial approach leveraged tmux for persistence:

# Spawn session
tmux new-session -d -s "hoook-pane-123"

# Attach from Electron
tmux attach-session -t "hoook-pane-123"

# Session survives app restart ✓

Why this approach failed:

  • Extra dependency — users must install tmux separately
  • Not cross-platform — tmux doesn't work on Windows, requiring alternative implementations
  • xterm incompatibility — tmux hijacks scrollbars, selection, and hotkeys, making the experience extremely clunky

Architecture: A Detachable Daemon

The solution splits terminal management across three process layers:

ELECTRON MAIN PROCESS
• tRPC router for renderer IPC
• DaemonTerminalManager (client)
• History persistence to disk
• Workspace/worktree metadata
         ↓ (Unix Domain Socket)
TERMINAL HOST DAEMON
• Long-running Node.js process
• Owns all PTY sessions
• Headless xterm emulator per session
• Broadcasts data/exit events to clients
         ↓ (Binary framed protocol)
PTY SUBPROCESS (per session)
• Isolated process per terminal
• Owns the actual node-pty instance
• Handles backpressure independently
• 128KB batched output, 32ms flush interval

The Electron app becomes a client of the terminal daemon, enabling:

  • App restarts while reconnecting to running sessions
  • Multiple windows attaching to the same daemon
  • Crash recovery through cold restore from disk history

Spawning the Daemon

const daemon = spawn(process.execPath, [daemonScript], {
  env: {
    ...process.env,
    ELECTRON_RUN_AS_NODE: "1", // Run as plain Node.js
  },
  detached: true,
  stdio: "ignore",
});
daemon.unref(); // Don't wait for daemon to exit

Setting ELECTRON_RUN_AS_NODE=1 directs Electron to function as standard Node.js runtime, suitable for background services without Chromium.

The Protocol: NDJSON Over Unix Sockets

Communication uses newline-delimited JSON over Unix domain sockets:

// Request from main process
{"id":"req-abc123","type":"createOrAttach","payload":{"sessionId":"pane-1","cols":80,"rows":24}}

// Response from daemon
{"id":"req-abc123","ok":true,"payload":{"isNew":true,"snapshot":{...},"pid":12345}}

// Event pushed to clients (stream socket)
{"type":"data","sessionId":"pane-1","data":"$ npm install\r\n"}

Why Unix sockets? They offer speed (no TCP overhead), security (file permissions), and native backpressure support through kernel buffers.

The Two-Socket Split

The initial single-socket protocol experienced "head-of-line blocking." When terminals produce rapid output exceeding socket drainage speed, the kernel buffer fills. Every socket.write() for data blocks, queuing RPC responses behind megabytes of output from commands like cat bigfile.log.

Protocol v2 splits communication:

Main Process          Daemon
    │◄────── Control Socket ──────────►│ RPC only (low latency)
    │ request/response                 │
    │                                  │
    │◄────── Stream Socket ───────────►│ Events only (can backpressure)
    │ data, exit, errors               │

Stream sockets can back up independently while RPC remains responsive.

Session Lifecycle: Create, Attach, Survive, Restore

Sessions progress through defined states:

CREATED ──► ALIVE ──► (clients attach/detach) ──► TERMINATING ──► EXIT

The attachment mechanism handles three scenarios:

async createOrAttach(params: CreateSessionParams) {
  // 1. Already in daemon? Just attach.
  if (daemon.hasSession(params.sessionId)) {
    return daemon.attach(params.sessionId);
  }

  // 2. Not in daemon, but history on disk? Cold restore.
  const metadata = await historyReader.readMetadata(params.sessionId);
  if (metadata && !metadata.endedAt) {
    // Unclean shutdown detected!
    return {
      isColdRestore: true,
      scrollback: await historyReader.readScrollback(),
      previousCwd: metadata.cwd,
    };
  }

  // 3. Truly new session
  return daemon.createSession(params);
}

Cold Restore: Recovering from Daemon Crashes

When daemons crash (machine reboot, crash, kill -9), sessions are lost but disk history remains:

~/.hoook/terminal-sessions/
└── workspace-abc/
    └── pane-123/
        ├── meta.json # {"startedAt": ..., "cwd": "/project"}
        └── scrollback.txt # Full terminal output

On next app launch, unclean shutdown detection (no endedAt in metadata) triggers cold restore. Users see previous activity and can resume in the same directory.

// Session restored - scrollback shown but read-only
// User sees: "Session Restored - Press Enter to start new shell"
// Old scrollback preserved, new shell spawns in same directory

Backpressure: The Hidden Challenge

Terminals generate output rapidly. Commands like cat /dev/urandom | base64 flood any buffer. Without careful backpressure handling, results include:

  • Memory exhaustion (unbounded queues)
  • UI freezes (blocked event loops)
  • Lost data (dropped writes)

Multi-level backpressure flows from PTY to UI:

PTY stdout
  │
  ▼ (if daemon buffer full, pause subprocess reads)
PTY Subprocess internal buffer (8MB high watermark, 64MB hard limit)
  │
  ▼ (if session buffer full, pause subprocess stdout)
Daemon session buffer
  │
  ▼ (if client socket full, pause session output)
Main process stream socket
  │
  ▼ (if renderer can't keep up, events queue in main)
Renderer xterm.js

PTY subprocess batches output aggressively:

// Collect output for up to 32ms or 128KB, whichever comes first
const FLUSH_INTERVAL_MS = 32;
const MAX_BATCH_SIZE = 128 * 1024;

let batch = "";
let flushTimeout: NodeJS.Timeout | null = null;

pty.onData((data) => {
  batch += data;

  if (batch.length >= MAX_BATCH_SIZE) {
    flush();
  } else if (!flushTimeout) {
    flushTimeout = setTimeout(flush, FLUSH_INTERVAL_MS);
  }
});

This avoids O(n²) string concatenation while maintaining approximately 30fps visual updates.

The Headless Emulator: State Without a Screen

Each daemon session runs a headless xterm.js emulator. Despite lacking a screen, the emulator provides:

  • Accurate snapshots: When clients attach, the current screen state serializes—not just raw scrollback. Users see exactly what appeared on screen, including cursor position.
  • Terminal mode tracking: Application mode, bracketed paste, mouse tracking—all parsed and tracked so reconnecting clients receive correct state.
  • CWD detection: Parsing OSC escape sequences reveals the shell's current directory even for sessions created hours ago.
// On attach, serialize current state
const snapshot = {
  snapshotAnsi: emulator.serialize(), // Screen content as ANSI
  rehydrateSequences: emulator.getRehydrateSequences(), // Mode restore
  cwd: emulator.getCwd(), // Parsed from OSC 7
  modes: emulator.getModes(), // Cursor visible, etc.
  cols: emulator.cols,
  rows: emulator.rows,
};

Lessons Learned

1. Protocol Versioning from Day One

When introducing the two-socket split, existing daemons couldn't speak the new protocol. Graceful handling:

// Client detects version mismatch
if (response.protocolVersion !== EXPECTED_VERSION) {
  await shutdownStaleDaemon();
  await startNewDaemon();
  return retry();
}

Always include version negotiation in protocols.

2. React StrictMode Double-Mounts Are Real

React 18's StrictMode double-mounts components in development:

  • Mount → createOrAttach() → receive cold restore
  • Unmount (StrictMode cleanup)
  • Mount again → createOrAttach() → ???

If re-reading from disk, the cold restore flag might disappear (after writing endedAt). Solution: sticky cache:

// Cache cold restore until explicitly acknowledged
private coldRestoreInfo = new Map();

createOrAttach(paneId) {
  if (this.coldRestoreInfo.has(paneId)) {
    return this.coldRestoreInfo.get(paneId); // Return cached
  }
  // ... actual logic
}

ackColdRestore(paneId) {
  this.coldRestoreInfo.delete(paneId); // User acknowledged, clear cache
}

3. Don't Kill on Disconnect

When the Electron app closes, daemon sessions don't terminate:

async cleanup() {
  // Close history writers (marks clean shutdown)
  for (const writer of this.historyWriters.values()) {
    await writer.close();
  }

  // Disconnect from daemon, but DON'T send kill
  this.disposeClient();

  // Sessions keep running for next app launch
}

Persistence, not cleanup, should be the default.

4. Concurrency Limits Prevent Spawn Storms

Opening workspaces with 10 terminal panes previously spawned 10 sessions simultaneously, overwhelming the daemon. A semaphore with priority was added:

// Max 3 concurrent createOrAttach operations
private limiter = new PrioritySemaphore(3);

// Focused pane gets priority 0, background panes get 1
const priority = isFocusedPane ? 0 : 1;
await this.limiter.acquire(priority);

Active terminals appear first; background tabs hydrate gradually.

The Future: Cloud Backends

The abstraction boundary supports local persistence and future expansion. The TerminalRuntime interface remains provider-neutral:

interface TerminalRuntime {
  capabilities: {
    persistent: boolean;
    coldRestore: boolean;
  };
  createOrAttach(params: CreateSessionParams): Promise;
  write(sessionId: string, data: string): Promise;
  // ...
}

Today, LocalTerminalRuntime wraps the daemon. Tomorrow, CloudTerminalRuntime could wrap SSH connections or remote tmux sessions—same interface, different backend. The renderer remains agnostic about terminal location.

Conclusion

For deeper exploration, check the Hoook desktop source for complete implementation details.