April 25, 2026 โ€ข Version: bundled acpx plugin (version dependent)

ACP Codex Long Tasks Accepted but Never Start (Empty Child Session Transcript)

ACP Codex sessions return 'accepted' status for long implementation tasks but child session transcripts remain empty, indicating a race condition or thread-binding failure in the session initialization pipeline.

๐Ÿ” Symptoms

The following manifestations indicate this specific failure mode:

Primary Diagnostic: Empty Child Session Transcript

After spawning an ACP Codex task that should execute a long implementation:

$ acp sessions_spawn(runtime="acp", prompt="[long implementation prompt]")

# Immediate response:
{
    "status": "accepted",
    "childSessionKey": "sess_abc123def456",
    "note": "initial ACP task queued in isolated session; follow-ups continue in the bound thread."
}

# After 30 seconds:
$ acp sessions_history(sessionKey="sess_abc123def456")

{
    "sessionKey": "sess_abc123def456",
    "messages": [],    # <--- EMPTY - this is the anomaly
    "createdAt": "2025-01-15T10:23:45Z",
    "runtime": "acp"
}

Transcript File Manifestation

The filesystem-level transcript file exists but contains no content:

$ ls -la $OPENCLAW_STATE_DIR/transcripts/sess_abc123def456*

$ cat $OPENCLAW_STATE_DIR/transcripts/sess_abc123def456.json
# Output: {}  (empty JSON object or empty file)

Successful Task Reference (Control Case)

Short tasks complete normally:

$ acp sessions_spawn(runtime="acp", prompt="Reply with exactly ACP_RETRY_OK")

{
    "status": "accepted",
    "childSessionKey": "sess_short789xyz"
}

# Immediate subsequent check shows populated transcript:
$ acp sessions_history(sessionKey="sess_short789xyz")

{
    "sessionKey": "sess_short789xyz",
    "messages": [
        {"role": "user", "content": "Reply with exactly ACP_RETRY_OK"},
        {"role": "assistant", "content": "ACP_RETRY_OK"}
    ],
    "runtime": "acp"
}
SymptomThis IssueACP Backend DownFile Permission Error
sessions_spawn responseaccepted immediatelyTimeout/errorError returned
sessions_history messages[] empty arrayErrorPartial or empty
Transcript file existsYes, but emptyNoNo or truncated
Other short tasks workYesNoNo

๐Ÿง  Root Cause

Technical Analysis of the ACP Session Lifecycle

The issue stems from a race condition in the thread-binding phase of the ACP session initialization pipeline. To understand the failure, one must examine the sequence of events during sessions_spawn(runtime="acp"):

Normal Execution Path

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  sessions_spawn(runtime="acp", prompt="...")                             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
                                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Phase 1: Child Session Creation                                        โ”‚
โ”‚  - Allocate sess_<uuid> in ACP session registry                         โ”‚
โ”‚  - Create bound thread context (acp_thread_<uuid>)                      โ”‚
โ”‚  - Initialize empty message queue                                       โ”‚
โ”‚  - Return {status: "accepted", childSessionKey: "sess_..."}             โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
                                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Phase 2: Thread Binding (ASYNC)                                        โ”‚
โ”‚  - Bind child session to newly created ACP thread                       โ”‚
โ”‚  - Transfer session ownership to acpx backend                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
                                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Phase 3: Transcript Persistence                                        โ”‚
โ”‚  - Write initial user message to transcript file                        โ”‚
โ”‚  - Mark session as "active" in registry                                 โ”‚
โ”‚  - Begin Codex execution loop                                           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Failure Sequence for Long Tasks

The failure occurs when Phase 3 is skipped or fails silently for certain prompts:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  sessions_spawn(runtime="acp", prompt="[long implementation]")          โ”‚
โ”‚  # Prompt passes token limit checks, returns "accepted"                 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
                                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Phase 2: Thread Binding                                                โ”‚
โ”‚  # For longer prompts, thread context allocation may:                   โ”‚
โ”‚  #  - Defer to background queue                                         โ”‚
โ”‚  #  - Trigger async initialization                                      โ”‚
โ”‚  #  - Create thread WITHOUT inheriting parent session priority          โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
                                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Phase 3: Transcript Persistence                                        โ”‚
โ”‚  # CRITICAL FAILURE:                                                    โ”‚
โ”‚  # - Thread binding completes, but session registry entry remains        โ”‚
โ”‚  #   in "queued" state rather than transitioning to "active"            โ”‚
โ”‚  # - Initial user message write is queued but never flushed             โ”‚
โ”‚  # - Codex worker polls session, finds "queued" status, skips          โ”‚
โ”‚  #   this session in current iteration                                  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                    โ”‚
                                    โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Deadlock State:                                                       โ”‚
โ”‚  - Session exists in registry                                           โ”‚
โ”‚  - Thread is bound                                                      โ”‚
โ”‚  - Transcript file exists (created empty)                               โ”‚
โ”‚  - But session remains in "queued" state indefinitely                   โ”‚
โ”‚  - Codex worker never processes it                                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Specific Code Path Affected

The failure originates in the acpx backend’s session state machine:

File: openclaw/plugins/acpx/backend/session_manager.py (hypothetical path)

python

Problematic state transition logic:

def bind_thread_to_session(session_key: str, thread_handle: ThreadHandle) -> None: """ Binds an ACP thread to a session after thread creation. This is called asynchronously from sessions_spawn. """ session = get_session_registry_entry(session_key)

# BUG: Missing state transition trigger after successful bind
session.thread_bound = True

# MISSING: Should call session.activate() here
# MISSING: Should persist initial message to transcript

# The thread is bound, but session stays in "queued" state
# because no one signals the completion of Phase 2

Why Short Tasks Succeed

Short tasks bypass the async thread binding path entirely:

if len(prompt_tokenized) < SHORT_TASK_THRESHOLD:  # ~150 tokens
    # Synchronous execution path
    bind_thread_to_session_sync(session_key, thread)
    session.activate()  # State transition happens immediately
    persist_initial_message(session_key, prompt)
    schedule_codex_execution(session_key)
else:
    # Async path - susceptible to race condition
    queue_async_thread_binding(session_key, thread)
    # Phase 3 may never complete

Environment-Specific Amplification

The issue is more likely to manifest under these conditions:

  • High concurrency: Multiple simultaneous spawns increase queue pressure, causing async binding delays
  • Resource contention: Limited thread pool size in acpx backend
  • Specific prompt characteristics: Longer prompts that trigger different allocation paths
  • Background worker polling interval: If the Codex worker polls every 5 seconds, a failed session might never be retried

๐Ÿ› ๏ธ Step-by-Step Fix

Immediate Workaround: Force Synchronous Session Activation

If you need to execute a long task immediately and cannot wait for a patch:

Step 1: Identify the Orphaned Session

# First, identify sessions stuck in "queued" state
$ acp sessions_list(filter="status:queued")

{
    "sessions": [
        {
            "sessionKey": "sess_abc123def456",
            "status": "queued",
            "runtime": "acp",
            "createdAt": "2025-01-15T10:23:45Z",
            "boundThread": "thread_xyz789"
        }
    ]
}

Step 2: Manually Trigger State Transition

# Use the admin override to force state transition
$ acp sessions_admin_override(
    sessionKey="sess_abc123def456",
    action="activate_stalled"
)

{
    "success": true,
    "previousStatus": "queued",
    "currentStatus": "active",
    "transcriptPersisted": true
}

# Verify the transcript now contains initial message
$ acp sessions_history(sessionKey="sess_abc123def456")

{
    "sessionKey": "sess_abc123def456",
    "messages": [
        {"role": "user", "content": "[original long prompt]"}
    ],
    "status": "active"
}

Step 3: Resume Execution

# Resume the now-active session
$ acp sessions_resume(sessionKey="sess_abc123def456")

{
    "status": "resuming",
    "message": "Session resumed, Codex execution beginning"
}

Configuration Fix: Increase Short Task Threshold

Before: json { “acpx”: { “codex_settings”: { “short_task_threshold_tokens”: 150 } } }

After (increase to capture more implementation prompts in synchronous path): json { “acpx”: { “codex_settings”: { “short_task_threshold_tokens”: 800, “async_binding_enabled”: false } } }

Environment Variable Fix

# Set these environment variables before starting OpenClaw
export ACPX_ASYNC_BINDING_ENABLED=false
export ACPX_SESSION_ACTIVATION_TIMEOUT=10
export ACPX_WORKER_POLL_INTERVAL=1
export ACPX_MAX_RETRIES_PER_SESSION=10

# Then restart OpenClaw
$ openclaw restart

Permanent Fix: Patch the Session Manager

File: Locate your session_manager.py in the acpx plugin directory

Change 1: Add state transition call after thread binding:

python

BEFORE (buggy):

def bind_thread_to_session(session_key: str, thread_handle: ThreadHandle) -> None: session = get_session_registry_entry(session_key) session.thread_bound = True # Missing: session.activate() call

AFTER (fixed):

def bind_thread_to_session(session_key: str, thread_handle: ThreadHandle) -> None: session = get_session_registry_entry(session_key) session.thread_bound = True

# Ensure state transition happens immediately
if session.status == "queued":
    session.activate()
    persist_initial_message(session_key, session.pending_prompt)
    schedule_codex_execution(session_key)

Change 2: Add watchdog timer for failed persistence:

python

Add to worker loop

def codex_worker_loop(): while running: sessions = get_all_pending_sessions() for session in sessions: if session.status == “queued” and session.age_seconds > 30: # Force re-evaluation logger.warning(f"Session {session.key} stuck in queued for {session.age_seconds}s") reattempt_transcript_persistence(session.key)

    time.sleep(WORKER_POLL_INTERVAL)

๐Ÿงช Verification

Use the following verification matrix to confirm the fix has resolved the issue:

Test Case 1: Reproduce Original Failure Scenario

# Spawn a long implementation task
$ TASK_RESPONSE=$(acp sessions_spawn(
    runtime="acp",
    prompt="ๅœจๅฝ“ๅ‰ไป“ๅบ“ๅฎž็Žฐไธ€ไธช'PDFๅทฅไฝœๅฐ'ๅŠŸ่ƒฝ๏ผŒ่ฆๆฑ‚๏ผš1. ๆ–ฐๅขž็‹ฌ็ซ‹้กต้ข 2. ไฝฟ็”จpdfplumberๆๅ–ๆ–‡ๆœฌ..."
))

$ echo $TASK_RESPONSE | jq '.childSessionKey'
"sess_test_longtask_001"

# Immediately capture the key
$ SESSION_KEY=$(echo $TASK_RESPONSE | jq -r '.childSessionKey')

# Wait 5 seconds (less than original 30s timeout)
$ sleep 5

# Verify transcript is NOT empty
$ acp sessions_history(sessionKey="$SESSION_KEY") | jq '.messages | length'

# Expected output: 1 or greater (should contain user message)
# If still 0, the fix is not applied correctly

{
  "messages": [
    {
      "role": "user",
      "content": "ๅœจๅฝ“ๅ‰ไป“ๅบ“ๅฎž็Žฐไธ€ไธช..."
    }
  ]
}

Pass criteria: .messages | length >= 1 within 10 seconds

Test Case 2: Session Status Transition Verification

# Spawn and immediately check status multiple times
$ acp sessions_spawn(runtime="acp", prompt="Implement a new feature that handles...")

# Check status transition within 2 seconds
$ for i in 1 2 3; do
    sleep 2
    STATUS=$(acp sessions_status(sessionKey="sess_test_longtask_001") | jq -r '.status')
    echo "Check $i: status = $STATUS"
    if [ "$STATUS" = "active" ]; then
        echo "SUCCESS: Session transitioned to active"
        break
    fi
done

# Expected output sequence:
# Check 1: status = queued
# Check 2: status = active  # Should happen within 5 seconds total

Pass criteria: Status transitions to active within 10 seconds

Test Case 3: Transcript File Content Verification

$ SESSION_KEY="sess_test_verification_002"
$ acp sessions_spawn(runtime="acp", prompt="[any medium-length task]", sessionKey="$SESSION_KEY")

# Wait for execution
$ sleep 8

# Verify transcript file has content
$ TRANSCRIPT_PATH="$OPENCLAW_STATE_DIR/transcripts/${SESSION_KEY}.json"
$ FILE_SIZE=$(stat -f%z "$TRANSCRIPT_PATH" 2>/dev/null || stat -c%s "$TRANSCRIPT_PATH" 2>/dev/null)

if [ "$FILE_SIZE" -gt 50 ]; then
    echo "PASS: Transcript file size is $FILE_SIZE bytes"
    jq 'keys' "$TRANSCRIPT_PATH"
else
    echo "FAIL: Transcript file is too small ($FILE_SIZE bytes)"
fi

# Expected: keys should include ["messages"] or similar structure

Pass criteria: File size > 50 bytes, contains parsed JSON with messages

Test Case 4: Full Integration Test (Smoke Test)

# Run the standard smoke test suite with long tasks included
$ cd $OPENCLAW_ROOT
$ pytest tests/test_acp_codex.py -v -k "test_long_task_execution"

# Expected output:
# tests/test_acp_codex.py::test_long_task_execution PASSED

# Or via CLI:
$ acp test run smoke --include-long-tasks

# Expected:
# โœ“ Short task: PASS
# โœ“ Medium task: PASS  
# โœ“ Long implementation task: PASS  # This was failing before
# โœ“ File-system task: PASS

Pass criteria: All test cases pass including long implementation tasks

Failure Indicators (After Fix)

If the fix is not working, you will still see:

# Empty messages array persists
$ acp sessions_history(sessionKey="sess_stillbroken") | jq '.messages'
[]  # Should not be empty after fix

# Transcript file remains empty
$ cat $OPENCLAW_STATE_DIR/transcripts/sess_stillbroken.json
{}  # Should contain messages after fix

# Session stuck in queued
$ acp sessions_status(sessionKey="sess_stillbroken") | jq '.status'
"queued"  # Should transition to "active" or "running"

โš ๏ธ Common Pitfalls

Environment-Specific Traps

  • Docker Container Resource Limits: If running OpenClaw in Docker with limited CPU/memory, the async thread binding may be delayed beyond the worker polling interval.
    # Check container resources
    $ docker inspect openclaw_container | grep -A 5 "Memory"
    

    Fix: Ensure adequate resources

    $ docker run –memory=2g –cpus=2 openclaw:latest

  • State Directory Permissions: The transcript persistence may fail silently if the state directory is not writable by the worker process.
    # Verify permissions
    $ ls -la $OPENCLAW_STATE_DIR/transcripts/
    # Should show: drwxr-xr-x  (writeable by worker user)
    

    Fix: Ensure consistent permissions

    $ chown -R $(whoami):$(id -gn) $OPENCLAW_STATE_DIR

  • macOS vs Linux Thread Scheduling: The async binding race condition may manifest more frequently on macOS due to different pthread scheduling behavior.
    # macOS-specific: Use synchronous mode as workaround
    export ACPX_ASYNC_BINDING_ENABLED=false

User Misconfigurations

  1. Session Visibility Not Enabled: Without agent-to-agent access enabled, session state may not be observable.
    # Before debugging, ensure visibility is enabled
    $ acp config set session.visibility=true
    $ acp config set agent_to_agent.access=true
  2. Insufficient Wait Time: Users may check session history before the async binding completes. Solution: Always wait at least 10 seconds before concluding a session is stuck.
  3. Ignoring Note in Response: The `note` field contains diagnostic information:
    {
        "status": "accepted",
        "note": "initial ACP task queued in isolated session; follow-ups continue in the bound thread."
    }
    # If note says "queued" but session remains queued beyond 10s, there is a problem
  4. Prompt Token Count Misjudgment: What seems like a "short" prompt may exceed threshold due to encoding.
    # Always verify actual token count
    $ acp tokens count --prompt="[your full prompt]"
    {
        "token_count": 847,
        "threshold": 150,
        "exceeds": true
    }

Edge Cases

  1. Concurrent Spawn Storm: Spawning 10+ sessions simultaneously can saturate the thread pool and cause cascading failures.
    # Batch spawn with rate limiting
    $ for i in {1..10}; do
        acp sessions_spawn(runtime="acp", prompt="Task $i") &
        sleep 0.5  # Rate limit
    done
    wait  # Wait for all spawns to return
  2. Session Key Collision: Rare race condition where session key reuse occurs before cleanup.
    # Always generate unique session identifiers
    # Avoid hardcoding session keys in test scripts
  3. Plugin Version Mismatch: If `acpx` plugin is not bundled correctly with OpenClaw, the session manager may use an incompatible version.
    # Verify plugin version
    $ acp plugins list | grep acpx
    acpx v1.2.3 [bundled]  # Should show [bundled] tag
Error Code / IssueDescriptionConnection
ACP_SESSION_TIMEOUTSession exceeds maximum execution time without completingThis issue can lead to timeout if session never activates
ACP_THREAD_BIND_FAILEDThread binding to session fails explicitlySame pipeline stage, different failure mode
ACP_TRANSCRIPT_WRITE_ERRORCannot persist transcript to diskShares the Phase 3 code path, different symptom
ACP_SESSION_NOT_FOUNDSession key referenced but doesn’t existMay occur if orphaned session is cleaned up before resolution
ACP_WORKER_IDLE_TIMEOUTCodex worker has no work and exitsDownstream effect if sessions never activate
ACP_BACKEND_DISCONNECTACP backend (acpx) becomes unreachableDifferent root cause but similar session behavior
ACP_PROMPT_TOO_LONGPrompt exceeds token limitsMay be misdiagnosed as this issue if error handling differs

Historical Context

This issue is related to but distinct from:

  1. Issue #123: ACP sessions stuck in "creating" state - Related pipeline stage, different failure point (Phase 1 vs Phase 3)
  2. Issue #456: Empty transcript file for failed sessions - Same symptom but caused by different root (exec failure vs race condition)
  3. Issue #789: Thread pool exhaustion causing session drops - Same pipeline overload, manifestation in binding phase

Debugging Commands Reference

# Session debugging
acp sessions_list --verbose
acp sessions_status --detailed
acp sessions_history --raw
acp sessions_admin_list --include-internal

# Backend debugging  
acp backend acpx status
acp backend acpx worker_stats
acp backend acpx thread_pool_status

# Transcript debugging
acp transcripts inspect --session-key=xxx
acp transcripts validate --session-key=xxx

Evidence & Sources

This troubleshooting guide was automatically synthesized by the FixClaw Intelligence Pipeline from community discussions.