Test Suite Reference¶

This page is auto-generated from pytest docstrings at build time. Do not edit manually. See docs/_scripts/gen_tests_doc.py.

Modules: 35 Tests: 108

How it works¶

The generator scans tests/ for test_*.py modules, reads docstrings of modules, test classes, and test functions (including async def), and renders them here. The first sentence becomes a short summary in the index; the full docstring appears in Details.

Writing good test docstrings¶

async def test_outbox_retry_backoff():
    """
    First two sends fail → item moves to *retry* with exponential backoff;
    on the 3rd attempt the item becomes *sent*.
    """

Modules¶

tests/e2e/vars/test_cancel_preserves_vars.py 1 test (marks: asyncio)

E2E: cancelling a running task does not erase previously written coordinator.vars.

test_cancel_midflow_preserves_coordinator_vars — (no docstring)

tests/e2e/vars/test_conditional_routing_xfail.py 1 test (marks: asyncio, xfail)

XFails (placeholder): conditional routing by coordinator.vars is not implemented yet.

test_conditional_routing_by_vars — (no docstring)

tests/e2e/vars/test_coord_fn_vars_echo_e2e.py 1 test (marks: asyncio)

E2E (slow): scheduler-driven coordinator_fn → coordinator_fn chain.

test_scheduler_runs_coordfn_chain_and_child_sees_vars — (no docstring)

tests/integration/adapters/test_adapter_misc.py 5 tests (marks: asyncio)

test_cmd_from_node_alias_supported — cmd.input_inline may provide 'from_node' (single) instead of 'from_nodes' (list).
test_cmd_unknown_adapter_causes_permanent_fail — If cmd specifies an unknown adapter, the worker must fail the task permanently and must NOT call handler.iter_batches.
test_fallback_to_iter_batches_when_no_adapter — If neither cmd nor handler specifies an adapter, worker must fallback to handler.iter_batches.
test_handler_adapter_used_when_cmd_absent — When cmd does not specify an adapter, the worker may use the handler's adapter if it is known.
test_handler_unknown_adapter_ignored_when_cmd_valid — If the handler proposes an unknown adapter but cmd provides a valid one, the worker must follow cmd and succeed.

tests/integration/adapters/test_empty_upstream.py 1 test (marks: asyncio)

test_empty_upstream_finishes_with_zero_count — (no docstring)

tests/integration/adapters/test_input_adapter_contract.py 3 tests (marks: asyncio)

Integration tests for the orchestrator-first input adapter contract.

test_empty_upstream_is_not_error — Valid adapter + empty upstream → normal finish (count=0), not a failure.
test_missing_required_args_yields_bad_input_args — cmd.input_inline with 'pull.from_artifacts.rechunk:size' must include 'size'.
test_rechunk_without_meta_list_key_treats_each_artifact_as_single_item — No meta_list_key → each upstream artifact's meta is a single logical item (no heuristics).

tests/integration/adapters/test_rechunk_args.py 3 tests (marks: asyncio)

test_rechunk_invalid_meta_list_key_type_fails — (no docstring)
test_rechunk_invalid_size_fails_early — (no docstring)
test_rechunk_missing_size_fails_early — (no docstring)

tests/integration/adapters/test_rechunk_determinism.py 1 test (marks: asyncio)

test_rechunk_without_meta_key_treats_each_artifact_as_one_item — (no docstring)

tests/integration/adapters/test_rechunk_strict_mode.py 2 tests (marks: asyncio, cfg)

test_rechunk_requires_meta_key_in_strict_mode — In strict mode, rechunk without meta_list_key must fail early.
test_rechunk_with_meta_key_passes_in_strict_mode — (no docstring)

tests/integration/adapters/test_unknown_adapter.py 1 test (marks: asyncio)

test_cmd_unknown_adapter_permanent_fail — (no docstring)

tests/integration/adapters/test_worker_adapter_precedence.py 1 test (marks: asyncio)

test_cmd_input_inline_overrides_handler_load_input — When configurations conflict, cmd.input_inline must take precedence over handler-provided load_input.

tests/integration/streaming/test_isolation.py 2 tests (marks: asyncio)

Isolation of aggregation and cross-talk guard between tasks.

test_metrics_cross_talk_guard — Two concurrent tasks (same node_ids across different task_ids) do not interfere: final 'count' values are isolated per task.
test_metrics_isolation_between_tasks — Two back-to-back tasks must keep metric aggregation isolated per task document.

tests/integration/streaming/test_metrics_dedup.py 3 tests (marks: asyncio)

Idempotency & event ordering for aggregated metrics.

test_metrics_dedup_persists_across_coord_restart — Metric deduplication survives a coordinator restart: duplicates of STATUS (BATCH_OK/TASK_DONE) sent during the restart must not double-count.
test_metrics_idempotent_on_duplicate_status_events — Duplicate STATUS events (BATCH_OK / TASK_DONE) must not double-increment aggregated metrics.
test_status_out_of_order_does_not_break_aggregation — BATCH_OK STATUS events arrive out of order → the aggregated metric remains correct.

tests/integration/streaming/test_metrics_exactness.py 3 tests (marks: asyncio)

Exactness and accounting of aggregated metrics.

test_metrics_multistream_exact_sum — Three upstreams -> one downstream: analyzer's aggregated 'count' must equal the sum of all totals.
test_metrics_partial_batches_exact_count — With a remainder in the last upstream batch, analyzer's 'count' must still exactly equal total.
test_metrics_single_stream_exact_count — Single upstream -> single downstream: analyzer's aggregated 'count' must equal the total items.

tests/integration/streaming/test_multistream_fanin.py 1 test (marks: asyncio)

Multistream fan-in to a single downstream consumer.

test_multistream_fanin_stream_to_one_downstream — Multi-stream fan-in: three upstream indexers stream into one analyzer.

tests/integration/streaming/test_start_when.py 2 tests (marks: asyncio)

Tests for streaming/async fan-in behavior: early vs delayed downstream start.

test_after_upstream_complete_delays_start — Without start_when, the downstream must not start until the upstream is fully finished.
test_start_when_first_batch_starts_early — Downstream should start while upstream is still running when `start_when=first_batch` is set.

tests/integration/vars/test_coord_fn_chain.py 1 test (marks: asyncio, parametrize)

test_coord_fn_chain_set_merge_incr_unset — Integration scenario: n1: vars.set → routing.sla=<param>, limits.max_batches=<param> n2: vars.merge → flags=<param> n3: vars.incr → counters.pages += <param> n4: vars.unset → limits.max_batches Verifies:.

tests/integration/vars/test_unset_forms.py 1 test (marks: asyncio, parametrize)

test_vars_unset_kwargs_list_in_coord_fn — Coordinator function 'vars.unset' must accept kwargs form `keys=[...]`.

tests/test_artifacts_data.py 2 tests (marks: asyncio)

test_merge_generic_creates_complete_artifact — coordinator_fn: merge.generic combines results of nodes 'a' and 'b'.
test_partial_shards_and_stream_counts — Source w1 (indexer) emits batches with batch_uid → worker creates partial artifacts.

tests/test_cancel_and_restart.py 5 tests (marks: asyncio)

test_cancel_before_any_start_keeps_all_nodes_idle — Cancel the task before any node can start: no node must enter 'running'.
test_cancel_on_deferred_prevents_retry — Node 'flaky' fails on first attempt → becomes deferred with backoff.
test_cascade_cancel_prevents_downstream — Cancel the task while upstream runs: downstream must not start.
test_restart_higher_epoch_ignores_old_batch_ok — After a node restarts with a higher epoch, the coordinator must ignore a stale BATCH_OK from a lower epoch.
test_restart_higher_epoch_ignores_old_events — After accepting epoch>=1, re-inject an old event (epoch=0).

tests/test_chaos_models.py 4 tests (marks: asyncio)

End-to-end streaming smoke tests under chaos.

test_chaos_coordinator_restart — Restart the Coordinator while the task is running.
test_chaos_delays_and_duplications — Chaos mode: small broker/consumer jitter + message duplications (no drops).
test_chaos_worker_restart_mid_stream — Restart the 'enricher' worker in the middle of the stream.
test_e2e_streaming_with_kafka_chaos — With chaos enabled (broker/consumer jitter and message duplications, no drops), the full pipeline should still complete and produce artifacts at indexer/enricher/ocr stages.

tests/test_concurrency_limits.py 4 tests (marks: asyncio)

test_concurrent_tasks_respect_global_limit — (no docstring)
test_max_global_running_limit — (no docstring)
test_max_type_concurrency_limits — (no docstring)
test_multi_workers_same_type_rr_distribution — (no docstring)

tests/test_lease_heartbeat_resume.py 11 tests (marks: asyncio, cfg)

Heartbeat / grace-window / resume tests.

test_deferred_retry_ignores_grace_gate — A DEFERRED retry must not be throttled by discovery grace window.
test_grace_gate_blocks_then_allows_after_window — Grace window should delay start initially and allow it after the window elapses.
test_heartbeat_hard_marks_task_failed — If hard heartbeat window is exceeded, the task should be marked FAILED.
test_heartbeat_soft_deferred_then_recovers — With a short soft heartbeat window the task becomes DEFERRED, then recovers and finishes.
test_heartbeat_tolerates_clock_skew — Worker clock skew should not cause a hard timeout; lease deadlines must be non-decreasing.
test_heartbeat_updates_lease_deadline_simple — Heartbeats should move the lease.deadline_ts_ms forward while the node runs.
test_lease_expiry_cascades_cancel — Permanent fail upstream should cascade-cancel downstream nodes.
test_no_task_resumed_on_worker_restart — On a cold worker restart there must be no TASK_RESUMED event emitted by the worker.
test_resume_inflight_worker_restarts_with_local_state — Restart with the same worker_id: coordinator should adopt inflight work without new epoch.
test_task_discover_complete_artifacts_skips_node_start — If artifacts are 'complete' during discovery, node should auto-finish without starting its handler.
test_worker_restart_with_new_id_bumps_epoch — Restart with a new worker_id must bump attempt_epoch; stale heartbeats from the old epoch are ignored.

tests/test_outbox_delivery.py 6 tests (marks: asyncio, cfg, use_outbox)

test_outbox_backoff_caps_with_jitter — Exponential backoff is capped by max and jitter stays within a reasonable window.
test_outbox_crash_between_send_and_mark_sent — Send succeeds, coordinator crashes before mark(sent) → after restart the outbox may resend, but there is only ONE effective delivery for (topic,key,dedup_id).
test_outbox_dedup_survives_restart — Deduplication by (topic,key,dedup_id) persists across coordinator restart: re-enqueue of the same envelope after restart does not create a second outbox row or send.
test_outbox_dlq_after_max_retries — After reaching max retries, the outbox row is moved to a terminal state with attempts >= max.
test_outbox_exactly_once_fp_uniqueness — Two enqueues with identical (topic,key,dedup_id) → single outbox row and exactly one real send.
test_outbox_retry_backoff — First two _raw_send calls fail → outbox goes to 'retry' with exponential backoff (with jitter), then on the 3rd attempt becomes 'sent'.

tests/test_pipeline_smoke.py 1 test (marks: asyncio)

End-to-end streaming smoke test.

test_e2e_streaming_with_kafka_sim — Full pipeline should complete and produce artifacts at indexer/enricher/ocr stages.

tests/test_reliability_failures.py 7 tests (marks: asyncio, cfg)

Tests around source-like roles: idempotent metrics, retries, fencing, coordinator restart adoption, cascade cancel, and heartbeat/lease updates.

test_coordinator_restart_adopts_inflight_without_new_epoch — When the coordinator restarts, it should adopt in-flight work without incrementing the worker's attempt_epoch unnecessarily (i.e., source keeps epoch=1).
test_explicit_cascade_cancel_moves_node_to_deferred — Explicit cascade cancel should move a running node to a cancelling/deferred/queued state within the configured cancel_grace window.
test_heartbeat_updates_lease_deadline — Heartbeats from a worker must extend the lease deadline in the task document.
test_idempotent_metrics_on_duplicate_events — Duplicated STATUS events (BATCH_OK/TASK_DONE) must not double-count metrics.
test_permanent_fail_cascades_cancel_and_task_failed — Permanent failure in an upstream node should cause the task to fail, while dependents get cancelled/deferred/queued depending on race windows.
test_status_fencing_ignores_stale_epoch — Status fencing must ignore events from a stale attempt_epoch.
test_transient_failure_deferred_then_retry — A transient error should defer the node and succeed on retry according to retry_policy (max>=2).

tests/test_scheduler_fanin_routing.py 7 tests (marks: asyncio, xfail)

Fan-in behavior (ANY/ALL/COUNT:N) and coordinator_fn merge smoke tests.

test_coordinator_fn_merge_without_worker — coordinator_fn node should run without a worker and produce artifacts that a downstream analyzer can consume via pull.from_artifacts.
test_edges_vs_routing_priority — If explicit graph edges are present and a node also has routing.on_success, edges should take precedence (routing target should not run).
test_fanin_all_waits_all_parents — Fan-in ALL: without start_when hint, downstream should only start after both parents are finished.
test_fanin_any_starts_early — Fan-in ANY: downstream should start as soon as at least one parent streams (start_when='first_batch'), even if other parents are not yet finished.
test_fanin_count_n — Fan-in COUNT:N placeholder: downstream should start when at least N parents are ready.
test_fanout_one_upstream_two_downstreams_mixed_start_when — One upstream → two downstreams: A has start_when=first_batch (starts early), B has no start_when (waits for completion).
test_routing_on_failure_triggers_remediator_only — On upstream TASK_FAILED(permanent=True), only the 'on_failure' remediator should run; 'on_success' must not.

tests/unit/vars/test_detectors.py 3 tests (marks: asyncio)

test_detector_function_with_merge — DI: callable detector.
test_detector_object_block_and_soft — DI: object detector.
test_detector_string_import — DI: string import "module:factory".

tests/unit/vars/test_incr.py 5 tests (marks: asyncio, parametrize)

test_vars_incr_concurrency_sum — Concurrency: 10x1 + 5x2 → exact sum.
test_vars_incr_creates_missing_leaf — Creating a missing leaf should create it and increment.
test_vars_incr_float_and_negative_and_type_conflict — Mixed numeric types are allowed; negative increments also work.
test_vars_incr_rejects_nan_inf — NaN / +/-Inf must be rejected.
test_vars_incr_validates_key — Key validation (segments).

tests/unit/vars/test_merge.py 4 tests (marks: asyncio)

test_vars_merge_block_sensitive_with_external_detector — External detector: block when block_sensitive=True, otherwise write but count sensitive_hits in logs.
test_vars_merge_deep_overwrite_mixed_types — Deep-merge with leaf overwrite; mixed types (dict → leaf).
test_vars_merge_limits_and_value_size_str_and_bytes — Limits: number of paths, depth, full path length, and value size (str/bytes).
test_vars_merge_noop_empty — Empty input → no-op (touched=0), 'vars' field is not created.

tests/unit/vars/test_set.py 7 tests (marks: asyncio, parametrize)

test_vars_set_block_sensitive_with_external_detector — With block_sensitive=True the detector blocks.
test_vars_set_deterministic_keys_order_and_sensitive_hits — Deterministic sort order in logs + sensitive_hits counter.
test_vars_set_dotted_and_nested_combo_and_logging — Dotted + nested forms in one call; verify values and structured logs.
test_vars_set_limits_depth_and_path_len — Limits: key depth and full path length.
test_vars_set_limits_paths_count — Limit on number of paths in a single $set.
test_vars_set_limits_value_size_str_and_bytes — Limit on value size (string/bytes).
test_vars_set_validates_key_segments — Validate segments: $, '.', NUL, segment length.

tests/unit/vars/test_unset.py 4 tests (marks: asyncio, parametrize)

test_vars_unset_accepts_args_and_kwargs — The adapter should accept both *args (varargs keys) and kwargs form 'keys=[...]'.
test_vars_unset_path_length_limit — Validate full path length limit (including prefix 'coordinator.vars.').
test_vars_unset_simple_nested_and_noop — Remove a simple leaf and a nested branch; missing key is a no-op (operation remains valid and accounted in 'touched').
test_vars_unset_validates_paths — Validate key/path formats: $, '.', NUL, segment length.

tests/unit/worker/test_error_classification.py 1 test

Unit test: default RoleHandler.classify_error marks config/programming errors as permanent.

test_programming_errors_are_permanent — (no docstring)

tests/unit/worker/test_error_policy.py 2 tests

test_classify_error_non_permanent_for_other_exceptions — (no docstring)
test_classify_error_permanent_for_common_programming_faults — (no docstring)

tests/unit/worker/test_pull_adapters_rechunk.py 2 tests (marks: asyncio)

Unit tests for PullAdapters.iter_from_artifacts_rechunk() selection rules.

test_rechunk_with_meta_list_key_splits_that_list — (no docstring)
test_rechunk_without_meta_list_key_wraps_meta_as_single_item — (no docstring)

tests/e2e/vars/test_cancel_preserves_vars.py¶

E2E: cancelling a running task does not erase previously written coordinator.vars.

Scenario: - n1: coordinator_fn (vars.set) writes task variables. - s : long-running cancellable 'source' node (worker-based). - Cancel the task while 's' is running. Assertions: - Task (or node) transitions away from 'running' due to cancel (implementation-dependent). - coordinator.vars written by n1 remain intact after cancellation.

Tests¶

Test	Summary	Marks	Location
test_cancel_midflow_preserves_coordinator_vars	(no docstring)	asyncio	`tests/e2e/vars/test_cancel_preserves_vars.py:27`

Details¶

test_cancel_midflow_preserves_coordinator_vars¶

marks: asyncio • location: tests/e2e/vars/test_cancel_preserves_vars.py:27

No docstring.

tests/e2e/vars/test_conditional_routing_xfail.py¶

XFails (placeholder): conditional routing by coordinator.vars is not implemented yet.

Intended behavior (when implemented): - n1 sets coordinator.vars.routing.sla = "gold". - Downstream 'gold_only' should run when sla == "gold". - Downstream 'silver_only' should NOT run in this case.

Current status: - Marked xfail until coordinator supports variable-driven conditional routing.

Tests¶

Test	Summary	Marks	Location
test_conditional_routing_by_vars	(no docstring)	asyncio, xfail	`tests/e2e/vars/test_conditional_routing_xfail.py:25`

Details¶

test_conditional_routing_by_vars¶

marks: asyncio, xfail • location: tests/e2e/vars/test_conditional_routing_xfail.py:25

No docstring.

tests/e2e/vars/test_coord_fn_vars_echo_e2e.py¶

E2E (slow): scheduler-driven coordinator_fn → coordinator_fn chain.

Graph: n1: coordinator_fn (vars.set) → routing.sla="gold", limits.max_batches=5 n2: coordinator_fn (vars.echo) → reads coordinator.vars and writes an artifact echo

Checks: - scheduler starts nodes automatically (no manual _run_coordinator_fn) - both nodes finish - echo artifact exists for n2 and contains an exact copy of coordinator.vars - updated_at present (and sane)

Tests¶

Test	Summary	Marks	Location
test_scheduler_runs_coordfn_chain_and_child_sees_vars	(no docstring)	asyncio	`tests/e2e/vars/test_coord_fn_vars_echo_e2e.py:26`

Details¶

test_scheduler_runs_coordfn_chain_and_child_sees_vars¶

marks: asyncio • location: tests/e2e/vars/test_coord_fn_vars_echo_e2e.py:26

No docstring.

tests/integration/adapters/test_adapter_misc.py¶

Tests¶

Test	Summary	Marks	Location
test_cmd_from_node_alias_supported	cmd.input_inline may provide 'from_node' (single) instead of 'from_nodes' (list).	asyncio	`tests/integration/adapters/test_adapter_misc.py:213`
test_cmd_unknown_adapter_causes_permanent_fail	If cmd specifies an unknown adapter, the worker must fail the task permanently and must NOT call handler.iter_batches.	asyncio	`tests/integration/adapters/test_adapter_misc.py:179`
test_fallback_to_iter_batches_when_no_adapter	If neither cmd nor handler specifies an adapter, worker must fallback to handler.iter_batches.	asyncio	`tests/integration/adapters/test_adapter_misc.py:147`
test_handler_adapter_used_when_cmd_absent	When cmd does not specify an adapter, the worker may use the handler's adapter if it is known.	asyncio	`tests/integration/adapters/test_adapter_misc.py:102`
test_handler_unknown_adapter_ignored_when_cmd_valid	If the handler proposes an unknown adapter but cmd provides a valid one, the worker must follow cmd and succeed.	asyncio	`tests/integration/adapters/test_adapter_misc.py:262`

Details¶

test_cmd_from_node_alias_supported¶

marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:213

cmd.input_inline may provide 'from_node' (single) instead of 'from_nodes' (list).

test_cmd_unknown_adapter_causes_permanent_fail¶

marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:179

If cmd specifies an unknown adapter, the worker must fail the task permanently and must NOT call handler.iter_batches.

test_fallback_to_iter_batches_when_no_adapter¶

marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:147

If neither cmd nor handler specifies an adapter, worker must fallback to handler.iter_batches.

test_handler_adapter_used_when_cmd_absent¶

marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:102

When cmd does not specify an adapter, the worker may use the handler's adapter if it is known.

test_handler_unknown_adapter_ignored_when_cmd_valid¶

marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:262

If the handler proposes an unknown adapter but cmd provides a valid one, the worker must follow cmd and succeed.

tests/integration/adapters/test_empty_upstream.py¶

Tests¶

Test	Summary	Marks	Location
test_empty_upstream_finishes_with_zero_count	(no docstring)	asyncio	`tests/integration/adapters/test_empty_upstream.py:21`

Details¶

test_empty_upstream_finishes_with_zero_count¶

marks: asyncio • location: tests/integration/adapters/test_empty_upstream.py:21

No docstring.

tests/integration/adapters/test_input_adapter_contract.py¶

Integration tests for the orchestrator-first input adapter contract.

Covers: - Unknown adapter in cmd → permanent TASK_FAILED (bad_input_adapter). - Missing required adapter args in cmd → permanent TASK_FAILED (bad_input_args). - Empty upstream is not an error (finished with count=0). - Rechunk without meta_list_key treats each artifact as a single logical item (no heuristics).

Tests¶

Test	Summary	Marks	Location
test_empty_upstream_is_not_error	Valid adapter + empty upstream → normal finish (count=0), not a failure.	asyncio	`tests/integration/adapters/test_input_adapter_contract.py:80`
test_missing_required_args_yields_bad_input_args	cmd.input_inline with 'pull.from_artifacts.rechunk:size' must include 'size'.	asyncio	`tests/integration/adapters/test_input_adapter_contract.py:49`
test_rechunk_without_meta_list_key_treats_each_artifact_as_single_item	No meta_list_key → each upstream artifact's meta is a single logical item (no heuristics).	asyncio	`tests/integration/adapters/test_input_adapter_contract.py:127`

Details¶

test_empty_upstream_is_not_error¶

marks: asyncio • location: tests/integration/adapters/test_input_adapter_contract.py:80

Valid adapter + empty upstream → normal finish (count=0), not a failure.

test_missing_required_args_yields_bad_input_args¶

marks: asyncio • location: tests/integration/adapters/test_input_adapter_contract.py:49

cmd.input_inline with 'pull.from_artifacts.rechunk:size' must include 'size'. Omission → early permanent failure (bad_input_args). Handler.iter_batches() must NOT be called.

test_rechunk_without_meta_list_key_treats_each_artifact_as_single_item¶

marks: asyncio • location: tests/integration/adapters/test_input_adapter_contract.py:127

No meta_list_key → each upstream artifact's meta is a single logical item (no heuristics). If upstream emits 2 partial artifacts, probe should aggregate 'count' == 2 (not N of inner lists).

tests/integration/adapters/test_rechunk_args.py¶

Tests¶

Test	Summary	Marks	Location
test_rechunk_invalid_meta_list_key_type_fails	(no docstring)	asyncio	`tests/integration/adapters/test_rechunk_args.py:93`
test_rechunk_invalid_size_fails_early	(no docstring)	asyncio	`tests/integration/adapters/test_rechunk_args.py:60`
test_rechunk_missing_size_fails_early	(no docstring)	asyncio	`tests/integration/adapters/test_rechunk_args.py:50`

Details¶

test_rechunk_invalid_meta_list_key_type_fails¶

marks: asyncio • location: tests/integration/adapters/test_rechunk_args.py:93

No docstring.

test_rechunk_invalid_size_fails_early¶

marks: asyncio • location: tests/integration/adapters/test_rechunk_args.py:60

No docstring.

test_rechunk_missing_size_fails_early¶

marks: asyncio • location: tests/integration/adapters/test_rechunk_args.py:50

No docstring.

tests/integration/adapters/test_rechunk_determinism.py¶

Tests¶

Test	Summary	Marks	Location
test_rechunk_without_meta_key_treats_each_artifact_as_one_item	(no docstring)	asyncio	`tests/integration/adapters/test_rechunk_determinism.py:24`

Details¶

test_rechunk_without_meta_key_treats_each_artifact_as_one_item¶

marks: asyncio • location: tests/integration/adapters/test_rechunk_determinism.py:24

No docstring.

tests/integration/adapters/test_rechunk_strict_mode.py¶

Tests¶

Test	Summary	Marks	Location
test_rechunk_requires_meta_key_in_strict_mode	In strict mode, rechunk without meta_list_key must fail early.	asyncio, cfg	`tests/integration/adapters/test_rechunk_strict_mode.py:15`
test_rechunk_with_meta_key_passes_in_strict_mode	(no docstring)	asyncio, cfg	`tests/integration/adapters/test_rechunk_strict_mode.py:60`

Details¶

test_rechunk_requires_meta_key_in_strict_mode¶

marks: asyncio, cfg • location: tests/integration/adapters/test_rechunk_strict_mode.py:15

In strict mode, rechunk without meta_list_key must fail early.

test_rechunk_with_meta_key_passes_in_strict_mode¶

marks: asyncio, cfg • location: tests/integration/adapters/test_rechunk_strict_mode.py:60

No docstring.

tests/integration/adapters/test_unknown_adapter.py¶

Tests¶

Test	Summary	Marks	Location
test_cmd_unknown_adapter_permanent_fail	(no docstring)	asyncio	`tests/integration/adapters/test_unknown_adapter.py:24`

Details¶

test_cmd_unknown_adapter_permanent_fail¶

marks: asyncio • location: tests/integration/adapters/test_unknown_adapter.py:24

No docstring.

tests/integration/adapters/test_worker_adapter_precedence.py¶

Tests¶

Test	Summary	Marks	Location
test_cmd_input_inline_overrides_handler_load_input	When configurations conflict, cmd.input_inline must take precedence over handler-provided load_input.	asyncio	`tests/integration/adapters/test_worker_adapter_precedence.py:49`

Details¶

test_cmd_input_inline_overrides_handler_load_input¶

marks: asyncio • location: tests/integration/adapters/test_worker_adapter_precedence.py:49

When configurations conflict, cmd.input_inline must take precedence over handler-provided load_input.

tests/integration/streaming/test_isolation.py¶

Isolation of aggregation and cross-talk guard between tasks.

Tests¶

Test	Summary	Marks	Location
test_metrics_cross_talk_guard	Two concurrent tasks (same node_ids across different task_ids) do not interfere: final 'count' values are isolated per task.	asyncio	`tests/integration/streaming/test_isolation.py:62`
test_metrics_isolation_between_tasks	Two back-to-back tasks must keep metric aggregation isolated per task document.	asyncio	`tests/integration/streaming/test_isolation.py:17`

Details¶

test_metrics_cross_talk_guard¶

marks: asyncio • location: tests/integration/streaming/test_isolation.py:62

Two concurrent tasks (same node_ids across different task_ids) do not interfere: final 'count' values are isolated per task.

test_metrics_isolation_between_tasks¶

marks: asyncio • location: tests/integration/streaming/test_isolation.py:17

Two back-to-back tasks must keep metric aggregation isolated per task document.

tests/integration/streaming/test_metrics_dedup.py¶

Idempotency & event ordering for aggregated metrics.

Tests¶

Test	Summary	Marks	Location
test_metrics_dedup_persists_across_coord_restart	Metric deduplication survives a coordinator restart: duplicates of STATUS (BATCH_OK/TASK_DONE) sent during the restart must not double-count.	asyncio	`tests/integration/streaming/test_metrics_dedup.py:155`
test_metrics_idempotent_on_duplicate_status_events	Duplicate STATUS events (BATCH_OK / TASK_DONE) must not double-increment aggregated metrics.	asyncio	`tests/integration/streaming/test_metrics_dedup.py:21`
test_status_out_of_order_does_not_break_aggregation	BATCH_OK STATUS events arrive out of order → the aggregated metric remains correct.	asyncio	`tests/integration/streaming/test_metrics_dedup.py:82`

Details¶

test_metrics_dedup_persists_across_coord_restart¶

marks: asyncio • location: tests/integration/streaming/test_metrics_dedup.py:155

Metric deduplication survives a coordinator restart: duplicates of STATUS (BATCH_OK/TASK_DONE) sent during the restart must not double-count. Skips if the fixture coord has no restart method`.

test_metrics_idempotent_on_duplicate_status_events¶

marks: asyncio • location: tests/integration/streaming/test_metrics_dedup.py:21

Duplicate STATUS events (BATCH_OK / TASK_DONE) must not double-increment aggregated metrics.

We monkeypatch AIOKafkaProducerMock.send_and_wait to produce the same STATUS event twice for status topics. The Coordinator should deduplicate by envelope key and keep metrics stable.

test_status_out_of_order_does_not_break_aggregation¶

marks: asyncio • location: tests/integration/streaming/test_metrics_dedup.py:82

BATCH_OK STATUS events arrive out of order → the aggregated metric remains correct. We hold the first BATCH_OK so later ones arrive first.

tests/integration/streaming/test_metrics_exactness.py¶

Exactness and accounting of aggregated metrics.

Tests¶

Test	Summary	Marks	Location
test_metrics_multistream_exact_sum	Three upstreams -> one downstream: analyzer's aggregated 'count' must equal the sum of all totals.	asyncio	`tests/integration/streaming/test_metrics_exactness.py:56`
test_metrics_partial_batches_exact_count	With a remainder in the last upstream batch, analyzer's 'count' must still exactly equal total.	asyncio	`tests/integration/streaming/test_metrics_exactness.py:105`
test_metrics_single_stream_exact_count	Single upstream -> single downstream: analyzer's aggregated 'count' must equal the total items.	asyncio	`tests/integration/streaming/test_metrics_exactness.py:15`

Details¶

test_metrics_multistream_exact_sum¶

marks: asyncio • location: tests/integration/streaming/test_metrics_exactness.py:56

Three upstreams -> one downstream: analyzer's aggregated 'count' must equal the sum of all totals.

test_metrics_partial_batches_exact_count¶

marks: asyncio • location: tests/integration/streaming/test_metrics_exactness.py:105

With a remainder in the last upstream batch, analyzer's 'count' must still exactly equal total.

test_metrics_single_stream_exact_count¶

marks: asyncio • location: tests/integration/streaming/test_metrics_exactness.py:15

Single upstream -> single downstream: analyzer's aggregated 'count' must equal the total items.

tests/integration/streaming/test_multistream_fanin.py¶

Multistream fan-in to a single downstream consumer.

Tests¶

Test	Summary	Marks	Location
test_multistream_fanin_stream_to_one_downstream	Multi-stream fan-in: three upstream indexers stream into one analyzer.	asyncio	`tests/integration/streaming/test_multistream_fanin.py:17`

Details¶

test_multistream_fanin_stream_to_one_downstream¶

marks: asyncio • location: tests/integration/streaming/test_multistream_fanin.py:17

Multi-stream fan-in: three upstream indexers stream into one analyzer. Analyzer should start early on first batch and eventually see the full flow.

tests/integration/streaming/test_start_when.py¶

Tests for streaming/async fan-in behavior: early vs delayed downstream start.

Tests¶

Test	Summary	Marks	Location
test_after_upstream_complete_delays_start	Without start_when, the downstream must not start until the upstream is fully finished.	asyncio	`tests/integration/streaming/test_start_when.py:85`
test_start_when_first_batch_starts_early	Downstream should start while upstream is still running when `start_when=first_batch` is set.	asyncio	`tests/integration/streaming/test_start_when.py:27`

Details¶

test_after_upstream_complete_delays_start¶

marks: asyncio • location: tests/integration/streaming/test_start_when.py:85

Without start_when, the downstream must not start until the upstream is fully finished. We first assert a negative hold window while the upstream is running, then assert the downstream starts and finishes after the upstream completes.

test_start_when_first_batch_starts_early¶

marks: asyncio • location: tests/integration/streaming/test_start_when.py:27

Downstream should start while upstream is still running when start_when=first_batch is set.

tests/integration/vars/test_coord_fn_chain.py¶

Tests¶

Test	Summary	Marks	Location
test_coord_fn_chain_set_merge_incr_unset	Integration scenario: n1: vars.set → routing.sla=, limits.max_batches= n2: vars.merge → flags= n3: vars.incr → counters.pages += n4: vars.unset → limits.max_batches Verifies:.	asyncio, parametrize	`tests/integration/vars/test_coord_fn_chain.py:27`

Details¶

test_coord_fn_chain_set_merge_incr_unset¶

marks: asyncio, parametrize • location: tests/integration/vars/test_coord_fn_chain.py:27

Integration scenario: n1: vars.set → routing.sla=, limits.max_batches= n2: vars.merge → flags= n3: vars.incr → counters.pages += n4: vars.unset → limits.max_batches

Verifies: - each node finishes (status=finished) - final coordinator.vars matches expected - updated_at increases across steps (if tracked by inmemory_db)

tests/integration/vars/test_unset_forms.py¶

Tests¶

Test	Summary	Marks	Location
test_vars_unset_kwargs_list_in_coord_fn	Coordinator function 'vars.unset' must accept kwargs form `keys=[...]`.	asyncio, parametrize	`tests/integration/vars/test_unset_forms.py:20`

Details¶

test_vars_unset_kwargs_list_in_coord_fn¶

marks: asyncio, parametrize • location: tests/integration/vars/test_unset_forms.py:20

Coordinator function 'vars.unset' must accept kwargs form keys=[...].

tests/test_artifacts_data.py¶

Tests¶

Test	Summary	Marks	Location
test_merge_generic_creates_complete_artifact	coordinator_fn: merge.generic combines results of nodes 'a' and 'b'.	asyncio	`tests/test_artifacts_data.py:158`
test_partial_shards_and_stream_counts	Source w1 (indexer) emits batches with batch_uid → worker creates partial artifacts.	asyncio	`tests/test_artifacts_data.py:90`

Details¶

test_merge_generic_creates_complete_artifact¶

marks: asyncio • location: tests/test_artifacts_data.py:158

coordinator_fn: merge.generic combines results of nodes 'a' and 'b'.

We verify: * the task finishes * merge node 'm' has a 'complete' artifact * nodes 'a' and 'b' have artifacts (partial/complete — does not matter)

test_partial_shards_and_stream_counts¶

marks: asyncio • location: tests/test_artifacts_data.py:90

Source w1 (indexer) emits batches with batch_uid → worker creates partial artifacts. Completion marks a 'complete' artifact. Analyzer reads via rechunk and accumulates a counter.

We verify: * number of partial shards at w1 == ceil(total / batch_size) * a 'complete' artifact exists at w1 * w2 has node.stats.count == total

tests/test_cancel_and_restart.py¶

Tests¶

Test	Summary	Marks	Location
test_cancel_before_any_start_keeps_all_nodes_idle	Cancel the task before any node can start: no node must enter 'running'.	asyncio	`tests/test_cancel_and_restart.py:231`
test_cancel_on_deferred_prevents_retry	Node 'flaky' fails on first attempt → becomes deferred with backoff.	asyncio	`tests/test_cancel_and_restart.py:295`
test_cascade_cancel_prevents_downstream	Cancel the task while upstream runs: downstream must not start.	asyncio	`tests/test_cancel_and_restart.py:97`
test_restart_higher_epoch_ignores_old_batch_ok	After a node restarts with a higher epoch, the coordinator must ignore a stale BATCH_OK from a lower epoch.	asyncio	`tests/test_cancel_and_restart.py:366`
test_restart_higher_epoch_ignores_old_events	After accepting epoch>=1, re-inject an old event (epoch=0).	asyncio	`tests/test_cancel_and_restart.py:169`

Details¶

test_cancel_before_any_start_keeps_all_nodes_idle¶

marks: asyncio • location: tests/test_cancel_and_restart.py:231

Cancel the task before any node can start: no node must enter 'running'.

test_cancel_on_deferred_prevents_retry¶

marks: asyncio • location: tests/test_cancel_and_restart.py:295

Node 'flaky' fails on first attempt → becomes deferred with backoff. Cancel the task right after TASK_FAILED(epoch=1) and ensure no higher epoch is accepted.

test_cascade_cancel_prevents_downstream¶

marks: asyncio • location: tests/test_cancel_and_restart.py:97

Cancel the task while upstream runs: downstream must not start.

test_restart_higher_epoch_ignores_old_batch_ok¶

marks: asyncio • location: tests/test_cancel_and_restart.py:366

After a node restarts with a higher epoch, the coordinator must ignore a stale BATCH_OK from a lower epoch. Also ensure injected stale event doesn't create duplicate metrics.

test_restart_higher_epoch_ignores_old_events¶

marks: asyncio • location: tests/test_cancel_and_restart.py:169

After accepting epoch>=1, re-inject an old event (epoch=0). Coordinator must ignore it by fencing.

tests/test_chaos_models.py¶

End-to-end streaming smoke tests under chaos.

Pipelines covered: - w1(indexer) -> w2(enricher) -> w5(ocr) -> w4(analyzer) ___________/ / _____ w3(merge) _/

Checks: - all nodes finish successfully (including the merge node in the extended graph) - artifacts are produced for indexer/enricher/ocr (streaming stages) - resiliency under worker and coordinator restarts

Tests¶

Test	Summary	Marks	Location
test_chaos_coordinator_restart	Restart the Coordinator while the task is running.	asyncio	`tests/test_chaos_models.py:344`
test_chaos_delays_and_duplications	Chaos mode: small broker/consumer jitter + message duplications (no drops).	asyncio	`tests/test_chaos_models.py:277`
test_chaos_worker_restart_mid_stream	Restart the 'enricher' worker in the middle of the stream.	asyncio	`tests/test_chaos_models.py:304`
test_e2e_streaming_with_kafka_chaos	With chaos enabled (broker/consumer jitter and message duplications, no drops), the full pipeline should still complete and produce artifacts at indexer/enricher/ocr stages.	asyncio	`tests/test_chaos_models.py:245`

Details¶

test_chaos_coordinator_restart¶

marks: asyncio • location: tests/test_chaos_models.py:344

Restart the Coordinator while the task is running. Expect the pipeline to recover and finish.

test_chaos_delays_and_duplications¶

marks: asyncio • location: tests/test_chaos_models.py:277

Chaos mode: small broker/consumer jitter + message duplications (no drops). Expect the pipeline to finish and produce artifacts for w1/w2/w5.

test_chaos_worker_restart_mid_stream¶

marks: asyncio • location: tests/test_chaos_models.py:304

Restart the 'enricher' worker in the middle of the stream. Expect the coordinator to fence and the pipeline to still finish.

test_e2e_streaming_with_kafka_chaos¶

marks: asyncio • location: tests/test_chaos_models.py:245

With chaos enabled (broker/consumer jitter and message duplications, no drops), the full pipeline should still complete and produce artifacts at indexer/enricher/ocr stages.

tests/test_concurrency_limits.py¶

Tests¶

Test	Summary	Marks	Location
test_concurrent_tasks_respect_global_limit	(no docstring)	asyncio	`tests/test_concurrency_limits.py:302`
test_max_global_running_limit	(no docstring)	asyncio	`tests/test_concurrency_limits.py:153`
test_max_type_concurrency_limits	(no docstring)	asyncio	`tests/test_concurrency_limits.py:196`
test_multi_workers_same_type_rr_distribution	(no docstring)	asyncio	`tests/test_concurrency_limits.py:252`

Details¶

test_concurrent_tasks_respect_global_limit¶

marks: asyncio • location: tests/test_concurrency_limits.py:302

No docstring.

test_max_global_running_limit¶

marks: asyncio • location: tests/test_concurrency_limits.py:153

No docstring.

test_max_type_concurrency_limits¶

marks: asyncio • location: tests/test_concurrency_limits.py:196

No docstring.

test_multi_workers_same_type_rr_distribution¶

marks: asyncio • location: tests/test_concurrency_limits.py:252

No docstring.

tests/test_lease_heartbeat_resume.py¶

Heartbeat / grace-window / resume tests.

Covers: - Soft heartbeat -> deferred -> recovery - Hard heartbeat -> task failed - Resume with local worker state (no new epoch) - Discover w/ complete artifacts -> skip start - Grace-gate delay, and that deferred retry ignores the gate - Heartbeat extends lease deadline - Restart with a new worker_id bumps attempt_epoch and fences stale heartbeats

Tests¶

Test	Summary	Marks	Location
test_deferred_retry_ignores_grace_gate	A DEFERRED retry must not be throttled by discovery grace window.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:258`
test_grace_gate_blocks_then_allows_after_window	Grace window should delay start initially and allow it after the window elapses.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:209`
test_heartbeat_hard_marks_task_failed	If hard heartbeat window is exceeded, the task should be marked FAILED.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:77`
test_heartbeat_soft_deferred_then_recovers	With a short soft heartbeat window the task becomes DEFERRED, then recovers and finishes.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:39`
test_heartbeat_tolerates_clock_skew	Worker clock skew should not cause a hard timeout; lease deadlines must be non-decreasing.	asyncio, cfg	`tests/test_lease_heartbeat_resume.py:556`
test_heartbeat_updates_lease_deadline_simple	Heartbeats should move the lease.deadline_ts_ms forward while the node runs.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:348`
test_lease_expiry_cascades_cancel	Permanent fail upstream should cascade-cancel downstream nodes.	asyncio, cfg	`tests/test_lease_heartbeat_resume.py:625`
test_no_task_resumed_on_worker_restart	On a cold worker restart there must be no TASK_RESUMED event emitted by the worker.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:292`
test_resume_inflight_worker_restarts_with_local_state	Restart with the same worker_id: coordinator should adopt inflight work without new epoch.	cfg, asyncio	`tests/test_lease_heartbeat_resume.py:113`
test_task_discover_complete_artifacts_skips_node_start	If artifacts are 'complete' during discovery, node should auto-finish without starting its handler.	asyncio	`tests/test_lease_heartbeat_resume.py:173`
test_worker_restart_with_new_id_bumps_epoch	Restart with a new worker_id must bump attempt_epoch; stale heartbeats from the old epoch are ignored.	asyncio, cfg	`tests/test_lease_heartbeat_resume.py:394`

Details¶

test_deferred_retry_ignores_grace_gate¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:258

A DEFERRED retry must not be throttled by discovery grace window.

test_grace_gate_blocks_then_allows_after_window¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:209

Grace window should delay start initially and allow it after the window elapses.

test_heartbeat_hard_marks_task_failed¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:77

If hard heartbeat window is exceeded, the task should be marked FAILED.

test_heartbeat_soft_deferred_then_recovers¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:39

With a short soft heartbeat window the task becomes DEFERRED, then recovers and finishes.

test_heartbeat_tolerates_clock_skew¶

marks: asyncio, cfg • location: tests/test_lease_heartbeat_resume.py:556

Worker clock skew should not cause a hard timeout; lease deadlines must be non-decreasing.

test_heartbeat_updates_lease_deadline_simple¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:348

Heartbeats should move the lease.deadline_ts_ms forward while the node runs.

test_lease_expiry_cascades_cancel¶

marks: asyncio, cfg • location: tests/test_lease_heartbeat_resume.py:625

Permanent fail upstream should cascade-cancel downstream nodes.

test_no_task_resumed_on_worker_restart¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:292

On a cold worker restart there must be no TASK_RESUMED event emitted by the worker.

test_resume_inflight_worker_restarts_with_local_state¶

marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:113

Restart with the same worker_id: coordinator should adopt inflight work without new epoch.

test_task_discover_complete_artifacts_skips_node_start¶

marks: asyncio • location: tests/test_lease_heartbeat_resume.py:173

If artifacts are 'complete' during discovery, node should auto-finish without starting its handler.

test_worker_restart_with_new_id_bumps_epoch¶

marks: asyncio, cfg • location: tests/test_lease_heartbeat_resume.py:394

Restart with a new worker_id must bump attempt_epoch; stale heartbeats from the old epoch are ignored.

tests/test_outbox_delivery.py¶

Tests¶

Test	Summary	Marks	Location
test_outbox_backoff_caps_with_jitter	Exponential backoff is capped by max and jitter stays within a reasonable window.	asyncio, use_outbox, cfg	`tests/test_outbox_delivery.py:366`
test_outbox_crash_between_send_and_mark_sent	Send succeeds, coordinator crashes before mark(sent) → after restart the outbox may resend, but there is only ONE effective delivery for (topic,key,dedup_id).	asyncio, use_outbox	`tests/test_outbox_delivery.py:214`
test_outbox_dedup_survives_restart	Deduplication by (topic,key,dedup_id) persists across coordinator restart: re-enqueue of the same envelope after restart does not create a second outbox row or send.	asyncio, use_outbox	`tests/test_outbox_delivery.py:305`
test_outbox_dlq_after_max_retries	After reaching max retries, the outbox row is moved to a terminal state with attempts >= max.	asyncio, use_outbox, cfg	`tests/test_outbox_delivery.py:419`
test_outbox_exactly_once_fp_uniqueness	Two enqueues with identical (topic,key,dedup_id) → single outbox row and exactly one real send.	asyncio	`tests/test_outbox_delivery.py:138`
test_outbox_retry_backoff	First two _raw_send calls fail → outbox goes to 'retry' with exponential backoff (with jitter), then on the 3^rd attempt becomes 'sent'.	asyncio	`tests/test_outbox_delivery.py:46`

Details¶

test_outbox_backoff_caps_with_jitter¶

marks: asyncio, use_outbox, cfg • location: tests/test_outbox_delivery.py:366

Exponential backoff is capped by max and jitter stays within a reasonable window.

test_outbox_crash_between_send_and_mark_sent¶

marks: asyncio, use_outbox • location: tests/test_outbox_delivery.py:214

Send succeeds, coordinator crashes before mark(sent) → after restart the outbox may resend, but there is only ONE effective delivery for (topic,key,dedup_id).

test_outbox_dedup_survives_restart¶

marks: asyncio, use_outbox • location: tests/test_outbox_delivery.py:305

Deduplication by (topic,key,dedup_id) persists across coordinator restart: re-enqueue of the same envelope after restart does not create a second outbox row or send.

test_outbox_dlq_after_max_retries¶

marks: asyncio, use_outbox, cfg • location: tests/test_outbox_delivery.py:419

After reaching max retries, the outbox row is moved to a terminal state with attempts >= max.

test_outbox_exactly_once_fp_uniqueness¶

marks: asyncio • location: tests/test_outbox_delivery.py:138

Two enqueues with identical (topic,key,dedup_id) → single outbox row and exactly one real send.

test_outbox_retry_backoff¶

marks: asyncio • location: tests/test_outbox_delivery.py:46

First two _raw_send calls fail → outbox goes to 'retry' with exponential backoff (with jitter), then on the 3^rd attempt becomes 'sent'.

tests/test_pipeline_smoke.py¶

End-to-end streaming smoke test.

Pipeline: w1(indexer) -> w2(enricher) -> w5(ocr) -> w4(analyzer) ___________/ / _____ w3(merge) _/

Checks: - all nodes finish successfully - artifacts are produced for indexer/enricher/ocr

Tests¶

Test	Summary	Marks	Location
test_e2e_streaming_with_kafka_sim	Full pipeline should complete and produce artifacts at indexer/enricher/ocr stages.	asyncio	`tests/test_pipeline_smoke.py:135`

Details¶

test_e2e_streaming_with_kafka_sim¶

marks: asyncio • location: tests/test_pipeline_smoke.py:135

Full pipeline should complete and produce artifacts at indexer/enricher/ocr stages.

tests/test_reliability_failures.py¶

Tests around source-like roles: idempotent metrics, retries, fencing, coordinator restart adoption, cascade cancel, and heartbeat/lease updates.

Design: - All runtime knobs come via coord_cfg/worker_cfg fixtures (see conftest.py). - For one-off tweaks, use @pytest.mark.cfg(coord={...}, worker={...}). - Workers are started with in-memory DB and handlers built from helper builders.

Tests¶

Test	Summary	Marks	Location
test_coordinator_restart_adopts_inflight_without_new_epoch	When the coordinator restarts, it should adopt in-flight work without incrementing the worker's attempt_epoch unnecessarily (i.e., source keeps epoch=1).	asyncio	`tests/test_reliability_failures.py:218`
test_explicit_cascade_cancel_moves_node_to_deferred	Explicit cascade cancel should move a running node to a cancelling/deferred/queued state within the configured cancel_grace window.	cfg, asyncio	`tests/test_reliability_failures.py:273`
test_heartbeat_updates_lease_deadline	Heartbeats from a worker must extend the lease deadline in the task document.	cfg, asyncio	`tests/test_reliability_failures.py:322`
test_idempotent_metrics_on_duplicate_events	Duplicated STATUS events (BATCH_OK/TASK_DONE) must not double-count metrics.	asyncio	`tests/test_reliability_failures.py:36`
test_permanent_fail_cascades_cancel_and_task_failed	Permanent failure in an upstream node should cause the task to fail, while dependents get cancelled/deferred/queued depending on race windows.	cfg, asyncio	`tests/test_reliability_failures.py:125`
test_status_fencing_ignores_stale_epoch	Status fencing must ignore events from a stale attempt_epoch.	asyncio	`tests/test_reliability_failures.py:169`
test_transient_failure_deferred_then_retry	A transient error should defer the node and succeed on retry according to retry_policy (max>=2).	asyncio	`tests/test_reliability_failures.py:83`

Details¶

test_coordinator_restart_adopts_inflight_without_new_epoch¶

marks: asyncio • location: tests/test_reliability_failures.py:218

When the coordinator restarts, it should adopt in-flight work without incrementing the worker's attempt_epoch unnecessarily (i.e., source keeps epoch=1).

test_explicit_cascade_cancel_moves_node_to_deferred¶

marks: cfg, asyncio • location: tests/test_reliability_failures.py:273

Explicit cascade cancel should move a running node to a cancelling/deferred/queued state within the configured cancel_grace window.

test_heartbeat_updates_lease_deadline¶

marks: cfg, asyncio • location: tests/test_reliability_failures.py:322

Heartbeats from a worker must extend the lease deadline in the task document.

test_idempotent_metrics_on_duplicate_events¶

marks: asyncio • location: tests/test_reliability_failures.py:36

Duplicated STATUS events (BATCH_OK/TASK_DONE) must not double-count metrics. We duplicate STATUS envelopes at the Kafka level; aggregator must remain stable.

test_permanent_fail_cascades_cancel_and_task_failed¶

marks: cfg, asyncio • location: tests/test_reliability_failures.py:125

Permanent failure in an upstream node should cause the task to fail, while dependents get cancelled/deferred/queued depending on race windows.

test_status_fencing_ignores_stale_epoch¶

marks: asyncio • location: tests/test_reliability_failures.py:169

Status fencing must ignore events from a stale attempt_epoch. We finish the task, then send a forged event with attempt_epoch=0; stats must remain unchanged.

test_transient_failure_deferred_then_retry¶

marks: asyncio • location: tests/test_reliability_failures.py:83

A transient error should defer the node and succeed on retry according to retry_policy (max>=2).

tests/test_scheduler_fanin_routing.py¶

Fan-in behavior (ANY/ALL/COUNT:N) and coordinator_fn merge smoke tests.

Covers: - ANY fan-in: downstream starts as soon as any parent streams a first batch. - ALL fan-in: downstream starts only after all parents are done (no start_when). - COUNT:N fan-in: xfail placeholder until the coordinator supports it. - Edges vs routing priority: xfail placeholder for precedence logic. - coordinator_fn merge: runs without a worker and feeds downstream via artifacts.

Tests¶

Test	Summary	Marks	Location
test_coordinator_fn_merge_without_worker	coordinator_fn node should run without a worker and produce artifacts that a downstream analyzer can consume via pull.from_artifacts.	asyncio	`tests/test_scheduler_fanin_routing.py:294`
test_edges_vs_routing_priority	If explicit graph edges are present and a node also has routing.on_success, edges should take precedence (routing target should not run).	asyncio, xfail	`tests/test_scheduler_fanin_routing.py:230`
test_fanin_all_waits_all_parents	Fan-in ALL: without start_when hint, downstream should only start after both parents are finished.	asyncio	`tests/test_scheduler_fanin_routing.py:127`
test_fanin_any_starts_early	Fan-in ANY: downstream should start as soon as at least one parent streams (start_when='first_batch'), even if other parents are not yet finished.	asyncio	`tests/test_scheduler_fanin_routing.py:69`
test_fanin_count_n	Fan-in COUNT:N placeholder: downstream should start when at least N parents are ready.	asyncio, xfail	`tests/test_scheduler_fanin_routing.py:176`
test_fanout_one_upstream_two_downstreams_mixed_start_when	One upstream → two downstreams: A has start_when=first_batch (starts early), B has no start_when (waits for completion).	asyncio	`tests/test_scheduler_fanin_routing.py:431`
test_routing_on_failure_triggers_remediator_only	On upstream TASK_FAILED(permanent=True), only the 'on_failure' remediator should run; 'on_success' must not.	asyncio, xfail	`tests/test_scheduler_fanin_routing.py:358`

Details¶

test_coordinator_fn_merge_without_worker¶

marks: asyncio • location: tests/test_scheduler_fanin_routing.py:294

coordinator_fn node should run without a worker and produce artifacts that a downstream analyzer can consume via pull.from_artifacts.

test_edges_vs_routing_priority¶

marks: asyncio, xfail • location: tests/test_scheduler_fanin_routing.py:230

If explicit graph edges are present and a node also has routing.on_success, edges should take precedence (routing target should not run).

test_fanin_all_waits_all_parents¶

marks: asyncio • location: tests/test_scheduler_fanin_routing.py:127

Fan-in ALL: without start_when hint, downstream should only start after both parents are finished.

test_fanin_any_starts_early¶

marks: asyncio • location: tests/test_scheduler_fanin_routing.py:69

Fan-in ANY: downstream should start as soon as at least one parent streams (start_when='first_batch'), even if other parents are not yet finished. We run a single 'indexer' worker so upstream parents execute sequentially.

test_fanin_count_n¶

marks: asyncio, xfail • location: tests/test_scheduler_fanin_routing.py:176

Fan-in COUNT:N placeholder: downstream should start when at least N parents are ready. Marked xfail until coordinator supports 'count:n'.

test_fanout_one_upstream_two_downstreams_mixed_start_when¶

marks: asyncio • location: tests/test_scheduler_fanin_routing.py:431

One upstream → two downstreams: A has start_when=first_batch (starts early), B has no start_when (waits for completion).

test_routing_on_failure_triggers_remediator_only¶

marks: asyncio, xfail • location: tests/test_scheduler_fanin_routing.py:358

On upstream TASK_FAILED(permanent=True), only the 'on_failure' remediator should run; 'on_success' must not.

tests/unit/vars/test_detectors.py¶

Tests¶

Test	Summary	Marks	Location
test_detector_function_with_merge	DI: callable detector.	asyncio	`tests/unit/vars/test_detectors.py:38`
test_detector_object_block_and_soft	DI: object detector.	asyncio	`tests/unit/vars/test_detectors.py:18`
test_detector_string_import	DI: string import "module:factory".	asyncio	`tests/unit/vars/test_detectors.py:63`

Details¶

test_detector_function_with_merge¶

marks: asyncio • location: tests/unit/vars/test_detectors.py:38

DI: callable detector. Block big binary payload when block_sensitive=True, otherwise log hit.

test_detector_object_block_and_soft¶

marks: asyncio • location: tests/unit/vars/test_detectors.py:18

DI: object detector. With block_sensitive=True — raise; without — write and count hit.

test_detector_string_import¶

marks: asyncio • location: tests/unit/vars/test_detectors.py:63

DI: string import "module:factory". Factory returns an object with is_sensitive().

tests/unit/vars/test_incr.py¶

Tests¶

Test	Summary	Marks	Location
test_vars_incr_concurrency_sum	Concurrency: 10x1 + 5x2 → exact sum.	asyncio	`tests/unit/vars/test_incr.py:27`
test_vars_incr_creates_missing_leaf	Creating a missing leaf should create it and increment.	asyncio	`tests/unit/vars/test_incr.py:17`
test_vars_incr_float_and_negative_and_type_conflict	Mixed numeric types are allowed; negative increments also work.	asyncio	`tests/unit/vars/test_incr.py:61`
test_vars_incr_rejects_nan_inf	NaN / ±Inf must be rejected.	asyncio, parametrize	`tests/unit/vars/test_incr.py:42`
test_vars_incr_validates_key	Key validation (segments).	asyncio, parametrize	`tests/unit/vars/test_incr.py:52`

Details¶

test_vars_incr_concurrency_sum¶

marks: asyncio • location: tests/unit/vars/test_incr.py:27

Concurrency: 10x1 + 5x2 → exact sum.

test_vars_incr_creates_missing_leaf¶

marks: asyncio • location: tests/unit/vars/test_incr.py:17

Creating a missing leaf should create it and increment.

test_vars_incr_float_and_negative_and_type_conflict¶

marks: asyncio • location: tests/unit/vars/test_incr.py:61

Mixed numeric types are allowed; negative increments also work. If a non-numeric value already exists at the leaf, the adapter should raise.

test_vars_incr_rejects_nan_inf¶

marks: asyncio, parametrize • location: tests/unit/vars/test_incr.py:42

NaN / ±Inf must be rejected.

test_vars_incr_validates_key¶

marks: asyncio, parametrize • location: tests/unit/vars/test_incr.py:52

Key validation (segments).

tests/unit/vars/test_merge.py¶

Tests¶

Test	Summary	Marks	Location
test_vars_merge_block_sensitive_with_external_detector	External detector: block when block_sensitive=True, otherwise write but count sensitive_hits in logs.	asyncio	`tests/unit/vars/test_merge.py:101`
test_vars_merge_deep_overwrite_mixed_types	Deep-merge with leaf overwrite; mixed types (dict → leaf).	asyncio	`tests/unit/vars/test_merge.py:15`
test_vars_merge_limits_and_value_size_str_and_bytes	Limits: number of paths, depth, full path length, and value size (str/bytes).	asyncio	`tests/unit/vars/test_merge.py:64`
test_vars_merge_noop_empty	Empty input → no-op (touched=0), 'vars' field is not created.	asyncio	`tests/unit/vars/test_merge.py:52`

Details¶

test_vars_merge_block_sensitive_with_external_detector¶

marks: asyncio • location: tests/unit/vars/test_merge.py:101

External detector: block when block_sensitive=True, otherwise write but count sensitive_hits in logs.

test_vars_merge_deep_overwrite_mixed_types¶

marks: asyncio • location: tests/unit/vars/test_merge.py:15

Deep-merge with leaf overwrite; mixed types (dict → leaf).

test_vars_merge_limits_and_value_size_str_and_bytes¶

marks: asyncio • location: tests/unit/vars/test_merge.py:64

Limits: number of paths, depth, full path length, and value size (str/bytes).

test_vars_merge_noop_empty¶

marks: asyncio • location: tests/unit/vars/test_merge.py:52

Empty input → no-op (touched=0), 'vars' field is not created.

tests/unit/vars/test_set.py¶

Tests¶

Test	Summary	Marks	Location
test_vars_set_block_sensitive_with_external_detector	With block_sensitive=True the detector blocks.	asyncio	`tests/unit/vars/test_set.py:138`
test_vars_set_deterministic_keys_order_and_sensitive_hits	Deterministic sort order in logs + sensitive_hits counter.	asyncio	`tests/unit/vars/test_set.py:118`
test_vars_set_dotted_and_nested_combo_and_logging	Dotted + nested forms in one call; verify values and structured logs.	asyncio	`tests/unit/vars/test_set.py:15`
test_vars_set_limits_depth_and_path_len	Limits: key depth and full path length.	asyncio	`tests/unit/vars/test_set.py:84`
test_vars_set_limits_paths_count	Limit on number of paths in a single $set.	asyncio	`tests/unit/vars/test_set.py:72`
test_vars_set_limits_value_size_str_and_bytes	Limit on value size (string/bytes).	asyncio	`tests/unit/vars/test_set.py:102`
test_vars_set_validates_key_segments	Validate segments: $, '.', NUL, segment length.	asyncio, parametrize	`tests/unit/vars/test_set.py:63`

Details¶

test_vars_set_block_sensitive_with_external_detector¶

marks: asyncio • location: tests/unit/vars/test_set.py:138

With block_sensitive=True the detector blocks. Without it, write succeeds and logs record sensitive_hits=1.

test_vars_set_deterministic_keys_order_and_sensitive_hits¶

marks: asyncio • location: tests/unit/vars/test_set.py:118

Deterministic sort order in logs + sensitive_hits counter.

test_vars_set_dotted_and_nested_combo_and_logging¶

marks: asyncio • location: tests/unit/vars/test_set.py:15

Dotted + nested forms in one call; verify values and structured logs.

test_vars_set_limits_depth_and_path_len¶

marks: asyncio • location: tests/unit/vars/test_set.py:84

Limits: key depth and full path length.

test_vars_set_limits_paths_count¶

marks: asyncio • location: tests/unit/vars/test_set.py:72

Limit on number of paths in a single $set.

test_vars_set_limits_value_size_str_and_bytes¶

marks: asyncio • location: tests/unit/vars/test_set.py:102

Limit on value size (string/bytes).

test_vars_set_validates_key_segments¶

marks: asyncio, parametrize • location: tests/unit/vars/test_set.py:63

Validate segments: $, '.', NUL, segment length.

tests/unit/vars/test_unset.py¶

Tests¶

Test	Summary	Marks	Location
test_vars_unset_accepts_args_and_kwargs	The adapter should accept both *args (varargs keys) and kwargs form 'keys=[...]'.	asyncio	`tests/unit/vars/test_unset.py:76`
test_vars_unset_path_length_limit	Validate full path length limit (including prefix 'coordinator.vars.').	asyncio	`tests/unit/vars/test_unset.py:64`
test_vars_unset_simple_nested_and_noop	Remove a simple leaf and a nested branch; missing key is a no-op (operation remains valid and accounted in 'touched').	asyncio	`tests/unit/vars/test_unset.py:14`
test_vars_unset_validates_paths	Validate key/path formats: $, '.', NUL, segment length.	asyncio, parametrize	`tests/unit/vars/test_unset.py:55`

Details¶

test_vars_unset_accepts_args_and_kwargs¶

marks: asyncio • location: tests/unit/vars/test_unset.py:76

The adapter should accept both *args (varargs keys) and kwargs form 'keys=[...]'.

test_vars_unset_path_length_limit¶

marks: asyncio • location: tests/unit/vars/test_unset.py:64

Validate full path length limit (including prefix 'coordinator.vars.').

test_vars_unset_simple_nested_and_noop¶

marks: asyncio • location: tests/unit/vars/test_unset.py:14

Remove a simple leaf and a nested branch; missing key is a no-op (operation remains valid and accounted in 'touched').

test_vars_unset_validates_paths¶

marks: asyncio, parametrize • location: tests/unit/vars/test_unset.py:55

Validate key/path formats: $, '.', NUL, segment length.

tests/unit/worker/test_error_classification.py¶

Unit test: default RoleHandler.classify_error marks config/programming errors as permanent. We don't pin the exact reason string, only the permanence contract.

Tests¶

Test	Summary	Marks	Location
test_programming_errors_are_permanent	(no docstring)	—	`tests/unit/worker/test_error_classification.py:15`

Details¶

test_programming_errors_are_permanent¶

location: tests/unit/worker/test_error_classification.py:15

No docstring.

tests/unit/worker/test_error_policy.py¶

Tests¶

Test	Summary	Marks	Location
test_classify_error_non_permanent_for_other_exceptions	(no docstring)	—	`tests/unit/worker/test_error_policy.py:18`
test_classify_error_permanent_for_common_programming_faults	(no docstring)	—	`tests/unit/worker/test_error_policy.py:10`

Details¶

test_classify_error_non_permanent_for_other_exceptions¶

location: tests/unit/worker/test_error_policy.py:18

No docstring.

test_classify_error_permanent_for_common_programming_faults¶

location: tests/unit/worker/test_error_policy.py:10

No docstring.

tests/unit/worker/test_pull_adapters_rechunk.py¶

Unit tests for PullAdapters.iter_from_artifacts_rechunk() selection rules.

Contract: - If meta_list_key is provided and points to a list → chunk that list. - Otherwise → treat the entire meta as a single item (wrap into a one-element list), no heuristics.

Notes: * We keep these tests unit-level (no Kafka/Coordinator). To let the adapter terminate its polling loop, we set eof_on_task_done=True and append a synthetic {status: "complete"} marker for the source node. Our stub count_documents recognizes such queries.

Tests¶

Test	Summary	Marks	Location
test_rechunk_with_meta_list_key_splits_that_list	(no docstring)	asyncio	`tests/unit/worker/test_pull_adapters_rechunk.py:70`
test_rechunk_without_meta_list_key_wraps_meta_as_single_item	(no docstring)	asyncio	`tests/unit/worker/test_pull_adapters_rechunk.py:113`

Details¶

test_rechunk_with_meta_list_key_splits_that_list¶

marks: asyncio • location: tests/unit/worker/test_pull_adapters_rechunk.py:70

No docstring.

test_rechunk_without_meta_list_key_wraps_meta_as_single_item¶

marks: asyncio • location: tests/unit/worker/test_pull_adapters_rechunk.py:113

No docstring.