Test Suite Reference¶
This page is auto-generated from pytest docstrings at build time. Do not edit manually. See
docs/_scripts/gen_tests_doc.py.
Modules: 35 Tests: 108
How it works¶
The generator scans tests/ for test_*.py modules, reads docstrings of modules, test classes, and
test functions (including async def), and renders them here. The first sentence becomes a short
summary in the index; the full docstring appears in Details.
Writing good test docstrings¶
async def test_outbox_retry_backoff():
"""
First two sends fail → item moves to *retry* with exponential backoff;
on the 3rd attempt the item becomes *sent*.
"""
Modules¶
tests/e2e/vars/test_cancel_preserves_vars.py 1 test (marks: asyncio)
- test_cancel_midflow_preserves_coordinator_vars — (no docstring)
tests/e2e/vars/test_conditional_routing_xfail.py 1 test (marks: asyncio, xfail)
- test_conditional_routing_by_vars — (no docstring)
tests/e2e/vars/test_coord_fn_vars_echo_e2e.py 1 test (marks: asyncio)
- test_scheduler_runs_coordfn_chain_and_child_sees_vars — (no docstring)
tests/integration/adapters/test_adapter_misc.py 5 tests (marks: asyncio)
- test_cmd_from_node_alias_supported — cmd.input_inline may provide 'from_node' (single) instead of 'from_nodes' (list).
- test_cmd_unknown_adapter_causes_permanent_fail — If cmd specifies an unknown adapter, the worker must fail the task permanently and must NOT call handler.iter_batches.
- test_fallback_to_iter_batches_when_no_adapter — If neither cmd nor handler specifies an adapter, worker must fallback to handler.iter_batches.
- test_handler_adapter_used_when_cmd_absent — When cmd does not specify an adapter, the worker may use the handler's adapter if it is known.
- test_handler_unknown_adapter_ignored_when_cmd_valid — If the handler proposes an unknown adapter but cmd provides a valid one, the worker must follow cmd and succeed.
tests/integration/adapters/test_empty_upstream.py 1 test (marks: asyncio)
- test_empty_upstream_finishes_with_zero_count — (no docstring)
tests/integration/adapters/test_input_adapter_contract.py 3 tests (marks: asyncio)
- test_empty_upstream_is_not_error — Valid adapter + empty upstream → normal finish (count=0), not a failure.
- test_missing_required_args_yields_bad_input_args — cmd.input_inline with 'pull.from_artifacts.rechunk:size' must include 'size'.
- test_rechunk_without_meta_list_key_treats_each_artifact_as_single_item — No meta_list_key → each upstream artifact's meta is a single logical item (no heuristics).
tests/integration/adapters/test_rechunk_args.py 3 tests (marks: asyncio)
- test_rechunk_invalid_meta_list_key_type_fails — (no docstring)
- test_rechunk_invalid_size_fails_early — (no docstring)
- test_rechunk_missing_size_fails_early — (no docstring)
tests/integration/adapters/test_rechunk_determinism.py 1 test (marks: asyncio)
- test_rechunk_without_meta_key_treats_each_artifact_as_one_item — (no docstring)
tests/integration/adapters/test_rechunk_strict_mode.py 2 tests (marks: asyncio, cfg)
- test_rechunk_requires_meta_key_in_strict_mode — In strict mode, rechunk without meta_list_key must fail early.
- test_rechunk_with_meta_key_passes_in_strict_mode — (no docstring)
tests/integration/adapters/test_unknown_adapter.py 1 test (marks: asyncio)
- test_cmd_unknown_adapter_permanent_fail — (no docstring)
tests/integration/adapters/test_worker_adapter_precedence.py 1 test (marks: asyncio)
- test_cmd_input_inline_overrides_handler_load_input — When configurations conflict, cmd.input_inline must take precedence over handler-provided load_input.
tests/integration/streaming/test_isolation.py 2 tests (marks: asyncio)
- test_metrics_cross_talk_guard — Two concurrent tasks (same node_ids across different task_ids) do not interfere: final 'count' values are isolated per task.
- test_metrics_isolation_between_tasks — Two back-to-back tasks must keep metric aggregation isolated per task document.
tests/integration/streaming/test_metrics_dedup.py 3 tests (marks: asyncio)
- test_metrics_dedup_persists_across_coord_restart — Metric deduplication survives a coordinator restart: duplicates of STATUS (BATCH_OK/TASK_DONE) sent during the restart must not double-count.
- test_metrics_idempotent_on_duplicate_status_events — Duplicate STATUS events (BATCH_OK / TASK_DONE) must not double-increment aggregated metrics.
- test_status_out_of_order_does_not_break_aggregation — BATCH_OK STATUS events arrive out of order → the aggregated metric remains correct.
tests/integration/streaming/test_metrics_exactness.py 3 tests (marks: asyncio)
- test_metrics_multistream_exact_sum — Three upstreams -> one downstream: analyzer's aggregated 'count' must equal the sum of all totals.
- test_metrics_partial_batches_exact_count — With a remainder in the last upstream batch, analyzer's 'count' must still exactly equal total.
- test_metrics_single_stream_exact_count — Single upstream -> single downstream: analyzer's aggregated 'count' must equal the total items.
tests/integration/streaming/test_multistream_fanin.py 1 test (marks: asyncio)
- test_multistream_fanin_stream_to_one_downstream — Multi-stream fan-in: three upstream indexers stream into one analyzer.
tests/integration/streaming/test_start_when.py 2 tests (marks: asyncio)
- test_after_upstream_complete_delays_start — Without start_when, the downstream must not start until the upstream is fully finished.
- test_start_when_first_batch_starts_early — Downstream should start while upstream is still running when `start_when=first_batch` is set.
tests/integration/vars/test_coord_fn_chain.py 1 test (marks: asyncio, parametrize)
- test_coord_fn_chain_set_merge_incr_unset — Integration scenario: n1: vars.set → routing.sla=<param>, limits.max_batches=<param> n2: vars.merge → flags=<param> n3: vars.incr → counters.pages += <param> n4: vars.unset → limits.max_batches Verifies:.
tests/integration/vars/test_unset_forms.py 1 test (marks: asyncio, parametrize)
- test_vars_unset_kwargs_list_in_coord_fn — Coordinator function 'vars.unset' must accept kwargs form `keys=[...]`.
tests/test_artifacts_data.py 2 tests (marks: asyncio)
- test_merge_generic_creates_complete_artifact — coordinator_fn: merge.generic combines results of nodes 'a' and 'b'.
- test_partial_shards_and_stream_counts — Source w1 (indexer) emits batches with batch_uid → worker creates partial artifacts.
tests/test_cancel_and_restart.py 5 tests (marks: asyncio)
- test_cancel_before_any_start_keeps_all_nodes_idle — Cancel the task before any node can start: no node must enter 'running'.
- test_cancel_on_deferred_prevents_retry — Node 'flaky' fails on first attempt → becomes deferred with backoff.
- test_cascade_cancel_prevents_downstream — Cancel the task while upstream runs: downstream must not start.
- test_restart_higher_epoch_ignores_old_batch_ok — After a node restarts with a higher epoch, the coordinator must ignore a stale BATCH_OK from a lower epoch.
- test_restart_higher_epoch_ignores_old_events — After accepting epoch>=1, re-inject an old event (epoch=0).
tests/test_chaos_models.py 4 tests (marks: asyncio)
- test_chaos_coordinator_restart — Restart the Coordinator while the task is running.
- test_chaos_delays_and_duplications — Chaos mode: small broker/consumer jitter + message duplications (no drops).
- test_chaos_worker_restart_mid_stream — Restart the 'enricher' worker in the middle of the stream.
- test_e2e_streaming_with_kafka_chaos — With chaos enabled (broker/consumer jitter and message duplications, no drops), the full pipeline should still complete and produce artifacts at indexer/enricher/ocr stages.
tests/test_concurrency_limits.py 4 tests (marks: asyncio)
- test_concurrent_tasks_respect_global_limit — (no docstring)
- test_max_global_running_limit — (no docstring)
- test_max_type_concurrency_limits — (no docstring)
- test_multi_workers_same_type_rr_distribution — (no docstring)
tests/test_lease_heartbeat_resume.py 11 tests (marks: asyncio, cfg)
- test_deferred_retry_ignores_grace_gate — A DEFERRED retry must not be throttled by discovery grace window.
- test_grace_gate_blocks_then_allows_after_window — Grace window should delay start initially and allow it after the window elapses.
- test_heartbeat_hard_marks_task_failed — If hard heartbeat window is exceeded, the task should be marked FAILED.
- test_heartbeat_soft_deferred_then_recovers — With a short soft heartbeat window the task becomes DEFERRED, then recovers and finishes.
- test_heartbeat_tolerates_clock_skew — Worker clock skew should not cause a hard timeout; lease deadlines must be non-decreasing.
- test_heartbeat_updates_lease_deadline_simple — Heartbeats should move the lease.deadline_ts_ms forward while the node runs.
- test_lease_expiry_cascades_cancel — Permanent fail upstream should cascade-cancel downstream nodes.
- test_no_task_resumed_on_worker_restart — On a cold worker restart there must be no TASK_RESUMED event emitted by the worker.
- test_resume_inflight_worker_restarts_with_local_state — Restart with the same worker_id: coordinator should adopt inflight work without new epoch.
- test_task_discover_complete_artifacts_skips_node_start — If artifacts are 'complete' during discovery, node should auto-finish without starting its handler.
- test_worker_restart_with_new_id_bumps_epoch — Restart with a new worker_id must bump attempt_epoch; stale heartbeats from the old epoch are ignored.
tests/test_outbox_delivery.py 6 tests (marks: asyncio, cfg, use_outbox)
- test_outbox_backoff_caps_with_jitter — Exponential backoff is capped by max and jitter stays within a reasonable window.
- test_outbox_crash_between_send_and_mark_sent — Send succeeds, coordinator crashes before mark(sent) → after restart the outbox may resend, but there is only ONE effective delivery for (topic,key,dedup_id).
- test_outbox_dedup_survives_restart — Deduplication by (topic,key,dedup_id) persists across coordinator restart: re-enqueue of the same envelope after restart does not create a second outbox row or send.
- test_outbox_dlq_after_max_retries — After reaching max retries, the outbox row is moved to a terminal state with attempts >= max.
- test_outbox_exactly_once_fp_uniqueness — Two enqueues with identical (topic,key,dedup_id) → single outbox row and exactly one real send.
- test_outbox_retry_backoff — First two _raw_send calls fail → outbox goes to 'retry' with exponential backoff (with jitter), then on the 3rd attempt becomes 'sent'.
tests/test_pipeline_smoke.py 1 test (marks: asyncio)
- test_e2e_streaming_with_kafka_sim — Full pipeline should complete and produce artifacts at indexer/enricher/ocr stages.
tests/test_reliability_failures.py 7 tests (marks: asyncio, cfg)
- test_coordinator_restart_adopts_inflight_without_new_epoch — When the coordinator restarts, it should adopt in-flight work without incrementing the worker's attempt_epoch unnecessarily (i.e., source keeps epoch=1).
- test_explicit_cascade_cancel_moves_node_to_deferred — Explicit cascade cancel should move a running node to a cancelling/deferred/queued state within the configured cancel_grace window.
- test_heartbeat_updates_lease_deadline — Heartbeats from a worker must extend the lease deadline in the task document.
- test_idempotent_metrics_on_duplicate_events — Duplicated STATUS events (BATCH_OK/TASK_DONE) must not double-count metrics.
- test_permanent_fail_cascades_cancel_and_task_failed — Permanent failure in an upstream node should cause the task to fail, while dependents get cancelled/deferred/queued depending on race windows.
- test_status_fencing_ignores_stale_epoch — Status fencing must ignore events from a stale attempt_epoch.
- test_transient_failure_deferred_then_retry — A transient error should defer the node and succeed on retry according to retry_policy (max>=2).
tests/test_scheduler_fanin_routing.py 7 tests (marks: asyncio, xfail)
- test_coordinator_fn_merge_without_worker — coordinator_fn node should run without a worker and produce artifacts that a downstream analyzer can consume via pull.from_artifacts.
- test_edges_vs_routing_priority — If explicit graph edges are present and a node also has routing.on_success, edges should take precedence (routing target should not run).
- test_fanin_all_waits_all_parents — Fan-in ALL: without start_when hint, downstream should only start after both parents are finished.
- test_fanin_any_starts_early — Fan-in ANY: downstream should start as soon as at least one parent streams (start_when='first_batch'), even if other parents are not yet finished.
- test_fanin_count_n — Fan-in COUNT:N placeholder: downstream should start when at least N parents are ready.
- test_fanout_one_upstream_two_downstreams_mixed_start_when — One upstream → two downstreams: A has start_when=first_batch (starts early), B has no start_when (waits for completion).
- test_routing_on_failure_triggers_remediator_only — On upstream TASK_FAILED(permanent=True), only the 'on_failure' remediator should run; 'on_success' must not.
tests/unit/vars/test_detectors.py 3 tests (marks: asyncio)
- test_detector_function_with_merge — DI: callable detector.
- test_detector_object_block_and_soft — DI: object detector.
- test_detector_string_import — DI: string import "module:factory".
tests/unit/vars/test_incr.py 5 tests (marks: asyncio, parametrize)
- test_vars_incr_concurrency_sum — Concurrency: 10x1 + 5x2 → exact sum.
- test_vars_incr_creates_missing_leaf — Creating a missing leaf should create it and increment.
- test_vars_incr_float_and_negative_and_type_conflict — Mixed numeric types are allowed; negative increments also work.
- test_vars_incr_rejects_nan_inf — NaN / +/-Inf must be rejected.
- test_vars_incr_validates_key — Key validation (segments).
tests/unit/vars/test_merge.py 4 tests (marks: asyncio)
- test_vars_merge_block_sensitive_with_external_detector — External detector: block when block_sensitive=True, otherwise write but count sensitive_hits in logs.
- test_vars_merge_deep_overwrite_mixed_types — Deep-merge with leaf overwrite; mixed types (dict → leaf).
- test_vars_merge_limits_and_value_size_str_and_bytes — Limits: number of paths, depth, full path length, and value size (str/bytes).
- test_vars_merge_noop_empty — Empty input → no-op (touched=0), 'vars' field is not created.
tests/unit/vars/test_set.py 7 tests (marks: asyncio, parametrize)
- test_vars_set_block_sensitive_with_external_detector — With block_sensitive=True the detector blocks.
- test_vars_set_deterministic_keys_order_and_sensitive_hits — Deterministic sort order in logs + sensitive_hits counter.
- test_vars_set_dotted_and_nested_combo_and_logging — Dotted + nested forms in one call; verify values and structured logs.
- test_vars_set_limits_depth_and_path_len — Limits: key depth and full path length.
- test_vars_set_limits_paths_count — Limit on number of paths in a single $set.
- test_vars_set_limits_value_size_str_and_bytes — Limit on value size (string/bytes).
- test_vars_set_validates_key_segments — Validate segments: $, '.', NUL, segment length.
tests/unit/vars/test_unset.py 4 tests (marks: asyncio, parametrize)
- test_vars_unset_accepts_args_and_kwargs — The adapter should accept both *args (varargs keys) and kwargs form 'keys=[...]'.
- test_vars_unset_path_length_limit — Validate full path length limit (including prefix 'coordinator.vars.').
- test_vars_unset_simple_nested_and_noop — Remove a simple leaf and a nested branch; missing key is a no-op (operation remains valid and accounted in 'touched').
- test_vars_unset_validates_paths — Validate key/path formats: $, '.', NUL, segment length.
tests/unit/worker/test_error_classification.py 1 test
- test_programming_errors_are_permanent — (no docstring)
tests/unit/worker/test_error_policy.py 2 tests
- test_classify_error_non_permanent_for_other_exceptions — (no docstring)
- test_classify_error_permanent_for_common_programming_faults — (no docstring)
tests/unit/worker/test_pull_adapters_rechunk.py 2 tests (marks: asyncio)
- test_rechunk_with_meta_list_key_splits_that_list — (no docstring)
- test_rechunk_without_meta_list_key_wraps_meta_as_single_item — (no docstring)
tests/e2e/vars/test_cancel_preserves_vars.py¶
E2E: cancelling a running task does not erase previously written coordinator.vars.
Scenario: - n1: coordinator_fn (vars.set) writes task variables. - s : long-running cancellable 'source' node (worker-based). - Cancel the task while 's' is running. Assertions: - Task (or node) transitions away from 'running' due to cancel (implementation-dependent). - coordinator.vars written by n1 remain intact after cancellation.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_cancel_midflow_preserves_coordinator_vars | (no docstring) | asyncio | tests/e2e/vars/test_cancel_preserves_vars.py:27 |
Details¶
test_cancel_midflow_preserves_coordinator_vars¶
marks: asyncio • location: tests/e2e/vars/test_cancel_preserves_vars.py:27
No docstring.
tests/e2e/vars/test_conditional_routing_xfail.py¶
XFails (placeholder): conditional routing by coordinator.vars is not implemented yet.
Intended behavior (when implemented): - n1 sets coordinator.vars.routing.sla = "gold". - Downstream 'gold_only' should run when sla == "gold". - Downstream 'silver_only' should NOT run in this case.
Current status: - Marked xfail until coordinator supports variable-driven conditional routing.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_conditional_routing_by_vars | (no docstring) | asyncio, xfail | tests/e2e/vars/test_conditional_routing_xfail.py:25 |
Details¶
test_conditional_routing_by_vars¶
marks: asyncio, xfail • location: tests/e2e/vars/test_conditional_routing_xfail.py:25
No docstring.
tests/e2e/vars/test_coord_fn_vars_echo_e2e.py¶
E2E (slow): scheduler-driven coordinator_fn → coordinator_fn chain.
Graph: n1: coordinator_fn (vars.set) → routing.sla="gold", limits.max_batches=5 n2: coordinator_fn (vars.echo) → reads coordinator.vars and writes an artifact echo
Checks: - scheduler starts nodes automatically (no manual _run_coordinator_fn) - both nodes finish - echo artifact exists for n2 and contains an exact copy of coordinator.vars - updated_at present (and sane)
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_scheduler_runs_coordfn_chain_and_child_sees_vars | (no docstring) | asyncio | tests/e2e/vars/test_coord_fn_vars_echo_e2e.py:26 |
Details¶
test_scheduler_runs_coordfn_chain_and_child_sees_vars¶
marks: asyncio • location: tests/e2e/vars/test_coord_fn_vars_echo_e2e.py:26
No docstring.
tests/integration/adapters/test_adapter_misc.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_cmd_from_node_alias_supported | cmd.input_inline may provide 'from_node' (single) instead of 'from_nodes' (list). | asyncio | tests/integration/adapters/test_adapter_misc.py:213 |
| test_cmd_unknown_adapter_causes_permanent_fail | If cmd specifies an unknown adapter, the worker must fail the task permanently and must NOT call handler.iter_batches. | asyncio | tests/integration/adapters/test_adapter_misc.py:179 |
| test_fallback_to_iter_batches_when_no_adapter | If neither cmd nor handler specifies an adapter, worker must fallback to handler.iter_batches. | asyncio | tests/integration/adapters/test_adapter_misc.py:147 |
| test_handler_adapter_used_when_cmd_absent | When cmd does not specify an adapter, the worker may use the handler's adapter if it is known. | asyncio | tests/integration/adapters/test_adapter_misc.py:102 |
| test_handler_unknown_adapter_ignored_when_cmd_valid | If the handler proposes an unknown adapter but cmd provides a valid one, the worker must follow cmd and succeed. | asyncio | tests/integration/adapters/test_adapter_misc.py:262 |
Details¶
test_cmd_from_node_alias_supported¶
marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:213
cmd.input_inline may provide 'from_node' (single) instead of 'from_nodes' (list).
test_cmd_unknown_adapter_causes_permanent_fail¶
marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:179
If cmd specifies an unknown adapter, the worker must fail the task permanently and must NOT call handler.iter_batches.
test_fallback_to_iter_batches_when_no_adapter¶
marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:147
If neither cmd nor handler specifies an adapter, worker must fallback to handler.iter_batches.
test_handler_adapter_used_when_cmd_absent¶
marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:102
When cmd does not specify an adapter, the worker may use the handler's adapter if it is known.
test_handler_unknown_adapter_ignored_when_cmd_valid¶
marks: asyncio • location: tests/integration/adapters/test_adapter_misc.py:262
If the handler proposes an unknown adapter but cmd provides a valid one, the worker must follow cmd and succeed.
tests/integration/adapters/test_empty_upstream.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_empty_upstream_finishes_with_zero_count | (no docstring) | asyncio | tests/integration/adapters/test_empty_upstream.py:21 |
Details¶
test_empty_upstream_finishes_with_zero_count¶
marks: asyncio • location: tests/integration/adapters/test_empty_upstream.py:21
No docstring.
tests/integration/adapters/test_input_adapter_contract.py¶
Integration tests for the orchestrator-first input adapter contract.
Covers: - Unknown adapter in cmd → permanent TASK_FAILED (bad_input_adapter). - Missing required adapter args in cmd → permanent TASK_FAILED (bad_input_args). - Empty upstream is not an error (finished with count=0). - Rechunk without meta_list_key treats each artifact as a single logical item (no heuristics).
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_empty_upstream_is_not_error | Valid adapter + empty upstream → normal finish (count=0), not a failure. | asyncio | tests/integration/adapters/test_input_adapter_contract.py:80 |
| test_missing_required_args_yields_bad_input_args | cmd.input_inline with 'pull.from_artifacts.rechunk:size' must include 'size'. | asyncio | tests/integration/adapters/test_input_adapter_contract.py:49 |
| test_rechunk_without_meta_list_key_treats_each_artifact_as_single_item | No meta_list_key → each upstream artifact's meta is a single logical item (no heuristics). | asyncio | tests/integration/adapters/test_input_adapter_contract.py:127 |
Details¶
test_empty_upstream_is_not_error¶
marks: asyncio • location: tests/integration/adapters/test_input_adapter_contract.py:80
Valid adapter + empty upstream → normal finish (count=0), not a failure.
test_missing_required_args_yields_bad_input_args¶
marks: asyncio • location: tests/integration/adapters/test_input_adapter_contract.py:49
cmd.input_inline with 'pull.from_artifacts.rechunk:size' must include 'size'. Omission → early permanent failure (bad_input_args). Handler.iter_batches() must NOT be called.
test_rechunk_without_meta_list_key_treats_each_artifact_as_single_item¶
marks: asyncio • location: tests/integration/adapters/test_input_adapter_contract.py:127
No meta_list_key → each upstream artifact's meta is a single logical item (no heuristics). If upstream emits 2 partial artifacts, probe should aggregate 'count' == 2 (not N of inner lists).
tests/integration/adapters/test_rechunk_args.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_rechunk_invalid_meta_list_key_type_fails | (no docstring) | asyncio | tests/integration/adapters/test_rechunk_args.py:93 |
| test_rechunk_invalid_size_fails_early | (no docstring) | asyncio | tests/integration/adapters/test_rechunk_args.py:60 |
| test_rechunk_missing_size_fails_early | (no docstring) | asyncio | tests/integration/adapters/test_rechunk_args.py:50 |
Details¶
test_rechunk_invalid_meta_list_key_type_fails¶
marks: asyncio • location: tests/integration/adapters/test_rechunk_args.py:93
No docstring.
test_rechunk_invalid_size_fails_early¶
marks: asyncio • location: tests/integration/adapters/test_rechunk_args.py:60
No docstring.
test_rechunk_missing_size_fails_early¶
marks: asyncio • location: tests/integration/adapters/test_rechunk_args.py:50
No docstring.
tests/integration/adapters/test_rechunk_determinism.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_rechunk_without_meta_key_treats_each_artifact_as_one_item | (no docstring) | asyncio | tests/integration/adapters/test_rechunk_determinism.py:24 |
Details¶
test_rechunk_without_meta_key_treats_each_artifact_as_one_item¶
marks: asyncio • location: tests/integration/adapters/test_rechunk_determinism.py:24
No docstring.
tests/integration/adapters/test_rechunk_strict_mode.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_rechunk_requires_meta_key_in_strict_mode | In strict mode, rechunk without meta_list_key must fail early. | asyncio, cfg | tests/integration/adapters/test_rechunk_strict_mode.py:15 |
| test_rechunk_with_meta_key_passes_in_strict_mode | (no docstring) | asyncio, cfg | tests/integration/adapters/test_rechunk_strict_mode.py:60 |
Details¶
test_rechunk_requires_meta_key_in_strict_mode¶
marks: asyncio, cfg • location: tests/integration/adapters/test_rechunk_strict_mode.py:15
In strict mode, rechunk without meta_list_key must fail early.
test_rechunk_with_meta_key_passes_in_strict_mode¶
marks: asyncio, cfg • location: tests/integration/adapters/test_rechunk_strict_mode.py:60
No docstring.
tests/integration/adapters/test_unknown_adapter.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_cmd_unknown_adapter_permanent_fail | (no docstring) | asyncio | tests/integration/adapters/test_unknown_adapter.py:24 |
Details¶
test_cmd_unknown_adapter_permanent_fail¶
marks: asyncio • location: tests/integration/adapters/test_unknown_adapter.py:24
No docstring.
tests/integration/adapters/test_worker_adapter_precedence.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_cmd_input_inline_overrides_handler_load_input | When configurations conflict, cmd.input_inline must take precedence over handler-provided load_input. | asyncio | tests/integration/adapters/test_worker_adapter_precedence.py:49 |
Details¶
test_cmd_input_inline_overrides_handler_load_input¶
marks: asyncio • location: tests/integration/adapters/test_worker_adapter_precedence.py:49
When configurations conflict, cmd.input_inline must take precedence over handler-provided load_input.
tests/integration/streaming/test_isolation.py¶
Isolation of aggregation and cross-talk guard between tasks.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_metrics_cross_talk_guard | Two concurrent tasks (same node_ids across different task_ids) do not interfere: final 'count' values are isolated per task. | asyncio | tests/integration/streaming/test_isolation.py:62 |
| test_metrics_isolation_between_tasks | Two back-to-back tasks must keep metric aggregation isolated per task document. | asyncio | tests/integration/streaming/test_isolation.py:17 |
Details¶
test_metrics_cross_talk_guard¶
marks: asyncio • location: tests/integration/streaming/test_isolation.py:62
Two concurrent tasks (same node_ids across different task_ids) do not interfere: final 'count' values are isolated per task.
test_metrics_isolation_between_tasks¶
marks: asyncio • location: tests/integration/streaming/test_isolation.py:17
Two back-to-back tasks must keep metric aggregation isolated per task document.
tests/integration/streaming/test_metrics_dedup.py¶
Idempotency & event ordering for aggregated metrics.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_metrics_dedup_persists_across_coord_restart | Metric deduplication survives a coordinator restart: duplicates of STATUS (BATCH_OK/TASK_DONE) sent during the restart must not double-count. | asyncio | tests/integration/streaming/test_metrics_dedup.py:155 |
| test_metrics_idempotent_on_duplicate_status_events | Duplicate STATUS events (BATCH_OK / TASK_DONE) must not double-increment aggregated metrics. | asyncio | tests/integration/streaming/test_metrics_dedup.py:21 |
| test_status_out_of_order_does_not_break_aggregation | BATCH_OK STATUS events arrive out of order → the aggregated metric remains correct. | asyncio | tests/integration/streaming/test_metrics_dedup.py:82 |
Details¶
test_metrics_dedup_persists_across_coord_restart¶
marks: asyncio • location: tests/integration/streaming/test_metrics_dedup.py:155
Metric deduplication survives a coordinator restart: duplicates of STATUS
(BATCH_OK/TASK_DONE) sent during the restart must not double-count.
Skips if the fixture coord has no restart method`.
test_metrics_idempotent_on_duplicate_status_events¶
marks: asyncio • location: tests/integration/streaming/test_metrics_dedup.py:21
Duplicate STATUS events (BATCH_OK / TASK_DONE) must not double-increment aggregated metrics.
We monkeypatch AIOKafkaProducerMock.send_and_wait to produce the same STATUS event twice for status topics. The Coordinator should deduplicate by envelope key and keep metrics stable.
test_status_out_of_order_does_not_break_aggregation¶
marks: asyncio • location: tests/integration/streaming/test_metrics_dedup.py:82
BATCH_OK STATUS events arrive out of order → the aggregated metric remains correct. We hold the first BATCH_OK so later ones arrive first.
tests/integration/streaming/test_metrics_exactness.py¶
Exactness and accounting of aggregated metrics.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_metrics_multistream_exact_sum | Three upstreams -> one downstream: analyzer's aggregated 'count' must equal the sum of all totals. | asyncio | tests/integration/streaming/test_metrics_exactness.py:56 |
| test_metrics_partial_batches_exact_count | With a remainder in the last upstream batch, analyzer's 'count' must still exactly equal total. | asyncio | tests/integration/streaming/test_metrics_exactness.py:105 |
| test_metrics_single_stream_exact_count | Single upstream -> single downstream: analyzer's aggregated 'count' must equal the total items. | asyncio | tests/integration/streaming/test_metrics_exactness.py:15 |
Details¶
test_metrics_multistream_exact_sum¶
marks: asyncio • location: tests/integration/streaming/test_metrics_exactness.py:56
Three upstreams -> one downstream: analyzer's aggregated 'count' must equal the sum of all totals.
test_metrics_partial_batches_exact_count¶
marks: asyncio • location: tests/integration/streaming/test_metrics_exactness.py:105
With a remainder in the last upstream batch, analyzer's 'count' must still exactly equal total.
test_metrics_single_stream_exact_count¶
marks: asyncio • location: tests/integration/streaming/test_metrics_exactness.py:15
Single upstream -> single downstream: analyzer's aggregated 'count' must equal the total items.
tests/integration/streaming/test_multistream_fanin.py¶
Multistream fan-in to a single downstream consumer.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_multistream_fanin_stream_to_one_downstream | Multi-stream fan-in: three upstream indexers stream into one analyzer. | asyncio | tests/integration/streaming/test_multistream_fanin.py:17 |
Details¶
test_multistream_fanin_stream_to_one_downstream¶
marks: asyncio • location: tests/integration/streaming/test_multistream_fanin.py:17
Multi-stream fan-in: three upstream indexers stream into one analyzer. Analyzer should start early on first batch and eventually see the full flow.
tests/integration/streaming/test_start_when.py¶
Tests for streaming/async fan-in behavior: early vs delayed downstream start.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_after_upstream_complete_delays_start | Without start_when, the downstream must not start until the upstream is fully finished. | asyncio | tests/integration/streaming/test_start_when.py:85 |
| test_start_when_first_batch_starts_early | Downstream should start while upstream is still running when start_when=first_batch is set. |
asyncio | tests/integration/streaming/test_start_when.py:27 |
Details¶
test_after_upstream_complete_delays_start¶
marks: asyncio • location: tests/integration/streaming/test_start_when.py:85
Without start_when, the downstream must not start until the upstream is fully finished. We first assert a negative hold window while the upstream is running, then assert the downstream starts and finishes after the upstream completes.
test_start_when_first_batch_starts_early¶
marks: asyncio • location: tests/integration/streaming/test_start_when.py:27
Downstream should start while upstream is still running when start_when=first_batch is set.
tests/integration/vars/test_coord_fn_chain.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_coord_fn_chain_set_merge_incr_unset | Integration scenario: n1: vars.set → routing.sla=, limits.max_batches= n2: vars.merge → flags= n3: vars.incr → counters.pages += n4: vars.unset → limits.max_batches Verifies:. | asyncio, parametrize | tests/integration/vars/test_coord_fn_chain.py:27 |
Details¶
test_coord_fn_chain_set_merge_incr_unset¶
marks: asyncio, parametrize • location: tests/integration/vars/test_coord_fn_chain.py:27
Integration scenario: n1: vars.set → routing.sla=, limits.max_batches= n2: vars.merge → flags= n3: vars.incr → counters.pages += n4: vars.unset → limits.max_batches
Verifies: - each node finishes (status=finished) - final coordinator.vars matches expected - updated_at increases across steps (if tracked by inmemory_db)
tests/integration/vars/test_unset_forms.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_vars_unset_kwargs_list_in_coord_fn | Coordinator function 'vars.unset' must accept kwargs form keys=[...]. |
asyncio, parametrize | tests/integration/vars/test_unset_forms.py:20 |
Details¶
test_vars_unset_kwargs_list_in_coord_fn¶
marks: asyncio, parametrize • location: tests/integration/vars/test_unset_forms.py:20
Coordinator function 'vars.unset' must accept kwargs form keys=[...].
tests/test_artifacts_data.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_merge_generic_creates_complete_artifact | coordinator_fn: merge.generic combines results of nodes 'a' and 'b'. | asyncio | tests/test_artifacts_data.py:158 |
| test_partial_shards_and_stream_counts | Source w1 (indexer) emits batches with batch_uid → worker creates partial artifacts. | asyncio | tests/test_artifacts_data.py:90 |
Details¶
test_merge_generic_creates_complete_artifact¶
marks: asyncio • location: tests/test_artifacts_data.py:158
coordinator_fn: merge.generic combines results of nodes 'a' and 'b'.
We verify: * the task finishes * merge node 'm' has a 'complete' artifact * nodes 'a' and 'b' have artifacts (partial/complete — does not matter)
test_partial_shards_and_stream_counts¶
marks: asyncio • location: tests/test_artifacts_data.py:90
Source w1 (indexer) emits batches with batch_uid → worker creates partial artifacts. Completion marks a 'complete' artifact. Analyzer reads via rechunk and accumulates a counter.
We verify: * number of partial shards at w1 == ceil(total / batch_size) * a 'complete' artifact exists at w1 * w2 has node.stats.count == total
tests/test_cancel_and_restart.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_cancel_before_any_start_keeps_all_nodes_idle | Cancel the task before any node can start: no node must enter 'running'. | asyncio | tests/test_cancel_and_restart.py:231 |
| test_cancel_on_deferred_prevents_retry | Node 'flaky' fails on first attempt → becomes deferred with backoff. | asyncio | tests/test_cancel_and_restart.py:295 |
| test_cascade_cancel_prevents_downstream | Cancel the task while upstream runs: downstream must not start. | asyncio | tests/test_cancel_and_restart.py:97 |
| test_restart_higher_epoch_ignores_old_batch_ok | After a node restarts with a higher epoch, the coordinator must ignore a stale BATCH_OK from a lower epoch. | asyncio | tests/test_cancel_and_restart.py:366 |
| test_restart_higher_epoch_ignores_old_events | After accepting epoch>=1, re-inject an old event (epoch=0). | asyncio | tests/test_cancel_and_restart.py:169 |
Details¶
test_cancel_before_any_start_keeps_all_nodes_idle¶
marks: asyncio • location: tests/test_cancel_and_restart.py:231
Cancel the task before any node can start: no node must enter 'running'.
test_cancel_on_deferred_prevents_retry¶
marks: asyncio • location: tests/test_cancel_and_restart.py:295
Node 'flaky' fails on first attempt → becomes deferred with backoff. Cancel the task right after TASK_FAILED(epoch=1) and ensure no higher epoch is accepted.
test_cascade_cancel_prevents_downstream¶
marks: asyncio • location: tests/test_cancel_and_restart.py:97
Cancel the task while upstream runs: downstream must not start.
test_restart_higher_epoch_ignores_old_batch_ok¶
marks: asyncio • location: tests/test_cancel_and_restart.py:366
After a node restarts with a higher epoch, the coordinator must ignore a stale BATCH_OK from a lower epoch. Also ensure injected stale event doesn't create duplicate metrics.
test_restart_higher_epoch_ignores_old_events¶
marks: asyncio • location: tests/test_cancel_and_restart.py:169
After accepting epoch>=1, re-inject an old event (epoch=0). Coordinator must ignore it by fencing.
tests/test_chaos_models.py¶
End-to-end streaming smoke tests under chaos.
Pipelines covered: - w1(indexer) -> w2(enricher) -> w5(ocr) -> w4(analyzer) ___________/ / _____ w3(merge) _/
Checks: - all nodes finish successfully (including the merge node in the extended graph) - artifacts are produced for indexer/enricher/ocr (streaming stages) - resiliency under worker and coordinator restarts
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_chaos_coordinator_restart | Restart the Coordinator while the task is running. | asyncio | tests/test_chaos_models.py:344 |
| test_chaos_delays_and_duplications | Chaos mode: small broker/consumer jitter + message duplications (no drops). | asyncio | tests/test_chaos_models.py:277 |
| test_chaos_worker_restart_mid_stream | Restart the 'enricher' worker in the middle of the stream. | asyncio | tests/test_chaos_models.py:304 |
| test_e2e_streaming_with_kafka_chaos | With chaos enabled (broker/consumer jitter and message duplications, no drops), the full pipeline should still complete and produce artifacts at indexer/enricher/ocr stages. | asyncio | tests/test_chaos_models.py:245 |
Details¶
test_chaos_coordinator_restart¶
marks: asyncio • location: tests/test_chaos_models.py:344
Restart the Coordinator while the task is running. Expect the pipeline to recover and finish.
test_chaos_delays_and_duplications¶
marks: asyncio • location: tests/test_chaos_models.py:277
Chaos mode: small broker/consumer jitter + message duplications (no drops). Expect the pipeline to finish and produce artifacts for w1/w2/w5.
test_chaos_worker_restart_mid_stream¶
marks: asyncio • location: tests/test_chaos_models.py:304
Restart the 'enricher' worker in the middle of the stream. Expect the coordinator to fence and the pipeline to still finish.
test_e2e_streaming_with_kafka_chaos¶
marks: asyncio • location: tests/test_chaos_models.py:245
With chaos enabled (broker/consumer jitter and message duplications, no drops), the full pipeline should still complete and produce artifacts at indexer/enricher/ocr stages.
tests/test_concurrency_limits.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_concurrent_tasks_respect_global_limit | (no docstring) | asyncio | tests/test_concurrency_limits.py:302 |
| test_max_global_running_limit | (no docstring) | asyncio | tests/test_concurrency_limits.py:153 |
| test_max_type_concurrency_limits | (no docstring) | asyncio | tests/test_concurrency_limits.py:196 |
| test_multi_workers_same_type_rr_distribution | (no docstring) | asyncio | tests/test_concurrency_limits.py:252 |
Details¶
test_concurrent_tasks_respect_global_limit¶
marks: asyncio • location: tests/test_concurrency_limits.py:302
No docstring.
test_max_global_running_limit¶
marks: asyncio • location: tests/test_concurrency_limits.py:153
No docstring.
test_max_type_concurrency_limits¶
marks: asyncio • location: tests/test_concurrency_limits.py:196
No docstring.
test_multi_workers_same_type_rr_distribution¶
marks: asyncio • location: tests/test_concurrency_limits.py:252
No docstring.
tests/test_lease_heartbeat_resume.py¶
Heartbeat / grace-window / resume tests.
Covers: - Soft heartbeat -> deferred -> recovery - Hard heartbeat -> task failed - Resume with local worker state (no new epoch) - Discover w/ complete artifacts -> skip start - Grace-gate delay, and that deferred retry ignores the gate - Heartbeat extends lease deadline - Restart with a new worker_id bumps attempt_epoch and fences stale heartbeats
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_deferred_retry_ignores_grace_gate | A DEFERRED retry must not be throttled by discovery grace window. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:258 |
| test_grace_gate_blocks_then_allows_after_window | Grace window should delay start initially and allow it after the window elapses. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:209 |
| test_heartbeat_hard_marks_task_failed | If hard heartbeat window is exceeded, the task should be marked FAILED. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:77 |
| test_heartbeat_soft_deferred_then_recovers | With a short soft heartbeat window the task becomes DEFERRED, then recovers and finishes. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:39 |
| test_heartbeat_tolerates_clock_skew | Worker clock skew should not cause a hard timeout; lease deadlines must be non-decreasing. | asyncio, cfg | tests/test_lease_heartbeat_resume.py:556 |
| test_heartbeat_updates_lease_deadline_simple | Heartbeats should move the lease.deadline_ts_ms forward while the node runs. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:348 |
| test_lease_expiry_cascades_cancel | Permanent fail upstream should cascade-cancel downstream nodes. | asyncio, cfg | tests/test_lease_heartbeat_resume.py:625 |
| test_no_task_resumed_on_worker_restart | On a cold worker restart there must be no TASK_RESUMED event emitted by the worker. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:292 |
| test_resume_inflight_worker_restarts_with_local_state | Restart with the same worker_id: coordinator should adopt inflight work without new epoch. | cfg, asyncio | tests/test_lease_heartbeat_resume.py:113 |
| test_task_discover_complete_artifacts_skips_node_start | If artifacts are 'complete' during discovery, node should auto-finish without starting its handler. | asyncio | tests/test_lease_heartbeat_resume.py:173 |
| test_worker_restart_with_new_id_bumps_epoch | Restart with a new worker_id must bump attempt_epoch; stale heartbeats from the old epoch are ignored. | asyncio, cfg | tests/test_lease_heartbeat_resume.py:394 |
Details¶
test_deferred_retry_ignores_grace_gate¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:258
A DEFERRED retry must not be throttled by discovery grace window.
test_grace_gate_blocks_then_allows_after_window¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:209
Grace window should delay start initially and allow it after the window elapses.
test_heartbeat_hard_marks_task_failed¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:77
If hard heartbeat window is exceeded, the task should be marked FAILED.
test_heartbeat_soft_deferred_then_recovers¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:39
With a short soft heartbeat window the task becomes DEFERRED, then recovers and finishes.
test_heartbeat_tolerates_clock_skew¶
marks: asyncio, cfg • location: tests/test_lease_heartbeat_resume.py:556
Worker clock skew should not cause a hard timeout; lease deadlines must be non-decreasing.
test_heartbeat_updates_lease_deadline_simple¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:348
Heartbeats should move the lease.deadline_ts_ms forward while the node runs.
test_lease_expiry_cascades_cancel¶
marks: asyncio, cfg • location: tests/test_lease_heartbeat_resume.py:625
Permanent fail upstream should cascade-cancel downstream nodes.
test_no_task_resumed_on_worker_restart¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:292
On a cold worker restart there must be no TASK_RESUMED event emitted by the worker.
test_resume_inflight_worker_restarts_with_local_state¶
marks: cfg, asyncio • location: tests/test_lease_heartbeat_resume.py:113
Restart with the same worker_id: coordinator should adopt inflight work without new epoch.
test_task_discover_complete_artifacts_skips_node_start¶
marks: asyncio • location: tests/test_lease_heartbeat_resume.py:173
If artifacts are 'complete' during discovery, node should auto-finish without starting its handler.
test_worker_restart_with_new_id_bumps_epoch¶
marks: asyncio, cfg • location: tests/test_lease_heartbeat_resume.py:394
Restart with a new worker_id must bump attempt_epoch; stale heartbeats from the old epoch are ignored.
tests/test_outbox_delivery.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_outbox_backoff_caps_with_jitter | Exponential backoff is capped by max and jitter stays within a reasonable window. | asyncio, use_outbox, cfg | tests/test_outbox_delivery.py:366 |
| test_outbox_crash_between_send_and_mark_sent | Send succeeds, coordinator crashes before mark(sent) → after restart the outbox may resend, but there is only ONE effective delivery for (topic,key,dedup_id). | asyncio, use_outbox | tests/test_outbox_delivery.py:214 |
| test_outbox_dedup_survives_restart | Deduplication by (topic,key,dedup_id) persists across coordinator restart: re-enqueue of the same envelope after restart does not create a second outbox row or send. | asyncio, use_outbox | tests/test_outbox_delivery.py:305 |
| test_outbox_dlq_after_max_retries | After reaching max retries, the outbox row is moved to a terminal state with attempts >= max. | asyncio, use_outbox, cfg | tests/test_outbox_delivery.py:419 |
| test_outbox_exactly_once_fp_uniqueness | Two enqueues with identical (topic,key,dedup_id) → single outbox row and exactly one real send. | asyncio | tests/test_outbox_delivery.py:138 |
| test_outbox_retry_backoff | First two _raw_send calls fail → outbox goes to 'retry' with exponential backoff (with jitter), then on the 3rd attempt becomes 'sent'. | asyncio | tests/test_outbox_delivery.py:46 |
Details¶
test_outbox_backoff_caps_with_jitter¶
marks: asyncio, use_outbox, cfg • location: tests/test_outbox_delivery.py:366
Exponential backoff is capped by max and jitter stays within a reasonable window.
test_outbox_crash_between_send_and_mark_sent¶
marks: asyncio, use_outbox • location: tests/test_outbox_delivery.py:214
Send succeeds, coordinator crashes before mark(sent) → after restart the outbox may resend, but there is only ONE effective delivery for (topic,key,dedup_id).
test_outbox_dedup_survives_restart¶
marks: asyncio, use_outbox • location: tests/test_outbox_delivery.py:305
Deduplication by (topic,key,dedup_id) persists across coordinator restart: re-enqueue of the same envelope after restart does not create a second outbox row or send.
test_outbox_dlq_after_max_retries¶
marks: asyncio, use_outbox, cfg • location: tests/test_outbox_delivery.py:419
After reaching max retries, the outbox row is moved to a terminal state with attempts >= max.
test_outbox_exactly_once_fp_uniqueness¶
marks: asyncio • location: tests/test_outbox_delivery.py:138
Two enqueues with identical (topic,key,dedup_id) → single outbox row and exactly one real send.
test_outbox_retry_backoff¶
marks: asyncio • location: tests/test_outbox_delivery.py:46
First two _raw_send calls fail → outbox goes to 'retry' with exponential backoff (with jitter), then on the 3rd attempt becomes 'sent'.
tests/test_pipeline_smoke.py¶
End-to-end streaming smoke test.
Pipeline: w1(indexer) -> w2(enricher) -> w5(ocr) -> w4(analyzer) ___________/ / _____ w3(merge) _/
Checks: - all nodes finish successfully - artifacts are produced for indexer/enricher/ocr
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_e2e_streaming_with_kafka_sim | Full pipeline should complete and produce artifacts at indexer/enricher/ocr stages. | asyncio | tests/test_pipeline_smoke.py:135 |
Details¶
test_e2e_streaming_with_kafka_sim¶
marks: asyncio • location: tests/test_pipeline_smoke.py:135
Full pipeline should complete and produce artifacts at indexer/enricher/ocr stages.
tests/test_reliability_failures.py¶
Tests around source-like roles: idempotent metrics, retries, fencing, coordinator restart adoption, cascade cancel, and heartbeat/lease updates.
Design: - All runtime knobs come via coord_cfg/worker_cfg fixtures (see conftest.py). - For one-off tweaks, use @pytest.mark.cfg(coord={...}, worker={...}). - Workers are started with in-memory DB and handlers built from helper builders.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_coordinator_restart_adopts_inflight_without_new_epoch | When the coordinator restarts, it should adopt in-flight work without incrementing the worker's attempt_epoch unnecessarily (i.e., source keeps epoch=1). | asyncio | tests/test_reliability_failures.py:218 |
| test_explicit_cascade_cancel_moves_node_to_deferred | Explicit cascade cancel should move a running node to a cancelling/deferred/queued state within the configured cancel_grace window. | cfg, asyncio | tests/test_reliability_failures.py:273 |
| test_heartbeat_updates_lease_deadline | Heartbeats from a worker must extend the lease deadline in the task document. | cfg, asyncio | tests/test_reliability_failures.py:322 |
| test_idempotent_metrics_on_duplicate_events | Duplicated STATUS events (BATCH_OK/TASK_DONE) must not double-count metrics. | asyncio | tests/test_reliability_failures.py:36 |
| test_permanent_fail_cascades_cancel_and_task_failed | Permanent failure in an upstream node should cause the task to fail, while dependents get cancelled/deferred/queued depending on race windows. | cfg, asyncio | tests/test_reliability_failures.py:125 |
| test_status_fencing_ignores_stale_epoch | Status fencing must ignore events from a stale attempt_epoch. | asyncio | tests/test_reliability_failures.py:169 |
| test_transient_failure_deferred_then_retry | A transient error should defer the node and succeed on retry according to retry_policy (max>=2). | asyncio | tests/test_reliability_failures.py:83 |
Details¶
test_coordinator_restart_adopts_inflight_without_new_epoch¶
marks: asyncio • location: tests/test_reliability_failures.py:218
When the coordinator restarts, it should adopt in-flight work without incrementing the worker's attempt_epoch unnecessarily (i.e., source keeps epoch=1).
test_explicit_cascade_cancel_moves_node_to_deferred¶
marks: cfg, asyncio • location: tests/test_reliability_failures.py:273
Explicit cascade cancel should move a running node to a cancelling/deferred/queued state within the configured cancel_grace window.
test_heartbeat_updates_lease_deadline¶
marks: cfg, asyncio • location: tests/test_reliability_failures.py:322
Heartbeats from a worker must extend the lease deadline in the task document.
test_idempotent_metrics_on_duplicate_events¶
marks: asyncio • location: tests/test_reliability_failures.py:36
Duplicated STATUS events (BATCH_OK/TASK_DONE) must not double-count metrics. We duplicate STATUS envelopes at the Kafka level; aggregator must remain stable.
test_permanent_fail_cascades_cancel_and_task_failed¶
marks: cfg, asyncio • location: tests/test_reliability_failures.py:125
Permanent failure in an upstream node should cause the task to fail, while dependents get cancelled/deferred/queued depending on race windows.
test_status_fencing_ignores_stale_epoch¶
marks: asyncio • location: tests/test_reliability_failures.py:169
Status fencing must ignore events from a stale attempt_epoch. We finish the task, then send a forged event with attempt_epoch=0; stats must remain unchanged.
test_transient_failure_deferred_then_retry¶
marks: asyncio • location: tests/test_reliability_failures.py:83
A transient error should defer the node and succeed on retry according to retry_policy (max>=2).
tests/test_scheduler_fanin_routing.py¶
Fan-in behavior (ANY/ALL/COUNT:N) and coordinator_fn merge smoke tests.
Covers: - ANY fan-in: downstream starts as soon as any parent streams a first batch. - ALL fan-in: downstream starts only after all parents are done (no start_when). - COUNT:N fan-in: xfail placeholder until the coordinator supports it. - Edges vs routing priority: xfail placeholder for precedence logic. - coordinator_fn merge: runs without a worker and feeds downstream via artifacts.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_coordinator_fn_merge_without_worker | coordinator_fn node should run without a worker and produce artifacts that a downstream analyzer can consume via pull.from_artifacts. | asyncio | tests/test_scheduler_fanin_routing.py:294 |
| test_edges_vs_routing_priority | If explicit graph edges are present and a node also has routing.on_success, edges should take precedence (routing target should not run). | asyncio, xfail | tests/test_scheduler_fanin_routing.py:230 |
| test_fanin_all_waits_all_parents | Fan-in ALL: without start_when hint, downstream should only start after both parents are finished. | asyncio | tests/test_scheduler_fanin_routing.py:127 |
| test_fanin_any_starts_early | Fan-in ANY: downstream should start as soon as at least one parent streams (start_when='first_batch'), even if other parents are not yet finished. | asyncio | tests/test_scheduler_fanin_routing.py:69 |
| test_fanin_count_n | Fan-in COUNT:N placeholder: downstream should start when at least N parents are ready. | asyncio, xfail | tests/test_scheduler_fanin_routing.py:176 |
| test_fanout_one_upstream_two_downstreams_mixed_start_when | One upstream → two downstreams: A has start_when=first_batch (starts early), B has no start_when (waits for completion). | asyncio | tests/test_scheduler_fanin_routing.py:431 |
| test_routing_on_failure_triggers_remediator_only | On upstream TASK_FAILED(permanent=True), only the 'on_failure' remediator should run; 'on_success' must not. | asyncio, xfail | tests/test_scheduler_fanin_routing.py:358 |
Details¶
test_coordinator_fn_merge_without_worker¶
marks: asyncio • location: tests/test_scheduler_fanin_routing.py:294
coordinator_fn node should run without a worker and produce artifacts that a downstream analyzer can consume via pull.from_artifacts.
test_edges_vs_routing_priority¶
marks: asyncio, xfail • location: tests/test_scheduler_fanin_routing.py:230
If explicit graph edges are present and a node also has routing.on_success, edges should take precedence (routing target should not run).
test_fanin_all_waits_all_parents¶
marks: asyncio • location: tests/test_scheduler_fanin_routing.py:127
Fan-in ALL: without start_when hint, downstream should only start after both parents are finished.
test_fanin_any_starts_early¶
marks: asyncio • location: tests/test_scheduler_fanin_routing.py:69
Fan-in ANY: downstream should start as soon as at least one parent streams (start_when='first_batch'), even if other parents are not yet finished. We run a single 'indexer' worker so upstream parents execute sequentially.
test_fanin_count_n¶
marks: asyncio, xfail • location: tests/test_scheduler_fanin_routing.py:176
Fan-in COUNT:N placeholder: downstream should start when at least N parents are ready. Marked xfail until coordinator supports 'count:n'.
test_fanout_one_upstream_two_downstreams_mixed_start_when¶
marks: asyncio • location: tests/test_scheduler_fanin_routing.py:431
One upstream → two downstreams: A has start_when=first_batch (starts early), B has no start_when (waits for completion).
test_routing_on_failure_triggers_remediator_only¶
marks: asyncio, xfail • location: tests/test_scheduler_fanin_routing.py:358
On upstream TASK_FAILED(permanent=True), only the 'on_failure' remediator should run; 'on_success' must not.
tests/unit/vars/test_detectors.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_detector_function_with_merge | DI: callable detector. | asyncio | tests/unit/vars/test_detectors.py:38 |
| test_detector_object_block_and_soft | DI: object detector. | asyncio | tests/unit/vars/test_detectors.py:18 |
| test_detector_string_import | DI: string import "module:factory". | asyncio | tests/unit/vars/test_detectors.py:63 |
Details¶
test_detector_function_with_merge¶
marks: asyncio • location: tests/unit/vars/test_detectors.py:38
DI: callable detector. Block big binary payload when block_sensitive=True, otherwise log hit.
test_detector_object_block_and_soft¶
marks: asyncio • location: tests/unit/vars/test_detectors.py:18
DI: object detector. With block_sensitive=True — raise; without — write and count hit.
test_detector_string_import¶
marks: asyncio • location: tests/unit/vars/test_detectors.py:63
DI: string import "module:factory". Factory returns an object with is_sensitive().
tests/unit/vars/test_incr.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_vars_incr_concurrency_sum | Concurrency: 10x1 + 5x2 → exact sum. | asyncio | tests/unit/vars/test_incr.py:27 |
| test_vars_incr_creates_missing_leaf | Creating a missing leaf should create it and increment. | asyncio | tests/unit/vars/test_incr.py:17 |
| test_vars_incr_float_and_negative_and_type_conflict | Mixed numeric types are allowed; negative increments also work. | asyncio | tests/unit/vars/test_incr.py:61 |
| test_vars_incr_rejects_nan_inf | NaN / ±Inf must be rejected. | asyncio, parametrize | tests/unit/vars/test_incr.py:42 |
| test_vars_incr_validates_key | Key validation (segments). | asyncio, parametrize | tests/unit/vars/test_incr.py:52 |
Details¶
test_vars_incr_concurrency_sum¶
marks: asyncio • location: tests/unit/vars/test_incr.py:27
Concurrency: 10x1 + 5x2 → exact sum.
test_vars_incr_creates_missing_leaf¶
marks: asyncio • location: tests/unit/vars/test_incr.py:17
Creating a missing leaf should create it and increment.
test_vars_incr_float_and_negative_and_type_conflict¶
marks: asyncio • location: tests/unit/vars/test_incr.py:61
Mixed numeric types are allowed; negative increments also work. If a non-numeric value already exists at the leaf, the adapter should raise.
test_vars_incr_rejects_nan_inf¶
marks: asyncio, parametrize • location: tests/unit/vars/test_incr.py:42
NaN / ±Inf must be rejected.
test_vars_incr_validates_key¶
marks: asyncio, parametrize • location: tests/unit/vars/test_incr.py:52
Key validation (segments).
tests/unit/vars/test_merge.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_vars_merge_block_sensitive_with_external_detector | External detector: block when block_sensitive=True, otherwise write but count sensitive_hits in logs. | asyncio | tests/unit/vars/test_merge.py:101 |
| test_vars_merge_deep_overwrite_mixed_types | Deep-merge with leaf overwrite; mixed types (dict → leaf). | asyncio | tests/unit/vars/test_merge.py:15 |
| test_vars_merge_limits_and_value_size_str_and_bytes | Limits: number of paths, depth, full path length, and value size (str/bytes). | asyncio | tests/unit/vars/test_merge.py:64 |
| test_vars_merge_noop_empty | Empty input → no-op (touched=0), 'vars' field is not created. | asyncio | tests/unit/vars/test_merge.py:52 |
Details¶
test_vars_merge_block_sensitive_with_external_detector¶
marks: asyncio • location: tests/unit/vars/test_merge.py:101
External detector: block when block_sensitive=True, otherwise write but count sensitive_hits in logs.
test_vars_merge_deep_overwrite_mixed_types¶
marks: asyncio • location: tests/unit/vars/test_merge.py:15
Deep-merge with leaf overwrite; mixed types (dict → leaf).
test_vars_merge_limits_and_value_size_str_and_bytes¶
marks: asyncio • location: tests/unit/vars/test_merge.py:64
Limits: number of paths, depth, full path length, and value size (str/bytes).
test_vars_merge_noop_empty¶
marks: asyncio • location: tests/unit/vars/test_merge.py:52
Empty input → no-op (touched=0), 'vars' field is not created.
tests/unit/vars/test_set.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_vars_set_block_sensitive_with_external_detector | With block_sensitive=True the detector blocks. | asyncio | tests/unit/vars/test_set.py:138 |
| test_vars_set_deterministic_keys_order_and_sensitive_hits | Deterministic sort order in logs + sensitive_hits counter. | asyncio | tests/unit/vars/test_set.py:118 |
| test_vars_set_dotted_and_nested_combo_and_logging | Dotted + nested forms in one call; verify values and structured logs. | asyncio | tests/unit/vars/test_set.py:15 |
| test_vars_set_limits_depth_and_path_len | Limits: key depth and full path length. | asyncio | tests/unit/vars/test_set.py:84 |
| test_vars_set_limits_paths_count | Limit on number of paths in a single $set. | asyncio | tests/unit/vars/test_set.py:72 |
| test_vars_set_limits_value_size_str_and_bytes | Limit on value size (string/bytes). | asyncio | tests/unit/vars/test_set.py:102 |
| test_vars_set_validates_key_segments | Validate segments: $, '.', NUL, segment length. | asyncio, parametrize | tests/unit/vars/test_set.py:63 |
Details¶
test_vars_set_block_sensitive_with_external_detector¶
marks: asyncio • location: tests/unit/vars/test_set.py:138
With block_sensitive=True the detector blocks. Without it, write succeeds and logs record sensitive_hits=1.
test_vars_set_deterministic_keys_order_and_sensitive_hits¶
marks: asyncio • location: tests/unit/vars/test_set.py:118
Deterministic sort order in logs + sensitive_hits counter.
test_vars_set_dotted_and_nested_combo_and_logging¶
marks: asyncio • location: tests/unit/vars/test_set.py:15
Dotted + nested forms in one call; verify values and structured logs.
test_vars_set_limits_depth_and_path_len¶
marks: asyncio • location: tests/unit/vars/test_set.py:84
Limits: key depth and full path length.
test_vars_set_limits_paths_count¶
marks: asyncio • location: tests/unit/vars/test_set.py:72
Limit on number of paths in a single $set.
test_vars_set_limits_value_size_str_and_bytes¶
marks: asyncio • location: tests/unit/vars/test_set.py:102
Limit on value size (string/bytes).
test_vars_set_validates_key_segments¶
marks: asyncio, parametrize • location: tests/unit/vars/test_set.py:63
Validate segments: $, '.', NUL, segment length.
tests/unit/vars/test_unset.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_vars_unset_accepts_args_and_kwargs | The adapter should accept both *args (varargs keys) and kwargs form 'keys=[...]'. | asyncio | tests/unit/vars/test_unset.py:76 |
| test_vars_unset_path_length_limit | Validate full path length limit (including prefix 'coordinator.vars.'). | asyncio | tests/unit/vars/test_unset.py:64 |
| test_vars_unset_simple_nested_and_noop | Remove a simple leaf and a nested branch; missing key is a no-op (operation remains valid and accounted in 'touched'). | asyncio | tests/unit/vars/test_unset.py:14 |
| test_vars_unset_validates_paths | Validate key/path formats: $, '.', NUL, segment length. | asyncio, parametrize | tests/unit/vars/test_unset.py:55 |
Details¶
test_vars_unset_accepts_args_and_kwargs¶
marks: asyncio • location: tests/unit/vars/test_unset.py:76
The adapter should accept both *args (varargs keys) and kwargs form 'keys=[...]'.
test_vars_unset_path_length_limit¶
marks: asyncio • location: tests/unit/vars/test_unset.py:64
Validate full path length limit (including prefix 'coordinator.vars.').
test_vars_unset_simple_nested_and_noop¶
marks: asyncio • location: tests/unit/vars/test_unset.py:14
Remove a simple leaf and a nested branch; missing key is a no-op (operation remains valid and accounted in 'touched').
test_vars_unset_validates_paths¶
marks: asyncio, parametrize • location: tests/unit/vars/test_unset.py:55
Validate key/path formats: $, '.', NUL, segment length.
tests/unit/worker/test_error_classification.py¶
Unit test: default RoleHandler.classify_error marks config/programming errors as permanent. We don't pin the exact reason string, only the permanence contract.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_programming_errors_are_permanent | (no docstring) | — | tests/unit/worker/test_error_classification.py:15 |
Details¶
test_programming_errors_are_permanent¶
location: tests/unit/worker/test_error_classification.py:15
No docstring.
tests/unit/worker/test_error_policy.py¶
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_classify_error_non_permanent_for_other_exceptions | (no docstring) | — | tests/unit/worker/test_error_policy.py:18 |
| test_classify_error_permanent_for_common_programming_faults | (no docstring) | — | tests/unit/worker/test_error_policy.py:10 |
Details¶
test_classify_error_non_permanent_for_other_exceptions¶
location: tests/unit/worker/test_error_policy.py:18
No docstring.
test_classify_error_permanent_for_common_programming_faults¶
location: tests/unit/worker/test_error_policy.py:10
No docstring.
tests/unit/worker/test_pull_adapters_rechunk.py¶
Unit tests for PullAdapters.iter_from_artifacts_rechunk() selection rules.
Contract: - If meta_list_key is provided and points to a list → chunk that list. - Otherwise → treat the entire meta as a single item (wrap into a one-element list), no heuristics.
Notes:
* We keep these tests unit-level (no Kafka/Coordinator). To let the adapter
terminate its polling loop, we set eof_on_task_done=True and append a
synthetic {status: "complete"} marker for the source node. Our stub
count_documents recognizes such queries.
Tests¶
| Test | Summary | Marks | Location |
|---|---|---|---|
| test_rechunk_with_meta_list_key_splits_that_list | (no docstring) | asyncio | tests/unit/worker/test_pull_adapters_rechunk.py:70 |
| test_rechunk_without_meta_list_key_wraps_meta_as_single_item | (no docstring) | asyncio | tests/unit/worker/test_pull_adapters_rechunk.py:113 |
Details¶
test_rechunk_with_meta_list_key_splits_that_list¶
marks: asyncio • location: tests/unit/worker/test_pull_adapters_rechunk.py:70
No docstring.
test_rechunk_without_meta_list_key_wraps_meta_as_single_item¶
marks: asyncio • location: tests/unit/worker/test_pull_adapters_rechunk.py:113
No docstring.