src.utils.curve.simple_coverage_refiner
Classes
|
- class src.utils.curve.simple_coverage_refiner.SimpleCoverageRefiner(merge_radius=None, z_tolerance=2.0, min_voxel_size=0.1, min_curve_length=None, min_segment_length=0.5, chain_radius_factor=1.0, nthreads=-1, snap_enable=False, snap_max_shift=None, snap_min_neighbors=1, snap_smoothing_iterations=2, snap_target_radius=None, snap_pull_radius=None, snap_densify_step=0.5, snap_truncate_iterations=2, hallucination_drop_enable=False, hallucination_drop_radius=None, hallucination_drop_threshold=0.5, hallucination_drop_densify_step=0.5, hallucination_min_supported_length=None, endpoint_extension_enable=False, endpoint_extension_radius=None, endpoint_extension_step=0.5, endpoint_extension_max_length=None, coord_center=None, x_orig_f64=None, kdtree_provider=None, sce=None)
- Author:
Alberto M. Esmoris Pena
Coverage-refinement and post-optimisation cleanup helper for the curves produced by
SimpleCurveExtractor. Wraps the H30/H31 snap-to-input shifts, the H32 hallucination-drop splitting, the H36 endpoint-extension walks, the iterative coverage-refinement extractor, and the fixed cleanup cascade that runs at the end of the SCE global-optimisation phase.Instances are constructed once per consuming SCE pipeline phase from the private orchestrator helpers
_step_coverage_refinementand_step_global_optimization(with a third single- use construction inside_concatenate_orphan_into_neighbourfor the staticdensify_polyline_3d_archelper). Each instance takes a snapshot of the relevant SCE attributes (24 fields) plus a callable hook to the SCE input-cloud KDTree cache and a back-reference to the SCE instance for cross-class invocations (skeleton extraction, topological pruning, geometry refinement, chain-merge, post-opt relabel / gap-split / orphan-merge – all of which remain on SCE for future refactors).The three driver-style entry points are:
refine_coverage()– iteratively detects uncovered regions of the input curve cloud and extracts supplementary curves through the SCE sub-pipeline.run_post_opt_cleanup()– executes the fixed H32..H37 cascade at the end of_step_global_optimization.snap_polylines_to_input()(and the H32drop_hallucinated_features/ H36extend_polyline_endpointspasses invoked by the cleanup driver and the orchestrator’s snap loops).
- Variables:
merge_radius (float or None) – Endpoint-merge radius forwarded to
refine_coverage()and the post-opt cascade.z_tolerance (float) – Z-tolerance threshold used by the post-opt Z-jump split (delegated to
SimplePolylineSanitizer).min_voxel_size (float) – Per-instance scale used to derive radii on point clouds at any resolution.
min_curve_length (float or None) – Lower bound for retained full-curve length, consumed by
refine_coverage().min_segment_length (float) – Lower bound for kept sub-curves emitted by the supplementary extractor inside
refine_coverage().chain_radius_factor (float) – Multiplier applied to
skel_vsto derive the per-cluster chain radius insiderefine_coverage().nthreads (int) – Outer-OMP thread budget for the batched stabilize call (forwarded by SCE when invoking
refine_coverage()).snap_enable (bool) – H30 master switch for
snap_polylines_to_input().snap_max_shift (float) – 3D shift cap (m) per interior vertex.
snap_min_neighbors (int) – Minimum number of contributing missed points required to commit a per-vertex shift.
snap_smoothing_iterations (int) – Number of 3-point moving-average smoothing passes applied to the per-vertex shift sequence.
snap_target_radius (float) – Coverage radius used to flag MISSED input points (3D Euclidean).
snap_pull_radius (float) – Maximum pull distance from a missed input point to its candidate interior vertex (3D Euclidean).
snap_densify_step (float) – Densify step for the per-polyline 3D arc-length sampling driving the MISSED-input classification.
snap_truncate_iterations (int) – Number of snap-then-truncate cycles inside
run_post_opt_cleanup().hallucination_drop_enable (bool) – H32 master switch.
hallucination_drop_radius (float) – Support radius used by
drop_hallucinated_features().hallucination_drop_threshold (float) – Length-weighted per-feature hallucinated fraction above which the feature is split.
hallucination_drop_densify_step (float) – 3D arc-length step for the support sampling.
hallucination_min_supported_length (float) – Lower bound on supported sub-polyline 3D length; below this the sub-polyline is dropped.
endpoint_extension_enable (bool) – H36 master switch.
endpoint_extension_radius (float) – Support radius used by the per-step input-neighbour test.
endpoint_extension_step (float) – Walk step length (m).
endpoint_extension_max_length (float) – Hard cap on per-endpoint extension length (m).
_coord_center (
np.ndarrayor None) – Coordinate centring offset forwarded by SCE._X_orig_f64 (
np.ndarrayor None) – Float64 reservoir of the input curve points in the centred frame._kdtree_provider (callable or None) – Callable returning the lazily-built input-cloud KDTree (
Nonetriggers a per-call rebuild fallback)._sce (
SimpleCurveExtractor) – Back-reference to theSimpleCurveExtractorinstance for cross-class invocations of methods that remain on SCE.
- __init__(merge_radius=None, z_tolerance=2.0, min_voxel_size=0.1, min_curve_length=None, min_segment_length=0.5, chain_radius_factor=1.0, nthreads=-1, snap_enable=False, snap_max_shift=None, snap_min_neighbors=1, snap_smoothing_iterations=2, snap_target_radius=None, snap_pull_radius=None, snap_densify_step=0.5, snap_truncate_iterations=2, hallucination_drop_enable=False, hallucination_drop_radius=None, hallucination_drop_threshold=0.5, hallucination_drop_densify_step=0.5, hallucination_min_supported_length=None, endpoint_extension_enable=False, endpoint_extension_radius=None, endpoint_extension_step=0.5, endpoint_extension_max_length=None, coord_center=None, x_orig_f64=None, kdtree_provider=None, sce=None)
Snapshot the SCE configuration relevant to the coverage-refinement and post-optimisation cleanup cascade.
- refine_coverage(smooth_curves, metadata, X_orig, skel_vs, spacing, zc_sigma)
Identify uncovered regions of the input curve point cloud and extract supplementary curves.
For each spatial cluster of input curve points that are far from any existing output curve, the full C++ sub-pipeline (stabilize, skeleton, prune, refine) is run to produce additional curves. This handles quarry path edges where only one side was captured by the initial extraction.
Uses only existing parameters: no new ones.
- Parameters:
smooth_curves (list of dict) – Current output curves.
metadata (list of dict) – Per-curve metadata.
X_orig (
np.ndarray) – Centered input curve points.skel_vs (float) – Skeleton voxel size.
spacing (float) – PCHIP resampling spacing.
zc_sigma (float) – Z-consistency sigma.
- Returns:
(smooth_curves, metadata) with supplementary curves appended.
- Return type:
tuple
- run_post_opt_cleanup(smooth_curves, metadata, X_orig)
Run the H32..H37 post-optimization cleanup cascade, called at the end of the SCE
_step_global_optimizationphase. The order is fixed and reproduces the original inline block verbatim:H32 (a) – drop hallucinated features.
H36 – snap/truncate iteration loop:
snap_truncate_iterationscycles of snap-to-input followed by truncate.H36 – extend polyline endpoints through any remaining supported input neighbourhood.
H36 – post-extension truncation iterated to a fixed point via
until_stable_count()(cap from the module-level_ENDPOINT_TRUNCATE_FIXED_POINTconstant).H32 (b) – relabel Z-disjoint CIDs (delegated to
SimpleCurveExtractor).H36 final – split endpoint-pair gaps (delegated to
SimpleCurveExtractor).H37 – merge orphan fragments (delegated to
SimpleCurveExtractor).
Extracted in iter4 (Phase 6, L-16). Not intended for use outside the SCE
_step_global_optimizationphase.- Parameters:
smooth_curves (list of dict) – Polylines (mutated by the inner cleanup chain).
metadata (list of dict) – Per-feature metadata dicts (mutated by the inner cleanup chain).
X_orig (
np.ndarray) – Centred curve-class input points (Nx3); reused for snap, truncate, and endpoint-extension passes.
- Returns:
(smooth_curves, metadata).- Return type:
tuple
- snap_polylines_to_input(smooth_curves, metadata, X_orig)
H30 – shift interior polyline vertices toward the input curve-class points that are currently MISSED by every polyline (beyond
snap_target_radiusfrom any polyline).Each missed point pulls its nearest interior vertex (within
snap_pull_radius); per-vertex shifts are the average of those pull vectors, capped atsnap_max_shift, smoothed by a 3-point moving averagesnap_smoothing_iterationstimes along the polyline, and reverted whenever the SI guard detects a new intra-polyline crossing. Endpoints are never shifted – their positions are load-bearing for the same-CID merge cascade and for T-junction classification upstream of the metrics pipeline.Motivation: ~65 % of missed curve-class points at the 4 m threshold are “INTERIOR-adjacent”: a polyline passes through their neighbourhood but sits 4-8 m away. An early iteration of this pass targeted the full-input centroid, which lowered coverage by -0.51 pp because it pushed already-covered points on the other side of the polyline outside the 4 m window while only marginally reducing the distance to missed points on the near side. Targeting only MISSED points makes the shift monotone w.r.t. coverage: if the cap and guards are satisfied, the shift strictly improves coverage at the moved vertex (up to the ~0.2 % of shifts the SI guard reverts).
Pass ordering: after the Stage 7j merge cascade, before the H25 sanitize. The local SI guard (per-polyline, first-hit C++ check) reverts shifts that introduce intra-polyline crossings; the sanitizer handles any residual cases. Same- CID inter-feature crossings are not guarded here – the Stage 7g/quater resolve pass already addressed those earlier and H25 baseline has SI = 0.
Hyperparameter sweep result on the production cloud (
inicorta.laz, 3 m / 2 m split metric radii at H31): baseline cov 95.24 % at 4 m / 4 m metric -> cov 99.26 % at 3 m, dev 0.74 % at 2 m, SI 0 (preserved), gaps 1 (preserved). The current defaults at the production fixturemin_voxel_size = 0.1(snap_max_shift = 3.5,snap_passes = 30,snap_post_passes = 20,snap_pull_radius = 40.0,snap_target_radius = 3.0) sit at or just beyond the saturation knee of the H31 sweep. Phase-9 SR-1 rewiredsnap_max_shift,snap_pull_radiusandsnap_target_radiusto derive frommin_voxel_size(coefficients \(35\), \(400\), \(30\) respectively) so the values above are reproduced automatically on the fixture and scale with the input voxel resolution on other clouds.- Parameters:
smooth_curves (list of dict) – Current polylines.
metadata (list of dict) – Per-polyline metadata.
X_orig (
np.ndarray) – Centered curve-class input points (N x 3 in the same coordinate frame assc['points']).
- Returns:
Snapped
(smooth_curves, metadata).- Return type:
tuple
- drop_hallucinated_features(smooth_curves, metadata, X_orig)
H32 – split features whose hallucinated polyline length exceeds
hallucination_drop_thresholdof their total length into supported sub- polylines.A densified vertex (3D linear interpolation along the polyline at
hallucination_drop_densify_stepspacing) is hallucinated when no input curve-class point lies withinhallucination_drop_radius. The per-feature score is the length-weighted ratio (segment with both endpoints hallucinated counts fully; one endpoint hallucinated counts for half), matching the validation metric.Features at or below the threshold are kept verbatim. Features above the threshold are split: each maximal contiguous run of supported densified vertices becomes a new feature, the first such run inheriting the original
CURVE_IDand the rest receiving freshCURVE_IDvalues so no phantom same-CID gap pair is created. Sub-polylines whose 2D length is belowhallucination_min_supported_lengthare dropped. If every supported run on a feature is too short, the whole feature is dropped (legacy behaviour from the H32 first iteration).- Parameters:
smooth_curves (list of dict) – Current polylines.
metadata (list of dict) – Per-polyline metadata.
X_orig (
np.ndarray) – Centered curve-class input points (N x 3 in the same coordinate frame assc['points']).
- Returns:
(smooth_curves, metadata)with hallucinated material removed.- Return type:
tuple
- extend_polyline_endpoints(smooth_curves, metadata, X_orig)
H36 – extend each polyline endpoint outward along its tangent through supported regions.
The snap pass deliberately never moves endpoints because they are load-bearing for the H3 merge cascade and the T-junction classification. After the snap-truncate loop saturates, however, a thin layer of input curve-class points remains MISSED just past the polyline endpoints. This pass walks each endpoint outward in steps of
endpoint_extension_stepwhile:the next step still has at least one input curve-class point within
endpoint_extension_radius(support guard), ANDthe new segment does not cross any segment of any other polyline already in the output (crossing guard).
The walk stops at the first violation or after
endpoint_extension_max_lengthmetres. By construction the extension can introduce neither hallucinated material (support guard) nor new self-/inter-feature crossings (crossing guard).- Parameters:
smooth_curves (list of dict) – Current polylines.
metadata (list of dict) – Per-polyline metadata.
X_orig (
np.ndarray) – Centered curve-class input points (N x 3 in the same coordinate frame assc['points']).
- Returns:
(smooth_curves, metadata)with extensions appended/prepended.- Return type:
tuple
- static densify_polyline_3d_arc(pts, step)
Densify an N x 3 polyline with linear 3D interpolation, spacing measured along the 3D arc. This matches the validation metric’s 3D densification: a steep z-jump segment that is short in XY but long in 3D gets enough intermediate samples for the per-vertex support test to flag the runaway, while a gently-sloping segment gets the same sampling rate as a flat one.
The arithmetic runs in float64 even when the input is float32 (the SCE default when
structure_space_bits = 32); float32 input densification at the segment-length boundary differs from the metric’s float64 densification by up to ~1 ULP, which is enough to flip the hallucinated/supported classification on a borderline vertex (H36 diagnostic).- Parameters:
pts (
np.ndarray) –(N, 3)polyline.step (float) – 3D arc-length spacing (m).
- Returns:
Densified
(M, 3)polyline.- Return type:
np.ndarray
- static walk_endpoint(endpoint, neighbour, step, max_len, input_tree, radius, feat_idx, crosses_any, register_segment)
Walk outward from
endpointalong the tangent away fromneighbourwhile each step has input support withinradiusAND does not cross any other polyline segment already registered incrosses_any.After each accepted step the new segment is immediately registered via
register_segmentso that subsequent steps on the same walk can detect crossings with the in-flight extension.- Parameters:
endpoint (
np.ndarray) – Tip of the polyline (3D).neighbour (
np.ndarray) – Penultimate vertex of the polyline (3D); together withendpointit defines the outgoing tangent.step (float) – Step length (m).
max_len (float) – Maximum walk length (m).
input_tree –
KDTreeover input curve- class XY coordinates.radius (float) – Support neighbourhood radius (m).
feat_idx (int) – Index of the polyline being extended; passed to
crosses_anyso its own original segments are excluded from the crossing test.crosses_any – Callable
(a, b, fi) -> boolreturningTruewhen segmenta-bcrosses any registered segment not owned by featurefi.register_segment – Callable
(fi, a, b)that adds segmenta-bto the spatial index of featurefiso subsequent crossing checks see it.
- Returns:
List of (x, y, z) extension points in walk order (first new vertex first).
- Return type:
list of
np.ndarray
- static polyline_si_cell_size(pts)
Compute the SI-grid cell size for a single polyline using the same
max(5 * median(seg_len), 1e-3)rule thatpolyline_has_si()(and the legacy per-polyline scan) uses. Lifted out ofpolyline_has_si()so the batched-SI fast path insnap_polylines_to_input()can compute all cell sizes up-front and feed a singlepyvl3dpp.curve_self_intersection_batch_dcall.The arithmetic preserves the input dtype so the
5*median(seg_len)value stays bit-identical with the per-polyline call (a float32 input produces a cell size computed from float32 medians; a float64 input from float64 medians).- Parameters:
pts (
np.ndarray) – Polyline points (N x ?, only the XY columns are used).- Returns:
Cell size as a float; falls back to
1e-3when the polyline has no segments.- Return type:
float
- static polyline_has_si(pts)
Return
Truewhenptshas at least one 2D self-crossing. Uses the same C++ first-hit scan asremove_self_ixon theSimplePolylineSanitizercollaborator, but stops at the boolean answer.- Parameters:
pts (
np.ndarray) – Polyline points (N x 3).- Returns:
Trueif any crossing is found.- Return type:
bool
- static until_stable_count(fn, smooth_curves, metadata, max_iter)
Iterate
fn(smooth_curves, metadata)up tomax_itertimes, breaking when the curve count is unchanged across a single iteration.Mirrors the original inline pattern verbatim:
The pre-iteration count is captured BEFORE calling
fn(n_before = len(smooth_curves)).fnis then invoked exactly once per iteration.The break test compares the new
len(smooth_curves)againstn_before; if equal, the loop exits without running anotherfncall.
Extracted in iter4 (Phase 6, L-16) for use in
run_post_opt_cleanup()’s post-extension truncation fixed point.- Parameters:
fn (callable) – Callable taking
(smooth_curves, metadata)and returning the updated tuple.smooth_curves (list of dict) – Curve dicts.
metadata (list of dict) – Per-segment metadata dicts.
max_iter (int) – Maximum number of iterations.
- Returns:
(smooth_curves, metadata)after the loop.- Return type:
tuple