src.clustering.simple_curve_extractor
Module Attributes
Default merge-score IDW weights used by the C++ |
Classes
|
|
|
|
|
- src.clustering.simple_curve_extractor.DEFAULT_MERGE_SCORE_DISTANCE_WEIGHT = 0.4
Default merge-score IDW weights used by the C++
CurveStabilizerwhen scoring endpoint merges. The three components combine a distance score, a tangent alignment score, and a component-size score. The sum of the three weights must be \(1.0\). Overridable via the public kwargsmerge_score_distance_weight,merge_score_tangent_weight, andmerge_score_component_weight.
- class src.clustering.simple_curve_extractor.Polyline(points: ndarray, arc_lengths: ndarray, slopes: ndarray)
- Author:
Alberto M. Esmoris Pena
Compact container for a curve segment: points, arc lengths, and per-vertex slopes. Use
to_dict()to emit the legacy dict shape (points,arc_lengths,slopes) still consumed by the rest of the class.- Variables:
points (
np.ndarray) – Polyline points (N x 3 array).arc_lengths (
np.ndarray) – Cumulative arc lengths per vertex, witharc_lengths[0] == 0.slopes (
np.ndarray) – Per-vertex slopes (N,).
- points: ndarray
- arc_lengths: ndarray
- slopes: ndarray
- to_dict()
Emit the legacy
{'points', 'arc_lengths', 'slopes'}dict shape consumed throughout the class.- Returns:
Legacy curve dict.
- Return type:
dict
- __init__(points: ndarray, arc_lengths: ndarray, slopes: ndarray) None
- class src.clustering.simple_curve_extractor.FeatureMeta(CURVE_ID: int, SEG_ID: int, CONF_GEOM: float, CONF_TOP: float, STAB_TOP: float, SRC_PTS: int, SKEL_MTH: str, CLUSTER_ID: int, LENGTH_3D: float, LENGTH_2D: float, AVG_GRADE: float, MAX_GRADE: float)
- Author:
Alberto M. Esmoris Pena
Compact container for the 12 per-feature metadata fields emitted by
SimpleCurveExtractor. Added in Stage 6.3 T9 to replace the inline 12-key dict literals at the metadata creation sites. Internally the rest of the class continues to consume plaindictinstances, soto_dict()is invoked at the creation sites to preserve byte-identical downstream behaviour.- Variables:
CURVE_ID – Curve identifier (int).
SEG_ID – Segment identifier within the curve (int).
-1for bridge segments.CONF_GEOM – Geometric confidence score.
CONF_TOP – Topological confidence score.
STAB_TOP – Topological stability score.
SRC_PTS – Number of source points that contributed to the feature.
SKEL_MTH – Skeleton extraction method identifier (e.g.,
'soft_2.5d_manifold','bridge','coverage_refine').CLUSTER_ID – Originating cluster identifier, or
-1if not applicable.LENGTH_3D – 3D polyline length.
LENGTH_2D – 2D polyline length.
AVG_GRADE – Average grade along the polyline.
MAX_GRADE – Maximum grade along the polyline.
- CURVE_ID: int
- SEG_ID: int
- CONF_GEOM: float
- CONF_TOP: float
- STAB_TOP: float
- SRC_PTS: int
- SKEL_MTH: str
- CLUSTER_ID: int
- LENGTH_3D: float
- LENGTH_2D: float
- AVG_GRADE: float
- MAX_GRADE: float
- to_dict()
Emit the legacy dict shape consumed throughout the class. Field order matches the dataclass declaration.
- Returns:
Legacy metadata dict.
- Return type:
dict
- classmethod bridge(cid, length_3d, length_2d, slope)
Build the bridge-segment metadata literal used by
post_chain_merge(SimpleTopologicalStitcher) when stitching two curves through a synthetic connector segment. Bridge segments carry no topological confidence, no source-point count and a sentinelCLUSTER_ID = -1; the grade fields are derived from the slope magnitude. Extracted in iter4 (Phase 1, L-23) to fold the repeated 12-field literal into a single source.- Parameters:
cid (int) – Curve identifier of the bridged pair.
length_3d (float) – Bridge 3D length.
length_2d (float) – Bridge 2D (planar) length.
slope (float) – Bridge per-segment slope value.
- Returns:
Legacy metadata dict.
- Return type:
dict
- classmethod coverage_refine(cid, src_pts, length_3d, length_2d)
Build the coverage-refinement metadata literal used by
SimpleCurveExtractor._refine_coverage_chain()when emitting supplementary coverage segments. Coverage-refinement segments carry no topological / geometric confidence and a sentinelCLUSTER_ID = -1; grades are not computed (set to 0.0). Extracted in iter4 (Phase 1, L-23).- Parameters:
cid (int) – Curve identifier assigned by the refinement pass.
src_pts (int) – Number of source points feeding the refinement segment.
length_3d (float) – Segment 3D length.
length_2d (float) – Segment 2D (planar) length.
- Returns:
Legacy metadata dict.
- Return type:
dict
- __init__(CURVE_ID: int, SEG_ID: int, CONF_GEOM: float, CONF_TOP: float, STAB_TOP: float, SRC_PTS: int, SKEL_MTH: str, CLUSTER_ID: int, LENGTH_3D: float, LENGTH_2D: float, AVG_GRADE: float, MAX_GRADE: float) None
- class src.clustering.simple_curve_extractor.SimpleCurveExtractor(**kwargs)
- Author:
Alberto M. Esmoris Pena
Extract clean, parameterized 3D curves from previously classified or clustered point clouds. The pipeline has six stages:
Adaptive Stabilization & Iterative Manifold Merging – Normalize density, build a radius graph, smooth tangents, and iteratively merge disconnected endpoints.
Soft 2.5D Manifold Skeleton Extraction – Compute an exact Euclidean distance transform on a soft-Z occupancy grid, apply topological thinning, and reproject to 3D.
Monotonic Topological Cleanup & Staged Pruning – Detect junctions, prune twigs, remove low-centrality edges, validate cycles, and assign segment/curve IDs.
Resilient Geometry & Dual-Track Z – RDP simplification, PCHIP resampling of X(s)/Y(s)/Z(s), and per-vertex slope computation. Internal slope convention is
dZ/ds_3D = sin(theta)(wheres_3Dis the cumulative 3D arc length); bounded in \([-1, 1]\) and well-defined on near-vertical segments. The curve-pcloud writer exposesslope_output_formto emit theSLOPEcolumn as'sin'(default), civil-engineering'tan'(=dZ/dxy), or the inclination angle in degrees'angle_deg'(=atan2(dZ, dxy) * 180/pi).Multi-Format Industrial Output – Write GeoPackage, Shapefile, and CSV with full metadata.
Reverse Label Propagation (optional) – Map curve/segment IDs back to the original point cloud via nearest-neighbor assignment.
Stages 1–4 (core numerics) are delegated to C++ via the
pyvl3dppPyBind11 bridge, including Stage 4 PCHIP resampling which runs in parallel over segments.Public API (A31): the JSON specification accepts 12 public kwargs enumerated in
_SCE_PARAMS_PUBLIC, 3 runtime overrides in_SCE_PARAMS_RUNTIME, and 38 hidden tuning knobs in_SCE_PARAMS_HIDDEN, for a total of 53 instance-attribute names in_SCE_PARAMS(counts post Phase-8 KR-1 reduction). Scale-coupled knobs use a sentinel-driven default in__init__(): when the user leaves them atNonethey are derived frommin_voxel_sizeso they track the input coordinate scale. Phase-7 + Phase-9 promoted these to the sentinel pattern:rdp_tolerance,min_twig_length,output_point_spacing,z_consistency_sigma,snap_densify_step,hallucination_drop_densify_step,endpoint_extension_step,cid_gap_split_planar_radius(derived frommerge_radius), plus the 11 Phase-9 SR-1 additions:min_curve_length,curve_pcloud_step,snap_max_shift,snap_target_radius,snap_pull_radius,hallucination_drop_radius,hallucination_min_supported_length,endpoint_extension_radius,endpoint_extension_max_length,cid_relabel_z_disjoint_threshold,cid_gap_split_dz_threshold. The instance also derivesself._segment_grid_cell(Phase-9 SR-2, formerly the module constantDEFAULT_SEGMENT_GRID_CELL).Internal layout:
cluster()is a thin orchestrator that composes 19_step_*pipeline phases. Curve dicts carry a{'points', 'arc_lengths', 'slopes'}shape wrapped by thePolylinedataclass; per-segment metadata carries 12 fields wrapped by theFeatureMetadataclass. Both dataclasses emit plaindictviato_dict()at creation sites to preserve byte-identical downstream behaviour.- Variables:
curve_source (str) – Source of point labels. One of
"classification","predictions"(or the equivalent singular alias"prediction"), or the name of an extra dimension in the LAS file. The three built-in keywords are matched case-insensitively; custom attribute names are passed through verbatim. For 2-D floating-point prediction arrays, hard labels are derived viaargmaxalong the class axis; 2-D integer arrays and 1-D arrays are used verbatim.curve_classes (list of int) – Class/cluster IDs that represent curve points.
spatial_scale (float) – Skeleton-extraction voxel size. When \(0.0\), the stabilization voxel size is used.
merge_radius (float or None) – 2D (XY) radius of cylindrical endpoint merging. Pairs must satisfy \(\|p_a^{xy} - p_b^{xy}\| \le r\) and \(|z_a - z_b| \le \Delta z_{\max}\). Also governs Z-artifact detection (bridge Z-jump split, vertical-column split, Z-reversal split). When
None(default) or set to a non-positive value no merging is done; the internal enable flag ismerge_radius is not None and merge_radius > 0.z_tolerance (float) – Soft-Z occupancy weighting sigma. Also governs \(\Delta z_{\max}\) for cylindrical merge.
max_merge_iterations (int) – Maximum iterations for Stage 1 convergence.
tangent_smoothing_iterations (int) – Iterations for tangent field smoothing in Stage 1.
rdp_tolerance (float) – Ramer-Douglas-Peucker simplification tolerance in point cloud units. When unset (JSON
null)__init__()derives it frommin_voxel_sizevia the sentinel block so it tracks the input coordinate scale.occupancy_threshold (float) – Binary threshold for the soft-Z occupancy grid.
min_twig_length (float) – Minimum branch length below which twigs are pruned. When unset (JSON
null)__init__()derives it as \(50 \times\)min_voxel_sizevia the sentinel block.angular_threshold_k (float) – Multiplier for junction angular spread threshold detection.
betweenness_percentile (float) – Percentile below which edges are pruned by betweenness centrality.
ridge_fraction (float) – Minimum distance-transform fraction for cycle validation.
output_point_spacing (float) – PCHIP resampling spacing (0 means auto-derive from voxel size).
z_consistency_sigma (float) – Width of the 1D Gaussian filter along arc-length for Z consistency (0 means auto: 5 * voxel_size).
propagate_labels (bool) – Whether to map curve and segment IDs back to the original point cloud.
output_gpkg (str or None) – Path for GeoPackage output, or None to skip.
output_shp (str or None) – Path for Shapefile output, or None to skip.
output_csv (str or None) – Path for CSV output, or None to skip.
curve_pcloud (str or None) – Path for the curve point cloud output (LAS format), or None to skip. When set, a LAS file is written containing only the curve points resampled at
curve_pcloud_stepspacing, withCURVE_IDandSEG_IDas extra dimensions.curve_pcloud_step (float or None) – Distance between consecutive points in the curve point cloud, in the same spatial units as the input point cloud. Default
None(sentinel) — derived as \(1 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series default \(0.1\) at the productionmin_voxel_size=0.1fixture. Production JSONs may set the kwarg explicitly to override.epsg (int or None) – EPSG code for the coordinate reference system, or None to read from LAS VLRs.
min_voxel_size (float) – Minimum voxel size for adaptive voxelization. Prevents pathological cases with duplicate or noisy points.
min_segment_length (float) – Minimum 3D arc-length for a curve segment to survive post-filtering. When \(0.0\), no filtering is applied.
chain_radius_factor (float) – Multiplier applied to the skeleton voxel size to compute the 2D chain radius for cylindrical segment chaining. Default \(3.0\).
min_curve_length (float or None) – Minimum arc-length for a curve to be retained in the final output. Curves shorter than this value are discarded after all chaining and merging. Also gates the
adopt-short-CIDspass at line ~8924 (skipped whenmin_curve_length <= 0), so the value is load-bearing — not a pure post-filter. Setting it to \(0.0\) disables both the final filter and the adopt-short pass. DefaultNone(sentinel) — derived as \(500 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1, updated 2026-05-02); reproduces the production-tuned value \(50.0\) m on the inicorta fixture (min_voxel_size = 0.1). The threshold is measured as a 2D (XY-plane) length. Note: this kwarg is scene-coupled, not just scale-coupled — the right value depends on the characteristic length of the curves the user wants to extract from a specific scene, not only on the input voxel resolution. Override explicitly in the JSON spec when the scene’s curve-length distribution differs from inicorta’s.snap_enable (bool) – When
Truethe snap-to-input post-pass shifts interior polyline vertices toward the input curve-class points the extraction missed. Defaults toTrue. Setting itFalsereproduces the H25 baseline output bit-for-bit.snap_max_shift (float or None) – Maximum 2D distance (in metres) a single interior vertex can be shifted in one snap pass. Smaller caps reduce the number of near-self-crossings the sanitiser later clips, which in turn preserves more coverage. The H31 sweep at the 3 m/2 m metric showed
snap_max_shift = 3.5to be the sweet spot, balancing per-pass reach against sanitiser attrition. DefaultNone(sentinel) — derived as \(35 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(3.5\) m at the productionmin_voxel_size=0.1fixture. Production JSONs may set the kwarg explicitly to override.snap_min_neighbors (int) – Minimum number of missed input points that must pull a given interior vertex before that vertex is shifted. Acts as a sparsity filter that suppresses noisy single-point pulls. Default
4.snap_smoothing_iterations (int) – Number of 3-point moving-average passes applied to the per-vertex delta sequence before the shifts are committed. Prevents adjacent vertices from shifting in opposing directions. Default
2.snap_target_radius (float or None) – A curve-class input point is treated as MISSED (eligible to pull a vertex) when its distance to every densified polyline vertex exceeds this radius. Match it to the coverage radius of the validation metric. Default
None(sentinel) — derived as \(30 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(3.0\) m at the productionmin_voxel_size=0.1fixture (aligned with the production coverage threshold). Setting it lower (more aggressive snap) was tried at H36b and cost coverage because the snap created more hallucinated material than the truncation pass could absorb without removing supported coverage. Production JSONs may set the kwarg explicitly to override.snap_pull_radius (float or None) – Maximum distance from a missed point to its nearest interior polyline vertex; missed points farther than this are ignored. Larger values pull more distant misses toward their nearest polyline; gain saturates once this exceeds typical inter-cluster spacing. Default
None(sentinel) — derived as \(400 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(40.0\) m at the productionmin_voxel_size=0.1fixture. Production JSONs may set the kwarg explicitly to override.snap_densify_step (float) – Spacing (in metres) of the densified polyline used to classify input points as covered or missed. Should match the validation metric’s densification step. Default 0.5 m.
snap_passes (int) – Number of times the snap pass is applied BEFORE the H25 sanitiser. Each pass re-evaluates which input points are still MISSED, so successive passes progressively close the remaining coverage gap. Default 30.
snap_post_passes (int) – Number of additional snap passes applied AFTER the H25 sanitiser. The sanitiser clips loop-prone polyline material the pre-sanitise snap may have created, which in turn destroys coverage on the input points that previously sat inside the clipped segments; the post-pass re-fits the clipped polylines toward those newly-missed points. The per-polyline SI guard prevents any post-pass shift from introducing new self-crossings, so no further sanitiser run is required. Default 20 (sweep winner at the 3 m / 2 m metric); set to
0to reproduce the H30 (single-stage) snap behaviour.hallucination_drop_enable (bool) – When
True(the default), features whose hallucinated polyline length exceeds hallucination_drop_threshold of their total length are dropped at the end of the global-optimisation step. Hallucinated material is polyline length without any input curve-class support within hallucination_drop_radius.hallucination_drop_radius (float or None) – Neighbourhood radius (m) used to classify a densified polyline vertex as supported. Default
None(sentinel) — derived as \(25 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(2.5\) m at the productionmin_voxel_size=0.1fixture (identical to the validation metric’s--hallucination-radius). The truncate’s KDTree is built from a float64 reservoir of the curve-class input points (self._X_orig_f64), so distance comparisons match the metric bit-for-bit. Production JSONs may set the kwarg explicitly to override.hallucination_drop_threshold (float) – Per-feature hallucination ratio above which the feature is truncated at its hallucinated runs (or dropped if no supported run survives). Default
0.025— identical to the validation metric’s--hallucination-feature-threshold. H36 made the SCE-internal classification bit-exact with the metric (float64 input KDTree + float64-arithmetic densification), so the thresholds can be matched without any safety margin.hallucination_drop_densify_step (float) – Polyline densification step (m) used to compute the per-feature hallucination ratio. Match the validation metric. Default 0.5 m.
hallucination_min_supported_length (float or None) – Minimum length (m) of a supported sub-polyline kept when a hallucinated feature is split. Hallucinated features above the per-feature threshold are truncated at their hallucinated runs; each contiguous supported run becomes a new feature, but only if its 2D length is at least this many metres. Shorter supported runs are discarded. Default
None(sentinel) — derived as \(1 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(0.1\) m at the productionmin_voxel_size=0.1fixture. Short supported pockets still contribute coverage and dropping them at the H32 default of 3 m was found to cost up to 0.3 pp of coverage on the production cloud. Production JSONs may set the kwarg explicitly to override.snap_truncate_iterations (int) – Number of snap/truncate cycles run after the initial truncation pass. Each cycle performs one
snap_polylines_to_inputfollowed by onedrop_hallucinated_features(both delegated toSimpleCoverageRefiner) so coverage gains from snap shifts compound without ever leaving hallucinated material in the output. Default 5 — empirically saturates the residual coverage budget at modest runtime cost.endpoint_extension_enable (bool) – When
True(the default), a final pass walks each polyline endpoint outward along its tangent and adds short extensions while the next step still has input curve-class support. The pass cannot introduce hallucinated material because it stops at the first unsupported step.endpoint_extension_radius (float or None) – Neighbourhood radius (m) used to test whether a candidate extension point is supported by the input cloud. Default
None(sentinel) — derived as \(25 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(2.5\) m at the productionmin_voxel_size=0.1fixture (matching the hallucination radius). Production JSONs may set the kwarg explicitly to override.endpoint_extension_step (float) – Distance (m) between successive extension steps. Smaller values track curved input clusters more precisely; larger values are faster. Default 0.5 m.
endpoint_extension_max_length (float or None) – Maximum extension length (m) per endpoint. Caps the walk so a particularly long stretch of supported material does not run away. Default
None(sentinel) — derived as \(150 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(15.0\) m at the productionmin_voxel_size=0.1fixture — long enough to add meaningful coverage past saturated endpoints; the per-step crossing guard (with in-walk segment registration) plus the post-extension truncation pass keep self-intersections at zero, gaps at zero, and hallucinated features at zero even at this length. Production JSONs may set the kwarg explicitly to override.cid_relabel_z_disjoint_enable (bool) – When
True(the default), any CURVE_ID grouping two or more features whose z-extents are separated by more than cid_relabel_z_disjoint_threshold is broken into multiple CURVE_IDs (one per z- connected component). Eliminates phantom same-CID gap pairs left behind by Z-jump and Z-reversal splits.cid_relabel_z_disjoint_threshold (float or None) – Z separation (m) above which two features sharing a CURVE_ID are considered disjoint and split apart into two CURVE_IDs. Default
None(sentinel) — derived as \(50 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(5.0\) m at the productionmin_voxel_size=0.1fixture, which is well abovez_tolerance(so legitimate Z-noise stays grouped) and well below the observed bad-pair gap (~22 m) in the H31 baseline. Production JSONs may set the kwarg explicitly to override.cid_gap_split_dz_threshold (float or None) – Z separation (m) above which the H36 endpoint-pair gap splitter breaks two same-CID features into different CURVE_IDs (when their closest endpoints are also planar-close within
cid_gap_split_planar_radius). DefaultNone(sentinel) — derived as \(30 \times\)min_voxel_sizeat instance construction (Phase-9 SR-1); reproduces the H-series sweep-winner value \(3.0\) m at the productionmin_voxel_size=0.1fixture. Production JSONs may set the kwarg explicitly to override.merge_orphans_enable (bool) – When
True(default), the final orphan-fragment merge pass reassigns whole-CID fragments shorter thanmerge_orphans_max_lengthto a planar-close, z-overlapping non-orphan neighbour CURVE_ID. SettingFalsedisables the pass and reproduces the pre-H37 CURVE_ID landscape bit-for-bit.nthreads (int) – Number of threads (-1 means all available).
- Collaborators:
SimpleCurveWriter– multi-format output sink (GeoPackage, Shapefile, CSV, curve LAS). Constructed once per pipeline run by the private_step_output_writersphase; receives a snapshot of the relevant SCE attributes (output_gpkg,output_shp,output_csv,curve_pcloud,curve_pcloud_step,slope_output_form,epsg,out_prefix,_coord_center) and emits the polyline outputs without further SCE coupling. SeeSimpleCurveWriterfor the column schemas and dbase-name truncation rationale.SimpleGlobalCurveOptimizer– energy- minimising global optimiser. Constructed once per pipeline run by the private_step_global_optimizationphase; receives a snapshot of the relevant SCE attributes (merge_radius,min_segment_length,z_tolerance,min_voxel_size) and runs the three-pass per-vertex descent plus the mixed-CID bridging pass. SeeSimpleGlobalCurveOptimizerfor the energy formulation.SimplePolylineSanitizer– polyline- sanitisation helper (Z-jump split, Z-reversal split, dedup consecutive points, self- intersection clip via thepyvl3dpp.curve_self_intersection_{f,d}C++ binding). Constructed inside the private_step_merge_resolve_split_pipelineand_step_global_optimizationphases (and therefine_coveragepath onSimpleCoverageRefinerplus the_merge_resolve_sanitize_tripletcleanup path); receives a snapshot of the relevant SCE attributes (merge_radius,z_tolerance,min_voxel_size) and runs the split / dedup / self-intersection-clip cascade. SeeSimplePolylineSanitizerfor the criteria.SimpleCoverageRefiner– coverage- refinement and post-optimisation cleanup helper (H30/H31 snap-to-input, H32 hallucination drop, H36 endpoint extension, iterative coverage refinement, and therun_post_opt_cleanupcascade driver). Constructed inside the private_step_coverage_refinementand_step_global_optimizationphases (and a single-use construction inside_concatenate_orphan_into_neighbourfor the staticdensify_polyline_3d_archelper); receives a snapshot of the relevant SCE attributes (24 fields) plus a callable hook to the input-cloud KDTree cache and a back- reference to this extractor for cross-class invocations. SeeSimpleCoverageRefinerfor the criteria.SimpleTopologicalStitcher– endpoint-merger and chain-builder helper (union-find endpoint merge with bridge creation, direction-aware per-CURVE_ID chain builder, post-chain merge cascade with its fixed-point wrapper, and the sub-chain attachment helper). Constructed inside the private_step_endpoint_merge,_step_chain_segments,_step_merge_resolve_split_pipeline,_merge_resolve_sanitize_triplet,_step_adopt_short_cidsand_step_global_optimizationphases (and the R4refine_coveragepath onSimpleCoverageRefiner); receives a snapshot of the relevant SCE attributes (merge_radius,spatial_scale) plus a back-reference to this extractor for the_max_dzcross-class invocation. SeeSimpleTopologicalStitcherfor the criteria.SimpleCrossingHandler– cross-feature and cross-CID crossing resolver (C++-backed bbox-grid crossing detection, truncation application, same-CID and cross-CID resolvers, fixed-point driver, and the coverage-refine drop dispatcher that delegates to_drop_coverage_crossing_features()). Constructed inside the private_step_merge_resolve_split_pipeline,_merge_resolve_sanitize_triplet,_step_overlap_and_cross_cid_cleanupand_step_adopt_short_cidsphases; receives a snapshot of the relevant SCE attribute (_segment_grid_cell) plus a back-reference to this extractor for symmetry with R3/R4/R5. The unit-test public entry point_drop_coverage_crossing_features()intentionally stays on SCE; the live instance-mode dispatcherdrop_cov_refine_by_predicate()on R6 forwards to it via deferred import. SeeSimpleCrossingHandlerfor the criteria.
- static extract_clustering_args(spec)
Extract the arguments to initialize/instantiate a SimpleCurveExtractor from a key-word specification. The public API is 12 kwargs plus 3 runtime overrides; hidden tuning knobs are also accepted. Unknown keys raise
ClusteringExceptionso typos and stale legacy names fail loudly instead of silently falling back to defaults.- Parameters:
spec – The key-word specification containing the arguments.
- Returns:
The arguments to initialize/instantiate a SimpleCurveExtractor.
- __init__(**kwargs)
Initialize an instance of SimpleCurveExtractor. The constructor accepts the 12 public kwargs (A31), 3 runtime overrides, and the hidden tuning knobs enumerated in
_SCE_PARAMS_HIDDEN.- Parameters:
kwargs – Public, runtime, or hidden SimpleCurveExtractor kwargs.
- cluster(pcloud)
Run the full curve-extraction pipeline on the given point cloud.
The pipeline executes the following steps:
Extract curve points from the point cloud using the configured mask and center coordinates.
Adaptive stabilization and iterative manifold merging (Stage 1 C++).
Soft 2.5D skeleton extraction with exact EDT and topological thinning (Stage 2 C++).
Monotonic topological pruning: junction detection, twig pruning, betweenness-based edge pruning, cycle validation, and segment/curve ID assignment (Stage 3 C++).
Geometry refinement: RDP simplification, PCHIP resampling, reversal removal, and slope computation (Stage 4 C++).
Post-filtering by minimum segment length.
Metadata computation and optional endpoint merging across curves.
Direction-aware chaining of segments into continuous polylines, followed by optional post-chain merging.
Multi-format output (GeoPackage, Shapefile, CSV) and optional reverse label propagation.
- Parameters:
pcloud (
PointCloud) – The input point cloud with classified or clustered points.- Returns:
The point cloud, optionally extended with curve and segment ID features.
- Return type:
- get_curves_dict()
Return the in-memory representation of the last extracted curves, in world coordinates. Each entry is a plain
dictcarrying the polyline vertices under'points'(Nx3np.ndarray) plus the per-feature metadata (CURVE_ID,SEG_ID,LENGTH_2D,LENGTH_3D, …) that the SHP/CSV/GPKG writers serialise.Polylines with fewer than 2 vertices are excluded so the result is directly consumable by downstream evaluators.
- Returns:
List of curve dicts, or
Nonewhencluster()has not been called yet (or the extraction returned no curves).- Return type:
list of dict or None