src.eval.simple_curve_evaluator

Classes

SimpleCurveEvaluator(**kwargs)

class src.eval.simple_curve_evaluator.SimpleCurveEvaluator(**kwargs)
Author:

Alberto M. Esmoris Pena

Class to evaluate the geometric quality of a Simple Curve Extractor (SCE) output against an input point cloud. Given a point cloud labelled with a designated curve_class and an in-memory list of polyline dicts produced by the SCE (one entry per feature), the evaluator reports five Key Performance Indicators (KPIs):

  1. Coverage \((\%)\) — fraction of curve-class input points whose closest extracted-curve point is within coverage_radius metres in 3D Euclidean distance.

    \[\mathrm{Coverage} = \frac{ \left| \{ p \in C : \min_{q \in S} \lVert p - q \rVert_{2}^{3D} \leq r_{\mathrm{cov}} \} \right| }{|C|}\]
  2. Deviation \((\%)\) — length-weighted ratio of extracted polyline 3D arc that traverses populated input regions but where no curve-class point is nearby. A densified polyline vertex is “deviating” when no curve-class input neighbour lies within hallucination_radius (i.e. it is hallucinated) AND at least one any-class input neighbour lies within hallucination_radius (i.e. it sits inside the cloud). The polyline length partitions cleanly into covered length, deviation length, and pure-hallucination length (off-cloud); thus \(\mathrm{Deviation} + \mathrm{PureHallucination} = \mathrm{Hallucination}\) segment-exact under the both/one segment-mask convention.

  3. Hallucination \((\%)\) — length-weighted ratio of extracted curve material whose densified vertices have no input curve-class support within hallucination_radius (3D). The polylines are densified at densify_step along their 3D arc. The evaluator also reports the count of features whose individual hallucination score exceeds hallucination_feature_threshold.

  4. Self-intersections — count of unordered pairs of 2D polyline-segment crossings that share the same CURVE_ID. Computed in 2D since they describe XY topology.

  5. Gaps — sum across CURVE_ID of feature-endpoint pairs from different features of the same curve whose 2D distance is in \((\varepsilon, r_{\mathrm{gap}}]\). T-junctions (endpoint sitting close to another feature’s interior vertex) can optionally be excluded.

Coverage, Deviation and Hallucination are 3D metrics; Self-intersections and Gaps are 2D.

Variables:
  • coverage_radius (float) – Coverage 3D radius (m).

  • deviation_radius (float) – Radius (m) used only by the n_isolated context KPI (an input curve-class point is flagged isolated when it has no other input neighbour within this radius). The length-anchored deviation_pct is governed by hallucination_radius.

  • hallucination_radius (float) – Hallucination 3D radius (m).

  • hallucination_feature_threshold (float) – Per-feature score above which a feature is flagged as hallucinated.

  • densify_step (float) – Polyline densification step (m, 3D arc).

  • gap_radius (float) – Maximum endpoint-to-endpoint 2D distance for the gap test (m).

  • gap_eps (float) – Endpoint distance below which the pair is considered touching, not a gap.

  • exclude_t_junctions (bool) – When True, skip endpoint pairs where one endpoint sits within t_radius of the other feature’s interior vertices.

  • t_radius (float) – Radius (m) for the T-junction test.

  • self_intersection_grid_cell (float) – Cell size (m) of the bounding-box grid used by count_self_intersections() to localise candidate segment pairs. The grid turns the scan from \(O(N^{2})\) to \(O(N k)\). The value should be commensurate with the typical polyline segment length (default 5.0 m suits polylines whose vertices are spaced roughly 0.5 m apart, as densified polylines typically are): too small wastes memory on empty cells, too large defeats the spatial pruning. Case-dependent.

  • self_intersection_eps (float) – Tolerance on the parametric-interval test that detects whether two 2D line segments cross. Two segments are considered to cross when their parameters t and u both fall in \((-\varepsilon, 1+\varepsilon)\). Default 1e-6 is unitless on the parameter space and rarely needs tuning, but is exposed for completeness.

  • self_intersection_parallel_threshold (float) – Magnitude below which the 2D cross-product of two segment direction vectors is treated as zero (parallel segments → no crossing). Scales with the squared magnitude of the segment direction vectors, so for very small or very large coordinate ranges this threshold may need to be adjusted. Default 1e-12.

  • degenerate_segment_length (float) – 3D segment lengths below this threshold are skipped during densification (no intermediate points inserted) to avoid divide-by-zero and n_insert explosions on near-degenerate polyline segments. Scales with the unit of the input coordinates; for centimetre-scale data the default 1e-12 may be larger than legitimate short segments and a smaller value is needed. Default 1e-12.

  • curve_class (int) – Classification value identifying curve points in the input cloud.

  • report_path (str or None) – Path to write the textual report.

  • hallucination_report_path (str or None) – Path to write the per-feature hallucination report.

  • summary_plot_path (str or None) – Path to write the KPI summary plot.

  • hallucination_plot_path (str or None) – Path to write the per-feature hallucination histogram.

static extract_eval_args(spec)

Extract the arguments to initialize a SimpleCurveEvaluator from a key-word spec.

Parameters:

spec – The dictionary holding the spec arguments.

Returns:

The kwargs dictionary.

Return type:

dict

__init__(**kwargs)

Initialize a SimpleCurveEvaluator.

Parameters:

kwargs – The attributes for the evaluator. See the class-level docstring for the supported keys.

eval(pcloud=None, curves=None, **kwargs)

Evaluate an in-memory list of extracted curves against an input point cloud.

Parameters:
  • pcloud (PointCloud) – VL3D point cloud whose curve-class points are the ground truth. Required.

  • curves (list of dict) – List of curve dicts produced by SimpleCurveExtractor.get_curves_dict() (each entry carries the polyline under 'points' plus CURVE_ID and friends). Required.

Returns:

The evaluation result.

Return type:

SimpleCurveEvaluation

__call__(pcloud=None, curves=None, **kwargs)

Pipeline-friendly wrapper around eval(). Writes the report and the plots when the corresponding paths are set.

eval_args_from_state(state)

Forward the pipeline state’s point cloud and the in-memory curves channel to eval().

Parameters:

state (SimplePipelineState) – The pipeline’s state.

Returns:

The kwargs for invoking the evaluator.

Return type:

dict

has_report_paths()
Returns:

True when at least one report path is set.

Return type:

bool

has_plot_paths()
Returns:

True when at least one plot path is set.

Return type:

bool

static densify_polyline_3d(pts_xyz, step, degenerate_length=1e-12)

Resample a 3D polyline at uniform step spacing along the 3D arc. A steep z-jump segment that is short in XY but long in 3D would otherwise be sampled too sparsely and let a hallucinated z-runaway slip past per-vertex support tests.

Parameters:
  • pts_xyz(N, M) polyline vertices. Missing Z (M < 3) is padded with zeros.

  • step – Densification step (m, 3D arc).

  • degenerate_length – 3D segment lengths below this value are skipped (no intermediate points inserted) to avoid divide-by-zero on near-degenerate segments. Default 1e-12.

Returns:

(K, 3) densified vertices.

Return type:

np.ndarray

static segs_cross_2d(p1, p2, p3, p4, eps=1e-06, parallel_threshold=1e-12)

Return True if 2D segment p1-p2 crosses p3-p4.

Parameters:
  • p1 – First endpoint of segment A.

  • p2 – Second endpoint of segment A.

  • p3 – First endpoint of segment B.

  • p4 – Second endpoint of segment B.

  • eps – Tolerance on the parametric-interval test. The segments are considered to cross when both t and u fall in \((-\varepsilon, 1+\varepsilon)\). Unitless on the parameter space. Default 1e-6.

  • parallel_threshold – Magnitude of the cross-product below which the two direction vectors are treated as parallel (no crossing). Scales with the squared coordinate magnitudes so very large or very small coordinate ranges may need a different value. Default 1e-12.

Returns:

True if the segments cross in 2D.

Return type:

bool

static compute_hallucination(curves, input_curve_xyz, radius, densify_step, feature_threshold, curve_tree=None, degenerate_length=1e-12)

Compute per-feature hallucination scores in 3D.

A point along an extracted polyline is hallucinated when no input curve-class point lies within radius metres in 3D Euclidean distance. The per-feature hallucination score is the length-weighted ratio of hallucinated polyline material on that feature: contiguous segments where both endpoints are hallucinated count fully; segments where exactly one endpoint is hallucinated count for half their length. Features whose score exceeds feature_threshold are flagged.

Parameters:
  • curves – List of curve dicts (each with a 'points' Nx3 array).

  • input_curve_xyz(N, 3) array of input curve-class XYZ coordinates.

  • radius – 3D Euclidean radius (m).

  • densify_step – 3D densification step (m).

  • feature_threshold – Threshold on the per-feature score above which the feature is flagged.

  • curve_tree (scipy.spatial.KDTree or None) – Optional precomputed KDTree(input_curve_xyz); reusing it across helper calls avoids a redundant tree build. Default None (the tree is built internally).

  • degenerate_length – Forwarded to densify_polyline_3d(). Default 1e-12.

Returns:

Tuple (overall, n_flagged, n_features_scored, per_feat) where per_feat is a list of tuples (fi, score, hall_len, feat_len).

Return type:

tuple

static compute_deviation(curves, input_curve_xyz, input_all_xyz, radius, densify_step, curve_tree=None, all_tree=None, degenerate_length=1e-12)

Compute the length-anchored deviation in 3D.

A densified polyline vertex is in a deviation region when (i) no input curve-class neighbour lies within radius (i.e. the vertex is hallucinated) AND (ii) at least one any-class input neighbour lies within radius (i.e. the vertex sits inside the cloud). Aggregation uses the both/one segment-mask convention shared with compute_hallucination(). With radius == hallucination_radius the result satisfies \(\mathrm{deviation\_length} + \mathrm{pure\_hallucination\_length} = \mathrm{hallucination\_length}\) segment-exact, because the deviation and pure-hallucination masks are mutually exclusive at every dense vertex.

Parameters:
  • curves – List of curve dicts (each with a 'points' Nx3 array).

  • input_curve_xyz(N, 3) array of input curve-class XYZ coordinates.

  • input_all_xyz(M, 3) array of all input point XYZ coordinates (M >= N).

  • radius – 3D Euclidean radius (m). Use the same value as hallucination_radius for the segment- exact partition with hallucination.

  • densify_step – 3D densification step (m).

  • curve_tree (scipy.spatial.KDTree or None) – Optional precomputed KDTree(input_curve_xyz). Default None.

  • all_tree (scipy.spatial.KDTree or None) – Optional precomputed KDTree(input_all_xyz). Default None.

  • degenerate_length – Forwarded to densify_polyline_3d(). Default 1e-12.

Returns:

Tuple (overall_fraction, dev_length, total_length).

Return type:

tuple

static count_self_intersections(curves, grid_cell=5.0, eps=1e-06, parallel_threshold=1e-12)

Count 2D self-intersections within the same CURVE_ID.

Backed by the C++ pyvl3dpp.curve_self_intersection_d (per-polyline first-hit scan) and pyvl3dpp.curve_segments_cross_d (cross-feature pair enumeration) entry points; both use the same grid-backed scan as the SimpleCurveExtractor and run in \(O(N k)\). The pure-Python double-loop driven by segs_cross_2d() (Phase 2 hotspot, ~5.6 M calls per evaluation) has been retired here. The segs_cross_2d() helper is kept on the class as a reference primitive (no other call sites in the evaluator).

Implementation contract per CURVE_ID group:

  • Cross-feature crossings. Counted exactly via pyvl3dpp.curve_segments_cross_d(), which emits each unordered (fa, fb) pair once with strict fa < fb. endpoint_touch_skip is left at the default False (cross-CID semantics — every crossing pair is reported).

  • Intra-feature crossings. Counted as the number of features that have at least one self-crossing, via pyvl3dpp.curve_self_intersection_d() (boolean first-hit). Polylines with multiple self-crossings contribute 1 here rather than the exact pair count; the consumer (the strict KPI gate n_self_intersections == 0) treats both reports identically (0 stays 0; any non-zero polyline still fails the gate).

Parameters:
  • curves – List of curve dicts (each entry must carry 'points' and CURVE_ID).

  • grid_cell – Cell size (m) of the bounding-box grid used to localise candidate segment pairs. The value should be commensurate with the typical polyline segment length: too small wastes memory on empty cells, too large defeats the spatial pruning. Case-dependent — exposed as self_intersection_grid_cell on SimpleCurveEvaluator. Default 5.0.

  • eps – Retained for API back-compatibility. The tolerance is applied inside the C++ scan; the value passed here is informational only. Default 1e-6.

  • parallel_threshold – Retained for API back-compatibility. The cross-product threshold is applied inside the C++ scan; the value passed here is informational only. Default 1e-12.

Returns:

Total number of crossings (intra-feature polylines contribute 1 each; cross-feature pairs are exact).

Return type:

int

static count_gaps(curves, gap_radius, gap_eps, exclude_t_junctions=False, t_radius=5.0)

Count endpoint-to-endpoint gaps within the same CURVE_ID.

Parameters:
  • curves – List of curve dicts.

  • gap_radius – Maximum 2D distance for a candidate gap pair (m).

  • gap_eps – Endpoint distance below which a pair is counted as touching, not a gap.

  • exclude_t_junctions – When True, skip pairs where an endpoint sits within t_radius of the other feature’s interior vertices.

  • t_radius – Radius for the T-junction test.

Returns:

A pair (total, n_t_junctions). The second value is always 0 when exclude_t_junctions=False.

Return type:

tuple