src.report.taut_string_reclassification_report

Classes

TautStringReclassificationReport(**kwargs)

class src.report.taut_string_reclassification_report.TautStringReclassificationReport(**kwargs)
Author:

Alberto M. Esmoris Pena

The TautStringReclassificationReport class is the per-cluster sign-disambiguation trace produced by the TautStringReclassifier. It records the decision signal, the centroid, the point count, the per-side ground statistics, and the input/output class distribution so operators can audit unexpected labelings on real data.

See Report. See also TautStringReclassifier.

Variables:
  • yin (np.ndarray) – The vector of point-wise input labels.

  • yout (np.ndarray) – The vector of point-wise output labels.

  • yinlut (dict) – The look-up table whose keys are input class names and whose values are the corresponding input class indices.

  • youtlut (dict) – The look-up table whose keys are output class names and whose values are the corresponding output class indices.

  • records_dict (dict) – Dict keyed by snake_case field names; each value is an (N_clusters,) numpy array as produced by the C++ backend (seed_point_idx, cluster_idx, centroid_x, centroid_y, centroid_z, point_count, mode, signal, g_plus_count, g_minus_count, z_plus_median, z_minus_median, z_plus_min_base, z_minus_min_base, n_horizontal_norm).

  • counters_dict (dict) – Dict with the eight scalar counters returned by the C++ backend (total_wall_clusters, n_a_majority, n_a_lower_median, n_a_lower_base, n_a_inconclusive, n_b_pure_mode_b, n_skipped_near_horizontal, n_skipped_too_small).

  • total_wall_clusters (int) – Scalar mirror of counters_dict['total_wall_clusters'] (the size of the seed list before any skip).

MODE_NAMES = {0: 'A', 1: 'B', 2: 'Skipped'}
SIGNAL_NAMES = {0: 'majority', 1: 'lower-ground-median', 2: 'lower-ground-base', 3: 'inconclusive', 4: 'near-horizontal', 5: 'pure-mode-B', 6: 'too-small'}
COUNTER_LABELS = (('total_wall_clusters', 'total_wall_clusters'), ('n_a_majority', 'N_A_majority'), ('n_a_lower_median', 'N_A_lower_median'), ('n_a_lower_base', 'N_A_lower_base'), ('n_a_inconclusive', 'N_A_inconclusive'), ('n_b_pure_mode_b', 'N_B_pure_mode_B'), ('n_skipped_near_horizontal', 'N_skipped_near_horizontal'), ('n_skipped_too_small', 'N_skipped_too_small'))
CSV_COLUMNS = ('seed_point_idx', 'cluster_idx', 'centroid_x', 'centroid_y', 'centroid_z', 'point_count', 'mode', 'signal', 'g_plus_count', 'g_minus_count', 'z_plus_median', 'z_minus_median', 'z_plus_min_base', 'z_minus_min_base', 'n_horizontal_norm')
INLINE_ROWS_LIMIT = 20
__init__(**kwargs)

Initialize an instance of TautStringReclassificationReport.

Parameters:

kwargs – The key-word arguments defining the report’s attributes (yin, yout, yinlut, youtlut, records_dict, counters_dict, total_wall_clusters).

to_class_distribution(title, lut, y)

Generate a string representing a class distribution. Mirrors DirectionalReclassificationReport.to_class_distribution().

Parameters:
  • title (str) – The title or name for the class distribution representation.

  • lut (dict) – The look-up table whose keys are class names and whose values are class indices.

  • y (np.ndarray) – The vector of point-wise labels.

Returns:

String representing the class distribution.

Return type:

str

to_file(path, out_prefix=None)

Write the per-cluster trace to a CSV file with one row per cluster. The columns are the fifteen fields defined by the TautStringReclassifier ClusterSignalRecord C++ struct (seed_point_idx, cluster_idx, centroid_x, centroid_y, centroid_z, point_count, decoded mode, decoded signal, g_plus_count, g_minus_count, z_plus_median, z_minus_median, z_plus_min_base, z_minus_min_base, n_horizontal_norm).

Parameters:
  • path (str) – Path to the CSV file where the report must be written.

  • out_prefix (str) – The output prefix to expand the path (OPTIONAL). When the path starts with "*" the prefix replaces the leading wildcard, mirroring Report.to_file().

Returns:

Nothing, the output is written to the CSV file.

num_clusters()

Number of per-cluster records held by this report. Returns 0 when the records dict is empty (no wall clusters processed).

Returns:

The number of per-cluster records.

Return type:

int

decode_mode(mode_int)

Decode an integer mode value into its string name. Unknown values are echoed as "unknown(<int>)" so the CSV row keeps flagging the corruption without crashing.

Parameters:

mode_int (int) – The integer mode value emitted by the C++ backend.

Returns:

The decoded mode name.

Return type:

str

decode_signal(signal_int)

Decode an integer signal value into its string name. Unknown values are echoed as "unknown(<int>)" so the CSV row keeps flagging the corruption without crashing.

Parameters:

signal_int (int) – The integer signal value emitted by the C++ backend.

Returns:

The decoded signal name.

Return type:

str

records_to_inline_table()

Render the per-cluster records as a compact inline table for the log emission. Only the first INLINE_ROWS_LIMIT rows are produced (the caller already gated on size).

Returns:

A string with one line per cluster.

Return type:

str