model.tdcomp package

Submodules

model.tdcomp.classwise_sampler module

class model.tdcomp.classwise_sampler.ClasswiseSampler(**kwargs)

Bases: TrainingDataComponent

Author:

Alberto M. Esmoris Pena

Training data component based on sampling from the input training data to satisfy a given class-wise distribution.

The model arguments of a Classwise sampler are:

target_class_distribution - list of int

The target distribution for each class.

replace - bool

Whether to compute the sampling with replacement (i.e., repeating data points is allowed) or not.

__init__(**kwargs)

Initialize the class-wise sampler training data component.

Parameters:

kwargs – The attributes for the class-wise training data component.

__call__(X, y)

Apply the class-wise sampler to transform the input training data.

See TrainingDataComponent.__call__().

compute_classwise_sampling(X, y)

Compute the class-wise sampling on the given input.

Parameters:
  • X – The features of the training data points.

  • y – The expected classes of the training data points.

Returns:

The transformed training data.

Return type:

tuple (X, y)

model.tdcomp.smote module

class model.tdcomp.smote.SMOTE(**kwargs)

Bases: TrainingDataComponent

Author:

Alberto M. Esmoris Pena

Training data component based on Synthetic Minority Oversampling TEchnique (SMOTE). It transforms the input data by considering the k-nearest neighbors for each point and interpolates between them to generate new samples.

__init__(**kwargs)

Initialize the SMOTE training data component.

Parameters:

kwargs – The attributes for the SMOTE training data component.

__call__(X, y)

Apply the SMOTE to transform the input training data.

See TrainingDataComponent.__call__().

model.tdcomp.training_data_component module

exception model.tdcomp.training_data_component.TrainingDataComponentException(message='')

Bases: VL3DException

Author:

Alberto M. Esmoris Pena

Class for exceptions related to components for training data pipelines. See Model.

__init__(message='')
class model.tdcomp.training_data_component.TrainingDataComponent(**kwargs)

Bases: object

Author:

Alberto M. Esmoris Pena

Abstract class providing the interface governing any training data component.

Variables:

component_args (dict) – The key-word arguemnts for the component of the training data pipeline.

static extract_component_args(spec)

Extract the arguments to initialize/instantiate a TrainingDataComponent from a key-word specification.

Parameters:

spec – The key-word specification containing the arguments.

Returns:

The arguments to initialize/instantiate a TrainingDataComponent.

__init__(**kwargs)

Root initialization for any instance of type TrainingDataComponent.

Parameters:

kwargs – The attributes for the TrainingDataComponent.

abstractmethod __call__(X, y)

Run the component on the given training data and return the transformed version of the training data.

Parameters:
  • X – The input to the model, typically the attributes.

  • y – The expected classes.

Returns:

The new attributes and expected classes for model training.

Return type:

tuple (X, y)

static build_pipeline(spec)

Build a pipeline from a training data pipeline specification.

Parameters:

spec (list) – A list of dictionaries where each dictionary represents a training data component.

Returns:

The built pipeline.

Return type:

list of TrainingDataComponent

static build_component(spec)

Build the received training data component.

Parameters:

spec – The specification of the training data component to be built.

Returns:

The built training data component.

Return type:

TrainingDataComponent

Module contents

author:

Alberto M. Esmoris Pena

The training data components package contains components that can be used in the context of training data pipelines to select or transform the training data when fitting a model.