vcnet package

Submodules

vcnet.classifiers module

class vcnet.classifiers.Classifier(hp)

Bases: LightningModule

Simple fully convolutional classifier that can be used

Args:

hp (Dict): configuration of the classifier (hyperparameters) and the dataset

configure_optimizers()

Choose what optimizers and learning-rate schedulers to use in your optimization. Normally you’d need one. But in the case of GANs or similar you might have multiple. Optimization with multiple optimizers only works in the manual optimization mode.

Return:

Any of these 6 options.

  • Single optimizer.

  • List or Tuple of optimizers.

  • Two lists - The first list has multiple optimizers, and the second has multiple LR schedulers (or multiple lr_scheduler_config).

  • Dictionary, with an "optimizer" key, and (optionally) a "lr_scheduler" key whose value is a single LR scheduler or lr_scheduler_config.

  • None - Fit will run without any optimizer.

The lr_scheduler_config is a dictionary which contains the scheduler and its associated configuration. The default configuration is shown below.

lr_scheduler_config = {
    # REQUIRED: The scheduler instance
    "scheduler": lr_scheduler,
    # The unit of the scheduler's step size, could also be 'step'.
    # 'epoch' updates the scheduler on epoch end whereas 'step'
    # updates it after a optimizer update.
    "interval": "epoch",
    # How many epochs/steps should pass between calls to
    # `scheduler.step()`. 1 corresponds to updating the learning
    # rate after every epoch/step.
    "frequency": 1,
    # Metric to to monitor for schedulers like `ReduceLROnPlateau`
    "monitor": "val_loss",
    # If set to `True`, will enforce that the value specified 'monitor'
    # is available when the scheduler is updated, thus stopping
    # training if not found. If set to `False`, it will only produce a warning
    "strict": True,
    # If using the `LearningRateMonitor` callback to monitor the
    # learning rate progress, this keyword can be used to specify
    # a custom logged name
    "name": None,
}

When there are schedulers in which the .step() method is conditioned on a value, such as the torch.optim.lr_scheduler.ReduceLROnPlateau scheduler, Lightning requires that the lr_scheduler_config contains the keyword "monitor" set to the metric name that the scheduler should be conditioned on.

# The ReduceLROnPlateau scheduler requires a monitor
def configure_optimizers(self):
    optimizer = Adam(...)
    return {
        "optimizer": optimizer,
        "lr_scheduler": {
            "scheduler": ReduceLROnPlateau(optimizer, ...),
            "monitor": "metric_to_track",
            "frequency": "indicates how often the metric is updated",
            # If "monitor" references validation metrics, then "frequency" should be set to a
            # multiple of "trainer.check_val_every_n_epoch".
        },
    }


# In the case of two optimizers, only one using the ReduceLROnPlateau scheduler
def configure_optimizers(self):
    optimizer1 = Adam(...)
    optimizer2 = SGD(...)
    scheduler1 = ReduceLROnPlateau(optimizer1, ...)
    scheduler2 = LambdaLR(optimizer2, ...)
    return (
        {
            "optimizer": optimizer1,
            "lr_scheduler": {
                "scheduler": scheduler1,
                "monitor": "metric_to_track",
            },
        },
        {"optimizer": optimizer2, "lr_scheduler": scheduler2},
    )

Metrics can be made available to monitor by simply logging it using self.log('metric_to_track', metric_val) in your LightningModule.

Note:

Some things to know:

  • Lightning calls .backward() and .step() automatically in case of automatic optimization.

  • If a learning rate scheduler is specified in configure_optimizers() with key "interval" (default “epoch”) in the scheduler configuration, Lightning will call the scheduler’s .step() method automatically in case of automatic optimization.

  • If you use 16-bit precision (precision=16), Lightning will automatically handle the optimizer.

  • If you use torch.optim.LBFGS, Lightning handles the closure function automatically for you.

  • If you use multiple optimizers, you will have to switch to ‘manual optimization’ mode and step them yourself.

  • If you need to control how often the optimizer steps, override the optimizer_step() hook.

forward(x: tensor)

Same as torch.nn.Module.forward().

Args:

*args: Whatever you decide to pass into the forward method. **kwargs: Keyword arguments are also possible.

Return:

Your model’s output

training_step(batch, batch_idx)

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Args:

batch: The output of your data iterable, normally a DataLoader. batch_idx: The index of this batch. dataloader_idx: The index of the dataloader that produced this batch.

(only if multiple dataloaders used)

Return:
  • Tensor - The loss tensor

  • dict - A dictionary which can include any keys, but must include the key 'loss' in the case of automatic optimization.

  • None - In automatic optimization, this will skip to the next batch (but is not supported for multi-GPU, TPU, or DeepSpeed). For manual optimization, this has no special meaning, as returning the loss is not required.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

To use multiple optimizers, you can switch to ‘manual optimization’ and control their stepping:

def __init__(self):
    super().__init__()
    self.automatic_optimization = False


# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx):
    opt1, opt2 = self.optimizers()

    # do training_step with encoder
    ...
    opt1.step()
    # do training_step with decoder
    ...
    opt2.step()
Note:

When accumulate_grad_batches > 1, the loss returned here will be automatically normalized by accumulate_grad_batches internally.

class vcnet.classifiers.sklearnClassifier(hp)

Bases: object

Wrapper for using a sklearn classifiers in the VCNet pipeline.

Example of minimal configuration:

```{python} {

“dataset”: {

“target”:”income”,

}, “classifier_params” : {

“skname”: “RandomForestClassifier”, “kwargs”: {

“n_estimators” : 50,

}

}

}

```

Attributes:

hp (Dict): configuration of the classifier (hyperparameters) and the dataset

fit(X: DataFrame)

function to fit the model

Args:

X (pd.DataFrame): dataset to train the model on

vcnet.data module

class vcnet.data.DataCatalog(config: dict)

Bases: LightningDataModule

Generic framework for datasets, using sklearn processing. This class is implemented by OnlineCatalog and CsvCatalog. OnlineCatalog allows the user to easily load online datasets, while CsvCatalog allows easy use of local datasets.

Parameters

data_name: str

What name the dataset should have.

df: pd.DataFrame

The complete Dataframe. This is equivalent to the combination of df_train and df_test, although not shuffled.

df_train: pd.DataFrame

Training portion of the complete Dataframe.

df_test: pd.DataFrame

Testing portion of the complete Dataframe.

df_val: pd.DataFrame

Validation portion of the complete Dataframe.

scaling_method: str, default: MinMax

Type of used sklearn scaler. Can be set with the property setter to any sklearn scaler. Set to “Identity” for no scaling.

encoding_method: str, default: OneHot_drop_binary

Type of OneHotEncoding {OneHot, OneHot_drop_binary}. Additional drop binary decides if one column is dropped for binary features. Can be set with the property setter to any sklearn encoder. Set to “Identity” for no encoding.

Returns

Data

Warning

Imputation works only for continuous variables.

property categorical: List[str]
property continuous: List[str]
data_unloader(X, y) DataFrame

recreates a dataframe from the (numpy) arrays of data and labels.

It applies the inverse transformation required internally by VCNet. More especially, it reverses the one hot encoding to recreate readable categorical features for the user.

Returns:

DataFrame: Dataframe with the same columns as input dataframe

Warning

In case the pre-processing included missing values imputations, this step is not reversed and the output dataset contains the imputed values.

property df_test: DataFrame
property df_train: DataFrame
property df_val: DataFrame
property encoder: BaseEstimator

Contains a fitted sklearn encoder:

Returns

sklearn.preprocessing.BaseEstimator

get_pipeline_element(key: str) Callable

Returns a specific element of the transformation pipeline.

Parameters

keystr

Element of the pipeline we want to return

Returns

Pipeline element

property immutables: List[str]
property imputer: BaseEstimator

Contains a fitted sklearn imputer:

Returns

sklearn.preprocessing.BaseEstimator

inverse_transform(df: DataFrame) DataFrame

Transforms output after prediction back into original form. Only possible for DataFrames with preprocessing steps.

Parameters

dfpd.DataFrame

Contains normalized and encoded data.

Returns

outputpd.DataFrame

Prediction output denormalized and decoded

prepare_data(raw_pd: DataFrame = None)

Data preparation

Parameters

raw_pd:ref:pd.DataFrame, optional

Defaults to None.

property raw_df_test: DataFrame
property raw_df_train: DataFrame
property raw_df_val: DataFrame
property scaler: BaseEstimator

Contains a fitted sklearn scaler.

Returns

sklearn.preprocessing.BaseEstimator

property target: str
test_dataloader() DataLoader

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

Note:

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note:

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader() DataLoader

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

For data processing use the following pattern:

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

Note:

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

transform(df: DataFrame) DataFrame

Transforms input for prediction into correct form. Only possible for DataFrames without preprocessing steps.

Recommended to keep correct encodings and normalization

Parameters

dfpd.DataFrame

Contains raw (not normalized and not encoded) data.

Returns

outputpd.DataFrame

Prediction input normalized and encoded

val_dataloader() DataLoader

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

Note:

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note:

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

class vcnet.data.NumpyDataset(*arrs)

Bases: TensorDataset

data_loader(batch_size=128, shuffle=True, num_workers=4)
features(test=False)
target(test=False)
class vcnet.data.Rounder(precisions)

Bases: object

A class dedicated to rounding values at a given decimal

Parameters

precisions: Dict(str,int)

map that gives the precision to apply to an attribute name

inverse_transform(df)
vcnet.data.decode(fitted_encoder: BaseEstimator, features: List[str], df: DataFrame) DataFrame

Pipeline function to decode data with fitted sklearn OneHotEncoder.

Parameters

fitted_encodersklearn OneHotEncoder

Encodes input data.

featureslist

List of categorical feature.

dfpd.DataFrame

Data we want to normalize

Returns

outputpd.DataFrame

Whole DataFrame with encoded values

vcnet.data.descale(fitted_scaler: BaseEstimator, features: List[str], df: DataFrame) DataFrame

Pipeline function to de-normalize data with fitted sklearn scaler.

Parameters

fitted_scalersklearn Scaler

Normalizes input data

featureslist

List of continuous feature

dfpd.DataFrame

Data we want to de-normalize

Returns

outputpd.DataFrame

Whole DataFrame with de-normalized values

vcnet.data.encode(fitted_encoder: BaseEstimator, features: List[str], df: DataFrame) DataFrame

Pipeline function to encode data with fitted sklearn OneHotEncoder.

Parameters

fitted_encodersklearn OneHotEncoder

Encodes input data.

featureslist

List of categorical feature.

dfpd.DataFrame

Data we want to normalize

Returns

outputpd.DataFrame

Whole DataFrame with encoded values

vcnet.data.fit_encoder(encoding_method, df)

Parameters

encoding_method: {“OneHot”, “OneHot_drop_binary”, “Identity”}

String indicating what encoding method to use or sklearn.preprocessing function.

df: pd.DataFrame

DataFrame containing only categorical data.

Returns

sklearn.base.BaseEstimator

vcnet.data.fit_imputer(imputation_method, df)

Parameters

imputation_method: {“SimpleImputer”,”Identity”}

String indicating what scaling method to use or sklearn.impute function.

df: pd.DataFrame

DataFrame only containing continuous data.

Returns

sklearn.base.BaseEstimator

vcnet.data.fit_rounder(df)

Function that build a rounder from a dataframe.

Parameters

df: pd.DataFrame

DataFrame only containing continuous data.

Returns

Rounder

vcnet.data.fit_scaler(scaling_method, df)

Parameters

scaling_method: {“MinMax”, “Standard”, “Identity”}

String indicating what scaling method to use or sklearn.preprocessing function.

df: pd.DataFrame

DataFrame only containing continuous data.

Returns

sklearn.base.BaseEstimator

vcnet.data.impute(fitted_imputer: BaseEstimator, features: List[str], df: DataFrame) DataFrame

Pipeline function to impute missing values in the dataset with a fitted sklearn Imputer. This function has to be applied once the

Parameters

fitted_imputersklearn Imputer

Imputes missing values.

featureslist

List of numerical feature.

dfpd.DataFrame

Data we want to modify

Returns

outputpd.DataFrame

Whole DataFrame without missing values (in the selected features)

vcnet.data.order_data(feature_order: List[str], df: DataFrame) DataFrame

Restores the correct input feature order for the ML model

Only works for encoded data

Parameters

feature_orderlist

List of input feature in correct order

dfpd.DataFrame

Data we want to order

Returns

outputpd.DataFrame

Whole DataFrame with ordered feature

vcnet.data.round(fitted_rounder: Rounder, features: List[str], df: DataFrame) DataFrame

Pipeline function to round data the numeraical attributes.

Parameters

fitted_rounderRounder

Round the attribute of the data at a fitter level of precision

featureslist

List of continuous feature

dfpd.DataFrame

Data we want to round

Returns

outputpd.DataFrame

Whole DataFrame with rounded values

vcnet.data.scale(fitted_scaler: BaseEstimator, features: List[str], df: DataFrame) DataFrame

Pipeline function to normalize data with fitted sklearn scaler.

Parameters

fitted_scalersklearn Scaler

Normalizes input data

featureslist

List of continuous feature

dfpd.DataFrame

Data we want to normalize

Returns

outputpd.DataFrame

Whole DataFrame with normalized values

vcnet.models module

class vcnet.models.PHVCNet(model_config: Dict, classifier: Module)

Bases: VCNetBase

Class for Post-hoc VCNet immutable version model architecture. Post-hoc VCNet uses a torch classifier trained on a classification task and trains the counterfactual generators.

A classifier provided to this class is assumed to take examples to be classified.

classif(z: tensor, x: tensor, x_mut: tensor, x_immut: tensor) tensor

Forward function of the classification layers. It predicts the class of an example z prepared by the encode_classif function.

Args:

z (torch.tensor): examples represented in their latent space for classification.

Returns:

torch.tensor: example classification. Dimension self.class_size.

training_step(batch, batch_idx)

Training step for lightning

Args:

batch (torch.tensor): batch batch_idx (torch.tensor): list of example indices

Returns:

float: loss measure for the batch

class vcnet.models.VCNet(model_config: Dict)

Bases: VCNetBase

Class for VCNet immutable version model architecture. VCNet is a joint learning architecture: during the training phase, both the classifier and the counterfactual generators are fitted.

classif(z: tensor, x: tensor, x_mut: tensor, x_immut: tensor) tensor

Forward function of the classification layers. It predicts the class of an example z prepared by the encode_classif function.

Args:

z (torch.tensor): examples represented in their latent space for classification.

Returns:

torch.tensor: example classification. Dimension self.class_size.

loss_functions(recon_x, x, mu, logvar, output_class, y_true)
pre_encode(x_mut: tensor, x_immut: tensor) tensor

Function that prepares examples (x) with a shared pre-coding layers.

The default behavior is to transmit x as it.

training_step(batch, batch_idx)

Training step for lightning

Args:

batch (torch.tensor): batch batch_idx (torch.tensor): list of example indices

Returns:

float: loss measure for the batch

class vcnet.models.VCNetBase(model_config: Dict)

Bases: LightningModule

Class for the general VCNet architecture with handling immutable features. This class is abstract. It specifies a VCNet model with a classifier and a conditional variational auto-encoder (cVAE), and a training procedure. The training procedure of a VCNet architecture consists in training the cVAE in a classical way. The VCNet trick lies in generating counterfactuals by switching the predicted class of an example to generate a modified example using the cVAE.

The VCNet architecture handles natively the immutable features.

Note that this VCNet architecture handles only numerical features. The user of this class has to manage the encoding of categorical features out of this class.

The current class implements the cVAE and abstract function.

classif(z: tensor, x: tensor, x_mut: tensor, x_imm: tensor) tensor

Forward function of the classification layers. It predicts the class of an example z prepared by the encode_classif function.

Args:

z (torch.tensor): examples represented in their latent space for classification.

Returns:

torch.tensor: example classification. Dimension self.class_size.

configure_optimizers()

Setup of the optimizer

counterfactuals(x: tensor) tensor

Generation of counterfactuals for the example x.

Warning

This function has been tested only for binary classification. The use with a multiclass problem is still to evaluate.

decode(z_prime: tensor, c: tensor) tensor

C-VAE decoding, computes P(x|z, c)

Args:

z_prime (torch.tensor): _description_ c (torch.tensor): conditioning of the VAE. For VCNet, the decoding is conditioned by the class and the immutable features [class, x_immutable]. Then, its dimension is class_size-1 + len(x_immutable)

Returns:

torch.tensor: _description_

encode(z: tensor, x_mut: tensor, x_immut: tensor) tensor

C-VAE encoding

Args:

z (torch.tensor): pre-encoded input representation. None or tensor of size defined by latent_size_share x_mut (torch.tensor): mutable part of the input tensor x_immut (torch.tensor): mutable part of the input tensor

Returns:

tuple(torch.tensor, torch.tensor): representation of the gaussian distribution in the latent space (mu, sigma). Tensors of dimension latent_size.

forward(x: tensor)

Forward function used during the training phase of a VCNet model. It mainly goes through the three parts of the models : the pre-coding, the C-VAE and the classification. Finally, it returns the reconstructed example, the output class and VAE distribution parameters.

Args:

x (torch.tensor): input examples

forward_pred(x: tensor) tensor

Forward function for prediction in test phase (prediction task). It prepares the examples and then classify

Args:

x (torch.tensor): _description_

loss_functions(recon_x, x, mu, sigma)
pre_encode(x_mut: tensor, x_imm: tensor) tensor

Function that prepares examples (x) with a shared pre-coding layers.

The default behavior is to transmit x as it.

reparameterize(mu: tensor, sigma: tensor) tensor

C-VAE Reparametrization trick

Args:

mu (torch.tensor): size latent_size sigma (torch.tensor): size latent_size

Returns:

torch.tensor: size latent_size

training_step(batch: tensor, batch_idx) float

Training step for lightning

Args:

batch (torch.tensor): batch batch_idx (torch.tensor): list of example indices

Returns:

float: loss measure for the batch

Module contents

class vcnet.DataCatalog(config: dict)

Bases: LightningDataModule

Generic framework for datasets, using sklearn processing. This class is implemented by OnlineCatalog and CsvCatalog. OnlineCatalog allows the user to easily load online datasets, while CsvCatalog allows easy use of local datasets.

Parameters

data_name: str

What name the dataset should have.

df: pd.DataFrame

The complete Dataframe. This is equivalent to the combination of df_train and df_test, although not shuffled.

df_train: pd.DataFrame

Training portion of the complete Dataframe.

df_test: pd.DataFrame

Testing portion of the complete Dataframe.

df_val: pd.DataFrame

Validation portion of the complete Dataframe.

scaling_method: str, default: MinMax

Type of used sklearn scaler. Can be set with the property setter to any sklearn scaler. Set to “Identity” for no scaling.

encoding_method: str, default: OneHot_drop_binary

Type of OneHotEncoding {OneHot, OneHot_drop_binary}. Additional drop binary decides if one column is dropped for binary features. Can be set with the property setter to any sklearn encoder. Set to “Identity” for no encoding.

Returns

Data

Warning

Imputation works only for continuous variables.

property categorical: List[str]
property continuous: List[str]
data_unloader(X, y) DataFrame

recreates a dataframe from the (numpy) arrays of data and labels.

It applies the inverse transformation required internally by VCNet. More especially, it reverses the one hot encoding to recreate readable categorical features for the user.

Returns:

DataFrame: Dataframe with the same columns as input dataframe

Warning

In case the pre-processing included missing values imputations, this step is not reversed and the output dataset contains the imputed values.

property df_test: DataFrame
property df_train: DataFrame
property df_val: DataFrame
property encoder: BaseEstimator

Contains a fitted sklearn encoder:

Returns

sklearn.preprocessing.BaseEstimator

get_pipeline_element(key: str) Callable

Returns a specific element of the transformation pipeline.

Parameters

keystr

Element of the pipeline we want to return

Returns

Pipeline element

property immutables: List[str]
property imputer: BaseEstimator

Contains a fitted sklearn imputer:

Returns

sklearn.preprocessing.BaseEstimator

inverse_transform(df: DataFrame) DataFrame

Transforms output after prediction back into original form. Only possible for DataFrames with preprocessing steps.

Parameters

dfpd.DataFrame

Contains normalized and encoded data.

Returns

outputpd.DataFrame

Prediction output denormalized and decoded

prepare_data(raw_pd: DataFrame = None)

Data preparation

Parameters

raw_pd:ref:pd.DataFrame, optional

Defaults to None.

property raw_df_test: DataFrame
property raw_df_train: DataFrame
property raw_df_val: DataFrame
property scaler: BaseEstimator

Contains a fitted sklearn scaler.

Returns

sklearn.preprocessing.BaseEstimator

property target: str
test_dataloader() DataLoader

An iterable or collection of iterables specifying test samples.

For more information about multiple dataloaders, see this section.

For data processing use the following pattern:

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

Note:

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

Note:

If you don’t need a test dataset and a test_step(), you don’t need to implement this method.

train_dataloader() DataLoader

An iterable or collection of iterables specifying training samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

For data processing use the following pattern:

However, the above are only necessary for distributed processing.

Warning

do not assign state in prepare_data

Note:

Lightning tries to add the correct sampler for distributed and arbitrary hardware. There is no need to set it yourself.

transform(df: DataFrame) DataFrame

Transforms input for prediction into correct form. Only possible for DataFrames without preprocessing steps.

Recommended to keep correct encodings and normalization

Parameters

dfpd.DataFrame

Contains raw (not normalized and not encoded) data.

Returns

outputpd.DataFrame

Prediction input normalized and encoded

val_dataloader() DataLoader

An iterable or collection of iterables specifying validation samples.

For more information about multiple dataloaders, see this section.

The dataloader you return will not be reloaded unless you set :paramref:`~lightning.pytorch.trainer.trainer.Trainer.reload_dataloaders_every_n_epochs` to a positive integer.

It’s recommended that all data downloads and preparation happen in prepare_data().

Note:

Lightning tries to add the correct sampler for distributed and arbitrary hardware There is no need to set it yourself.

Note:

If you don’t need a validation dataset and a validation_step(), you don’t need to implement this method.

class vcnet.PHVCNet(model_config: Dict, classifier: Module)

Bases: VCNetBase

Class for Post-hoc VCNet immutable version model architecture. Post-hoc VCNet uses a torch classifier trained on a classification task and trains the counterfactual generators.

A classifier provided to this class is assumed to take examples to be classified.

classif(z: tensor, x: tensor, x_mut: tensor, x_immut: tensor) tensor

Forward function of the classification layers. It predicts the class of an example z prepared by the encode_classif function.

Args:

z (torch.tensor): examples represented in their latent space for classification.

Returns:

torch.tensor: example classification. Dimension self.class_size.

training_step(batch, batch_idx)

Training step for lightning

Args:

batch (torch.tensor): batch batch_idx (torch.tensor): list of example indices

Returns:

float: loss measure for the batch

class vcnet.VCNet(model_config: Dict)

Bases: VCNetBase

Class for VCNet immutable version model architecture. VCNet is a joint learning architecture: during the training phase, both the classifier and the counterfactual generators are fitted.

classif(z: tensor, x: tensor, x_mut: tensor, x_immut: tensor) tensor

Forward function of the classification layers. It predicts the class of an example z prepared by the encode_classif function.

Args:

z (torch.tensor): examples represented in their latent space for classification.

Returns:

torch.tensor: example classification. Dimension self.class_size.

loss_functions(recon_x, x, mu, logvar, output_class, y_true)
pre_encode(x_mut: tensor, x_immut: tensor) tensor

Function that prepares examples (x) with a shared pre-coding layers.

The default behavior is to transmit x as it.

training_step(batch, batch_idx)

Training step for lightning

Args:

batch (torch.tensor): batch batch_idx (torch.tensor): list of example indices

Returns:

float: loss measure for the batch

class vcnet.sklearnClassifier(hp)

Bases: object

Wrapper for using a sklearn classifiers in the VCNet pipeline.

Example of minimal configuration:

```{python} {

“dataset”: {

“target”:”income”,

}, “classifier_params” : {

“skname”: “RandomForestClassifier”, “kwargs”: {

“n_estimators” : 50,

}

}

}

```

Attributes:

hp (Dict): configuration of the classifier (hyperparameters) and the dataset

fit(X: DataFrame)

function to fit the model

Args:

X (pd.DataFrame): dataset to train the model on