Study of the plausibility of generated counterfactuals

The objective of this notewokk is to study the plausibility of the counterfactuals generated by VCNet. We want to assess two properties of the generated counterfactuals :

are the generated counterfactuals in the distribution of the examples? If not, some generated examples may be seen as unrealistic.
are the generated counterfactuals diverse? If not, this would mean that VCNet generates always the same counterfactuals (overfitting in the learning of the distribution of instances)

In this study, we will use synthetic data and compare the generation of counterfactuals by the Wachter method and by VCNet.

Let first install some useful library for these experiments and then load all the required libraries.

[ ]:

!pip install openTSNE
!pip install seaborn
!pip install tensorboard
!pip install tensorboardX

[ ]:

import pandas as pd
import numpy as np
import torch
import matplotlib.pyplot as plt
import seaborn as sns
import lightning as L
from lightning.pytorch.loggers import TensorBoardLogger
from openTSNE import TSNE

[ ]:

from vcnet import DataCatalog, VCNet
from vcnet import PHVCNet
from vcnet import SKLearnClassifier

Test VCNet on synthetic data

In this section, we test VCNet on simple synthetic datasets. Datasets are generated on the principle of blobs with two classes: class centers are randomly generated using a uniform law, and then, samples are generated around this center according to a normal distribution (with fixed variance).

[ ]:

np.random.seed(531)

class_size = 2
nbpd = 1000
num_features = 5

[ ]:

class_centers = np.random.uniform(low=-10, high=10, size=(class_size, num_features) )
class_spreads = np.random.uniform(low=0, high=3, size=(class_size, num_features) )

print(class_centers)

[ ]:

samples = []
labels = []

for c in range(class_size):
    # domain-class center C_dc
    class_center = class_centers[c]  # (num_features,)
    scale = class_spreads [c]

    # sample
    generated_samples = np.random.normal(loc=class_center, scale=scale, size=(nbpd // class_size, num_features))
    samples.append(generated_samples)
    labels.append(np.full(nbpd // class_size, c))  # y

samples = np.vstack(samples)  # (nb_domains * nbpd, num_features)
labels = np.concatenate(labels)  # (nb_domains * nbpd,)

# create a dataframe
data = pd.DataFrame(samples, columns=[f"x_{i+1}" for i in range(num_features)])
data["y"] = labels

#shuffle rows
data = data.sample(frac=1).reset_index(drop=True)
data.head()

[ ]:

sns.scatterplot(data=data, x="x_1", y="x_3", hue="y")

[ ]:

dataset_settings = {
    "target":"y",
    "continuous" : data.columns[:-1].to_list(),
    "categorical": [],
    "immutables" : [],
    "scaling_method": "MinMax",
    'encoding_method': "Identity"
}
dataset = DataCatalog(dataset_settings)

[ ]:

dataset_settings = dataset.prepare_data(data)

Learn the classifier

Learn now a basic random forest classifier … that is assumed to have a good accuracy on the easy to separate datasets

[ ]:

hp = {
    "dataset": dataset_settings,
    "classifier_params": {
        "skname": "RandomForestClassifier",
        "kwargs": {
            "n_estimators": 50,
        }
    }
}

classifier = SKLearnClassifier(hp)
classifier.fit(dataset.df_train)

Learn a post-hoc counterfactual generator

We now learn a post-hoc VCNet model to generate realistic counterfactuals

[ ]:

hp["vcnet_params"]= {
        "lr": 1e-3,
        "epochs": 200,
        "lambda_KLD": 5,
        "lambda_BCE": 1,
        "latent_size": 16,
        "latent_size_share": 64,
        "mid_reduce_size": 32
    }

# Define the post-hoc VCNet model
vcnet = PHVCNet(hp, classifier)

# Finally, fit it using a Lightning module
logger = TensorBoardLogger("tb_logs", name="PHVCNet")
trainer = L.Trainer(max_epochs=hp['vcnet_params']['epochs'], enable_checkpointing=False, logger=logger)
trainer.fit(model=vcnet, train_dataloaders=dataset.train_dataloader())

Some basic verification on the test set

We finally check the prediction accuracy of the classifier and the validity of the counterfactuals that have been generated.

[ ]:

vcnet.eval()
for ldata, labels in dataset.test_dataloader():
    cl = vcnet.forward_pred(ldata)
    cf, clcf = vcnet.counterfactuals(ldata)
    rlcf = dataset.data_unloader(cf,clcf)


    acc = torch.sum((cl[:, 0] > 0.5).int() == labels[:, 0]) / len(ldata)
    validity = torch.sum((cl[:, 0] > 0.5).int() != (clcf[:, 0] > 0.5).int()) / len(ldata)
    print(f"Accuracy: {acc}, validity:{validity}")

[ ]:

#torch.dstack((cl[:, 0],clcf[:, 0]))

[ ]:

rlcf

Visualisation of the spread of generated counterfactuals : realism and diversity

In this visualisation, we compute a TSNE projection to visualize the dataset in 2 dimensions.

The TNSE projection is computed on the training set, and then, we reuse the same projection to visualize the tests set (x) and their counterfactuals (o) in the same space.

[ ]:

# TSNE
tsne = TSNE(
    perplexity=30,
    metric="euclidean",
    n_jobs=8,
    random_state=42,
    verbose=True,
)

# Create a projection from the Train dataset
embedding_train = tsne.fit( dataset.df_train.drop(columns='y').to_numpy() )

[ ]:

# use the projection to project the generated counterfactuals of the test set
embedding_cf_test = embedding_train.transform( cf.numpy() )

[ ]:

# use the same projection to project the data of the test set
embedding_test = embedding_train.transform( dataset.df_test.drop(columns='y').to_numpy() )

[ ]:

sns.color_palette("Paired")
plotvalues = pd.DataFrame({'x':embedding_train[:,0], 'y':embedding_train[:,1], 'c':dataset.df_train['y']})
fig=sns.kdeplot(data=plotvalues, x="x", y="y", hue="c", fill=False, levels=5)
plt.scatter( embedding_test[:,0], embedding_test[:,1], c=(clcf[:, 0] > 0.5).int(), cmap='Paired', marker='x')
plt.scatter( embedding_cf_test[:,0], embedding_cf_test[:,1], c=(clcf[:, 0] > 0.5).int(), cmap='Paired', marker='o')
fig.axes.get_xaxis().set_visible(False)
fig.axes.get_yaxis().set_visible(False)
plt.legend('',frameon=False)
plt.show()

Same experiment, with VCNet

The posthoc version seems to work correctly … then, we now investigate the joint-learning model of VCNet.

[ ]:

hp["vcnet_params"]= {
        "lr":  2e-3,
        "epochs" : 10,
        "lambda_KLD": 0.5,
        "lambda_CE": 0.93,
        "lambda_BCE": 1,
        "latent_size" : 19,
        "latent_size_share" :  304,
        "mid_reduce_size" : 152
    }

vcnet = VCNet(hp)

[ ]:

# Finally, fit it using a Lightning module
trainer = L.Trainer(max_epochs=hp['vcnet_params']['epochs'])
trainer.fit(model=vcnet, train_dataloaders=dataset.train_dataloader())

[ ]:

vcnet.eval()
for data, labels in dataset.test_dataloader():
    cl = vcnet.forward_pred(data)
    cf, clcf = vcnet.counterfactuals(data)
    rlcf = dataset.data_unloader(cf,clcf)


    acc = torch.sum((cl[:, 0] > 0.5).int() == labels[:, 0]) / len(data)
    validity = torch.sum((cl[:, 0] > 0.5).int() != (clcf[:, 0] > 0.5).int()) / len(data)
    print(f"Accuracy: {acc}, validity:{validity}")

[ ]:

# we simply reproject the counterfactual (in the similar space as before)
embedding_cf_test = embedding_train.transform( cf.numpy() )

[ ]:

sns.color_palette("Paired")
plotvalues = pd.DataFrame({'x':embedding_train[:,0], 'y':embedding_train[:,1], 'c':dataset.df_train['y']})
fig=sns.kdeplot(data=plotvalues, x="x", y="y", hue="c", fill=False, levels=5)
plt.scatter( embedding_test[:,0], embedding_test[:,1], c=(clcf[:, 0] > 0.5).int(), cmap='Paired', marker='x')
plt.scatter( embedding_cf_test[:,0], embedding_cf_test[:,1], c=(clcf[:, 0] > 0.5).int(), cmap='Paired', marker='o')
fig.axes.get_xaxis().set_visible(False)
fig.axes.get_yaxis().set_visible(False)
plt.legend('',frameon=False)
plt.show()

With the joint version, the spread of counterfactual seems to be better than before.