Population Sampling for Bias Modeling

Population Sampling for Bias Modeling#

The basic entity for the bias modeling is to create a population object.

import mlsim

pop = mlsim.bias.Population()

The default is not a completely iid and balanced population. All populations are defined by the following variables: \(A\), \(Z\), \(Y\), \(X\). A population has a sample method and attributes for each component sampler.

Object Structure#

class mlsim.bias.Population(demographic_sampler=<class 'mlsim.bias.bias_components.Demographic'>, target_sampler=<class 'mlsim.bias.bias_components.Target'>, feature_sampler=<class 'mlsim.bias.bias_components.Feature'>, feature_noise_sampler=<class 'mlsim.bias.bias_components.FeatureNoise'>, parameter_dictionary={})[source]#

Object for describing a population so that sampling from the population and biased samples are possible from a sampler type and parameter dictionary

This takes one sampler of each factor of the joint data distribution

Sampling bias#

Populations also have samplers that insert sampling, rather than population level biases. This allows for the creation of a population with one set of biases and to use the same object to draw additional datasets that have additionally biased sampls. For example you may wish to have training data and audit datasets that have different disributions to demonstrate the impact of a biased sampling at one of those times.