Population Sampling for Bias Modeling#
The basic entity for the bias modeling is to create a population object.
import mlsim
pop = mlsim.bias.Population()
The default is not a completely iid and balanced population. All populations are
defined by the following variables: \(A\), \(Z\), \(Y\), \(X\). A population has a sample
method
and attributes for each component sampler.
Object Structure#
- class mlsim.bias.Population(demographic_sampler=<class 'mlsim.bias.bias_components.Demographic'>, target_sampler=<class 'mlsim.bias.bias_components.Target'>, feature_sampler=<class 'mlsim.bias.bias_components.Feature'>, feature_noise_sampler=<class 'mlsim.bias.bias_components.FeatureNoise'>, parameter_dictionary={})[source]#
Object for describing a population so that sampling from the population and biased samples are possible from a sampler type and parameter dictionary
This takes one sampler of each factor of the joint data distribution
Sampling bias#
Populations also have samplers that insert sampling, rather than population level biases. This allows for the creation of a population with one set of biases and to use the same object to draw additional datasets that have additionally biased sampls. For example you may wish to have training data and audit datasets that have different disributions to demonstrate the impact of a biased sampling at one of those times.