API Reference#
- mlsim.anomaly.geometric_2d_gmm_sp(r_clusters, cluster_size, cluster_spread, p_sp_clusters, domain_range, k, N, p_clusters=None)[source]#
Sample from a gaussian mixture model with Simpson’s Paradox and spread means return data in a data fram
- r_clustersscalar [0,1]
correlation coefficient of clusters
- cluster_size2 vector
variance in each direction of each cluster
- cluster_spreadscalar [0,1]
pearson correlation of means
- p_sp_clustersscalar in [0,1]
portion of clusters with SP
- p_clustersvector in [0,1)^k, optional
probabilty of membership of a sample in each cluster (controls relative size of clusters) default is [1.0/k]*k for uniform
- domain_range[xmin, xmax, ymin, ymax]
planned region for points to be in, means will be in middle 80%
- kinteger
number of clusters
- Nscalar
number of points
- mlsim.anomaly.geometric_indep_views_gmm_sp(d, r_clusters, cluster_size, cluster_spread, p_sp_clusters, domain_range, k, N, p_clusters=None, numeric_categorical=False)[source]#
Sample from a gaussian mixture model with Simpson’s Paradox and spread means return data in a data fram
- dinteger
number of independent views, groups of 3 columns with sp
- r_clustersscalar [0,1] or list of d
correlation coefficient of clusters
- cluster_size2 vector or list of d
variance in each direction of each cluster
- cluster_spreadscalar [0,1] list of d
pearson correlation of means
- p_sp_clustersscalar in [0,1] list of d
portion of clusters with SP
- p_clustersvector in [0,1)^k, optional or list of d vectors
probabilty of membership of a sample in each cluster (controls relative size of clusters) default is [1.0/k]*k for uniform
- domain_range[xmin, xmax, ymin, ymax] list of d
planned region for points to be in, means will be in middle 80%
- kinteger or list of d
number of clusters
- Nscalar
number of points, shared across all views
- numeric_categorical=False
use numerical (ordinal) values instead of letters
- mlsim.anomaly.plot_clustermat(z, fmt=None)[source]#
black and white matshow for clustering and feat allocation matrices
- Parameters:
z (nparray, square to be plotted)
fmt (if z is not a square, then str of what it is)
fmt options: ‘crplist’ : a list of values from zero to k ‘ibplist’ : a list of lists of varying lengths ‘list’ : a list, but not nparray otherwise ready to plot