mimic._MimicCalibration

class mimic._MimicCalibration(threshold_pos=5, record_history=False)[source]

mimic calibration: A method to calibrate probability of binary classification model.

__init__(self, threshold_pos=5, record_history=False)[source]
Parameters:
threshold_pos: int

the number of positive at each bin at initial binning step.

record_history: bool

to record the merging bin process.

construct_initial_bin(self, sorted_score, sorted_target, threshold_pos)[source]

make each bin having the number of positives equal to threshold_pos.the default = 5.

Parameters:
sorted_score: the sorted probability from the model,

ie pre-calibrated score.

sorted target: the target in the order of increasing score.

the number of target = 2.

threshold_pos: number of positive in each bin, default=5
Returns:
bin_info: 2-D array, shape (number of bins, 6).
[[bl_index, score_min, score_max, score_mean,

nPos_temp, total_temp, nPosRate_temp]]

total_number_pos: integer

number of positive.

fit(self, X, y, sample_weight=None)[source]

perform mimic calibration.

Parameters:
X: array-like, shape (number of row, 1)

the probability from the binary model.

y: array-like, shape (number of row, 1)

binary target, its element is 0 or 1.

Returns:
self : object

Returns an instance of self.

get_bin_boundary(self, current_binning, boundary_choice)[source]
Parameters:
current_binning: array-like, shape (num_bins, 7)
[[bl_index, score_min, score_max, score_mean,

nPos_temp, total_temp, PosRate_temp]]

boundary_choice: int

0: choose socre_min, ie left boundary of bin 1: choose socre_max, ie right boundary of bin 2: choose socre_mean, ie mean score of bin

Returns:
boundary_table: array-like, shape (num_bins, 1)
get_params(self, deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

merge_bins(self, binning_input, increasing_flag)[source]
Parameters:
binning_input: array-like, shape (number of bins, 7)
[[bl_index, score_min, score_max,

score_mean, nPos_temp, total_temp, PosRate_temp]]

increasing_flag: bool
Returns:
result: array-like, shape (number of bins, 7)

It merge bins to make sure the positive at each bin increasing.

increasing_flag: bool
output_history_result(self, show_history_array=[])[source]

Output merging history. Parameters ———- show_history_array: array-like

given history index.
Returns:
score-posRate-array : array-like

[[score_array, nPosRate_array, i]]

predict(self, pre_calib_prob)[source]

prediction function of mimic calibration. It returns 1-d array, calibrated probability using mimic calibration.

Parameters:
pre_calib_prob: array-like

the probability prediction from the binary model.

Returns:
calib_prob : array-like

the mimic-calibrated probability.

run_merge_function(self, current_binning, record_history=False)[source]

It keep merging bins together until the positive rate at each bin increasing.

Parameters:
current_binning: array-like, shape (number of bins, 7)
[[bl_index, score_min, score_max,

score_mean, nPos_temp, total_temp, PosRate_temp]]

record_history: bool
Returns:
result: array-like, shape (number of bins, 7)

it return the final binning result.

score(self, X, y, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
X : array-like, shape = (n_samples, n_features)

Test samples. For some estimators this may be a precomputed kernel matrix instead, shape = (n_samples, n_samples_fitted], where n_samples_fitted is the number of samples used in the fitting for the estimator.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True values for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:
score : float

R^2 of self.predict(X) wrt. y.

Notes

The R2 score used when calling score on a regressor will use multioutput='uniform_average' from version 0.23 to keep consistent with metrics.r2_score. This will influence the score method of all the multioutput regressors (except for multioutput.MultiOutputRegressor). To specify the default value manually and avoid the warning, please either call metrics.r2_score directly or make a custom scorer with metrics.make_scorer (the built-in scorer 'r2' uses multioutput='uniform_average').

set_params(self, **params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
self