mimic._MimicCalibration¶
-
class
mimic._MimicCalibration(threshold_pos=5, record_history=False)[source]¶ mimic calibration: A method to calibrate probability of binary classification model.
-
__init__(self, threshold_pos=5, record_history=False)[source]¶ Parameters: - threshold_pos: int
the number of positive at each bin at initial binning step.
- record_history: bool
to record the merging bin process.
-
construct_initial_bin(self, sorted_score, sorted_target, threshold_pos)[source]¶ make each bin having the number of positives equal to threshold_pos.the default = 5.
Parameters: - sorted_score: the sorted probability from the model,
ie pre-calibrated score.
- sorted target: the target in the order of increasing score.
the number of target = 2.
- threshold_pos: number of positive in each bin, default=5
Returns: - bin_info: 2-D array, shape (number of bins, 6).
- [[bl_index, score_min, score_max, score_mean,
nPos_temp, total_temp, nPosRate_temp]]
- total_number_pos: integer
number of positive.
-
fit(self, X, y, sample_weight=None)[source]¶ perform mimic calibration.
Parameters: - X: array-like, shape (number of row, 1)
the probability from the binary model.
- y: array-like, shape (number of row, 1)
binary target, its element is 0 or 1.
Returns: - self : object
Returns an instance of self.
-
get_bin_boundary(self, current_binning, boundary_choice)[source]¶ Parameters: - current_binning: array-like, shape (num_bins, 7)
- [[bl_index, score_min, score_max, score_mean,
nPos_temp, total_temp, PosRate_temp]]
- boundary_choice: int
0: choose socre_min, ie left boundary of bin 1: choose socre_max, ie right boundary of bin 2: choose socre_mean, ie mean score of bin
Returns: - boundary_table: array-like, shape (num_bins, 1)
-
get_params(self, deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
merge_bins(self, binning_input, increasing_flag)[source]¶ Parameters: - binning_input: array-like, shape (number of bins, 7)
- [[bl_index, score_min, score_max,
score_mean, nPos_temp, total_temp, PosRate_temp]]
- increasing_flag: bool
Returns: - result: array-like, shape (number of bins, 7)
It merge bins to make sure the positive at each bin increasing.
- increasing_flag: bool
-
output_history_result(self, show_history_array=[])[source]¶ Output merging history. Parameters ———- show_history_array: array-like
given history index.Returns: - score-posRate-array : array-like
[[score_array, nPosRate_array, i]]
-
predict(self, pre_calib_prob)[source]¶ prediction function of mimic calibration. It returns 1-d array, calibrated probability using mimic calibration.
Parameters: - pre_calib_prob: array-like
the probability prediction from the binary model.
Returns: - calib_prob : array-like
the mimic-calibrated probability.
-
run_merge_function(self, current_binning, record_history=False)[source]¶ It keep merging bins together until the positive rate at each bin increasing.
Parameters: - current_binning: array-like, shape (number of bins, 7)
- [[bl_index, score_min, score_max,
score_mean, nPos_temp, total_temp, PosRate_temp]]
- record_history: bool
Returns: - result: array-like, shape (number of bins, 7)
it return the final binning result.
-
score(self, X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: - X : array-like, shape = (n_samples, n_features)
Test samples. For some estimators this may be a precomputed kernel matrix instead, shape = (n_samples, n_samples_fitted], where n_samples_fitted is the number of samples used in the fitting for the estimator.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True values for X.
- sample_weight : array-like, shape = [n_samples], optional
Sample weights.
Returns: - score : float
R^2 of self.predict(X) wrt. y.
Notes
The R2 score used when calling
scoreon a regressor will usemultioutput='uniform_average'from version 0.23 to keep consistent with metrics.r2_score. This will influence thescoremethod of all the multioutput regressors (except for multioutput.MultiOutputRegressor). To specify the default value manually and avoid the warning, please either call metrics.r2_score directly or make a custom scorer with metrics.make_scorer (the built-in scorer'r2'usesmultioutput='uniform_average').
-
set_params(self, **params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>so that it’s possible to update each component of a nested object.Returns: - self
-