socube.train package

Submodules

socube.train.metrics module

socube.train.metrics.getCurve(label: numpy.ndarray, score: numpy.ndarray, average: str = 'macro', curve: str = 'roc')

Evaluate receiver operating characteristic (ROC) for multiclass task based on false positive rate(FPR) and true positive rate(TPR). The area under curve(AUC) is used.

Parameters
  • label (np.ndarray) – the true label 1D ndarray vector

  • score (np.ndarray) – the predict score 1D ndarray vector

  • average (str) – the average method, “macro” , “micro” or “binary” how to average multiclass. Because conventional ROC is designed for binary classification task. For detail about this, please search it on the Internet.

Returns

  • A triple tuple of (x, y, AUC). When curve is “roc”,

  • x is FPR and y is TPR. When curve is “prc”, x is FNR and y is TNR.

socube.train.metrics.evaluateReport(label: numpy.ndarray, score: numpy.ndarray, roc_plot_file: Optional[str] = None, prc_plot_file: Optional[str] = None, average: str = 'macro', threshold: float = 0.5) → pandas.core.series.Series

Evaluate model performance with multiple indicator and generate file report.

Parameters
  • label (np.ndarray) – the true label 1D ndarray vector

  • score (np.ndarray) – the predict score 1D ndarray vector

  • roc_plot_file (Optional[str]) – the file name of ROC curve plot. If None, it will not save plot.

  • prc_plot_file (Optional[str]) – the file name of PR curve plot. If None, it will not save plot.

  • average (str) – the average method, “macro” , “micro” or “binary” how to average multiclass. Because conventional ROC is designed for binary classification task. For detail about this, please search it on the Internet.

  • threshold (float) – the threshold of predict score.

Returns

Return type

a pandas.Series object as the report.

socube.train.metrics.binaryRate(label: numpy.ndarray, predict: numpy.ndarray, positive_first: bool = False) → Tuple[float]

For binary classification task, calculate its confusion matrix and true positive rate (TPR), false negative rate (FNR), false positive rate (FPR) and true negative rate (TNR).

Parameters
  • label (np.ndarray) – the true label 1D ndarray vector

  • predict (np.ndarray) – the predict label 1D ndarray vector

  • positive_first (If True, it will regard 0 as positive class. Otherwise) – it will regard 1 as positive class. By default, it is False.

Returns

Return type

a tuple of TPR, FNR, FPR, TNR

socube.train.strategy module

class socube.train.strategy.EarlyStopping(descend: bool = True, patience: int = 7, threshold: float = 1e-05, verbose: int = 0, delta: float = 0, path='checkpoint.pt')

Bases: object

Early stops the training if validation loss doesn’t improve after a given patience.

Parameters
  • descend (bool, default True) – If True, the loss is minimized. If False, the loss is maximized.

  • patience (int, default 7) – How long to wait after last time validation loss improved.

  • verbose (int, default 0) –

    Prints a message which level great than verbose for each

    validation loss improvement.

  • delta (float, default 0) – Minimum change in the monitored quantity to qualify as an improvement.

  • path (str, default 'checkpoint.pt') – Path for the checkpoint to be saved to.

property earlyStop

Wether to reach early stopping point.

Returns

Return type

Boolean value

Module contents

socube.train.getCurve(label: numpy.ndarray, score: numpy.ndarray, average: str = 'macro', curve: str = 'roc')

Evaluate receiver operating characteristic (ROC) for multiclass task based on false positive rate(FPR) and true positive rate(TPR). The area under curve(AUC) is used.

Parameters
  • label (np.ndarray) – the true label 1D ndarray vector

  • score (np.ndarray) – the predict score 1D ndarray vector

  • average (str) – the average method, “macro” , “micro” or “binary” how to average multiclass. Because conventional ROC is designed for binary classification task. For detail about this, please search it on the Internet.

Returns

  • A triple tuple of (x, y, AUC). When curve is “roc”,

  • x is FPR and y is TPR. When curve is “prc”, x is FNR and y is TNR.

socube.train.evaluateReport(label: numpy.ndarray, score: numpy.ndarray, roc_plot_file: Optional[str] = None, prc_plot_file: Optional[str] = None, average: str = 'macro', threshold: float = 0.5) → pandas.core.series.Series

Evaluate model performance with multiple indicator and generate file report.

Parameters
  • label (np.ndarray) – the true label 1D ndarray vector

  • score (np.ndarray) – the predict score 1D ndarray vector

  • roc_plot_file (Optional[str]) – the file name of ROC curve plot. If None, it will not save plot.

  • prc_plot_file (Optional[str]) – the file name of PR curve plot. If None, it will not save plot.

  • average (str) – the average method, “macro” , “micro” or “binary” how to average multiclass. Because conventional ROC is designed for binary classification task. For detail about this, please search it on the Internet.

  • threshold (float) – the threshold of predict score.

Returns

Return type

a pandas.Series object as the report.

socube.train.binaryRate(label: numpy.ndarray, predict: numpy.ndarray, positive_first: bool = False) → Tuple[float]

For binary classification task, calculate its confusion matrix and true positive rate (TPR), false negative rate (FNR), false positive rate (FPR) and true negative rate (TNR).

Parameters
  • label (np.ndarray) – the true label 1D ndarray vector

  • predict (np.ndarray) – the predict label 1D ndarray vector

  • positive_first (If True, it will regard 0 as positive class. Otherwise) – it will regard 1 as positive class. By default, it is False.

Returns

Return type

a tuple of TPR, FNR, FPR, TNR

class socube.train.EarlyStopping(descend: bool = True, patience: int = 7, threshold: float = 1e-05, verbose: int = 0, delta: float = 0, path='checkpoint.pt')

Bases: object

Early stops the training if validation loss doesn’t improve after a given patience.

Parameters
  • descend (bool, default True) – If True, the loss is minimized. If False, the loss is maximized.

  • patience (int, default 7) – How long to wait after last time validation loss improved.

  • verbose (int, default 0) –

    Prints a message which level great than verbose for each

    validation loss improvement.

  • delta (float, default 0) – Minimum change in the monitored quantity to qualify as an improvement.

  • path (str, default 'checkpoint.pt') – Path for the checkpoint to be saved to.

property earlyStop

Wether to reach early stopping point.

Returns

Return type

Boolean value