【精选】马修斯相关系数代码Python版

您所在的位置:网站首页 查看mcc码 【精选】马修斯相关系数代码Python版

【精选】马修斯相关系数代码Python版

2023-11-07 04:58| 来源: 网络整理| 查看: 265

目录 马修斯相关系数介绍python代码及数据格式强调原文部分内容

马修斯相关系数介绍

马修斯相关系数相关介绍来自维基百科,可自行查看。

python代码及数据格式

下面是python代码

import pandas as pd from math import sqrt def get_data(): df = pd.read_csv('data.csv') TP = df.iloc[0]["zero"] FP = df.iloc[1]["zero"] FN = df.iloc[0]["one"] TN = df.iloc[1]["one"] return TP,FP,FN,TN def calculate_data(TP,FP,FN,TN): numerator = (TP * TN) - (FP * FN) #马修斯相关系数公式分子部分 denominator = sqrt((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN)) #马修斯相关系数公式分母部分 result = numerator/denominator return result if __name__ == '__main__': TP,FP,FN,TN = get_data() result = calculate_data(TP,FP,FN,TN) print(result) #打印出结果

数据表格格式: 在这里插入图片描述

强调

整篇文章均会使用网页中的原文

原文部分内容

While there is no perfect way of describing the confusion matrix of true and false positives and negatives by a single number, the Matthews correlation coefficient is generally regarded as being one of the best such measures.Other measures, such as the proportion of correct predictions (also termed accuracy), are not useful when the two classes are of very different sizes. For example, assigning every object to the larger set achieves a high proportion of correct predictions, but is not generally a useful classification.

The MCC can be calculated directly from the confusion matrix using the formula: MCC = T P × T N − F P × F N ( T P + F P ) ( T P + F N ) ( T N + F P ) ( T N + F N ) {\displaystyle {\text{MCC}}={\frac {{\mathit {TP}}\times {\mathit {TN}}-{\mathit {FP}}\times {\mathit {FN}}}{\sqrt {({\mathit {TP}}+{\mathit {FP}})({\mathit {TP}}+{\mathit {FN}})({\mathit {TN}}+{\mathit {FP}})({\mathit {TN}}+{\mathit {FN}})}}}} MCC=(TP+FP)(TP+FN)(TN+FP)(TN+FN) ​TP×TN−FP×FN​ In this equation, TP is the number of true positives, TN the number of true negatives, FP the number of false positives and FN the number of false negatives. If any of the four sums in the denominator is zero, the denominator can be arbitrarily set to one; this results in a Matthews correlation coefficient of zero, which can be shown to be the correct limiting value.

The MCC can be calculated with the formula: MCC = P P V × T P R × T N R × N P V − F D R × F N R × F P R × F O R {\displaystyle {\text{MCC}}={\sqrt {{\mathit {PPV}}\times {\mathit {TPR}}\times {\mathit {TNR}}\times {\mathit {NPV}}}}-{\sqrt {{\mathit {FDR}}\times {\mathit {FNR}}\times {\mathit {FPR}}\times {\mathit {FOR}}}}} MCC=PPV×TPR×TNR×NPV ​−FDR×FNR×FPR×FOR ​ using the positive predictive value, the true positive rate, the true negative rate, the negative predictive value, the false discovery rate, the false negative rate, the false positive rate, and the false omission rate.

The original formula as given by Matthews was: N = T N + T P + F N + F P S = T P + F N N P = T P + F P N MCC = T P / N − S × P P S ( 1 − S ) ( 1 − P ) {\displaystyle {\begin{aligned}N&={\mathit {TN}}+{\mathit {TP}}+{\mathit {FN}}+{\mathit {FP}}\\S&={\frac {{\mathit {TP}}+{\mathit {FN}}}{N}}\\P&={\frac {{\mathit {TP}}+{\mathit {FP}}}{N}}\\{\text{MCC}}&={\frac {{\mathit {TP}}/N-S\times P}{\sqrt {PS(1-S)(1-P)}}}\end{aligned}}} NSPMCC​=TN+TP+FN+FP=NTP+FN​=NTP+FP​=PS(1−S)(1−P) ​TP/N−S×P​​ This is equal to the formula given above. As a correlation coefficient, the Matthews correlation coefficient is the geometric mean of the regression coefficients of the problem and its dual. The component regression coefficients of the Matthews correlation coefficient are Markedness (Δp) and Youden’s J statistic (Informedness or Δp’). Markedness and Informedness correspond to different directions of information flow and generalize Youden’s J statistic, the δp statistics and (as their geometric mean) the Matthews Correlation Coefficient to more than two classes.

Some scientists claim the Matthews correlation coefficient to be the most informative single score to establish the quality of a binary classifier prediction in a confusion matrix context.

In abstract terms, the confusion matrix is as follows:Alt



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3