机器学习 - metric评估方法

有一些方法来评估classification model。

CHNMSCS

613人浏览 · 2024-04-05 16:49:11

CHNMSCS · 2024-04-05 16:49:11 发布

有一些方法来评估classification model。

Metric name / Evaluation method	Definition	Code
Accuracy	Out of 100 predictions, how many does your model get correct? E.g. 95% accuracy means it gets 95/100 predictions correct.	`torchmetrics.Accuracy()` or `sklearn.metrics.accuracy_score()`
Precision	Proportion of true positive over total number of samples. Higher precision leads to less false positives (model predicts 1 when it should’ve been 0).	`torchmetrics.Precision()` or `sklearn.metrics.precision_score()`
Recall	Proportion of true positives over total number of true positives and false negatives (model predicts 0 when it should’ve been 1). Higher recall leads to less false negatives.	`torchmetrics.Recall()` or `sklearn.metrics.recall_score()`
F1-score	Combines precision and recall into one metric, 1 is best, 0 is worst	`torchmetrics.F1Score()` or `sklearn.metrics.f1_score()`
Confusion matrix	Compares the predicted values with the true values in a tabular way, if 100% correct, all values in the matrix will be top left to bottom right (diagnoal line).	`torchmetrics.ConfusionMatrix` or `sklearn.metrics.plot_confusion_matrix()`
Classification report	Collection of some of the main classification metrics such as precision, recall and f1-score.	`sklearn.metrics.classification_report()`

点个赞呗~

DAMO开发者矩阵

DAMO开发者矩阵，由阿里巴巴达摩院和中国互联网协会联合发起，致力于探讨最前沿的技术趋势与应用成果，搭建高质量的交流与分享平台，推动技术创新与产业应用链接，围绕“人工智能与新型计算”构建开放共享的开发者生态。

更多推荐

TongSIM：智能机器仿真通用平台

DAMO开发者矩阵

非结构化数据处理的容错机制设计

非结构化数据是指没有固定结构、无法用传统数据库（如SQL）直接存储和查询文本：用户评论、新闻 articles、社交媒体内容；图像：用户上传的照片、产品图片、医疗影像；音频/视频：语音留言、直播片段、短视频；其他：PDF、Word文档、日志文件。“不按常理出牌”——没有统一的 schema，格式千变万化，质量参差不齐。

DAMO开发者矩阵

机器人诊断系统十年演进

摘要：机器人诊断系统十年演进（2015-2025）从救火式运维发展为Robot SRE闭环治理体系。核心演进包括：诊断对象从单机扩展到服务SLA，证据从经验升级为结构化数据链（metrics/logs/traces/replay），处置从人肉运维进化为自愈联动，治理从被动救火转变为防复发闭环。第三代系统通过五大模块（证据采集、事件编排、根因推理、自愈处置、防复发）实现"检测-定位-处置-