Learning Evaluation Metrics in ML - Lesson 7 - Confusion Matrix

In statistical analysis of binary classification and information retrieval systems, the F-score or F-measure is a measure of predictive performance.

A confusion matrix is a table that is used to define the performance of a classification algorithm. A confusion matrix visualizes and summarizes the performance of a classification algorithm. A confusion matrix is shown in Table 5.1, where benign tissue is called healthy and malignant tissue is considered cancerous.

Evaluation metrics are quantitative measures used to evaluate the performance and effectiveness of a statistical or machine learning model. These metrics provide insights into how well the model is performing and help in comparing different models or algorithms. When evaluating a machine learning model, it is crucial to assess its predictive ability, generalization capability, and overall quality.

There are different types of evaluation metrics available, depending on the specific machine learning task. Some of the common evaluation matrices are Precision, recall, F1-score, Mean Absolute Error, Mean Squared Error, R-squared, adjusted r-squared, accuracy, confusion matrix, log-loss, and AUC-ROC are some of the most popular metrics.

مقاييس تقييم التعلم في تعلم الآلة - الدرس 7 - مصفوفة الثقة - الارتباك

مصفوفة الثقة أو الإرباك أو التشويش أو الخطأ في مجال التعلّم الآلي وتحديدا مشكلة التصنيف الإحصائي، هي تخطيط جدولي معين يسمح بتصور أداء خوارزمية. عادةً ما تكون خوارزمية التعليم المراقب (وتسمى في التعلم غير المراقب باسم مصفوفة المطابقة). يُمثل كل صف من المصفوفة حالات الفئة المتوقعة بينما يمثل كل عمود حالات الفئة الفعلية (أو العكس بالعكس). وينبع الاسم من حقيقة أنه يجعل من السهل رؤية ما إذا كان النظام مرتبكا بين الفئتين (أي يخطئ بين أحد الفئتين على أنها الأخرى).

مقاييس التقييم هي مقاييس كمية تستخدم لتقييم أداء وفعالية نموذج إحصائي أو التعلم الآلي. توفر هذه المقاييس رؤى حول مدى جودة أداء النموذج وتساعد في مقارنة النماذج أو الخوارزميات المختلفة. عند تقييم نموذج التعلم الآلي، من المهم تقييم قدرته التنبؤية وقدرته على التعميم والجودة الشاملة.

هناك أنواع مختلفة من مقاييس التقييم المتاحة، اعتمادًا على مهمة التعلم الآلي المحددة. بعض مصفوفات التقييم الشائعة هي الدقة، والاستدعاء، ودرجة إف ون، ومتوسط الخطأ المطلق، ومتوسط الخطأ التربيعي، و المعدل، والدقة، ومصفوفة الارتباك، وخسارة السجل، ومنحني روك . المقاييس الشعبية جذر الخطأ.