Evaluation Performance Measures

4 min readOct 21, 2023

This section contains data on the confusion matrix, accuracy, precision, recall, F1 score, and AUC-ROC curve. It has also involved research on machine learning methods and their efficiency evaluation.

Confusion Matrix

The confusion matrix describes the outcome of categorization issues. It indicates the accuracy of the model predictions. The amount of right and wrong predictions can be categorized by calculating each outcome and categorizing them. To put it another way, the confusion matrix reveals how confused the model is while making predictions. The matrix provides information about the model’s flaws along with their categories based on the values that occur. Table below shows the general format of a confusion matrix for a two-class (Binary)problem.

The confusion matrix is a 2x2 matrix, as shown in table above with rows for the true class and bars for the anticipated category. True Positive, False Negative, False Positive, and True Negative are acronyms for True Positive, False Positive, and True Negative, correspondingly. The following elements in the confusion matrix are clarified as follows:

·True Negative (TN): The number of parameters estimated by the equation to be negative that are really negative.

·True Positive (TP): The number of variables that were estimated by the model to be Positive and found to be Positive.

·False Negative (FN): The number of factors estimated by the model appears to be negative but is positive.

·False Positive (FP): The number of parameters estimated by the model appears to be positive but turns out to be negative.

Accuracy

Precision is an innate performance metric. It indicates how close the outcomes are to their genuine value. As shown in Equation 1, accuracy is indicated as the number of right estimates divided by the entire number of guesses.

The correctness issue is that it fails to offer all of the information about the data. To illustrate, consider COVID-19 cases: the accuracy of persons in the community who test negative for the virus can be 99%, but this does not ensure security, and the one percent that remains does not need to be handled. The high-accuracy findings do not provide all of the information required to analyze data. As a result, more assessments and evaluations using indicators are required.

Precision

Precision is the level of repeatability with which a procedure can be repeated. The model’s accuracy is the number of times it successfully detects a fractured picture and a non-cracked picture. According to Equation 2, it is the proportion of genuine positivity compared to the total expected positivity from the confusion matrix.

Recall

The level of responsiveness is referred to as recall. The recall is the number of positive examples that a model catches by identifying them as such. As seen in Equation 3, this is represented as the percentage of really recognized positive cases per real outcome.

F1 Score

F1-result measures the level of reliability using Recall and Precision consideration, providing greater importance to false positives and false negatives while preventing a significant number of genuine negatives from influencing the result. The F1-Score computation is detailed in Equation 4.

AUC-ROC Curve

In the research we conducted, an essential sort of metric was the area under the ROC curve (AUC), which we employed in both the under sampled data and the whole dataset. AUC outperforms correctness in assessing algorithms for learning, with AUC assessed on the positive category (fraudulent) FP and TP. The ‘Area Under the ROC Curve’ (AUC) is always used to contrast. The region underneath the curve (AUC) is determined by the formula below: