06 Model Evaluation
Last updated on 2025-07-10 | Edit this page
Model Evaluation: Comparing Logistic Regression and SVM
We compare two classifiers: - Logistic Regression - SVM
Using metrics: Accuracy, Precision, Recall, F1 Score, Confusion Matrix, and ROC-AUC.
Step 1: Load the Data
PYTHON
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd
# Load dataset
data = load_breast_cancer()
X = data.data
y = data.target
# Split dataset
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=31
)
# Normalize (Standardize) features
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)
Step 3: Evaluation Metrics
What is Accuracy?
Accuracy is the proportion of correct predictions:
\[ \text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN} \]
Can be misleading if classes are imbalanced.
What are Precision, Recall and F1 Score?
-
Precision: \(\frac{TP}{TP
+ FP}\)
-
Recall: \(\frac{TP}{TP +
FN}\)
- F1 Score: Harmonic mean of precision and recall
\[ F1 = 2 \cdot \frac{\text{Precision} \cdot \text{Recall}}{\text{Precision} + \text{Recall}} \]
PYTHON
from sklearn.metrics import precision_score, recall_score, f1_score
for name, model in [("SVM", svm_model), ("Neural Net", lr_model)]:
y_pred = model.predict(X_test)
print(f"\n{name} Metrics:")
print(f"Precision: {precision_score(y_test, y_pred):.2f}")
print(f"Recall: {recall_score(y_test, y_pred):.2f}")
print(f"F1 Score: {f1_score(y_test, y_pred):.2f}")
What is a Confusion Matrix?
A confusion matrix shows the breakdown of correct and incorrect classifications.
| Predicted Positive | Predicted Negative | |
|---|---|---|
| Actual Positive | True Positive (TP) | False Negative (FN) |
| Actual Negative | False Positive (FP) | True Negative (TN) |
PYTHON
from sklearn.metrics import ConfusionMatrixDisplay
import matplotlib.pyplot as plt
fig, axs = plt.subplots(1, 2, figsize=(12, 5))
ConfusionMatrixDisplay.from_estimator(svm_model, X_test, y_test, ax=axs[0])
axs[0].set_title("SVM Confusion Matrix")
ConfusionMatrixDisplay.from_estimator(lr_model, X_test, y_test, ax=axs[1])
axs[1].set_title("Neural Network Confusion Matrix")
plt.tight_layout()
plt.show()

png
What is the ROC Curve?
ROC = Receiver Operating Characteristic Curve
- Plots TPR vs FPR
- AUC = Area Under the ROC Curve Closer to 1 = better model.
PYTHON
from sklearn.metrics import roc_curve, auc
svm_probs = svm_model.predict_proba(X_test)[:, 1]
nn_probs = lr_model.predict_proba(X_test)[:, 1]
svm_fpr, svm_tpr, _ = roc_curve(y_test, svm_probs)
nn_fpr, nn_tpr, _ = roc_curve(y_test, nn_probs)
svm_auc = auc(svm_fpr, svm_tpr)
nn_auc = auc(nn_fpr, nn_tpr)
plt.figure(figsize=(8, 6))
plt.plot(svm_fpr, svm_tpr, label=f"SVM (AUC = {svm_auc:.2f})")
plt.plot(nn_fpr, nn_tpr, label=f"Logistic Regression (AUC = {nn_auc:.2f})")
plt.plot([0, 1], [0, 1], 'k--')
plt.xlabel("False Positive Rate")
plt.ylabel("True Positive Rate")
plt.title("ROC Curve")
plt.legend()
plt.show()

png