<Logistic Regression>
1. Classification
2. Hypothesis Representation
3. Decision boundary
4. Cost function
5. Simplified cost function and gradient descent
6. One-vs-all
Classification
Email: Spam / Not Spam ?
Online: Transactions: Fraudulent (Yes / No) ?
Tumor: Malignant / Bening ?
→ y ∈ {0, 1}
- 0: "Negative Class"
- 1: "Positive Class"

Hθ(x) = θ^Tx
- Hθ(x) ≥ 0.5, predict "y=1"
- Hθ(x) < 0.5, predcit "y=0"
Classification: y = 0 or 1
Hθ(x) → '> 1' or '< 0'
Hypothesis Representation
Sigmoid function

z = θ^Tx

hθ(x): input x에 대하여 y=1 일 때의 확률
- P(y=0|x; θ) + P(y=1|x; θ) = 1
- P(y=0|x; θ) = 1 - P(y=1|x; θ)
Decision boundary
Linear decision boundary
hθ(X) = g(θ0 + θ₁X₁ + θ₂X₂ + ...)

Non-linear decision boundaries
hθ(X) = g(θ0 + θ₁X₁ + θ₂X₂ + θ₃X₁^2 + θ₄X₂^2 + ...)

Cost function
Cost function in Logistic regression
Cost(hθ(x), y)
- -log(hθ(x)) if y = 1

- -log(1 - hθ(x)) if y = 0

Simplified cost function and gradient descent
Simplified cost function

min J(θ):

Gradient Descent

one-vs-all
Multiclass classification
Email foldering/tagging: Work(y=0), Freinds(y=1), Family(y=2), Hobby(y=3)
Medical diagrams: Not ill(y=0), Cold(y=1), Flu(y=2)
Weather: Sunny(y=0), Cloudy(y=1), Rain(y=2), Snow(y=3)
Binary classification vs. Multi-class classification

One-vs-all (one-vs-rest):

→ 각 class i마다 y = i 확률 예측
maimize 한 class i 선택

ex)
3 binary-class classification model
2X₁ + 3X₂ + 4
-X₁ + 5X₂ - 3
-4X₁ + X₂ + 2
X₁ = 1, X₂ = 2 일 시,
Sigmoid output
2*1 + 3*2 + 4 = 12 → 99.95%
-1*1 + 5*2 - 3 = 6 → 98%
-4*1 + 1*2 + 2 = 0 → 50%
Softmax output
e¹² / (e¹² + e^6 + e^0) = 92%
e^6 / (e¹² + e^6 + e^0) = 7%
e^0 / (e¹² + e^6 + e^0) = 1%
'Machine Learning' 카테고리의 다른 글
| Clustering (0) | 2022.05.27 |
|---|---|
| Dimension Reduction (0) | 2022.05.25 |
| Cross Validation & Dimension Reduction (0) | 2022.05.24 |
| Regularization (0) | 2022.05.23 |