pacman::p_load(ggplot2, dplyr)


【A】簡單案例


 0  1 
98 33 

Call:
glm(formula = y ~ x1 + x2, family = binomial, data = D)

Deviance Residuals: 
   Min      1Q  Median      3Q     Max  
-2.377  -0.627  -0.510  -0.154   2.119  

Coefficients:
            Estimate Std. Error z value    Pr(>|z|)    
(Intercept)  -2.5402     0.4500   -5.64 0.000000017 ***
x1            0.0627     0.0240    2.62     0.00892 ** 
x2            0.1099     0.0326    3.37     0.00076 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 147.88  on 130  degrees of freedom
Residual deviance: 116.45  on 128  degrees of freedom
AIC: 122.4

Number of Fisher Scoring iterations: 5
(Intercept)          x1          x2 
  -2.540212    0.062735    0.109896 

Given x1=3, x2=4, what are the predicted logit, odd and probability?

   logit      odd     prob 
-1.91242  0.14772  0.12871 

🗿 : What if x1=2, x2=3?

   logit      odd     prob 
-2.08505  0.12430  0.11056 


💡 : glm(family=binomial)的功能:在 \(\{x\}\) 的空間之中,找出區隔 \(y\) 的(類別)界線

We can plot the line of logit = 0 or odd = 1, prob = 0.5 on the plane of \(X\)

Furthermore, we can translate probability, logit and coefficents to intercept & slope …

\[f(x) = b_0 + b_1 x_1 + b_2 x_2 \; \Rightarrow \; x_2 = \frac{f - b_0}{b_2} - \frac{b_1}{b_2}x_1\]

  prob    logit
1  0.1 -2.19722
2  0.2 -1.38629
3  0.3 -0.84730
4  0.4 -0.40547
5  0.5  0.00000
6  0.6  0.40547
7  0.7  0.84730
8  0.8  1.38629
9  0.9  2.19722

then mark the contours of proabilities into the scatter plot

🗿 : What do the blue/cyan lines means?

🗿 : Given any point in the figure above, how can you tell its (predicted) probability approximately?



【B】 邏輯式回歸

Logistic Function & Logistic Regression
  • Linear Model: \(y = f(x) = b_0 + b_1x_1 + b_2x_2 + ...\)

  • General Linear Model(GLM): \(y = Link(f(x))\)

  • Logistic Regression: \(logit(y) = log(\frac{p}{1-p}) = f(x) \text{ where } p = prob[y=1]\)

  • Logistic Function: \(Logistic(F_x) = \frac{1}{1+Exp(-F_x)} = \frac{Exp(F_x)}{1+Exp(F_x)}\)

🗿 : What are the definiion of logit & logistic function? What is the relationship between them?