Lecture 3. Generalized Linear Models
• Let yi, i = 1, . . . , n be binary response:
(
1 “ess”
yi =
0 “failure”
T
associate with covariates x1i, . . . , xki through xi β, where
1 β0
x1i β1
xi = . β= .
. .
. .
x β
ki p×1 k p×1
and
T
xi β= β0 + β1x1i + . . . + βkxki
• Probit model assumes
T
pi = P (yi = 1) = Φ(xi β)
where Φ(·) is the cdf of normal distribution.
• Logit model assumes
−xT β
1 e i
p = P (y = 1) = , 1 − p = P (y = 0) =
i i −xT β i i −xT β
1 + e i 1 + e i
• Use maximum likelihood estimation (MLE) method to estimate β. The probability mass function
(pmf) for bernoulli trials yi
yi 1−yi
p(yi) = pi (1 − pi)
By independence of yi’s, the joint likelihood function
L(β) = p(y1)p(y2) · · · p(yn)
and the log likelihood function
n
X
l(β) = log L(β) = log p(yi)
i=1
n
X pi
= y log + log(1 − p )
i 1 − p i
i=1 i
n
n T o
X T −xi β
= (yi − 1)xi β− log(1 + e )
i=1
The first derivative is
n
∂l(β) X
= (y − p )x .
∂β i i i
i=1
That is
XT (e) = 0
where
e = y − pˆ
1
• Newton’s method: find zero of a function f
f(xi)
xi+1 = xi − 0 , i = 0, 1, . . .
f (xi)
given a starting value x0.
• Newton-Raphson
β(t+1) = β(t) + I−1(β(t))XT (y − pˆ)
where I(β(t)) is the Information matrix with elements (second derivative)
∂2l(β)
−
∂βa∂βb
Lec3 来自淘豆网www.taodocs.com转载请标明出处.