Show that hessian of logistic regression is negative semi definite
Answers
Here I derive all the necessary properties and identities for the solution to be self-contained, but apart from that this derivation is clean and easy. Let us formalize our notation and write the loss function a little more compactly. Consider m samples {xi,yi} such that xi∈Rd and yi∈R. Recall that in binary logistic regression we typically have the hypothesis function hθ be the logistic function. Formally
hθ(xi)=σ(ωTxi)=σ(zi)=11+e−zi,
where ω∈Rd and zi=ωTxi. The loss function (which I believe OP's is missing a negative sign) is then defined as:
l(ω)=∑i=1m−(yilogσ(zi)+(1−yi)log(1−σ(zi)))
There are two important properties of the logistic function which I derive here for future reference. First, note that 1−σ(z)=1−1/(1+e−z)=e−z/(1+e−z)=1/(1+ez)=σ(−z).
Also note that
∂∂zσ(z)=∂∂z(1+e−z)−1=e−z(1+e−z)−2=11+e−ze−z1+e−z=σ(z)(1−σ(z))
Instead of taking derivatives with respect to components, here we will work directly with vectors (you can review derivatives with vectors here). The Hessian of the loss function l(ω) is given by ∇⃗ 2l(ω), but first recall that ∂z∂ω=xTω∂ω=xT and ∂z∂ωT=∂ωTx∂ωT=x.