-
Class for building and using a multinomial logistic
regression model with a ridge estimator.
There are some modifications, however, compared to the paper of leCessie and
van Houwelingen(1992):
If there are k classes for n instances with m attributes, the parameter
matrix B to be calculated will be an m*(k-1) matrix.
The probability for class j with the exception of the last class is
Pj(Xi) = exp(XiBj)/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The last class has probability
1-(sum[j=1..(k-1)]Pj(Xi))
= 1/((sum[j=1..(k-1)]exp(Xi*Bj))+1)
The (negative) multinomial log-likelihood is thus:
L = -sum[i=1..n]{
sum[j=1..(k-1)](Yij * ln(Pj(Xi)))
+(1 - (sum[j=1..(k-1)]Yij))
* ln(1 - sum[j=1..(k-1)]Pj(Xi))
} + ridge * (B^2)
In order to find the matrix B for which L is minimised, a Quasi-Newton Method
is used to search for the optimized values of the m*(k-1) variables.
Implements stochastic gradient descent for learning various linear models (binary class SVM, binary class logistic regression, squared loss, Huber loss and epsilon-insensitive loss linear regression).
Implements stochastic gradient descent for learning a linear binary class SVM or binary class logistic regression on text data.
-
SMOreg implements the support vector machine for regression.