Linear Regressions

Linear regression is probably one of the simplest form of Machine Learning algorithms out there. What it does is to train a linear model based on an existing dataset (usually historical data). Once trained, the model can make predictions about future values of the data.

_images/ml_linreg.jpg

Typical Linear regression ( real data = red crosses, model = blue line )




















Note: A linear regression can fit way more than affine models, for that you have to add polynomial features to your dataset eg with quotek::ml::polynomial_features(dataset& X, int degree);

class quotek::ml::linearRegression

The linearRegressions class allows to perform linear regression learning algorithms on some datasets.

Public Functions

linearRegression()

Class simplest constructor

linearRegression(bool regularize)

Class constructor 2

Parameters
  • regularize -

    Tells if features must be regularized to avoid overfitting.

~linearRegression()

Object Destructor

int train(dataset &X)

train takes a dataset and creates a fitting model according to the provided data. Note: it assusmes that the last column of the dataset stores the expected results (y).

Parameters
  • X -

    Dataset to modelize.

int train(dataset &X, dvector &y)

Same than regular train() except that it assumes the expected results are in a splitted, m-dimensions vector.

Parameters
  • X -

    Dataset to modelize.

  • y -

    results vector.

int predict(dataset &data, std::vector<double> &y)

predict takes a small dataset and try to guess the output according to the previously learned model.

Parameters
  • X -

    Data to predict outputs for.

  • y -

    Predicted outputs, stored as a vector of floats.

double predict(dataset &X)

predict takes a small dataset and try to guess the output according to the previously learned model.

Return
predicted output, as a float.
Parameters
  • X -

    Data to predict output for (array must have 1 single line).

Public Members

bool regularize

stores wether we will use regularization or not.

VectorXd coefficients

stores the coefficients for each dimension of the dataset.

Exemple

#include <quotek/quotek.hpp>
#include <quotek/linearregression.hpp>

int main(int argc, char** argv) {

  //We declare a dataset X, and expected results y.
  quotek::ml::dataset X;
  quotek::ml::dvector y;

  X =  MatrixXd(10,1);
  y =  VectorXd(10);

  //We fill both dataset and expected results with our data.
  X << 1,2,3,4,5,6,7,8,9,10;
  y << 2,4,6,8,10,12,14,16,18,20;

  //We train our Linear regression with our data.
  quotek::ml::linearRegression l1;
  l1.train(X,y);

  //Then we can make predictions with our trained linear regression object.
  quotek::ml::dataset Xpred = MatrixXd(3,1);

  Xpred << 150, 200, 500;
  double pred = l1.predict(Xpred);

}