この記事は何？

線形回帰のpython(+ numpy)による実装例です．

数式

訓練データを $({\bf X}, {\bf y})$ とし，推定する重みベクトルを ${\bf \hat{w}}$ とすると， ${\bf \hat{w}}$ は以下の式で表されます． \begin{align} {\bf \hat{w}}=({\bf X}^T{\bf X})^{-1}{\bf X}^T{\bf y} \end{align}

この式を実装するだけです，非常に簡単です．

実装

線形回帰のコードを下に示します．

class LinearRegression(object):

    def __init__(self, is_bias=False):
        self.is_bias = is_bias

    def _add_bias(self, X):
        N = X.shape[0]
        bias = np.ones((N, 1))
        res = np.c_[bias, X]
        return res

    def fit(self, X, y):
        if self.is_bias:
            X_train = self._add_bias(X)
        else:
            X_train = X
        self.w_hat = np.linalg.inv(
            (X_train.T).dot(X_train)).dot(X_train.T).dot(y)

    def predict(self, X):
        if self.is_bias:
            X_test = self._add_bias(X)
        else:
            X_test = X
        return X_test.dot(self.w_hat)

バイアス項を追加するかどうかを選択できるようにしています．

実行例

一次元データに対して線形回帰を行った例を示します．データセットを以下の式で生成しました．

$ \begin{align} {\bf y}={\bf X}{\bf w} + b + \epsilon \end{align} $

$\epsilon$ はガウスノイズです．

if __name__ == '__main__':
    N = 20
    D = 1
    bias = 1.5
    sigma = 0.1
    X = np.random.rand(N, D)
    w = np.ones((D, 1))
    y = X.dot(w) + bias + np.random.normal(loc=0, scale=sigma, size=(N, 1))

    model = LinearRegression(is_bias=True)
    model.fit(X, y)
    xx = np.linspace(0, 1, 10).reshape(10, 1)
    yy_true = xx.dot(w) + bias
    yy_pred = model.predict(xx)

    plt.plot(X, y, "bo", label="train data")
    plt.plot(xx, yy_true, "b", label="true")
    plt.plot(xx, yy_pred, "r", label="predict")
    plt.title("Linear Regression")
    plt.xlabel("x")
    plt.ylabel("y")
    plt.legend(loc="best")
    plt.show()