% Please read the full homework problem 3 (from homework 2) for a % description of what we're doing here. % Start by loading data cd ~/matlab/Data load HW2_prob3.mat % Now, let's divide the data (x and y), using the first six years % to train a multiple regression equation, and use the last five as an % independent verification: train_set = 1:7; x_train = x(train_set,:); y_train = y(train_set,:); ver_set = 8:11; x_ind = x(ver_set,:); y_ind = y(ver_set,:); % Start by predicting y using only the first column of x. Do this using % the normal equations: cols = 1; % Use only the first column. X = [ones(7,1) x_train(:,cols)]; a = (X'*X) \ (X'*y_train); % The \ operator acts similarly to the % inverse function: % a = inv(X'*X) * (X'*y_train); % Generate dependent and independent predictions of y: y_pred_dep = X * a; y_pred_ind = [ones(4,1) x_ind(:,cols)] * a; % Now, evaluate the explained and unexplained variance of y, and the % root mean square error of the forecast based on the dependent and % independent predictions of y: corr(y_pred_dep, y_train)^2 % Explained variance of y, % based on dependent data 1-corr(y_pred_dep, y_train)^2 % Unexplained variance of y, % based on dependent data sqrt(1-corr(y_pred_dep, y_train)^2) % RMSE from dependent data corr(y_pred_ind, y_ind)^2 % Explained var., independ. data 1-corr(y_pred_ind, y_ind)^2 % Unxplained var., independ. data sqrt(1-corr(y_pred_ind, y_ind)^2) % RMSE, independ. data % Now, repeat this, adding one predictor time series each time. To save % room, let's just do this in a 'do loop'. We'll just remember the % correlations % Initialize outputs (this makes MATLAB happy) corr_dependent = zeros(6,1); corr_independent = zeros(6,1); for i = 1:6; cols = 1:i; X = [ones(7,1) x_train(:,cols)]; a = (X'*X) \ (X'*y_train); y_pred_dep = X * a; y_pred_ind = [ones(4,1) x_ind(:,cols)] * a; corr_dependent(i) = corr(y_pred_dep, y_train); corr_independent(i) = corr(y_pred_ind, y_ind); end % Finally, evaluate the explained and unexplained variance, and the RMSE % for the dependent and independent data: var_expl_dependent = corr_dependent.^2; var_unexpl_dependent = 1-var_expl_dependent; rmse_dependent = sqrt(var_unexpl_dependent); var_expl_independent = corr_independent.^2; var_unexpl_independent = 1-var_expl_independent; rmse_independent = sqrt(var_unexpl_independent);