01B. Efficient programming

Mingyang Lu

01/07/2024

This code has been adapted from numericalR, originally written for R programming, with adjustments for MATLAB usage. While the following discussions hold true in general, it's important to exercise caution due to the differences between MATLAB and R/Python. Keep in mind that certain optimizations or conventions in one language may not directly translate to the other. Always verify and adapt code accordingly to ensure compatibility and optimal performance in the MATLAB environment.

Avoid growing vectors

The following code is extremely slow for large n.
% Set the value of n to 10
n = 10;
 
% Create an empty array v
v = [];
 
% Use a for loop to iterate from 1 to n
for i = 1:n
% Append the square of i to array v
v = [v, i^2];
end
 
disp(v);
1 4 9 16 25 36 49 64 81 100
A better way creates an array of final length first.
% Set the value of n to 10
n = 10;
 
% Create a numeric array v filled with zeros
v = zeros(1, n);
 
% Use a for loop to iterate from 1 to n
for i = 1:n
% Assign the square of i to the corresponding element in the array v
v(i) = i^2;
end
 
disp(v);
1 4 9 16 25 36 49 64 81 100

Vectorize codes

An even better approach is the following. It uses vector operations instead.
% Set the value of n to 10
n = 10;
 
% Generate a sequence from 1 to n using MATLAB
v = 1:n;
 
% Square each element in the array v
v = v.^2;
 
disp(v);
1 4 9 16 25 36 49 64 81 100
Iteration, e.g., using a For Loop, is typically slow. For example, the following code calculates the mean and standard deviation (SD) of a series of numbers.
% Iterations with a for loop
num = 100;
 
% Initialize variables
my_sum = 0;
my_sum2 = 0;
 
% Loop through a range from 1 to num
for i = 1:num
my_sum = my_sum + i;
my_sum2 = my_sum2 + i^2;
end
 
% Calculate mean and standard deviation
my_mean = my_sum / num;
my_sd = sqrt(my_sum2 / num - my_mean^2);
 
% Print the mean and standard deviation
fprintf('%.4f %.4f\n', my_mean, my_sd);
50.5000 28.8661
While the above code is typical for C or Fortran, a better approach for MATLAB is to use vector operations.
% Vectorization
num = 100;
 
% Create a sequence from 1 to num using MATLAB
v = 1:num;
 
% Calculate mean, mean square, and standard deviation
my_mean = mean(v);
my_mean_square = mean(v.^2);
my_sd = sqrt(my_mean_square - my_mean^2);
 
% Print the mean and standard deviation
fprintf('%.4f %.4f\n', my_mean, my_sd);
50.5000 28.8661
Another example to compute standard deviations for each column
rng('default'); % Set the random number generator to its default settings
 
mat = randn(4); % Generate a random matrix of size 4x4
disp(mat);
0.5377 0.3188 3.5784 0.7254 1.8339 -1.3077 2.7694 -0.0631 -2.2588 -0.4336 -1.3499 0.7147 0.8622 0.3426 3.0349 -0.2050
 
% Calculate means, means square, and standard deviations
means = mean(mat);
means2 = mean(mat.^2);
sd = sqrt(means2 - means.^2);
 
disp(sd);
1.5215 0.6756 1.9606 0.4300
Arrayfun can be used to compute standard deviations (similar to apply in R)
% Calculate standard deviation for each column using arrayfun
sd_col = arrayfun(@(col) std(mat(:, col),1), 1:4);
disp(sd_col);
1.5215 0.6756 1.9606 0.4300
 
% Calculate standard deviation for each row using arrayfun
sd_row = arrayfun(@(row) std(mat(row, :),1), 1:4);
disp(sd_row);
1.3290 1.5917 1.1017 1.2292

Performance evaluation

% Timing the functions
tic;f1(1e7);toc; % using for loop
Elapsed time is 0.011023 seconds.
tic;f2(1e7);toc; % using vectorization
Elapsed time is 0.023213 seconds.
It's suprising that in MATLAB, using for loop is faster. It could be because that MATLAB optimizes loop structures with JIT (Just-In-Time) compiler.

Define MATLAB functions

function result = f1(num) % using for loop
my_sum = 0;
my_sum2 = 0;
 
for i = 1:num
my_sum = my_sum + i;
my_sum2 = my_sum2 + i^2;
end
 
my_mean = my_sum / num;
my_sd = sqrt(my_sum2 / num - my_mean^2);
result = [my_mean, my_sd];
end
 
function result = f2(num) % Using vectorization
v = 1:num;
 
my_mean = mean(v);
my_variance = mean(v.^2);
my_sd = sqrt(my_variance - my_mean^2);
result = [my_mean, my_sd];
end