Basic C++ for numerical computations: vectors
Norbert Pozar
This short note gives a few basic tips on how to efficiently write C++ code with a numerical performance in mind, focusing on vectors.
Introduction
Quick reminder before we begin: Your code should be as easy to understand as possible, easy to maintain and flexible. While it should be designed from the beginning with performance in mind, especially in terms of how data is stored in memory, flexibility is almost as important, especially in the early stages.
Vectors
C++ has the Standard
Templates Library (STL) that provides some basic containers for
storing elements. The vector
STL container stores data in the same way the C array does, but with
additional functionality.
After defining a variable of type vector
, you can use it
pretty much as a drop-in replacement of the C array. When optimization
is enabled, the performance of vector
and the C array is
basically the same.
vector
has a few advantages:
- it remembers its size
- it allocates and deallocates memory automatically, no need to touch pointers
- fits nicely with other C++ libraries
Example: l_2-norm of a vector
Let us first look at the code for computing the l_2-norm of a vector using the standard C arrays.
#include <math>
int main() {
int n = 100;
double u[n];
// initialize
for (int i = 0; i < n; ++i) {
[i] = i;
u}
// l2-norm
double accum = 0.;
for (int i = 0; i < n; ++i) {
+= u[i] * u[i];
accum }
double norm = sqrt(accum);
}
The same code can be easily rewritten using C++ and
vector
. Notice that we use u.size()
, which
returns the size of the vector. No need to remember the size
anymore.
#include <vector>
#include <cmath>
using namespace std;
int main() {
int n = 100;
<double> u(n); // create a vector with 100 elements, 0. by
vector// default
// initialize (note that we use u.size() instead of n)
for (int i = 0; i < u.size(); ++i) {
[i] = i;
u}
// l2-norm
double accum = 0.;
for (int i = 0; i < n; ++i) {
+= u[i] * u[i];
accum }
double norm = sqrt(accum);
}
Of course, it makes sense to define a function that computes the l_2-norm. First the C code:
double l2_norm(double const* u, int n) {
double accum = 0.;
for (int i = 0; i < n; ++i) {
+= u[i] * u[i];
accum }
return sqrt(accum);
}
Because the C array does not know its size, we have to pass the size as the second argument of the function.
This becomes easier with vector
:
double l2_norm(vector<double> const& u) {
double accum = 0.;
for (int i = 0; i < u.size(); ++i) {
+= u[i] * u[i];
accum }
return sqrt(accum);
}
The symbol &
makes u
a reference
to a vector
of double
s. This makes the code
more efficient. Without &
, u
would be
passed by value, which means that it would be copied into a
local variable every time l2_norm
is called. However, this
is an unnecessary and slow operation. With &
, if you
call l2_norm
in your code, for example:
<double> v;
vector...
double norm = l2_norm(v);
u
inside l2_norm
is just a different name
for the same vector<double>
that is stored in
v
. In particular, if you modify u
inside
l2_norm
, v
is modified as well. Hence the
const
keyword, which tells the caller that v
will not be modified, even though it is passed by
reference.
If your compiler supports the new C++11 standard (for g++ 4.6 and
newer use the -std=c++0x
command line option), the code can
be simplified somewhat more using the new range-based
loop:
double l2_norm(vector<double> const& u) {
double accum = 0.;
for (double x : u) {
+= x * x;
accum }
return sqrt(accum);
}
Example: Discretized Dirichlet energy
Let \Omega = (0,1). We want to find the minimizer of the Dirichlet energy
I(u):= \int_\Omega |u'(x)|^2 \;dx
over all functions in the set
\mathcal{K}:=\{v \in {\rm Lip}(\bar \Omega): v(0) = h_0, v(1) = h_1\}.
Here {\rm Lip} denotes all Lipschitz continuous functions on \bar\Omega. Lipschitz continuous functions are differentiable almost everywhere with bounded derivative, and therefore the integral is well-defined and finite.
To discretize this problem, we subdivide the domain into N \geq 1 equal subintervals (x_i, x_{i+1}), where x_i = i/N, i = 0, \ldots, N. Suppose that u is a piece-wise linear function on \Omega, linear on each (x_i, x_{i+1}). Function u is uniquely prescribed by its values v_i:= u(x_i) at the nodes x_i. We shall assume that the boundary values of u at x_0 = 0 and x_N = 1 are given values h_0, h_1. We introduce the discretized Dirichlet energy functional J(v_1, \ldots, v_{N-1}) as
J(v_1, \ldots, v_{N-1}):=\int_\Omega |u'(x)|^2 \;dx = \sum_{i=0}^{N-1}\frac{(v_{i+1}-v_i)^2}{\Delta x},
where \Delta x = x_{i+1} - x_i = 1/N.
Minimization of J over all vectors in \mathbb{R}^{N-1} is equivalent to minimization of the original energy I over all piece-wise linear functions on \Omega, linear on each subinterval (x_i, x_{i+1}), with boundary values u(0) = v_0 = h_0 and u(1) = v_N = h_1. It is a discretized version of the original problem in the sense that we restricted the search for the minimizer to a smaller set. Therefore the minimizer of the discretized problem will in general be only an approximation of the real minimizer.
An array or vector
are a natural choice for storing the
values of v_0, \ldots, v_N. The number
of nodes (vector
’s size) is N +
1.
It is easy to compute the discretized energy J, given the values v_0, \ldots, v_N.
double dirichlet_energy(vector<double> const& u) {
int N = u.size() - 1;
double accum = 0;
for (int i = 0; i < N; ++i) {
+= pow(u[i+1] - u[i], 2.);
accum }
return N * accum; // dx = 1 / N
}
The gradient of J has N-1 components:
\frac{\partial J}{\partial v_i} (v_1, \ldots, v_{N-1}) = -2 \frac{v_{i-1} - 2v_i + v_{i+1}}{\Delta x}.
However, it is convenient to store it as an array of N+1 elements and pad it on both sides by zeros.
<double> dirichlet_energy_gradient(vector<double> const& u) {
vectorint N = u.size() - 1;
double c = -2. * N; // c = -2. / dx
<double> grad(u.size()); // vector of N+1 zeros
vectorfor (int i = 1; i < N; ++i) {
[i] = c * (u[i-1] - 2. * u[i] + u[i+1]);
grad}
return grad;
}
In a high-performance code one often wants to avoid unnecessary memory allocations and deallocations. Therefore it is better to return the gradient in a preallocated vector, which can be reused multiple times.
void dirichlet_energy_gradient(vector<double>& grad, vector<double> const& u) {
// check that the vectors have the same size
if (grad.size() != u.size()) {
throw runtime_error("dirichlet_energy_gradient: size of vectors"
"does not match");
}
int N = u.size() - 1;
double c = -2. * N; // c = -2. / dx
for (int i = 1; i < N; ++i) {
[i] = c * (u[i-1] - 2. * u[i] + u[i+1]);
grad}
[0] = grad[N] = 0.; // do not forget the padding
grad}
We can also provide the function that returns the gradient directly, for convenience.
<double> dirichlet_energy_gradient(vector<double> const& u) {
vector<double> grad(u.size());
vector(grad, u);
dirichlet_energy_gradientreturn grad;
}
Now we write a code to minimize the discretized Dirichlet energy using the steepest descent method (also called gradient descent), for simplicity. The minimization proceeds iteratively, at each iteration k we find the minimum of the functional J along the line of steepest descent, that is, along the line v_{k-1} - t g_{k-1}, where v_{k-1} is the solution after k-1 iterations, and g_{k-1} = \nabla J(v_{k-1}) is the gradient of J at v_{k-1}. The minimum t_k along this line must satisfy
\nabla J(v_{k-1} - t_k g_{k-1}) \cdot g_{k-1} = 0.
Since \nabla J is linear in this case, we can rewrite it as
0 = \nabla J(v_{k-1}) \cdot g_{k-1} - t_k \nabla J(g_{k-1}) \cdot g_{k-1} = g_{k-1} \cdot g_{k-1} - t_k \nabla J(g_{k-1}) \cdot g_{k-1}.
This gives a simple formula for t_k:
t_k = \frac{\|g_{k-1}\|^2}{\nabla J(g_{k-1}) \cdot g_{k-1}}.
The following full algorithm is implemented in
find_dirichlet_minimizer
below:
Set v_0 to be the initial guess. For k = 1, \ldots, k_{\rm max} do:
- Compute the gradient, g_{k-1} = \nabla J(v_{k-1}).
- If \|g_{k-1}\| \leq tol, exit. v_{k-1} is the resulting approximation and k-1 is the number of iterations performed.
- Compute the minimizer along the line, t_k = \frac{\|g_{k-1}\|^2}{\nabla J(g_{k-1}) \cdot g_{k-1}}.
- Set v_k = v_{k-1} - t_k g_{k-1}.
The code follows:
#include <cmath>
#include <vector>
#include <iostream>
#include <stdexcept>
using namespace std;
void check_size(vector<double> const& u, vector<double> const& v) {
// check that the vectors have the same size
if (u.size() != v.size()) {
throw runtime_error("vector sizes do not match");
}
}
double dirichlet_energy(vector<double> const& u) {
int N = u.size() - 1;
double accum = 0;
for (int i = 0; i < N; ++i) {
accum += pow(u[i+1] - u[i], 2.);
}
return N * accum; // dx = 1 / N
}
void dirichlet_energy_gradient(vector<double>& grad, vector<double> const& u) {
check_size(u, grad);
int N = u.size() - 1;
double c = -2. * N; // c = -2. / dx
for (int i = 1; i < N; ++i) {
grad[i] = c * (u[i-1] - 2. * u[i] + u[i+1]);
}
grad[0] = grad[N] = 0.; // do not forget the padding
}
double l2_norm(vector<double> const& u) {
double accum = 0.;
for (int i = 0; i < u.size(); ++i) {
accum += u[i] * u[i];
}
return sqrt(accum);
}
double dot(vector<double> const& u, vector<double> const& v) {
check_size(u, v);
double accum = 0.;
for (int i = 0; i < u.size(); ++i) {
accum += u[i] * v[i];
}
return accum;
}
void add_to(vector<double>& u, double a, vector<double> const& v) {
check_size(u, v);
for (int i = 0; i < u.size(); ++i) {
u[i] += a * v[i];
}
}
int find_dirichlet_minimizer(vector<double>& u, double tolerance = 1e-5) {
const int max_iter = 100000;
vector<double> grad(u.size());
vector<double> ggrad(u.size());
int iter;
for (iter = 0; iter < max_iter; ++iter) {
dirichlet_energy_gradient(grad, u);
double grad_norm = l2_norm(grad);
if (grad_norm <= tolerance) {
break;
}
// find the mimimum along the line u - t grad
dirichlet_energy_gradient(ggrad, grad);
double t = grad_norm * grad_norm / dot(ggrad, grad);
// update
add_to(u, -t, grad);
}
return iter;
}
int main() {
// test the minimization
int n = 128;
vector<double> u(n + 1);
for (int i = 0; i <= n; ++i) {
double x = static_cast<double>(i) / n;
u[i] = exp(x);
}
// quick test that the energy and gradient are compatible
vector<double> grad(u.size());
vector<double> v(u); // copy of u
dirichlet_energy_gradient(grad, u);
double delta = 1e-3;
add_to(v, delta, grad);
double Ju = dirichlet_energy(u);
double Jv = dirichlet_energy(v);
// should be small
cout << "Relative Taylor difference = "
<< ((Ju + delta * dot(grad, grad) - Jv) / Ju) << endl;
// find the minimizer
int num_iter = find_dirichlet_minimizer(u);
// report results
cout << "Number of iterations = " << num_iter << endl;
double J = dirichlet_energy(u);
double J_exact = pow(u[n] - u[0], 2.);
cout << "Energy at minimizer = " << J << endl;
cout << "Difference from exact energy = " << (J - J_exact) << endl;
dirichlet_energy_gradient(grad, u);
cout << "Norm of the residual = " << l2_norm(grad) << endl;
return 0;
}