Basic C++ for numerical computations: vectors

Norbert Pozar

This short note gives a few basic tips on how to efficiently write C++ code with a numerical performance in mind, focusing on vectors.


Quick reminder before we begin: Your code should be as easy to understand as possible, easy to maintain and flexible. While it should be designed from the beginning with performance in mind, especially in terms of how data is stored in memory, flexibility is almost as important, especially in the early stages.


C++ has the Standard Templates Library (STL) that provides some basic containers for storing elements. The vector STL container stores data in the same way the C array does, but with additional functionality.

After defining a variable of type vector, you can use it pretty much as a drop-in replacement of the C array. When optimization is enabled, the performance of vector and the C array is basically the same.

vector has a few advantages:

Example: \(l_2\)-norm of a vector

Let us first look at the code for computing the \(l_2\)-norm of a vector using the standard C arrays.

The same code can be easily rewritten using C++ and vector. Notice that we use u.size(), which returns the size of the vector. No need to remember the size anymore.

Of course, it makes sense to define a function that computes the \(l_2\)-norm. First the C code:

Because the C array does not know its size, we have to pass the size as the second argument of the function.

This becomes easier with vector:

The symbol & makes u a reference to a vector of doubles. This makes the code more efficient. Without &, u would be passed by value, which means that it would be copied into a local variable every time l2_norm is called. However, this is an unnecessary and slow operation. With &, if you call l2_norm in your code, for example:

u inside l2_norm is just a different name for the same vector<double> that is stored in v. In particular, if you modify u inside l2_norm, v is modified as well. Hence the const keyword, which tells the caller that v will not be modified, even though it is passed by reference.

If your compiler supports the new C++11 standard (for g++ 4.6 and newer use the -std=c++0x command line option), the code can be simplified somewhat more using the new range-based loop:

Example: Discretized Dirichlet energy

Let \(\Omega = (0,1)\). We want to find the minimizer of the Dirichlet energy

\[ I(u):= \int_\Omega |u'(x)|^2 \;dx \]

over all functions in the set

\[ \mathcal{K}:=\{v \in {\rm Lip}(\bar \Omega): v(0) = h_0, v(1) = h_1\}. \]

Here \({\rm Lip}\) denotes all Lipschitz continuous functions on \(\bar\Omega\). Lipschitz continuous functions are differentiable almost everywhere with bounded derivative, and therefore the integral is well-defined and finite.

To discretize this problem, we subdivide the domain into \(N \geq 1\) equal subintervals \((x_i, x_{i+1})\), where \(x_i = i/N\), \(i = 0, \ldots, N\). Suppose that \(u\) is a piece-wise linear function on \(\Omega\), linear on each \((x_i, x_{i+1})\). Function \(u\) is uniquely prescribed by its values \(v_i:= u(x_i)\) at the nodes \(x_i\). We shall assume that the boundary values of \(u\) at \(x_0 = 0\) and \(x_N = 1\) are given values \(h_0, h_1\). We introduce the discretized Dirichlet energy functional \(J(v_1, \ldots, v_{N-1})\) as

\[ J(v_1, \ldots, v_{N-1}):=\int_\Omega |u'(x)|^2 \;dx = \sum_{i=0}^{N-1}\frac{(v_{i+1}-v_i)^2}{\Delta x}, \]

where \(\Delta x = x_{i+1} - x_i = 1/N\).

Minimization of \(J\) over all vectors in \(\mathbb{R}^{N-1}\) is equivalent to minimization of the original energy \(I\) over all piece-wise linear functions on \(\Omega\), linear on each subinterval \((x_i, x_{i+1})\), with boundary values \(u(0) = v_0 = h_0\) and \(u(1) = v_N = h_1\). It is a discretized version of the original problem in the sense that we restricted the search for the minimizer to a smaller set. Therefore the minimizer of the discretized problem will in general be only an approximation of the real minimizer.

An array or vector are a natural choice for storing the values of \(v_0, \ldots, v_N\). The number of nodes (vector’s size) is \(N + 1\).

It is easy to compute the discretized energy \(J\), given the values \(v_0, \ldots, v_N\).

The gradient of \(J\) has \(N-1\) components:

\[ \frac{\partial J}{\partial v_i} (v_1, \ldots, v_{N-1}) = -2 \frac{v_{i-1} - 2v_i + v_{i+1}}{\Delta x}. \]

However, it is convenient to store it as an array of \(N+1\) elements and pad it on both sides by zeros.

In a high-performance code one often wants to avoid unnecessary memory allocations and deallocations. Therefore it is better to return the gradient in a preallocated vector, which can be reused multiple times.

We can also provide the function that returns the gradient directly, for convenience.

Now we write a code to minimize the discretized Dirichlet energy using the steepest descent method (also called gradient descent), for simplicity. The minimization proceeds iteratively, at each iteration \(k\) we find the minimum of the functional \(J\) along the line of steepest descent, that is, along the line \(v_{k-1} - t g_{k-1}\), where \(v_{k-1}\) is the solution after \(k-1\) iterations, and \(g_{k-1} = \nabla J(v_{k-1})\) is the gradient of \(J\) at \(v_{k-1}\). The minimum \(t_k\) along this line must satisfy

\[ \nabla J(v_{k-1} - t_k g_{k-1}) \cdot g_{k-1} = 0. \]

Since \(\nabla J\) is linear in this case, we can rewrite it as

\[ 0 = \nabla J(v_{k-1}) \cdot g_{k-1} - t_k \nabla J(g_{k-1}) \cdot g_{k-1} = g_{k-1} \cdot g_{k-1} - t_k \nabla J(g_{k-1}) \cdot g_{k-1}. \]

This gives a simple formula for \(t_k\):

\[ t_k = \frac{\|g_{k-1}\|^2}{\nabla J(g_{k-1}) \cdot g_{k-1}}. \]

The following full algorithm is implemented in find_dirichlet_minimizer below:

Set \(v_0\) to be the initial guess. For \(k = 1, \ldots, k_{\rm max}\) do:

  1. Compute the gradient, \(g_{k-1} = \nabla J(v_{k-1})\).
  2. If \(\|g_{k-1}\| \leq tol\), exit. \(v_{k-1}\) is the resulting approximation and \(k-1\) is the number of iterations performed.
  3. Compute the minimizer along the line, \(t_k = \frac{\|g_{k-1}\|^2}{\nabla J(g_{k-1}) \cdot g_{k-1}}\).
  4. Set \(v_k = v_{k-1} - t_k g_{k-1}\).

The code follows:

#include <cmath>
#include <vector>
#include <iostream>
#include <stdexcept>

using namespace std;

void check_size(vector<double> const& u, vector<double> const& v) {
    // check that the vectors have the same size
    if (u.size() != v.size()) {
        throw runtime_error("vector sizes do not match");

double dirichlet_energy(vector<double> const& u) {
    int N = u.size() - 1;
    double accum = 0;
    for (int i = 0; i < N; ++i) {
        accum += pow(u[i+1] - u[i], 2.);
    return N * accum;  // dx = 1 / N

void dirichlet_energy_gradient(vector<double>& grad, vector<double> const& u) {
    check_size(u, grad);
    int N = u.size() - 1;
    double c = -2. * N;  // c = -2. / dx
    for (int i = 1; i < N; ++i) {
        grad[i] = c * (u[i-1] - 2. * u[i] + u[i+1]);
    grad[0] = grad[N] = 0.;  // do not forget the padding

double l2_norm(vector<double> const& u) {
    double accum = 0.;
    for (int i = 0; i < u.size(); ++i) {
        accum += u[i] * u[i];
    return sqrt(accum);

double dot(vector<double> const& u, vector<double> const& v) {
    check_size(u, v);
    double accum = 0.;
    for (int i = 0; i < u.size(); ++i) {
        accum += u[i] * v[i];
    return accum;

void add_to(vector<double>& u, double a, vector<double> const& v) {
    check_size(u, v);

    for (int i = 0; i < u.size(); ++i) {
        u[i] += a * v[i];

int find_dirichlet_minimizer(vector<double>& u, double tolerance = 1e-5) {
    const int max_iter = 100000;
    vector<double> grad(u.size());
    vector<double> ggrad(u.size());

    int iter;
    for (iter = 0; iter < max_iter; ++iter) {
        dirichlet_energy_gradient(grad, u);
        double grad_norm = l2_norm(grad);

        if (grad_norm <= tolerance) {

        // find the mimimum along the line u - t grad
        dirichlet_energy_gradient(ggrad, grad);
        double t = grad_norm * grad_norm / dot(ggrad, grad);

        // update
        add_to(u, -t, grad);

    return iter;

int main() {
    // test the minimization
    int n = 128;

    vector<double> u(n + 1);

    for (int i = 0; i <= n; ++i) {
        double x = static_cast<double>(i) / n;
        u[i] = exp(x);

    // quick test that the energy and gradient are compatible
    vector<double> grad(u.size());
    vector<double> v(u); // copy of u

    dirichlet_energy_gradient(grad, u);

    double delta = 1e-3;
    add_to(v, delta, grad);

    double Ju = dirichlet_energy(u);
    double Jv = dirichlet_energy(v);

    // should be small
    cout << "Relative Taylor difference = "
         << ((Ju + delta * dot(grad, grad) - Jv) / Ju) << endl;

    // find the minimizer
    int num_iter = find_dirichlet_minimizer(u);

    // report results
    cout << "Number of iterations = " << num_iter << endl;

    double J = dirichlet_energy(u);
    double J_exact = pow(u[n] - u[0], 2.);

    cout << "Energy at minimizer = " << J << endl;
    cout << "Difference from exact energy = " << (J - J_exact) << endl;

    dirichlet_energy_gradient(grad, u);
    cout << "Norm of the residual = " << l2_norm(grad) << endl;

    return 0;