Me: Josh Bevan - jbevan@bu.edu
Get Help from RCS: help@scc.bu.edu

Our website: rcs.bu.edu
Tutorial eval: rcs.bu.edu/eval
This notebook: http://scv.bu.edu/examples/matlab/Tutorials/PerfOpt/

What is "performance optimization"?

Programs are run with finite resources: memory, cpu, disk, time, etc. For small or simple programs these limitations are unimportant; the program finishes running fast enough and within your constraints so that these limitations have no noticeable effect. However, most non-trivial programs eventually reach one of these constraints; usually memory or running time.

Performance optimization in the context of this tutorial then means improvement of a program to minimize the effect of our computing environments limitations.

You can break techniques to "PO" into several categories:

  1. Memory access efficiency
  2. Vectorization
  3. Use of "a priori" knowledge to specialize approach/methods
  4. Making use of/avoiding computer and language strengths/weaknesses
  5. Algorithmic improvement: asymptotic/"big O"

Not 6: Parallelization. Improves performance, but does not "optimize" (actually usually has efficiency penalty).

Brief sidebar: Recall vectorization... Vectorization is the process of performing the same operation on multiple pieces of data at the same time. This has to do with MATLAB's language/implementation specifics, but is also generally applicable to most other interpreted languages.

It often takes the form of converting "tight" loops into operations on vectors/matrices.

Here's an example:

Depending on MATLAB version, they may be close or not. Newer versions are able to better JIT (just-in-time) compile simple patterns. Consider this very similiar version and compare:

We can easily make situations where MATLAB has a hard time JITing. Try encapsulating operations inside functions:

Consider a simple example that shows all of the above categories of optimization: We want to calculate cumulative sum (prefix sum) for an array of numbers

We can see the above plot shows our running time does not increase linearly as we increase $N$. We can talk about the "asymptotic" behavior of our program using "big O" notation. This describes the running time dependent on some critical parameter(s). In this case our critical parameter is $N$ and we'll show that our program scales as: $\mathcal{O}(N^2)$

What can we improve?

  1. Memory access efficiency
  2. Vectorization
  3. Use of "a priori" knowledge to specialize approach/methods
  4. Making use of/avoiding computer and language strengths/weaknesses
  5. Algorithmic improvement: asymptotic/"big O"

Memory access efficiency

Vectorization

a priori knowledge

Language strengths/weaknesses

Fibonacci sequence: 1,1,2,3,5,8,13

Algorithmic improvements

https://en.wikipedia.org/wiki/Fenwick_tree
"A flat array of N values can either store the values or the prefix sums. In the first case, computing prefix sums requires linear time; in the second case, updating the array values requires linear time (in both cases, the other operation can be performed in constant time)."

Advanced Vectorization Methods

Sometimes we have two sets of data etc. that we need to interact across all elements. Imagine you have a series of partially full boxes and a variety of items you can pack in them, but you can't exceed a max weight.

https://www.mathworks.com/help/matlab/ref/bsxfun.html

Another "advanced" technique: Use linear algebra when you can (and know how):

Can we write a short program using loops to find the "neighbor sum" for each element?

Is there a better way?

Instrumentation/metrics

Program bottlenecks and code performance will often defy your intuition. Therefore it's important to empirically measure what's going on.

Consider a simple example: based on what we've learned which of these is faster?

For non-trivial programs this measurement approach is insufficient. Let's take a look at a "real" program and see how we can profile it in MATLAB.

Recommended algorithms starting places:

https://en.wikipedia.org/wiki/Introduction_to_Algorithms (CLRS)

Handbook of Mathematical Functions aka Abramowitz and Stegun:

https://en.wikipedia.org/wiki/Abramowitz_and_Stegun and https://dlmf.nist.gov/