# The art of vectorizing – Part 1

Vectors and matrices are at the heart of any Matlab program. Matlab is extremely efficient at doing any operation with these. Vectorizing is the art of transforming a calculation done element by element into an operation on vectors. Here I start a serie of posts on how to vectorize your code.

I had a very good teacher in chemistry when I was in college. He used to say something like this :

To understand chemistry is easy, you just have to understand the very few building blocks on which all this science is based upon. And they don’t take more space than a metro ticket…

Programming is similar, I believe. They are some important principles and all the rest is just addon knowledge built on these principles. If you were to ask me, vectorizing is one of the few Matlab core principles that should sit on this metro ticket. Some day I will make a post to actually draw this metro ticket in detail…

I already talked about why you should vectorize in numerous occasions here, here and here. So today I am going to dive straight into the hard core. How to vectorize?

They are many ways to process data in blocks and I want to cover all the tips and tricks I have used in my Matlab life so I will make as many posts as needed on this.

But I guess to start, we should talk about what I consider the most important Matlab operator : the point.

Such a small thing and yet I remember my first days with Matlab. This point was not intuitive. Part of the reason is because I was not taught Mathematic this way.

Let’s assume you have two matrices A and B of the same size. There are two ways to multiply them element by element :

```A=rand(100,1);
B=rand(100,1);
for i=1:100
C(i)=A(i)*B(i);
end
```

Or you could use the magical . operator. The point actually modify the multiplication operator so that it operates element by element, instead of doing a complicated matrix multiplication.

```A=rand(100,1);
B=rand(100,1);
C=A.*B;
```

Please note that the small point. A.*B is not equal to A*B. Here A*B will give you an error because you are not using properly the matrix multiplication (mathematically correct operations would be A’*B or A*B’ here where ‘ is the transpose). So the point MODIFIES the definition of * to make a multiplication of the entire A and B, element by element.

If you were to ask me, I believe this should be the default behavior of matrix multiplication but I can’t restart Mathematic history…

The point is (I couldn’t resist…) that this is EXTREMELY usefull and is probably one of the simplest form of vectorization you could do. The vectorize version A.*B is much faster than the for loop as Matlab operates on the entire matrix at once via its built-in routines.

In the same way, you can modify the behavior of mathematical operators like / 0r ^.

So you could equally do A./B to divide element by element and A.^B to take the power of each element of A with the corresponding element in B.

Already you can do a lot with this.

Vectorizing is more or less the art of getting rid of for loops. With this point operator, there is a type of for loops that you can easily simplify.

For instance let’s assume you are doing the following :

```A=rand(100,1);
B=rand(100,1);
for i=1:100
if B(i)>0.5
C(i)=A(i)^2;
else
C(i)=exp(B(i));
end
end
```

Vectorizing this is straight forward if you make use of the point and create an intermediate boolean matrix D.

```A=rand(100,1);
B=rand(100,1);
D=(B>0.5);
C=D.*(A.^2)+(~D).*exp(B);
```

Here D is a matrix of 1 and 0 of the same size of B. For each element in B that is superior to 0.5, there is a 1 or true in the corresponding element in D (and vice versa).

~ is the logical not operator so it is the exact opposite of D.

C is the sum of two matrix which are filled with A(i)^2 when D(i) is 1 or exp(B(i)) when ~(D(i)) is 1.

This entry was posted in Beginners, Optimizing your code. Bookmark the permalink.

### 19 Responses to The art of vectorizing – Part 1

1. Brian says:

Hi, I would like you to know that I have found your website really useful, and even enjoyable. over the years, I look topics up via Google and I’ve come across your website so many times that I recognize it without reading anything on the page! I’m finally going to bookmark your website, but, I’ve been learning from you all along. I’m an undergrad by the way; Applied Math and Materials Engineering major – thank you for your hard work.

2. Richard says:

I have always heard vectorizing is the way go but never tried too hard to go out of my way to make sure I was doing it — I guess because what I was doing just wasn’t that time sensitive. Today I thought I’d try out your suggestion. However I’m seeing some unusual results.

Can you think of any reason why if I use tic…toc around the 2 examples you give above, the for loop comes out on top?

Here’s the code:

A=rand(100,1);
B=rand(100,1);

tic
for i=1:100
if B(i)>0.5
C(i)=A(i)^2;
else
C(i)=exp(B(i));
end
end
toc

tic
D=(B>0.5);
C=D.*(A.^2)+(~D).*exp(B);
toc

Here’s the results:

Elapsed time is 0.000007 seconds.
Elapsed time is 0.000777 seconds.

Am I missing something?

• Jerome says:

Hi Richard,

I think you are seeing a mixture of preallocation time + JIT (for Just In Time) compilation. JIT can speed up computation of for loop quite significantly. The JIT is especially efficient on these simple for loops.

Try to read my post on FOR loop here :
http://www.matlabtips.com/matlab-is-no-longer-slow-at-for-loops/

Try to deactivate the JIT and compare. Also typically, the first time you run the code should be much slower (especially on the for loop side).

• Richard says:

Hey Jerome,

Thanks for the reply. That does make a lot of sense. I think I did run it a couple times before the numbers in my original post and I had about the JIT improving things. I have an older version of MATLAB kicking around on another computer that I may try this on out of curiosity, as well.

Thanks again for the reply and for this site (I just stumbled upon it and plan on digging around a bit more).
Cheers

3. Pieter van Vugt says:

I tried the following code, but i ran into trouble. A as a vector of real numbers, some of which are NaN, because sometimes a division by zero occurs when calculating A. I want to replace these NaN’s with zeros.

NANs=(isnan(A));
A=(~NANs).*A + (NANs).*0;

It doesn’t seem to work, because, 0 * NaN = NaN.

Do you have a clever way to do this in a vectorized manner, or do I have to resort to a for-loop?

• Jerome says:

A(isnan(A))=0 ?

• Pieter van Vugt says:

Aah! I didn’t expect that to work, because I know e.g. “A([0 1])=0” returns an error because the first index is zero. However your suggestion works because isnan() returns an array of logicals the same size as A, and not an array of numbers.

Brilliant in its simplicity! Thanks

• Jayden says:

That’s the thiinkng of a creative mind

4. rached says:

Hi everyboody,
Thanks a lot for this helpful work 😀 !
Please, can we vectorize every FOR or while loops?
Thanks!

• Jerome says:

I don’t think you should vectorize every loops. First, sometimes, it is not possible or easy to do so. Second, sometimes it goes against clarity and provide very little improvement in efficiency. Use common sense and proper evaluation of your bottle necks.

5. rached says:

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
for i=1:size(img,3)

%image de reference
refLoc = model.ref(:,:,i);
refLoc2 = model.ref2(:,:,i);
%canal local de l’image
imgLoc = img(:,:,i);

%si region est foreground, l’update est plus lent

model.ref(:,:,i) = refLoc;
model.ref2(:,:,i) = refLoc2;
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%Thanks again for every thing!!!!

• Pieter van Vugt says:

my GUESS is, it is possible if all variables are either matrices of the same size as “img”, or scalars. So if “noupdateMask” is the same size as as “img(:,:,i)” (i.e., two-dimensional), then you would have to make it three dimensional with repmat(). This way you avoid ambiguities where MATLAB doesn’t know how to apply the maths to the different dimensions.

Do note however, that you exchange CPU time for memory usage. Every intermediate “mask”, as well as “noupdateMask”, and “refLoc” will become size(img,3) times larger in memory. Which might slow things down too for large data sets (such as a movie?), or even freeze up the PC. (Experience!)

You already have this vectorized along two dimensions, which is already pretty good, I’d say.

(Jerome, if you disagree, feel free to delete this reply)

6. rached says:

Thank you very much Pieter van Vugt! I find your response good and i thought the same thing!!
For information:“mask”, as well as “noupdateMask”, and “refLoc” have the same dimension as «img»!!
I am waiting for Jerome’s confirmation 🙂 and it will be pleasant 🙂

• Jerome says:

Yes, I think Pieter is right. Make sure you ran the profiler first to check if this loop is a bottle neck to your calculation. If it is the case, than maybe vectorizing further could help.

7. rached says:

I use the profiler and the loop is not taking a lot of time!! So there is no need for vectorization!!
Thanks a lot Jerome and Pieter !!!Both of you were very helpful for me !!!
i am very grateful!!! 😀

Hey Jerome. Just a question about your last example. It seems counter intuitive to do it the way you do this, since the for loop would take advantage of temporal memory? (something something more cache hits) Am I wrong in my line of reasoning?

9. Pingback: MatLab Basic 2 | New ThinKing

10. love spell says:

What’s up,I log on to your blog named “The art of vectorizing – Part 1 | Matlabtips.com” on a regular basis.Your humoristic style is witty, keep up the good work! And you can look our website about love spell.

11. Thomas says:

Great post! Have nice day ! 🙂 quziz

This site uses Akismet to reduce spam. Learn how your comment data is processed.