Copy on Write

Since I was recently talking about Pointers in Matlab, I thought I should make this picture complete by saying some words about the Copy-On-Write mechanism in Matlab. This is rather advanced and undocumented Matlab so I hope that I am clear enough. The relevance might change with new release of Matlab if Mathworks decide to change their internal mechanism but for today it is still valid.

You can look over Wikipedia here :

http://en.wikipedia.org/wiki/Copy-on-write

But still I want to develop what this means for us, Matlab programmers.

Copy-On-Write is use to optimize memory usage.  When a programmer make a copy of a variable, you want to efficiently use memory. And, well, if the programmer just make a copy and does not change it in the process, then making a duplicate of the variable in memory is a total waste. So it makes total sense to actually copy the variable only and only if the program actually change the value of that copy! Therefore you would want to duplicate that variable in memory when the program actually change the value of the copy. So the name Copy-On-Write.

Now this has several  important consequences to you. The first one is that Matlab actually uses pointers in the background. For all variables, when you copy it to a new variable, the copy is only effectively a pointer until you change its value. Now Matlab does it in a smart way so that it is completely transparent to you.

Now, the second, and probably much more important consequence is that Matlab can spend a lot of time on unexpected lines. Consider the following code :

A=rand(400,400,400);
B=A;
B(1,1,1)=0;

If you run the Profiler on this. You will get the following result :

time   calls  line
4.72       1    1 A=rand(400,400,400);
0.10       1    2 B=A;
2.48       1    3 B(1,1,1)=0;

If you were not to know about the Copy-On-Write, then you would be quite puzzled by how much time Matlab spend on line 3. A simple value change that takes 2.5 s! But now you know that this is because  Matlab is making the copy of variable on line 3, not line 2.

You can often use this mechanism to optimize your code when you want to completely change the content of one big matrix but you need to keep a copy of the old matrix to process the new one.

If you want to see these hidden pointers in action, you can use the secret and hidden command ‘format debug’ as explained here :

http://www.mathworks.com/matlabcentral/newsreader/view_thread/15988

This entry was posted in Advanced, Optimizing your code. Bookmark the permalink.

One Response to Copy on Write

  1. dark_music says:

    wow man!!!! learned lot of things from you……thank you ….

Leave a Reply

Your email address will not be published. Required fields are marked *