How to load Tiff stacks fast, really fast!

Loading Tiff files has been slow for many years in Matlab. With the recent introduction of the TIFF library, things have improved a lot. But still, when it comes to loading large dataset stored in Tiff files, Matlab functions are not as good as they could be. Today I am going to introduce a few lines of codes that will make all of this past history for good.

Quite some time ago, I introduced “inlining” as a way to efficiently boost your code efficiency. the basic principle is to look for Matlab functionalities that are not built-in functions. If it happens that these are slowing down your calculations, you can access the underlying M-file and fish in the only few lines of codes that are relevant to you. Today’s post is basically a comprehensive example of this technique so that, even if you don’t care about Tiff files, this post is interesting as a guideline for optimization

Okay, now that the introduction is done, let’s dig in.

Let’s suppose you have a TIff file named ImageStack.tif with a series of images stored in it. In good old Matlab code, you would use this code to load it in a 3D matrix :

FileTif='ImageStack.tif';
InfoImage=imfinfo(FileTif);
mImage=InfoImage(1).Width
nImage=InfoImage(1).Height
NumberImages=length(InfoImage)

FinalImage=zeros(nImage,mImage,NumberImages,'uint16');
for i=1:NumberImages
   FinalImage(:,:,i)=imread(FileTif,'Index',i);
end

imfinfo is used to get the size of the movie stack to preallocate the big matrix. Nothing is especially fancy in this code. This would be the way most people load a tif stack if there was no performance issues.

With this particular code and a decent dataset of 1575 images (256 by 256 pixels) in a single Tiff file, it takes approximately 200 seconds to run on my computer.

To give you an idea how awful this is, ImageJ, a very widely used software in microscopy, takes approximately 3 seconds to load the same stack.

To help solve this issue, Mathworks modified imread to allow feeding some additional info and avoid some overhead within imread, as mentioned in the help :

Note: When reading images from a multi-image TIFF file, passing the output of imfinfo as the value of the ‘Info’ argument helps imread locate the images in the file more quickly.

So the new version of the code, a couple years ago was :

FileTif='ImageStack.tif';
InfoImage=imfinfo(FileTif);
mImage=InfoImage(1).Width;
nImage=InfoImage(1).Height;
NumberImages=length(InfoImage);
FinalImage=zeros(nImage,mImage,NumberImages,'uint16');

for i=1:NumberImages
   FinalImage(:,:,i)=imread(FileTif,'Index',i,'Info',InfoImage);
end

This new code, on my computer takes 40 seconds. A big improvements. But we are still quite far from ImageJ and its 3 seconds.

For a few years, nothing happened on this front. Many voices were raised to clearly bring this issue to a higher priority at Mathworks.

And then, a miracle happened (Hallelujah!) with Matlab 2011!

Mathworks decided to port to Matlab the TIFF library. The new code to read TIff stack is now :

FileTif='ImageStack.tif';
InfoImage=imfinfo(FileTif);
mImage=InfoImage(1).Width;
nImage=InfoImage(1).Height;
NumberImages=length(InfoImage);
FinalImage=zeros(nImage,mImage,NumberImages,'uint16');

TifLink = Tiff(FileTif, 'r');
for i=1:NumberImages
   TifLink.setDirectory(i);
   FinalImage(:,:,i)=TifLink.read();
end
TifLink.close();

Again, a big improvement, now the very same file is loaded in 19 seconds. Still this is rather slow compare to ImageJ so I decided to really push on this front as I am loading Tiff files in Matlab many times, every single day, some having many more images than 1575 (up to 10000 and more).

It turned out, as you dig in Mathworks implementation of the TIFF library that they did a very poor job at limiting the overhead when dealing with TiFF stacks. This is especially annoying as I believe that the main advantage of this move was to get faster at stacks.

Indeed when you run the profiler on this code. You should get this :

As you can see, the number one process is Tiff.getTag. getTag is used to get some properties of the image. So they actually duplicated the mistake they did with imread as this function is being called 28350 times to read my stack!

What we want to do now is to use the profiler to select the pieces of code that are relevant and get rid of the rest. So within the profiler I clicked on Tiff.read, I realized that Tiff.read makes a call to Tiff.readAllStips which also make many calls to Tiff.readEncodedStrip and there, deeply buried within a loop that goes over all the pixels of the data, there was the real call to tifflib, the original compiled library.

Oh surprise, most of the time is not spent reading the data! Most of the time is spent checking that the image has a certain color profile (look at the call to Photometrics).

This is a golden example on why inlining can be extremely efficient.

So I went through all these functions, copied and pasted some code and tried to make a new loader that makes smarter usage of the TIfflib for stacks. This is the new code :

FileTif='ImageStack.tif';
InfoImage=imfinfo(FileTif);
mImage=InfoImage(1).Width;
nImage=InfoImage(1).Height;
NumberImages=length(InfoImage);
FinalImage=zeros(nImage,mImage,NumberImages,'uint16');
FileID = tifflib('open',FileTif,'r');
rps = tifflib('getField',FileID,Tiff.TagID.RowsPerStrip);

for i=1:NumberImages
   tifflib('setDirectory',FileID,i);
   % Go through each strip of data.
   rps = min(rps,nImage);
   for r = 1:rps:nImage
      row_inds = r:min(nImage,r+rps-1);
      stripNum = tifflib('computeStrip',FileID,r);
      FinalImage(row_inds,:,i) = tifflib('readEncodedStrip',FileID,stripNum);
   end
end
tifflib('close',FileID);

What this codes does is to bypass the M-file wrapper wrote by Mathworks (the one that is very bad at stacks) around their built-in MEX file. So I make direct call to tifflib now.

The problem is that Matlab does not place tifflib within your search path, so you MUST copy the compiled libraries from your own distribution of Matlab into your function folder. On my mac, this file is at :

/Applications/MATLAB_R2011b.app/toolbox/matlab/imagesci/private and is called tifflib.mexmaci64. I copied this file into the folder where my M-file code is located.

This also means that, in my case, this function will work only on MAC 64 bits until I copy the mex files for the other distributions.

Keep in mind that I also removed lots of overhead to check the particular tiff types (in this example it is a chunky file) so you might want to create several loader depending on the file type (instead of checking the file type at every pixel like Mathworks did). The current code works for my particular application.

With this in mind,  using TIC/TOC routines, this codes now takes 1.5 seconds. Yes, I am not joking, Matlab is now FASTER than ImageJ.

I hope Mathworks is reading this for their next release… They might consider changing their wrapper as I am not the only one around that use TIff stacks…

NOTA : Mathworks released (a few months after I posted about it) a bug correction of the TIFF class to deal with this issue. The new class is far better and gives very decent loading time. I recommend you download it and overload your local copy of Tiff.m with the bug fix. Direct call to these new libraries still provides a little boost but not as drastic. 

This entry was posted in Advanced, Optimizing your code. Bookmark the permalink.

25 Responses to How to load Tiff stacks fast, really fast!

  1. Simon P says:

    Unfortunately I am not seeing this ; I tried the TifLink approach vs. imread (though in fairness, this was 2012a), and here is my output:


    >> tic ; im = load_image('Image_Registration_4_an167951_2012_05_08_main_325'); toc
    Elapsed time is 0.709723 seconds.
    >> tic ; im = load_image('Image_Registration_4_an167951_2012_05_08_main_325'); toc
    Elapsed time is 1.043148 seconds.
    >>

    The first call was made with imread, with the ‘Info’ correction, while the second call was made with the use of Tiff. Here is the relevant code (I would just set the 1 to 0 to test imread):


    if (1) % sadly this did not help :(
    TifLink = Tiff(fullpath, 'r');
    for f=1:length(frames) % TIFF nice!
    infile_idx = frames(f);
    % skip inappropriately sized frames
    if (imf(infile_idx).Width == imf(1).Width & imf(infile_idx).Height == imf(1).Height)
    TifLink.setDirectory(frames(f));
    im(:,:,f) = TifLink.read();
    else
    disp(['load_image::not allowed to have files with disparate frame sizes -- skipping frame ' num2str(infile_idx) ' in ' fullpath]);
    end
    end
    TifLink.close();
    else % imread
    for f=1:length(frames)
    infile_idx = frames(f);
    % skip inappropriately sized frames
    if (imf(infile_idx).Width == imf(1).Width & imf(infile_idx).Height == imf(1).Height)
    im(:,:,f) = imread(fullpath, infile_idx, 'Info', imf);
    else
    disp(['load_image::not allowed to have files with disparate frame sizes -- skipping frame ' num2str(infile_idx) ' in ' fullpath]);
    end
    end
    end

    Will try the mex suggestion but could it be that Mathworks actually improved imread in the most recent iteration?

    • Jerome says:

      I suppose you do preallocation of the big image matrix?
      Because if you don’t the first loop with Tiff is doing it for the second loop which might bias your measurements.

      • Jerome says:

        Oups, sorry. I misunderstood your code. It’s possible they changed imread recently to use the Tiff library in a better way. I am using 2011 here. Maybe I should upgrade to try it out.
        In my hands, imread was doing a lot of overhead with Tiff stacks. I am curious to see if the last approach works out nice for you.

        • Simon P says:

          Yea, sorry about the messy code — just used a regular code tag and didn’t realize it would kill my indents.

          Anyways, yes I do preallocation. Will post once I try the mex file thing.

          As an aside, I did this on a mac ; but the same holds true when I did it on a linux box.

          • Jerome says:

            I saw some improvements both on mac and windows, I believe. From the release notes, it doesn’t seem like they changed imread in 2012.
            Have you tried this with very big stacks, like 500-800 Mb on HD?

  2. Simon P says:

    Jerome — Main reason I was curious about these workarounds was because I am typically in the 1-2 GB regime.

    And I just tried your tifflib mex file hack, and that worked splendid — 27 s to 2 s.

    Thanks so much for this!

  3. Masood Samim says:

    hey, do you know how to read very large (25 Gig) multipage files. The imfinfo returns an error (below) and says the file is corrupt, but i know it is because it’s too big.

    The indicated starting byte of the first image file directory (1048584) exceeds the
    image file size (-335978638), the image may be corrupt.

    Error in ==> imtifinfo at 57
    raw_tags = tifftagsread ( filename );

    or any other way to read large file very fast and take advantage of the imfinfo.

    Thanks

    • Jerome says:

      TIFF files are not supposed to be that big according to the standard of the file format.
      Usually these huge TIFF are created using ImageJ. To make this to work, ImageJ actually leaves the standard and stop recording frames into directories. I think you will greatly benefit to migrate to HDF5 files. These are much faster than TIFF for such large size and they are designed for this exact purpose.
      Check my post on this particular issue.
      http://www.matlabtips.com/how-to-store-large-datasets/

  4. micha says:

    hey, I tried your code using Matlab 2012b. Unfortunately, for me it did not work — i copied the tifflib to my folder and it is running. but it takes pretty much the same time as with imread — around 45 seconds for 20000 frames of a 170000 frames file with 128×80 pixels. With ImageJ the whole 170000 frames file is loaded in less than a second. the file is 16bit, could this be a problem?

    do you have any other suggestions?

    also: take care: you mixed up ImageWidth and ImageHeight … you will notice, if you use a non-squared image ;)

    any helpful comment would be much appreciated =)
    cheers

    • Jerome says:

      I suppose you tried the last code?
      16 bits is not a problem here, I believe.
      Have you run the profiler on your machine? Where is the bottleneck?

      • Vincent says:

        There is indeed a typo in the code. ‘mImage’ in loop needs to be replaced with ‘nImage’ (see below). For tiff stacks where Width >> Height, the code is much slower than imread (using the most recent patch to Tiff).


        InfoImage=imfinfo(FileTif);
        mImage=InfoImage(1).Width;
        nImage=InfoImage(1).Height;
        NumberImages=length(InfoImage);
        FinalImage=zeros(nImage,mImage,NumberImages,’uint16′);
        FileID = tifflib(‘open’,FileTif,’r’);
        rps = tifflib(‘getField’,FileID,Tiff.TagID.RowsPerStrip);

        for i=1:NumberImages
        tifflib(‘setDirectory’,FileID,i);
        % Go through each strip of data.
        rps = min(rps,nImage);
        for r = 1:rps:nImage
        r
        row_inds = r:min(nImage,r+rps-1);
        stripNum = tifflib(‘computeStrip’,FileID,r);
        FinalImage(row_inds,:,i) = tifflib(‘readEncodedStrip’,FileID,stripNum);
        end
        end
        tifflib(‘close’,FileID);

        • Jerome says:

          Thanks.
          I corrected the typo.
          As I said in the NOTA, Mathworks fixed their Tiff class for good (after nearly half a decade of extreme slowness). So the trick is not worth now the effort. Basically the new version loads the entire image as one single ‘Strip’, if possible, instead of doing it nearly at the pixel level.
          You can still get a boost in speed by taking the new libraries and using it to get rid of one for loop (the one that goes across the strips (i.e. for r=1:rps:nImage). In my hands, it gives something like a 10-20 % boost. I am not sure it is worth it as to do so, you have to possibly break compatibility with future releases of Matlab.

          • Jerome says:

            This is probably why the proposed code is not good at elongated image as it is still processing each image as multiple strips. For elongated image, that’s a lot of strips…
            You can try my proposed trick in my last comment in that particular case.

  5. Daniel says:

    I tried using your code, it doesn’t seem to help me. It is considerably slower than just using imread+file info. Maybe it is because my tiff files are a bit different, they are 1312×1082 pixel images (up to 750 of them). Maybe imread works better for those larger images, I don’t know what the rtifc.mex is doing…

    • Jerome says:

      If I understand correctly, you have 750 files? That’s a very different case than what I propose here as I am working with single tiff file with many images in it.
      If you do have a single file, then you should try using the mex file approach.

      • Daniel says:

        I was a bit unclear I guess. I have one multi-page file with 750 images in it. I tried the mex-file approach (if that is the last code you present above), but I still get a better performance with imread.
        If you’re interested I can send you the profiler reports.

        The images are LZW compressed, if that makes a difference.
        Also, the machine I am running the code on is just a standard laptop, not a lot of processing power…

  6. Noah says:

    Hi Jerome! This code is fantastic. Thank you so much for posting it. I wonder if a similar solution is possible for writing a very large multi-page tiff. So given a 3D matrix Img(x,y,time) – do you know the syntax to write a multi-page tiff ( with “time” number of pages and each page of dimensions X and Y) using the tifflib mex?

  7. cristina says:

    Super useful :) thank you

  8. Move Left says:

    The code aften fails with:

    Error using tifflib
    The strip number must be positive and less than or equal to 1000.

    Error in testtiffi (line 26)
    FinalImage(row_inds,:,i) = tifflib(‘readEncodedStrip’,FileID,stripNum);

  9. Nico says:

    Hi,

    Thanks for your work !

    I encourted the same problem (or so) : i got the error

    Error using tifflib
    The strip number must be positive and less than or equal to 64.

    (and stripNum stopped at 64…)

    I use matlab 2013a

    Where the library fix can be found ?

  10. Erik says:

    Great article & site.

    I have used similar techniques on functions like interp1, which contains all sorts of error trap code and coping mechanisms to handle vectors of different orientations. I can’t remember the figures but got a massive speed increase by essentially writing my own in the end.

  11. dario says:

    I recibe the error

    The strip number must be positive and less than or equal to 1.

    And the result of stripnumber is always 1, no meter the number of r

    Can you help me? Have you a tiff file to test if the problem is my image.

    thanks

    • Jerome says:

      Hello,

      Thank for your message. I found this fix about 2 years ago on an older version of Matlab. Mathworks has fixed their Tiff library implementation in the meantime. They also changed their underlying code so that probably my code does not work well with the new library. Besides the current Tiff class (in 2013b) gives good loading results.

Leave a Reply