Structure, cell or both?

In this post, I talk about the two heterogenous containers that are available to you in Matlab, the structure and the cell. I explain when you probably want to use one or the other and when you probably should not. As often, I end with some ideas for little more advanced programmers on how to combine cell and structures together.

As you most certainly have read my post on variable ID cards, you are aware that any variable in Matlab environment has a data type. As you should pick the right clothes that fits your day activities (would you go running in suit?) , you should choose the right data type for your particular computing task. In many occasions, I have seen programs that were hard to read only because its variables were not appropriately stored. The way you organize your dataset has a great deal of influence on the way your program is structured.

Of all the data types, the most common are numerical datatypes like double. All of these are homogeneous containers. What does this mean?

Any variable needs to be stored somewhere in memory.  As you create an array of elements in Matlab, their elemental space in memory gets duplicated. Homogeneous containers like double arrays use contiguous blocks of memory where all elements have the same size. As a result, you only need to know the location of the first element and its size to know where all elements are located.

This is very handy but in some cases, you do have a list of elements which are not of the same size. A very good example of this are list of strings. I am going to pick an example I already use.

For instance, if you want to store how to say hello in multiple languages, you will do :

HelloLanguage{1}='Hello';
HelloLanguage{2}='Bonjour';
HelloLanguage{3}='Buenos días';

{} is used to access elements in a cell. So :

>> HelloLanguage{3}

ans =

Buenos días

Cell is, as we said, an heterogeneous container. All elements can be of any size or class, which why, in this particular example, it is usefull to store string of different lengths.

As with standard arrays, cells are indexed with numbers and can be of any dimensions. You can do :

>> HelloLanguage{3,2,1}=’Konnichiwa’

HelloLanguage =

‘Hello’ ‘Bonjour’ ‘Buenos días’

[] [] []

[] ‘Konnichiwa’ []

Before we move on how to use in great details this new object. I want to also introduce the cell mate, the structure. As opposed to cells, structure are not indexed through numbers but through field name. To use a structure in this particular example, you could do : 

HelloLanguageStruct.English='Hello';
HelloLanguageStruct.French='Bonjour';
HelloLanguageStruct.Spanish='Buenos días';

And then :

>> HelloLanguageStruct

HelloLanguageStruct =

English: ‘Hello’

French: ‘Bonjour’

Spanish: ‘Buenos días’

Please note the usage of the dot or . followed with the field name. You are free to use any field name of your convenience so this object is quite flexible too. Already you can see that cells and structures serves different, complimentary purposes. Cells are meant to be used when it makes sense to index individual elements with a number while structure will provide some additional classification of your objects using its field name.

Keep also in mind that individual elements in structure or cells can be absolutely anything INCLUDING another cell or structure.

You could therefore store large arrays of different sizes into multiple compartements of a cell, like so :

>> CellStorage{1}=rand(10)

CellStorage =

[10x10 double]

The usage of {} is a little counter-intuitive at first because () also works on these objects.

Indeed :

>> CellStorage{1}

ans =

0.1622 0.4505 0.1067 0.4314 0.8530 0.4173 0.7803 0.2348 0.5470 0.9294

0.7943 0.0838 0.9619 0.9106 0.6221 0.0497 0.3897 0.3532 0.2963 0.7757

0.3112 0.2290 0.0046 0.1818 0.3510 0.9027 0.2417 0.8212 0.7447 0.4868

0.5285 0.9133 0.7749 0.2638 0.5132 0.9448 0.4039 0.0154 0.1890 0.4359

0.1656 0.1524 0.8173 0.1455 0.4018 0.4909 0.0965 0.0430 0.6868 0.4468

0.6020 0.8258 0.8687 0.1361 0.0760 0.4893 0.1320 0.1690 0.1835 0.3063

0.2630 0.5383 0.0844 0.8693 0.2399 0.3377 0.9421 0.6491 0.3685 0.5085

0.6541 0.9961 0.3998 0.5797 0.1233 0.9001 0.9561 0.7317 0.6256 0.5108

0.6892 0.0782 0.2599 0.5499 0.1839 0.3692 0.5752 0.6477 0.7802 0.8176

0.7482 0.4427 0.8001 0.1450 0.2400 0.1112 0.0598 0.4509 0.0811 0.7948

>> CellStorage(1)

ans =

[10x10 double]

The reason for this behavior is that {} is giving access to the inside of a particular element in a cell while () will travel along the cell dimension. It is as if {} actually give the direct memory address to the data of the element stored while () only gives back the memory address of the cell element (not the data). In essence, CellStorage(1) is still a cell while CellStorage{1} is the actual matrix stored in there.

This is why this will give you an error :

>> CellStorage(2)=’test’

Conversion to cell from char is not possible.

That you can overcome with :

>> CellStorage(2)={‘test’}

You convert ‘test’ to a cell that can now be copied in. The alternative command does not need this conversion as you directly access the inside :

>> CellStorage{3}=’test’

Now that you know how to use cells, you are probably wondering why not using these ALL the time. Indeed these objects are much MORE flexible than standard arrays.

However this flexibility comes at a cost : memory size. As each element can be of any size, each one of these must store its datatype, size,…

They all need to keep their individual ID card. As such it takes some space so you should always try to avoid growing these cells or structures to large size, especially if you could replace them with more standard homogeneous containers.

Structures also have some additional capabilities that can be confusing. As for cells, structures can be indexed. Indeed, using the example above, one can do :

>> HelloLanguageStruct(1)

ans =

English: ‘Hello’

French: ‘Bonjour’

Spanish: ‘Buenos días’

But you could also do :

>> HelloLanguageStruct(2).English=’test’

HelloLanguageStruct =

1×2 struct array with fields:

English

French

Spanish

What is happening here?

Now HelloLanguageStruct is an ARRAY of structure. So HelloLanguageStruct(1) and HelloLanguageStruct(2) are both structures. BUT and this is important, this is NOT like a cell because ALL of the elements of this array of structure MUST contain the same fields.

This is why you will get :

>> HelloLanguageStruct(2)

ans =

English: ‘test’

French: []

Spanish: []

The fields that were not filled are there but empty. This particular behavior can be very useful if you want to create a small database with people names, address, phone number as all the individual persons will need the same fields.

In some ways, the array of structure is an homogeneous container of heterogeneous containers.

 The million dollar question now is : what if you actually need an array of structure that have different fields?

In that case you should combine both and store all of your individual structures in a cell, like so :

>> MultiDictionary{1}=struct(‘English’,’Hello’,’French’,’Bonjour’,’Spanish’,’Buenos Dias’);

>> MultiDictionary{2}=struct(‘English’,’Goodbye’,’French’,’Au revoir’);

>> MultiDictionary{1}

ans =

English: ‘Hello’

French: ‘Bonjour’

Spanish: ‘Buenos Dias’

>> MultiDictionary{2}

ans =

English: ‘Goodbye’

French: ‘Au revoir’

That way, you can store various structures with different amount of fields in the same big Cell.

Related Posts

This entry was posted in Intermediate. Bookmark the permalink.

6 Responses to Structure, cell or both?

  1. Archit says:

    How do i define a struct containing a particular number of fields and no more i.e. no additional fields should be created by accident. Eg. struct Velocity has fields Vx, Vy, Vz and if i try to access VX then it should be an error.
    What happens now is it creates an additional field in the entire struct. I do not want this behaviour
    Thanks for ur great blog.

    • Jerome says:

      I don’t think you can achieve that with structures.
      I think you need to do some object oriented programming to achieve that.
      I would define an object with your 3 properties Vx, Vy and Vz and that’s it.

      • Archit says:

        Just curious, why have u not covered a tutorial on OOP in MATLAB
        Is it because of lack of time or is it because OOP is not that popular in MATLAB

        • Jerome says:

          I want to. I didn’t have time to do that yet. OOP is getting more and more popular. Each new release of Matlab introduce more and more built-in objects.

  2. Archit says:

    Thanks in advance

  3. kris says:

    Appreciate the overview – you are very good at explaining the material and making it easy to understand for us newbies.

Leave a Reply