A few months ago I wrote a piece of code capable of reading large "RAW" files on disk into VTK ImageData structures.  It works pretty well, and I’ve seen some simply amazing results from it in ezViz. (One researcher Volume Renders his 1080x540x1080 dataset in 38 seconds load-to-PNG with it!) It does some fancy stuff like support various data types (char, integer, float, signed, unsigned, etc) and support byte-swapping.  It wasn’t all that hard to write, although it does require that the entire file fit in memory in 1 chunk.

Well, a few weeks ago a researcher sent me an email saying that he was getting an error "Unable to allocate memory", quickly followed by a segmentation fault.  I recognized the error as a sanity check I added near the malloc in that code.  I figured he simply didn’t have enough memory on that machine, and suggested he try it on Amethyst (with it’s 128G of memory).  Yesterday he finally got a chance to try it, and still got the same error.  We talked for a bit, and he gave me access to his data & scripts and I tried it.

The file he was loading was a 1080x540x1080 grid of Floating points.. Do the math and you get 1080x540x1080*4(bytes per float) = 2,519,424,000 bytes, or roughly 2.5G of memory.  That should have fit easily into the 128G of ram on the machine, so what was going wrong?  I initially thought I had hit a 2GB memory barrier, so began looking for a 64-bit version of malloc.  From what I could find, 64-bit linux automatically uses 64-bit compatible malloc, so that wasn’t the problem. 

After a bit of searching, I finally found the problem.  The code looked like this:

    if ((data = (unsigned char*)malloc(iSize * jSize * kSize *
                                       NumComponents * NumberSize))
             == NULL) {
        perror("Unable to allocate memory!nt");
        return NULL;
    }

Seems simple enough right?  So what was going wrong?  After some diagnostics, I realized that the large multiplications (i*j*k*components*size) was all being done with Integers, which overflowed at 2^31 (2,147,483,648 ), so it was returning a big negative number.  Since you can’t allocate negative memory, it returned an error.

The solution was simple enough: Typecast everything as long:

    if ((data = (unsigned char*)malloc((unsigned long)iSize *
                                       (unsigned long)
jSize *
                                      
(unsigned long)kSize *
                                       (unsigned long)
NumComponents *
                                       (unsigned long)
NumberSize))
             == NULL) {
        perror("Unable to allocate memory!nt");
        return NULL;
    }

That fixed it.. Works like a charm now.  Type conversion in C is one of the trickier problems to find and figure out, and one again I got burned by it.
[tag:c][tag:memory][tag:malloc][tag:64bit]