Arrays VS Structures
By Pete Freitag
Array's and Structures (structures in CF are called hashtables, or associative arrays) are two very different data structures. There is some confusion about how arrays work in CF, and from what I can understand they are based on native java arrays.
Once common misconception is that the following code will only allocate two slots of memory:
<cfset a = ArrayNew(1)> <cfset a[1] = "One"> <cfset a[10] = "Ten">
What is really going on here is that cf is allocating an Array with 10 elements , elements 2-9 are null. The memory allocated is about 8 bytes for each element in the array (these are references the objects they hold). The memory is allocated in one contiguous block, so if you have a very large range in your Array, it can be fairly taxing on your memory manager.
You can see the size of the array, and the length of the array is 10 elements by using either ArrayLen
or the .size()
method (undocumented).
<cfoutput> Len: #ArrayLen(a)# Size: #a.size()# </cfoutput>
It is a common misconception that CF will only allocate two slots of memory, but from my understanding it will allocate all ten, so beware of this.
If your wasting a lot of space in the array, the other option is to use a structure:
<cfset a = StructNew()> <cfset a[1] = "One"> <cfset a[10] = "Ten"> <cfoutput>Size: #a.size()#</cfoutput>
You will notice that with a structure the size of the data structure will be just 2, instead of 10. Also note that the index of the structure is treated as a string, so you can index that structure with any string, not just unsigned integers as in Arrays.
A performance tradeoff decision must be made. Array's are very fast for retrieving elements, and for looping over. They can be slow to add elements to if you don't know the size of the array ahead of time (because they need to allocate contiguous blocks of memory). If your using array's with non-consecutive indexes you will require a lot of memory, and possibly lots of dynamic resizing.
Structures or hashtables have an expensive lookup cost compared to arrays because the string values are converted to an integer using a hash function that is used to locate a hash bucket then (because hash functions can create identical hashes for two different input keys) if the bucket has more than one item each item is compared with the key value. However if you have lots of non-consecutive indexes you will not have wasted memory like you would with an array.
So bottom line is if you have somewhat consecutive non negative integer indexes arrays are a good bet. Otherwise use a structure.
Arrays VS Structures was first published on May 25, 2005.
The Fixinator Code Security Scanner for ColdFusion & CFML is an easy to use security tool that every CF developer can use. It can also easily integrate into CI for automatic scanning on every commit.
Try Fixinator
CFBreak
The weekly newsletter for the CFML Community
Comments
Array's don't iterate through the entire array every time you look something up. Think of an Array as a chunk of memory with slots for each item in the array, if you know you want item 10, the exact memory address of the item can be calculated by taking the address of the first item, and adding 10 * size of each item (object references are all the same size). So an array item lookup is O(1) constant time.
And for structures your forgetting about looking up the hash item, when you generate the hash of the item it has to be put in a sorted sequence, then you have to do a binary search to find the item, which I think would give you a O(log(n)) - my big O is a bit rusty...
As for synchronization both java.util.Hashtable and java.util.Vector are both synchronized, so that would not have any effect on performance.