No VM I know of will store an Integer[] array like an int[] array for the following reasons:
- There can be null Integer objects in the array and you have no bits left for indicating this in an int array. The VM could store this 1-bit information per array slot in a hiden bit-array though.
- You can synchronize in the elements of an Integer array. This is much harder to overcome as the first point, since you would have to store a monitor object for each array slot.
- The elements of Integer[] can be compared for identity. You could for example create two Integer objects with the value 1 via new and store them in different array slots and later you retrieve them and compare them via ==. This must lead to false, so you would have to store this information somewhere. Or you keep a reference to one of the Integer objects somewhere and use this for comparison and you have to make sure one of the == comparisons is false and one true. This means the whole concept of object identity is quiet hard to handle for the optimized Integer array.
- You can cast an Integer[] to e.g. Object[] and pass it to methods expecting just an Object[]. This means all the code which handles Object[] must now be able to handle the special Integer[] object too, making it slower and larger.
Taking all this into account, it would probably be possible to make a special Integer[] which saves some space in comparison to a naive implementation, but the additional complexity will likely affect a lot of other code, making it slower in the end.
The overhead of using Integer[] instead of int[] can be quiet large in space and time. On a typical 32 bit VM an Integer object will consume 16 byte (8 byte for the object header, 4 for the payload and 4 additional bytes for alignment) while the Integer[] uses as much space as int[]. In 64 bit VMs (using 64bit pointers, which is not always the case) an Integer object will consume 24 byte (16 for the header, 4 for the payload and 4 for alignment). In addition a slot in the Integer[] will use 8 byte instead of 4 as in the int[]. This means you can expect an overhead of 16 to 28 byte per slot, which is a factor of 4 to 7 compared to plain int arrays.
The performance overhead can be significant too for mainly two reasons:
- Since you use more memory, you put on much more pressure on the memory subsystem, making it more likely to have cache misses in the case of Integer[]. For example if you traverse the contents of the int[] in a linear manner, the cache will have most of the entries already fetched when you need them (since the layout is linear too). But in case of the Integer array, the Integer objects itself might be scattered randomly in the heap, making it hard for the cache to guess where the next memory reference will point to.
- The garbage collection has to do much more work because of the additional memory used and because it has to scan and move each Integer object separately, while in the case of int[] it is just one object and the contents of the object doesn't have to be scanned (they contain no reference to other objects).
To sum it up, using an int[] in performance critical work will be both much faster and memory efficient than using an Integer array in current VMs and it is unlikely this will change much in the near future.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…