Hopefully we all have some idea of how the ArrayList and LinkedList are structured internally. The ArrayList holds a pointer to a single Object array, which grows as the number of elements exceed the size of the array. The ArrayList's underlying Object array grows by about 50% whenever we run out of space. The LinkedList has pointers to the link nodes at the front and end of the list. Each link node is an object and has pointers forward and backwards and to the object that is being held in the list. The memory requirements of the LinkedList is about 4x larger than an equivalently sized ArrayList. In addition, the ArrayList allows random access into the middle, whereas the LinkedList has a lookup complexity of O(n).
One of the questions I enjoy asking here is this: Which list uses more bytes when serialized? LinkedList or ArrayList? Think about this for a bit before reading on, considering the internal structure of each list.
One of the questions I enjoy asking here is this: Which list uses more bytes when serialized? LinkedList or ArrayList? Think about this for a bit before reading on, considering the internal structure of each list.
Most programmers guess that the LinkedList would have a larger serialized size than the ArrayList. However, that is never the case:
import java.io.*; import java.util.*; public class ListWritingSize { public static void main(String[] args) throws IOException { test(new LinkedList<String>()); test(new ArrayList<String>()); } public static void test(List<String> list) throws IOException { for (int i = 0; i < 10; i++) { list.add("hello world"); } ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(baos); out.writeObject(list); out.close(); System.out.println(list.getClass().getSimpleName() + " used " + baos.toByteArray().length + " bytes"); } }
When we run this, we see that the LinkedList uses 107 bytes, whilst the ArrayList uses 117 bytes. Instead of just naively writing the contents of the collections to the stream, the lists both contain
writeObject()
andreadObject()
methods. These write the contents of the lists to the stream.Why the 10 byte difference?
The ArrayList and LinkedList work similarly, in that they write out the number of elements and the actual objects contained in the list. However, ArrayList also writes out the size of the underlying array, used to recreate an identical ArrayList to what was serialized.
This additional
int
is what makes up the 10 byte difference in the serialization size.At this point I need to also point out that the String "hello world" is only serialized once, after that only a reference number to the object is written.
What about Vector?
The old Vector class implements serialization in a naive way. They simply do the default serialization, which writes the entire
Object[]
as-is into the stream. Thus if we insert a bunch of elements into the List, then clear it, the difference between Vector and ArrayList is enormous.import java.util.*; import java.io.*; public class VectorWritingSize { public static void main(String[] args) throws IOException { test(new LinkedList<String>()); test(new ArrayList<String>()); test(new Vector<String>()); } public static void test(List<String> list) throws IOException { insertJunk(list); for (int i = 0; i < 10; i++) { list.add("hello world"); } ByteArrayOutputStream baos = new ByteArrayOutputStream(); ObjectOutputStream out = new ObjectOutputStream(baos); out.writeObject(list); out.close(); System.out.println(list.getClass().getSimpleName() + " used " + baos.toByteArray().length + " bytes"); } private static void insertJunk(List<String> list) { for(int i = 0; i<1000 * 1000; i++) { list.add("junk"); } list.clear(); } }
When we run this code, we get the following output:
LinkedList used 107 bytes ArrayList used 117 bytes Vector used 1310926 bytes
Vector can use a staggering amount of bytes when being serialized. The lesson here? Don't ever use Vector as Lists in objects that are Serializable. The potential for disaster is too great.
No comments:
Post a Comment