Java Performance Hints
This document contains some hints how to ensure good performance (memory-wise and execution speed) in Java code. The hints are sometimes intentionally kept quite short without examples so that the reader will have to study the subject (and understand it) before implementing the hint. Some hints may be of theoretical usefulness only and saves nanoseconds at best, other are crucial for good performance. Most are generally applicable, some are only valid for Java 5 or Java 6. Java 7 is not yet covered.
- Strings
- Collections
- Arrays
- Garbage Collection
- GUIs and Swing
- Threads
- Files
- Classes (and their fields and methods)
- Objects in general
- Miscellaneous
- Articles and Books
- Performance related sites
- Footnotes
Strings
- Never use the String constructor, use literals where possible. There are two exceptions, if you have a char[] and want to turn it into a String, or if you use substring() on a large String.
- String.equals() is faster than String.equalsIgnoreCase().
- Use StringBuilder when constructing a String where applicable, not the "+" operator (unless it's a single statement, String s = a + b + c;) or String.concat().
- StringBuilder is not synchronized and is thus faster than StringBuffer (J2SE 5). (escape analysis footnote)
- Use the capacity argument in the String[Buffer|Builder] constructor, creating a too small buffer reduces performance.
- In some cases using String.intern() can improve performance as you can use "==" to compare those Strings, but the risk is that you swamp the heap with objects that can never be removed.
- aString.length() == 0 is faster than aString.equals(""). With Java 6, use isEmpty().
- Calling String.toString() is very pointless.
- All methods in String that returns a modified String actually returns a new instance since strings are immutable.
- String.split(regex) actually simply invokes Pattern.compile(regex).split(this, limit), and compile() returns a new Pattern each time. If the split is a frequent operation, it is thus beneficial to create and reuse a single Pattern instance for performing the splits.
Collections
- Choosing the correct data structure is critical for performance. They are good at doing different things. Some are good at random access, others are good for insertion and deletion of elements at random positions and so on, as described in the Java tutorial. Also note that memory consumption varies - a double linked list consumes more memory than a ArrayList for example, since it must contain references to the next and previous element. Some structures may also allocate only sufficient space to fit the actual number of elements, whereas others may preallocate space to fit future elements. In some cases, a plain array may actually be optimal for your purposes.
- Choose a non-synchronized variant where applicable (ArrayList vs Vector, HashMap vs Hashtable). (escape analysis footnote)
- Collections can often be created with an initial size which, if choosen appropriatly, avoids needless modification of the internal data structure.
- Iterating over the values in a Map using Map.keySet() in combination with Map.get() performs worse than iterating using Map.entrySet().
- List.subList(int, int) can be used to create a new view of a List, thus avoiding creation of a new data structure.
- Maintaining data integrety is important. But instead of copying data from your internal data structure to an identical one so that the original can not be altered when you hand it to some other entity, use Collections.unmodifiableXxxx(). This avoids the creation of a new potentially large data structure. It requires that the objects it holds are immutable as well though.
- You never have to create an empty Collection, use Collections.emptyList(), emptyMap() and emptySet() (Java 5).
- When iterating over a List it may be worthwhile to check if it implements RandomAccess and retrieve elements using get() instead of using an iterator, as described in Learning From Code: Fast Random Access.
- The hashCode() method of a Map key has to distribute the returned value over a large amount of values to be efficient, otherwise you end up with a lot of elements in some buckets and retrieval time complexity degrades to O(n).
- Do not create Map keys by concatenating various types into a String:
// This is really bad (perhaps not in this simple example, but if repeated many times): Map<String, Person> map = ... map.put(givenName + "/" + lastName, somePerson); // An implicit StringBuilder and a new String! Person p = map.get(givenName + "/" + lastName); // An implicit StringBuilder and a new String! // This is better, and more flexible: Map<Key, Person> map = ... map.put(new Key(givenName, lastName), somePerson); Person p = map.get(new Key(givenName, lastName)); class Key { private final String given; private final String last; Key(final String givenName, final String lastName) { given = givenName; last = lastName; } @Override public boolean equals(final Object o) { if (o == this) { return true; } if (!(o instanceof Key)) { return false; } final Key key = (Key)o; // (If some field is a primitive, start with that one, it's faster.) return given.equals(key.given) && last.equals(key.last); } @Override public int hashCode() { // This could potentially be calculated once and stored in a private field. return given.hashCode() + 31 * last.hashCode(); } } Some data structures has a load factor, for example HashMap. It is is used to determine when to increase the size of the internal data structure (usually an array). Resizing it is not free. A new larger internal data structure must be created and the elements are copied to the new one. So if you can avoid resizing you should. This is for example the case when you know exactly the number of mappings you will put in your HashMap. You should then state the initial capacity of course, but also either specify load factor 1 (when your key has a good hashCode() algorithm), or divide the initial capacity by the load factor. Otherwise the internal array is resized when you have added a fraction of your mappings.
Also note that the HashMap(Map) constructor uses load factor 0.75 too. This means that in the following code:
HashMap map = new HashMap(256, 1); for (int i = 0; i < 256; ++i) map.put(new Integer(i), "dummy"); HashMap map2 = new HashMap(map);map2 will not have internal capacity 256, but actually 512 (not 156/0.75=342 due to that HashMap capacity is always a power of two). That may not be what you wanted.
And finally note that using load factor 1 may affect the lookup performance (unless you have a very good hash code distribution), as stated in the HashMap documentation.
- When applicable, a primitive collection (such as an ArrayListish class that holds primitives such as int, long and so on) can be quite beneficial since it avoids the creation of primitive wrapper objects. There are no such collections in the standard API, but there are several open source implementations available, and you can easiliy create one yourself.
Arrays
- System.arrayCopy() is faster than manual looping and copying.
- Multidimensional array creation is slower than single dimension array creation.
- clone() and System.arraycopy() are equally fast, see benchmark at The Java Specialists' Newsletter, which also contains an example of a flawed benchmark giving a false result. Also note there are Arrays.copyOf() and copyOfRange() methods in Java 6.
- Use Arrays.asList(T...) to provide a Collection view of an array, do not copy the elements to a new data structure. Use Collections.unmodifiableList() to make it immutable if needed.
- Use Collections.addAll(Collection<? super T>, T...) (Java 5) to copy all the elements of an array to a Collection.
Garbage Collection
- Invoking System.gc() (or equivalently Runtime.gc()) will most likely not improve the performance, see article Garbage collection and performance. Once the GC runs it will also pause all other threads, so if several threads keeps invoking System.gc() then all threads will be interupted frequently! A full GC may also perform a compaction of the old generation, which is not cheap if the heap is large and packed with objects.
- There are many options to the java command that tunes the garbage collector, see this list.
- Explict nulling of a variable may help the GC, but probably it does not. It is most likely to help when the object is very large and is incorrectly scoped. See article Garbage collection and performance, Referencing objects, Nulling variables and garbage collection and Tuning Garbage Collection for 1.3.1, 1.4.2, 1.5 and 1.6.
- Garbage collection of objects in the young generation is quicker than in the older generations.
GUIs and Swing
- When implementing a cell renderer, such as javax.swing.table.TableCellRenderer, make sure that the getXxxRendererComponent(...) method returns this instead of creating a new renderer each time. Preferably don't create any object instances in that method at all since it can be called very often. Need a bold font for example? Don't derive a new one for each painted cell, do it once.
- When changing a model, do so in bulk if possible, since this generates fewer events and probably does not require several small enlargements of the internal data structure.
- Use javax.swing.Timer instead of a java.util.Timer (or even worse: a thread you created yourself) for GUI related timeouts or periodic events.
- Lazy instantiation can improve GUI performance.
- Use constructors that takes a complete model as an argument (see for example JList, JComboBox and JTable) instead of adding components one by one after creation.
- Perceived performance is not the same thing as actual performance in GUIs. Keep the GUI responsive by executing long tasks in a non-EDT thread. A progress bar or similar is a good way to show that something is going on.
- If you are doing some manual drawing which takes way too long to perform, using an offscreen buffer might help.
- Components performing custom painting can be optimized by using the clip region. See Painting in AWT and Swing.
- Synchronizing methods of AWT and Swing subcomponents may impair performance as described in Swing 'Urban Legends'.
Threads
- Reuse threads in a pool (since they are quite heavyweight), do not create them over and over again to perform short tasks.
- Use wait()/notify() to pause threads, do not sleep or create no-op-loops.
- Too many threads may impact performance (thrashing).
- Use java.util.Timer instead of creating the same behaviour with your own threads. See also swing Timer.
- In order to achieve low contention, keep synchronized blocks as short as possible and lock only what really need to be locked (synchronize on an appropriate object). When you repeatedly call some method which aquires a lock, it may be wise to synchronize before this loop so that the current thread already owns the lock each iteration. See article Threading lightly, Part 2: Reducing contention.
- Instead of sharing an object between threads and having to synchronize access to it, in some cases using a ThreadLocal can benefit performance, see article Threading lightly, Part 3: Sometimes it's best not to share.
- Synchronizing a method of a subclass of java.awt.Component may be inappropriate as explained in Core Java Technologies Technical Tips.
- In rare cases, using volatile instead of synchronized can be beneficial, as explained in Managing volatility.
Files
- Always use a buffered variant when reading from or writing to a file (or stream).
- Always close() the stream once you are finished with it, otherwise it may continue to waste system resources. If you have nested streams you only need to close the topmost wrapping stream, the others are closed automatically.
- Note that with some operations flush() needs to be called. With other operations it is a complete waste of time since it is also done automatically. A close() always contains a flush.
- Use java.nio where applicable.
Classes (and their fields and methods)
- Inner classes can be made static if they do not use the implicit reference to the enclosing class. It saves a few bytes, may be of importance if you create a lot of those objects.
- Deep class hierarchies slightly impacts performance.
- There is no need to assign the default value to instance variables, it is done automatically. The explict assignment causes extra instructions to be executed.
class Foo { private Bar bar = null; // Pointless initialization. } - Declaring classes or methods final does not necessarily improve performance, see article Is that your final answer?. Note that other resources often state the opposite fact! Declaring fields and variables as final is a good idea though.
- Trying to save memory by choosing a byte-wise smaller primitive (such as a byte instead of an int) to represent a certain field most likely will not do what you expect. The compiler will reserve four bytes for each field no matter what the type is (even booleans!), as described in article Determining Memory Usage in Java.
- When creating an API, do not force the users of it to use a certain argument type to your methods where possible. Use method overloading to support various types to avoid needless object creation. Choose the return type in a similar manner. Also see article Design for performance, The self-return idiom makes for more usable API design, Whose object is it, anyway? and the Java tutorial.
- Regarding serialization of objects, declaring fields as transient might improve performance.
- When serializing objects, overriding writeObject(ObjectOutputStream) and readObject(ObjectOutputStream) might improve performance (but increases the risk of doing something wrong).
- When serializing objects, implementing Externalizable might improve performance (but significantly increases the risk of doing something wrong).
- Inner classes and anonymous classes are in reality just another class, and as such they have to be loaded and requires memory. As stated in The Java Tutorial:
Simple one-statement event listeners can be realised by EventHandler instead, which however introduces to possibility to introduce errors that are not identified at compile time, plus the overhead of reflection.When considering whether to use an inner class, keep in mind that application startup time and memory footprint are typically directly proportional to the number of classes you load. The more classes you create, the longer your program takes to start up and the more memory it will take.
Objects in general
Lazy initialization may improve performance. An example:
class Foo { private AReallyLargeObject large; // Note: this method is not thread safe. public AReallyLargeObject get() { // The object is never created if no one ever invokes this method. if (large == null) large = new AReallyLargeObject(); return large; } }You shall however not use double-checked locking, which is broken.
- Beware of creating many short lived objects, such as in a loop. Object creation as such is however not expensive, see article Urban performance legends, revistited
- Object pools for non-heavyweight objects (examples of heavyweight objects are threads or network and database connections) are not considered a performance benefit, see article Garbage collection and performance and Urban performance legends, revistited.
- Overriding Object.finalize() is often a bad idea since it decreases performance. For C++ programmers: finalize() is not a destructor! See article Garbage collection and performance.
- Immutable objects are rumored to impact performance negatively, but this is often not the case. As a matter of fact, they can sometimes improve performance, see article Garbage collection and performance and The Java Tutorials.
Miscellaneous
- No performance tip helps when you are using a stupid algorithm or a stupid design.
- Consider using the java -server and -X and -XX options. Also see Java HotSpot VM Options and Java 6 JVM options.
- Too much logging takes a lot of time, especially if it involves file access. Note that in a call such as
myLogger.log(logLevel, dateFormatter.format(new Date()) + ": " + anObject)
in which the log() method decides that due to the log level this message shall not be logged, there is still a performance impact since the method arguments have already been evaluated (especially if anObject.toString() is a non-trivial method). - switch is faster than a nested if-else.
- Handling primitives is faster than handling the corresponding wrapper classes.
- Do not use a complex expression as the for loop condition, since it is executed once for each iteration:
for (int i = 0; i < calculateComplexMaxValue(); ++i) {...} // Bad int maxValue = calculateComplexMaxValue(); // Good for (int i = 0; i < maxValue; ++i) {...} - Looping downwards is in theory faster than looping upwards since the JVM contains an optimized version (if<cond>) for comparison with 0 (and potentially some more values depending on the JVM implementation, such as -1, 1 .. 5).
for (int i = startValue; i >= 0; --i) {...} - Flipping a boolean (true → false or false → true) is more efficient by using b ^= true than b = !b. See the test case.
- Do not assume that performance tips are valid (which includes this document). They may be obsolete or perhaps they have never been correct. Maybe they are only valid for specific Java versions or special situations.
- Avoid loading classes you do not need.
- There is probably no point in trying to improve performance if you do not have a known problem with it. It will only result in micro-improvements. You also need a performance target and a known bottleneck.
- Avoid using synchronization in read-only or single-threaded contexts.
- Shifting is faster than multiplication / division (where applicable, powers of two).
- Move code out of loops where possible.
- Reflection may impair performance. In other situations it may improve it.
- Do not perform things repeatedly if you only have to do it once. Example:
for (int i = 0; i < array.length; i++) array[i] *= SOME_NUMBER * doSomeStuff(someArg); is quite a lot faster if it is written like this: double mult = SOME_NUMBER * doSomeStuff(someArg); for (int i = 0; i < array.length; i++) array[i] *= mult;
- Optimizations are often made in new JDK versions, abandon the old ones. Furthermore one can be very sure deprecated APIs are not maintained.
- Your micro benchmarking code is probably flawed and does not do what you expect it to do. Read these articles: Dynamic compilation and performance measurement and Anatomy of a flawed microbenchmark.
- Auto(un)boxing (J2SE 5) can impact performance in a negative way. It does not eliminate conversions between primitives and objects, they are just performed behind the scenes. See language guide Autoboxing.
- Do not use java.net.URL.equals() or URL.hashCode(), and thus do not put them in a Map, since those methods may do name resolutions to determine if two host names actually refers to the same one. This is stated in the javadoc of URL, but is easy to miss! Use class URI. Further reading at blogspot.com.
- java.util.concurrent.CopyOnWriteArrayList can be used to implement an Observer, which safely can be modified while concurrently altering it. Applicable for event listeners for example. See Be a good (event) listener for more information.
- The javax.xml.xpath (Java 5) is very convenient to use, but degrades performance severely.
Articles and Books
- Java theory and practice: Urban performance legends
- Java theory and practice: Urban performance legends, revistited (allocation, deallocation, garbage collection, stack allocation, escape analysis (Mustang))
- Java theory and practice: Garbage collection and performance
- Java theory and practice: Dynamic compilation and performance measurement
- Don't forget about memory, How to monitor your Java applications' Windows memory usage
- Eye on performance: Tuning garbage collection in the HotSpot JVM
- Java bytecode: Understanding bytecode makes you a better programmer
- Java theory and practice: Hashing it out, Defining hashCode() and equals() effectively and correctly
- Java Platform Performance: Strategies and Tactics
- Bitwise Optimization in Java: Bitfields, Bitboards, and Beyond
- The Java Specialists' Newsletter [Issue 029] - Determining Memory Usage in Java
- Java Performance Documentation
- Java SE 6 Performance White Paper
- Java Tuning White Paper
- Improvements to Program Execution Speed describes performance enhancements in various JDK versions.
Performance related sites
- Java performance tuning
- Java performance (Wikipedia)
Footnotes
Escape analysis and lock elision: As first described in JavaOne session TS-3412 2006 (but it states the wrong flag, UseEscapeAnalysis) and now documented in Java HotSpot Virtual Machine Performance Enhancements, escape analysis is available and enabled by default. Old JDK versions use the -XX:+DoEscapeAnalysis flag ("not stable"). It is for the server compiler only.