Immutable Collections [email protected] @PaulSandoz
JavaOne 2017 Immutable Collections CON6079 1 Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole
discretion of Oracle.
JavaOne 2017 Immutable Collections CON6079 2 Agenda
• A recap of unmodifiable collections in the JDK
• A brief overview of immutable collections in external Java libraries and JVM-based platforms
• Immutable collections leveraging persistent data structures
JavaOne 2017 Immutable Collections CON6079 3 When referring to immutable collections there are no claims made as to the immutability of the collection’s elements
JavaOne 2017 Immutable Collections CON6079 4 Advantages of immutability
• Don’t need to think about concurrency and data races
• Resistant to misbehaving libraries
• Are constants that may be optimized at runtime
• Implementations can optimize over time, space for representation and transformation
JavaOne 2017 Immutable Collections CON6079 5 Immutable collections wish list
• Manifests immutability (of the collections, not their elements)
• Sealed (not publicly extensible)
• Provide a bridge to mutable collections (not extension of)
• Efficient construction, updates, and copying
JavaOne 2017 Immutable Collections CON6079 6 Unmodifiable in the JDK
• The JDK has the notion of unmodifiable collections
• Unmodifiable is a runtime property of a collection
• Modifying (add, put, remove, …) methods throw UnsupportedOperationException
• No way to directly query
JavaOne 2017 Immutable Collections CON6079 7 Two forms of unmodifiable
• Unmodifiable view or wrapper to a source or backing collection
List
• Directly unmodifiable
List
JavaOne 2017 Immutable Collections CON6079 8 Immutability with unmodifiable collections
• When wrapping ensure the source collection is never accessible*
List
List
• List.of and friends are is if the source is never accessible
* Except, of course, to the unmodifiable wrapper
JavaOne 2017 Immutable Collections CON6079 9 JDK collections as immutable collections
✗ Manifests immutability
✗ Sealed
• Provide a bridge to mutable collections
✗ Efficient construction, updates and copying
JavaOne 2017 Immutable Collections CON6079 10 Unmodifiable is a reasonable abstraction for mutable collections but not for immutable collections
JavaOne 2017 Immutable Collections CON6079 11 Guava’s immutable collections
• Defines sealed types such as ImmutableList, ImmutableMap, …
• These implement the corresponding JDK mutable collection type (ImmutableList implements List)
• Copying is smart ImmutableList.copyOf(otherCollection)
JavaOne 2017 Immutable Collections CON6079 12 Guava’s collections are a good compromise
✔ Manifests immutability
✔ Sealed
✘ Provide a bridge to mutable collections
✘ Efficient construction (✔✘), updates (✘), and copying (✔✘)
JavaOne 2017 Immutable Collections CON6079 13 Eclipse collections: something for everyone
✔ Manifests immutability
✘ Sealed
✔ Provide a bridge to mutable collections
✘ Efficient construction, updates, and copying
JavaOne 2017 Immutable Collections CON6079 14 Vavr (Java), Clojure, Scala
✔ Manifests immutability
✔ Sealed*
✔ Provide a bridge to mutable collections*
✔ Efficient construction, updates, and copying
* Not completely verified but believed to be mostly true
JavaOne 2017 Immutable Collections CON6079 15 Vavr (Java), Clojure, Scala
✔ Efficient updates (addition, removal, replace, merge)
• The immutable collection implementations leverage persistent data structures for maps, sets and vectors (non-linked lists)
JavaOne 2017 Immutable Collections CON6079 16 Persistent data structures
• A persistent data structure preserves the previous version of itself when modified
• Hash Array Mapped Tries (HAMTs) are the basis of efficient persistent (immutable) maps, sets, and vectors
• Provide structural sharing between a new and previous version of a collection
• Effectively constant time for many operations
• Cache friendly
JavaOne 2017 Immutable Collections CON6079 17 Trie
In computer science, a trie, also called digital tree and sometimes radix tree or prefix tree (as they can be searched by prefixes), is a kind of search tree — an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings
A trie for keys "A","to", "tea", "ted", "ten", "i", "in", and "inn".
JavaOne 2017 Immutable Collections CON6079 18 Hash Array Mapped Trie
• Symbol is a 5 bit sequence
• String is fixed in size, 32 bits, consisting of 7 symbols (last symbol is truncated to 2 bits)
• String is the hashCode of an Object (the key)
JavaOne 2017 Immutable Collections CON6079 19 Hash Array Mapped Trie
0xCAFEBABE
32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 1 0 1 1 1 1 1 0
s7 s6 s5 s4 s3 s2 s1
JavaOne 2017 Immutable Collections CON6079 20 HAMT properties
• Wide branching factor, 32
• Limited tree depth, 6
• Effectively constant time lookup
O(log32N) = O(log2N/log232) = O(log2N/5)
JavaOne 2017 Immutable Collections CON6079 21 HAMT properties
• Good structural sharing (for updates, merging and splitting) but also good memory usage and cache coherency
• The basis for vectors, where index is the hash code (see also Relaxed Radix Balanced trees), and multi- maps
• Can be applied to mutable collections, for efficient construction of an immutable collection
• Efficiently implemented in Java
JavaOne 2017 Immutable Collections CON6079 22 A naive implementation
public class PMap
public Optional
private Optional
if (_k == SUB_LAYER_NODE) { PMap
static int symbolAtDepth(int h, int d) { return (h >>> (d * 5) & (32 - 1)); } }
JavaOne 2017 Immutable Collections CON6079 23 A naive implementation
public class PMap
public Optional
private Optional
if (_k == SUB_LAYER_NODE) { PMap
static int symbolAtDepth(int h, int d) { return (h >>> (d * 5) & (32 - 1)); } }
JavaOne 2017 Immutable Collections CON6079 24 A naive implementation
public class PMap
public Optional
private Optional
if (_k == SUB_LAYER_NODE) { PMap
static int symbolAtDepth(int h, int d) { return (h >>> (d * 5) & (32 - 1)); } }
JavaOne 2017 Immutable Collections CON6079 25 A compact representation
public class PMap
// [..., k, v, ....] or // [..., SUB_LAYER_NODE, PMap, ...] or // [..., COLLISION_NODE, CollisionNode, ...] or // invariant: a sub-layer will not consist of a single mapping node @Stable private final Object[] nodes;
JavaOne 2017 Immutable Collections CON6079 26 A better representation
private Optional
int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1)); }
JavaOne 2017 Immutable Collections CON6079 27 A better representation
private Optional
int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1)); }
JavaOne 2017 Immutable Collections CON6079 28 A better representation
private Optional
int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1)); }
JavaOne 2017 Immutable Collections CON6079 29 A better representation
private Optional
int nodeCount = bitmapCountFrom(bitmap, symbol); Object _k = nodes[nodeCount * 2]; if (_k == SUB_LAYER_NODE) { PMap
private static int bitmapCountFrom(int bitmap, int symbol) { return Integer.bitCount(bitmap & ((1 << symbol) - 1)); }
Compiles to POPCNT on x64
JavaOne 2017 Immutable Collections CON6079 30 A better representation
• Space is required only for present nodes, using the bitmap (made possible with HotSpot optimzations)
• Further refinements (and tradeoffs) possible
• Sub nodes and entries could be separated for more cache friendly traversal (see Steindorfer’s work compressed HAMTs aka CHAMP)
• Hash codes could be cached
JavaOne 2017 Immutable Collections CON6079 31 Persistent Map API
public void forEach(BiConsumer
public Optional
public PMap
public PMap
JavaOne 2017 Immutable Collections CON6079 32 Persistent collections API
• Modifying methods return a new collection
• An implementation shares unmodified structure with the previous collection
• Require mutable builders to efficiently construct in a confined manner
• For example, closure/thread confined construction then freezing
JavaOne 2017 Immutable Collections CON6079 33 Demo: Visualizing HAMT-based persistent maps https://github.com/PaulSandoz/per/
JavaOne 2017 Immutable Collections CON6079 34 Summary
• Unmodifiable is a reasonable abstraction for mutable but not immutable
• For efficient immutable collections we need persistent collections
• Sets, maps and vectors using HAMTs have proven to be effective in many libraries and platforms
JavaOne 2017 Immutable Collections CON6079 35 What about Java?
• We shall continue to improve on unmodifiable in the JDK
• Selective sedimentation of persistent collections into the Java platform?
• Claim: possibly to optimize such collections very aggressively with internal APIs, HotSpot, and safely contained unsafe mechanisms
JavaOne 2017 Immutable Collections CON6079 36 References
• Fast And Space Efficient Trie Searches, Bagwell https://pdfs.semanticscholar.org/93a1/fe7f226cfbc7cb2bceac39308a66c8aef0b0.pdf
• Ideal Hash Trees, Bagwell http://lampwww.epfl.ch/papers/idealhashtrees.pdf
• RRB-Trees: Efficient Immutable Vectors, Bagwell and Rompf https://infoscience.epfl.ch/record/169879/files/RMTrees.pdf
• Optimizing Hash-Array Mapped Tries for Fast Lean Immutable JVM Collections, Steindorfer and Vinju https://michael.steindorfer.name/publications/oopsla15.pdf
• Efficient Immutable Collections - PhD Thesis - Steindorfer https://michael.steindorfer.name/publications/phd-thesis-efficient-immutable-collections.pdf
• Cache-Aware Lock-Free Concurrent Hash Tries, Prokopec, Bagwell, Odersky https://infoscience.epfl.ch/record/166908/files/ctries-techreport.pdf
JavaOne 2017 Immutable Collections CON6079 37