Deep dive into the OpenJ9 GC technologies

Charlie Gracie IBM Senior Software Developer charlie_gracie@ca..com @crgracie charliegracie

1 Important Disclaimers

• THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. • WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. • ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLED ENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTURE DIFFERENCES. • ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE. • IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE. • IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. • NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF: • CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIR SUPPLIERS AND/OR LICENSORS

2 About me

• Senior Software Developer on the IBM Runtime Technologies Team • Project Lead on Eclipse OMR and a committer on Eclipse OpenJ9 • Has worked on Garbage Collection Technology for over 10 years with a focus on scalability

3 OpenJ9 GC technologies

• Open J9 Project • Garbage collection overview • Open J9 GC technologies • Questions

4 Eclipse OpenJ9 Created Sept 2017

http://www.eclipse.org/openj9 https://github.com/eclipse/openj9

Dual License: v2.0 Apache 2.0

Users and contributors very welcome https://github.com/eclipse/openj9/blob/master/CONTRIBUTING.md

5 The place to get OpenJDK builds

For both: • OpenJDK with Hotspot • OpenJDK with Eclipse OpenJ9 https://adoptopenjdk.net/releases.html?va riant=openjdk9-openj9

6 Eclipse OMR Created March 2016

http://www.eclipse.org/omr https://github.com/eclipse/omr https://developer.ibm.com/open/omr/

Dual License: Eclipse Public License v2.0 Apache 2.0

Users and contributors very welcome https://github.com/eclipse/omr/blob/master/CONTRIBUTING.md

7 Garbage Collection

“Garbage Collection (GC) is form automatic memory management. The garbage collector attempts to reclaim memory occupied by objects that are no longer is use by the program.”

8 Garbage Collection

Positives: 1. Automatic memory management 2. Help reduce certain categories of bugs

Negatives: 1. Requires additional resources 2. Can cause application pauses 3. May introduce runtime costs 4. Application has little control of when memory is reclaimed

9 OpenJ9 GC goals

1. Implement re-usable technology 2. Provide highly scalable GC technology 3. Favor smaller memory consumption

10 Garbage collection policies

• -Xgcpolicy: 1. optthruput – stop the world parallel collector

11 Garbage collection policies

• -Xgcpolicy: 1. optthruput – stop the world parallel collector 2. optavgpause – concurrent collector

12 Garbage collection policies

• -Xgcpolicy: 1. optthruput – stop the world parallel collector 2. optavgpause – concurrent collector 3. gencon – generational copying collector

13 Garbage collection policies

• -Xgcpolicy: 1. optthruput – stop the world parallel collector 2. optavgpause – concurrent collector 3. gencon – generational copying collector 4. balanced – region based generational collector

14 Garbage collection policies

• -Xgcpolicy: 1. optthruput – stop the world parallel collector 2. optavgpause – concurrent collector 3. gencon – generational copying collector 4. balanced – region based generational collector 5. metronome – incremental soft realtime collector

15 -Xgcpolicy:optthruput

• Parallel global collector • GC operations are completed in Stop the world (STW) pauses • Mark, sweep and optional compaction collector • All GC operations are completed in parallel by multiple helper threads • GC native memory overhead for mark map and work packets

16 -Xgcpolicy:optthruput heap

• Flat heap • 1 continuous block of memory

Heap

17 -Xgcpolicy:optthruput heap

• Flat heap • 1 continuous block of memory • Divided into 2 logical memory areas for allocation • SOA – small object area • LOA – large object area

SOA LOA Heap

18 -Xgcpolicy:optthruput heap

• Allocation is a first fit algorithm • OpenJ9 uses TLHs to improve allocation performance*

Heap _freelist _freelist

19 -Xgcpolicy:optthruput heap

• Allocation is a first fit algorithm • OpenJ9 uses TLHs to improve allocation performance*

Heap _freelist _freelist

20 -Xgcpolicy:optthruput heap

• Allocation is a first fit algorithm • OpenJ9 uses TLHs to improve allocation performance*

Heap _freelist _freelist

21 -Xgcpolicy:optthruput heap

• Allocation is a first fit algorithm • OpenJ9 uses TLHs to improve allocation performance* • Allocate until SOA is completely full

Heap _freelist _freelist

22 -Xgcpolicy:optthruput heap

• Allocation is a first fit algorithm • OpenJ9 uses TLHs to improve allocation performance* • Allocate until SOA is completely full • Now perform a GC

Heap _freelist _freelist

23 -Xgcpolicy:optthruput GC

24 -Xgcpolicy:optthruput GC

GC 1 GC 1 Start End

25 -Xgcpolicy:optthruput GC

GC 1 GC 1 Start End

26 -Xgcpolicy:optthruput GC

GC 1 GC 1 GC 2 GC 2 Start End Start End

27 -Xgcpolicy:optthruput GC

GC 1 GC 1 GC 2 GC 2 Start End Start End

28 -Xgcpolicy:optthruput GC

Each GC is divided into 3 phases: 1. Marking – finds all of the live objects 2. Sweeping – reclaims memory for dead objects 3. Compaction – (optional) defragment the heap

29 -Xgcpolicy:optthruput marking

A B D E F G H I J K L M N K L Heap

30 -Xgcpolicy:optthruput marking

Root Set A D I L

A B C D E F G H I J K L M N K L Heap

31 -Xgcpolicy:optthruput marking

Root Set Mark map A D I L

A B C D E F G H I J K L M N K L Heap

32 -Xgcpolicy:optthruput marking

Root Set Mark map A D I L

A B C D E F G H I J K L M N K L Heap

33 -Xgcpolicy:optthruput marking

Root Set Mark map

A 1 D I L

A B C D E F G H I J K L M N K L Heap

34 -Xgcpolicy:optthruput marking Work Stack Output Root Set Input Mark map

A 1 D I A L

A B C D E F G H I J K L M N K L Heap

35 -Xgcpolicy:optthruput marking Work Stack Output Root Set Input Mark map

A 1 1 D I D A L

A B C D E F G H I J K L M N K L Heap

36 -Xgcpolicy:optthruput marking Work Stack Output Root Set Input Mark map

A 1 1 1 D I I D A L

A B C D E F G H I J K L M N K L Heap

37 -Xgcpolicy:optthruput marking Work Stack Output Root Set Input Mark map L A 1 1 1 1 D I I D A L

A B C D E F G H I J K L M N K L Heap

38 -Xgcpolicy:optthruput marking Work Stack Input Output Mark map L 1 1 1 1 I D A

A B C D E F G H I J K L M N K L Heap

39 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map L 1 1 1 1 I D A

A B C D E F G H I J K L M N K L Heap

40 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AA L 1 1 1 1 D DI I ID L LA

A B C D E F G H I J K L M N K L Heap

41 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ L 1 1 1 1 DS I IP D LY A

A B C D E F G H I J K L M N K L Heap

42 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ L 1 1 1 1 DS I IP D LY A

A B C D E F G H I J K L M N K L Heap

43 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 DS I IP D LY A

A B C D E F G H I J K L M N K L Heap

44 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 DS I IP D LY A M

A B C D E F G H I J K L M N K L Heap

45 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 DS IP D LY A M

A B C D E F G H I J K L M N K L Heap

46 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 1 DS IP D K LY A M

A B C D E F G H I J K L M N K L Heap

47 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 1 DS IP K LY A M

A B C D E F G H I J K L M N K L Heap

48 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 1 1 DS F IP K LY A M

A B C D E F G H I J K L M N K L Heap

49 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 1 1 DS F IP K LY M

A B C D E F G H I J K L M N K L Heap

50 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ 1 1 1 1 1 1 1 DS F IP K LY M

A B C D E F G H I J K L M N K L Heap

51 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ B 1 1 1 1 1 1 1 1 DS F IP K LY M

A B C D E F G H I J K L M N K L Heap

52 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map AQ B 1 1 1 1 1 1 1 1 D SF I PK L YM

A B C D E F G H I J K L M N K L Heap

53 -Xgcpolicy:optthruput marking Work Stack Work List Input Output Mark map

1 1 1 1 1 1 1 1

A B C D E F G H I J K L M N K L Heap

54 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B C D E F G H I J K L M N K L Heap

55 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B C D E F G H I J K L M N K L Heap

56 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D E F G H I J K L M N K L Heap

57 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D F G H I J K L M N K L Heap

58 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D F I J K L M N K L Heap

59 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D F I K L M N K L Heap

60 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D F I K M L Heap

61 -Xgcpolicy:optthruput sweeping

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D F I K M L Heap

62 -Xgcpolicy:optthruput compaction

Mark map

1 1 1 1 1 1 1 1

_freelist

A B D F I K M L Heap

63 -Xgcpolicy:optthruput compaction

Compact table

_freelist

A B D F I K M L Heap

64 -Xgcpolicy:optthruput compaction

Compact table

256

_freelist

A B D F I K M L Heap

65 -Xgcpolicy:optthruput compaction

Compact table

256 512

_freelist

A B D F I K M L Heap

66 -Xgcpolicy:optthruput compaction

Compact table

256 512 1024

_freelist

A B D F I K M L Heap

67 -Xgcpolicy:optthruput compaction

Compact table

256 512 1024 1280 1536 2048

_freelist

A B D F I K M L Heap

68 -Xgcpolicy:optthruput compaction

Compact table

256 512 1024 1280 1536 2048

_freelist

A B D F I K M L Heap

69 -Xgcpolicy:optthruput compaction

Compact table

256 512 1024 1280 1536 2048

A.field1 = D

A B D F I K M L Heap

70 -Xgcpolicy:optthruput compaction

Compact table

256 512 1024 1280 1536 2048

A.field1 = (D – compactTable[0])

A B D F I K M L Heap

71 -Xgcpolicy:optthruput compaction

Compact table

256 512 1024 1280 1536 2048

A.field1 = (D – 256)

A B D F I K M L Heap

72 -Xgcpolicy:optavgpause

• Concurrent global collector • Mark and sweep phases are completed concurrently with the application. If required, compaction is completed STW • Utilizes a low priority background to complete concurrent work • Application threads also perform concurrent GC work! • Improves STW pause times and application responsiveness • Introduces the requirement for an object write barrier • GC native memory overhead for mark map, work packets and card table

73 -Xgcpolicy:optavgpause heap

• Heap layout is the same as optthruput

SOA LOA Heap

74 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput

Heap _freelist _freelist

75 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput

Heap _freelist _freelist

76 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput

Heap _freelist _freelist

77 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput

Heap _freelist _freelist

78 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput • Start a concurrent GC before the heap memory is exhausted

Heap _freelist _freelist

Kick off concurrent GC

79 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput • Start a concurrent GC before the heap memory is exhausted • Application runs during the GC so allocations continue

Heap _freelist _freelist

Kick off concurrent GC

80 -Xgcpolicy:optavgpause heap

• Allocation is the same as with optthruput • Start a concurrent GC before the heap memory is exhausted • Application runs during the GC so allocations continue

Heap _freelist _freelist

Kick off concurrent GC Finish GC

81 -Xgcpolicy:optavgpause GC

82 -Xgcpolicy:optavgpause GC

GC 1 Start

83 -Xgcpolicy:optavgpause GC

GC 1 Start

84 -Xgcpolicy:optavgpause GC

GC 1 GC 1 Start End

85 -Xgcpolicy:optavgpause GC

GC 1 GC 1 Start End

86 -Xgcpolicy:optavgpause GC

GC 1 GC 1 GC 2 Start End Start

87 -Xgcpolicy:optavgpause GC

GC 1 GC 1 GC 2 Start End Start

88 -Xgcpolicy:optavgpause GC

GC 1 GC 1 GC 2 GC 2 Start End Start End

89 -Xgcpolicy:optavgpause GC

GC 1 GC 1 GC 2 GC 2 Start End Start End

90 -Xgcpolicy:optavgpause write barrier

• Why is there a write barrier required?

91 -Xgcpolicy:optavgpause write barrier

A

B

92 -Xgcpolicy:optavgpause write barrier

A

B C

93 -Xgcpolicy:optavgpause write barrier

• How is the write barrier implemented?

94 -Xgcpolicy:optavgpause write barrier

• How is the write barrier implemented?

private void setField(Object A, Object C) { A.field1 = C; }

95 -Xgcpolicy:optavgpause write barrier

• How is the write barrier implemented?

private void setField(Object A, Object C) { A.field1 = C; if (concurrentGCActive) { cardTable->dirtyCard(A); } }

96 -Xgcpolicy:optavgpause write barrier

Mark map

1 1

Card table

A B C Heap

97 -Xgcpolicy:optavgpause write barrier

Mark map

1 1

Card table

A B C Heap

98 -Xgcpolicy:optavgpause write barrier

Mark map

1 1

Card table

1

A B C Heap

99 -Xgcpolicy:optavgpause GC

Mark map

1 1

Card table

1

A B C Heap

100 -Xgcpolicy:optavgpause GC

Mark map

1 1

Card table

1

A B C Heap

101 -Xgcpolicy:optavgpause GC

Mark map

1 1

Card table

1

A B C Heap

102 -Xgcpolicy:optavgpause GC

Mark map

1 1

Card table

1

A B C Heap

103 -Xgcpolicy:optavgpause GC

Mark map

1 1 1

Card table

1

A B C Heap

104 -Xgcpolicy:gencon

• Generational copy collector • Focuses collection on a small part of the heap • Provides a significant reduction in GC STW pause times • Introduces another write barrier for the remembered set • GC native memory overhead for mark map, work packets, card table and copy scan caches • Concurrent Global collector from optavgpause

105 -Xgcpolicy:gencon heap

• Heap is divided into Nursery and Tenure areas • Generally the Nursery is smaller than the Tenure area

Nursery Tenure Heap

106 -Xgcpolicy:gencon heap

• Heap is divided into Nursery and Tenure spaces • Generally the Nursery is smaller than Tenure • The Nursery is divided into 2 logical spaces: Allocate and Survivor • Objects are allocated in Allocate space*

Allocate Survivor Tenure Heap

107 -Xgcpolicy:gencon heap

• Heap is divided into Nursery and Tenure spaces • Generally the Nursery is smaller than Tenure • The Nursery is divided into 2 logical spaces: Allocate and Survivor • Objects are allocated in Allocate space*

Heap _freelist _freelist

108 -Xgcpolicy:gencon heap

• Heap is divided into Nursery and Tenure spaces • Generally the Nursery is smaller than Tenure • The Nursery is divided into 2 logical spaces: Allocate and Survivor • Objects are allocated in Allocate space*

Heap _freelist _freelist

109 -Xgcpolicy:gencon heap

• Heap is divided into Nursery and Tenure spaces • Generally the Nursery is smaller than Tenure • The Nursery is divided into 2 logical spaces: Allocate and Survivor • Objects are allocated in Allocate space*

Heap _freelist _freelist

110 -Xgcpolicy:gencon heap

• Heap is divided into Nursery and Tenure spaces • Generally the Nursery is smaller than Tenure • The Nursery is divided into 2 logical spaces: Allocate and Survivor • Objects are allocated in Allocate space*

Heap _freelist _freelist

111 -Xgcpolicy:gencon GC

112 -Xgcpolicy:gencon GC

Scavenge 1

113 -Xgcpolicy:gencon GC

Scavenge 1

114 -Xgcpolicy:gencon GC

Scavenge Scavenge 1 2

115 -Xgcpolicy:gencon GC

Scavenge Scavenge 1 2

116 -Xgcpolicy:gencon GC

Scavenge Scavenge Global 1 2 Start

117 -Xgcpolicy:gencon GC

Scavenge Scavenge Global 1 2 Start

118 -Xgcpolicy:gencon GC

Scavenge Scavenge Global Scavenge 1 2 Start 3

119 -Xgcpolicy:gencon GC

Scavenge Scavenge Global Scavenge 1 2 Start 3

120 -Xgcpolicy:gencon GC

Scavenge Scavenge Global Scavenge Global 1 2 Start 3 End

121 -Xgcpolicy:gencon GC

Scavenge Scavenge Global Scavenge Global 1 2 Start 3 End

122 -Xgcpolicy:gencon generational write barrier

• Why is there a write barrier required?

123 -Xgcpolicy:gencon generational write barrier

• Why is there a write barrier required?

• The GC needs to be able to find objects in the nursery which are only referenced from tenure space

124 -Xgcpolicy:gencon generational write barrier

• How is the write barrier implemented?

private void setField(Object A, Object C) { A.field1 = C; }

125 -Xgcpolicy:gencon generational write barrier

• How is the write barrier implemented? private void setField(Object A, Object C) { A.field1 = C; if (A is tenured) { if (C is NOT tenured) { remember(A); } } }

126 -Xgcpolicy:gencon generational write barrier

• How is the write barrier implemented? private void setField(Object A, Object C) { A.field1 = C; if (A is tenured) { if (C is NOT tenure) { remember(A); } if (concurrentGCActive) { cardTable->dirtyCard(A); } } } 127 -Xgcpolicy:gencon GC

A B C D E F G H

Allocate Survivor Tenure 128 -Xgcpolicy:gencon GC

Root Set A D F

A B C D E F G H

Allocate Survivor Tenure 129 -Xgcpolicy:gencon GC

Root Set A D F

A B C D E F G H

Allocate Survivor Tenure 130 -Xgcpolicy:gencon GC Work List

Root Set Scan cache A Copy cache D F

A B C D E F G H

Allocate Survivor Tenure 131 -Xgcpolicy:gencon GC

Root Set Scan cache A Copy cache D F

A B C D E F G H

Allocate Survivor Tenure 132 -Xgcpolicy:gencon GC

Root Set Scan cache A Copy cache A1 D F

B C D E F G A1 H

Allocate Survivor Tenure 133 -Xgcpolicy:gencon GC

Root Set Scan cache A Copy cache A1 D F

B C D E F G A1 H

Allocate Survivor Tenure 134 -Xgcpolicy:gencon GC

Root Set Scan cache 1 A Copy cache A1 D F

B C D E F G A1 H

Allocate Survivor Tenure 135 -Xgcpolicy:gencon GC

Root Set Scan cache 1 A Copy cache A1 D F

B C D E F G A1 H

Allocate Survivor Tenure 136 -Xgcpolicy:gencon GC

Root Set Scan cache 1 A Copy cache A1 D1 D1 F

B C E F G A1 D1 H

Allocate Survivor Tenure 137 -Xgcpolicy:gencon GC

Root Set Scan cache 1 A Copy cache A1 D1 D1 F

B C E F G A1 D1 H

Allocate Survivor Tenure 138 -Xgcpolicy:gencon GC

Root Set Scan cache 1 A Copy cache A1 D1 F1 D1 F1

B C E G A1 D1 F1 H

Allocate Survivor Tenure 139 -Xgcpolicy:gencon GC

Work list Scan cache

Copy cache A1 D1 F1

B C E G A1 D1 F1 H

Allocate Survivor Tenure 140 -Xgcpolicy:gencon GC

Work list Scan cache 1 1 1 A 1 D 1 F 1 AA1 DD1 FF1 Copy cache

B C E G A1 D1 F1 H

Allocate Survivor Tenure 141 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1

B C E G A1 D1 F1 H

Allocate Survivor Tenure 142 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 C1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1 C1

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 143 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 C1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1 C1

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 144 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 C1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1 C1

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 145 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 C1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1 C1

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 146 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 C1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1 C1

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 147 -Xgcpolicy:gencon GC

Work list Scan cache A1 D1 F1 C1 1 1 1 A 1 D 1 F 1 Q R L Copy cache A1 D1 F1 C1

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 148 -Xgcpolicy:gencon GC

Work list Scan cache

Copy cache

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 149 -Xgcpolicy:gencon GC

B E G A1 D1 F1 C1 H

Allocate Survivor Tenure 150 -Xgcpolicy:gencon GC

A1 D1 F1 C1 H

Survivor Allocate Tenure 151 -Xgcpolicy:gencon GC

A1 D1 F1 C1 H

Survivor Allocate Tenure 152 -Xgcpolicy:balanced

• Region based generational collector • Partial Garbage Collections (PGC) focus on high ROI regions • Goal of no global collections • NUMA aware allocation and GC • Provides a significant reduction in max GC STW pause times • Introduces a write barrier to track inter region references • GC native memory overhead for 2 mark maps, work packets, remembered set card table and copy scan caches • Full parallel global as a fall back if the PGCs can not keep up

153 -Xgcpolicy:balanced heap

• Heap is divided into a fixed number of regions • Region size is always a power of 2 • Attempts to have between 1000-2000 regions • Bigger heap == bigger region size • Regions are evenly distributed across NUMA nodes

Heap

154 -Xgcpolicy:balanced heap

• Allocate from Eden regions • Eden can be any set of completely free regions • Attempts to pick regions from each NUMA node • Application threads allocate from the NUMA node they are running on

Heap

155 -Xgcpolicy:balanced heap

• No non-array object can be larger than region size • Object size > region size == OutOfMemoryError • Large arrays are allocated as arraylets • Arrays less than region size are allocated as normal arrays

156 -Xgcpolicy:balanced arraylets

• Arraylets are a dis-contiguous representation of arrays • Array is create from construct and an arraylet spine and 1 or more arraylet leaves • An arraylet spine is allocate like a normal object • Each leaf consumes an entire region

157 -Xgcpolicy:balanced arraylets

Leaf 1 • To access an element you calculate Arraylet spine the leaf (index / leaf size) Header • Then you calculate the index into that leaf (index % leaf size)

Data Leaf 2

158 -Xgcpolicy:balanced arraylets

• Array element access is slower due to an extra dereference • JNI critical sections causes the entire data section of the arraylet to be copied into a native memory buffer

159 -Xgcpolicy:balanced heap

• What does allocation look like?

Heap

160 -Xgcpolicy:balanced heap

Heap

161 -Xgcpolicy:balanced heap

Heap

162 -Xgcpolicy:balanced heap

Heap

163 -Xgcpolicy:balanced heap

• Time for a PGC

Heap

164 -Xgcpolicy:balanced GC

165 -Xgcpolicy:balanced GC

PGC 1

166 -Xgcpolicy:balanced GC

PGC 1

167 -Xgcpolicy:balanced GC

GMP PGC 1 start

168 -Xgcpolicy:balanced GC

GMP PGC 1 start

169 -Xgcpolicy:balanced GC

GMP PGC 1 start PGC 2

170 -Xgcpolicy:balanced GC

GMP PGC 1 start PGC 2

171 -Xgcpolicy:balanced GC

GMP GMP PGC 1 start PGC 2 End

172 -Xgcpolicy:balanced GC

GMP GMP PGC 1 start PGC 2 End

173 -Xgcpolicy:balanced GC

GMP GMP PGC 1 start PGC 2 End PGC 3

174 -Xgcpolicy:balanced GC

GMP GMP PGC 1 start PGC 2 End PGC 3

175 -Xgcpolicy:balanced Global Mark Phase (GMP)

• Does not reclaim any memory! • Performs a marking phase only • Scheduled to run in between PGCs • Builds an accurate mark map of the whole heap • Mark map is used to predict region ROI for PGC • Scrubs RSCL

176 -Xgcpolicy:balanced PGC

• Select regions for inclusion • Always select Eden regions • Use GMP mark map to predict regions with low live counts • If RSCL has too many entries it is not a good candidate • RSCL for selected regions becomes a root

Heap 177 -Xgcpolicy:balanced PGC

• Select regions for inclusion • Always select Eden regions • Use GMP mark map to predict regions with low live counts • If RSCL has too many entries it is not a good candidate • RSCL for selected regions becomes a root

CS CS CS CS Heap 178 -Xgcpolicy:balanced PGC

• Select regions for inclusion • Always select Eden regions • Use GMP mark map to predict regions with low live counts • If RSCL has too many entries it is not a good candidate • RSCL for selected regions becomes a root

CS CS CS CS CS CS Heap 179 -Xgcpolicy:balanced PGC

• Perform the copy forward operation

Heap 180 -Xgcpolicy:balanced PGC

• Perform the copy forward operation • Pick the next free regions for Eden

Heap 181 -Xgcpolicy:balanced inter region write barrier

• Why is there a write barrier required?

182 -Xgcpolicy:balanced inter region write barrier

• Why is there a write barrier required?

• Balanced PGCs can select any region to be included in the collect. • Similar to the generational barrier the GC needs to know which regions reference a given region

183 -Xgcpolicy:balanced inter region write barrier

• How is the write barrier implemented?

private void setField(Object A, Object C) { A.field1 = C; }

184 -Xgcpolicy:balanced inter region write barrier

• How is the write barrier implemented? private void setField(Object A, Object C) { A.field1 = C; if (findRegion(A) != findRegion(C)) { addRSCLEntryFor(C, A); } }

185 -Xgcpolicy:metronome

• Incremental soft realtime collector • Ultra low STW pause times of 3ms • Incremental mark and sweep phases with no compactor • Application receives 70% of the utilization • In any timing windows of 100ms application will run for 70ms • Provides a significant reduction in max GC STW pause times • Uses a Snapshot At The Beginning (SATB) write barrier • High percentage of floating garbage • GC native memory overhead for a mark map, work packets and a remembered set

186 -Xgcpolicy:metronome heap

• Heap is divided into fixed sized regions • Region size is 64k • Bigger heap == more regions

Heap

187 -Xgcpolicy:metronome heap

• Cell based allocation • Regions are assigned a cell size on first use • Cell sizes range from 16 bytes to 2048 bytes • Only objects of that cell size can be allocated in a region of a particular cell size

Heap

188 -Xgcpolicy:metronome heap

• Objects larger than 2048 bytes consume contiguous regions • If no contiguous regions a full STW GC is completed before OOM • Large arrays are allocated as arraylets • Arraylet leaves are 2048 bytes • Each arraylet leaf region contains 32 arraylet leaves

Heap

189 -Xgcpolicy:metronome heap

• What does allocation look like?

Heap

190 -Xgcpolicy:metronome heap

• What does allocation look like?

Heap

191 -Xgcpolicy:metronome heap

• What does allocation look like?

Heap

192 -Xgcpolicy:metronome heap

• What does allocation look like?

Heap

193 -Xgcpolicy:metronome heap

• Time to start a GC before the heap is exhausted • GCs are triggered with 50% heap free

Heap

194 -Xgcpolicy:metronome heap

• While the GC is happening heap is still consumed • All objects allocated during the GC are kept alive for this GC

Heap

195 -Xgcpolicy:metronome heap

• At marking end the heap basically full

Heap

196 -Xgcpolicy:metronome heap

• Sweep phase will start freeing memory

Heap

197 -Xgcpolicy:metronome heap

• Sweep phase will start freeing memory

Heap

198 -Xgcpolicy:metronome heap

• At sweep complete the application goes back to getting 100% utilization • If less than 50% of the heap is freed another GC is triggered

Heap

199 -Xgcpolicy:metronome GC

200 -Xgcpolicy:metronome GC

GC Start

201 -Xgcpolicy:metronome GC

GC Start

202 -Xgcpolicy:metronome GC

GC Start

203 -Xgcpolicy:metronome GC

GC Start

204 -Xgcpolicy:metronome GC

GC Start

205 -Xgcpolicy:metronome GC

GC Start

206 -Xgcpolicy:metronome GC

GC Start

207 -Xgcpolicy:metronome GC

GC Start

208 -Xgcpolicy:metronome GC

GC Start

209 -Xgcpolicy:metronome GC

GC Start

210 -Xgcpolicy:metronome GC

GC Start GC End

211 -Xgcpolicy:metronome GC

GC Start GC End

212 -Xgcpolicy:metronome SATB write barrier

• A snapshot at the beginning barrier makes the guarantee that all objects alive at the beginning of the GC will be alive at the end • This causes a lot of floating garbage but ensures correctness for the incremental collector • The barrier has to be performed before the store

213 -Xgcpolicy:metronome SATB write barrier

• How is the write barrier implemented?

private void setField(Object A, Object C) { A.field1 = C; }

214 -Xgcpolicy:metronome SATB write barrier

• How is the write barrier implemented? private void setField(Object A, Object C) { Object temp = A.field1; if (barrier isActive) { remember(temp): } A.field1 = C; }

215 -Xgcpolicy:metronome GC

• Metronome uses the same marking and sweeping as optthruput • Root scanning, marking and sweeping all have yield points • The remembered set is added as a new root • If the pause time has been reached the GC pauses and lets the application run again

216 GC policy quick guide

GC Policy Domain Memory used in addition to Throughput heap (% of Xmx) gencon* Webservers, desktop applications, 9 Highest runs most apps well balanced Very large heaps, large NUMA 13.5 High systems metronome Soft-realtime systems, trading 4.5 Low-Medium applications, etc. optthruput Small heaps with little GC 4.5 Medium optavgpause Why not gencon? 6 Low-Medium

217 Links

• Eclipse OMR https://www.eclipse.org/omr/

• Eclipse OpenJ9 https://www.eclipse.org/openj9

• AdoptOpenJDK https://adoptopenjdk.net/?variant=openjdk8-openj9

218 Questions???

219