Reference Object Processing in On-The-Fly Garbage Collection

Tomoharu Ugawa, Kochi University of Technology Richard Jones, Carl Ritson, University of Kent Weak Pointers

• Weak pointers are a mechanism to allow mutators to communicate with GC • java.lang.ref.Reference • Specification requires us to process weak references atomically from the view point of mutators • Fully concurrent (on-the-fly) GC never stops all mutators Java reference types stronger • Strong - usual references. • Soft - used for caches that the GC can reclaim. • Weak - used for canonicalize mappings (e.g., interned strings) that do not prevent GC from reclaiming their keys or values. • Phantom - used for scheduling pre-mortem cleanup actions more flexibility than finalisers. weaker Java reference types stronger • Strong - usual references. • Soft - used for caches that the GC can reclaim. reduce to strong/weak references • Weak - used for canonicalize mapping (e.g., interned strings) that do not prevent GC from reclaiming their keys or values. • Phantom - used for scheduling pre-mortem cleanup actions more flexibility than finalisers. no interaction with mutators weaker root strongly Reachability reachable

Strongly - can be reached without traversing any strong other references. reference Weakly - not strongly reachable but can be reached by traversing weak references.

weak reference • No formal specification • Specification is written in English. • There are errors in implementations • We formalised the specification root strongly GC actions reachable

The GC finds all strongly reachable objects and reclaims others.

The GC “clears” references whose referents are weakly reachable. • Weak reference to Strongly - to be retained • Weak reference to Weakly - to be cleared root strongly Reference.get() reachable

Reference Normal object object

Reference.get() - returns a strong reference to its target or null if the GC has cleared. get() may make some objects that were weakly reachable strongly reachable. root strongly Reference.get() reachable

Reference Normal object object get() Reference.get() - returns a strong reference to its target or null if the GC has cleared. get() may make some objects that were weakly reachable strongly reachable. Race A O B Race A O B

• Collector clears weak reference A Race get() A O B

• Collector clears weak reference A Race get() A O B

• Collector clears weak reference A • Mutator makes O strongly reachable by creating a strong reference to the upstream from the OpenJDK mailing list

“I've been tuning a Java 7u51, Solaris 10, T4 system with 24G heap. My customer is not very happy with the remark pauses of up to 2 seconds.” Thomas Viessmann

“It looks like the application is using a lot of Reference objects. The time Pauses (sec) spent in remark is dominated by 2 reference processing.” Bengt Rutisson

1

blue: reference processing 0 Solutions

• Stop the world - Pauseless GC [Click et al., 2005], Staccato [McCloskey et al., 2008] • Stop all mutators and process references

• Lock - “On-the-fly” GC [Domani et al., 2000] • Block any mutator that calls get()

• On-the-fly - Metronome-TS [Auerbach et al., 2008] • Implementation technique is not public Global GC State

GC and mutator race in TRACING • GC want to finish tracing and then start clearing • Mutator force GC to do more work

by collector by collector (atomic) REPEAT by mutator (atomic)

get() start tracing

start tracing no tracing work NORMAL TRACING CLEARING

GC traces strongly reachable objects TRACING State

• GC traverses strong references • to colour strongly reachable objects black Write barrier • REPEAT • insertion barrier [Dijkstra] deletion barrier (a.k.a. snapshot) [Yuasa] • NORMAL TRACING CLEARING • Read barrier for Reference.get() TRACING State

• GC traverses strong references • to colour strongly reachable objects black Write barrier • REPEAT • insertion barrier [Dijkstra] deletion barrier (a.k.a. snapshot) [Yuasa] • NORMAL TRACING CLEARING • Read barrier for Reference.get() Insertion Barrier

INVARIANT: Root is grey (allows root to refer to white objects) GC repeat tracing until it finds root black • REPEAT • Reference.get() changes the state to REPEAT to notify the GC that the root may not be black NORMAL TRACING CLEARING

root get() Deletion Barrier

INVARIANT: Root is black --- root is never rescanned

Reference.get() colours target grey • REPEAT • Reference.get() changes the state to REPEAT to notify the GC that the root may not be black NORMAL TRACING CLEARING

root get() CLEARING

Once GC enters CLEARING state

Reference.get() returns null if its target is white • REPEAT • no more objects become strongly reachable • GC clears weak references whose targets are white NORMAL TRACING CLEARING

get() returns null Evaluation

• Jikes RVM • Sapphire on-the-fly copying collector • Trigger GC immediately after the previous GC completes • Configuration • Core i7-4770 (4-core, 3.4 GHz) • 1 GB heap • 2 collector threads Pause Time distribution

• Stop-the-world GC, or • Block mutator.get() with “lock”

100 100 100 STW STW STW lock ins lock ins lock ins lock del 10 lock del 10 lock del 10

1 1 1 avrora2009 lusearch2009 0.1 jython2006 0.1 0.1 0 6 12 18 24 30 0 6 12 18 24 30 0 6 12 18 24 30 Note 3ms logarithmic 100 100 100 buckets STW scale STW STW lock ins lock ins lock ins

Frequency (%) 10 lock del 10 lock del 10 lock del

1 1 1

0.1 pmd2009 0.1 sunflow2009 0.1 xalan2009 0 6 12 18 24 30 0 6 12 18 24 30 0 6 12 18 24 30 Pause times (ms) Pause Time distribution

100 STW lock ins 10 lock del

1

0.1 xalan2009 0 6 12 18 24 30 Pause time Reference Processing Phase Time

Mutators blocked Mutators running 100 100 STW OTF ins lock ins OTF del 10 10 lock del

1 1 avrora2009

0.1 0.1 0 6 12 18 24 30 0 6 12 18 24 30

100 100 STW OTF ins lock ins OTF del 10 10 lock del lusearch2009 1 1

0.1 0.1 0 30 60 90 120 150 180 210 240 270 300 0 30 60 90 120 150 180 210 240 270 300

100 100 STW OTF ins lock ins OTF del 10 lock del 10

1 1 xalan2009 0.1 0.1 0 6 12 18 24 30 0 6 12 18 24 30 Execution times Conclusion

• Reference types are frequently used in a significant number of programs. • On-the-fly GC must not ignore reference types. • Formalised the definition of reference types. • On-the-fly reference processing. • Model checked with SPIN. • Implemented in Jikes RVM. • On-the-fly reference processing phases are longer in the worst case, but with deletion barrier, not by much. • Overall execution time is not increased significantly by processing references on-the-fly, and is often reduced.

http://github.com/perlfu/sapphire Questions? Reference type usage

2 2 2 T = 3790ms T = 4480ms T = 3005ms

pmd2009 3 1 xalan2009 1 sunflow2009 1

0 0 0

2 2 T = 4139ms T = 2651ms x-axis: 1 lusearch2009 1 luindex2009 Normalised execution time 0 0

2 2 T = 13639 T = 5112ms ms

1 1 Calls to get() per msec x 10 Jython2006 avrora2009

0 0 Model Checking

• Model checked with SPIN • Correctness: appears to mutators to be processed by GC atomically • Terminates only with deletion barrier

inline getRef(i, ret) { r r r do::(refState == NORMAL || reference objects refState == REPEAT) -> o o o if::CLEARED[i] -> ret = NULL    normal objects ::else -> ret = i x fi; root break ::(refState == TRACING) -> Figure 7: The model if::(!CLEARED[i] && (mark[i] == WHITE)) -> CAS(refState, TRACING, REPEAT) /* continue */ ::(!CLEARED[i] && (mark[i] != WHITE)) -> ret = i; while(true) break int i =random.nextInt{ (5); ::else -> switch (i) ret = NULL; case 0: x{ = vr .get();break; 0 break case 1: x = vr .get();break; 1 fi case 2: x = vr .get();break; 2 ::(refState == CLEANING) -> case 3: if (x = null) x = x.next; break; if::(!CLEARED[i] && (mark[i] == WHITE)) -> case 4: x =null;break;! ret = NULL } ::(!CLEARED[i] && (mark[i] != WHITE)) -> } ret = i ::else -> ret = NULL Figure 8: Simple mutator fi; break od; d_step { /* d_step is an atomic action */ since o is weakly-reachable through w,themutatormayalsoseeo, getRef_arg = i; getRef_ret = ret contrary to P2.! Since bounded model checking does not deal with infinite state, }; we checked the properties for the limited model shown in Fig. 7. } This model has three pairs of reference and normal objects, namely Figure 9: Promela model of a Reference.get() method with an r0,r1,r2 for references and o0,o1,o2 for the corresponding nor- insertion barrier mal objects. These normal objects are linked in a list, but there are no other strong references to them. We assumed that all reference objects remain directly strongly reachable from the root and that the mutator can always call get() methods on them. However, we found that, with an insertion barrier, the mutator Fig. 8 shows the mutator’s pseudocode: vri is a local variable can continually prevent the collector from breaking out of the whose value is a reference object ri,andx is another local variable. The mutator repeatedly and arbitrarily calls a get() method to load termination loop, even if we assume weakly fair scheduling. The the referent to x,loadsthe‘next’objectofx,orclearsx.Sincewe reason for this is that, while the collector is tracing or checking focus on the behaviour of references, the mutator does not write to if the work queue is empty, a mutator has a chance to load a any object. Thus, our model does not have write barriers. white referent to a local variable x and then clear x.Themutator Fig. 9 shows the model of the get() method on the reference changes refState to REPEAT when it loads a reference with get(), thus forcing the collector to trace again. However, if the mutator object ri,foracollectorusinganinsertionbarrier.Thismodelis faithful to Fig. 4.Thereturnvalueispassedtothecallerthrough has cleared x,thecollectorwillnotfind,andhenceshade,anew the parameter ret . mark[i] and CLEARED[i] represent the colour of white referent: the number of white objects is not reduced and so no progress is made. Fortunately, the deletion barrier version does oi and whether ri has been cleared or not, respectively. When get() make progress, since get() shades white objects grey. returns oi,itsetsi to ret .InordertocheckP2,themodelalsoputs i and ret in global variables getRef_arg and getRef_ret. For the collector side, our model is faithful to the pseudocode in Fig. 3 and 5.Attheendofacycle,thecollectorreclaimswhite 7. Evaluation objects by calling reclaim():weintroduceafourthobjectstate We built our OTF reference processing framework in Jikes RVM RECLAIMED.Ourmodelofreclaim() reclaims white objects and and evaluated it with our new implementation of the Sapphire col- reverts the black objects to white. P1 and P2 can be interpreted as: lector [12], running DaCapo benchmarks that would run (10 from the 2006 and 6 from the 2009 suite). All measurements were per- P1 ((x = NULL)= (mark[x] = RECLAIMED)) ! ! ⇒ ! formed on a 4-core, 3.4 GHz Intel Core i7-4770 CPU running P2 (RETNULLi = (x = i)) (i =1, 2, 3) Ubuntu Linux 12.04.4. ! ⇒ ¬♦ where RETNULLi (getRef_arg= i) (getRef_ret= NULL). 7.1 Reference Type Usage We have model≡ checked these properties∧ with models both for collectors with an insertion barrier and a deletion barrier. We also To understand the behaviour of the benchmarks, we measured how tried to model check the termination property. often reference types were used. Fig. 10 shows the number of calls of a get() method per second in each 10 ms time window; the x- P3 (Termination) GC eventually terminates. axis is the normalised elapsed time of the program. inlineinline getRef(i, getRef(i, ret) ret) { { r r r r r r do::(refStatedo::(refState == == NORMAL NORMAL || || referencereference objects objects refStaterefState == == REPEAT) REPEAT) -> -> if::CLEARED[i]if::CLEARED[i] -> -> ret ret = NULL = NULL oo oo oo    normalnormal objects objects ::else::else -> -> ret ret = i = i x fi;fi; root breakbreak ::(refState::(refState == == TRACING) TRACING) -> -> Figure 7: The model Figure 7: The model if::(!CLEARED[i]if::(!CLEARED[i] && && (mark[i] (mark[i] == WHITE)) == WHITE)) -> -> CAS(refState,CAS(refState, TRACING, TRACING, REPEAT) REPEAT) /*/* continue continue */ */ ::(!CLEARED[i]::(!CLEARED[i] && && (mark[i] (mark[i] != WHITE)) != WHITE)) -> -> ret = i; while(true) ret = i; while(true) break int i =random.nextInt{ (5); break int i =random.nextInt{ (5); ::else -> switch (i) ::else -> switch (i) { ret = NULL; case 0: x{= vr 0.get();break; ret = NULL; case 0: x = vr 0.get();break; break case 1: x = vr 1.get();break; break case 1: x = vr 1.get();break; fi case 2: x = vr 2.get();break; fi case 2: x = vr .get();break; ::(refState == CLEANING) -> case 3: if (x =2 null) x = x.next; break; ::(refState == CLEANING) -> case 3: if (x ! = null) x = x.next; break; if::(!CLEARED[i] && (mark[i] == WHITE)) -> case 4: x =null;break; if::(!CLEARED[i] && (mark[i] == WHITE)) -> case 4: x =null;break;! ret = NULL } ::(!CLEARED[i]ret = NULL && (mark[i] != WHITE)) -> } } ret::(!CLEARED[i] = i && (mark[i] != WHITE)) -> } ::elseret -> = ret i = NULL Figure 8: Simple mutator fi; ::else -> ret = NULL Figure 8: Simple mutator breakfi; od; break d_stepod; { /* d_step is an atomic action */ since o is weakly-reachable through w,themutatormayalsoseeo, getRef_argd_step { /* = d_step i; is an atomic action */ getRef_ret = ret sincecontraryo is to weakly-reachableP2.! through w,themutatormayalsoseeo, getRef_arg = i; Since bounded model checking does not deal with infinite state, }; getRef_ret = ret contrary to P2.! } weSince checked bounded the properties model checking for the limited does not model deal shown with infinite in Fig. state,7. }; } weThis checked model has the three properties pairs of for reference the limited and normal model objects, shown innamely Fig. 7. Figure 9: Promela model of a Reference.get() method with an Thisr0,r1 model,r2 for has references three pairs and ofo reference0,o1,o2 for and the normal corresponding objects, namely nor- insertion barrier mal objects. These normal objects are linked in a list, but there are Figure 9: Promela model of a Reference.get() method with an r0,r1,r2 for references and o0,o1,o2 for the corresponding nor- insertion barrier malno other objects. strong These references normal to objects them. are We linked assumed in thata list, all but reference there are noobjects other remain strong directly references strongly to them. reachable We assumed from the that root all and reference that the mutator can always call get() methods on them. objects remain directly strongly reachable from the root and that However, we found that, with an insertion barrier, the mutator Fig. 8 shows the mutator’s pseudocode: vri is a local variable the mutator can always call get() methods on them. can continually prevent the collector from breaking out of the whose value is a reference object ri,andx is another local variable. However, we found that, with an insertion barrier, the mutator Fig. 8 shows the mutator’s pseudocode: vri is a local variable termination loop, even if we assume weakly fair scheduling. The The mutator repeatedly and arbitrarily calls a get() method to load can continually prevent the collector from breaking out of the whose value is a reference object ri,andx is another local variable. reason for this is that, while the collector is tracing or checking the referent to x,loadsthe‘next’objectofx,orclearsx.Sincewe termination loop, even if we assume weakly fair scheduling. The Thefocus mutator on the behaviour repeatedly of and references, arbitrarily the calls mutator a get does() method not write to load to if the work queue is empty, a mutator has a chance to load a the referent to x,loadsthe‘next’objectofx,orclearsx.Sincewe whitereason referent for this to a is local that, variable whilex theand collector then clear is tracingx.Themutator or checking any object. Thus, our model does not have write barriers. if the work queue is empty, a mutator has a chance to load a focusFig. on9 theshows behaviour the model of references, of the get the() method mutator on does the not reference write to changes refState to REPEAT when it loads a reference with get(), thuswhite forcing referent the collector to a local to variable trace again.x and However, then clear if thex.Themutator mutator anyobject object.ri,foracollectorusinganinsertionbarrier.Thismodelis Thus, our model does not have write barriers. changes refState to REPEAT when it loads a reference with get(), faithfulFig. to9 shows Fig. 4.Thereturnvalueispassedtothecallerthrough the model of the get() method on the reference has cleared x,thecollectorwillnotfind,andhenceshade,anew thus forcing the collector to trace again. However, if the mutator objectthe parameterri,foracollectorusinganinsertionbarrier.Thismodelisret . mark[i] and CLEARED[i] represent the colour of white referent: the number of white objects is not reduced and so nohas progress cleared is made.x,thecollectorwillnotfind,andhenceshade,anew Fortunately, the deletion barrier version does faithfuloi and whether to Fig.r4i .Thereturnvalueispassedtothecallerthroughhas been cleared or not, respectively. When get() makewhite progress, referent: since theget number() shades of white white objects objects is grey. not reduced and so thereturns parameteroi,itsetsreti.tomarkret .Inordertocheck[i] and CLEARED[iP2] represent,themodelalsoputs the colour of no progress is made. Fortunately, the deletion barrier version does oi iandandret whetherin globalri has variables been clearedgetRef_arg or not,and respectively.getRef_ret When. get() make progress, since get() shades white objects grey. returnsFor theoi,itsets collectorPropertiesi to side,ret .Inordertocheck our model is faithfulP2,themodelalsoputs to the pseudocode 7. Evaluation iinand Fig.ret3 andin global5.Attheendofacycle,thecollectorreclaimswhite variables getRef_arg and getRef_ret. objectsFor the by calling collectorreclaim side, our():weintroduceafourthobjectstate model is faithful to the pseudocode We built our OTF reference processing framework in Jikes RVM • inNoRECLAIMED Fig. dangling3 and .Ourmodelofpointer5.Attheendofacycle,thecollectorreclaimswhite is createdreclaim() reclaims white objects and and7. evaluated Evaluation it with our new implementation of the Sapphire col- •objectsrevertsIf a variable the by black calling is not objects reclaimnull, its to target white.():weintroduceafourthobjectstate hasP1 notand beenP2 can reclaimed be interpreted as: lectorWe [ built12], running our OTF DaCapo reference benchmarks processing that framework would run in (10 Jikes from RVM RECLAIMED.Ourmodelofreclaim() reclaims white objects and theand 2006 evaluated and 6 from it with the our 2009 new suite). implementation All measurements of the were Sapphire per- col- P1 !((x = NULL)= (mark[x] = RECLAIMED)) reverts the black! objects⇒ to white. P1! and P2 can be interpreted as: formedlector on [12 a], 4-core, running 3.4 DaCapo GHz Intel benchmarks Core i7-4770 that would CPU run running (10 from P2 (RETNULLi = (x = i)) (i =1, 2, 3) Ubuntu Linux 12.04.4. • Once get()! of a Reference⇒ returns¬♦ null, it will never the 2006 and 6 from the 2009 suite). All measurements were per- P1returns! its(( xtarget= NULL)= (mark[x] = RECLAIMED)) formed on a 4-core, 3.4 GHz Intel Core i7-4770 CPU running where RETNULL! i (getRef_arg⇒ = i) ! (getRef_ret= NULL). 7.1 Reference Type Usage P2 (RETNULL≡i = (x = i))∧ (i =1, 2, 3) Ubuntu Linux 12.04.4. We! have model checked⇒ ¬♦ these properties with models both for collectors with an insertion barrier and a deletion barrier. We also inlineTo getRef(i, understand ret) the behaviour { of the benchmarks, we measured how where RETNULLr i (getRef_argr r = i) (getRef_ret= NULL). do::(refStateoften reference == types NORMAL were || used. Fig. 10 shows the number of calls tried to model check≡ the termination property.∧ 7.1 Reference Type Usage We have model checked these propertiesreference with objects models both for ofrefState a get() method == REPEAT) per second -> in each 10 ms time window; the x- To understand the behaviour of the benchmarks, we measured how collectorsP3 (Termination) witho an GC insertiono eventually barriero terminates. and a deletion barrier. We also axisif::CLEARED[i] is the normalised -> elapsed ret = NULL time of the program. tried to model check the termination property.normal objects often::else reference types -> ret were = i used. Fig. 10 shows the number of calls x fi;of a get() method per second in each 10 ms time window; the x- root P3 (Termination) GC eventually terminates. breakaxis is the normalised elapsed time of the program. ::(refState == TRACING) -> Figure 7: The model if::(!CLEARED[i] && (mark[i] == WHITE)) -> CAS(refState, TRACING, REPEAT) /* continue */ ::(!CLEARED[i] && (mark[i] != WHITE)) -> ret = i; while(true) break int i =random.nextInt{ (5); ::else -> switch (i) ret = NULL; case 0: x{ = vr .get();break; 0 break case 1: x = vr .get();break; 1 fi case 2: x = vr .get();break; 2 ::(refState == CLEANING) -> case 3: if (x = null) x = x.next; break; if::(!CLEARED[i] && (mark[i] == WHITE)) -> case 4: x =null;break;! ret = NULL } ::(!CLEARED[i] && (mark[i] != WHITE)) -> } ret = i ::else -> ret = NULL Figure 8: Simple mutator fi; break od; d_step { /* d_step is an atomic action */ since o is weakly-reachable through w,themutatormayalsoseeo, getRef_arg = i; getRef_ret = ret contrary to P2.! Since bounded model checking does not deal with infinite state, }; we checked the properties for the limited model shown in Fig. 7. } This model has three pairs of reference and normal objects, namely Figure 9: Promela model of a Reference.get() method with an r0,r1,r2 for references and o0,o1,o2 for the corresponding nor- insertion barrier mal objects. These normal objects are linked in a list, but there are no other strong references to them. We assumed that all reference objects remain directly strongly reachable from the root and that the mutator can always call get() methods on them. However, we found that, with an insertion barrier, the mutator Fig. 8 shows the mutator’s pseudocode: vri is a local variable can continually prevent the collector from breaking out of the whose value is a reference object ri,andx is another local variable. The mutator repeatedly and arbitrarily calls a get() method to load termination loop, even if we assume weakly fair scheduling. The the referent to x,loadsthe‘next’objectofx,orclearsx.Sincewe reason for this is that, while the collector is tracing or checking focus on the behaviour of references, the mutator does not write to if the work queue is empty, a mutator has a chance to load a any object. Thus, our model does not have write barriers. white referent to a local variable x and then clear x.Themutator Fig. 9 shows the model of the get() method on the reference changes refState to REPEAT when it loads a reference with get(), thus forcing the collector to trace again. However, if the mutator object ri,foracollectorusinganinsertionbarrier.Thismodelis faithful to Fig. 4.Thereturnvalueispassedtothecallerthrough has cleared x,thecollectorwillnotfind,andhenceshade,anew the parameter ret . mark[i] and CLEARED[i] represent the colour of white referent: the number of white objects is not reduced and so no progress is made. Fortunately, the deletion barrier version does oi and whether ri has been cleared or not, respectively. When get() make progress, since get() shades white objects grey. returns oi,itsetsi to ret .InordertocheckP2,themodelalsoputs i and ret in global variables getRef_arg and getRef_ret. For the collector side, our model is faithful to the pseudocode in Fig. 3 and 5.Attheendofacycle,thecollectorreclaimswhite 7. Evaluation objects by calling reclaim():weintroduceafourthobjectstate We built our OTF reference processing framework in Jikes RVM RECLAIMED.Ourmodelofreclaim() reclaims white objects and and evaluated it with our new implementation of the Sapphire col- reverts the black objects to white. P1 and P2 can be interpreted as: lector [12], running DaCapo benchmarks that would run (10 from the 2006 and 6 from the 2009 suite). All measurements were per- P1 ((x = NULL)= (mark[x] = RECLAIMED)) ! ! ⇒ ! formed on a 4-core, 3.4 GHz Intel Core i7-4770 CPU running P2 (RETNULLi = (x = i)) (i =1, 2, 3) Ubuntu Linux 12.04.4. ! ⇒ ¬♦ where RETNULLi (getRef_arg= i) (getRef_ret= NULL). 7.1 Reference Type Usage We have model≡ checked these properties∧ with models both for collectors with an insertion barrier and a deletion barrier. We also To understand the behaviour of the benchmarks, we measured how tried to model check the termination property. often reference types were used. Fig. 10 shows the number of calls of a get() method per second in each 10 ms time window; the x- P3 (Termination) GC eventually terminates. axis is the normalised elapsed time of the program. Race

• Mutator makes an object strongly reachable • Collector clears weak reference

target object strongly not strongly reachable reachable get() not cleared GC weak reference cleared Answer

• Handshake ensure that GC changes to TRACING after the get() that change the state to REPEAT returned • Second handshake ensures all mutators acknowledge GC is in TRACING • CAS tells which thread won the race CAS=> REPEAT ack get() handshake-1 handshake-2 CAS=>fail

req → TRACING CAS=>CLEARING