coccinelle semantic patch language

Nicholas Mc Guire

March 5, 2019 History Code refactoring in Linux

coccinelle Started around 2006 semantic patch First larger application presented in 2007 at OLS (SCSI lan- driver cleanup) guage

2010 First public release of the coccinelle-0.1 Nicholas Mc Guire The original goal - automate the task of keeping device drivers up to date with the latest kernel interfaces as much Context

as possible Introduction Well established in the linux kernel by now with a large set Simple of scripts in scripts/coccinelle Examples Finding bugs

The goal now - automated evolution in the Linux kernel Conclusions see https://hal.archives-ouvertes.fr/hal-01022704/PDF/faults- in-linux-2.6-tocs.pdf to see the impact

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Short profile

coccinelle semantic COCCINELLE/Semantic Patches (SmPL) patch Maintainer: Julia Lawall lan- Gilles Muller guage Nicolas Palix Michal Marek Nicholas Mc List : [email protected] (moderated for non-subscribers) Guire Web page : http://coccinelle.lip6.fr/ Tree/type : git git://git.kernel.org/pub/scm/linux/kernel/git/ Context mmarek/kbuild.git misc Status : Supported Introduction Files : scripts/coccinelle/, scripts/coccicheck Simple Examples

Finding bugs

Repo: https://github.com/coccinelle/coccinelle Conclusions Coccinelle is use in a number of projects other than the linux kernel

Copyright OSADL eG, 2018, CC-BY-SA-4.0 What is coccinelle/spatch

coccinelle semantic patch lan- guage program matching and transformation engine Nicholas Mc analysis at the semantic level emits at the lexical level Guire part of collaborative tools effort by Context maintainer: Julia Lawall, Introduction

Note: most examples here are from Gilles Muller and Julia Simple Lawall. Examples Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 What is coccinelle

coccinelle Coccinelle is a tool to automatically analyze and rewrite semantic code. Coccinelle patch Beyond string processing - coccinelle is aware of the C lan- language structure guage Beyond static code-checking - Coccinelle can generate Nicholas Mc Guire code to actually fix the bugs it finds

Has its own C parser to retain spacing and comments Context Mechanism Introduction The semantic patch is transformed to CTL Simple The code is transformed to a CFG Examples Model checking is used to check the validity of the CTL on Finding bugs the CFG Conclusions Supports isomorphism to handle equivalence (e.g. !var, var == NULL) Embeded python scripting for postprocessing and reporting Copyright OSADL eG, 2018, CC-BY-SA-4.0 Supporting evolution background

coccinelle semantic patch lan- evolutions in library APIs guage generate changes needed in client code Nicholas Mc Guire extended to the Linux kernel

maintenance of kernel API users Context

maintenance of out-of-tree code Introduction

Simple Manual maintenance of API changes too error prone - too Examples

expensive. Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Coccinelle in the Linux kernel

coccinelle semantic patch lan- number of patches accepted: 1000++ guage Nicholas Mc location: scripts/coccinelle/* Guire integrated in DLC: make coccicheck Context eliminates a lot of legacy constructs notably for out-of-tree Introduction code Simple potential for regression testing Examples Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Capabilities

coccinelle semantic patch context sensitive: lan- guage renaming of function Nicholas Mc adding of function arguments Guire reorganizing a data structure

bug hunting: Context

finding bugs Introduction

proposing fixes for bug patterns Simple Examples analysis: Finding bugs locating semantic patterns in a large code base Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Simple example - switching to a helper function

coccinelle semantic patch @@ lan- struct resource *res; guage @@ Nicholas Mc Guire - (res->end - res->start) + 1 + resource_size(res) Context @@ Introduction struct resource *res; Simple @@ Examples Finding bugs - res->end - res->start Conclusions + BAD(resource_size(res))

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Simple example - switching to macro helper

coccinelle semantic patch @haskernel@ lan- @@ guage

Nicholas Mc #include Guire @depends on haskernel@ Context type t; Introduction identifier f; Simple @@ Examples

Finding bugs - (sizeof(((t*)0)->f)) + FIELD_SIZEOF(t, f) Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 API transformation

coccinelle @@ semantic struct device dev; patch expression E; lan- type T; guage

@@ Nicholas Mc Guire - dev.driver_data = (T)E + dev_set_drvdata(&dev, E) Context

Introduction @@ Simple struct device *dev; Examples expression E; Finding bugs type T; @@ Conclusions

- dev->driver_data = (T)E + dev_set_drvdata(dev, E)

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Refactoring with coccinelle

coccinelle semantic patch lan- @@ guage @@ Nicholas Mc Guire - kcalloc(1, + kzalloc( Context ...) Introduction

Simple Looks trivial - but try and do this change in 16.2M LoC Examples manually... could you guarantee that you handled every Finding bugs instance in just 25k LoC ? Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Bug hunting and elimination

coccinelle semantic patch lan- Often ”high-level” faults and misunderstandings that will guage repeat Nicholas Mc Guire Testing for such errors not efficient, actually infeasible Code base far to large for manual review and exhaustive Context search Introduction

Easy to automate -> regression testing Simple Examples

Semantic encoding of bugs allows to search for ”related bugs” Finding bugs and writing semantic scanners often helps clarify the problem ! Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Bugs are (often) patterns

coccinelle semantic patch lan- checking at syntax level guage coding styles/checkpatch.pl Nicholas Mc Guire gcc

checking at context level Context

sparse Introduction checking at semantic pattern level Simple Examples coccinelle Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Scanning with spatch

coccinelle semantic patch lan- @@ guage expression lock1,lock2; expression flags; Nicholas Mc Guire @@

*spin_lock_irqsave(lock1,flags) Context ... when != flags Introduction *spin_lock_irqsave(lock2,flags) Simple Examples Search for cases where the inner lock flags were also saved - Finding bugs report only. Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Scanning with spatch - false positive

coccinelle semantic

-u -p linux-3.12.9/drivers/net/ethernet/natsemi/ns83820.c /tmp/nothing/driv patch ers/net/ethernet/natsemi/ns83820.c lan- --- linux-3.12.9/drivers/net/ethernet/natsemi/ns83820.c +++ /tmp/nothing/drivers/net/ethernet/natsemi/ns83820.c guage @@ -562,7 +562,6 @@ static inline int rx_refill(struct net_d Nicholas Mc dprintk("rx_refill(%p)\n", ndev); Guire if (gfp == GFP_ATOMIC) - spin_lock_irqsave(&dev->rx_info.lock, flags); for (i=0; idata - PTR_ALIGN(skb->data, 16)); Examples if (gfp != GFP_ATOMIC) - spin_lock_irqsave(&dev->rx_info.lock, flags); Finding bugs res = ns83820_add_rx_skb(dev, skb); if (gfp != GFP_ATOMIC) Conclusions spin_unlock_irqrestore(&dev->rx_info.lock, flags);

False positive - the scanner might be too generic !

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Scanning with spatch - code

coccinelle semantic

>>>> unsigned long flags = 0; patch lan- if (unlikely(nr_rx_empty(dev) <= 2)) return 0; guage

dprintk("rx_refill(%p)\n", ndev); Nicholas Mc if (gfp == GFP_ATOMIC) Guire >>>> spin_lock_irqsave(&dev->rx_info.lock, flags); for (i=0; i

/* extra 16 bytes for alignment */ Introduction skb = __netdev_alloc_skb(ndev, REAL_RX_BUF_SIZE+16, gfp); Simple if (unlikely(!skb)) Examples break; Finding bugs skb_reserve(skb, skb->data - PTR_ALIGN(skb->data, 16)); if (gfp != GFP_ATOMIC) Conclusions >>>> spin_lock_irqsave(&dev->rx_info.lock, flags); res = ns83820_add_rx_skb(dev, skb);

Copyright OSADL eG, 2018, CC-BY-SA-4.0 bug - race with very small race window

coccinelle semantic patch lan- guage static void advance_transaction(struct acpi_ec *ec, u8 status) { Nicholas Mc unsigned long flags; Guire struct transaction *t = ec->curr; >>>>>> spin_lock_irqsave(&ec->lock, flags); Context if (!t) goto unlock; Introduction ... Simple Examples The race window is one line of code here. Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 The fix

coccinelle semantic patch lan- guage static void advance_transaction(struct acpi_ec *ec, u8 status) { Nicholas Mc unsigned long flags; Guire - struct transaction *t = ec->curr; + struct transaction *t; Context spin_lock_irqsave(&ec->lock, flags); + t = ec->curr; Introduction if (!t) Simple Examples But this could be a serial bug.... Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Bug pattern scanner

@assign@ coccinelle expression s,var; semantic position p1,p2,p3; identifier func; patch statement S1; identifier member; lan- @@ guage func@p1(...){ ... Nicholas Mc var = s->member@p2 Guire ... ( spin_lock_irqsave@p3(&s->lock,...); Context | spin_lock@p3(&s->lock,...); Introduction ) if (!var) Simple S1 Examples ... } Finding bugs

@script:python@ Conclusions p1 << assign.p1; p2 << assign.p2; p3 << assign.p3; fn << assign.func; @@ print "%s:%s possible assign without lock at lines %s (related ? lock at line %s)" % (p1[0].file,fn,p1[0].line,p3[0].line) Copyright OSADL eG, 2018, CC-BY-SA-4.0 Result

coccinelle semantic patch lan- guage

Nicholas Mc Guire linux-stable-rt4-clean/drivers/acpi/ec.c:advance_transaction possible assign without lock at lines 175 (related ? lock at line 180) Context This was a ”one of” - but now we can do regression testing at the semantic level - it will pop up again.... Introduction

Simple Examples

Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 bug - overlooked API dependency

coccinelle semantic patch proposed patch for spin lock bh/spin unlock bh caused crash in lan- guage rt because the API can be used in an inbalanced but equivalent Nicholas Mc form: Guire balanced: spin lock bh/spin unlock bh Context Introduction unbalanced: Simple spin lock bh/spin unlock/local bh enable Examples local bh disable/spin lock/spin unlock bh Finding bugs Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 API pattern scanner

coccinelle @r1@ identifier f; semantic expression E; patch position p1,p2,p3; @@ lan- f(...) { <... guage ( spin_lock_bh(E)@p1; Nicholas Mc ... Guire spin_unlock(E)@p2; ... local_bh_enable()@p3; | Context local_bh_disable()@p2; Introduction ... spin_lock(E)@p3; Simple ... Examples spin_unlock_bh(E)@p1; ) Finding bugs ...> } Conclusions @script:python@ p1 << r1.p1; p2 << r1.p2; p3 << r1.p3; @@ print "file:%s at lines %s %s %s" % (p1[0].file,p1[0].line, p2[0].line, p3[0].line)

Copyright OSADL eG, 2018, CC-BY-SA-4.0 One of the cases found

coccinelle semantic in linux-3.12.9/net/core/sock.c patch lan- ... spin_lock_bh(&sk->sk_lock.slock); guage

if (!sk->sk_lock.owned) Nicholas Mc /* Guire * Note : We must disable BH */ return false; Context

__lock_sock(sk); Introduction sk->sk_lock.owned = 1; spin_unlock(&sk->sk_lock.slock); Simple /* Examples * The sk_lock has mutex_lock() semantics here: */ Finding bugs mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); local_bh_enable(); Conclusions ...

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Run example/results

coccinelle semantic patch lan- guage root@debian:/usr/src# spatch --sp-file spin_lock_bh2.cocci \ --dir linux-3.12.9 Nicholas Mc ... Guire HANDLING: linux-3.12.9/security/tomoyo/file.c HANDLING: linux-3.12.9/security/tomoyo/securityfs_if.c HANDLING: linux-3.12.9/security/tomoyo/gc.c ... Context file:linux-3.12.9/net/core/sock.c at lines 2383 2393 2398 file:linux-3.12.9/net/core/sock.c at lines 2336 2340 2345 Introduction file:linux-3.12.9/net/ipv4/inet_hashtables.c at lines 571 577 581 Simple Examples As it only applies in these three cases one could now consider rebalancing the API or adjusting the proposed patch... Finding bugs

Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0 Conclusion

coccinelle semantic patch describing bugs as ”semantic artifact” helps understanding lan- guage systematic scanning of the entire kernel semantic patterns Nicholas Mc feasible Guire coccinelle is a versatile tool for scanning and refactoring its not that hard to use notably because there is a lot of Context help Introduction Simple URL: http://coccinelle.lip6.fr/ Examples mailing list: https://systeme.lip6.fr/mailman/listinfo/cocci Finding bugs

available as package for many distributions Conclusions

Copyright OSADL eG, 2018, CC-BY-SA-4.0