coccinelle semantic patch language
Nicholas Mc Guire
March 5, 2019 History Code refactoring in Linux
coccinelle Started around 2006 semantic patch First larger application presented in 2007 at OLS (SCSI lan- driver cleanup) guage
2010 First public release of the source code coccinelle-0.1 Nicholas Mc Guire The original goal - automate the task of keeping device
as possible Introduction Well established in the linux kernel by now with a large set Simple of scripts in scripts/coccinelle Examples Finding bugs
The goal now - automated evolution in the Linux kernel Conclusions see https://hal.archives-ouvertes.fr/hal-01022704/PDF/faults- in-linux-2.6-tocs.pdf to see the impact
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Short profile
coccinelle semantic COCCINELLE/Semantic Patches (SmPL) patch Maintainer: Julia Lawall
Finding bugs
Repo: https://github.com/coccinelle/coccinelle Conclusions Coccinelle is use in a number of projects other than the linux kernel
Copyright OSADL eG, 2018, CC-BY-SA-4.0 What is coccinelle/spatch
coccinelle semantic patch lan- guage program matching and transformation engine Nicholas Mc analysis at the semantic level emits at the lexical level Guire
Note: most examples here are from Gilles Muller and Julia Simple Lawall. Examples Finding bugs
Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 What is coccinelle
coccinelle Coccinelle is a tool to automatically analyze and rewrite C semantic code. Coccinelle patch Beyond string processing - coccinelle is aware of the C lan- language structure guage Beyond static code-checking - Coccinelle can generate Nicholas Mc Guire code to actually fix the bugs it finds
Has its own C parser to retain spacing and comments Context Mechanism Introduction The semantic patch is transformed to CTL Simple The code is transformed to a CFG Examples Model checking is used to check the validity of the CTL on Finding bugs the CFG Conclusions Supports isomorphism to handle equivalence (e.g. !var, var == NULL) Embeded python scripting for postprocessing and reporting Copyright OSADL eG, 2018, CC-BY-SA-4.0 Supporting evolution background
coccinelle semantic patch lan- evolutions in library APIs guage generate changes needed in client code Nicholas Mc Guire extended to the Linux kernel
maintenance of kernel API users Context
maintenance of out-of-tree code Introduction
Simple Manual maintenance of API changes too error prone - too Examples
expensive. Finding bugs
Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Coccinelle in the Linux kernel
coccinelle semantic patch lan- number of patches accepted: 1000++ guage Nicholas Mc location: scripts/coccinelle/* Guire
Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Capabilities
coccinelle semantic patch context sensitive: lan- guage renaming of function Nicholas Mc adding of function arguments Guire reorganizing a data structure
bug hunting: Context
finding bugs Introduction
proposing fixes for bug patterns Simple Examples analysis: Finding bugs locating semantic patterns in a large code base Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Simple example - switching to a helper function
coccinelle semantic patch @@ lan- struct resource *res; guage @@ Nicholas Mc Guire - (res->end - res->start) + 1
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Simple example - switching to macro helper
coccinelle semantic patch @haskernel@ lan- @@ guage
Nicholas Mc #include
Finding bugs - (sizeof(((t*)0)->f)) + FIELD_SIZEOF(t, f) Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 API transformation
coccinelle @@ semantic struct device dev; patch expression E; lan- type T; guage
@@ Nicholas Mc Guire - dev.driver_data = (T)E
Introduction @@ Simple struct device *dev; Examples expression E; Finding bugs type T; @@ Conclusions
- dev->driver_data = (T)E + dev_set_drvdata(dev, E)
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Refactoring with coccinelle
coccinelle semantic patch lan- @@ guage @@ Nicholas Mc Guire - kcalloc(1,
Simple Looks trivial - but try and do this change in 16.2M LoC Examples manually... could you guarantee that you handled every Finding bugs instance in just 25k LoC ? Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Bug hunting and elimination
coccinelle semantic patch lan- Often ”high-level” faults and misunderstandings that will guage repeat Nicholas Mc Guire Testing for such errors not efficient, actually infeasible
Easy to automate -> regression testing Simple Examples
Semantic encoding of bugs allows to search for ”related bugs” Finding bugs and writing semantic scanners often helps clarify the problem ! Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Bugs are (often) patterns
coccinelle semantic patch lan- checking at syntax level guage coding styles/checkpatch.pl Nicholas Mc Guire gcc
checking at context level Context
sparse Introduction checking at semantic pattern level Simple Examples coccinelle Finding bugs
Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Scanning with spatch
coccinelle semantic patch lan- @@ guage expression lock1,lock2; expression flags; Nicholas Mc Guire @@
*spin_lock_irqsave(lock1,flags) Context ... when != flags Introduction *spin_lock_irqsave(lock2,flags) Simple Examples Search for cases where the inner lock flags were also saved - Finding bugs report only. Conclusions
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Scanning with spatch - false positive
coccinelle semantic
diff -u -p linux-3.12.9/drivers/net/ethernet/natsemi/ns83820.c /tmp/nothing/driv patch ers/net/ethernet/natsemi/ns83820.c lan- --- linux-3.12.9/drivers/net/ethernet/natsemi/ns83820.c +++ /tmp/nothing/drivers/net/ethernet/natsemi/ns83820.c guage @@ -562,7 +562,6 @@ static inline int rx_refill(struct net_d Nicholas Mc dprintk("rx_refill(%p)\n", ndev); Guire if (gfp == GFP_ATOMIC)
False positive - the scanner might be too generic !
Copyright OSADL eG, 2018, CC-BY-SA-4.0 Scanning with spatch - code
coccinelle semantic
>>>> unsigned long flags = 0; patch lan- if (unlikely(nr_rx_empty(dev) <= 2)) return 0; guage
dprintk("rx_refill(%p)\n", ndev); Nicholas Mc if (gfp == GFP_ATOMIC) Guire >>>> spin_lock_irqsave(&dev->rx_info.lock, flags); /* extra 16 bytes for alignment */ Introduction skb = __netdev_alloc_skb(ndev, REAL_RX_BUF_SIZE+16, gfp); Simple if (unlikely(!skb)) Examples break; Finding bugs skb_reserve(skb, skb->data - PTR_ALIGN(skb->data, 16)); if (gfp != GFP_ATOMIC) Conclusions >>>> spin_lock_irqsave(&dev->rx_info.lock, flags); res = ns83820_add_rx_skb(dev, skb); Copyright OSADL eG, 2018, CC-BY-SA-4.0 bug - race with very small race window coccinelle semantic patch lan- guage static void advance_transaction(struct acpi_ec *ec, u8 status) { Nicholas Mc unsigned long flags; Guire struct transaction *t = ec->curr; Conclusions Copyright OSADL eG, 2018, CC-BY-SA-4.0 The fix coccinelle semantic patch lan- guage static void advance_transaction(struct acpi_ec *ec, u8 status) { Nicholas Mc unsigned long flags; Guire - struct transaction *t = ec->curr; Conclusions Copyright OSADL eG, 2018, CC-BY-SA-4.0 Bug pattern scanner @assign@ coccinelle expression s,var; semantic position p1,p2,p3; identifier func; patch statement S1; identifier member; lan- @@ guage func@p1(...){ ... Nicholas Mc var = s->member@p2 Guire ... @script:python@ Conclusions p1 << assign.p1; p2 << assign.p2; p3 << assign.p3; fn << assign.func; @@ print "%s:%s possible assign without lock at lines %s (related ? lock at line %s)" % (p1[0].file,fn,p1[0].line,p3[0].line) Copyright OSADL eG, 2018, CC-BY-SA-4.0 Result coccinelle semantic patch lan- guage Nicholas Mc Guire linux-stable-rt4-clean/drivers/acpi/ec.c:advance_transaction Simple Examples Finding bugs Conclusions Copyright OSADL eG, 2018, CC-BY-SA-4.0 bug - overlooked API dependency coccinelle semantic patch proposed patch for spin lock bh/spin unlock bh caused crash in lan- guage rt because the API can be used in an inbalanced but equivalent Nicholas Mc form: Guire Copyright OSADL eG, 2018, CC-BY-SA-4.0 API pattern scanner coccinelle @r1@ identifier f; semantic expression E; patch position p1,p2,p3; @@ lan- f(...) { <... guage ( spin_lock_bh(E)@p1; Nicholas Mc ... Guire spin_unlock(E)@p2; Copyright OSADL eG, 2018, CC-BY-SA-4.0 One of the cases found coccinelle semantic in linux-3.12.9/net/core/sock.c patch lan- ... spin_lock_bh(&sk->sk_lock.slock); guage if (!sk->sk_lock.owned) Nicholas Mc /* Guire * Note : We must disable BH __lock_sock(sk); Introduction sk->sk_lock.owned = 1; spin_unlock(&sk->sk_lock.slock); Simple /* Examples * The sk_lock has mutex_lock() semantics here: */ Finding bugs mutex_acquire(&sk->sk_lock.dep_map, 0, 0, _RET_IP_); local_bh_enable(); Conclusions ... Copyright OSADL eG, 2018, CC-BY-SA-4.0 Run example/results coccinelle semantic patch lan- guage root@debian:/usr/src# spatch --sp-file spin_lock_bh2.cocci \ --dir linux-3.12.9 Nicholas Mc ... Guire HANDLING: linux-3.12.9/security/tomoyo/file.c Conclusions Copyright OSADL eG, 2018, CC-BY-SA-4.0 Conclusion coccinelle semantic patch describing bugs as ”semantic artifact” helps understanding lan- guage systematic scanning of the entire kernel semantic patterns Nicholas Mc feasible Guire available as package for many distributions Conclusions Copyright OSADL eG, 2018, CC-BY-SA-4.0