Shannon and Group Testing: Finding Needles in Haystacks
Total Page:16
File Type:pdf, Size:1020Kb
Shannon and group testing: Finding needles in haystacks Oliver Johnson [email protected] Twitter: @BristOliver School of Mathematics, University of Bristol Inaugural Lecture, 1st November 2018 Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 1 / 35 Outline of the talk 1 Catering conundrum 2 How did I get here? 3 Shannon and Information theory 4 Group testing Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 2 / 35 Section 1: Toxic talk treat teaser Professor J is giving his inaugural lecture at Blistor University. 7 plates of delicious post-lecture snacks. Professor J's evil nemesis, Dr X, has poisoned one of them. Whoever eats that snack will fall asleep for 24 hours. How to find the poisoned food, as efficiently as possible? Can pay any PhD student $10 to eat what we tell them. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 3 / 35 How to solve the mystery? One idea: pay 7 PhD students to eat one snack each. One will fall asleep. Will cost us $70. Better idea: use the following strategy (only costs $30). Olives Nuts Bread Crisps Dip Cheese Jelly sticks straws PhD 1 X × X × X × X PhD 2 XX × × XX × PhD 3 XXXX × × × Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 4 / 35 Outcome Olives Nuts Bread Crisps Dip Cheese Jelly sticks straws PhD 1 X × X × X × X PhD 2 XX × × XX × PhD 3 XXXX × × × Solution: breadsticks were poisoned. This strategy would always work. But what if more than one snack poisoned? Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 5 / 35 More interesting problems What if you had 500 snacks, with 10 poisoned? How many students would we need? What should we get them to eat? How should we find the poisoned snacks? What if some students are immune to poison . or fall asleep anyway? This is group testing. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 6 / 35 Section 2: Brief history of me Academic careers are a matter of luck. Successes are always team efforts. Survivorship bias and impostor syndrome. Will briefly say how I got here, via Birmingham and Cambridge. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 7 / 35 Birmingham Attended King Edward's School. Lucky to be encouraged towards maths . and Maths Olympiads in particular. UK team (Sweden 1991, Russia 1992). Learnt a lot of tricks . also that I can't do geometry. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 8 / 35 Cambridge Queens' College Cambridge as undergrad and PhD student. Took diverse range of courses from General Relativity through to Galois Theory . including Information Theory. Realised I wasn't smart enough to be a number theorist. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 9 / 35 Cambridge (cont.) PhD (Entropy and Limit Theorems) with Yuri Suhov. Lucky to be exposed to Russian School. Applied for lots of jobs . got the last one (JRF at Christ's College). After 4 years at Christ's, job at Bristol came up . I didn't get it! 3 more postdoc years later, second time lucky. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 10 / 35 Bristol 2006, moved to Bristol as Lecturer in Statistics. Fantastic city, great department. Found interesting new directions. Including collaborating with biologists (a bit) and engineers (a lot!). Lucky to have some great PhD students: Matt, Leo, Dan, Vaia, Tom, Jennifer, Zichen, Chrys. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 11 / 35 Section 3: Information theory Claude Shannon (1916{2001) Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 12 / 35 It's rare to talk about maths and science as op- portunities to revel in discovery. We speak, instead about their practical benefits { to society, the econ- omy, our prospects for employment. STEM courses are the means to job security, not joy. Studying them becomes the academic equivalent of eating your vegetables { something valuable, and state sanctioned, but vaguely distasteful. { from A Mind at Play (2017) by Jimmy Soni and Rob Goodman Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 13 / 35 Shannon's 1948 paper One of the most influential scientific papers ever. 110k citations and counting. Impact comparable with e.g. Einstein's work on relativity? Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 14 / 35 What did Shannon do? from A Mathematical Theory of Communication (1948) Boole had introduced binary arithmetic (0s and 1s). Shannon realised any information can be represented as series of these `bits'. Understood we can compress information down to a limit (entropy). Think about \amount of stuff" a message contains. Remove redundancy. Key Idea: Predictable messages are compressible. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 15 / 35 We all live in Shannon's world Phones Hard Drives Broadband Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 16 / 35 Channels and noise Shannon also modelled noisy (imperfect) `communications channels'. Imagine we send message X : X = (0 0 1 0 1 0 0 1 :::) Measurements inaccurate, environment has interference etc. Through noisy channel may receive Y = (0 1 1 0 1 0 0 0 :::) 6= X Could model this as Y = X + Z, where Z is noise (randomly flip bits). Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 17 / 35 Coping with noise Shannon realised noise needn't be a problem. Can make messages longer (deliberately introduce redundancy). e.g. Naively just repeat message 3 times { call this `rate 1=3'. Even with some errors know what was (probably) sent. In fact, better strategies available. In general rate is `proportion of bits that are message'. How big can rate be? Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 18 / 35 Shannon capacity Shannon introduced `capacity' C of a channel, gave general formula. Think about \width of a pipe". Roughly speaking: 1 [Achievability] For any rate < C, there is a strategy so the message gets through. 2 [Converse] For any rate > C, no such strategy exists. Key Idea: Some problems can't be solved. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 19 / 35 Section 4: Group testing 1942/3, US wanted to test all men joining army for disease; potentially very expensive. Disease rare: test outcomes known with high probability (predictable). Dorfman's idea: pool blood from a group of people, test it together. If disease present in any blood sample, test is positive. If no disease present, test is negative. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 20 / 35 Standard noiseless group testing Outcome Test 1 1 1110000 Positive Test 2 0 0 0 01111 Positive Test 3 1 1 0 0 0 0 0 0 Negative Test 4 0 0100000 Positive Test 5 0 0 0 01100 Positive Test 6 0 0 0 01000 Positive Represent pooling strategy via binary test matrix. Rows are tests, columns are people (`items'). Put a 1 if item is in test. Red denotes having the disease (`being defective') { rare. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 21 / 35 In practice Outcome Test 1 1 1 1 1 0 0 0 0 Positive Test 2 0 0 0 0 1 1 1 1 Positive Test 3 1 1 0 0 0 0 0 0 Negative Test 4 0 0 1 0 0 0 0 0 Positive Test 5 0 0 0 0 1 1 0 0 Positive Test 6 0 0 0 0 1 0 0 0 Positive Want to find defective items in as few tests as possible. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 22 / 35 Why should we care? It's fun! Applications in . DNA testing for rare genetic conditions, . counting disease prevalence . cognitive radio, . data forensics, . database management, . and many more. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 23 / 35 My work on group testing algorithms Been thinking how to find defective items. Describe work with Aldridge, Baldassini, Scarlett. Give computationally feasible algorithm (recipe) . with provable performance guarantee. Find defective items from test matrix and outcomes only. Negative test . all items in it are non-defective. Positive test with one item in . that item is defective. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 24 / 35 How well can we do in general? Suppose N items with K defective, often consider K ∼ Nθ. How well can we do with T tests? Theorem (BJA 2013 { 'strong counting bound') Any matrix and any algorithm has success probability satisfying 2T (suc) ≤ : P N K N Hence (folklore) need T ' log2 K tests (`magic number'). Information theory argument gives this too. N Need this many tests to learn log2 K bits of information. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 25 / 35 DD algorithm (ABJ 2014): Example Stage 1 1 0 1 0 0 1 0 Negative 1 1 0 1 0 0 1 Positive 1 0 0 0 1 0 0 Negative 0 1 1 0 1 1 0 Positive 1 0 1 1 0 1 0 Positive First, look at negative tests. Test 1 is negative, so items 1,3,6 are non-defective. Test 3 is negative, so items 1,5 are non-defective. Hence items 2,4,7 are possible defectives (PDs). This is the (pre-existing) COMP algorithm. Oliver Johnson @BristOliver Shannon and group testing 1st November 2018 26 / 35 DD example: Stage 2 0 0 0 1 1 1 Positive 0 0 0 1 0 0 Positive 0 1 0 Positive Restrict to submatrix corresponding to the PD set Test 4 is positive with one PD item in, so item 2 is defective.