Some Applications in Human Behavior Modeling (And Open Questions)
Total Page:16
File Type:pdf, Size:1020Kb
Some applications in human behavior modeling (and open questions) Jerry Zhu University of Wisconsin-Madison Simons Institute Workshop on Spectral Algorithms: From Theory to Practice Oct. 2014 1 / 30 Outline Human Memory Search Machine Teaching 2 / 30 Verbal fluency Say as many animals as you can without repeating in one minute. 3 / 30 Semantic \runs" 1. cow, horse, chicken, pig, elephant, lion, tiger, porcupine, gopher, rat, mouse, duck, goose, horse, bird, pelican, alligator, crocodile, iguana, goose 2. elephant, tiger, dog, cow, horse, sheep, cat, lynx, elk, moose, antelope, deer, tiger, wolverine, bobcat, mink, rabbit, wolf, coyote, fox, cow, zebra 3. cat, dog, horse, chicken, duck, cow, pig, gorilla, giraffe, tiger, lion, ostrich, elephant, squirrel, gopher, rat, mouse, gerbil, hamster, duck, goose 4. cat, dog, sheep, goat, elephant, tiger, dog, deer, lynx, wolf, mountain goat, bear, giraffe, moose, elk, hyena, aardvark, platypus, lion, skunk, wolverine, raccoon 5. dog, cat, leopard, elephant, monkey, sea lion, tiger, leopard, bird, squirrel, deer, antelope, snake, beaver, robin, panda, vulture 6. deer, muskrat, bear, fish, raccoon, zebra, elephant, giraffe, cat, dog, mouse, rat, bird, snake, lizard, lamb, hippopotamus, elephant, skunk, lion, tiger 4 / 30 7. dog, cat, ferret, fish, cow, horse, pig, sheep, elephant, tiger, lion, bear, giraffe, bird, groundhog, ox 8. antelope, baboon, cat, dog, elephant, frog, giraffe, hippopotamus, jaguar, dog, cat, horse, pig, chicken, bird, hippopotamus, cow 9. dog, cat, cow, sheep, lamb, rooster, chicken, tiger, elephant, monkey, orangutan, lemur, crocodile, giraffe, owl, bird, lion, elephant, hippopotamus, rhinoceros, penguin, seal, walrus, whale, dolphin 10. dog, cat, bear, deer, puma, opossum, moose, bear, wolf, coyote, lion, tiger, elephant, giraffe, hyena, cougar, mountain lion, antelope, elk Memory search K = 12, obj = 14.1208 human hippopotamus whale dolphin muskyshark squid 1 jaguar walrusporpoise baboon alligator muskox iguana boa onstrictor eel manatee c polar bear tiger crocodile cobra snake crab elephant turtle gorilla rhinoceroscougar monkey rattlesnake panda kangaroo sea lion moleseal clam cow oxlion puma pig cheetahbisonlamb otter worm 0.5 wolf ocelotaardvark chipmunk armadillo cockroach rabbit gilamonster tortoise python ant moosedog wallaby elk raccoon llamalemur skunk muskrat wildebeest ape chimp fish anemone prairie dog salamander amphibiantadpole bear panther porcupine tasmanian devil Guinea pig bobcat goat hedgehogwolverine chameleon octopus frog ram fox caterpillar leopard badger mongoosecat snail platypus dingo hyena lobster centipede buffaloyak shrimp 0 alpaca pony mule donkey deer opossum newt insect horse gazelle impalasloth chinchilla rodent lizard spider coyote tapir wombat reindeer beaver gerbil lynx anteaterkoala racoon orangutan gibbonjackrabbit vole penguin ladybug antelope bunny groundhog mouse jackal sheep woodchuck weasel gecko wasp zebra camelcaribou shrew marten duck mare giraffe pelican hornet beetleflea fly −0.5 bull squirrel hamster parrot butterflygnat hare roadrunnerswan moth gopher colt raven bee rat fowlcrow woodpecker roosterquail bat loonflamingooriole mosquito ostrich vulture wren owl sparrow emu heron bumblebee buzzard pigeon gull seagullturkey crane dove bird ibis bluejay canary −1 chickenhawkcockatiel eagle falcon robinchickadeepeacock cricketgrasshopper grouse cardinal hummingbird goose −1.5 −1 −0.5 0 0.5 1 1.5 5 / 30 Memory search K = 12, obj = 14.1208 human hippopotamus whale dolphin muskyshark squid 1 jaguar walrusporpoise baboon alligator muskox iguana boa onstrictor eel manatee c polar bear tiger crocodile cobra snake crab elephant turtle gorilla rhinoceroscougar monkey rattlesnake panda kangaroo sea lion moleseal clam cow oxlion puma pig cheetahbisonlamb otter worm 0.5 wolf ocelotaardvark chipmunk armadillo cockroach rabbit gilamonster tortoise python ant moosedog wallaby elk raccoon llamalemur skunk muskrat wildebeest ape chimp fish anemone prairie dog salamander amphibiantadpole bear panther porcupine tasmanian devil Guinea pig bobcat goat hedgehogwolverine chameleon octopus frog ram fox caterpillar leopard badger mongoosecat snail platypus dingo hyena lobster centipede buffaloyak shrimp 0 alpaca pony mule donkey deer opossum newt insect horse gazelle impalasloth chinchilla rodent lizard spider coyote tapir wombat reindeer beaver gerbil lynx anteaterkoala racoon orangutan gibbonjackrabbit vole penguin ladybug antelope bunny groundhog mouse jackal sheep woodchuck weasel gecko wasp zebra camelcaribou shrew marten duck mare giraffe pelican hornet beetleflea fly −0.5 bull squirrel hamster parrot butterflygnat hare roadrunnerswan moth gopher colt raven bee rat fowlcrow woodpecker roosterquail bat loonflamingooriole mosquito ostrich vulture wren owl sparrow emu heron bumblebee buzzard pigeon gull seagullturkey crane dove bird ibis bluejay canary −1 chickenhawkcockatiel eagle falcon robinchickadeepeacock cricketgrasshopper grouse cardinal hummingbird goose −1.5 −1 −0.5 0 0.5 1 1.5 6 / 30 Censored random walk [Abbott, Austerweil, Griffiths 2012] I V: n animal word types in English I P : (dense) n × n transition matrix I Censored random walk: observing only the first token of each type x1; x2; : : : ; xt;::: ) a1; : : : ; an I (star example) 7 / 30 The estimation problem Given m censored random walks n (1) (1) (m) (m)o D = a1 ; :::; an ; :::; a1 ; :::; an , estimate P . 8 / 30 Each observed step is an absorbing random walk (with Kwang-Sung Jun) I P (ak+1 j a1; : : : ; ak) may contain infinite latent steps I Instead, model this observed step as an absorbing random walk with absorbing states V n fa1; : : : ; akg QR P ) I 0 I 9 / 30 Maximum Likelihood −1 I Fundamental matrix N = (I − Q) , Nij is the expected number of visits to j before absorption when starting at i Pk I p(ak+1 j a1; : : : ; ak) = i=1 NkiRi1 Pm Pn (i) (i) (i) I log likelihood i=1 k=1 log p(ak+1 j a1 ; :::; ak ) I Nonconvex, gradient method 10 / 30 Other estimators I PRW: pretend a1; : : : ; an not censored I PFirst2: Use only a1; a2 in each walk (consistent) 11 / 30 Star graph 2 x-axis: m, y-axis: kPb − P kF 12 / 30 2D Grid 13 / 30 Erd¨os-R´enyiwith p = log(n)=n 14 / 30 Ring graph 15 / 30 Questions I Consistency? I Rate? 16 / 30 Outline Human Memory Search Machine Teaching 17 / 30 Passive learning: iid 1. given training data D = (x1; y1) ::: (xn; yn) ∼ p(x; y) 2. finds θ^ consistent with D Risk jθ^ − θ∗j = O(n−1) Learning a noiseless 1D threshold classifier θ∗ x ∼ uniform[0; 1] −1; x < θ∗ y = 1; x ≥ θ∗ Θ = fθ : 0 ≤ θ ≤ 1g 18 / 30 Risk jθ^ − θ∗j = O(n−1) Learning a noiseless 1D threshold classifier θ∗ x ∼ uniform[0; 1] −1; x < θ∗ y = 1; x ≥ θ∗ Θ = fθ : 0 ≤ θ ≤ 1g Passive learning: iid 1. given training data D = (x1; y1) ::: (xn; yn) ∼ p(x; y) 2. finds θ^ consistent with D 18 / 30 Learning a noiseless 1D threshold classifier θ∗ x ∼ uniform[0; 1] −1; x < θ∗ y = 1; x ≥ θ∗ Θ = fθ : 0 ≤ θ ≤ 1g Passive learning: iid 1. given training data D = (x1; y1) ::: (xn; yn) ∼ p(x; y) 2. finds θ^ consistent with D Risk jθ^ − θ∗j = O(n−1) 18 / 30 Risk jθ^ − θ∗j = O(2−n) Active learning (sequential experimental design) Binary search θ∗ 19 / 30 Active learning (sequential experimental design) Binary search θ∗ Risk jθ^ − θ∗j = O(2−n) 19 / 30 θ∗ Risk jθ^ − θ∗j = , 8 > 0 Machine teaching What is the minimum training set a helpful teacher can construct? 20 / 30 θ∗ Machine teaching What is the minimum training set a helpful teacher can construct? Risk jθ^ − θ∗j = , 8 > 0 20 / 30 Comparing the three θ∗ θ∗ θ∗ { O(1/n){ O(1/2n ) passive learning "waits" active learning "explores" teaching "guides" The teacher knows θ∗ and the learning algorithm. 21 / 30 non-iid, n = d + 1 Example 2: Teaching a Gaussian distribution d Given a training set x1 : : : xn 2 R , let the learning algorithm be n 1 X µ^ = x n i i=1 n 1 X Σ^ = (x − µ^)(x − µ^)> n − 1 i i i=1 How to teach N(µ∗; Σ∗) to the learner quickly? 22 / 30 Example 2: Teaching a Gaussian distribution d Given a training set x1 : : : xn 2 R , let the learning algorithm be n 1 X µ^ = x n i i=1 n 1 X Σ^ = (x − µ^)(x − µ^)> n − 1 i i i=1 How to teach N(µ∗; Σ∗) to the learner quickly? non-iid, n = d + 1 22 / 30 An Optimization View of Machine Teaching A D A(D) θ∗ −1 Α−1 (θ ∗ ) Α Θ DD min jDj D2D s.t. A(D) = θ∗ The objective is the teaching dimension [Goldman, Kearns 1995] of θ∗ with respect to A; Θ. 23 / 30 Example 3: Works on Humans Human categorization on 1D stimuli [Patil, Z, Kope´c,Love 2014] human training set human test accuracy machine teaching 72.5% iid 69.8% (statistically significant) 24 / 30 New task: teaching humans how to label a graph Given: I a graph G = (V; E) ∗ I target labels y : V 7! {−1; 1g I a label-completion cognitive model A (graph diffusion algorithm) such as: I mincut I harmonic function [Z, Gharahmani, Lafferty 2003] I local global consistency [Zhou et al. 2004] I ::: Find the smallest seed set: min jSj S⊆V s.t. A(y∗(S)) = y∗ (inverse problem of semi-supervised learning) 25 / 30 Example: A = local-global consistency [Zhou et al. 2004] F = (1 − α)(I − αD−1=2WD−1=2)−1y∗(S) 1 y = sgn F −1 26 / 30 jA−1(y∗)j is large Turns out 4649 out of 216 = 65536 training sets S lead to the target label completion 27 / 30 Optimal seed set 1: jSj = 3 28 / 30 Optimal seed set 2: jSj = 3 29 / 30 Questions I How does jSj relate to (spectral) properties of G? I How to solve for S? 30 / 30.