Computer Vision Stochastic Grammars for Scene Parsing

Computer Vision Stochastic Grammars for Scene Parsing

Computer Vision Stochastic Grammars for Scene Parsing Song-Chun Zhu Ying Nian Wu August 18, 2020 Contents 0.1 This Book in the Series . ix Part I Stochastic Grammars in Vision 1 Introduction 3 1.1 Vision as Joint Parsing of Objects, Scenes, and Events . .3 1.2 Unified Representation for Models, Algorithms, and Attributes . .6 1.2.1 Three Families of Probabilistic Models . .6 1.2.2 Dynamics of Three Inferential Channels . .7 1.2.3 Three Attributes associated with each Node . .7 1.3 Missing Themes in Popular Data-Driven Approaches . .8 1.3.1 Top-Down Inference in Space and Time . .8 1.3.2 Vision is to be a Continuous Computational Process . .9 1.3.3 Resolving Ambiguities and Preserving Distinct Solutions . 11 1.3.4 Vision is Driven by a Large Number of Tasks . 12 1.4 Scope of Book: Compositional Patterns in the Middle Entropy Regime . 13 1.4.1 Information Scaling and Three Entropy Regimes . 14 1.4.2 Organization of Book . 14 2 Overview of Stochastic Grammar 17 2.1 Grammar as a Universal Representation of Intelligence . 17 2.2 An Empiricist’s View of Grammars . 18 2.3 The Formalism of Grammars . 20 2.4 The Mathematical Structure of Grammars . 21 2.5 Stochastic Grammar . 23 2.6 Ambiguity and Overlapping Reusable Parts . 24 2.7 Stochastic Grammar with Context . 26 3 Spatial And-Or Graph 29 3.1 Three New Issues in Image Grammars in Contrast to Language . 29 3.2 Visual Vocabulary . 31 3.2.1 The Hierarchical Visual Vocabulary – the "Lego Land" . 31 3.2.2 Image Primitives . 32 3.2.3 Basic Geometric Groupings . 34 3.2.4 Parts and Objects . 35 3.3 Relations and configurations . 36 i 3.3.1 Relations . 37 3.3.2 Configurations . 39 3.3.3 The Reconfigurable Graphs . 40 3.4 Parse Graph for Objects and Scenes . 42 3.5 Knowledge Representation with And-Or Graph . 44 3.5.1 And-Or graph . 44 3.5.2 Stochastic Models on the And-Or graph . 47 3.6 Examples in Related Work . 49 3.6.1 Probabilistic Geometric Grammars . 49 3.6.2 Mixture of Deformable Part-based Models and Object Detection Grammars . 49 3.6.3 Probabilistic Program Induction . 50 3.6.4 Recursive Cortical Networks . 51 4 Learning the And-Or Graph 55 4.1 Overview of the Learning Problem . 55 4.1.1 Learning Parameters in Stochastic Context-Free Grammar . 57 4.1.2 Probability Model for AOG . 58 4.2 Learning Parameters in AOG . 59 4.2.1 Maximum Likelihood Learning of Θ ......................... 60 4.2.2 Learning and Pursuing the Relation Set . 61 4.2.3 Examples of Sampling and Synthesis from AOG . 61 4.2.4 Summary of the Parameter Learning . 62 4.3 Structure Learning:Block Pursuit and Graph Compression . 63 4.3.1 Hybrid Image Templates (HIT) as Terminal Nodes . 69 4.3.2 AOT: Reconfigurable Object Templates . 73 4.3.3 Learning AOT from Images . 75 4.3.4 Inference on AOTs . 80 4.3.5 Example: The Synthesized 1D Text AOT Learning . 83 4.4 Structure Learning: Unsupervised Structure Learning . 86 4.4.1 Algorithm Framework . 86 4.4.2 And-Or Fragments . 87 4.5 Structure Learning:Pruning from Full Graph . 89 4.5.1 General Framework . 91 4.5.2 Example: Learning Image Tangram Model . 91 5 Parsing Algorithms for Inference in And-Or Graphs 101 5.1 Classic Search and Parsing Algorithms . 101 5.1.1 Heuristic Search in And-Graph, Or-Graph and And-Or-Graph . 101 5.1.2 Bottom-Up Chart Parsing . 120 5.1.3 Top-Down Earley Parser and Generalization . 128 5.1.4 Inside-Outside Algorithm for Parsing and Learning . 131 5.1.5 Figure of Merit Parsing . 135 5.2 Scheduling Top-down and Bottom-up Processes for Object Parsing . 147 5.2.1 Integrating α-β-γ Processes in Inference . 149 5.2.2 Learning the α, β and γ Processes . 159 5.3 Example I: Integrating the α, β and γ for Image Parsing. 168 ii 5.3.1 Experiment I: Evaluating Information Contributions of the α, β and γ Processes Individually . 169 5.3.2 Experiment II: Object Parsing in a Greedy Pursuit Manner by Integrating the α, β and γ Processes . 176 5.4 Example II: Recognition on Object Categories . 176 6 Attributed And-Or Graph 179 6.1 Introduction of Attribute Grammar . 179 6.2 Attributed Graph Grammar Model . 180 6.3 Example I: Parsing the Perspective Man-made World . 181 6.4 Example II: Single-View 3D Scene Reconstruction and Parsing . 184 6.4.1 Attribute Hierarchy . 185 6.4.2 Attribute Scene Grammar . 189 6.4.3 Probabilistic Formulation for 3D Scene Parsing . 189 6.4.4 Inference . 194 6.5 Example III: Human-Centric Indoor Scene Synthesis Using Stochastic Grammar . 195 6.5.1 Representation . 196 6.5.2 Probabilistic Formulation . 196 6.5.3 Synthesizing Scene Configurations . 199 6.6 Example IV: Joint Parsing of Human Attributes, Parts and Pose . 200 6.7 Summarization . 204 7 Temporal And-Or Graph 205 7.1 Introduction . 205 7.2 Atomic Action Models . 206 7.2.1 2D HOI in Time - A Simplified Atomic Action Model . 207 7.2.2 Modeling Human Object Interaction in 3D and Time . 208 7.2.3 Part-Level 3DHOI . 213 7.2.4 Hand Object Interaction . 214 7.2.5 Concurrent HOI’s and hoi’s in STC-AOG . 215 7.3 Event representation by T-AOG . 216 7.3.1 The T-AOG for Events . 216 7.3.2 Parse Graph . 217 7.3.3 Example I: Synthesizing New Events by T-AoG . 218 7.3.4 Example II: Group Activity Parsing by ST-AoG . 218 7.4 Parsing with Event Grammars . 225 7.4.1 Formulation of Event Parsing . 225 7.4.2 Generating Parse Graphs of Single Events . 226 7.4.3 Runtime Incremental Parsing . 227 7.4.4 Generalized Earley Parser . 227 7.4.5 Multi-agent Event Parsing . 234 7.5 Learning the T-AoG . ..

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    316 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us