Proximity and Semi-Metric Analysis of Social Networks
Total Page:16
File Type:pdf, Size:1020Kb
Proximity and Semi-Metric Analysis of Social Networks Advanced Knowledge Integration In Assessing Terrorist Threats LDRD-DR Network Analysis Component LAUR 02-6557 Luis M Rocha1 Modeling, Algorithms, and Informatics Group (CCS-3) Los Alamos National Laboratory, MS B256 1 Eugene Gavrilov and Jim Gattiker participated in the work here described, by producing necessary databases and generating data sets and proximity measures as specified throughout the text. TABLE OF CONTENTS 1. Data Sets ......................................................................... 1 2. Proximity Relations ................................................................. 1 2.1 Mathematical Background .................................................... 1 2.1.1 Crisp Relations .................................................... 1 2.1.2 Fuzzy Relations ................................................... 2 2.1.3 Binary Relations and Graphs ......................................... 2 2.1.4 Properties of Graphs ................................................ 2 2.1.5 Composition of Graphs .............................................. 3 2. 1.6 Transitive Closure ................................................. 3 2.1.7 Similarity Relations ................................................ 3 2.1.8 Proximity Relations ................................................ 4 2.2 Proximity Relations as Co-Occurrence Probabilities ................................ 4 2.3 Extracting Proximity Relations from the Datasets .................................. 6 3. Analysis of Proximity Graphs ......................................................... 7 3.1 NetStat ................................................................... 7 3.2 Comparison with PEOPLE_DOCUMENTS1 Subset used by Lattice Group ................ 10 4. Semi-metric Behavior of Proximity Graphs .............................................. 13 4.1 From Proximity to Distance .................................................. 13 4.2 Semi-metric behavior: latent associations ....................................... 13 4.3 Characterizing Semi-metric behavior .......................................... 14 5 Analysis of Semi-Metric Behavior in Datasets: Capturing Indirect and Latent Associations ......... 15 5.1 Semi-metric behavior of Distance Graphs ....................................... 15 5.2 Identification of Latent Pairs ................................................. 16 5.3 Comparison with Lattice Subset .............................................. 18 6 Confidence in Results: Capturing Latent Knowledge ....................................... 20 ii 6.1 Expert Test ............................................................... 21 6.2 Random Deletions Test ..................................................... 23 7 Conclusions and Future Work ......................................................... 35 References ......................................................................... 35 Appendix A - Proximity Data .......................................................... 36 A1. Pairs of names with PDP1 $3 ................................................ 36 A2. Pairs of names with Transitive Closure of PDP1 $3 .............................. 38 A3. Pairs of cities with CDP1 $3 ................................................. 41 A4. Pairs of cities with transitive closure of CDP1 $3 ................................ 42 A5. Pairs of people names with PDP2 $3 .......................................... 44 A6. Pairs of people names with transitive closure of PDP2 $3 .......................... 50 A7. Pairs of events with EGP $3 ................................................. 68 A8. Pairs of events with transitive closure of EGP $3 ................................. 69 A9. Pairs of group names with GEP $3 ............................................ 70 A10. Pairs of group names with transitive closure of GEP $3 .......................... 71 A11. Pairs of events with EPP $3 ................................................ 71 A12. Pairs of events with transitive closure of EPP $3 ................................ 71 A13. Pairs of people names with PEP $3 .......................................... 72 A14. Pairs of people names with transitive closure of PEP $3 .......................... 79 A15. Pairs of group names with GPP $3 ........................................... 94 A16. Pairs of group names with transitive closure of GPP $3 ........................... 95 A17. Pairs of people names with PGP $3 .......................................... 95 A18. Pairs of people names with transitive closure of PGP $3 .......................... 95 A18. Pairs of people names with PRP $3 .......................................... 95 A20. Pairs of people names with transitive closure of PRP $3 ......................... 106 A21. Pairs of related people names with RPP $3 ................................... 107 iii A22. Pairs of related people names with transitive closure of RPP $3 ................... 109 Appendix B: Semi-Metric Pairs ........................................................ 113 B1: PDP1 .................................................................. 113 B1.1: Pairs of people names with s $ 2 .................................... 113 B1.2: Pairs of people names with rs $ 0.25 ................................. 114 B1.3: Pairs of people names with b $ 2 .................................... 114 B2: CDP1 ................................................................. 115 B2.1: Pairs of city names with s $ 2 ...................................... 115 B2.2: Pairs of city names with rs $ 0.25 ................................... 115 B2.3: Pairs of city names with b $ 2 ...................................... 115 B3: PDP2 .................................................................. 116 B3.1: Pairs of people names with s $ 2 .................................... 116 B3.2: Pairs of people names with rs $ 0.25 ................................. 119 B3.3: Pairs of people names with b $ 2 .................................... 121 B4: EGP ................................................................... 132 B4.1: Pairs of events with s $ 2 .......................................... 132 B4.2: Pairs of events with rs $ 0.25 ...................................... 132 B4.3: Pairs of events with b $ 2 ......................................... 132 B5: GEP ................................................................... 132 B5.1: Pairs of group names with s $ 2 ..................................... 132 B5.2: Pairs of group names with rs $ 0.25 ................................. 132 B5.3: Pairs of group names with b $ 2 .................................... 133 B6: EPP ................................................................... 133 B6.1: Pairs of events with s $ 2 .......................................... 133 B6.2: Pairs of events with rs $ 0.25 ...................................... 133 B6.3: Pairs of events with b $ 2 ......................................... 133 iv B7: PEP ................................................................... 133 B7.1: Pairs of people names with s $ 2 .................................... 133 B7.2: Pairs of people names with rs $ 0.25 ................................. 133 B7.3: Pairs of people names with b $ 2 .................................... 133 B8: GPP ................................................................... 133 B8.1: Pairs of group names with s $ 2 ..................................... 133 B8.2: Pairs of group names with rs $ 0.25 ................................. 133 B8.3: Pairs of group names with b $ 2 .................................... 133 B9: PGP ................................................................... 134 B9.1: Pairs of people names with s $ 2 .................................... 134 B9.2: Pairs of people names with rs $ 0.25 ................................. 134 B9.3: Pairs of people names with b $ 2 .................................... 134 B10: PRP .................................................................. 134 B10.1: Pairs of people names with s $ 2 ................................... 134 B10.2: Pairs of people names with rs $ 0.25 ................................ 135 B10.3: Pairs of people names with b $ 2 ................................... 137 B11: RPP .................................................................. 137 B11.1: Pairs of related people names with s $ 2 ............................. 137 B11.2: Pairs of related people names with rs $ 0.25 .......................... 138 B11.3: Pairs of related people names with b $ 2 ............................. 139 v 1. DATA SETS We received several data sets for the Network Analysis component. From the document reports stored in the original Lotus Notes database, Eugene Gavrilov created an SQL database for use by the entire project. From the first load of this database, the relation between people names and documents was deemed of interest. We refer to this as the PEOPLE_DOCUMENTS1 dataset. Another relation between cities and documents was also studied: CITIES_DOCUMENTS1. Later, when new documents and additional relations were added to the Lotus Notes database, Eugene Gavrilov updated the SQL database. From this novel database we again extracted the relation between people names and documents: PEOPLE_DOCUMENTS2. We furthermore extracted four additional relations of interest: 1. EVENTS_GROUPS: A Relation between terrorist events and terrorist organizations. 2. EVENTS_PEOPLE: A Relation between terrorist events and people names. 3. GROUPS_PEOPLE: A Relation between terrorist organizations