Unstructured Data Analysis in Arcgis

Unstructured Data Analysis in ArcGIS James Jones - Esri Julia Bell - Esri Scott Graff - Microsoft What is Unstructured Data? • Does not have a recognizable structure or is loosely structured • Can be in a variety of formats and storage mechanisms • Word Documents • Email • Social Media Posts • PowerPoint • PDF • Share drive “Every two days we create as much information as we did up to 2003” Eric Schmidt, 2010 144 million e-mails are sent What does that look like? Every minute… Twitter sees new 350,000 tweets Facebook has 510,000 comments posted, 293,000 statuses updated 15.2 million Text Messages are sent 954,000 new Microsoft Office documents are created How much spatial information are we missing out on? How much spatial information are we missing out on? How can we capture this information in ArcGIS? How to Integrate Unstructured Data into ArcGIS Native Esri Capability Coordinates Custom Locations What are you looking for? User defined keywords What is the best tool? ArcGIS Pro w/ ArcGIS Pro for ArcGIS Enterprise w/ LocateXT Intelligence LocateXT • Data is at least somewhat understood • Data benefits from identifiable and repeating patterns How is it best used? • Little to no programming experience available/needed ArcGIS LocateXT Extract Locations from Unstructured Data Extracting Locations with ArcGIS • LocateXT Extension for ArcGIS Desktop and Enterprise • Available for ArcMap 9.1 and later • Available in ArcGIS Pro at 2.3 • 100% Feature function as ArcGIS Pro 2.4 • Uses pattern matching regular expressions (REGEX) to search for coordinates in a variety of formats • Uses custom location list to match/extract other patterns (place names, codes, other terms) Extracting Locations in ArcGIS Pro • New option added to the “Add Data” button • Allows for a user to drag and drop documents or copied text into a window • Can create a new feature class or append it to an existing one • Included with ArcGIS Pro for Intelligence Extracting Locations in ArcGIS Pro • Two Geoprocessing Tools added • Located in the Conversion Tools –> To Geodatabase • Extract Locations from Document • Extract Locations from Text Extracting Custom Attributes • Ability to create custom attributes based on content within document or near a location • Triggered by location extraction • Based on keywords • Tag locations based on keywords • Scrape/harvest portions of document based on keywords • Ability to extract based off of: • Number of characters/words • Number of lines/blank line • Stop string • Previously built in separate LocateXT desktop application (until Pro 2.4) Extracting Addresses • Ability to extract addresses from documents based on combination of: • State • Zipcode • Ex. VA 22182 • The combination of extracted text and pre-text is geocoded Explore Unstructured Data through LocateXT and Custom Attributes How to Integrate Unstructured Data into ArcGIS Native Esri Third Party Capability Integration Locations Coordinates People/Organizations Custom Locations What are you looking for? Events User defined keywords Dates Relationships What is the best tool? ArcGIS Pro w/ ArcGIS Pro for ArcGIS Enterprise w/ LocateXT Intelligence LocateXT Natural Language Processing • Data is at least somewhat understood • Data is not well understood • Data benefits from identifiable and • Data does not contain identifiable repeating patterns How is it best used? and/or repeating patterns • Little to no programming experience • Integration needed available/needed Integrating NLP with ArcGIS Integrating NLP Capabilities with ArcGIS • Many NLP offerings have Python APIs/SDKs or communicate over REST Apps NLTK • Integrates near seamlessly with ArcPy Desktop APIs • Create Python Toolboxes/Script Tools • Allows to extract relevant data based on data local to their machine or as part of Enterprise Pipeline ArcGIS • ArcGIS.Learn has incorporated support for NLP Tools Entity Recognition Processing Unstructured Data Using ArcGIS and Microsoft Azure Source Processing Storage Analysis Apps/Visualization Key Take-aways: 1. Leverages modern, serverless processes and integration apps 2. Allows for a variety of NLP processes to be ran 3. Deep analytics with ArcGIS and Azure Cognitive Services/Machine Learning Building an Unstructured Pipeline to Understand World Events Source Processing Storage Analysis Apps/Visualization Logic Apps / Power Automate Data is passed to ArcGIS ArcGIS Pro for Intelligence Operations Dashboard and watch RSS feeds and Website. GeoEvent Server for ingestion allows non-GIS Intel ArcGIS Insights allows for very Microsoft Cognitive Services into ArcGIS Platform. JSON professionals access to a tailored views of the data to be extracts entities and analyze files are stored in Azure custom experience of ArcGIS quickly analyzed and viewed sentiment. Storage. Tabular data is Pro to provide deep analysis. by decision makers and non- stored in a Azure SQL Data Microsoft Cognitive Services GIS Professionals. Warehouse. further enriches data by running computer vision against embedded images. ArcGIS Pro for Intelligence • Create and manage intelligence information • Visualizeand display your data in maps, charts, and timelines • Perform spatial, temporal, relational, and predictive analysis • Produce and disseminate intelligence products Multi-int workstation for the intelligence professional NLP Integration with ArcGIS Please Share Your Feedback in the App Download the Esri Select the session Scroll down to Log in to access the Complete the survey Events app and find you attended “Survey” survey and select “Submit” your event.

Unstructured Data Analysis in Arcgis

Big-Data Science in Porous Materials: Materials Genomics and Machine Learning

Unstructured Data Is a Risky Business

1 Application of Text Mining to Biomedical Knowledge Extraction: Analyzing Clinical Narratives and Medical Literature

Big Data Mining Tools for Unstructured Data: a Review YOGESH S

Extracting Unstructured Data from Template Generated Web Documents

Top Natural Language Processing Applications in Business UNLOCKING VALUE from UNSTRUCTURED DATA for Years, Enterprises Have Been Making Good Use of Their 1

Combining Unstructured, Fully Structured and Semi-Structured Information in Semantic Wikis

Solving the Unstructured Data Puzzle with Analytics

Cheminformatics for Genome-Scale Metabolic Reconstructions

Geospatial Semantics Yingjie Hu GSDA Lab, Department of Geography, University of Tennessee, Knoxville, TN 37996, USA

The Role of Text Analytics in Healthcare: a Review of Recent Developments and Applications

Web Mining – Data Mining Im Internet