Web Data Extraction, Applications and Techniques: a Survey

Total Page:16

File Type:pdf, Size:1020Kb

Web Data Extraction, Applications and Techniques: a Survey Web Data Extraction, Applications and Techniques: A Survey Emilio Ferraraa,∗, Pasquale De Meob, Giacomo Fiumarac, Robert Baumgartnerd aCenter for Complex Networks and Systems Research, Indiana University, Bloomington, IN 47408, USA bUniv. of Messina, Dept. of Ancient and Modern Civilization, Polo Annunziata, I-98166 Messina, Italy cUniv. of Messina, Dept. of Mathematics and Informatics, viale F. Stagno D'Alcontres 31, I-98166 Messina, Italy dLixto Software GmbH, Austria Abstract Web Data Extraction is an important problem that has been studied by means of different scientific tools and in a broad range of applications. Many approaches to extracting data from the Web have been designed to solve specific problems and operate in ad-hoc domains. Other approaches, instead, heavily reuse techniques and algorithms developed in the field of Information Extraction. This survey aims at providing a structured and comprehensive overview of the literature in the field of Web Data Extraction. We provided a simple classification framework in which existing Web Data Extraction applications are grouped into two main classes, namely applications at the Enterprise level and at the Social Web level. At the Enterprise level, Web Data Extraction techniques emerge as a key tool to perform data analysis in Business and Competitive Intelligence systems as well as for business process re-engineering. At the Social Web level, Web Data Extraction techniques allow to gather a large amount of structured data continuously generated and disseminated by Web 2.0, Social Media and Online Social Network users and this offers unprecedented opportunities to analyze human behavior at a very large scale. We discuss also the potential of cross-fertilization, i.e., on the possibility of re-using Web Data Extraction techniques originally designed to work in a given domain, in other domains. Keywords: Web Information Extraction, Web Data Mining, Business Intelligence, Knowledge Engineering, Knowledge-based Systems, Information Retrieval ∗Corresponding author Email addresses: [email protected] (Emilio Ferrara), [email protected] (Pasquale De Meo), [email protected] (Giacomo Fiumara), [email protected] (Robert Baumgartner) Preprint submitted to Knowledge-based systems June 5, 2014 Contents 1 Introduction 3 1.1 Challenges of Web Data Extraction techniques . .3 1.2 Related work . .4 1.3 Our contribution . .4 1.4 Organization of the survey . .5 2 Techniques 5 2.1 Tree-based techniques . .6 2.1.1 Addressing elements in the document tree: XPath . .6 2.1.2 Tree edit distance matching algorithms . .7 2.2 Web wrappers . .9 2.2.1 Wrapper generation and execution . 10 2.2.2 The problem of wrapper maintenance . 14 2.3 Hybrid systems: learning-based wrapper generation . 16 3 Web Data Extraction Systems 17 3.1 The main phases associated with a Web Data Extraction System . 17 3.2 Layer cake comparisons . 19 4 Applications 21 4.1 Enterprise Applications . 23 4.1.1 Context-aware advertising . 23 4.1.2 Customer care . 24 4.1.3 Database building . 24 4.1.4 Software Engineering . 24 4.1.5 Business Intelligence and Competitive Intelligence . 24 4.1.6 Web process integration and channel management . 25 4.1.7 Functional Web application testing . 25 4.1.8 Comparison shopping . 26 4.1.9 Mashup scenarios . 26 4.1.10 Opinion mining . 26 4.1.11 Citation databases . 26 4.1.12 Web accessibility . 27 4.1.13 Main content extraction . 27 4.1.14 Web (experience) archiving . 27 4.1.15 Summary . 28 4.2 Social Web Applications . 28 4.2.1 Extracting data from a single Online Social Web platform . 30 4.2.2 Extracting data from multiple Online Social Web platforms . 32 4.3 Opportunities for cross-fertilization . 33 5 Conclusions 35 2 1. Introduction Web Data Extraction systems are a broad class of software applications targeting at extracting information from Web sources [79, 11]. A Web Data Extraction system usually interacts with a Web source and extracts data stored in it: for instance, if the source is an HTML Web page, the extracted information could consist of elements in the page as well as the full-text of the page itself. Eventually, extracted data might be post-processed, converted in the most convenient structured format and stored for further usage [131, 63]. Web Data Extraction systems find extensive use in a wide range of applications including the analysis of text- based documents available to a company (like e-mails, support forums, technical and legal documentation, and so on), Business and Competitive Intelligence [9], crawling of Social Web platforms [17, 52], Bio- Informatics [99] and so on. The importance of Web Data Extraction systems depends on the fact that a large (and steadily growing) amount of information is continuously produced, shared and consumed online: Web Data Extraction systems allow to efficiently collect this information with limited human effort. The availability and analysis of collected data is an indefeasible requirement to understand complex social, scientific and economic phenomena which generate the information itself. For example, collecting digital traces produced by users of Social Web platforms like Facebook, YouTube or Flickr is the key step to understand, model and predict human behavior [68, 94, 3]. In the commercial field, the Web provides a wealth of public domain information. A company can probe the Web to acquire and analyze information about the activity of its competitors. This process is known as Competitive Intelligence [22, 125] and it is crucial to quickly identify the opportunities provided by the market, to anticipate the decisions of the competitors as well as to learn from their faults and successes. 1.1. Challenges of Web Data Extraction techniques The design and implementation of Web Data Extraction systems has been discussed from different perspec- tives and it leverages on scientific methods coming from various disciplines including Machine Learning, Logic and Natural Language Processing. In the design of a Web Data Extraction system, many factors must be taken into account; some of them are independent of the specific application domain in which we plan to perform Web Data Extraction. Other factors, instead, heavily depend on the particular features of the application domain: as a consequence, some technological solutions which appear to be effective in some application contexts are not suitable in others. In its most general formulation, the problem of extracting data from the Web is hard because it is constrained by several requirements. The key challenges we can encounter in the design of a Web Data Extraction system can be summarized as follows: • Web Data Extraction techniques often require the help of human experts. A first challenge consists of providing a high degree of automation by reducing human efforts as much as possible. Human feedback, however, may play an important role in raising the level of accuracy achieved by a Web Data Extraction system. A related challenge is, therefore, to identify a reasonable trade-off between the need of building highly automated Web Data Extraction procedures and the requirement of achieving accurate performance. • Web Data Extraction techniques should be able to process large volumes of data in relatively short time. This requirement is particularly stringent in the field of Business and Competitive Intelligence because a company needs to perform timely analysis of market conditions. • Applications in the field of Social Web or, more in general, those dealing with personal data must provide solid privacy guarantees. Therefore, potential (even if unintentional) attempts to violate user privacy should be timely and adequately identified and counteracted. 3 • Approaches relying on Machine Learning often require a significantly large training set of manually labeled Web pages. In general, the task of labeling pages is time-expensive and error-prone and, therefore, in many cases we can not assume the existence of labeled pages. • Oftentimes, a Web Data Extraction tool has to routinely extract data from a Web Data source which can evolve over time. Web sources are continuously evolving and structural changes happen with no forewarning, thus are unpredictable. Eventually, in real-world scenarios it emerges the need of maintaining these systems, that might stop working correctly if lacking of flexibility to detect and face structural modifications of related Web sources. 1.2. Related work The theme of Web Data Extraction is covered by a number of reviews. Laender et al. [79] presented a survey that offers a rigorous taxonomy to classify Web Data Extraction systems. The authors introduced a set of criteria and a qualitative analysis of various Web Data Extraction tools. Kushmerick [77] defined a profile of finite-state approaches to the Web Data Extraction problem. The author analyzed both wrapper induction approaches (i.e., approaches capable of automatically generating wrappers by exploiting suitable examples) and maintenance ones (i.e., methods to update a wrapper each time the structure of the Web source changes). In that paper, Web Data Extraction techniques derived from Natural Language Processing and Hidden Markov Models were also discussed. On the wrapper induction problem, Flesca et al. [45] and Kaiser and Miksch [64] surveyed approaches, techniques and tools. The latter paper, in particular, provided a model describing the architecture of an Information Extraction system. Chang et al. [19] introduced a tri-dimensional categorization of Web Data Extraction systems, based on task difficulties, techniques used and degree of automation. In 2007, Fiumara [44] applied these criteria to classify four state- of-the-art Web Data Extraction systems. A relevant survey on Information Extraction is due to Sarawagi [105] and, in our opinion, anybody who intends to approach this discipline should read it. Recently, some authors focused on unstructured data management systems (UDMSs) [36], i.e., software systems that analyze raw text data, extract from them some structure (e.g. person name and location), integrate the structure (e.g., objects like New York and NYC are merged into a single object) and use the integrated structure to build a database.
Recommended publications
  • A Platform for Networked Business Analytics BUSINESS INTELLIGENCE
    BROCHURE A platform for networked business analytics BUSINESS INTELLIGENCE Infor® Birst's unique networked business analytics technology enables centralized and decentralized teams to work collaboratively by unifying Leveraging Birst with existing IT-managed enterprise data with user-owned data. Birst automates the enterprise BI platforms process of preparing data and adds an adaptive user experience for business users that works across any device. Birst networked business analytics technology also enables customers to This white paper will explain: leverage and extend their investment in ■ Birst’s primary design principles existing legacy business intelligence (BI) solutions. With the ability to directly connect ■ How Infor Birst® provides a complete unified data and analytics platform to Oracle Business Intelligence Enterprise ■ The key elements of Birst’s cloud architecture Edition (OBIEE) semantic layer, via ODBC, ■ An overview of Birst security and reliability. Birst can map the existing logical schema directly into Birst’s logical model, enabling Birst to join this Enterprise Data Tier with other data in the analytics fabric. Birst can also map to existing Business Objects Universes via web services and Microsoft Analysis Services Cubes and Hyperion Essbase cubes via MDX and extend those schemas, enabling true self-service for all users in the enterprise. 61% of Birst’s surveyed reference customers use Birst as their only analytics and BI standard.1 infor.com Contents Agile, governed analytics Birst high-performance in the era of
    [Show full text]
  • Data Extraction Techniques for Spreadsheet Records
    Data Extraction Techniques for Spreadsheet Records Volume 2, Number 1 2007 page 119 – 129 Mark G. Simkin University of Nevada Reno, [email protected], (775) 784-4840 Downloaded from http://meridian.allenpress.com/aisej/article-pdf/2/1/119/2070089/aise_2007_2_1_119.pdf by guest on 28 September 2021 Abstract Many accounting applications use spreadsheets as repositories of accounting records, and a common requirement is the need to extract specific information from them. This paper describes a number of techniques that accountants can use to perform such tasks directly using common spreadsheet tools. These techniques include (1) simple and advanced filtering techniques, (2) database functions, (3) methods for both simple and stratified sampling, and, (4) tools for finding duplicate or unmatched records. Keywords Data extraction techniques, spreadsheet databases, spreadsheet filtering methods, spreadsheet database functions, sampling. INTRODUCTION A common use of spreadsheets is as a repository of accounting records (Rose 2007). Examples include employee records, payroll records, inventory records, and customer records. In all such applications, the format is typically the same: the rows of the worksheet contain the information for any one record while the columns contain the data fields for each record. Spreadsheet formatting capabilities encourage such applications inasmuch as they also allow the user to create readable and professional-looking outputs. A common user requirement of records-based spreadsheets is the need to extract subsets of information from them (Severson 2007). In fact, a survey by the IIA in the U.S. found that “auditor use of data extraction tools has progressively increased over the last 10 years” (Holmes 2002).
    [Show full text]
  • ISSN: 2348-1773 (Online) Volume 3, Issue 1 (January-June, 2016), Pp
    Journal of Information Management ISSN: 2348-1765 (Print), ISSN: 2348-1773 (Online) Volume 3, Issue 1 (January-June, 2016), pp. 71-79 © Society for Promotion of Library Professionals (SPLP) http:// www.splpjim.org ______________________________________________________________________________ DATA CURATION: THE PROCESSING OF DATA Krishna Gopal Librarian , Kendriya Vidyalaya NTPC Dadri, GB Nagar [email protected] ABSTRACT Responsibility for data curation rests, in different ways, on a number of different professional roles. Increasingly, within the library, data curation responsibilities are being associated with specific jobs (with titles like “data curator” or “data curation specialist”), and the rise of specialized training programs within library schools has reinforced this process by providing a stream of qualified staff to fill these roles. At the same time, other kinds of library staff have responsibilities that may dovetail with (or even take the place of) these specific roles: for instance, metadata librarians are strongly engaged in curatorial processes, as are repository managers and subject librarians who work closely with data creators in specific fields. Different techniques and different organizations are engaged in data curation. Finally, it is increasingly being recognized that the data creators themselves (faculty researchers or library staff) have a very important responsibility at the outset: to follow relevant standards, to document their work and methods, and to work closely with data curation specialists so that their data is curatable over the long term. Keywords:Data Storage, Data Management, Data curation, Data annotation, Data preservation 1. INTRODUCTION Data Curation is a term used to indicate management activities related to organization and integration of data collected from various sources, annotation of the data, publication and presentation of the data such that the value of the data is maintained over time and the data remains available for reuse and preservation.
    [Show full text]
  • Tip for Data Extraction for Meta-Analysis - 11
    Tip for data extraction for meta-analysis - 11 How can you make data extraction more efficient? Kathy Taylor Data extraction that’s not well organised will add delays to the analysis of data. There are a number of ways in which you can improve the efficiency of the data extraction process and also help in the process of analysing the extracted data. The following list of recommendations is derived from a number of different sources and my experiences of working on systematic reviews and meta- analysis. 1. Highlight the extracted data on the pdfs of your included studies. This will help if the other data extractor disagrees with you, as you’ll be able to quickly find the data that you extracted. You obviously need to do this on your own copies of the pdfs. 2. Keep a record of your data sources. If a study involves multiple publications, record what you found in each publication, so if the other data extractor disagrees with you, you can quickly find the source of your extracted data. 3. Create folders to group sources together. This is useful when studies involve multiple publications. 4. Provide informative names of sources. When studies involve multiple publications, it is useful to identify, through names of files, which is the protocol publication (source of data for quality assessment) and which includes the main results (sources of data for meta- analysis). 5. Document all your calculations and estimates. Using calculators does not leave a record of what you did. It’s better to do your calculations in EXCEL, or better still, in a computer program.
    [Show full text]
  • Download Date 04/10/2021 06:52:09
    Reunión subregional de planificación de ODINCARSA (Red de Datos e Información Oceanográficos para las Regiones del Caribe y América del Sur), Universidad Autónoma de Baja California (UABC) Ensenada, Mexico, 7-10 December 2009, Item Type Report Publisher UNESCO Download date 04/10/2021 06:52:09 Item License http://creativecommons.org/licenses/by-nc/3.0/ Link to Item http://hdl.handle.net/1834/5678 Workshop Report No. 225 Informes de reuniones de trabajo Nº 225 Reunión subregional de planificación de ODINCARSA (Red de Datos e Información Oceanográficos para las Regiones del Caribe y América del Sur) Universidad Autónoma de Baja California (UABC) Ensenada (México) 7-10 de diciembre de 2009 ODINCARSA (Ocean Data and Information Network for the Caribbean and South America region) Latin America sub- regional Planning Meeting Universidad Autónoma de Baja California (UABC) Ensenada, Mexico, 7-10 December 2009 UNESCO 2010 IOC Workshop Report No. 225 Oostende, 23 February 2010 English and Spanish Workshop Participants For bibliographic purposes this document should be cited as follows: ODINCARSA (Ocean Data and Information Network for the Caribbean and South America region) Latin America sub-regional Planning Meeting, Universidad Autónoma de Baja California (UABC), Ensenada, Mexico, 7-10 December 2009 Paris, UNESCO, 23 February 2010 (IOC Workshop Report No. 225) (English and Spanish) La Comisión Oceanográfica Intergubernamental (COI) de la UNESCO celebra en 2010 su 50º Aniversario. Desde la Expedición Internacional al Océano Índico en 1960, en la que la COI asumió su función de coordinadora principal, se ha esforzado por promover la investigación de los mares, la protección del océano y la cooperación internacional.
    [Show full text]
  • AIMMX: Artificial Intelligence Model Metadata Extractor
    AIMMX: Artificial Intelligence Model Metadata Extractor Jason Tsay Alan Braz Martin Hirzel [email protected] [email protected] Avraham Shinnar IBM Research IBM Research Todd Mummert Yorktown Heights, New York, USA São Paulo, Brazil [email protected] [email protected] [email protected] IBM Research Yorktown Heights, New York, USA ABSTRACT International Conference on Mining Software Repositories (MSR ’20), October Despite all of the power that machine learning and artificial intelli- 5–6, 2020, Seoul, Republic of Korea. ACM, New York, NY, USA, 12 pages. gence (AI) models bring to applications, much of AI development https://doi.org/10.1145/3379597.3387448 is currently a fairly ad hoc process. Software engineering and AI development share many of the same languages and tools, but AI de- 1 INTRODUCTION velopment as an engineering practice is still in early stages. Mining The combination of sufficient hardware resources, the availability software repositories of AI models enables insight into the current of large amounts of data, and innovations in artificial intelligence state of AI development. However, much of the relevant metadata (AI) models has brought about a renaissance in AI research and around models are not easily extractable directly from repositories practice. For this paper, we define an AI model as all the software and require deduction or domain knowledge. This paper presents a and data artifacts needed to define the statistical model for a given library called AIMMX that enables simplified AI Model Metadata task, train the weights of the statistical model, and/or deploy the eXtraction from software repositories.
    [Show full text]
  • BI SEARCH and TEXT ANALYTICS New Additions to the BI Technology Stack
    SECOND QUARTER 2007 TDWI BEST PRACTICES REPORT BI SEARCH AND TEXT ANALYTICS New Additions to the BI Technology Stack By Philip Russom TTDWI_RRQ207.inddDWI_RRQ207.indd cc11 33/26/07/26/07 111:12:391:12:39 AAMM Research Sponsors Business Objects Cognos Endeca FAST Hyperion Solutions Corporation Sybase, Inc. TTDWI_RRQ207.inddDWI_RRQ207.indd cc22 33/26/07/26/07 111:12:421:12:42 AAMM SECOND QUARTER 2007 TDWI BEST PRACTICES REPORT BI SEARCH AND TEXT ANALYTICS New Additions to the BI Technology Stack By Philip Russom Table of Contents Research Methodology and Demographics . 3 Introduction to BI Search and Text Analytics . 4 Defining BI Search . 5 Defining Text Analytics . 5 The State of BI Search and Text Analytics . 6 Quantifying the Data Continuum . 7 New Data Warehouse Sources from the Data Continuum . 9 Ramifications of Increasing Unstructured Data Sources . .11 Best Practices in BI Search . 12 Potential Benefits of BI Search . 12 Concerns over BI Search . 13 The Scope of BI Search . 14 Use Cases for BI Search . 15 Searching for Reports in a Single BI Platform Searching for Reports in Multiple BI Platforms Searching Report Metadata versus Other Report Content Searching for Report Sections Searching non-BI Content along with Reports BI Search as a Subset of Enterprise Search Searching for Structured Data BI Search and the Future of BI . 18 Best Practices in Text Analytics . 19 Potential Benefits of Text Analytics . 19 Entity Extraction . 20 Use Cases for Text Analytics . 22 Entity Extraction as the Foundation of Text Analytics Entity Clustering and Taxonomy Generation as Advanced Text Analytics Text Analytics Coupled with Predictive Analytics Text Analytics Applied to Semi-structured Data Processing Unstructured Data in a DBMS Text Analytics and the Future of BI .
    [Show full text]
  • Intelligent Decision Support Systems- a Framework
    Information and Knowledge Management www.iiste.org ISSN 2224-5758 (Paper) ISSN 2224-896X (Online) Vol 2, No.6, 2012 Intelligent Decision Support Systems- A Framework Ahmad Tariq * Khan Rafi The Business School, University of Kashmir, Srinagar-190006, India * [email protected] Abstract: Information technology applications that support decision-making processes and problem- solving activities have thrived and evolved over the past few decades. This evolution led to many different types of Decision Support System (DSS) including Intelligent Decision Support System (IDSS). IDSS include domain knowledge, modeling, and analysis systems to provide users the capability of intelligent assistance which significantly improves the quality of decision making. IDSS includes knowledge management component which stores and manages a new class of emerging AI tools such as machine learning and case-based reasoning and learning. These tools can extract knowledge from previous data and decisions which give DSS capability to support repetitive, complex real-time decision making. This paper attempts to assess the role of IDSS in decision making. First, it explores the definitions and understanding of DSS and IDSS. Second, this paper illustrates a framework of IDSS along with various tools and technologies that support it. Keywords: Decision Support Systems, Data Warehouse, ETL, Data Mining, OLAP, Groupware, KDD, IDSS 1. Introduction The present world lives in a rapidly changing and dynamic technological environment. The recent advances in technology have had profound impact on all fields of human life. The Decision making process has also undergone tremendous changes. It is a dynamic process which may undergo changes in course of time. The field has evolved from EDP to ESS.
    [Show full text]
  • Central Archiving Platform (CAP)
    Central Archiving Platform (CAP) Central Archiving Platform (CAP) is a complex integrated system for a long-term storage, processing, securing and usage of any digital data. CAP provides a solution for recording, collection, archiving and data protection as well as web-harvesting and web-archiving task options. The system can also be used as a basis for a systematic digitization of any kind of analog information. Beside the long-term data storage, the main benefit of CAP is the institutionalization of digital archives meeting international norms and standards. Therefore, a part of the solution is a broad know-how defining the legislation norms, data processing and methodology for long-term data storage, imparting of the information in the archive and further content handling. Benefits: long-term data storage and protection compliance with the international criteria top security level and standards conformity with the legislation norms, data modularity, scalability, robustness processing and methodology for content and expandability handling and management CAP DIAGRAM CAP SERVICES CAP provides long-term digital data archiving services meeting the OAIS standard (Open Archival Information System). The main purpose for such an archive is to keep the data secure, legible and understandable. Therefore, CAP’s focus is on monitoring the lifetime cycle of stored data formats and their conversion/emulation as well as their bit protection. Our system also provides a support for identification and selection of formats suitable for archiving, legislation support and data selection variability. The archived objects are secured and the solution is fully compliant with the legal norms on intellectual property. LogIstIcs Data delivery can be realized online or using a specialized logistics system that could be included in the CAP solution.
    [Show full text]
  • Best Practices for PDF and Data: Use Cases, Methods, Next Steps
    Best Practices for PDF and Data: Use Cases, Methods, Next Steps Thomas Forth and Paul Connell, ODILeeds The ODI September 2017 Contents Summary .......................................................................................................................................... 1 Background ...................................................................................................................................... 1 What are portable documents? ......................................................................................................... 2 Current perceptions and usage of PDF and data .............................................................................. 3 Group 1: Users of open data and developers of PDF data extraction tools. ................................... 4 PDFTables................................................................................................................................. 5 Group 2: PDF users within libraries, archival, and print: especially scientific publishing. ............... 5 Link Rot ..................................................................................................................................... 7 Group 3: PDF users within business and government. .................................................................. 7 Examples of working with PDFs and data, and new possibilities ....................................................... 8 Extracting content from PDFs .......................................................................................................
    [Show full text]
  • Automatic Document Metadata Extraction Using Support Vector Machines
    Automatic Document Metadata Extraction using Support Vector Machines ½¾ ½ ½ Hui Han ½ C. Lee Giles Eren Manavoglu Hongyuan Zha ½ Department of Computer Science and Engineering ¾ The School of Information Sciences and Technology The Pennsylvania State University University Park, PA, 16802 hhan,zha,manavogl @cse.psu.edu [email protected] Zhenyue Zhang Department of Mathematics, Zhejiang University Yu-Quan Campus, Hangzhou 310027, P.R. China [email protected] Edward A. Fox Department of Computer Science, Virginia Polytechnic Institute and State University 660 McBryde Hall, M/C 0106, Blacksburg, VA 24061 [email protected] Abstract the process, facilitating the discovery of content stored in distributed archives [7, 18]. The digital library CITIDEL Automatic metadata generation provides scalability and (Computing and Information Technology Interactive Digi- usability for digital libraries and their collections. Ma- tal Educational Library), part of NSDL (National Science chine learning methods offer robust and adaptable auto- Digital Library), uses OAI-PMH to harvest metadata from matic metadata extraction. We describe a Support Vector all applicable repositories and provides integrated access Machine classification-based method for metadata extrac- and links across related collections [14]. Support for the tion from header part of research papers and show that it Dublin Core (DC) metadata standard [31] is a requirement outperforms other machine learning methods on the same for OAI-PMH compliant archives, while other metadata for- task. The method first classifies each line of the header into mats optionally can be transmitted. one or more of 15 classes. An iterative convergence proce- However, providing metadata is the responsibility of dure is then used to improve the line classification by using each data provider with the quality of the metadata a signif- the predicted class labels of its neighbor lines in the previ- icant problem.
    [Show full text]
  • Vittorio Cannas
    Curriculum Vitae Vittorio Cannas INFORMAZIONI PERSONALI Vittorio Cannas Viale Giuseppe Serrecchia 16, 00015, Monterotondo (RM), Italia +39 06 90626740 +39 339 6071094 [email protected] [email protected] Sesso Maschile | Data di nascita 22/11/1968 | Nazionalità Italiana ESPERIENZA PROFESSIONALE Da 07/2015 Presidente SpacEarth Technology Srl (http://www.spacearth.net/ ) Spin-off dell’Istituto Nazionale di Geofisica e Vulcanologia(http://www.ingv.it/en/) Attività realizzate: ▪ Sviluppo business e networking con aziende e organismi di ricerca a livello internazionale nei settori: aerospazio, minerario e ambient. ▪ Gestione dei rapporti con reti di imprese: Cluster Aerospaziale della Sardegna, Associazione Lazio Connect (Settore aerospaziale), Rete d’imprese ATEN-IS Lazio nel settore aerospaziale, Associazione Italia-Cina CinItaly nei settori aerospaziale, ICT e Ambiente. ▪ Gestione progetti europei di R&S (H2020, ESA, EIT) nei settori Aerospazio, Minerario e Ambiente: ▪ Scrittura di business plan e analisi di mercato propedeutiche allo sviluppo business. ▪ Trasferimento tecnologico, brevettazione e valorizzazione dei risultati della ricerca ▪ Internazionalizzazione e strumenti di finanza innovativa ▪ Scouting, selezione e scrittura di proposte di progetto e offerte commerciali in risposta a bandi e gare regionali, nazionali ed internazionali. ▪ Esperto valutatore di progetti POR-FESR nel settore Smart Cities&Communities per la regione Sicilia. ▪ Docente di strumenti di finanza innovativa e di sviluppo manageriale per le start-up ▪ Docente di sistemi satellitari a supporto dell’agricoltura, dei rischi ambientali e del cambiamento climatico Attività o settore: Sviluppo business e formazione a livello nazionale ed internazionale verso clienti dei settori aerospazio, ambiente e minerario. Da 07/2014 Senior Advisor Leoni Corporate Advisors (http://www.corporate-advisors.eu/) – Milan, Italy.
    [Show full text]