Proceedings in One File
Total Page:16
File Type:pdf, Size:1020Kb
1st Symposium on Information Management and Big Data 8th, 9th and 10th September 2014 Cusco - Peru PROCEEDINGS © SIMBig, Andina University of Cusco and the authors of individual articles. Proceeding Editors: J. A. Lossio-Ventura and H. Alatrista-Salas TABLE OF CONTENTS SIMBig 2014 Conference Organization Committee SIMBig 2014 Conference Reviewers SIMBig 2014 Paper Contents SIMBig 2014 Keynote Speaker Presentations Full and Short Papers The SIMBig 2014 Organization Committee confirms that full and concise papers accepted for this publication: • Meet the definition of research in relation to creativity, originality, and increasing humanity's stock of knowledge; • Are selected on the basis of a peer review process that is independent, qualified expert review; • Are published and presented at a conference having national and international significance as evidenced by registrations and participation; and • Are made available widely through the Conference web site. Disclaimer: The SIMBig 2014 Organization Committee accepts no responsibility for omissions and errors. 3 SIMBig 2014 Organization Committee GENERAL CHAIR • Juan Antonio LOSSIO VENTURA, Montpellier 2 University, LIRMM, Montpellier, France • Hugo ALATRISTA SALAS, UMR TETIS, Irstea, France LOCAL CHAIR • Cristhian GANVINI VALCARCEL Andina University of Cusco, Peru • Armando FERMIN PEREZ, National University of San Marcos, Peru • Cesar A. BELTRAN CASTAÑON, GRPIAA, Pontifical Catholic University of Peru • Marco RIVAS PEÑA, National University of San Marcos, Peru 4 SIMBig 2014 Conference Reviewers • Salah Ait-Mokhtar, Xerox Research Centre Europa, FRANCE • Jérôme Azé, LIRMM - University of Montpellier 2, FRANCE • Cesar A. Beltrán Castañón, GRPIAA - Pontifical Catholic University of Peru, PERU • Sandra Bringay, LIRMM - University of Montpellier 3, FRANCE • Oscar Corcho, Ontology Engineering Group - Polytechnic University of Madrid, SPAIN • Gabriela Csurka, Xerox Research Centre Europa, FRANCE • Mathieu d'Aquin, Knowledge Media institute (KMi) - Open University, UK • Ahmed A. A. Esmin, Federal University of Lavras, BRAZIL • Frédéric Flouvat, PPME Labs - University of New Caledonia, NEW CALEDONIA • Hakim Hacid, Alcatel-Lucent, Bell Labs, FRANCE • Dino Ienco, Irstea, FRANCE • Clement Jonquet, LIRMM - University of Montpellier 2, FRANCE • Eric Kergosien, Irstea, FRANCE • Pierre-Nicolas Mougel, Hubert Curien Labs - University of Saint-Etienne, FRANCE • Phan Nhat Hai, Oregon State University, USA • Thomas Opitz, University of Montpellier 2 - LIRMM, FRANCE • Jordi Nin, Barcelona Supercomputing Center (BSC) - Technical University of Catalonia (BarcelonaTECH), SPAIN • Gabriela Pasi, Information Retrieval Lab - University of Milan Bicocca, ITALY • Miguel Nuñez del Prado Cortez, Intersec Labs - Paris, FRANCE • José Manuel Perea Ortega, European Commission - Joint Research Centre (JRC)- Ispra, ITALY • Yoann Pitarch, IRIT - Toulouse, FRANCE • Pascal Poncelet, LIRMM - University of Montpellier 2, FRANCE • Julien Rabatel, LIRMM, FRANCE • Mathieu Roche, Cirad - TETIS and LIRMM, FRANCE • Nancy Rodriguez, LIRMM - University of Montpellier 2, FRANCE • Hassan Saneifar, Raja University, IRAN • Nazha Selmaoui-Folcher, PPME Labs - University of New Caledonia, NEW CALEDONIA • Maguelonne Teisseire, Irstea - LIRMM, FRANCE • Boris Villazon-Terrazas, iLAB Research Center - iSOCO, SPAIN • Pattaraporn Warintarawej, Prince of Songkla University, THAILAND • Osmar Zaïane, Department of Computing Science, University of Alberta, CANADA 5 SIMBig 2014 Paper Contents Title and Authors Page Keynote Speaker Presentations 7 Pascal Poncelet Mathieu Roche Hakim Hacid Mathieu d’Aquin Long Papers Identification of Opinion Leaders Using Text Mining Technique in 8 Virtual Community, Chihli Hung and Pei-Wen Yeh Quality Metrics for Optimizing Parameters Tuning in Clustering 14 Algorithms for Extraction of Points of Interest in Human Mobility, Miguel Nunez Del Prado Cortez and Hugo Alatrista-Salas A Cloud-based Exploration of Open Data: Promoting Transparency 22 and Accountability of the Federal Government of Australia, Richard Sinnott and Edwin Salvador Discovery Of Sequential Patterns With Quantity Factors, Karim 33 Guevara Puente de La Vega and César Beltrán Castañón Online Courses Recommendation based on LDA, Rel Guzman 42 Apaza, Elizabeth Vera Cervantes, Laura Cruz Quispe and Jose Ochoa Luna Short Papers ANIMITEX project: Image Analysis based on Textual Information, 49 Hugo Alatrista-Salas, Eric Kergosien, Mathieu Roche and Maguelonne Teisseire A case study on Morphological Data from Eimeria of Domestic 53 Fowl using a Multiobjective Genetic Algorithm and R&P for Learning and Tuning Fuzzy Rules for Classification, Edward Hinojosa Cárdenas and César Beltrán Castañón SIFR Project: The Semantic Indexing of French Biomedical Data 58 Resources, Juan Antonio Lossio-Ventura, Clement Jonquet, Mathieu Roche and Maguelonne Teisseire Spanish Track Mathematical modelling of the performance of a computer system, 62 Felix Armando Fermin Perez Clasificadores supervisados para el análisis predictivo de muerte y 67 sobrevida materna, Pilar Hidalgo 6 SIMBig 2014 Keynote Speakers Mining Social Networks: Challenges and Directions Pascal Poncelet, Lirmm, Montpellier, France Professor Pascal Poncelet talk was focused on analysis of data associate to social network. In his presentation, Professor Poncelet summarized several techniques through knowledge discovery on databases process. NLP approaches: How to Identify Relevant Information in Social Networks? Mathieu Roche, UMR TETIS, Irstea, France In this talk, doctor Roche outlined last tendencies in text mining task. Several techniques (sentiment analysis, opinion mining, entity recognition, ...) on different corpus (tweets, blogs, sms,...) were detailed in this presentation. Social Network Analysis: Overview and Applications Hakim Hacid, Zayed University, Dubai, UAE Doctor Hacid presented a complete overview concerning techniques associated to explote social web data. Techniques as social network analysis, social information retrieval and mashups full completion were presented. Putting intelligence in Web Data With Examples Education Mathieu d’Aquin, Knowledge Media Institute, The Open University, UK Doctor d'Aquin presented several web mining approaches addressed to education issues. In this talk doctor d'Aquin detailed themes like intelligent web information and knowledge processing, the semantic web, among others. 7 Identification of Opinion Leaders Using Text Mining Technique in Virtual Community Chihli Hung Pei-Wen Yeh Department of Information Management Department of Information Management Chung Yuan Christian University Chung Yuan Christian University Taiwan 32023, R.O.C. Taiwan 32023, R.O.C. [email protected] [email protected] significantly and consumers are further influenced Abstract by other consumers without any geographic limitation (Flynn et al., 1996). Word of mouth (WOM) affects the buying behavior of information receivers stronger than Nowadays, making buying decisions based on advertisements. Opinion leaders further affect WOM becomes one of collective decision-making others in a specific domain through their new strategies. It is nature that all kinds of human information, ideas and opinions. Identification groups have opinion leaders, explicitly or of opinion leaders has become one of the most implicitly (Zhou et al., 2009). Opinion leaders important tasks in the field of WOM mining. usually have a stronger influence on other Existing work to find opinion leaders is based members through their new information, ideas and mainly on quantitative approaches, such as representative opinions (Song et al., 2007). Thus, social network analysis and involvement. how to identify opinion leaders has increasingly Opinion leaders often post knowledgeable and attracted the attention of both practitioners and useful documents. Thus, the contents of WOM are useful to mine opinion leaders as well. This researchers. research proposes a text mining-based approach As opinion leadership is relationships between to evaluate features of expertise, novelty and members in a society, many existing opinion leader richness of information from contents of posts identification tasks define opinion leaders by for identification of opinion leaders. According analyzing the entire opinion network in a specific to experiments in a real-world bulletin board domain, based on the technique of social network data set, this proposed approach demonstrates analysis (SNA) (Kim, 2007; Kim and Han, 2009). high potential in identifying opinion leaders. This technique depends on relationship between initial publishers and followers. A member with the greatest value of network centrality is 1 Introduction considered as an opinion leader in this network This research identifies opinion leaders using the (Kim, 2007). technique of text mining, since the opinion leaders However, a junk post does not present useful affect other members via word of mouth (WOM) information. A WOM with new ideas is more on social networks. WOM defined by Arndt (1967) interesting. A spam link usually wastes readers' is an oral person-to-person communication means time. A long post is generally more useful than a between an information receiver and a sender, who short one (Agarwal et al., 2008). A focused exchange the experiences of a brand, a product or a document is more significant than a vague one. service based on a non-commercial purpose. That is, different documents may contain different Internet provides human beings with a new way