
Wright State University CORE Scholar Browse all Theses and Dissertations Theses and Dissertations 2016 Personalized and Adaptive Semantic Information Filtering for Social Media Pavan Kapanipathi Wright State University Follow this and additional works at: https://corescholar.libraries.wright.edu/etd_all Part of the Computer Engineering Commons, and the Computer Sciences Commons Repository Citation Kapanipathi, Pavan, "Personalized and Adaptive Semantic Information Filtering for Social Media" (2016). Browse all Theses and Dissertations. 1522. https://corescholar.libraries.wright.edu/etd_all/1522 This Dissertation is brought to you for free and open access by the Theses and Dissertations at CORE Scholar. It has been accepted for inclusion in Browse all Theses and Dissertations by an authorized administrator of CORE Scholar. For more information, please contact [email protected]. Personalized and Adaptive Semantic Information Filtering for Social Media A dissertation submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy By PAVAN KAPANIPATHI M.S., Wright State University, 2012 B.S., Visvesvaraya Technological University, 2007 2016 Wright State University Dayton, Ohio 45435-0001 WRIGHT STATE UNIVERSITY SCHOOL OF GRADUATE STUDIES April 6, 2016 I HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER MY SUPERVISION BY Pavan Kapanipathi ENTITLED Personalized and Adaptive Semantic Information Filtering for Social Media BE ACCEPTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Doctor of Philosophy. Amit Sheth, Ph.D. Dissertation Director Michael Raymer, Ph.D. Director, Computer Science Ph.D. Program Robert E. W. Fyffe, Ph.D. Vice President for Research and Dean of the Graduate School Committee on Final Examination Amit Sheth, Ph.D. Krishnaprasad Thirunarayan, Ph.D. Derek Doran, Ph.D. Prateek Jain, Ph.D. ABSTRACT Kapanipathi, Pavan. PhD., Department of Computer Science and Engineering, Wright State University, 2016. Personalized and Adaptive Semantic Information Filtering for Social Media. Short-text, and the real-time nature of social media platforms has introduced challenges such as a lack of semantic context and a dynamically changing vocabulary for personalized filtering. Semantic techniques and technologies can be leveraged to address these challenges and build novel methodologies to address the challenges to build a personalized filtering system for social media content. Social media has experienced immense growth in recent times. These platforms are becoming increas- ingly common for information seeking and consumption, and as part of its growing popularity, information overload pose a significant challenge to users. For instance, Twitter alone generates around 500 million tweets per day and it is impractical for users to have to parse through such an enormous stream to find information that are interesting to them. This situation necessitates efficient personalized filtering mechanisms for users to consume relevant, interesting information from social media. Building a personalized filtering system involves understanding users’ interests and utilizing these inter- ests to deliver relevant information to users. These tasks primarily include analyzing and processing social media text which is challenging due to its shortness in length, and real-time nature of the medium. The chal- lenges include: (1) Lack of semantic context: Social Media posts are on an average short in length, which provides limited semantic context to perform textual analysis. This is particularly detrimental for topic iden- tification which is a necessary task for mining users’ interests; (2) Dynamically changing vocabulary: Most social media websites such as Twitter and Facebook generate posts that are of current (timely) interests to the users. Due to this real-time nature, information relevant to topics dynamically evolve reflecting the changes in the real world. This in turn changes the vocabulary associated with these dynamic topics of interest making it harder to filter relevant information; (3) Scalability: The number of users on social media platforms are significantly large, which is difficult for centralized systems to scale to deliver relevant information to users. This dissertation is devoted to exploring semantics and Semantic Web technologies to address the above men- tioned challenges in building a personalized information filtering system for social media. Particularly, the necessary semantics (knowledge-bases) is derived from crowd sourced knowledge bases such as Wikipedia to improve context for understanding short-text and dynamic topics on social media. iii Contents 1 Introduction 1 1.1 Social Media: Consuming Collected Intelligence . .1 1.2 Challenges for Social Media Filtering . .3 1.3 Semantic Approaches for Social Data Filtering . .5 1.3.1 Enhancing Semantic Context using Hiearchical Interest Graphs. .6 1.3.2 Harnessing Evolving Knowledge Base for Continuous Filtering . .8 1.3.3 Scalable Content Dissemination . .9 1.4 Dissertation Organization . .9 2 Background and Related Work 11 2.1 World Wide Web . 11 2.1.1 Web 2.0 and The Social Web . 11 2.1.2 Semantics and The Semantic Web . 14 2.1.2.1 RDF – The Resource Description Framework . 16 2.1.2.2 Ontologies and Vocabularies . 19 2.1.2.3 SPARQL: Querying the Semantic Web . 20 2.1.2.4 Linked Open Data . 21 2.1.2.5 Wikipedia . 23 2.2 Related Work on Information Filtering for Social Media . 25 2.2.1 Architecture of Information Filtering Systems . 26 2.2.1.1 User Modeling . 27 2.2.1.2 Filtering Module . 31 2.2.2 Scalability Aspects for an Information Filtering System. 33 3 Hierarchical Interest Graphs from Tweets 35 3.1 Overview . 36 3.2 Inferring User Interests . 38 3.2.1 Hierarchy preprocessor . 40 3.2.1.1 Categories clean up. 41 3.2.1.2 Hierarchical transformation. 41 3.2.2 User Interests Generator . 42 3.2.2.1 Identifying primitive interests . 43 3.2.2.2 Scoring User Interests . 44 3.2.3 Inferring the Hierarchical Interest Graph . 44 3.2.3.1 Bell activation function . 46 3.2.3.2 Bell log activation function . 46 3.2.3.3 Priority intersect activation function . 47 3.3 Tweet Recommendation using Hierarchical Interest Graphs . 49 3.4 Results and Evaluation . 50 3.4.1 Wikipedia Hierarchy Evaluation . 50 iv 3.4.2 Quality of Interests Identified . 52 3.4.2.1 Top-k relevancy . 54 3.4.2.2 Ranking evaluation . 54 3.4.2.3 Highest ranking interest . 56 3.4.3 Finding Implicit Interests . 57 3.4.4 Comparison Against Twopics . 59 3.4.5 Tweet Recommendation Evaluation . 60 3.4.5.1 Dataset . 61 3.4.5.2 Evaluation Approach . 61 3.4.5.3 Comparison Methods . 62 3.4.5.4 Evaluation Settings . 65 3.4.6 Results . 68 3.5 Conclusion . 68 4 Filtering Tweets for Dynamically Evolving Interests 70 4.1 Overview . 70 4.2 Study of Hashtag Behavior During Dynamic Topics . 73 4.2.1 Dataset for Analysis . 74 4.2.2 Frequency Analysis of Hashtags for Dynamic Topics . 74 4.2.3 Co-occurrence Study of Impacting Hashtags . 77 4.3 Event Filtering in Twitter using Hashtags . 79 4.3.1 Topic Wiki Processor . 81 4.3.2 Semantic Enrichment: A Weighted Concepts Representation of Hashtag . 84 4.3.3 Hashtag Analyzer . 85 4.3.3.1 Jaccard Co-efficient . 85 4.3.3.2 Cosine Similarity . 85 4.3.3.3 Weighted Subsumption Measure . 86 4.4 Evaluation . 87 4.4.1 Experimental Setup . 87 4.4.2 Evaluation Results and Discussion . 91 4.5 Conclusion . 93 5 Scalable and Privacy-Aware Dissemination of Content 95 5.1 Overview . 95 5.2 Background . 97 5.2.1 Distributed Content Dissemination and PubSubHubbub . 97 5.2.2 Semantics in Distributed Content Dissemination Platforms . 98 5.2.3 WebID . 99 5.2.4 PPO - The Privacy Preference Ontology . 100 5.3 Extending PubSubHubbub for Privacy-Aware Content Dissemination . 100 5.3.1 Motivations for Extending PuSH . 100 5.3.2 PuSH extension . 101 5.3.3 Distributed Social Graph . 103 5.3.4 Generating Privacy Preference . 104 5.3.5 Semantic Dissemination of Content . 105 5.4 Implementation and Use Case in SMOB . 106 5.4.1 SMOB Hub-User Initial Interaction . 107 5.4.2 SMOB Followee - Publishing . 109 5.4.3 SMOB Semantic Hub - Distribution . 110 5.5 Adapting Semantic Hub for Social Data Filtering. 111 5.5.1 Architecture . 111 v 5.5.1.1 Semantic Filter . 112 5.5.1.2 User Profile Generator . 113 5.5.1.3 Semantic Hub . 115 5.5.2 Implementation . ..
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages153 Page
-
File Size-