On the Feasibility of Internet-Scale Author Identification
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Bostonbarjournal
January/February 2009 BostonBarA Publication of the Boston Bar Association Journal Web 2.0 and the Lost Generation Web 2.0: What’s Evidence Between “Friends”? Q W The New E-Discovery Frontier — Seeking Facts in the Web 2.0 World (and Other Miscellany) do you E Twitter? Massachusetts Adopts Comprehensive Information Security Regulations updated 3 minutes ago Law Firm Added You as a Friend What Happens When the College Rumor Mill Goes Online? R 77º The Future of Online THIS AGREE- Networking MENT (hereinafter referred to as the “Agreement”) made and entered into this ____________ day of Y tweet! T Boston Bar Journal • January/February 2009 1 Boston Bar Journal Volume 53, Number 1 January/February 2009 Contents President’s Page 2 Officers of the Boston Bar Association President, Kathy B. Weinman President-Elect, John J. Regan BBJ Editorial Policy 3 Vice President, Donald R. Frederico Treasurer, Lisa C. Goodheart Secretary, James D. Smeallie Departments Members of the Council Opening Statement 4 Lisa M. Cukier Damon P. Hart Edward Notis-McConarty Paul T. Dacier Christine Hughes Maureen A. O’Rourke Web 2.0 and the Lost Generation Anthony M. Doniger Julia Huston Laura S. Peabody Bruce E. Falby Kimberly Y. Jones Mala M. Rafik By Donald R. Frederico George P. Field Wayne M. Kennard Rebecca B. Ransom Laurie Flynn Grace H. Lee Douglas B. Rosner First Principles 5 Lawrence M. Friedman James D. Masterman Charles E. Walker, Jr. Elizabeth Shea Fries Wm. Shaw McDermott Mark J. Warner Web 2.0: What’s Evidence Randy M. Gioia Between “Friends”? Past Presidents By Seth P. -
Quantitative Authorship Attribution: a History and an Evaluation of Techniques
QUANTITATIVEAUTHORSHIP ATTRIBUTION: A HISTORYAND AN EVALUATIONOF TECHNIQUES A thesis submitted in partial fulfillment of the requirements for the degree of IN THE DEPARTMENT OF LINGUISTICS O Jack Grieve 2005 SIMONFRASER UNI~RSITY Summer 2005 All rights reserved. This work may not be reproduced in whole or in part, by photocopy or other means, without the permission of the author Name: Jack William Grieve Degree: Master of Arts Title of Thesis: Quantitative Authorship Attribution: A History and an Evaluation of Techniques Examining Committee: Dr. Zita McRobbie Chair Associate Professor, Department of Linguistics Dr. Paul McFetridge Senior Supervisor Associate Professor, Department of Linguistics Dr. Maria Teresa Taboada Supervisor Assistant Professor, Department of Linguistics Dr. Fred Popowich External Examiner Professor, School of Computing Science Date Defended: SIMON FRASER UNIVERSITY PARTIAL COPYRIGHT LICENCE The author, whose copyright is declared on the title page of this work, has granted to Simon Fraser University the right to lend this thesis, project or extended essay to users of the Simon Fraser University Library, and to make partial or single copies only for such users or in response to a request from the library of any other university, or other educational institution, on its own behalf or for one of its users. The author has further granted permission to Simon Fraser University to keep or make a digital copy for use in its circulating collection. The author has further agreed that permission for multiple copying of this work for scholarly purposes may be granted by either the author or the Dean of Graduate Studies. It is understood that copying or publication of this work for financial gain shall not be allowed without the author's written permission. -
Stylometric Analysis of Parliamentary Speeches: Gender Dimension
Stylometric Analysis of Parliamentary Speeches: Gender Dimension Justina Mandravickaite˙ Tomas Krilaviciusˇ Vilnius University, Lithuania Vytautas Magnus University, Lithuania Baltic Institute of Advanced Baltic Inistitute of Advanced Technology, Lithuania Technology, Lithuania [email protected] [email protected] Abstract to capture the differences in the language due to the gender (Newman et al., 2008; Herring and Relation between gender and language has Martinson, 2004). Some results show that gen- been studied by many authors, however, der differences in language depend on the con- there is still some uncertainty left regard- text, e.g., people assume male language in a for- ing gender influence on language usage mal setting and female in an informal environ- in the professional environment. Often, ment (Pennebaker, 2011). We investigate gender the studied data sets are too small or texts impact to the language use in a professional set- of individual authors are too short in or- ting, i.e., transcripts of speeches of the Lithua- der to capture differences of language us- nian Parliament debates. We study language wrt age wrt gender successfully. This study style, i.e., male and female style of the language draws from a larger corpus of speeches usage by applying computational stylistics or sty- transcripts of the Lithuanian Parliament lometry. Stylometry is based on the two hypothe- (1990–2013) to explore language differ- ses: (1) human stylome hypothesis, i.e., each in- ences of political debates by gender via dividual has a unique style (Van Halteren et al., stylometric analysis. Experimental set 2005); (2) unique style of individual can be mea- up consists of stylistic features that indi- sured (Stamatatos, 2009), stylometry allows gain- cate lexical style and do not require exter- ing meta-knowledge (Daelemans, 2013), i.e., what nal linguistic tools, namely the most fre- can be learned from the text about the author quent words, in combination with unsu- - gender (Luyckx et al., 2006; Argamon et al., pervised machine learning algorithms. -
Revista Comunicação E Sociedade, Vol. 25
Comunicação e Sociedade, vol. 25, 2014, pp. 252 – 266 The (non)regulation of the blogosphere: the ethics of online debate Elsa Costa e Silva [email protected] Centro de Estudos de Comunicação e Sociedade, Universidade do Minho, Instituto de Ciências Sociais, 4710-057 Braga, Portugal Abstract New technologies have enabled innovative possibilities of communication, but have also imposed new upon matters of ethics, particularly in regards to platforms such as blogs. This ar- ticle builds upon the reflections and experiences on this issue, in an attempt to map and analyse the ethical questions underlying the Portuguese blogosphere. In order to reflect on the possibil- ity and opportunity to create a code of ethics for bloggers, attention is permanently paid to the political blogosphere. Keywords Code of conduct; values; political blogs; freedom of expression 1. Introduction The debate about ethics in communication (a theme that is perhaps as old as the first reflections on the human ability to interact with others) remains extremely dynamic in contemporary societies, fuelled by innovative technological capabilities that bring new challenges to this endless questioning. Ethics is a construction whose pillars oscillate between different perspectives (Fidalgo, 2007), ranging from a greater primacy of the freedom of the ‘I’, to a primacy of the collective responsibility, from the justification by the norms (to do that which is correct) to the justification by the purposes (the good to be achieved). Values and rules underlie ethics, providing a standard for human behaviour and forms of procedure. It is never a finished process, but the result of ongoing tensions and negotiations between interested members of the community (Christofoletti, 2011). -
Detection of Translator Stylometry Using Pair-Wise Comparative Classification and Network Motif Mining
Detection of Translator Stylometry using Pair-wise Comparative Classification and Network Motif Mining Heba Zaki Mohamed Abdallah El-Fiqi M.Sc. (Computer Sci.) Cairo University, Egypt B.Sc. (Computer Sci.) Zagazig University, Egypt SCIENTIA MANU E T MENTE A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the School of Engineering and Information Technology University of New South Wales Australian Defence Force Academy © Copyright 2013 by Heba El-Fiqi [This page is intentionally left blank] i Abstract Stylometry is the study of the unique linguistic styles and writing behaviours of individuals. The identification of translator stylometry has many contributions in fields such as intellectual-property, education, and forensic linguistics. Despite the research proliferation on the wider research field of authorship attribution using computational linguistics techniques, the translator stylometry problem is more challenging and there is no sufficient machine learning literature on the topic. Some authors even claimed that detecting who translated a piece of text is a problem with no solution; a claim we will challenge in this thesis. In this thesis, we evaluated the use of existing lexical measures for the transla- tor stylometry problem. It was found that vocabulary richness could not identify translator stylometry. This encouraged us to look for non-traditional represen- tations to discover new features to unfold translator stylometry. Network motifs are small sub-graphs that aim at capturing the local structure of a real network. We designed an approach that transforms the text into a network then identifies the distinctive patterns of a translator by employing network motif mining. -
3.3.5.Haulath.Pdf
CONTENTS SL. PAGE NO. ARTICLE NO 1 DATA PRE-PROCESSING FOR EFFECTIVE INTRUSION DETECTION RIYAD AM 5 2. AN ENHANCED AUTHENTICATION SYSTEM USING MULTIMODAL BIOMETRICS MOHAMED BASHEER. K.P DR.T. ABDUL RAZAK 9 6. GRADIENT FEATURE EXTRACTION USING FOR MALAYALAM PALM LEAF DOCUMENT IMAGEGEENA K.P 19 7. INTERNET ADDICTION JESNA K 23 8. VANETS AND ITS APPLICATION: PRESENT AND FUTURE.JISHA K 26 9. DISTRIBUTED OPERATING SYSTEM AND AMOEBAKHAIRUNNISA K 30 5. INDIVIDUAL SOCIAL MEDIA USAGE POLICY: ORGANIZATION INFORMATION SECURITY THROUGH DATA MINING REJEESH.E1, MOHAMED JAMSHAD K2, ANUPAMA M3 34 3. APPLICATION OF DATA MINING TECHNIQUES ON NETWORK SECURITY O.JAMSHEELA 38 4. SECURITY PRIVACY AND TRUST IN SOCIAL MEDIAMS HAULATH K 43 10. SECURITY AND PRIVACY ISSUES AND SOLUTIONS FOR WIRELESS SYSTEM NETWORKS (WSN) AND RFID RESHMA M SHABEER THIRUVAKALATHIL 45 11. ARTIFICIAL INTELLIGENCE IN CYBER DEFENSESHAMEE AKTHAR. K. ASKARALI. K.T 51 SECURITY PRIVACY AND TRUST IN SOCIAL MEDIA Haulath K, Assistant professor, Department of Computer Science, EMEA College, Kondotty. [email protected] Abstract— Channels social interactions using 2. Social Blogs extremely accessible and scalable publishing A blog (a truncation of the expression weblog) is a methods over the internet. Connecting individuals, discussion or informational site published on communities, organization. Exchange of idea Sharing the World Wide Web consisting of discrete entries message and collaboration through security privacy (“posts”) typically displayed in reverse chronological and trust. order (the most recent post appears first). Until 2009, Classification of Social Media blogs were usually the work of a single individual, occasionally of a small group, and often covered a 1. -
Handbook for Bloggers and Cyber-Dissidents
HANDBOOK FOR BLOGGERS AND CYBER-DISSIDENTS REPORTERS WITHOUT BORDERS MARCH 2008 Файл загружен с http://www.ifap.ru HANDBOOK FOR BLOGGERS AND CYBER-DISSIDENTS CONTENTS © 2008 Reporters Without Borders 04 BLOGGERS, A NEW SOURCE OF NEWS Clothilde Le Coz 07 WHAT’S A BLOG ? LeMondedublog.com 08 THE LANGUAGE OF BLOGGING LeMondedublog.com 10 CHOOSING THE BEST TOOL Cyril Fiévet, Marc-Olivier Peyer and LeMondedublog.com 16 HOW TO SET UP AND RUN A BLOG The Wordpress system 22 WHAT ETHICS SHOULD BLOGUEURS HAVE ? Dan Gillmor 26 GETTING YOUR BLOG PICKED UP BY SEARCH-ENGINES Olivier Andrieu 32 WHAT REALLY MAKES A BLOG SHINE ? Mark Glaser 36 P ERSONAL ACCOUNTS • SWITZERLAND: “” Picidae 40 • EGYPT: “When the line between journalist and activist disappears” Wael Abbas 43 • THAILAND : “The Web was not designed for bloggers” Jotman 46 HOW TO BLOG ANONYMOUSLY WITH WORDPRESS AND TOR Ethan Zuckerman 54 TECHNICAL WAYS TO GET ROUND CENSORSHIP Nart Villeneuve 71 ENS URING YOUR E-MAIL IS TRULY PRIVATE Ludovic Pierrat 75 TH E 2008 GOLDEN SCISSORS OF CYBER-CENSORSHIP Clothilde Le Coz 3 I REPORTERS WITHOUT BORDERS INTRODUCTION BLOGGERS, A NEW SOURCE OF NEWS By Clothilde Le Coz B loggers cause anxiety. Governments are wary of these men and women, who are posting news, without being professional journalists. Worse, bloggers sometimes raise sensitive issues which the media, now known as "tradition- al", do not dare cover. Blogs have in some countries become a source of news in their own right. Nearly 120,000 blogs are created every day. Certainly the blogosphere is not just adorned by gems of courage and truth. -
Identifying Idiolect in Forensic Authorship Attribution: an N-Gram Textbite Approach Alison Johnson & David Wright University of Leeds
Identifying idiolect in forensic authorship attribution: an n-gram textbite approach Alison Johnson & David Wright University of Leeds Abstract. Forensic authorship attribution is concerned with identifying authors of disputed or anonymous documents, which are potentially evidential in legal cases, through the analysis of linguistic clues left behind by writers. The forensic linguist “approaches this problem of questioned authorship from the theoretical position that every native speaker has their own distinct and individual version of the language [. ], their own idiolect” (Coulthard, 2004: 31). However, given the diXculty in empirically substantiating a theory of idiolect, there is growing con- cern in the Veld that it remains too abstract to be of practical use (Kredens, 2002; Grant, 2010; Turell, 2010). Stylistic, corpus, and computational approaches to text, however, are able to identify repeated collocational patterns, or n-grams, two to six word chunks of language, similar to the popular notion of soundbites: small segments of no more than a few seconds of speech that journalists are able to recognise as having news value and which characterise the important moments of talk. The soundbite oUers an intriguing parallel for authorship attribution studies, with the following question arising: looking at any set of texts by any author, is it possible to identify ‘n-gram textbites’, small textual segments that characterise that author’s writing, providing DNA-like chunks of identifying ma- terial? Drawing on a corpus of 63,000 emails and 2.5 million words written by 176 employees of the former American energy corporation Enron, a case study approach is adopted, Vrst showing through stylistic analysis that one Enron em- ployee repeatedly produces the same stylistic patterns of politely encoded direc- tives in a way that may be considered habitual. -
Social Media Identity Deception Detection: a Survey
Social Media Identity Deception Detection: A Survey AHMED ALHARBI, HAI DONG, XUN YI, ZAHIR TARI, and IBRAHIM KHALIL, School of Computing Technologies, RMIT University, Australia Social media have been growing rapidly and become essential elements of many people’s lives. Meanwhile, social media have also come to be a popular source for identity deception. Many social media identity deception cases have arisen over the past few years. Recent studies have been conducted to prevent and detect identity deception. This survey analyses various identity deception attacks, which can be categorized into fake profile, identity theft and identity cloning. This survey provides a detailed review of social media identity deception detection techniques. It also identifies primary research challenges and issues in the existing detection techniques. This article is expected to benefit both researchers and social media providers. CCS Concepts: • Security and privacy ! Social engineering attacks; Social network security and privacy. Additional Key Words and Phrases: identity deception, fake profile, Sybil, sockpuppet, social botnet, identity theft, identity cloning, detection techniques ACM Reference Format: Ahmed Alharbi, Hai Dong, Xun Yi, Zahir Tari, and Ibrahim Khalil. 2021. Social Media Identity Deception Detection: A Survey . ACM Comput. Surv. 37, 4, Article 111 (January 2021), 35 pages. https://doi.org/10.1145/ 1122445.1122456 1 INTRODUCTION Social media are web-based communication tools that facilitate people with creating customized user profiles to interact and share information with each other. Social media have been growing rapidly in the last two decades and have become an integral part of many people’s lives. Users in various social media platforms totalled almost 3.8 billion at the start of 2020, with a total growth of 321 million worldwide in the past 12 months1. -
Learning Stylometric Representations for Authorship Analysis
0 Learning Stylometric Representations for Authorship Analysis STEVEN H. H. DING, School of Information Studies, McGill University, Canada BENJAMIN C. M. FUNG, School of Information Studies, McGill University, Canada FARKHUND IQBAL, College of Technological Innovation, Zayed University, UAE WILLIAM K. CHEUNG, Department of Computer Science, Hong Kong Baptist University, Hong Kong Authorship analysis (AA) is the study of unveiling the hidden properties of authors from a body of expo- nentially exploding textual data. It extracts an author’s identity and sociolinguistic characteristics based on the reflected writing styles in the text. It is an essential process for various areas, such as cybercrime investigation, psycholinguistics, political socialization, etc. However, most of the previous techniques criti- cally depend on the manual feature engineering process. Consequently, the choice of feature set has been shown to be scenario- or dataset-dependent. In this paper, to mimic the human sentence composition process using a neural network approach, we propose to incorporate different categories of linguistic features into distributed representation of words in order to learn simultaneously the writing style representations based on unlabeled texts for authorship analysis. In particular, the proposed models allow topical, lexical, syn- tactical, and character-level feature vectors of each document to be extracted as stylometrics. We evaluate the performance of our approach on the problems of authorship characterization and authorship verification -
Blog Rules: a Business Guide to Managing Policy, Public Relations, and Legal Issues Has You Covered
Other Books by Nancy Flynn Instant Messaging Rules: A Business Guide to Managing Policies, Security, and Legal Issues for Safe IM Communication, Nancy Flynn, (AMACOM) E-Mail Rules: A Business Guide to Managing Policies, Security, and Legal Issues for E-Mail and Digital Communication, Nancy Flynn and Randolph Kahn, Esq., (AMACOM) The ePolicy Handbook: Designing and Implementing Effective E-Mail, Internet, and Software Policies, Nancy Flynn (AMACOM) Writing Effective E-Mail: Improving Your Electronic Communication, Nancy Flynn and Tom Flynn (Crisp) E-Mail Management, Nancy Flynn (Thomson Learning) ................. 15888$ $$FM 05-19-06 09:29:45 PS PAGE i This page intentionally left blank Blog Rules A Business Guide to Managing Policy, Public Relations, and Legal Issues Nancy Flynn American Management Association New York • Atlanta • Brussels • Chicago • Mexico City • San Francisco Shanghai • Tokyo • Toronto • Washington, D. C. ................. 15888$ $$FM 05-19-06 09:29:45 PS PAGE iii Special discounts on bulk quantities of AMACOM books are available to corporations, professional associations, and other organizations. For details, contact Special Sales Department, AMACOM, a division of American Management Association, 1601 Broadway, New York, NY 10019. Tel.: 212-903-8316. Fax: 212-903-8083. Web site: www.amacombooks.org This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional service. If legal advice or other expert assistance is required, the services of a competent professional person should be sought. Although this book is designed to provide accurate and authoritative information in regard to the subject matter covered, it is sold with the understanding that the publisher and author are not engaged in rendering legal, regulatory, technology, or other professional service. -
SOCIAL MEDIA USE and MISUSE: a GUIDE for PSYCHIATRISTS the Content of This Booklet (“Content”) Is for Informational Purposes Only
SOCIAL MEDIA USE AND MISUSE: A GUIDE FOR PSYCHIATRISTS The content of this booklet (“Content”) is for informational purposes only. The Content is not intended to be a substitute for professional legal advice or judgment, or for other professional advice. Always seek the advice of your attorney with any questions you may have regarding the Content. Never disregard professional legal advice or delay in seeking it because of the Content. ©2015 Professional Risk Management Services, Inc. (PRMS). All rights reserved. Social Media Use and Misuse: A Guide for Psychiatrists PRMS 3 Introduction 4 Websites 5 Facebook 7 LinkedIn 10 Doximity, Sermo and Other Medically-Oriented 10 Social Networking Communities Blogging 11 Twitter 14 Listservs 15 Social Media in the Medical Office 16 Managing Your Online Reputation – Yes You Do Have One 19 Asking Patients to Rate You 20 Googling Patients 20 Physicians and Social Media – Further Guidance 22 Conclusion 23 Glossary 24 Social Media Use and Misuse: A Guide for Psychiatrists PRMS 4 INTRODUCTION The way patients seek out healthcare information is changing. Whereas once patients would pick up the phone and call their doctors, today’s patients are more likely to look to the Internet for information. Psychiatrists are taking note of this change and are looking for guidance on using technology to enhance physician-patient communication and to provide patients with accurate and reliable information. Many are also recognizing that in order to be found by prospective patients, they need an online presence and are exploring options for achieving this. To help in this endeavor, in the following pages we will discuss various forms of social media and ways in which they may be used by you, in your practice, and in your care of patients.