A Proposed Technique for Tracing Origin of Spam on the Usenet

Total Page:16

File Type:pdf, Size:1020Kb

A Proposed Technique for Tracing Origin of Spam on the Usenet A proposed technique for tracing origin of spam on the Usenet by Dirk Bertels, BComp A dissertation submitted to the School of Computing in partial fulfillment of the requirements for the degree of Bachelor of Computing with Honours University of Tasmania June 2006 This thesis contains no material which has been accepted for the award of any other degree or diploma in any tertiary institution. To the candidate’s knowledge and belief, the thesis contains no material previously published or written by another person except where due reference is made in the text of the thesis. Signed Dirk Bertels Hobart, June 2006 Abstract The Usenet, a worldwide distributed decentralized conferencing system, is widely targeted by spammers who use a variety of techniques in order to obscure their identity. One of these techniques is called path preload, in which the path header is spoofed by means of attaching a false section at the beginning of this path. The process of detecting and confirming path preload is laborious and requires a thorough understanding of the Usenet. A technique which downloads a particular article from several servers, and compares their path headers is explored as to its usefulness regarding path preload detection. This document begins with a general background on the Usenet, highlighting those aspects that are relevant to the research, especially the topics of Usenet headers and spam. This leads to a description of the proposed technique and the development of a tool capable of implementing this technique. The tool essentially downloads a spam article from different servers, and analyses their headers. A map is constructed from the data gathered showing the servers that the articles traverse in order to reach their destiny. Since each article is downloaded from more than one location, some commonality may be found in these trajectories. If this commonality occurs at the beginning of the trajectory, the article is said to have a common path. Once a common path is established, a more heuristic approach is taken in order to establish some preliminary conclusions as to whether this common path is likely to be path preload or not. This approach requires some knowledge about various aspects of the Usenet and involves additional anti-spam techniques alongside the common path technique. Several sessions have been conducted and the results outlined, analysed, and discussed at the end of this document, followed by some thoughts on possible future advances and further improvements. Acknowledgements The author likes to thank the following people for their contribution: Prof. Christopher Lueg Supervisor Dr. Vishv Malhotra Co-supervisor Index 1. Introduction ...................................................................................................... 1 2. The Usenet – an overview ................................................................................ 4 2.1. Origin of the Usenet..................................................................................................4 2.2. Current state of the Usenet........................................................................................5 2.3. Usenet Addressing and Newsgroups.........................................................................6 Crossposting..............................................................................................................7 2.4. The Usenet Networking System................................................................................8 The UUCP network...................................................................................................8 NNTP - The News Network Transport Protocol .......................................................9 2.5. Request For Comments (RFC) standards..................................................................9 RFC2822: Standard for Interchange of Usenet messages. .......................................9 RFC 977: Network News Transfer Protocol...........................................................10 2.6. Usenet Structure and Propagation...........................................................................11 2.7. Current handling of Usenet traffic ..........................................................................12 2.8. Usenet Servers and Clients......................................................................................13 The INN Server software.........................................................................................13 NNTP clients - Usenet readers................................................................................14 3. Visualisation - Mapping various aspects of the Usenet............................... 15 Tree maps................................................................................................................16 Traffic flow maps ....................................................................................................17 Conversation maps..................................................................................................18 4. Spamming........................................................................................................ 19 4.1. History of spam.......................................................................................................19 4.2. Usenet Spam ...........................................................................................................20 Definition of Usenet spam.......................................................................................20 Crossposting............................................................................................................20 Excessive Crossposting - ECP ................................................................................20 Excessive multiposting (EMP) ................................................................................20 Breidbart Index .......................................................................................................21 4.3. Spam Cancel ...........................................................................................................22 4.4. Spam and anti-spam Techniques.............................................................................23 Spam traps...............................................................................................................24 4.5. Spam news servers statistics from 2006..................................................................24 4.6. Is the Spammer working from a server or a client? ................................................25 4.7. Report spam............................................................................................................ 25 5. Networking and Spamming Tools .................................................................26 5.1. Usenet Control Clients ........................................................................................... 27 5.2. Basic networking commands.................................................................................. 28 dig and nslookup .................................................................................................... 28 whois 29 traceroute ............................................................................................................... 29 5.3. Additional networking tools................................................................................... 30 Geolocation ............................................................................................................ 30 Server networking tools.......................................................................................... 31 6. Headers ............................................................................................................32 6.1. General discussion.................................................................................................. 32 6.2. Description of main headers................................................................................... 34 6.3. The ‘path’ header.................................................................................................... 35 The posted stamp.................................................................................................... 36 The mismatch stamp ............................................................................................... 36 6.4. The ‘nntp-posting-host’ header .............................................................................. 38 6.5. The x-trace header.................................................................................................. 38 6.6. The message-id header ........................................................................................... 39 6.7. Other Headers relevant to the research................................................................... 40 The xref header....................................................................................................... 40 The injection-info header ....................................................................................... 41 6.8. Headers of binary newsgroups ............................................................................... 41 7. Common Path and Path Preload...................................................................43 7.1. Definitions: Path preload, Common Path, Adjacent Server, and Boundary server.43 7.2. Why path preload is used ....................................................................................... 45 7.3. When is a common path ‘path preload’?...............................................................
Recommended publications
  • Maximum Internet Security: a Hackers Guide - Networking - Intrusion Detection
    - Maximum Internet Security: A Hackers Guide - Networking - Intrusion Detection Exact Phrase All Words Search Tips Maximum Internet Security: A Hackers Guide Author: Publishing Sams Web Price: $49.99 US Publisher: Sams Featured Author ISBN: 1575212684 Benoît Marchal Publication Date: 6/25/97 Pages: 928 Benoît Marchal Table of Contents runs Pineapplesoft, a Save to MyInformIT consulting company that specializes in Internet applications — Now more than ever, it is imperative that users be able to protect their system particularly e-commerce, from hackers trashing their Web sites or stealing information. Written by a XML, and Java. In 1997, reformed hacker, this comprehensive resource identifies security holes in Ben co-founded the common computer and network systems, allowing system administrators to XML/EDI Group, a think discover faults inherent within their network- and work toward a solution to tank that promotes the use those problems. of XML in e-commerce applications. Table of Contents I Setting the Stage 1 -Why Did I Write This Book? 2 -How This Book Will Help You Featured Book 3 -Hackers and Crackers Sams Teach 4 -Just Who Can Be Hacked, Anyway? Yourself Shell II Understanding the Terrain Programming in 5 -Is Security a Futile Endeavor? 24 Hours 6 -A Brief Primer on TCP/IP 7 -Birth of a Network: The Internet Take control of your 8 -Internet Warfare systems by harnessing the power of the shell. III Tools 9 -Scanners 10 -Password Crackers 11 -Trojans 12 -Sniffers 13 -Techniques to Hide One's Identity 14 -Destructive Devices IV Platforms
    [Show full text]
  • Trendmicro™ Hosted Email Security
    TrendMicro™ Hosted Email Security Best Practice Guide Trend Micro Incorporated reserves the right to make changes to this document and to the products described herein without notice. The names of companies, products, people, characters, and/or data mentioned herein are fictitious and are in no way intended to represent any real individual, company, product, or event, unless otherwise noted. Complying with all applicable copyright laws is the responsibility of the user. Copyright © 2016 Trend Micro Incorporated. All rights reserved. Trend Micro, the Trend Micro t-ball logo, and TrendLabs are trademarks or registered trademarks of Trend Micro, Incorporated. All other brand and product names may be trademarks or registered trademarks of their respective companies or organizations. No part of this publication may be reproduced, photocopied, stored in a retrieval system, or transmitted without the express prior written consent of Trend Micro Incorporated. Authors : Michael Mortiz, Jefferson Gonzaga Editorial : Jason Zhang Released : June 2016 Table of Contents 1 Best Practice Configurations ................................................................................................................................. 8 1.1 Activating a domain ....................................................................................................................................... 8 1.2 Adding Approved/Blocked Sender ................................................................................................................ 8 1.3 HES order
    [Show full text]
  • Memetic Proliferation and Fan Participation in the Simpsons
    THE UNIVERSITY OF HULL Craptacular Science and the Worst Audience Ever: Memetic Proliferation and Fan Participation in The Simpsons being a Thesis submitted for the Degree of PhD Film Studies in the University of Hull by Jemma Diane Gilboy, BFA, BA (Hons) (University of Regina), MScRes (University of Edinburgh) April 2016 Craptacular Science and the Worst Audience Ever: Memetic Proliferation and Fan Participation in The Simpsons by Jemma D. Gilboy University of Hull 201108684 Abstract (Thesis Summary) The objective of this thesis is to establish meme theory as an analytical paradigm within the fields of screen and fan studies. Meme theory is an emerging framework founded upon the broad concept of a “meme”, a unit of culture that, if successful, proliferates among a given group of people. Created as a cultural analogue to genetics, memetics has developed into a cultural theory and, as the concept of memes is increasingly applied to online behaviours and activities, its relevance to the area of media studies materialises. The landscapes of media production and spectatorship are in constant fluctuation in response to rapid technological progress. The internet provides global citizens with unprecedented access to media texts (and their producers), information, and other individuals and collectives who share similar knowledge and interests. The unprecedented speed with (and extent to) which information and media content spread among individuals and communities warrants the consideration of a modern analytical paradigm that can accommodate and keep up with developments. Meme theory fills this gap as it is compatible with existing frameworks and offers researchers a new perspective on the factors driving the popularity and spread (or lack of popular engagement with) a given media text and its audience.
    [Show full text]
  • Newscache – a High Performance Cache Implementation for Usenet News 
    THE ADVANCED COMPUTING SYSTEMS ASSOCIATION The following paper was originally published in the Proceedings of the USENIX Annual Technical Conference Monterey, California, USA, June 6-11, 1999 NewsCache – A High Performance Cache Implementation for Usenet News _ _ _ Thomas Gschwind and Manfred Hauswirth Technische Universität Wien © 1999 by The USENIX Association All Rights Reserved Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org NewsCache – A High Performance Cache Implementation for Usenet News Thomas Gschwind Manfred Hauswirth g ftom,M.Hauswirth @infosys.tuwien.ac.at Distributed Systems Group Technische Universitat¨ Wien Argentinierstraße 8/E1841 A-1040 Wien, Austria, Europe Abstract and thus provided to its clients are defined by the news server’s administrator. Usenet News is reaching its limits as current traffic strains the available infrastructure. News data volume The world-wide set of cooperating news servers makes increases steadily and competition with other Internet up the distribution infrastructure of the News system. services has intensified. Consequently bandwidth re- Articles are distributed among news servers using the quirements are often beyond that provided by typical Network News Transfer Protocol (NNTP) which is de- links and the processing power needed exceeds a sin- fined in RFC977 [2]. In recent years several exten- gle system’s capabilities.
    [Show full text]
  • Usenet News HOWTO
    Usenet News HOWTO Shuvam Misra (usenet at starcomsoftware dot com) Revision History Revision 2.1 2002−08−20 Revised by: sm New sections on Security and Software History, lots of other small additions and cleanup Revision 2.0 2002−07−30 Revised by: sm Rewritten by new authors at Starcom Software Revision 1.4 1995−11−29 Revised by: vs Original document; authored by Vince Skahan. Usenet News HOWTO Table of Contents 1. What is the Usenet?........................................................................................................................................1 1.1. Discussion groups.............................................................................................................................1 1.2. How it works, loosely speaking........................................................................................................1 1.3. About sizes, volumes, and so on.......................................................................................................2 2. Principles of Operation...................................................................................................................................4 2.1. Newsgroups and articles...................................................................................................................4 2.2. Of readers and servers.......................................................................................................................6 2.3. Newsfeeds.........................................................................................................................................6
    [Show full text]
  • NBAR2 Standard Protocol Pack 1.0
    NBAR2 Standard Protocol Pack 1.0 Americas Headquarters Cisco Systems, Inc. 170 West Tasman Drive San Jose, CA 95134-1706 USA http://www.cisco.com Tel: 408 526-4000 800 553-NETS (6387) Fax: 408 527-0883 © 2013 Cisco Systems, Inc. All rights reserved. CONTENTS CHAPTER 1 Release Notes for NBAR2 Standard Protocol Pack 1.0 1 CHAPTER 2 BGP 3 BITTORRENT 6 CITRIX 7 DHCP 8 DIRECTCONNECT 9 DNS 10 EDONKEY 11 EGP 12 EIGRP 13 EXCHANGE 14 FASTTRACK 15 FINGER 16 FTP 17 GNUTELLA 18 GOPHER 19 GRE 20 H323 21 HTTP 22 ICMP 23 IMAP 24 IPINIP 25 IPV6-ICMP 26 IRC 27 KAZAA2 28 KERBEROS 29 L2TP 30 NBAR2 Standard Protocol Pack 1.0 iii Contents LDAP 31 MGCP 32 NETBIOS 33 NETSHOW 34 NFS 35 NNTP 36 NOTES 37 NTP 38 OSPF 39 POP3 40 PPTP 41 PRINTER 42 RIP 43 RTCP 44 RTP 45 RTSP 46 SAP 47 SECURE-FTP 48 SECURE-HTTP 49 SECURE-IMAP 50 SECURE-IRC 51 SECURE-LDAP 52 SECURE-NNTP 53 SECURE-POP3 54 SECURE-TELNET 55 SIP 56 SKINNY 57 SKYPE 58 SMTP 59 SNMP 60 SOCKS 61 SQLNET 62 SQLSERVER 63 SSH 64 STREAMWORK 65 NBAR2 Standard Protocol Pack 1.0 iv Contents SUNRPC 66 SYSLOG 67 TELNET 68 TFTP 69 VDOLIVE 70 WINMX 71 NBAR2 Standard Protocol Pack 1.0 v Contents NBAR2 Standard Protocol Pack 1.0 vi CHAPTER 1 Release Notes for NBAR2 Standard Protocol Pack 1.0 NBAR2 Standard Protocol Pack Overview The Network Based Application Recognition (NBAR2) Standard Protocol Pack 1.0 is provided as the base protocol pack with an unlicensed Cisco image on a device.
    [Show full text]
  • The Internet Is a Semicommons
    GRIMMELMANN_10_04_29_APPROVED_PAGINATED 4/29/2010 11:26 PM THE INTERNET IS A SEMICOMMONS James Grimmelmann* I. INTRODUCTION As my contribution to this Symposium on David Post’s In Search of Jefferson’s Moose1 and Jonathan Zittrain’s The Future of the Internet,2 I’d like to take up a question with which both books are obsessed: what makes the Internet work? Post’s answer is that the Internet is uniquely Jeffersonian; it embodies a civic ideal of bottom-up democracy3 and an intellectual ideal of generous curiosity.4 Zittrain’s answer is that the Internet is uniquely generative; it enables its users to experiment with new uses and then share their innovations with each other.5 Both books tell a story about how the combination of individual freedom and a cooperative ethos have driven the Internet’s astonishing growth. In that spirit, I’d like to suggest a third reason that the Internet works: it gets the property boundaries right. Specifically, I see the Internet as a particularly striking example of what property theorist Henry Smith has named a semicommons.6 It mixes private property in individual computers and network links with a commons in the communications that flow * Associate Professor, New York Law School. My thanks for their comments to Jack Balkin, Shyam Balganesh, Aislinn Black, Anne Chen, Matt Haughey, Amy Kapczynski, David Krinsky, Jonathon Penney, Chris Riley, Henry Smith, Jessamyn West, and Steven Wu. I presented earlier versions of this essay at the Commons Theory Workshop for Young Scholars (Max Planck Institute for the Study of Collective Goods), the 2007 IP Scholars conference, the 2007 Telecommunications Policy Research Conference, and the December 2009 Symposium at Fordham Law School on David Post’s and Jonathan Zittrain’s books.
    [Show full text]
  • The Design, Implementation and Operation of an Email Pseudonym Server
    The Design, Implementation and Operation of an Email Pseudonym Server David Mazieres` and M. Frans Kaashoek MIT Laboratory for Computer Science 545 Technology Square, Cambridge MA 02139 Abstract Attacks on servers that provide anonymity generally fall into two categories: attempts to expose anonymous users and attempts to silence them. Much existing work concentrates on withstanding the former, but the threat of the latter is equally real. One particularly effective attack against anonymous servers is to abuse them and stir up enough trouble that they must shut down. This paper describes the design, implementation, and operation of nym.alias.net, a server providing untraceable email aliases. We enumerate many kinds of abuse the system has weath- ered during two years of operation, and explain the measures we enacted in response. From our experiences, we distill several principles by which one can protect anonymous servers from similar attacks. 1 Introduction Anonymous on-line speech serves many purposes ranging from fighting oppressive government censorship to giving university professors feedback on teaching. Of course, the availability of anonymous speech also leads to many forms of abuse, including harassment, mail bombing and even bulk emailing. Servers providing anonymity are particularly vulnerable to flooding and denial-of-service attacks. Concerns for the privacy of legitimate users make it impractical to keep usage logs. Even with logs, the very design of an anonymous service generally makes it difficult to track down attackers. Worse yet, attempts to block problematic messages with manually-tuned filters can easily evolve into censorship—people unhappy with anonymous users will purposefully abuse a server if by doing so they can get legitimate messages filtered.
    [Show full text]
  • Technical and Legal Analysis of Comcast's Network Management
    Technical and Legal Analysis of Comcast’s Network Management Practices by Satish Sunder Rajan Gopal B.Tech., Pondicherry University, 2009 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in partial fulfillment of the requirement for the degree of Master of Science Department of Interdisciplinary Telecommunications 2011 This thesis entitled: Technical and Legal Analysis of Comcast’s Network Management Practices written by Satish Sunder Rajan Gopal has been approved for the Interdisciplinary Telecommunication Program Paul Ohm Dale Hatfield Preston Padden Date The final copy of this thesis has been examined by the signatories, and we Find that both the content and the form meet acceptable presentation standards Of scholarly work in the above mentioned discipline. Satish Sunder Rajan Gopal (M.S, Telecommunications) Technical and Legal Analysis of Comcast’s Network Management Practices Thesis directed by associate professor Paul Ohm Comcast took a controversial decision by targeting P2P (peer to peer) specific protocols to control congestion in upstream traffic over its network. In 2008, FCC required Comcast to stop and reveal details of their current network management practices that violated Network Neutrality obligations. Many concluded that Comcast’s actions were against Internet Engineering Task Force (IETF) standard. However, rules of network neutrality, a policy statement architected by FCC cannot be enforced. And IETF, a standards body does not control Comcast’s actions. This research focuses on a hypothetical infringement suit which shows how an internet service provider could be liable for infringement, when it deviates from IETF’s protocol standards and while controlling copyrighted material over its network.
    [Show full text]
  • Newsreaders.Com: Getting Started: Netiquette
    Usenet Netiquette NewsReaders.com: Getting Started: Netiquette I hope you will try Giganews or UsenetServer for your news subscription -- especially as a holiday present for your friends, family, or yourself! Getting Started (Netiquette) Consult various Netiquette links The Seven Don'ts of Usenet Google Groups Posting Style Guide Introduction to Usenet and Netiquette from NewsWatcher manual Quoting Style by Richard Kettlewell Other languages: o Voila! Francois. o Usenet News German o Netiquette auf Deutsch! Some notes from me: Read a newsgroup for a while before posting o By reading the group without posting (known as "lurking") for a while, or by reading the group archives on Google, you get a sense of the scope and tone of the group. Consult the FAQ before posting. o See Newsgroups page to find the FAQ for a particular group o FAQ stands for Frequently Asked Questions o a FAQ has not only the questions, but the answers! o Thus by reading the FAQ you don't post questions that have been asked (and answered) ad infinitum See the configuring software section regarding test posts (only post tests to groups with name ending in ".test") and not posting a duplicate post in the mistaken belief that your initial post did not go through. Don't do "drive through" posts. o Frequently someone who has an urgent question will post to a group he/she does not usually read, and then add something like "Please reply to me directly since I don't normally read this group." o Newsgroups are meant to be a shared experience.
    [Show full text]
  • The Good Samaritan Exemption and the Cda
    THE GOOD SAMARITAN EXEMPTION AND THE CDA Excerpted from Chapter 37 (Defamation and Torts) of E-Commerce and Internet Law: A Legal Treatise With Forms, Second Edition, a 4-volume legal treatise by Ian C. Ballon (Thomson/West Publishing 2015) SANTA CLARA UNIVERSITY LAW PRESENTS “HOT TOPICS IN INTERNET, CLOUD, AND PRIVACY LAW” SANTA CLARA UNIVERSITY LAW SCHOOL APRIL 23, 2015 Ian C. Ballon Greenberg Traurig, LLP Silicon Valley: Los Angeles: 1900 University Avenue, 5th Fl. 1840 Century Park East, Ste. 1900 East Palo Alto, CA 914303 Los Angeles, CA 90067 Direct Dial: (650) 289-7881 Direct Dial: (310) 586-6575 Direct Fax: (650) 462-7881 Direct Fax: (310) 586-0575 [email protected] <www.ianballon.net> Google+, LinkedIn, Twitter, Facebook: IanBallon This paper has been excerpted from E-Commerce and Internet Law: Treatise with Forms 2d Edition (Thomson West 2015 Annual Update), a 4-volume legal treatise by Ian C. Ballon, published by West LegalWorks Publishing, 395 Hudson Street, New York, NY 10014, (212) 337-8443, www.ianballon.net. Ian C. Ballon Silicon Valley 1900 University Avenue Shareholder 5th Floor Internet, Intellectual Property & Technology Litigation East Palo Alto, CA 94303 T 650.289.7881 Admitted: California, District of Columbia and Maryland F 650.462.7881 JD, LLM, CIPP Los Angeles 1840 Century Park East [email protected] Los Angeles, CA 90067 Google+, LinkedIn, Twitter, Facebook: Ian Ballon T 310.586.6575 F 310.586.0575 Ian Ballon represents Internet, technology, and entertainment companies in copyright, intellectual property and Internet litigation, including the defense of privacy and behavioral advertising class action suits.
    [Show full text]
  • Design of a Machine Learning Based Predictive Analytics System for Spam Problem A.S
    Vol. 132 (2017) ACTA PHYSICA POLONICA A No. 3 Special issue of the 3rd International Conference on Computational and Experimental Science and Engineering (ICCESEN 2016) Design of a Machine Learning Based Predictive Analytics System for Spam Problem A.S. Yüksela;∗, Ş.F. Çankayaa and İ.S. Üncüb aSüleyman Demirel University, Faculty of Engineering, Department of Computer Engineering, Isparta, Turkey bSüleyman Demirel University, Faculty of Technology, Department of Electrical and Electronic Engineering, Isparta, Turkey Spamming is the act of abusing an electronic messaging system by sending unsolicited bulk messages. Filtering of these messages is merely another line of defence and does not prevent spam messages from circulating in email systems. This problem causes users to distrust email systems, suspect even legitimate emails and leads to substantial investment in technologies to counter the spam problem. Spammers threaten users by abusing the lack of accountability and verification features of communicating entities. To contribute to the fight against spamming, a cloud-based system that analyses the email server logs and uses predictive analytics with machine learning to build trust identities that model the email messaging behavior of spamming and legitimate servers has been designed. The system constructs trust models for servers, updating them regularly to tune the models. This study proposed that this approach will not only minimize the circulation of spam in email messaging systems, but will also be a novel step in the direction of trust identities and accountability in email infrastructure. DOI: 10.12693/APhysPolA.132.500 PACS/topics: spam, predictive analytics, machine learning, trust identities 1. Introduction viable because spammers can manage their mailing lists at a low cost.
    [Show full text]