Web Users' Activities Tracking Based on the Beacons Implementation

Total Page:16

File Type:pdf, Size:1020Kb

Web Users' Activities Tracking Based on the Beacons Implementation Web Users' Activities Tracking based on the Beacons Implementation Asyaev Grigorii, Medvedev Maxim, Mursalimov Ainur, Sinkov Anton South Ural State University 76, Lenin Prospekt, Chelyabinsk, Russian Federation Tel: 79617930538, E-mails: [email protected], [email protected], [email protected], [email protected] ABSTRACT The article considers the concept of web beacons, the purpose and principles of embedding web beacons into websites, the ways of the manual and automated detection of them and minimizing the effectiveness of tracking user’s activities in the Internet. The results of a study on the use of web beacons on the most popular websites are presented. Keywords: web, beacons, cookies, activity, tracking, user, privacy, advertisement, browser fingerprint 1. INTRODUCTION The interaction between the Internet users and website owners (in the broadest sense) is associated with a global age-old problem – on the one hand, the owners (particularly, the advertisers) declare that they want to constantly progress in understanding and predicting user interests and requests, on the other hand, users accuse the websites owners of collecting their personal information, which is essential for advertisers in order to show the content that depends on what pages the users visit. Tracking is important, for example, for social networks to exclude the need for users to authenticate once more when logging after starting another session, and online stores that collect information about user purchases, the current content of their virtual shopping cart and their preferences. Usually, the so-called cookies are used for this. There are 1st party cookies hosted on the domain where the user is located at the moment (which is needed to exclude the need to re-authenticate during a second session) and 3rd party cookies hosted on a third-party domain (used by advertisers to collect data on user’s actions and user's computer statistics). Users can disable the cookies (3rd party in most cases). This, in turn, is an obstacle for advertisers and malefactor who collect statistics. Thus, the creators of websites came to a simple but effective solution to the problem – the use of the so- called web beacons (clear GIFs, cleardots, web bugs [1], tracking pixels, tracking bugs etc.) allowing to monitor the activity of users on the network. In turn, malefactors that use web beacons can compromise the privacy of victims by sending them spam and phishing e-mails. Since the principles of operation of web beacons are based on the basic principles of HTTP functioning, and their implementation doesn’t 210 require any additional software and is done through the simplest HTML tags, user tracking technology being carried out by web beacons on the one hand is simple, on the other hand is very resistant to tracking itself by third-party software. The most important thing is that it is impossible to disable beacons like it is with cookies, which can lead to violations of the privacy of the Internet users, but it is possible to limit the functioning of web beacons. 2. DESCRIPTION OF THE WEB BEACON A web beacon is an image or an inline frame that is placed on a web page and, when loaded as the user visits this page or as an e-mail is opened, sends the information necessary for tracking user activities to the "owner" of the beacon [2]. The scenarios of using web beacons are most often limited by using them to collect statistics on website visits, to collect web analytics in order to optimize the display of content on the webpages, and to display targeted advertisements. Attackers are used to find out whether the e-mail address to which the beacon is sent is valid, for the purpose of sending to spam and phishing e-mails. When sending a GET request to the server, the address on which the request is sent can be converted by the server so as to transfer information about the user to the server (discussed later). Beacons most often work in conjunction with cookies: the server while sending responses to the web browser, adds a Set-Cookie header to it, and the web browser creates a cookie file while downloading the beacon [2]. 3. THE IMPLEMENTATION OF THE WEB BEACONS IN WEBPAGES Depending on the implementation method, web beacons can be classified as follows. 1. In the simplest case, the web beacon is an ordinary image in JPG, PNG, GIF format with a size of 1x1 pixels. This provides imperceptibility, complemented most often with setting the style attribute of style sheet language CSS for the HTML <img> tag, and also it provides beacon's fast loading that does not affect page load speed. In general, the format of such a web beacon in HTML looks as follows: <img src= "https://beaconurl.com/beacon.jpg" width="1" height="1">, or <img src= "https://beaconurl.com/beacon.png" style= "position:absolute; visibility:hidden">, or <img src= "https://beaconurl.com/beacon.gif" style= "display:none"> and so on. Being a simple picture, such a beacon is downloaded by the browser, and the server (1st party or 3rd party server, as it is with cookies) that posted the image on the page gets information that a web beacon has been downloaded to a computer having a certain IP address. Such beacons are imperceptible to the user, but theoretically it is not necessary – the beacon can be any image of any format and size. 211 Having limited functionality, web beacons-images are usually used to keep statistics of visiting different webpages by an individual user, in order to find out his interests by the contents of visited pages, because within a single session of work the user's IP address is constant. Figure 1 shows an example of a web beacon that is located on the https://www.theguardian.com/international page and sends data about the user software to a third party server: the version and type of OS, the location data (country, region), user's system language, as well as other information. The web beacons function in conjunction with cookies. Figure 1. Web beacon put in a GIF file collecting user’s software details, geolocation data and using cookies. Ironically, the information collected about the user is actually sent to the third-party server by the user himself : while the browser is downloading the web beacon it is also sending the GET request to the web beacon's URL. The information collected by the server is encoded in the URL's query string. For example: let the web beacon that is further downloaded by the browser be located at "https://beaconurl.com/beacon.gif". Accordingly, the GET request sent by the browser to the server will look, for example, like this: GET https://beaconurl.com/beacon.gif?devicetype=PC&datetime=20170912195151&ostype=microsoft&os mane=win10&screenresolution=1920x1080&country=en&region=chelyabinsk HTTP / 1.0 The parameters of the query string (stored after a "?" symbol) do not affect the display of the content and are ignored by the server, but the URL from the request is stored in the log files of the server for analysis. Here, after the "?" symbol, a structure is sent as parameters of the query string, which after being parsed by the server is converted to a list consisting of key-value pairs [3]. 2. Web beacons implemented as a link to an executable script. In the HTML language they look the same as web beacons-images, but the src attribute of the <img> tag contains the path to the executable script (for this type of beacon - the PHP script mostly). Usually, a different approach, called 301- redirect, is used. For example, let there be some server "beaconurl.com" and in the directory beaconurl.com/path the following files are stored: "beacon.jpg", "beacon.php". The HTML page displays the following: <img src = "https://beaconurl.com/path/beacon.jpg" width = "1" height = "1">. 212 As soon as the browser makes a request to the address "https://beaconurl.com/path/beacon.jpg", the request is redirected with the HTTP code 30x (x = 1,2,3,5 or 7) to the address "https://beaconurl.com/ path/beacon.php", which contains a script that allows the server to obtain the necessary information. The script is executed, in the address bar the user sees not the source link, but a link to the script. Such web beacons are not common and are replaced by more simple and reliable ones. 3. Iframe-beacons. They are the most "harmless" among all the others. They are to make sure that the user has viewed some content. Most of these beacons do not send information about the user to third- party servers. Used wherever a reliable collection of information is required. Web beacons-images are easily detected and disabled, since in most cases they are formatted in HTML-code according to some templates. To solve this problem, advertisers use Iframes inline frames [3]. The simplest examples of using Iframes are built-in web pages of audio/video players from third-party hosting websites. An inline frame is a certain area of a web page within which another webpage or its fragment is loaded, specified in the attributes of the <iframe> </iframe> paired tag. Navigation through this webpage (its fragment) is carried out regardless of navigation through the main webpage. A web beacon made in the form of the inline frame is an 1x1 pixel area imperceptible for the user, often with CSS style attributes that prevent it from being displayed, for example: <iframe src="https://beaconurl.com/beacon.html" style="display: none; width: 1px; height: 1px; opacity: 0;"></iframe> The beacon.html document contains executable JavaScript code.
Recommended publications
  • The Internet and Web Tracking
    Grand Valley State University ScholarWorks@GVSU Technical Library School of Computing and Information Systems 2020 The Internet and Web Tracking Tim Zabawa Grand Valley State University Follow this and additional works at: https://scholarworks.gvsu.edu/cistechlib ScholarWorks Citation Zabawa, Tim, "The Internet and Web Tracking" (2020). Technical Library. 355. https://scholarworks.gvsu.edu/cistechlib/355 This Project is brought to you for free and open access by the School of Computing and Information Systems at ScholarWorks@GVSU. It has been accepted for inclusion in Technical Library by an authorized administrator of ScholarWorks@GVSU. For more information, please contact [email protected]. Tim Zabawa 12/17/20 Capstone Project Cover Page: 1. Introduction 2 2. How we are tracked online 2 2.1 Cookies 3 2.2 Browser Fingerprinting 4 2.3 Web Beacons 6 3. Defenses Against Web Tracking 6 3.1 Cookies 7 3.2 Browser Fingerprinting 8 3.3 Web Beacons 9 4. Technological Examples 10 5. Why consumer data is sought after 27 6. Conclusion 28 7. References 30 2 1. Introduction: Can you remember the last time you didn’t visit at least one website throughout your day? For most people, the common response might be “I cannot”. Surfing the web has become such a mainstay in our day to day lives that a lot of us have a hard time imagining a world without it. What seems like an endless trove of data is right at our fingertips. On the surface, using the Internet seems to be a one-sided exchange of information. We, as users, request data from companies and use it as we deem fit.
    [Show full text]
  • Beacons and Their Uses for Digital Forensics Purposes
    Beacons and Their Uses for Digital Forensics Purposes An Investigation Prof. Martin Oliver Luke Lubbe Department of Computer Science Department of Computer Science University of Pretoria University of Pretoria [email protected] [email protected] Abstract— This article relates to the field of digital forensics websites which hosted the web bug and more importantly the with a particular focus on web (World Wide Web) beacons and IP address of the requesting browsers computer. Examples of how they can be utilized for digital forensic purposes. A web detected web beacons on a website are shown in Figure 1.0 beacon or more commonly “web bug” is an example of a hidden resource reference in a webpage, which when the webpage is The reason for the widespread presence of web beacons is loaded, is requested from a third party source. The purpose of a that they have proven to be a very useful method for tracking web beacon is to track the browsing habits of a particular IP user activity on the internet without impairing the users address. This paper proposes a novel technique that utilizes the browsing experience. The ability to track a user’s internet use presence of web beacons to create a unique ID for a website, to and habits has become invaluable for web analysis and test this a practical investigation is performed. The practical subsequently internet marketing. Web analysis is the act of investigation involves an automated scanning of web beacons on studying internet user’s behavior with the objective of a number of websites, this scanning process involves identifying identifying opportunities for improvements to the user which beacons are present on a web page and recording the experience or site performance.
    [Show full text]
  • Integrated Thesis-Sep 12
    DEMOGRAPHICS OF ADWARE AND SPYWARE Except where reference is made to the work of others, the work described in this thesis is my own or was done in collaboration with my advisory committee. This thesis does not include proprietary or classified information. _______________________________________________ Kavita Sanyasi Arumugam Certificate of Approval: _____________________________ _____________________________ Dean Hendrix David A Umphress, Chair Associate Professor Associate Professor Computer Science and Computer Science and Software Engineering Software Engineering _____________________________ _____________________________ Cheryl Seals George T. Flowers Assistant Professor Interim Dean Computer Science and Graduate School Software Engineering DEMOGRAPHICS OF ADWARE AND SPYWARE Kavita Arumugam A Thesis Submitted To the Graduate Faculty of Auburn University in Partial Fulfillment of the Requirements for the Degree of Master of Science Auburn, Alabama December 17, 2007 DEMOGRAPHICS OF ADWARE AND SPYWARE Kavita Arumugam Permission is granted to Auburn University to make copies of this thesis at its discretion, upon the request of individuals or institutions and at their expense. The author reserves all publication rights. ____________________________ Signature of Author ____________________________ Date of Graduation iii THESIS ABSTRACT DEMOGRAPHICS OF ADWARE AND SPYWARE Kavita Arumugam Master of Science, December 17, 2007 (B.E., Sir M Visveswaraya Institute of Technology, 2004) 60 Typed Pages Directed by David Umphress The World Wide Web is the most popular use of the Internet. Information can be accessed from this network of web pages. Unknown to users, web pages can access their personal information and, sometimes, also provide information that the user has not asked for. Various kinds of software are used in the web pages. Web pages use Java, Perl scripts, XML, etc.
    [Show full text]
  • RUGGEDCOM ROX V1.16 Device Management 3
    Preface Introduction 1 Using ROX 2 RUGGEDCOM ROX v1.16 Device Management 3 System Administration 4 Setup and Configuration 5 User Guide Upgrades 6 For RX1000, RX1000P, RX1100, RX1100P 3/2014 RC1098-EN-03 RUGGEDCOM ROX User Guide Copyright © 2014 Siemens AG All rights reserved. Dissemination or reproduction of this document, or evaluation and communication of its contents, is not authorized except where expressly permitted. Violations are liable for damages. All rights reserved, particularly for the purposes of patent application or trademark registration. This document contains proprietary information, which is protected by copyright. All rights are reserved. No part of this document may be photocopied, reproduced or translated to another language without the prior written consent of Siemens AG. Disclaimer Of Liability Siemens has verified the contents of this manual against the hardware and/or software described. However, deviations between the product and the documentation may exist. Siemens shall not be liable for any errors or omissions contained herein or for consequential damages in connection with the furnishing, performance, or use of this material. The information given in this document is reviewed regularly and any necessary corrections will be included in subsequent editions. We appreciate any suggested improvements. We reserve the right to make technical improvements without notice. Registered Trademarks ROX™, Rugged Operating System On Linux™, CrossBow™ and eLAN™ are trademarks of Siemens AG. ROS® is a registered trademark of Siemens AG. Linux® is the registered trademark of Linus Torvalds in the U.S. and other countries. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis.
    [Show full text]
  • 1. What Is a Cookie?
    This Cookie Policy explains how cookies and similar technologies (collectively, “Cookie(s)”) are used when you visit our Site. A “Site” includes our websites, emails, and other applications owned and operated by The New York Times Company (the “Company”, “our”, or “us”) as well as any other services that display this Cookie Policy. This policy explains what these technologies are and why they are used, as well as your right to control their use. We may change this Cookie Policy at any time. Please take a look at the “LAST REVISED” date at the top of this page to see when this Cookie Policy was last revised. Any change in this Cookie Policy will become effective when we make the revised Cookie Policy available on or through the Site. If you have any question, please contact us by email at [email protected], or write to us at the following address: CyberReady, LLC, PO BOX 1180, Justin, TX 76247, attn.: Privacy Counsel. 1. What Is A Cookie? A cookie is a small text file (often including a unique identifier), that is sent to a user’s browser from a website's computers and stored on a user’s computer's hard drive or on a tablet or mobile device (collectively, “Computer”). A Cookie stores a small amount of data on your Computer about your visit to the Site. We may also use “web beacons” (also known as “clear GIFs” or “pixel tags”) or similar technologies on our Site to enable us to know whether you have visited a web page or received a message.
    [Show full text]
  • Poster: Web Beacon Based Detection of Data Leakage for Android
    Poster: Web Beacon Based Detection of Data Leakage for Android GyeongRyun Bae Hae Young Lee Dept. of Information Security DuDu IT Seoul Women’s University Seoul, Republic of Korea Seoul, Republic of Korea [email protected] [email protected] Abstract—This poster presents a web beacon based data supplied by Android. Fig. 2 shows an example of a web beacon leakage detection method, called B2-D2, in which data leakage that was injected into a short text message. Note that each carried out by spy apps can be detected by injecting web beacons beacon would have a unique name in the final release. into Android and tracing triggered beacons. Compared to the existing solutions, B2-D2 is very lightweight and does not require a system modification. Also, by injecting JavaScript tags instead of web beacons, B2-D2 would be able to exploit cross-site scripting vulnerabilities against attackers. Through preliminary experiments, we have confirmed that it works against many spy apps. Keywords—data leakage detection, mobile privacy, Android, web beacons I. INTRODUCTION Due to a dramatic increase of Android malware, especially spy apps, sensitive data theft from Android devices has become a major form of attack in recent years [1]. Although researchers have proposed a large number of solutions for detecting or preventing data leakage from Android devices, most of them are heavy, e.g., static and/or dynamic analysis based solutions [2,3], Fig. 1. Overview of B2-D2 and/or need system modifications [4,5]. In this poster, we propose a web beacon based data leakage detection method (B2-D2) for Android.
    [Show full text]
  • Uses Cookies and Similar Technologies in Order to Distinguish You from Other Users
    TERMS AND CONDITIONS BACKGROUND: This website www.cowinbuilding.co.uk (“Our Site”) uses cookies and similar technologies in order to distinguish you from other users. By using cookies, We are able to provide you with a better experience and to improve Our Site by better understanding how you use it. Please read this Cookie Policy carefully and ensure that you understand it. Your acceptance of Our Cookie Policy is deemed to occur if you continue using Our Site. If you do not agree to Our Cookie Policy, please stop using Our Site immediately. 1. Definitions and Interpretation 1. In this Cookie Policy, unless the context otherwise requires, the following expressions have the following meanings: “cookie” means a small file consisting of letters and numbers that Our Site downloads to your computer or device; “web beacon” means a small, transparent image file (usually only 1- pixel x 1-pixel in size) used for tracking user behaviour and activity around Our Site; “We/Us/Our” means Cowin Building Contractors Ltd [, a company registered in England under 4849635, whose registered address is 125 John Wilson Business park, Chestfield, Whitstable, Kent CT5 3RB and whose main trading address is] OR [of] Unit 29 The Oaks Business Park, Invicta Way, Manston, Kent, CT12 5FN. 2. How Does Our Site Use Cookies? 1. We may use cookies on Our Site for a number of reasons, all of which are designed to improve your experience of using it. Cookies allow you to navigate around Our Site better and enable Us to tailor and improve Our Site by saving your preferences and understanding your use of it.
    [Show full text]
  • Gmail Return Notice of Opening
    Gmail Return Notice Of Opening Libidinous Aleks uncanonising very severally while Ramesh remains caparisoned and lapstrake. Unprompted Hermy sometimes dialyses any chara entreats close-up. Fredrick mercurializes his mixes shield half-price or prayingly after Mario isomerizing and commutes gratis, splendrous and showerless. Why ensure patient management systems not assert limits for certain biometric data? You are absolutely right. Didn't Open to total sample of email IDs that money NOT opened your email Additionally whenever a recipient opens your email a notification. However, some email clients require the recipient to return a receipt manually. Want to arms about setting up email read receipts and animate to socket them? This is a handy function if you find yourself typing the same response over and over to different emails. Today have gmail returns to open your opens the tracking: opening the form of us on the tool for free email with the main message? And opened and tracking pixel in one of personal website. But this only happens when you actually click a link, not just open an email. TIME may receive compensation for some links to products and services on this website. Gmail add-on quickstart Docs Translate Form Notifications Slides Progress Bars Slides. They will have a method of correction either way so you can help prevent this from continuing. What floor do if you everything getting emails after unsubscribing? Receive notice of gmail inbox later time they opened, opening or later for? Everyone uses these notifications differently. Instead, the recipient you get a notification that trait have requested to fee the original message.
    [Show full text]
  • Privacy Overview 2
    CHAPTER Privacy Overview 2 RIVACY IS AN area of growing importance for people and organizations. This growth roughly corresponds with the growth of the internet, which has made it possible for peo- Pple to share all types of information very quickly. Organizations collect and use information to conduct business, governments collect and use information to provide services and security for their citizens, and individuals share information to get goods and services. People also share information to network while looking for jobs and to catch up with old friends. However, this in- creased collection of information comes with questions about proper information use. This chapter provides an overview of privacy issues. Because privacy is a very large field, it is impossible to discuss all of the unique and interesting issues in the field of privacy. For the most part, this chapter is limited to the areas where information technology and privacy meet. This chapter also will address general privacy concepts. Chapter 2 Topics This chapter covers the following topics and concepts: • Why privacy is an issue • What privacy is • How privacy is different from information security • What the sources of privacy law are • What threats to privacy there are in the Information Age • What workplace privacy is • What general principles for privacy protection exist in information systems Chapter 2 Goals When you complete this chapter, you will be able to: • Describe basic privacy principles • Explain the difference between information security and privacy © mirjanajovic/DigitalVision Vectors/Getty Images 31 © Jones & Bartlett Learning, LLC. NOT FOR SALE OR DISTRIBUTION. 32 PART I | Fundamental Concepts • Describe threats to privacy • Explain the important issues regarding workplace privacy • Describe the general principles for privacy protection in information systems Why Is Privacy an Issue? Advances in technology are forcing people to think about how their personal data is used and shared.
    [Show full text]
  • Glossary a - Basic Terms Related to the Internet
    Glossary A - Basic Terms Related to the Internet (Note: The source use for the definition is identified in parentheses following the definition.) Add-on, Plug-in, or Add-in - A software product designed to enhance another software product. It usually cannot be run independently. (multiple) Adware - Software that performs certain functions for advertisers, such as sending an ad to a specific website when it is being visited by the consumer that is being tracked by the adware. Adware may be installed on a computer as part of a bundle of software that a consumer purchases, or it may be embedded into a free download. (multiple) Analytics - The discovery, interpretation, and communication of meaningful patterns in data. Especially valuable in areas rich with recorded information, analytics relies on the simultaneous application of statistics, computer programming, and operations research to quantify performance. Firms may apply analytics to business data to describe, predict, and improve business performance. (Wikipedia) App - A Web application (Web app) is an application program that is stored on a remote server and delivered over the Internet through a browser interface. (TechTarget) Algorithm - In mathematics and computer science, an algorithm (e.g., Listeni/'ælg?r?ð?m/ AL-g?-ri-dh?m) is a self-contained step-by-step set of operational instructions to perform a calculation, data processing, and automated reasoning. A computer algorithm is basically an instance of logic written in software by software developers to be effective for the intended "target" computer to produce output from given input. (Wikipedia) Algorithms are used by the behavioral advertising industry to profile consumers.
    [Show full text]
  • Airport Terminal Beacons Recommended Practice
    Airport Terminal Beacons Recommended Practice Page | 1 1.0 Table of Contents 2.0 INTRODUCTION ........................................................................................ 4 3.0 BACKGROUND OF AIRPORT TERMINAL BEACONS ......................... 4 4.0 TECHNOLOGY DISCUSSION .................................................................. 6 4.1. What is an Airport Terminal Beacon? ............................................................................... 6 4.2. Building Beacon Business Models .................................................................................... 7 4.2.1. Introduction .......................................................................................................................... 7 4.2.2. Overview .............................................................................................................................. 7 4.2.3. Impact on Technology Deployment .................................................................................. 8 4.2.4. Building the Business Case ............................................................................................... 8 4.2.5. Options for Implementation ............................................................................................... 8 4.2.6. Recommendation ................................................................................................................ 8 4.2.7. Implementation Approach .................................................................................................. 9 4.3. Common Use
    [Show full text]
  • Lecture Notes
    Lecture #22: Privacy CS106E, Young This lecture we look at privacy issues. We begin by considering how computing technology has affected our personal privacy. As we discover, our movements and actions can be tracked much more easily in today’s society, and technology allows processing of that information much more easily as well. We consider laws related to privacy in the US and in Europe. Consumer privacy advocates sometimes point out that when using a service you are either a “customer or a product”. We consider what that means and whether or not we should be concerned with use of our data by commercial entities. As we discover, even if the commercial entity we are working with has the greatest intention to use our data in a reasonable manner, if their computer security is poor, that data may end up in public anyway. We look at tools and techniques used to attack or defend privacy including web beacons or tracking pixels, third party cookies, email tracking, and hiding one’s tracks online using the TOR network. We consider whether or not anonymity online is a good or bad thing, then look at some ways in which governments might use that information. China provides an example of how a government can use the power of computer technology to monitor and potentially modify citizens’ behavior. We look at their proposed Social Credit System, which aims to take all information on citizens and give them a score. We look at how pilot projects have used those scores to control access to government services or even to help with match making on dating websites.
    [Show full text]