Data Privacy on the Web
Total Page:16
File Type:pdf, Size:1020Kb
Data Privacy On The Web McMaster Software Freedom March 3, 2020 Outline 1 Introduction 2 Data Scraping Demo 3 Theory 4 Tutorial 5 Wrapping Up 1 23 Introduction The Presenters Sil Hamilton, 3rd Year English & Multimedia S.M Mukarram Nainar, 2nd year Mathematics & Physics 2 23 McMaster Software Freedom Student group formed to promote software freedom and computer literacy on campus Bi-monthly drop-in meetings to discuss a wide variety of topics: I data privacy I operating systems I current aairs I programming See macswf.ca for more information and scheduled meetings 3 23 Privacy and Why It Matters Privacy is a complicated topic; depending mainly on personal politics Companies track a lot, but the steps required to counter-act it are easy These steps do not need to negatively aect your experience! Information is Power! I advertising is (surprisingly) eective I that should worry you 4 23 Data Scraping Demo Panopticlick Go to https://panopticlick.eff.org/ I note the items being gathered 5 23 Theory Entropy Measure of information One bit of entropy = cuts down possibilities by half 33 bits of entropy uniquely identies anyone globally I log2 (7 billion) ≈ 32.8 6 23 The Web How does the internet work? Servers Addresses & DNS HTTP & TLS Javascript 7 23 Servers The Web follows a "client-server" model You are a client; everything you do runs through a server Servers are just other people’s computers 8 23 Addresses & DNS The internet is primarily run on the IPv4 protocol, used to assign addresses to connected devices Address in this case means a unique series of numbers to dierentiate devices Limited space: 232 possible addresses Rent out to countries, institutions, and companies in blocks; then rented to you (by ISP) Typically leased dynamically, but does not change often 9 23 HTTP & TLS Base protocols HTTP is stateless I the protocol doesn’t store information I however, both the client and server can cookies, localstore HTTP Verbs I GET, POST, etc TLS encrypts and authenticates the connection I covered in more detail in next workshop 10 23 Javascript Arbitrary code on your client Huge risk, since you can’t (usually) know what code does until you run it Blocking (at least some) javascript is the best way to avoid tracking It can also be quite heavy on your computer 11 23 Knowing Yourself Moving on from the wider web: how do you t in? IP Address Useragents Cookies & localstorage Referers Passwords Fonts & more 12 23 IP Address IP is necessarily visible to all those you connect to Means you have a consistent identity when surng Primary method for tracking individuals over time ISP will keep logs of your activity VPNs and public networks may be used to mitigate this I Mullvad I Nord VPN (possibily compromised) I ExpressVPN, etc. Public networks introduce their own security implications 13 23 Useragent Mozilla/5.0 (platform; rv:geckoversion) Gecko/geckotrail Firefox/firefoxversion String read by websites to detect your browser version Sent by your web browser in the header of a HTTP request Contains information regarding your specics I Compatibility I Rendering engine I Operating system I Browser Can be fairly unique depending on your context Mitigated by spoong (eg. Useragent Switcher) 14 23 Cookies & localStorage Cookies are the primary method for enabling persistence, manipulated with HTTP headers and used for. I login information I settings SameSite cookies (locality), averted by advertising domains Deprecated by localStorage (Web API) I accessed and modied via JS (client-side scripts) I supposed to be only read by the client I no expiration date, but only allows <10MB I essentially the same for tracking companies Among other mitigations, Cookie AutoDelete is good 15 23 Referers HTTP header often contains address of site visited immediately prior Enables gathering information for analysis HTTPS sites will not pass along data to non-secured sites Danger crops up when websites receive referers linking to sensitive information Referer is shared with third-party sites even without leaving a page, eg. CDNs Website can dictate referrer-policy (two Rs!) Add-ons can delete referer after the fact, eg. uMatrix 16 23 Passwords 17 23 Passwords Continued Passwords do not need to be complicated (for us)! I Six "random" words with non-regular capitalizations and special characters is good enough Best practice is to have a unique password for each service you have (don’t re-use them) I https://haveibeenpwned.com/ Various convenient password managers exist I KeepassXC I Firefox Lockwise 18 23 Fonts & Other "Features" Websites can request locally installed fonts with JS Leaks a lot of information (more entropy, etc.) 19 23 Do Not Track Essentially useless! Adds extra entropy 20 23 Tutorial Content Blocking Most add-ons are a one-time install, no conguration necessary uMatrix: an excellent all-in-one lter & blocking tool 21 23 Pi-hole Useful tool for DNS ltering Demonstration 22 23 Wrapping Up Final Notes Presentation slides will be available on our website I macswf.ca Thank you! 23 / 23.