Web-browsing history
As seen from the client side
Michael Sonntag Institute of Networks and Security ELEMENTS OF WEB-BROWSING HISTORY
History The list of URLs visited (+at which time, intentionality…) Provides general information on time and location of activity URL's may also contain form information: GET request parameters Might be very extensive, as a single page may consists of very many independent elements = URLs Pictures, Ads, JavaScript, CSS, iFrames, redirections… Depends on how “History” is interpreted/stored by the browser Cookies Which websites were visited when + additional information May allow determining whether the user was logged in Can survive much longer than the history Depends on the expiry date of the Cookie and the configuration Theoretically: Collection of third-party cookies can determine site Practically: Too much overlap; too many cookies
2 ELEMENTS OF WEB-BROWSING HISTORY
Cache The content of the pages visited Incomplete: E.g. ad's will rarely be cached (“no-cache” headers) Provides the full content of what was seen, e.g. (in theory!) webmail content More exactly: What was delivered by the server to the client and was marked as “cacheable” Mostly not that interesting, as the relevant part (e.g. the actual Facebook content, the chat messages etc) are nowadays “dynamic” and therefore not cached any more What you get is the “standard border” every page has plus the icons, stylesheets, scripts, fonts etc
3 INTENTIONALITY Did the user visit the webpage intentionally? In general: If it's in the cache/history/cookie file: Yes See also: Bookmarks! BUT: What about pop-ups? E.g.: Pornography ads (no one views such content intentionally )! Password protected pages? Images/JavaScript can supply passwords too, when opening a URL! Plugins, Prefetching…? Investigation of other files/trying it out/content inspection/… are needed to verify, whether a page that was visited, was actually intended to be visited (and whether the content was expected!) Usually this should not be a problem: Logging in into the mail account Visiting a website after entering log-in data (username+password) Downloading files
4 INCREASING PROBLEM: DYNAMIC PAGES
Web browsing history works well for “a page” Server sends something, browser displays, user clicks on link But today often content is retrieved dynamically by JavaScript One “page” with lots of different content over time Problems: JavaScript requests are rarely cached JavaScript requests do no show up in the browser history The “page” might be completely empty at the start “Repeating” works less well Many pages change very frequently Not every few weeks, but minutes/seconds! The “page” might be the same, but the content (see above) not Content may be adjusted to the viewer, i.e. dependent on language, browser, previous browsing (cookies!) etc.
5 WEB BROWSING PROCEDURE (classical!) User enters the URL Browser determines the IP address for the host part Browser connect to the IP address (+port if specified) Sends request With additional information, e.g. what compression is allowed May contain cookie(s) Retrieves response Headers and actual content Header may contain cookie Saved to memory (and perhaps the disk in the cache file) Depends on headers, settings etc Connection is closed Note: HTTP 1.1 typically keeps the connection open for further requests (incl. pipelining). This is especially useful for images, stylesheets, scripts etc from the same site!
6 THE HTTP PROTOCOL Basis of HTTP is a reliable stream protocol (usually TCP) The HTTP state diagram is very simple With some exceptions, e.g. authorization There is only a single request and a single response HTTP request methods: GET: Retrieve some content Should never change the state on the server! Especially important if caching takes place somewhere Parameters (optional) are encoded in the URL POST: Send data for processing and retrieve result To be used for requests changing the server state! Parameters are sent in the request body HEAD, PUT, DELETE, TRACE, OPTIONS, CONNECT… Of less importance and (very) rare Head (check freshness), Put (file upload) Forget the rest (or: suspicious if present investigate!)
7 THE HTTP PROTOCOL The response always includes a status code 1xx Informational 2xx Success 3xx Redirection (request should be sent again differently) 4xx Client side error (e.g. incorrect request, not existing) 5xx Server side error (should not be retried) Caching of HTTP: Commonly performed through proxies Must either be validated with the source ( HEAD) Or it is "fresh enough" according to client, server, and cache Note: Browsers often ignore this E.g. IE can be configured to never check for a newer version even if the cached page has already expired! This has no influence on what proxies on the transmission path do!
8 THE HTTP PROTOCOL Local (=browser) caches If a page has expired, it is not necessarily deleted from the local cache It might remain there for much longer Can keep even pages marked as "no-cache" and "no-store" "no-cache": Should not be cached for future requests But might still be written to disk (e.g. Firefox) "no-store": Should only be held in memory Users are still allowed to use "Save As"! All dynamically generated pages are typically marked as expiring in the past, no cache etc. Why? Because previously there was no specification, so browsers did different things and servers provided headers for every browser make and version to be on the safe side! Such pages must normally be generated anew for every visitor (=no caches on transmission path) and every visit (=no caching in browser)
9 THE HTTP PROTOCOL Local (=browser) caches This cache can be very large and contain very old files Important for computer forensics! Manual deletion or cleaner programs are simple and effective But must be used every time after surfing Attention: Many such programs just delete the files, only the more serious ones overwrite them securely! Deleted by Browser before cleaning No overwriting! Also, fragments of files might remain in unused areas, so all free sectors and slack spaces would have to be cleaned every time! See also swap file/partition, hibernation file ( URLs) See also “Private browsing mode”
10 THE HTTP PROTOCOL EXAMPLE: HTTPS://WWW.GOOGLE.AT/ Request:
GET / HTTP/1.1 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8 Accept-Encoding: gzip, deflate, br Accept-Language: de,en-US;q=0.7,en;q=0.3 Cache-Control: max-age=0 Connection: keep-alive Cookie: NID=119=mg3iUhrNJHA5CfE6CG5KVd…017-12-14-7; OGPC=5061821-25: DNT: 1 Host: www.google.at Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (Windows NT 6.1; W…) Gecko/20100101 Firefox/57.0
11 THE HTTP PROTOCOL EXAMPLE: HTTPS://WWW.GOOGLE.AT/ Response: HTTP/1.1 200 OK alt-svc: hq=":443"; ma=2592000; quic=51303431; quic=51303339; quic=51303338; quic=51303337; quic=51303335,quic=":443"; ma=2592000; v="41,39,38,37,35" cache-control: private, max-age=0 content-encoding: br “Do not cache content-type: text/html; charset=UTF-8 this response” date: Thu, 14 Dec 2017 07:32:10 GMT Page, not Cookie! expires: -1 p3p: CP="This is not a P3P policy! See g.co/p3phelp for more info." server: gws set-cookie:
12 HTTP/2 The new(er) protocol introduces a few changes Data compression: Not really a problem – The content stays the same, compression was negotiable before too Mostly an issue for wiretapping Request pipelining: More pages over one connection Was also part of HTTP/1.1 Wiretapping issue Multiplexing requests: Concurrent/interleaved requests possible Similar as pipelining: Wiretapping issue Server push: Problem for intentionality The server can send you data before you requested it, because you are going to need it anyway (e.g. stylesheets, scripts) This can be “repurposed” to send you arbitrary data Which might be cached (even if not shown, because it is not needed!)
13 COOKIES What is a "cookie"? Small (max. 4 kB) text file with information Originates form the server Stored locally Transmitted back to server on "matching" requests Content (with exemplary data): Name: "session-id" Value: "303-1195544-4348244" Domain: ".amazon.de" Sent to all requests ("/") of Website path: "/" subdomains of ".amazon.de" None Till browser is Expiry date and time: 15.10.2007, 00:02:22 closed ("session cookie") Secure(https): * Will be sent also on non-HTTPS connections The data may have any meaning Very rarely this is some "plain-text data" Some part of it might be the IP address or the user name But usually it is just a (more or less!) random unique number
14 THIRD-PARTY COOKIES “Real” Third-Party Cookies: Site A sends Cookie with Domain value set to site B Would NOT be sent back to site A, but only to site B! Browsers generally do not accept such cookies any more Third-Party Cookies: Page is from site A, but some iframe/picture/… is from site B – and the request for it sets a Cookie for site B Cookies originates at site B and is sent back to site B, but the “overall”/”encompassing”/”top-level” page is from site A Many browsers block these now (optionally/by default); also AdBlockers often drop them Forensically the difference is unimportant: All are either stored or not, and all are stored in the same way Theoretically a unique collection of third-party cookies (=same time) could identify a website (=verify; must be known to compare)
15 IE: INTERESTING FILES/LOCATIONS Where can we find information on what users did with IE? Att.: Locations change slightly with OS version/language!
16 IE: COOKIE FILE STRUCTURE Each cookie file contains all cookies for a single domain The information is stored line-by-line; 9 lines = 1 cookie Example: __utma Name Value 36557369.378120483.1187701792.1189418701.1190710388.4 hotel.at/ Domain 1088 Flags 2350186496 Expiration time (UTC; LoVal","HiVal) 32111674 2116717664 Creation time (UTC; format as above) 29884241 * Secure (here: False) __utmb • Note: Additional information on the cookies is in the … index.dat file in the same directory! • Number of hits, suspected as advertisement
17 IE: NEWER VERSIONS Additional information can be found under
18 IE >=10 Most files are now in separate directories named “Low” located within the “old” directory structure “Low” = “Low integrity files” No problem if damaged/lost All index.dat files have been replaced by a single database Located in “%USERDIR%\AppData\Local\Microsoft\Windows\WebCache” V01.chk (Checkpoint file), V01.log (Transaction log), V01res*.jrs/V01*.log (Reserved transaction logs; for clean shutdown in emergency cases, e.g. disk full) WebCacheV01.dat (ESE Database) Format: Windows ESE-DB (Extensible Storage Engine) Same as Exchange, Active Directory, Desktop Search… Problem: When windows is running, it is locked Shadow Copy needed (or image investigation) As before: Index only; actual content is unchanged!
ESEDatabaseView: http://www.nirsoft.net/utils/ese_database_view.html Or others (libesedb reads also damages files) 33 Windows: esentutl.exe: E.g. recovery of crashed DB IE >=10 The database consists of several “containers” Which are not necessarily consistently numbered Probably a DB viewer limitation; use name of container! Especially interesting: “History”/”MSHist01*”, “Cookies”, “Content”, “DOMStore”, “iedownload” Attention: Some names occur several times (they also depend on a location, i.e. a directory!) InPrivate browsing data is stored there as well, but deleted at the end (=when the window is closed) “Deletion” Carving for such entries is possible, as deleted records will not be reused until that database part is reorganized B-Tree rebalancing Parts will be overwritten (not necessarily all!) Attention: Contains other things beside IE history/…! Example: Windows 8.1 – All searches are logged here too Searches: https://www.windowssearch.com/search?q= While typing: https://www.windowssearch.com/suggestions?q= Even if offline at that point in time!
34 IE >=10 Content: Structure is identical for all containers But different columns are used (=others remain empty) Common: FileSize, SecureDirectory (in which subdirectory), Filename, ResponseHeaders, AccessCount, CreationTime (when created; entry!), ExpiryTime (how long valid), ModifiedTime (last change), AccessTime (last access) Specialized: Cookies: URL = user@domain History: No Creation time DOMStore: No Expiry/modified time iedownload: - Content (=Cache): -
35 IE >=10 Accessing it is difficult: Always open Solution: Offline or Shadow Copy Or simply as a different user with appropriate permissions! Example: Use “Hobocopy” (=Copy-tool with VSS support) D:\Temp>HoboCopy.exe C:\Users\sonntag\AppData\Local\Microsoft\Windows\WebCache d:\temp\hobocopy WebCacheV01.dat HoboCopy (c) 2006 Wangdera Corporation. [email protected]
Starting a full copy from C:\Users\sonntag.ADS2- FIM\AppData\Local\Microsoft\Windows\WebCache to d:\temp\hobocopy Copied directory Backup successfully completed. Backup started at 2014-08-06 10:03:16, completed at 2014-08-06 10:03:21. 1 files (32.06 MB, 1 directories) copied, 16 files skipped Then open it with a viewer (e.g. ESEDatabaseViewer) Select “Containers” from the Combo Box to identify which container is which data store Select container, select interesting files, use “Save as” to export e.g. as CSV
36 Hobocopy: https://github.com/candera/hobocopy/ IE >=10
37 IE >=10
38 MICROSOFT EDGE Very similar to IE 10+ Also stored in ESE-DB in same format Cache: Similar to IE 10+ Bookmarks: Similar to IE 10+ History, Cookies: Similar to IE 10+ New: Cortana search: Stored in ESE DB in container “DependencyEntry 5” Reading list: Separate ESE DB, separate for each user InPrivate browsing: Still stored in DB and deleted on exit Data carving remains useful
http://www.dataforensics.org/microsoft-edge-browser-forensics/ https://articles.forensicfocus.com/2015/07/27/project-spartan-forensics/ 39 MICROSOFT EDGE
Base directory: C:\Users\
Newer versions: These directories still exist, but most of the content has been moved into the ESE-DB (still in file systems: Cache, DOM store)
http://www.dataforensics.org/microsoft-edge-browser-forensics/ https://articles.forensicfocus.com/2015/07/27/project-spartan-forensics/ 40 FIREFOX: INTERESTING FILES/LOCATIONS
Where can we find data on what users did with Firefox? Profile ID is a random string generated once
41 FIREFOX: SQLITE COOKIE DB Contains: Organizational data: id, appId, inBrowserElement BaseDomain: Domain name without host ( searching!) Name: Name of contained data Value: Value of contained data Host: Where to send it to Path: Sub-path restriction for host (rarely used) Expiry: Date of expiry LastAccessed: Date of last access CreationTime: Date of first access isSecure: Send only over encrypted connections? isHttpOnly: Access by JavaScript forbidden?
42 FIREFOX: SQLITE HISTORY DB places.sqlite; contains 13 tables FavIcons, hosts, inputHistory, keywords, sequences, annotations, downloads (incl destination filename)… Bookmarks: Hierarchical structure (“parent”), position, title, date added/modified, GUID “moz_places”: Main table – connects GUID to URL Contains: URL, title, rev_host (=reversed hostname; searching; www.mozilla.com moc.allizom.www), visit_count, typed (=manually), last visit date, GUID, favIcon reference… “moz_historyvisits”: Entry created when visiting a page Id, from_visit (=id of previous page), place_id (=see above), visit_date (=timestamp), visit_type (1=link, 2=typed, 3=bookmark, 4=Embedded, 7=Download…), session
43 https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places FIREFOX: CACHE (>=32) „index“ file: Index of cache entries 4 bytes version (32-47: 0x00000001) 4 bytes last changed timestamp (Unix 32 Bit Hex - Big Endian) 4 bytes dirty flag Records of 36 bytes each 20 bytes SHA1-hash of URL This is the name of the file in the subdirectories! Format: “:http://………./…./…..” 4 bytes “frecency” (=frequency+recency) See https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/Frecency_algorithm “amount of revisitation, the type of those visits, how recent they were, and whether the URI was bookmarked or tagged” 4 bytes expiration date (Unix 32 Bit Hex – Big Endian) 1 byte flags 3 bytes file size Subdirectories „doomed“ and „entries“: Actual cache files
45 FIREFOX: CACHE (>=32) Cache files contain the content, checksums, and some metadata File format: File exactly as it is received, i.e. might be compressed (or not, whatever the server decided to send!) The very last 4 bytes of the cache file are the data length in this file Other view: This is the offset of the beginning of the metadata 4 bytes hash for the whole file 2 bytes hash for every chunk (rounded up) of 262.144 bytes data 4 bytes version number (32-44: 0x00000001, >=45: 0x00000002) 4 bytes fetch count Added! 4 bytes last fetch date, 4 bytes last modified date Version 3 (>=51): Cache files can store alternative 4 bytes frecency, 4 bytes expiration date data (=processed, e.g. 4 bytes URL (“key”) length, 4 bytes flags, URL + ‘\0’ decompressed) Attribute name + ‘\0’, attribute value + ’\0’ Can be various, typically includes “request-method”, “response-head” 4 bytes data length/metadata offset
46 https://hg.mozilla.org/mozilla-central/file/default/netwerk/cache2/CacheFileMetadata.h FIREFOX: CACHE (>=32) Flags: Currently only a single one exists “Pinned” Pinned flag (value “1”): Resource will always be cached (regardless of size, headers etc) Will never be evicted from the cache, even if “old” Will not be deleted when the cache is cleared Will be overwritten if a new version becomes known Uses: Unclear! Documentation: Pinned pages, offline web applications Actual use: Pinned pages do NOT set this flag!
47 https://bugzilla.mozilla.org/show_bug.cgi?id=1032254 GOOGLE CHROME File locations:
48 GOOGLE CHROME History: „History“ file; SQLite database Most important: “downloads”, “urls”, “visits” Downloads: Target path, start&end time, number of bytes (announced/actually downloaded), MIME type, URL of tab and site from which downloaded… URLs (single line for every URL visited): URL, title, visit count, typed count (=manually typing URL), last visit time, hidden (???) Visits: Actual “history”, i.e. one line for every single visit URL (=reference to URL table), visit time, from-visit (=from which URL did the user come), transition (how he got there; link, typed, reload…), duration of visit visit_source: If URL came from a different device (=synchronization across multiple browsers on different devices)
https://developer.chrome.com/extensions/history 49 GOOGLE CHROME Login data: „Login data“ file; SQLite database Origin URL: URL where the form came from Action URL: URL where the data was sent to Username element/value: ID of element and content Password element/value, Submit element: See above! Note: password value is encrypted using DPAPI (which is based on the user password) Creation date, how often used etc Web data: “Web data” file; SQLite database Auto fill information (stored form content), credit cards, token services, keywords etc. Current/Last Session/Tabs: Information on tabs/sessions Top sites: Most frequently visited sites
https://digital-forensics.sans.org/summit-archives/dfirprague14/ Give_Me_the_Password_and_Ill_Rule_the_World_Francesco_Picasso.pdf 50 GOOGLE CHROME Cache: Subdirectory „Cache“ Index file: Main hash table „index“ Data files: Blocks of four fixed sizes (1-4*256/1k/4k) „data_1/2/3“ Stream files for longer content (i.e. >16 kB): „f_“
https://www.chromium.org/developers/design-documents/network-stack/disk-cache 51 IE-EXAMPLE: RECONSTRUCTING WEBMAIL Cookies: www.gmx.net/de/ Visits 1 moveinBrowser new%20MoveinData%28%29%2Eunpickle%28%7B%22viewed%22%3A%201%2C%20%22closed%22 %3A%20false%2C%20%22latest%22%3A%20new%20Date%281192174225718%29%7D%29 Decoded: new MoveinData().unpickle({"viewed": 1, "closed": false, "latest": new Date(1192174225718)}) Decoded date (Unix): Fr, 12 October 2007 07:30:25 UTC gmx.net/ GUD bMDEpJi1JPF9xN0JINkUyQkExJSIhJxweJBkeGyAvLjcsLDQpKzJCSzElIiEnHB8dGRwcIC83Ny8tNC0u MktBMSMtSzksIh0gGw== Mime encoded, but is just a binary value Probably a unique ID for session handling
logout.gmx.net/ Sa, 13 Oktober 2007 07:33:24 UTC POPUPCHECK 1192260804812
52 IE-EXAMPLE: RECONSTRUCTING WEBMAIL History (pasco; adjusts for local time zone!): = 12.10.2007 7:30-7:33 UTC! Modified/access time: 10/12/2007 09:30 until 09:33 Local time of event: Western European DST (=+2) But shown according to the time zone set at the moment of the analysis; physically stored as UTC time! URLs (selection): sonntag@http://www.gmx.net/de User visited GMX homepage sonntag@http://service.gmx.net/de/cgi/login User logged in to GMX sonntag@http://service.gmx.net/de/cgi/g.fcgi/mail/index?CUSTOMERNO=10333901&t=de169 0301692.1192174366.c35ea10d&FOLDER=inbox User visited his inbox sonntag@http://service.gmx.net/de/cgi/derefer?TYPE=2&DEST=http%3A%2F%2Fwww.gmxa ttachments.net%2Fde%2Fcgi%2Fg.fcgi%2Fmail%2Fprint%2Fattachment%3Fmid%3Dbabgeh j.1192174412.25124.s9vnnjbfon.74%26uid%3DKxs5Dm8bQEVsw%252FqY9HVpw45KNTg2 NcIR%26frame%3Ddownload User opened an attachment sonntag@http://www.gmxattachments.net/de/cgi/g.fcgi/mail/print/attachment:/filename/Lebens lauf.doc?mid=babgehj.1192174412.25124.s9vnnjbfon.74&uid=Kxs5Dm8bQEVsw%2FqY9HV pw45KNTg2NcIR&frame=attachment User downloaded an attachment called "Lebenslauf.doc"
53 IE-EXAMPLE: RECONSTRUCTING WEBMAIL Cache: 282 entries Images (GIF, JPG) Stylesheets (CSS) JavaScript (JS) HTML files (HTML) Only static files, login screen, etc.! What is missing are the actual E-Mails These are not cached on disk In previous versions they might have been cached Depending on the server, not the version of Internet Explorer! So webmail is not necessarily recoverable, but perhaps in some instances (swap/hibernation file, deleted files…) Note: The cache only contains what is sent to the computer Locally drafted E-Mail is "form input" which is never cached! Only chance: Sent back as preview
54 TYPED URLS To be found in the registry under HKCU\Software\Microsoft\Internet Explorer\TypedURLs Number depends on OS version (later have more; 20/25/50) Contains “url1” to “url20” url1 is the most recent, url20 the last All are moved when a new one is typed! Since IE 10 there is a “parallel” key “TypedURLsTime” FILETIME timestamps for the entries above Most recent visit for the corresponding URL
55 OTHER INFORMATION Form history and stored passwords For identifying visited sites and accessing them Often encrypted, but decryption programs exist Search history: What was the person looking for? Blocked sites: If the popup-host of a site was blocked, the site itself was probably visited! Manually unblocked sites obviously interesting! Compatibility settings (IE): Sites shown using old renderers Certificate store: To identify secured sites visited often Might include client certificates, which act as a kind of key Download history: What file(names) were downloaded And where they were stored locally (name; for searching) Installed add-ons (browser controls) Language preferences and all other configuration options
Careful interpretation necessary! 56 PRIVACY MODE: INTERNET EXPLORER Allows Browsing without leaving traces (but see below!) Additional feature: Prevent Sites from sending data to other sites (InPrivate Filtering) IE traces third party content; if it appears on more than 10 (can be modified from 3 to 30) sites visited, it is blocked in InPrivate Browsing mode (“Ads” or similar) Must be activated manually each time (works per-session)! Can also be activated in non-private browsing mode Complete blocking (no third-party content) can be set manually; exceptions can be configured as well InPrivate Browsing does not store: New cookies (existing ones can still be read!), history entries, form data, passwords, typed URLs, search queries, visited links Toolbars and extensions are disabled Will keep: Bookmarks, downloaded files, Flash cookies
57 PRIVACY MODE: INTERNET EXPLORER InPrivate Browsing previously stored files in the cache on the disk, but deleted them when closing the window This meant, traces WOULD remain on the disk! But this is no longer the case Reconstructing the history: Not available directly (not stored!) Article unclear about this; some parts might remain But possible through the cache, which contains the last access time of every stored element! No advantage regarding: Eavesdropping: ISP, NSA… can still copy the connection Use good encryption Servers: May be able to identify that this mode is used
58 PRIVACY MODE: FIREFOX Firefox does not store History entries (incl. intelligent address bar), search queries, download history, form data, cookies, cache, typed URLs, passwords, visited links Will keep: Bookmarks, downloaded files, Flash cookies Same features as IE Except third party elements Cookies can be filtered Images too, but not through the UI! about:config permissions.default.image=3 (no third party images) Scripts etc: NoScript or other extensions Extensions generally remain active! Configuration (e.g. third party images) is the same Whether extensions work in privacy mode can be determined when installing an extension (and in the add-on configuration)
59 INCOGNITO MODE: CHROME Chrome does not store History entries, search queries, download history, form data, cookies, cache, typed URLs, passwords, visited links Will keep: Bookmarks, downloaded files Prevents sites from identifying, that incognito mode is used But workarounds still exist… Extensions are disabled, but can be individually enabled manually Configuration (e.g. third party images) is the same Will not store anything on disk (except swap/hibernation!)
60 PRIVACY MODE: PAGEFILE If immediately/soon after using the privacy mode, content may remain in the pagefile Records are not written to disk, but they were definitely in the main memory! May contain: Individual URLs: Easy to recognize Problem: They need not be from the browser, but could also stem from other programs Form content: Mostly unrecognizable (unless content known!) Page data: HTML or fragments of it Problem: How to assemble it to a complete webpage Main problem: Conclusions We have fragments, but mostly no trace how they were created URLs could be the within program code loaded, strings etc Page data: A file that was copied somewhere …
61 PRIVACY MODE: TRACKING No cookies are taken from the “normal” into the “private” mode You are a “new” user BUT: If you log in somewhere during the browsing session, the (previously anonymous) new cookies can be attributed to you Privacy mode is a “single mode”, i.e. it applies to all tabs/windows together Normal windows Share cookies Private mode windows Share cookies Close all private mode windows and open a new one No cookies Result: Private mode is way less private than you think But it is today relatively good against local forensics
62 ALTERNATIVE: NETWORK FORENSICS Copying the network traffic allows reconstructing the page This requires live access on a router, intercept station etc. at the moment the user browses the web Wiretapping Very difficult to do legally! Only very limited usability! Result: Trace of the individual packets Requires: Reassembly of the TCP connection (difficult tool needed!) Splitting into the individual requests (HTTP 1.1 pipelining!) Manual reassembly to a “viewable” local page Inspection of the HTML code is quite simple Following pages: Wireshark example Left out: IPv6 DNS query, redirect to actual homepage, detailed analysis of the individual packets (not interesting!)
63 WWW.JKU.AT: DNS QUERY
64 WWW.JKU.AT: DNS RESPONSE
65 WWW.JKU.AT: HTTP QUERY
DNT = Do Not Track (Mozilla version of „X-Do-Not-Track“)
66 WWW.JKU.AT: HTTP RESPONSE (START)
67 HTTP RESPONSE (TCP STREAM)
68 CONCLUSIONS What a user did in a browser can typically be reconstructed quite well Especially Internet Explorer: Deleting the index.dat/ESE files is almost impossible Dedicated "cleaner" programs are needed Information may be stored multiple times Main problem: Separating the “human & interesting” things from the huge number of other automatic&subordinate elements Reconstructing the content of web-based E-Mail is difficult That, which address etc can be discovered But content is almost never cached and therefore unavailable A variety of programs exist to investigate these files Few of them are free File formats are often not at all/badly documented
69 THANK YOU FOR https://www.ins.jku.at YOUR ATTENTION!
Michael Sonntag [email protected]
+43 (732) 2468 - 4137 JOHANNES KEPLER nd UNIVERSITÄT LINZ S3 235 (Science park 3, 2 floor) Altenberger Straße 69 4040 Linz, Österreich www.jku.at Literature
Anderson, Keith: Firefox history exporter: https://bugzilla.mozilla.org/show_bug.cgi?id=241438 (Entry at 2006-03-17 09:10:47 PDT) Malmström/Teveldal: Forensic analysis of the ESE database in Internet Explorer 10, http://hh.diva- portal.org/smash/get/diva2:635743/FULLTEXT02.pdf
71