reserved, 31–35 Common UNIX Printing System (CUPS), unreserved, 32 152–153 character sets Common Vulnerability Scoring System byte order marks and detection, 208 (CVSS), 6–7 detection for non-HTTP files, 210–211 Common Weakness Enumeration (CWE), 6 handling, 206–211 complex selectors, in CSS, 88 for headers, 49–51 computer proficiency of user, 14 inheritance and override, 209 conditionals, explicit and implicit, in markup-controlled, on subresources, HTML, 75–76 209–210 conflicting headers, resolution of, 47–48 sniffing, 264 CONNECT requests, 46, 54 in URLs, 33 Connolly, Dan, 9 @charset (CSS), 89 content directives, on subresources, 204 children objects in JavaScript, 108 Content-Disposition directive, 48, 84, 122 Chrome defensive uses, 203–204 autodetection of passive document NUL character and, 51 types, 205 plug-in-executed code and, 204 cached pages in, 37 user-controlled filenames in, 67 characters in URL scheme name content inclusion in HTML ignored by, 25 hyperlinking and, 79–84 deleting JavaScript function, 103 type-specific, 82–84 and file extensions in URLs, 130 Content-Length header, 43, 52, 147 local file access, 160 in keepalive sessions, 56–58 modal dialogs for prompts, 219 content recognition, 197–211 navigation timing, 259 content rendering, plug-ins for, 127–138 prerendering page, 258–259 Content Security Policy (CSP), 242–245, printable characters in, 32 250, 253 privileged JavaScript in, 161 criticisms of, 244–245 and realm string, 63 violations, 244 and RFC 2047 encoding, 50 content sniffing, 197–198, 205, 264 stored password retrieval, 228 Content-Type directive, 49, 71, 84 SWF file handling without application/binary, 212 Content-Type, 199 application/JavaScript, 118 time limits on continuously executing application/json, 118, 202 scripts, 215 application/mathml+xml, 119 WebKit parsing engine, 70n application/octet-stream, 200–201, 212 window.open() function and, 218 charset parameter, 206, 208 Windows Presentation Foundation image/jpeg, 118, 202, 205 plug-ins, 136 image/svg+xml, 124 chunked data transfers, 57–58 logic to handle absence, 198–199 clickjacking, 179, 180–181, 263 plug-ins and, 128, 204 click() method, 218 slash-delmited alphanumeric client certificates, 64–66 tokens in, 199 client-server architecture, 17–18 special values, 200–201 client-side data, 165 text/css, 118 client-side databases, 258 text/html, 124 client-side errors (400–499), 55–56 text/plain, 118, 156, 200–201, 204, 212 client-side scripts, restricting privileges of unrecognized, 202–203 HTML generated by, 250–251 and XML document parsing, 120 cloud, 15 control characters, JavaScript shorthand Clover, Andrew, 184 notation, 112 command injection, 265 cookie-authenticated text, reading, 181 comments Cookie header. See cookies in CSS syntax, 89 cookie injection, 264 in XHTML and HTML, 72 The Tangled Web © 2011 by Michal Zalewski INDEX 285 tw_book.book Page 286 Tuesday, October 18, 2011 10:07 AM
cookies, 11, 257 D deleting, 62 and DNS hijacking, 153 daap: scheme, 36 forcing, 264 data: scheme, 37, 167–168 limitations on third-party, 192–194 data transfers, chunked, 57–58 and same-origin policy, 150–151 Date/If-Modified-Since header pair, 59 security policy for, 149–153 deceptive framing, 180 semantics, 60–62 dedicated workers, for background user data in, 67 processes, 258 CORS. See Cross-Origin Resource default policy, CSP directive for, 243 Sharing (CORS) default ports, for protocols, overriding, 27 CR characters, stripping from HTTP DELETE method (HTTP), 53 headers, 45 deleting credential-passing methods, 63 cookies, 62 credentials, in URLs, 26 JavaScript functions, 102–103 CRLF (newline), 45 delimiting characters, in URLs, 29 cross-browser interactions, 16–17 denial-of-service (DoS) attacks, 214–219, cross-document links, 8, 9 248, 264 cross-domain communications, and frame DeviceOrientation API, 258 descendant policy, 176–178 dialog use restrictions, 218–219 cross-domain content inclusion, 181–183 digest credential-passing method, 63 cross-domain policy files, 155–156 Digital Rights Management (DRM), 131 cross-domain requests, 236–239 directory traversal, 265 Cross-Origin Resource Sharing (CORS), disable-xss-protection, 242n 148, 236
tag (HTML), 73 current status, 239 DNS hijacking, and cookies, 153 non-simple requests and preflight, 238 DNS labels, security mechanisms request types, 236–237 based on, 142n security checks, 237–238 DNS names, in URLs, browser cross-origin subresources, 183 acceptance, 27 cross-site request forgery (XSRF, CSRF), DNS pinning, 142n, 190 84, 190, 262 DNS rebinding, 142n, 189 exploitation of flaws, 190 DNT request header, 193 login forms and, 145–146 directive, 71 cross-site
script inclusion (XSSI), 104n, 262 document.cookie API (JavaScript), 61 cross-site scripting (XSS), 71, 262 document.domain property (JavaScript), bugs, and password managers, 228 143–144 exploitation of flaws, 190 document-level scrollbar, 180 filtering, 251–252, 253 document namespace, mapping HTML crossdomain.xml file, 155, 162 elements to, 110 CSP (Content Security Policy), 242–245, document object (JavaScript), 108 250, 253
Document Object Model, 12, 108, CSRF (cross-site request forgery), 84, 109–111, 142–146 190, 262 document rendering helpers, 130–131 exploitation of flaws, 190 documents login forms and, 145–146 changing location of existing, 174–178 CSS. See Cascading Style Sheets (CSS) script access to other, 111–112 CUPS (Common UNIX Printing System), document type detection logic, 198–206 152–153 Domain parameter, for cookie, 61 currentStyle API, 184 domains CVSS (Common Vulnerability Scoring hardcoded, 227 System), 6–7 problems with restrictions, 151–152 CWE (Common Weakness Enumeration), 6 DOMService mechanism, 158 Cyrillic alphabet,
homoglyphs in, 35 DoS (denial-of-service) attacks, 214–219, 248, 264
The Tangled Web 286 INDEX © 2011 by Michal Zalewski tw_book.book Page 287 Tuesday, October 18, 2011 10:07 AM
downloaded files, 205–206 ExternalInterface.call() API, 133 drag-and-drop, 180 External XML Entity (XXE) attack, 76 DRM (Digital Rights Management), 131 duplicate headers, resolution of, 47–48 F Dutta, Sunava, 239 false positives, risk in XSS filtering, 251–252 E fault tolerance, 11 feeds, 123–124 E4X. See ECMAScript for XML (E4X) feed: scheme, 37 Earthlink, 153 Felten, Ed, 193 ECMA (European Computer Manufac- file extensions, browser response to, 205 turers Association), 11, 96 file formats. See also plug-ins ECMAScript, 96 audio and video, 119 escape codes, 112 bitmap images, 118 strict mode, 104 HTML. See HTML ECMAScript for XML (E4X), 106–107 non-renderable, 124 Eich, Brendan, 95 plaintext, 64, 85, 117–118 Electronic Frontier Foundation, 109 XML. See XML Eloquent JavaScript (Haverbeke), 97 file inclusion, 265
The Tangled Web © 2011 by Michal Zalewski INDEX 287 tw_book.book Page 288 Tuesday, October 18, 2011 10:07 AM
Firefox (continued) getElementsByTagName() function, 109 time limits on continuously executing GET method (HTTP), 42, 52, 58, 80–81 scripts, 215 GetRight download utility, 137 UTF-8 text in, 50 getters, in JavaScript, 103 Windows Presentation Foundation getURL() function, 133 plug-ins, 136 GIFAR vulnerability, 129 Worker API, 258 GIF file format, 83, 129 firefoxurl: protocol, 17, 36 GML (Generalized Markup Language), Flash applets, 11 8–9 fonts Gontmakher, Alex, 35 CSP directive for, 243 Gonzalez, Albert, 5n Flash programs enumeration of, 132 gopher: scheme, 36 Forbidden status code (403), 56 Gosling, James, 134 forecasting, statistical, 6 GPS data, 226n format-string vulnerability, 266 Grossman, Jeremiah, 179 form-based password managers, 227–229 Guninski, Georgi, 176 form feed character, in HTML tag, 74 forms, 80–82 H Found status code (302), 55 fragment ID, in URLs, 28–29 Hansen, Robert, 179 frame-ancestors directive, 243 hardcoded domains, 227 framebusting, 264 Haverbeke, Marijn, Eloquent JavaScript, 97 frame descendant policy, and cross-domain HDP file format, 83 communications, 176–178 header injection, 45, 239, 262 frames, 82 headers disabling navigation descendant model, character set and encoding schemes, 230–231 49–51 hijacking risks, 175–176 Content Security Policy encoded in, 242 name attribute of, 175 in HTTP requests, 43 sandboxed, 245–247 resolution of duplicate or conflicting, unsolicited, 178–181 47–48 and window interactions, 174–181 semicolon-delimited values, 48–49 frame-src directive, 243 HEAD request (HTTP), 53 From-Origin header, 240 hexadecimal notation, 77, 112 FTP (File Transfer Protocol), 26n, 205–206 hierarchical file path, in URLs, 27–28 ftp: scheme, 36 history object (JavaScript), 108 full-screen mode, proposals for, 259 history.pushState() API, 256 fully qualified absolute URLs, 24 Hodges, Jeff, 248 fully restricted URL scheme, 188 homoglyphs, in Cyrillic alphabet, 35 functional notation, in CSS, 89 Host request header, 43 functions hostnames JavaScript, overriding, 102–103 extra periods, and cookie-setting resolution for JavaScript, 98–99 algorithms, 159 non-fully qualified, 159 G HTML (Hypertext Markup Language), 9, 69–86 Gabrilovich, Evgeniy, 35 basic concepts, 70–73 Gecko parsing engine, 70n case of tags, 72 Generalized Markup Language converting to plaintext, 85 (GML), 8–9 CSS interaction with, 90 geolocation data, 226 document misidentified as, 198 geolocation discovery, 258 document parsing modes, 71–72 geolocation-sharing prompts, 223 embedded in feed formats, 124 getComputedStyle API, 184 entity encoding, 76–78 getElementById() function, 109 explicit and implicit conditionals, 75–76 The Tangled Web 288 INDEX © 2011 by Michal Zalewski tw_book.book Page 289 Tuesday, October 18, 2011 10:07 AM
HTTP integration semantics, 78–79 images hyperlinking and content inclusion, bitmap, 118 79–84 in HTML, 83 in-browser sanitizers, 250–251 risk of content sniffing on, 202 mapping elements to document Scalable Vector Graphics (SVG), 83, namespace, 110 121–122 parser behavior, 73–76 image/svg+xml document type, 124 tag interactions, 74–75
tag (HTML), 83 type-specific content inclusion, 82–84 src parameter, 181 version 4, 12 for SVG images, 122 version 5, 70, 119, 131 implicit caching, 59 HTTP (HyperText Transfer Protocol), 9, implicit conditionals, in HTML, 75–76 41–67 @import, in CSS, 89–90 authentication, 62–63 IndexedDB design, 258 basic syntax, 42–51 indicator of hierarchical URLs, 25–26 binary, 257 information security, 1–8 caching behavior, 58–60 inheritance, for vbscript: scheme, 169–170 cookie semantics, 60–62 inline-script setting, 242n downgrade, 264 innerHTML property, 110–111 history, 41–42 innerStaticHTML API, 251 HTML integration semantics, 78–79 integer overflow, 266 newline handling, 45 Interactive Voice Response (IVR) proxy requests, 46–47 systems, 236 request types, 52–54 interconnected systems, losses in, 5 semantics battle, 72–73 internal networks, access to, 189–190 simultaneous connections, 216 Internal Revenue Service, 231 version 0.9, 42–43, 44 Internal Server Error (500), 56 version 1.0, 42, 43, 44, 48, 59 International Organization for Standard- version 1.1, 42–43, 45, 48, 57, 198 ization (ISO), 11 httponly flag, for cookie, 61, 150 Internationalized Domain Names in http: scheme, 36 Applications (IDNA), 34–35 HTTPS, 65 Internet Assigned Numbers Authority documents, 138n, 183 (IANA), 24, 152 downgrade risks, 248 Internet Engineering Task Force (IETF), 11 https: scheme, 36 Internet Explorer, 10, 11–12 hyperlinking, and content inclusion, 79–84 ActiveX and, 137 Hypertext Markup Language (HTML). and <% ... %> blocks, 75 See HTML (Hypertext Markup \ (backslash) in URLs, 29 Language) acceptance of backtick as quote, 74 HyperText Transfer Protocol (HTTP). See characters in URL scheme name HTTP (HyperText Transfer Protocol) ignored by, 25 clickjacking, 182 I content sniffing, 202 cookies, 149 IANA (Internet Assigned Numbers data: URLs in, 168 Authority), 24, 152 delete attempt of JavaScript function, 103 ICO file format, 83 extension matching, 202 IDNA (Internationalized Domain Names fallback display, 118 in Applications), 34–35 and file extensions in URLs, 130 IETF (Internet Engineering Task Force), 11 frames, 177 If-Modified-Since header, 59 JavaScript in, 96 If-None-Match header, 59 JSON.parse() function alternative, 104
The Tangled Web © 2011 by Michal Zalewski INDEX 289 tw_book.book Page 290 Tuesday, October 18, 2011 10:07 AM
Internet Explorer (continued) JavaScript, 10, 11n, 83, 95–107 and multiline headers, 45 character encoding in, 112–113 multiline string literals support, 91 code and object inspection capabilities, non-recognition of vertical tab, 112 101–102 NUL character and, 73, 74 code execution, 100 origin check and port number, 142 code inclusion modes and nesting risks, printable characters in, 32 113–114 proprietary security-restricted document.domain property, 143–144 parameter, 246 Document Object Model, 12, 108, redirects to about:blank, 166–167 109–111 and RFC 2047 encoding, 50 embedded in PDF documents, 130 same-origin policy and, 143n, 185 execution order control, 100–101 Silverlight and, 134 labeled statements support, 105n stored password retrieval, 228 MIME type, 118n SWF file handling without Netscape and, 95–96 Content-Type, 199 runtime environment for, 102–104 text/plain document type, 200–201 script processing model, 97–100 third-party cookies blocking, 193 setters and getters, 103 time limits on continuously executing standard object hierarchy, 107–112 scripts, 215 variable declaration, 99 Trident parsing engine, 70n and WML Script (WMLS), 123 VBScript, 96, 114 JavaScript Object Notation (JSON), window.open() function and, 218 104–106, 112 Windows Presentation Foundation javascript: scheme, 37, 169–170 plug-ins, 136 Jobs, Steve, 131 XDomainRequest approach to, 148 JPEG file format, 83 XSS-detection logic, 251 JScript, 11n Zone.Identifier metadata, 231 JScript.Encode, 113n zone model, 229–231 JSObject mechanism, 158 Internet Information Server, and Host JSON (JavaScript Object Notation), headers, 47 104–106, 112 Internet service providers, 153 JSONP (JSON with padding), 106n, 245 Internet zone, for Internet Explorer, 230 JSON.parse() function, alternatives, 104 interstitials, 218 intrusions K escalation of, 5 nonmonetary costs, 5 Kaminsky, Dan, 153 Invisible Gorilla experiment, 223 katakana, 33 IP addresses, and cookies, 158 keepalive sessions, 56–57, 216 ISO (International Organization for Stan- keystroke redirection, 180 dardization), 11 Kinugawa, Masato, 210 ISO-8859-1 (Western European code page), 50 L itms: scheme, 36 itpc: scheme, 36 language parameter, for