F.3.2. XML External Entity (XXE) Attacks
Total Page:16
File Type:pdf, Size:1020Kb
Master Thesis Security Implications of DTD Attacks Against a Wide Range of XML Parsers Christopher Späth Date: 20.10.2015 Supervisor: Christian Mainka, Vladislav Mladenov, Jörg Schwenk Ruhr-University Bochum, Germany Chair for Network and Data Security Prof. Dr. Jörg Schwenk Homepage: www.nds.rub.de Abstract The Extensible Markup Language (XML) is extensively used today in applications, protocols and databases. It is platform independent and allows description and easy access to structured data. Technically, a parser processes an input XML document to translate a byte-stream into a structure that can be used by APIs in programming languages like Python, Java or PHP. DTDs (DTDs) enable the author of the XML document to describe the structure (grammar) of the XML document and therefore check the document for validity. Besides a validity check DTDs also introduce basics storage units, called entities. These are the cause of a series of vulnerabilities, which can be assigned to different classes. (1.) Denial-of-Service (DoS) attacks (billion laugh attack) force the parser to process an input XML document of several hundred bytes, which then expands its size during processing up to several tera- or petabytes of data. (2.) XML External En- tity (XXE) attacks aim to read arbitrary data on a server’s filesystem by only invoking the parsing process of an XML document. (3.) URL Invocation (URL Invocation) attacks use the parser to invoke a request on another system, which can be any system the server can access. This is highly problematic because an attacker may retrieve information about internal systems or misuse the server for Server Side Request Forgery Attacks. DoS and XXE were first discovered back in 2002, and so it is surprisingly that the at- tacks are still working more than a decade later, even for companies like Facebook [Sil], Google [det], SAP NetWeaver [erpa] and Mobile Platform [erpb], Apple Office Viewer [app], mediawiki [wik], Adobe [thr] and Samsung SmartTVs [NSMS15]. In this presentation we contribute a comprehensive security analysis of 27 XML parsers in six popular pro- gramming languages: Ruby, .NET, PHP, Java, Python, Perl. We identify each parser’s default behavior with regard to security relevant parsing features by using 16 core tests. These core tests address DoS (3 tests), XXE (5 tests), URL Invocation (6 tests), XML Inclusions (XInclude) (1 test) and Extensible Stylesheet Language Transformations (XSLT) (1 test). We compute a “base vulnerability score” from these test re- sults which is independent of the programming language and the parser. By extending these core tests with parser-specific tests, we investigate the implication of individual parser features and their interaction with one another on the overall security. This leads to an overall test set of more than 800 tests and on average 50 tests per parser. With this accumulated information we calculate another parser-dependant “extended vul- nerability score”. These two scores enable a party to quickly make a sound decision regarding the security of an XML parser. Using the results of individual tests, a party can identify problematic settings or appropriate countermeasures for specific parser configurations. Keywords: XML parsing, DoS, XXE, Parameter Entities, XInclude, URL Invocation Contents List of Figures............................................. xiii List of Tables............................................. xv Glossary xvii Acronyms 1 1. Introduction 2 1.1. Motivation............................................3 1.1.1. Topics Beyond the Scope................................4 2. Related Work 5 2.1. Existing Contributions.....................................5 2.2. Reported Attacks........................................6 2.3. Contribution and Motivation..................................6 3. XML Basics 7 3.1. XML Language Constructs...................................7 3.1.1. Documents.......................................8 3.1.2. Well-formedness....................................8 3.1.3. Validity.........................................9 3.1.4. Characters....................................... 10 3.1.5. Common Syntactic Constructs............................. 10 3.1.6. Character Data and Markup.............................. 11 3.1.7. Processing Instructions................................. 11 3.1.8. CDATA Sections.................................... 12 3.1.9. Prolog and Document Type Declaration........................ 12 3.1.10. Standalone Document Declaration........................... 14 3.1.11. Validating and Non-Validating Processors....................... 15 3.2. Logical Structure........................................ 16 3.2.1. Elements and Attributes................................ 16 3.2.2. Element Type Declarations............................... 18 3.2.3. Attribute-List Declarations............................... 19 Contents iv 3.3. Physical Structure........................................ 22 3.3.1. Character References.................................. 22 3.3.2. Internal General Entity................................. 23 3.3.3. External General Entity................................. 25 3.3.4. Internal Parameter Entity................................ 26 3.3.5. External Parameter Entity............................... 28 3.3.6. Unparsed Entity.................................... 30 3.4. Processing of Entities...................................... 32 3.4.1. Character Reference.................................. 33 3.4.2. Internal General Entity................................. 36 3.4.3. External General Entity................................. 39 3.4.4. Parameter Entities................................... 41 3.4.5. Unparsed Entities.................................... 46 3.5. XInclude............................................ 47 3.6. XSLT.............................................. 50 3.7. XML Parsing.......................................... 51 3.7.1. DOM.......................................... 51 3.7.2. SAX.......................................... 52 3.7.3. StAX Parser...................................... 56 4. Attacks 57 4.1. Denial-of-Service Attacks................................... 57 4.2. XML External Entity (XXE) Attacks.............................. 62 4.3. XXE Attacks Based on Parameter Entity............................ 64 4.3.1. Reading out arbitrary files............................... 64 4.3.2. Bypassing [WFC: No External Entity References].................. 67 4.3.3. XXE Out-of-Band Attacks............................... 70 4.3.4. XXE Out-of-Band Attacks - SchemaEntity...................... 74 4.4. URL Invocation Attacks.................................... 77 4.5. XInclude Attacks........................................ 79 4.6. XSLT Attacks.......................................... 81 5. Methodology 82 5.1. General............................................. 82 5.2. Denial-of-Service Methodology................................ 85 5.3. XML External Entity (XXE) Methodology........................... 87 5.4. XXE Based on Parameter Entities Methodology........................ 89 5.5. URL Invocation Methodology................................. 92 5.6. XInclude Methodology..................................... 97 Contents v 5.7. XSLT Methodology....................................... 98 6. Ruby 99 6.1. Installation and Setup...................................... 99 6.1.1. Installation Nokogiri.................................. 100 6.1.2. Network and Firewall.................................. 100 6.2. Overview of Parsers....................................... 100 6.3. REXML............................................. 101 6.3.1. Introduction to REXML................................ 101 6.3.2. Available Settings of REXML............................. 102 6.3.3. Impact......................................... 106 6.4. Nokogiri............................................. 109 6.4.1. Introduction to Nokogiri................................ 109 6.4.2. Available Settings of Nokogiri............................. 111 6.4.3. Impact......................................... 112 6.5. Impact of all Ruby Parsers................................... 116 7. Python 118 7.1. Installation and Setup...................................... 118 7.1.1. Installation Requests.................................. 118 7.1.2. Installation Defusedxml................................ 119 7.1.3. Installation lxml.................................... 119 7.1.4. Network and Firewall.................................. 120 7.2. Overview of Parsers....................................... 120 7.3. Etree............................................... 121 7.3.1. Introduction to etree.................................. 121 7.3.2. Available Settings of etree............................... 122 7.3.3. Impact......................................... 124 7.4. minidom............................................. 126 7.4.1. Introduction to minidom................................ 126 7.4.2. Available Settings of minidom............................. 127 7.4.3. Impact......................................... 128 7.5. xml.sax............................................. 130 7.5.1. Introduction to xml.sax................................. 130 7.5.2. Available Settings of xml.sax.............................. 131 7.5.3. Impact......................................... 132 7.6. pulldom............................................. 135 7.6.1. Introduction to pulldom................................ 135 7.6.2. Available Settings of pulldom............................. 136 Contents vi