Yarl Documentation Release 1.6.3
Total Page:16
File Type:pdf, Size:1020Kb
yarl Documentation Release 1.6.3- Andrew Svetlov Sep 15, 2021 CONTENTS 1 Introduction 3 2 Installation 5 3 Dependencies 7 4 API documentation 9 5 Comparison with other URL libraries 11 6 Why isn’t boolean supported by the URL query API? 13 7 Source code 15 8 Discussion list 17 9 Authors and License 19 9.1 Public API................................................ 19 10 Indices and tables 31 Python Module Index 33 Index 35 i ii yarl Documentation, Release 1.6.3- The module provides handy URL class for URL parsing and changing. CONTENTS 1 yarl Documentation, Release 1.6.3- 2 CONTENTS CHAPTER ONE INTRODUCTION URL is constructed from str: >>> from yarl import URL >>> url= URL( 'https://www.python.org/~guido?arg=1#frag') >>> url URL('https://www.python.org/~guido?arg=1#frag') All URL parts: scheme, user, password, host, port, path, query and fragment are accessible by properties: >>> url.scheme 'https' >>> url.host 'www.python.org' >>> url.path '/~guido' >>> url.query_string 'arg=1' >>> url.query <MultiDictProxy('arg': '1')> >>> url.fragment 'frag' All URL manipulations produces a new URL object: >>> url.parent/ 'downloads/source' URL('https://www.python.org/downloads/source') A URL object can be modified with / and % operators: >>> url= URL( 'https://www.python.org') >>> url/ 'foo' / 'bar' URL('https://www.python.org/foo/bar') >>> url/ 'foo' %{ 'bar': 'baz'} URL('https://www.python.org/foo?bar=baz') Strings passed to constructor and modification methods are automatically encoded giving canonical representation as result: >>> url= URL( 'https://www.python.org/') >>> url URL('https://www.python.org/%D0%BF%D1%83%D1%82%D1%8C') 3 yarl Documentation, Release 1.6.3- Regular properties are percent-decoded, use raw_ versions for getting encoded strings: >>> url.path '/' >>> url.raw_path '/%D0%BF%D1%83%D1%82%D1%8C' Human readable representation of URL is available as human_repr(): >>> url.human_repr() 'https://www.python.org/' For full documentation please read Public API section. 4 Chapter 1. Introduction CHAPTER TWO INSTALLATION $ pip install yarl The library is Python 3 only! PyPI contains binary wheels for Linux, Windows and MacOS. If you want to install yarl on another operating system (like Alpine Linux, which is not manylinux-compliant because of the missing glibc and therefore, cannot be used with our wheels) the the tarball will be used to compile the library from the source code. It requires a C compiler and and Python headers installed. To skip the compilation you must explicitly opt-in by setting the YARL_NO_EXTENSIONS environment variable to a non-empty value, e.g.: $ YARL_NO_EXTENSIONS=1 pip install yarl Please note that the pure-Python (uncompiled) version is much slower. However, PyPy always uses a pure-Python implementation, and, as such, it is unaffected by this variable. 5 yarl Documentation, Release 1.6.3- 6 Chapter 2. Installation CHAPTER THREE DEPENDENCIES YARL requires multidict library. It installs it automatically. 7 yarl Documentation, Release 1.6.3- 8 Chapter 3. Dependencies CHAPTER FOUR API DOCUMENTATION Open Public API for reading full list of available methods. 9 yarl Documentation, Release 1.6.3- 10 Chapter 4. API documentation CHAPTER FIVE COMPARISON WITH OTHER URL LIBRARIES • furl (https://pypi.python.org/pypi/furl) The library has a rich functionality but furl object is mutable. I afraid to pass this object into foreign code: who knows if the code will modify my URL in a terrible way while I just want to send URL with handy helpers for accessing URL properties. furl has other non obvious tricky things but the main objection is mutability. • URLObject (https://pypi.python.org/pypi/URLObject) URLObject is immutable, that’s pretty good. Every URL change generates a new URL object. But the library doesn’t any decode/encode transformations leaving end user to cope with these gory details. 11 yarl Documentation, Release 1.6.3- 12 Chapter 5. Comparison with other URL libraries CHAPTER SIX WHY ISN’T BOOLEAN SUPPORTED BY THE URL QUERY API? There is no standard for boolean representation of boolean values. Some systems prefer true/false, others like yes/no, on/off, Y/N, 1/0, etc. yarl cannot make an unambiguous decision on how to serialize bool values because it is specific to how the end-user’s application is built and would be different for different apps. The library doesn’t accept booleans in the API; auser should convert bools into strings using own preferred translation protocol. 13 yarl Documentation, Release 1.6.3- 14 Chapter 6. Why isn’t boolean supported by the URL query API? CHAPTER SEVEN SOURCE CODE The project is hosted on GitHub Please file an issue on the bug tracker if you have found a bug or have some suggestion in order to improve the library. The library uses Azure Pipelines for Continuous Integration. 15 yarl Documentation, Release 1.6.3- 16 Chapter 7. Source code CHAPTER EIGHT DISCUSSION LIST aio-libs google group: https://groups.google.com/forum/#!forum/aio-libs Feel free to post your questions and ideas here. 17 yarl Documentation, Release 1.6.3- 18 Chapter 8. Discussion list CHAPTER NINE AUTHORS AND LICENSE The yarl package is written by Andrew Svetlov. It’s Apache 2 licensed and freely available. Contents: 9.1 Public API The only public yarl class is URL: >>> from yarl import URL class yarl.URL(arg, *, encoded=False) Represents URL as [scheme:]//[user[:password]@]host[:port][/path][?query][#fragment] for absolute URLs and [/path][?query][#fragment] for relative ones (Absolute and relative URLs). The URL structure is: http://user:[email protected]:8042/over/there?name=ferret#nose \__/ \__/ \__/ \_________/ \__/\_________/ \_________/ \__/ | | | | | | | | scheme user password host port path query fragment Internally all data are stored as percent-encoded strings for user, path, query and fragment URL parts and IDNA- encoded (RFC 5891) for host. Constructor and modification operators perform encoding for all parts automatically. The library assumes all data uses UTF-8 for percent-encoded tokens. >>> URL('http://example.com/path/to/?arg1=a&arg2=b#fragment') URL('http://example.com/path/to/?arg1=a&arg2=b#fragment') Unless URL contain the only ascii characters there is no differences. But for non-ascii case encoding is applied. 19 yarl Documentation, Release 1.6.3- >>> str(URL('http://.eu//')) 'http://xn--jxagkqfkduily1i.eu/%D0%BF%D1%83%D1%82%D1%8C/%E9%80%99%E8%A3%A1' The same is true for user, password, query and fragment parts of URL. Already encoded URL is not changed: >>> URL('http://xn--jxagkqfkduily1i.eu') URL('http://xn--jxagkqfkduily1i.eu') Use URL.human_repr() for getting human readable representation: >>> url= URL( 'http://.eu//') >>> str(url) 'http://xn--jxagkqfkduily1i.eu/%D0%BF%D1%83%D1%82%D1%8C/%E9%80%99%E8%A3%A1' >>> url.human_repr() 'http://.eu//' Note: Sometimes encoding performed by yarl is not acceptable for certain WEB server. Passing encoded=True parameter prevents URL auto-encoding, user is responsible about URL correctness. Don’t use this option unless there is no other way for keeping URL attributes not touched. Any URL manipulations don’t guarantee correct encoding, URL parts could be re-quoted even if encoded parameter was explicitly set. 9.1.1 URL properties There are two kinds of properties: decoded and encoded (with raw_ prefix): URL.scheme Scheme for absolute URLs, empty string for relative URLs or URLs starting with ‘//’ (Absolute and relative URLs). >>> URL('http://example.com').scheme 'http' >>> URL('//example.com').scheme '' >>> URL('page.html').scheme '' URL.user Decoded user part of URL, None if user is missing. >>> URL('http://[email protected]').user 'john' >>> URL('http://@example.com').user '' >>> URL('http://example.com').user is None True URL.raw_user Encoded user part of URL, None if user is missing. 20 Chapter 9. Authors and License yarl Documentation, Release 1.6.3- >>> URL('http://@example.com').raw_user '%D0%B0%D0%BD%D0%B4%D1%80%D0%B5%D0%B9' >>> URL('http://example.com').raw_user is None True URL.password Decoded password part of URL, None if user is missing. >>> URL('http://john:[email protected]').password 'pass' >>> URL('http://:@example.com').password '' >>> URL('http://example.com').password is None True URL.raw_password Encoded password part of URL, None if user is missing. >>> URL('http://user:@example.com').raw_password '%D0%BF%D0%B0%D1%80%D0%BE%D0%BB%D1%8C' URL.host Encoded host part of URL, None for relative URLs (Absolute and relative URLs). Brackets are stripped for IPv6. Host is converted to lowercase, address is validated and converted to compressed form. >>> URL('http://example.com').host 'example.com' >>> URL('http://.').host '.' >>> URL('page.html').host is None True >>> URL('http://[::1]').host '::1' URL.raw_host IDNA decoded host part of URL, None for relative URLs (Absolute and relative URLs). >>> URL('http://.').raw_host 'xn--n1agdj.xn--d1acufc' URL.port port part of URL, with scheme-based fallback. None for relative URLs (Absolute and relative URLs) or for URLs without explicit port and URL.scheme without default port substitution. >>> URL('http://example.com:8080').port 8080 >>> URL('http://example.com').port 80 >>> URL('page.html').port is None True 9.1. Public API 21 yarl Documentation, Release 1.6.3- URL.explicit_port explicit_port part of URL, without scheme-based fallback. None for relative URLs (Absolute and relative URLs) or for URLs without explicit port. >>> URL('http://example.com:8080').explicit_port 8080 >>> URL('http://example.com').explicit_port is None True >>> URL('page.html').explicit_port is None True New in version 1.3. URL.authority Decoded authority part of URL, a combination of user, password, host, and port. authority = [ user [ ":" password ] "@" ] host [ ":" port ]. authority is empty string if all parts are missing.