Scrapy Documentation Release 2.4.1

Total Page:16

File Type:pdf, Size:1020Kb

Scrapy Documentation Release 2.4.1 Scrapy Documentation Release 2.4.1 Scrapy developers Apr 06, 2021 FIRST STEPS 1 Getting help 3 2 First steps 5 2.1 Scrapy at a glance............................................5 2.2 Installation guide.............................................7 2.3 Scrapy Tutorial.............................................. 11 2.4 Examples................................................. 23 3 Basic concepts 25 3.1 Command line tool............................................ 25 3.2 Spiders.................................................. 34 3.3 Selectors................................................. 46 3.4 Items................................................... 63 3.5 Item Loaders............................................... 69 3.6 Scrapy shell............................................... 78 3.7 Item Pipeline............................................... 82 3.8 Feed exports............................................... 86 3.9 Requests and Responses......................................... 96 3.10 Link Extractors.............................................. 110 3.11 Settings.................................................. 113 3.12 Exceptions................................................ 141 4 Built-in services 145 4.1 Logging.................................................. 145 4.2 Stats Collection.............................................. 151 4.3 Sending e-mail.............................................. 152 4.4 Telnet Console.............................................. 155 4.5 Web Service............................................... 158 5 Solving specific problems 159 5.1 Frequently Asked Questions....................................... 159 5.2 Debugging Spiders............................................ 165 5.3 Spiders Contracts............................................. 167 5.4 Common Practices............................................ 170 5.5 Broad Crawls............................................... 174 5.6 Using your browser’s Developer Tools for scraping........................... 177 5.7 Selecting dynamically-loaded content.................................. 182 5.8 Debugging memory leaks........................................ 186 5.9 Downloading and processing files and images.............................. 190 5.10 Deploying Spiders............................................ 198 i 5.11 AutoThrottle extension.......................................... 199 5.12 Benchmarking.............................................. 201 5.13 Jobs: pausing and resuming crawls................................... 203 5.14 Coroutines................................................ 204 5.15 asyncio.................................................. 206 6 Extending Scrapy 209 6.1 Architecture overview.......................................... 209 6.2 Downloader Middleware......................................... 212 6.3 Spider Middleware............................................ 229 6.4 Extensions................................................ 236 6.5 Core API................................................. 241 6.6 Signals.................................................. 250 6.7 Item Exporters.............................................. 256 7 All the rest 265 7.1 Release notes............................................... 265 7.2 Contributing to Scrapy.......................................... 340 7.3 Versioning and API stability....................................... 343 Python Module Index 345 Index 347 ii Scrapy Documentation, Release 2.4.1 Scrapy is a fast high-level web crawling and web scraping framework, used to crawl websites and extract structured data from their pages. It can be used for a wide range of purposes, from data mining to monitoring and automated testing. FIRST STEPS 1 Scrapy Documentation, Release 2.4.1 2 FIRST STEPS CHAPTER ONE GETTING HELP Having trouble? We’d like to help! • Try the FAQ – it’s got answers to some common questions. • Looking for specific information? Try the genindex or modindex. • Ask or search questions in StackOverflow using the scrapy tag. • Ask or search questions in the Scrapy subreddit. • Search for questions on the archives of the scrapy-users mailing list. • Ask a question in the #scrapy IRC channel, • Report bugs with Scrapy in our issue tracker. 3 Scrapy Documentation, Release 2.4.1 4 Chapter 1. Getting help CHAPTER TWO FIRST STEPS 2.1 Scrapy at a glance Scrapy is an application framework for crawling web sites and extracting structured data which can be used for a wide range of useful applications, like data mining, information processing or historical archival. Even though Scrapy was originally designed for web scraping, it can also be used to extract data using APIs (such as Amazon Associates Web Services) or as a general purpose web crawler. 2.1.1 Walk-through of an example spider In order to show you what Scrapy brings to the table, we’ll walk you through an example of a Scrapy Spider using the simplest way to run a spider. Here’s the code for a spider that scrapes famous quotes from website http://quotes.toscrape.com, following the pagi- nation: import scrapy class QuotesSpider(scrapy.Spider): name='quotes' start_urls=[ 'http://quotes.toscrape.com/tag/humor/', ] def parse(self, response): for quote in response.css('div.quote'): yield { 'author': quote.xpath('span/small/text()').get(), 'text': quote.css('span.text::text').get(), } next_page= response.css('li.next a::attr("href")').get() if next_page is not None: yield response.follow(next_page, self.parse) Put this in a text file, name it to something like quotes_spider.py and run the spider using the runspider command: scrapy runspider quotes_spider.py-o quotes.jl 5 Scrapy Documentation, Release 2.4.1 When this finishes you will have in the quotes.jl file a list of the quotes in JSON Lines format, containing text and author, looking like this: {"author":"Jane Austen","text":" \u201cThe person, be it gentleman or lady, who has ,!not pleasure in a good novel, must be intolerably stupid.\u201d"} {"author":"Steve Martin","text":" \u201cA day without sunshine is like, you know, ,!night.\u201d"} {"author":"Garrison Keillor","text":" \u201cAnyone who thinks sitting in church can ,!make you a Christian must also think that sitting in a garage can make you a car.\ ,!u201d"} ... What just happened? When you ran the command scrapy runspider quotes_spider.py, Scrapy looked for a Spider definition inside it and ran it through its crawler engine. The crawl started by making requests to the URLs defined in the start_urls attribute (in this case, only the URL for quotes in humor category) and called the default callback method parse, passing the response object as an argument. In the parse callback, we loop through the quote elements using a CSS Selector, yield a Python dict with the extracted quote text and author, look for a link to the next page and schedule another request using the same parse method as callback. Here you notice one of the main advantages about Scrapy: requests are scheduled and processed asynchronously. This means that Scrapy doesn’t need to wait for a request to be finished and processed, it can send another request or do other things in the meantime. This also means that other requests can keep going even if some request fails or an error happens while handling it. While this enables you to do very fast crawls (sending multiple concurrent requests at the same time, in a fault-tolerant way) Scrapy also gives you control over the politeness of the crawl through a few settings. You can do things like setting a download delay between each request, limiting amount of concurrent requests per domain or per IP, and even using an auto-throttling extension that tries to figure out these automatically. Note: This is using feed exports to generate the JSON file, you can easily change the export format (XML or CSV, for example) or the storage backend (FTP or Amazon S3, for example). You can also write an item pipeline to store the items in a database. 2.1.2 What else? You’ve seen how to extract and store items from a website using Scrapy, but this is just the surface. Scrapy provides a lot of powerful features for making scraping easy and efficient, such as: • Built-in support for selecting and extracting data from HTML/XML sources using extended CSS selectors and XPath expressions, with helper methods to extract using regular expressions. • An interactive shell console (IPython aware) for trying out the CSS and XPath expressions to scrape data, very useful when writing or debugging your spiders. • Built-in support for generating feed exports in multiple formats (JSON, CSV, XML) and storing them in multiple backends (FTP, S3, local filesystem) • Robust encoding support and auto-detection, for dealing with foreign, non-standard and broken encoding dec- larations. 6 Chapter 2. First steps Scrapy Documentation, Release 2.4.1 • Strong extensibility support, allowing you to plug in your own functionality using signals and a well-defined API (middlewares, extensions, and pipelines). • Wide range of built-in extensions and middlewares for handling: – cookies and session handling – HTTP features like compression, authentication, caching – user-agent spoofing – robots.txt – crawl depth restriction – and more •A Telnet console for hooking into a Python console running inside your Scrapy process, to introspect and debug your crawler • Plus other goodies like reusable spiders to crawl sites from Sitemaps and XML/CSV feeds, a media pipeline for automatically downloading images (or any other media) associated with the scraped items,
Recommended publications
  • In-Depth Evaluation of Redirect Tracking and Link Usage
    Proceedings on Privacy Enhancing Technologies ; 2020 (4):394–413 Martin Koop*, Erik Tews, and Stefan Katzenbeisser In-Depth Evaluation of Redirect Tracking and Link Usage Abstract: In today’s web, information gathering on 1 Introduction users’ online behavior takes a major role. Advertisers use different tracking techniques that invade users’ privacy It is common practice to use different tracking tech- by collecting data on their browsing activities and inter- niques on websites. This covers the web advertisement ests. To preventing this threat, various privacy tools are infrastructure like banners, so-called web beacons1 or available that try to block third-party elements. How- social media buttons to gather data on the users’ on- ever, there exist various tracking techniques that are line behavior as well as privacy sensible information not covered by those tools, such as redirect link track- [52, 69, 73]. Among others, those include information on ing. Here, tracking is hidden in ordinary website links the user’s real name, address, gender, shopping-behavior pointing to further content. By clicking those links, or or location [4, 19]. Connecting this data with informa- by automatic URL redirects, the user is being redirected tion gathered from search queries, mobile devices [17] through a chain of potential tracking servers not visible or content published in online social networks [5, 79] al- to the user. In this scenario, the tracker collects valuable lows revealing further privacy sensitive information [62]. data about the content, topic, or user interests of the This includes personal interests, problems or desires of website. Additionally, the tracker sets not only third- users, political or religious views, as well as the finan- party but also first-party tracking cookies which are far cial status.
    [Show full text]
  • IBM Cognos Analytics - Reporting Version 11.1
    IBM Cognos Analytics - Reporting Version 11.1 User Guide IBM © Product Information This document applies to IBM Cognos Analytics version 11.1.0 and may also apply to subsequent releases. Copyright Licensed Materials - Property of IBM © Copyright IBM Corp. 2005, 2021. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM, the IBM logo and ibm.com are trademarks or registered trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at " Copyright and trademark information " at www.ibm.com/legal/copytrade.shtml. The following terms are trademarks or registered trademarks of other companies: • Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. • Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. • Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. • Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. • UNIX is a registered trademark of The Open Group in the United States and other countries. • Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
    [Show full text]
  • Download Forecheck Guide
    Forecheck Content 1 Welcome & Introduction 5 2 Overview: What Forecheck Can Do 6 3 First Steps 7 4 Projects and Analyses 9 5 Scheduler and Queue 12 6 Important Details 13 6.1 La..n..g..u..a..g..e..s.,. .C...h..a..r.a..c..t.e..r. .S..e..t.s.. .a..n..d.. .U..n..i.c..o..d..e....................................................................... 13 6.2 Ch..o..o..s..i.n..g.. .t.h..e.. .c.o..r..r.e..c..t. .F..o..n..t............................................................................................. 14 6.3 St.o..r.a..g..e.. .L..o..c..a..t.i.o..n.. .o..f. .D..a..t.a................................................................................................ 15 6.4 Fo..r.e..c..h..e..c..k. .U...s.e..r.-.A...g..e..n..t. .a..n..d.. .W...e..b.. .A..n..a..l.y..s.i.s.. .T..o..o..l.s............................................................. 15 6.5 Er.r.o..r.. .H..a..n..d..l.i.n..g................................................................................................................ 17 6.6 Ro..b..o..t.s...t.x..t.,. .n..o..i.n..d..e..x..,. .n..o..f.o..l.l.o..w......................................................................................... 18 6.7 Co..m...p..l.e..t.e.. .A..n..a..l.y..s..i.s. .o..f. .l.a..r.g..e.. .W....e..b..s.i.t.e..s............................................................................. 20 6.8 Fi.n..d..i.n..g.. .a..l.l. .p..a..g..e..s. .o..f. .a.. .W....e..b..s.i.t.e....................................................................................... 21 6.9 Go..o..g..l.e.. .A...n..a..l.y.t.i.c..s.
    [Show full text]
  • Lxmldoc-4.5.0.Pdf
    lxml 2020-01-29 Contents Contents 2 I lxml 14 1 lxml 15 Introduction................................................. 15 Documentation............................................... 15 Download.................................................. 16 Mailing list................................................. 17 Bug tracker................................................. 17 License................................................... 17 Old Versions................................................. 17 2 Why lxml? 18 Motto.................................................... 18 Aims..................................................... 18 3 Installing lxml 20 Where to get it................................................ 20 Requirements................................................ 20 Installation................................................. 21 MS Windows............................................. 21 Linux................................................. 21 MacOS-X............................................... 21 Building lxml from dev sources....................................... 22 Using lxml with python-libxml2...................................... 22 Source builds on MS Windows....................................... 22 Source builds on MacOS-X......................................... 22 4 Benchmarks and Speed 23 General notes................................................ 23 How to read the timings........................................... 24 Parsing and Serialising........................................... 24 The ElementTree
    [Show full text]
  • CSE 190 M (Web Programming), Spring 2008 University of Washington
    Extra Slides, week 2 CSE 190 M (Web Programming), Spring 2008 University of Washington Reading: Chapter 2, sections 2.4 - 2.6 Except where otherwise noted, the contents of this presentation are © Copyright 2008 Marty Stepp and Jessica Miller and are licensed under the Creative Commons Attribution 2.5 License. Additional XHTML Tags for adding metadata and icons to a page Web page metadata: <meta> information about your page (for a browser, search engine, etc.) <meta name="description" content="Authors' web site for Building Java Programs." /> <meta name="keywords" content="java, textbook" /> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> placed in the head of your XHTML page meta tags often have both the name and content attributes some meta tags use the http-equiv attribute instead of name meta element to aid browser / web server <meta http-equiv="Content-Type" content=" type of document (character encoding)" /> <meta http-equiv="refresh" content=" how often to refresh the page (seconds)" /> </head> using the Content-Type gets rid of the W3C "tentatively valid" warning <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> the meta refresh tag can also redirect from one page to another: <meta http-equiv="refresh" content="5;url=http://www.bjp.com " /> why would we want to do this? (example ) meta element to describe the page <head> <meta name="author" content=" web page's author " /> <meta name="revised" content=" web page version and/or last modification date " /> <meta name="generator"
    [Show full text]
  • Essbase Logs
    Oracle® Enterprise Performance Management System Installation and Configuration Troubleshooting Guide Release 11.1.2.1 Updated: August 2012 EPM System Installation and Configuration Troubleshooting Guide, 11.1.2.1 Copyright © 2007, 2012, Oracle and/or its affiliates. All rights reserved. Authors: EPM Information Development Team Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing. If this is software or related documentation that is delivered to the U.S. Government or anyone licensing it on behalf of the U.S. Government, the following notice is applicable: U.S. GOVERNMENT RIGHTS: Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007).
    [Show full text]
  • CSE 190 M: HTML Page 1
    CSE 190 M: HTML Page 1 Lecture Notes 2: Hypertext Markup Language (HTML) CSE 190 M (Web Programming), Spring 2007 University of Washington Reading: Sebesta Ch. 2 sections 2.1 - 2.4.5, 2.4.7 - 2.4.9, 2.5 -2.6.1,2.7 -2.7.2 Hypertext Markup Language (HTML ) describes the content and structure of information on a web page not the same as the presentation (appearance on screen) surrounds text content with opening and closing tags each tag's name is called an element syntax: <element > content </element > example: <p>This is a paragraph</p> most whitespace is insignificant in HTML (it gets ignored or collapsed into a single space) More about HTML tags some tags can contain additional information called attributes syntax: <element attribute =" value " attribute =" value "> content </element > example: <a href="page2.html">Next page</a> some tags don't contain content; can be opened and closed in one tag syntax: <element attribute =" value " attribute =" value " /> example: <img src="bunny.jpg" alt="A bunny" /> example: <hr /> file://localhost/C:/Documents%20and%20Settings/stepp/My%20Documents/cse190m/07sp/... 05/03/2007 12:57:29 PM CSE 190 M: HTML Page 2 Structure of an HTML page <html> <head> information about the page </head> <body> page contents </body> </html> a header describes the page and a body contains the page's contents an HTML page is saved into a file ending with extension .html Example page XHTML a newer version of HTML, standardized in 2000 uses a markup format called XML (XML + HTML = XHTML) though the browser will accept some malformed HTML, we'll write "strict" XHTML that complies to the official web standards why use XHTML and web standards? more rigid and structured language more interoperable across different web browsers more likely that our pages will display correctly in the future can be interchanged with other XML data: SVG (graphics), MathML , MusicML , etc.
    [Show full text]
  • (ACM) Style Guide, WCAG 2.0 Level A
    Association of Computing Machinery (ACM) Style Guide, WCAG 2.0 Level A ACM 12/19/16 TABLE OF CONTENTS Overview ............................................................................................................................................................... 4 Media Types .......................................................................................................................................................... 4 Images ............................................................................................................................................................... 4 Best Practice: Provide Alternative Text for Images ..................................................................................... 4 Best Practice: Ensure complex images provide sufficient descriptions ....................................................... 5 Best Practice: Ensure CSS background images that convey meaning have textual and visible equivalents ...................................................................................................................................................................... 5 Color and Contrast ............................................................................................................................................ 6 Best Practice: Ensure color is not the sole means of communicating information or indicating error messages .....................................................................................................................................................
    [Show full text]
  • SC Tech Summary
    SC tech summary Failure of Success Criterion F65 1.1.1 due to omitting the alt attribute on img elements, area elements, and input elements of type image Using label elements to associate text labels with form H44 controls." category Using the title attribute to identify form controls when H65 the label element cannot be used Using alt attributes on images used as submit buttons" H36 category ARIA6 Using aria-label to provide labels for objects H37 Using alt attributes on img elements 1.1.1 Failure of Success Criterion F30 1.1.1 and 1.2.1 due to using text alternatives that are not alternatives Combining adjacent image and text links for the same H2 resource H35 Providing text alternatives on applet elements H53 Using the body of the object element" category Providing text alternatives for the area elements of image H24 maps" category Providing link text that describes the purpose of a link for H30 anchor elements H45 Using longdesc H46 Using noembed with embed Using null alt text and no title attribute on img elements H67 for images that AT should ignore H96 Using the track element to provide audio descriptions Providing an alternative for time-based media for audio- G158 only content Providing an alternative for time-based media for video- G195 1.2.1 only content Providing audio that describes the important video G166 content and describing it as such Failure of Success Criterion F30 1.1.1 and 1.2.1 due to using text alternatives that are not alternatives 1.2.2 H95 Using the track element to provide captions 1.2.3 H53 Using the body
    [Show full text]
  • Exploring the Ecosystem of Referrer-Anonymizing Services
    Exploring the Ecosystem of Referrer-Anonymizing Services Nick Nikiforakis, Steven Van Acker, Frank Piessens, and Wouter Joosen IBBT-DistriNet, KU Leuven, 3001 Leuven, Belgium [email protected] Abstract. The constant expansion of the World Wide Web allows users to enjoy a wide range of products and services delivered directly to their browsers. At the same time however, this expansion of functionality is usually coupled with more ways of attacking a user's security and privacy. In this arms race, certain web-services present themselves as privacy- preserving or privacy-enhancing. One type of such services is a Referrer- Anonymizing Service (RAS), a service which relays users from a source site to a destination site while scrubbing the contents of the referrer header from user requests. In this paper, we investigate the ecosystem of RASs and how they in- teract with web-site administrators and visiting users. We discuss their workings, what happens behind the scenes and how top Internet sites react to traffic relayed through such services. In addition, we present user statistics from our own Referrer-Anonymizing Service and show the leakage of private information by others towards advertising agencies as well as towards `curious' RAS owners. Keywords: referrer, anonymization, online ecosystem 1 Introduction In the infant stages of the Internet, privacy and anonymity were mostly unneces- sary due to the small size of the online community and the public nature of the available data. Today however, this has changed. People have online identities, are connected to the Internet almost permanently and they increasingly store their sensitive documents, photos and other data online in the cloud.
    [Show full text]
  • Package 'Htmlutils'
    Package ‘HTMLUtils’ February 19, 2015 Type Package Title Facilitates Automated HTML Report Creation Version 0.1.7 Date 2015-01-17 Depends R2HTML Suggests Author ``Markus Loecher, Berlin School of Eco- nomics and Law (BSEL)'' <[email protected]> Maintainer ``Markus Loecher, Berlin School of Eco- nomics and Law (BSEL)'' <[email protected]> Description Facilitates automated HTML report creation, in particular framed HTML pages and dynamically sortable tables. License GPL LazyLoad yes Repository CRAN Date/Publication 2015-01-17 16:21:32 NeedsCompilation no R topics documented: HTMLUtils-package . .2 BasicHTML . .2 FramedHTML . .6 HTMLhref . 12 HTMLsortedTable . 13 InstallJSC . 14 makePathName . 14 myHTMLInitFile . 15 MyReportBegin . 16 MyReportEnd . 17 Index 18 1 2 BasicHTML HTMLUtils-package Facilitates Automated HTML Report Creation Description Facilitates automated HTML report creation, in particular framed HTML pages and dynamically sortable tables. Details Package: HTMLUtils Type: Package Title: Facilitates Automated HTML Report Creation Version: 0.1.7 Date: 2015-01-17 Depends: R2HTML Suggests: Author: "Markus Loecher, Berlin School of Economics and Law (BSEL)" <[email protected]> Maintainer: "Markus Loecher, Berlin School of Economics and Law (BSEL)" <[email protected]> License: GPL LazyLoad: yes Packaged: 2012-05-17 21:56:35 UTC; mloecher Repository: CRAN Date/Publication: 2012-05-18 05:59:13 Author(s) "Markus Loecher, Berlin School of Economics and Law (BSEL)" <[email protected]> BasicHTML creates a basic HTML page displaying plots and annota Description Creates a basic HTML page displaying plots and annotations that can easily be navigated. The plots can be created either ’on the fly’ by passing the appropriate commands or beforehand in which case just the filenames need to be passed.
    [Show full text]
  • CSC 443: Web Programming Basic HTML
    1 CSC 443: Web Programming Haidar Harmanani Department of Computer Science and Mathematics Lebanese American University Byblos, 1401 2010 Lebanon CSC443: Web Programming 2 Basic HTML Haidar Harmanani Department of Computer Science and Mathematics Lebanese American University Byblos, 1401 2010 Lebanon CSC443: Web Programming Hypertext Markup Language (HTML) 3 ¨ Describes the content and structure of information on a web page ¨ Not the same as the presentation (appearance on screen) ¨ Surrounds text content with opening and closing tags ¨ Each tag's name is called an element ¤ syntax: <element> content </element> ¤ example: <p>This is a paragraph</p> CSC443: Web Programming XHTML 4 ¨ Uses a markup format called XML ¨ XML + HTML = XHTML ¨ Standardized in 2000 ¨ A strict XHTML page uses some different syntax and tags than HTML CSC443: Web Programming Structure of XHTML page 5 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> information about the page </head> <body> page contents </body> </html> HTML ¨ HTML is saved with extension .html ¨ Basic structure: tags that enclose content, i.e., elements ¨ Header describes the page ¨ Body contains the page’s contents CSC443: Web Programming Page Title <title> 6 … <head> <title> HARRY POTTER AND THE DEATHLY HALLOWS - PART 2 </title> </head> … HTML ¨ Placed within the head of the page ¨ Displayed in web browser’s title mark and when bookmarking the page CSC443: Web Programming Paragraph <p> 7 … <body> <p> Harry Potter and the Deathly Hallows, the last book in the series, begins directly after the events of the sixth book.
    [Show full text]