Web Information System Design No.5 Accessing Web Documents

Web Information System Design No.5 Accessing Web Documents

Web Information System Design No.5 Accessing Web Documents Tatsuya Hagino ([email protected]) 1 Web Basic Components Prepare documents using HTML and CSS Address documents by URL Get documents from server using HTTP HTML documentHTML HTML Internet documentHTML document documentHTML document URL HTTP Web Browser Web Server 2 URI (Uniform Resource Identifier) Identifier names and numbers that identify things (objects) and concepts Identifiers which we use: identifier for students identifier for books Identifier identifier for phones identifier for PC Object identifier in Internet Concept 3 URL, URN and IRI URL (Uniform Resource Locator) URI to point the address of resource Example: http, ftp, file, mailto URN (Uniform Resource Name) urn:<nid>:<nss> NID needs to be registered to IANA http://www.iana.org/assignments/urn-namespaces/urn-namespaces.xml 50 URN are registered (2014-04-17) Example: 3gpp, cablelabs, cgi, clei, dgiwg, dvb, ebu, epc, epcglobal, eurosystem, example, fdc, fipa, geant, gsma, ietf, iptc, isan, isbn, iso, issn, ivis, liberty, mace, mpeg, nbn, nena, newsml, nfc, nzl, oasis, ogc, ogf, oid, oipf, oma, pin, publicid, s1000d, schac, service, smpte, swift, tva, uci, ucode, uuid, web3d, xmlorg, xmpp IRI (Internationalized Resource Identifier) Internationalized URI URI only allows to use ascii characters IRI allows to use unicode characters 4 URI Generic Syntax http://www.sfc.keio.ac.jp/teacher/hagino.html?title=web#lecture Schema Authority Path Query Fragment Schema Path Type of URI Address inside the authority Protocol File name Authority Query Host name Query words Server name Parameters for interaction Fragment Position inside the document 5 Axioms for URI Universality Every Web resource has URI. Global Scope URI always has the same meaning regardless of context. It is unique in the global scope. Sameness URI means the same thing. Representations may vary, but the meanings are the same. Opacity URI itself does not show its resource type. Resource type and details need to be obtained from its representation. 6 Concept, Identifier and Representation Concept Web Resource http://www.keio.ac.jp/index.html Identifier Identifier Resource Keio University URI (Uniform Resource Identifier) Representation Representation <!DOCTYPE html> <html> HTML+CSS <head> <title>Keio University</title> </head> XML <body> <h1>Keio University</h1> RDF ... ... ... </body> </html> 7 Use of URI Web page address URL http://www.sfc.keio.ac.jp/about_sfc/video.html Specification DTD http://www.w3.org/TR/html4/loose.dtd urn:ietf:rfc:2141 RFC Any Web resource Person http://ja.dbpedia.org/page/東京都 City http://ja.dbpedia.org/page/安倍晋三 8 HTTP (Hypertext Transfer Protocol) Protocol for manipulating Web resource Five main methods HEAD GET HEAD Obtain Get information of the resource Process GET Web Get a representation of the resource Resource POST PUT Create or update the resource Update DELETE PUT DELETE Delete the resource POST Send data for process 9 HTTP and FTP FTP For remote file manipulation Used from the beginning of Internet User needs to login to FTP server Use two TCP connections (control and data) Only support two types: text or binary Anonymous FTP for software distribution HTTP For Web resource manipulation No need for user authentication Use single TCP connection Multimedia support 10 HTTP Request and Response GET <method> HTTP/1.0 <header1>: <value1> <header2>: <value2> ... <body> ... Web Server User Agent HTTP Request (Browser) Web HTTP Response Resource HTTP/1.1 <status code> <reason> <header1>: <value1> <header2>: <value2> status code meaning ... 200 OK 301 Moved Permanently <body> ... 303 See Other 401 Unauthorized 403 Forbidden 11 404 Not Found GET and HEAD Methods GET method Obtain a representation of the resource HTML Document Movie Content negotiation GET Language negotiation Web Resource HEAD method Obtain information about the resource and its presentation GET Subset of GET Japanese English Property of GET method GET is safe to use multiple times GET is idempotent No side effect for GET GET × GET = GET 12 Content Negotiation Document/image format GET keioPenMark Web Resource Accept: image/png keioPenMark.jpg User keioPenMark Agent keioPenMark.png keioPenMark.png keioPenMark.gif Language format GET index.html index.html.en Accept-Language: ja, en-us;q=0.8, en;q=0.7 User index.html Agent index.html.ja index.html.ja index.html.kr 13 PUT and POST Methods PUT method GET PUT Create or update the resource of URI Create or update a Web page Inverse of GET Browsers do not use PUT method Web Resource POST method Send data to URI resource The resource process the data Used for FORM interaction GET vs POST method Choose GET or POST in FORM method attribute Use GET for non side effect operations Use POST for updating resource with side effect POST is not idempotent POST × POST ≠ POST 14 Other Features of HTTP Transfer Effective Use of TCP/IP Page move Persistent connection (keep- alive) Pipeline processing Authentication User name and password Proxy cache control Basic and Digest authentication max-age no-cache public and private Virtual Host Serve multiple hosts by single server Extension to WebDAV Use DNS alias COPY, MOVE, LOCK, UNLOCK 15 Summary URI Identifier for Web URL, URN, IRI HTTP Protocol for Web resource manipulation HTTP URI Methods: HEAD, GET, PUT, DELETE, POST 16 Group Work Group Work Three students in one group Title: Proposal of HTML6 List problems of current HTML and propose a solution as HTML6 Problems regarding HTML document format, features, server environment, programming environment, and so on. Presentation 5 minutes presentation The presentation material must be uploaded to SFC-SFS before noon of the presentation day. Date Presentation on June 1st. If there is no class because of Keio Waseda baseball match, presentation on June 8th. Note The title slide must include the list of members (who contribute the work). 17 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    17 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us