IST 516: Web and Information Retrieval

IST 516: Web and Information Retrieval

<p> IST 516: Web and Information Retrieval Fall 2013 / Dongwon Lee Lab #1: XML Schema (TOTAL: 40 (DUE: Sep. 22 SUN 11:55PM) Points) NOTE: This is an individual lab. </p><p>Task 1. Consider the following two XML files: </p><p> http://pike.psu.edu/classes/ist516/latest/labs/xschema/letter.xml  http://pike.psu.edu/classes/ist516/latest/labs/xschema/letter2.xml</p><p> letter.xml letter2.xml <?xml version="1.0"?> <?xml version="1.0"?> <!DOCTYPE letter SYSTEM "letter.dtd"> <!DOCTYPE letter SYSTEM "letter.dtd"> <letter> <letter date="2005/1/1"> <to> <to> <first>Dongwon</first> <first>Dongwon</first> <last>Lee</last> <last>Lee</last> </to> </to> <from><first>Sylvie</first></from> <from> <title>Example</title> <first>Sylvie</first> <msg>Can you infer a DTD ?</msg> <middle>S.</middle> </letter> </from> <title>Example</title> <msg> <paragraph> I have a question. <paragraph> Can you infer a DTD ? </paragraph> </paragraph> </msg> </letter></p><p>Download both XML files to your local web folder (e.g., PASS). Then, using some XML editor software, write a reasonably tight schema in DTD (letter.dtd) such that both letter.xml and letter2.xml are “valid” according to letter.dtd. You may have to modify the location information in the XML files (i.e., <!DOCTYPE … >).</p><p>A tight schema accepts what is permitted in XML files but not too much “more” unnecessarily. For instance, suppose one needs to write a tight schema that accepts the following two XML snippets:</p><p><foo> <bar/> </foo> <foo> <bar2/> </foo></p><p>Then, either <!ELEMENT foo (bar|bar2)> or <!ELEMENT foo (bar?,bar2?)> would be a correct and tight schema (i.e., content model). However, <!ELEMENT foo (bar*,bar2*,</p><p>The Pennsylvania State University / College of Information Sciences and Technology bar3?)> would still be able to accept above two XML snippets but a whole lot more other XML snippets like:</p><p><foo> <bar/><bar/> <bar2/><bar2/> <bar3/> </foo></p><p>Therefore, <!ELEMENT foo (bar*,bar2*,bar3?)> would be a bit “too loose” for the two given snippets, and not the best schema.</p><p>Task 2. Consider the following XML file: </p><p> http://pike.psu.edu/classes/ist516/latest/labs/xschema/letter3.xml</p><p> letter3.xml <?xml version="1.0"?> </p><p><letter date="2005/1/1" xmlns="http://pike.psu.edu/classes/ist516/latest/labs/xschema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://some-path-in-your-web-folder letter.xsd"> <to> <first>Dongwon</first> <last>Lee</last> </to> <from> <first>Sylvie</first> <middle>S.</middle> </from> <title>Example</title> <msg> <paragraph> I have a question. <paragraph> Can you infer an XML Schema? </paragraph> </paragraph> </msg></p><p></letter></p><p>Again, download the letter3.xml to your local web folder (e.g., PASS). Then, using some XML editor software, write a reasonably tight schema in XML Schema (letter.xsd) such that letter3.xml is “valid” according to letter.xsd. You may have to modify the namespace or location information in the XML files (i.e., xsi:schemaLocation=”http://some-path-in-your- web-folder letter.xsd”).</p><p>Task 3. Using any one of the XML schema validators available on the Web or as part of S/W (e.g., http://www.w3.org/2001/03/webdata/xsv, http://schneegans.de/sv/, XML Pad), verify one more time that XML files are “valid” according to your letter.dtd or letter.xsd. </p><p>Grading Rubric (Total: 40 Points). Note that a very loose schema that not only accepts letter.xml, letter2.xml, letter3.xml, but also any other XML files is NOT a good answer. Therefore, design your content model of the schema using regular expression (RE) as tight as possible. As long as your schema is reasonably tight, you will get the full credit. </p><p>1. 20 Points: Passing the schema validity using your letter.dtd 2. 15 Points: Passing the validity using your letter.xsd 3. 5 Points: Schema are reasonably tight</p><p>Turn-In: By due date, upload the following information to ANGEL: 1. URL of your public web folder where you upload both letter.dtd and letter.xsd files. TA will visit each folder to check the correctness of both schema files. Make sure all files are accessible by TA. 2. Screenshot of the result message by W3C’s schema validator, showing that XML files are VALID</p>

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    3 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us