COMP: Please Set Alphabetical Dividers. Also Use En Dashes for Numerical Sequences

Total Page:16

File Type:pdf, Size:1020Kb

COMP: Please Set Alphabetical Dividers. Also Use En Dashes for Numerical Sequences

@@@COMP: Please set alphabetical dividers. Also use

en dashes for numerical sequences.@@@

INDEX accented characters, and text retrieval, 260 access control, 168 accessor methods, 219 Adept, 396, 397 Advogato, 421 aircraft company, shared authoring scenario in, 171 Ælfred parser, 142, 393 algebraic queries, 256-257 AltaVista, 251 anchor, regular expressions, 24 AnyDBM-File module, 241 Apache cache, 295 Apache mod-perl module, 410 search module, 317 Apache parser, 392 Apache Web Server, 392 online documentation, 421 application programmer's interface (API), 37-38 callbacks, 46 example, 38-44 apt-get program, 384 arbitrary field nesting, XML documents, 164-165 architectural forms, 213 archiving, 168 Ardent, 404 arithmetic encoding, 262, 263 arrays, 208 ASCII, 7 Astoria, 405-406 asynchronous networking, 45-47 example, 47-53 ATA 2100, 147, 151 atomic information, in tables, 60 attribute-list declaration, 26-28 attributes, 61, 72 and entities, 16 markup characters not allowed, 123 XML documents, 12-13, 26-28 authors.tbl, 79 AutoLinked glossary, 327-328 implementation, 332-336 use cases, 328-332 AutoLinker, 327, 337 for document conversion, 336 in memory, 275-284 with text retrieval, 284 for textual analysis, 336 automatic part-of-speech detection, 258 automatic text classification, 253 automatic thesaurus generation, 258 backup, 149-150 Balise programming environment, 116, 395 Barefoot License, 348, 374-375 binary mode, FTP, 382-383 binary packages, 377 installation, 383 binary snapshots, 377 binding, objects in object-oriented databases, 200 BitchX, 424 Bladerunner, 406 BLOB data type, 66 paragraphs as, 183-185 XML documents as, 180-183 BODY element (HTML), 302 book catalogue example, 117-120 CGI script for, 101-104, 105 ESIS output, 127-128 reading in Perl, 120-124 books, about XML and related topics, 413-420 book.tbl, 80-81 BookWeb example, 337-338 MySQL implementation, 76-84 representing tables, 87-91 SQL example, 73-75 Boolean ranking, presenting text retrieval results by, 265 /BriefCase, 408 browsers, see Web browsers brute force and ignorance, classification method, 252 BSD (Berkeley Software Distribution) License, 347, 365-366 byte order mark, Unicode, 7 bzip2, 150 C entities, 14 example API, 38-44 functions, 38, 43 parsers, 390 size overhead, 209 using expat, 133-137 C++ Is-A relationship, 211 parsers, 390 Standard Template Library, 208 cache manager, ndbm as, 294-297 callback function, 46-47 installing, 135-136 cardinality, 61, 62 Cascading Style Sheets, 33, 154 categorization, of documents in text retrieval, 252-253 The Cathedral and the Bazaar (Eric Raymond), 342 cdb, 247 central access scenario, 169-170 CGI scripts, 56, 96-98 book catalogue example, 101-104, 105 debugging, 98-101 and Perl DBI, 101-104 testing, 98-100 character references, XML documents, 13-14 character search, 254-255 character set, XML documents, 7 CHAR data type, SQL, 66 Citech browsers, 394 class designs, 215-219 classes, 212 classification, of documents in text retrieval, 252-253 clients, 35-36 second-class citizens, 38 client/server systems, 35-36 APIs, 37-44 asynchronous networking, 45-53 cache manager, 296 networks, 36-37 clustering, 321 comments DTD, 21-22 XML declaration, 8-9 Common Gateway Interface, see CGI scripts communication, 168 compound document configuration, 172 Comprehensive Perl Archive Network (CPAN), 388 Concurrent Versions System (CVS), 288, 408 configuration management, 168 configure script, 386 console-apt program, 380, 381 Containment relationship, 206-208 and references, 211-212 contamination, of software distributed with licensed, 345 content models, 22-23, 26 overriding using parameter entities, 30-31 conversion, 175 encyclopedia to XML, 274 existing object-oriented databases to XML, 211-212 cookies, 57 cooperation, 168 Copyleft, 346 CPAN (Comprehensive Perl Archive Network), 388 cross-references, 148 implicit, 157 in ndbm database, 248-249 CVS (Concurrent Versions System), 288, 408 Cyberbolic Map, 273 database applications requirements, 145-149 XML architectures, 149-158 database implementation documents as BLOBs, 180-183 elements as fields, 185-187 elements as objects, 189-190 general issues, 178-180 hybrid approaches, 191-192 metadata only, 187-189 paragraphs as BLOBs, 183-185 text retrieval, 191-192 database integrity, 67 DataBase Interface module (Perl), see Perl DBI databases. See also object-oriented databases; relational databases; SQL; XML books about, 416 creation with ndbm, 225-229 external file support, 288 files with, 287-291 normal forms, 72-73 session tracking, 56 and text retrieval, 291-294 database scenarios, 167-169 central access, 169-170 distributed access and technology reuse, 174 information reuse, 172-174 revision control, 171-172 shared authoring, 170-171 database-to-database translations, 85, 86 Data Mirror, 149 data mobility, 85, 86 data transfer, 149-150 db, 221, 245-248 features, 248 hashing with, 409 hybrid applications, 296-297 performance, 236 dbm, 221, 243. See also ndbm hashing with, 409 dbm_close(), 240, 242 dbm_delete(), 231-233 dbm_fetch(), 229-231 dbm_insert(), 226 dbm_open(), 225, 240, 242 dbm_store(), 226-229 dbz, 247 deadlock, 202 Debian Free Software Guidelines, 342, 345 Debian Linux, 377, 380-382 binary packages on, 383 source packages on, 384 debugging CGI, 98-101 object-oriented databases, 195 DELETE statement, SQL, 71 delta coding, 263 derived works, 343 design agency, central access scenario in, 169-170 Desperate Perl Hacker, 123 dev-xml (mailing list), 423 Dewey decimal classification system, 252, 253 diff program (Unix), 180 discrimination, in licenses, 344 distributed access and technology reuse scenario, 174 distribution of licenses, 344 objects in object-oriented databases, 203 DocBook DTD, 151 DOCTYPE declaration, 9 document management, 297-298, 324 document management systems, 324, 405-406 Document Object Model (DOM), 217 documents, see XML documents document similarity, 259 presenting text retrieval results by, 265-266 document type declaration, 9-10 document type definition (DTD), 19-21 comments and spaces, 21-22 conditional sections, 31-32 element declaration, 22-28 parameter entities, 28-31 per-document schemata, 163 document type element, 10 Docuverse DOM SDK, 394 DOS environment, document type declaration, 9 dpkg program (Debian), 380 dselect program (Debian), 380 DSSSL, 395 DTD, see document type definition dtddoc, 393 DTD subset, 9-10 Dublin Core, 304-306 elements summary, 305-306 dynamic hashing, 224 dynamic hashing libraries, 221-222. See also ndbm dynamic linking, 37-38 dynamic storage, 208 editors, for XML, 396-400 effectivity, 175 elements. See also content models declaring, 22-28 as fields, 185-187 as objects, 189-190 representing sequences of, 209-210 XML documents, 10-12 Element Structure Information Set (ESIS), 125-129 embedded markup, 4-5 Empress, 403 EMPTY elements, 123 Encoded Archival Description (EAD), 312 encoding parameters, 134 encyclopedia conversion to XML, 274 publishing, information reuse scenario in, 173-174 end command (C), 209 end tags, 10 Enhydra, 406 Enlightenment window manager, 154, 155 entities, 60-61 external, 7, 15, 16, 217-218 general, 4-5, 14-17 internal, 218 parameter, 28-31 entity reference, 15 entity relationship diagrams, 60-61, 147 BookWeb example, 88 Entity system, 214 ENUM data type, SQL, 66 error handling, in hybrid systems, 298 escape sequences, 125, 126 ESIS, 125-128 escape sequences, 125, 126 format summary, 125 reading, 129 etext command (C), 209 event models asynchronous networking, 45-46 parsers, 46, 389 events, 45 eXcelon/Object Design, Inc., 404 Excite search engine, 412 expat, 390 building on Unix, 390-392 sample program, 137-142 using in C, 133-137 using in Java, 133 using in Perl, 130-133 export restrictions, on software, 344 Extended Backas-Naur Form, 6 extended links, 271 eXtensible Markup Language, see XML external asset management systems, 317 external DTD, 20 external entities, 7, 15, 16, 217-218 external parsing, 124-129 extranet, 146 fields elements as, 185-187 length, 163-164 nesting, 164-165 sequencing, 165 fields of endeavor, discrimination against prohibited in licenses, 344 find command (Unix), 410, 411 Finding Aids, 312 first normal form, 72 flat files, 287 flexible storage design, 208 FLOAT data type, SQL, 66 FOP, 396 formatting, 396 401 Authentication, 56-57 404 Forbidden error, 100-101 fragment identifier and CGI, 97 XPath, 269 FrameMaker, 398 FreeBSD, 377, 382 binary packages on, 383 source packages on, 383, 384 FreeCode, 421 free software, see open source projects; software Free Software Foundation, 346 Free XML Tools Page, 423 Freshmeat, 378, 379, 421, 422 FTP, downloading software with, 382-383 garbage collection (Perl), 94 gdbm, 221, 244-245 features, 248 hashing with, 409 GeekBoys, 421 Gemstone, 404 general entities, 4-5, 14-17 general text entities, 15-16 getParent(), 217 getReadyForPatterns(), 280 glossary, AutoLinker based, see AutoLinked glossary Gnome, 378-379, 392 gnome-apt program, 380 gnorpm program, 378-379, 380, 383 GNU General Public License (GPL), 345, 346, 348-354 GNU Lesser General Public (Library) License (LGPL), 346, 354-363 GNU Public Virus (GPV), 346 Goxml.com, 423 GPL (GNU General Public License), 345, 346, 348-354 GPV (GNU Public Virus), 346 grep command (Unix), 25 for information retrieval database searching, 410-411 for text retrieval, 254, 291 groups discrimination against prohibited in licenses, 344 of related documents, 321 gzip, 150 Harvest, 412 hashing with dbm, 409 dynamic, 221-222, 224 Has relationship, 206, 210 HEAD element (HTML), 302 heterogeneous clients, object-oriented database servers, 196 heterogeneous collections, text retrieval, 259-260 historical archiving, 168 hooks, 46 HTML. See also links Dublin Core elements with, 304 META and LINK tags, 302-303 page caching, 295 XML contrasted, 3-4, 5, 19 HTML-style links, 270-271 HTML Tidy, 396 HTTP, 53-54 HTTP-based cache, 295 httpd log file, 322 HTTP error logs, 324 hub format, 147, 150-151 hybrid approaches, 287-298 database implementation, 191-192 hypercase searching, 254 hypertext. See also links books about, 416-417 SQL-like languages for reasoning about, 321 HyperText Markup Language, see HTML HyperText Transfer Protocol (HTTP), see HTTP HyTime, 273 Iaijutsu, 406, 407 IBM DB2 Universal Database, 403 IBM XML tools for Java, 394 implementation strategies, see database implementation implicit cross-references, 157 InDelv browser, 394 indexes AutoLinked glossary, 328-329, 333-334, 336 creation for text retrieval, 260-264 inverted, 256, 261-264 object-oriented databases, 201-202 information retrieval, 251-252. See also text retrieval books about, 416-417 information retrieval databases searching with index, 411-412 searching without index, 410-411 information reuse scenario, 172-174 Informix, 403 Inktomi search engine, 412 inline links, 270-271 inner join, 70 Insight Foundation (Yuri Rubinsky), 146 installing binary packages, 383 Perl modules, 388 source packages, 383-384 source tarball, 385-387 InstallShield, 377 INT data type, SQL, 66 Integrated Development Environments (IDEs), with object-oriented databases, 195 Interactive PostgreSQL for Windows, 402 Interbase, 402 interchange between applications, 153 with other organizations, 151-152 interfaces, 215 Interleaf, 406 Interleaf Panorama, 394 internal cache, 295 internal document type definition subset, 9-10, 20 internal entities, 218 internal parsing, 129-142 Internet Explorer 5, 87 XML support, 394-395 Internet Relay Chat (IRC), 424 Internet Relay Chat (IRC) glossary, 274, 322, 333 visualizing relationships, 285 intranet, 146 cookies in, 57 XML-aware Web browsers, 154 inverted index, 256, 261-264 Is-A relationship, 206, 210-211 ISO 8859-1, 7 ISO 8879:1988 SGML, 4 ISO Topic maps, 273-274, 312-313 Jade, 395, 396 Japanese language, text retrieval issues, 260 Java dynamic hashing libraries with, 221 entities, 14 exporting data with, 112-115 multithreaded connections, 55 object-oriented databases, 199 parsers, 393-394 using expat, 133 with XML-aware browsers, 147 Java API for XML (Sun), 393 Java Component Library, 400 Java DataBase Connection (JDBC), 112 Java servlets, 113-115 for link visualization, 320 JOIN statement, SQL, 70 minimizing number for improved speed, 294 journals, about XML and related topics, 420-421 JUMBO browser, 142, 394 keepalive mechanism, HTTP 1.1, 54 kpackage program, 380, 381 languages, text retrieval in multiple, 259-260 Lark, 394 Latin 1, 7 LGPL (GNU Lesser Public (Library) License), 346, 354-363 licenses, see open source licenses ligatures, and text retrieval, 260 line breaks, 179-180 LINK element (HTML), 302, 303 links, 315-316. See also XML links in AutoLinked glossary, 328, 334-335 automatic, 274-284 checking, 324 extended, 271 between files, 249 incorrect, 274 management and analysis, 324 multiway, 317, 322 simple, 270-271 link visualization, 318-321 link visualization tools, 321 Linux. See also Debian Linux; Mandrake Linux; Red Hat Linux books about, 417-419 Linux Chix, 423 literal, 309 little language, 256 loading behavior, 208, 209 local area network (LAN), 146 locales, text retrieval from multiple, 260 locate command (Unix), for information retrieval database searching, 410 location, of users, 146 locked programs, in synchronous networks, 45 LONGBLOB data type, SQL, 66 LONGTEXT data type, SQL, 66 LotusXSL, 396 lqaddfile, 333-334 lqkwic, 293, 335 lqsed, 335 lq-text, 347, 378, 379 hypercase searching, 254 for link insertion in AutoLinker glossary, 333-335 query result, 291-294 text retrieval using, 411-412 lqunindex, 293, 334 Lynx, 146 Macintosh environment Internet Explorer 5 with, 395 MIME types, 55 source repositories, 406 magazines, about XML and related topics, 420-421 mailing lists, 423-424 malloc(), 129 Mandrake Linux, 378-380 source packages on, 383-384 manifests, 307 man-k command, 421 man perl command, 421 many-to-many relationships, 62 MARC version 21, 312 markup, 4 Markup Technologies, 413 metadata, 287-288 database implementation using only, 187-189 defined, 301-302 Dublin Core, 304-306 links as, 315-316 and Resource Description Framework (RDF), 307-312 storing in databases, 316-324 META element (HTML), 302-303 Microsoft, see DOS environment; Internet Explorer; Windows environment Microsoft XML Notepad, 400 migration, of database objects, 203 MIME types, 317 described, 55 as preferable to namespaces for Web uses, 33 and XML declaration, 8 mIRC, 424 MIT License, 347, 366 mixed content, 148, 161-162, 209-210, 216 mkheader() (HTML), 281 Moby Project thesaurus, 258 modifiers, SQL, 64, 66-67 Mozilla, 85, 394 Enlightenment window manager, 154, 155 Public License, 347-348, 366-374 RDF application, 312 multilingual text retrieval, 259-260 Multipurpose Internet Mail Extension (MIME) types, see MIME types multithreaded connections, 55 MySQL, 63, 183, 401, 402 documentation, 421 field length, 164 mysqladmin command for database creation, 64 options, 65 mysql interpreter, 63 namespaces, 33, 214 RDF, 309 XLink, 270 name.tbl, 82 navigation, object-oriented databases, 196-200 navigational aids, 322, 323 ndbm, 221, 243-244 as cache manager, 294-297 database creation and saving a value, 225-229 deleting a value, 231-233 described, 222-225 features, 248 hashing with, 409 iterate over all keys, 223 performance, 236-240 reading a value, 229-231 reading all values, 233-236 retrieve any value by key, 223 store unordered set of key-value pairs, 222-223 and text retrieval, 262 using, 225-236 using in Perl, 240-242 versions, 242-248 when to avoid, 224 and XML, 248-249 nested objects, 5 nested transactions, 202 NetBSD, 382 source packages on, 383, 384 Netscape Mozilla, see Mozilla Network File System (NFS), 188, 288 Network Information System (NIS), 222 networks, 36-37 asynchronous networking, 45-53 normalization, BookWeb example, 74-75 notations, 33 Not Found error, 100-101 NOT NULL database reference, 324 nsgmls, 129, 390 NSGMLS.pm, 129 object-oriented databases, 195-196, 404-405 converting existing to XML, 211-212 data access, 196-200 indexing, 201-202 object binding, 200 object distribution, 203 object saving, 200 persistence, 195, 196 queries, 200-201 text retrieval with, 291, 297 transactions, 202 and XML relationships, 206-211 object-oriented programming, 205 Object Query Language (OQL), 190, 200 objects binding in object-oriented databases, 200 distribution in object-oriented databases, 203 elements as, 189-190 properties in relational databases, 60-61 relationships in relational databases, 61-62 saving in object-oriented databases, 200 XML documents as, 297 odbm, 243 features, 248 OLAP, 192 Omnimark, 116, 336, 396 one-to-many relationships, 62 one-to-one relationships, 62 online documentation, 412, 421-424 online newsfeed, information reuse scenario in, 173 OpenJade, 395, 396 open source browsers, 394-395 Open Source Definition (v 1.7), 342-344 open source licenses, 341-342 Barefoot, 348, 374-375 BSD, 347, 365-366 GPL, 346, 348-354 LGPL, 346, 354-363 MIT, 347, 366 Mozilla, 347-348, 366-374 Perl Artistic, 347, 363-365 open source projects defined, 342-345 learning about, 413 relational databases, 401-402 Web sites, 421, 423 OpenSP, 390 Open Text, 291, 412 Oracle, 403 arbitrary field nesting, 164-165 OSI Certified software, 342, 345 Ovrimos SQL Server, 403 A Package Tool, 380-381 paper-based publishing, 152, 153 paragraphs, as BLOBs, 183-185 parameter entities, 28-31 overriding content models, 30-31 Parlance, 406 parsers, for XML, see XML parsers PartNo element, 5 patch files, 343 patterns, see regular expressions #PCDATA keyword, 26 per-document schemata, 163 Perl AutoLinker example, 275-284, 328 book catalogue example, 120-124 books about, 419-420 Desperate Perl Hacker, 123 documentation, 421 dynamic hashing libraries with, 221 entities, 14 garbage collection, 94 for link visualization, 318-321 module installation, 388 textual analysis application, 336 using expat, 130-133 using ndbm in, 240-242 Web server script, 48-53 Perl Artistic License, 347, 363-365 Perl AutoLinker, 275-284, 328 perl command (Unix), 25 Perl DataBase Interface module, see Perl DBI Perl DBI, 409-410 and CGI, 101-104 described, 92 generating XML with, 91-95 perldoc command, 421 pernicious mixed-content model, 162 per-paragraph similarity, in text retrieval, 259 perpetual intermediates, 264 persistence, in object-oriented databases, 195, 196 persons, discrimination against prohibited in licenses, 344 PHP architecture, 105 described, 104-106, 409 documentation, 421 example code, 106-112 phrase-aware systems, 254, 255-256 phrase sequences, in text retrieval, 259 Pinnacles DTD, 151 Platform for Internet Content Selection (PICS), 306-307 Poet Software, 404, 405 poll, 47 PostgreSQL, 183, 401, 402 printing, 396 processing instructions, in XML documents, 18-19, 179 programming conferences, 413 Project Gutenberg thesaurus, 258 Prolog, XML documents, 7-10 properties, in RDF, 308, 309 protocol layers, 37-38 protocols and APIs, 37-44 networks, 36-37 public domain software, licenses, 345-346 publishers.tbl, 79-80 push technology, 174 PyPointers, 393 Python dynamic hashing libraries with, 221 parsers, 393, 396 queries metadata, 316-318 object-oriented databases, 200-201 text retrieval, 254-259 Raima, 403 RCS, see Revision Control System rcsdiff command, 290 RDF, see Resource Description Framework RDF Schema Specification, 308, 310, 311 RDF Specification, 307, 310 RDF Visualization tool, 310, 311 RDM, 406 recall, in text retrieval, 255 Red Hat Linux, 377, 378-380 binary packages on, 383 MySQL, 402 RDF application, 311 source packages on, 383-384 redistribution, of open source code, 343 references.tbl, 82-84 referential integrity, 288 checking, 71 Refers To relationship, 206, 211 regular expressions, 23, 24-25 text retrieval, 255 relational databases, 59. See also SQL; tables commercial, 402-404 creation in SQL, 63-64 deletion in SQL, 64 generating XML from, 85-116 minimizing JOINs for improved speed, 294 and nested objects, 5 object properties, 60-61 object relationships, 61-62 open source and free, 401-402 performance and number of tables, 148 representing tables, 87-91 Revision Control System with, 289-291 string manipulation weakness, 24 text retrieval with, 291 XML documents stored in, 166 relational tables, see tables relationship types, see roles repositories, 287-288, 405-406. See also databases application-specific with wired-in behavior, 214-215 generic XML with external application, 214 source, 406, 408 requirements, 145-149 research project, central access scenario in, 169 Resource Description Framework (RDF), 147, 273 described, 307-312 visualizing relationships, 284-285 resource discovery, 168 resources, in RDF, 308, 309 revision control scenario, 171-172 Revision Control System (RCS), 408 and relational databases, 289-291 revision control tools, 172 RFC 2731, 306 roles, 61 BookWeb example, 75 root, avoiding installing source distributions as, 385 root element, 10 rpm program, 378, 380, 384 rusage command (Unix), 209 RXP, 392 saving, objects in object-oriented databases, 200 SAX, 393 described, 142 saxlib, 393 SAXON XSL processor, 395-396 SCCS, 406, 408 sdbm, 221, 244 features, 248 hashing with, 409 performance, 236, 240 search and replace, 24-25 searches, metadata, 316-318 second-class citizen, 38 second normal form, 73 sed command (Unix), 25 select() function, BSD Networking API, 47-48 select() statement, SQL, 68-69 ORDER By clause, 70 semantic nets, 258 Sequence relationship, 206, 208-210 serialization, 212 Server51, 423 servers, 35-36 SGML, and XML, 4, 5 SGML Open Catalog, 21 SGMLS.pl, 129 sgrep command (Unix) for information retrieval database searching, 411 for text retrieval, 255 shared authoring scenario, 170-171 signatures, 261 significant comments, 179 similarity algorithms, in text retrieval, 259 Simple API for XML (SAX), see SAX simple links, 270-271 size overhead, 208-209 slang dictionary, 156-158 SlashDot, 423 slocate command (Unix), for information retrieval database searching, 410 slurp mode (Perl), 122-123 SMART system, 266 software, 377-378 downloading with FTP, 382-383 finding packages, 378-383 installing binary packages, 383 installing Perl modules, 388 installing source packages, 383-384 installing source tarball, 385-387 software production, shared authoring scenario in, 170-171 Solaris environment, 377, 382 binary packages on, 383 Some2XML, 395 SORTBY, 208 source code, 343 choice of programmers for writing, 148-149 integrity of author's, 343-344 Source Code Control System (SCCS), 406, 408 SourceForge, 423 source packages, 377 installation, 383-384 source repositories, 406, 408 source tarballs, 378 installation, 385-387 SP, 390 building on Unix, 390-392 spell-check, context-sensitive, 259 spreadsheets, XML-aware, 87 SQL, 63. See also relational databases changing data with UPDATE and DELETE, 71 database creation, 63-64 database deletion, 64 inserting data into tables, 68 JOIN statement, 70 limiting queries with WHERE, 69 printing tables with SELECT, 68-69 returning multiple columns, 70-71 for returning text retrieval results to a program, 266 SELECT statement, 68-69 sorting, 70 table creation, 67 text retrieval, 337 WHERE statement, 69 SQL/92, 63 SQL commands, 63 SQL data types, 64-67 SQL expressions, 70 Squid, 295 STAIRS, 262 standard error, 101 Standard Generalized Markup Language, see SGML Standard Template Library, 208 start tags, 10 statements, RDF, 308-309 statistical ranking, presenting text retrieval results by, 265 stemming, 257-258, 259 streams, 4 strings manipulation weakness of relational databases, 24 parameter entities for reuse, 29 Structured Query Language, see SQL stub objects, 199 style sheets, 33 subset, 9-10 substitution, in search and replace, 25 surrogate fields, 210 sxml for emacs, 400 Sybase, 404 synchronous networking, 45 synchronous protocols, 40-41 tables, 60-61, 166 changing data with UPDATE and DELETE, 71 creation, 67 inserting data into, 68 number of, and performance of relational databases, 148 printing with SELECT, 68-69 representing, 87-91 returning multiple columns, 70-71 tags can't be omitted, 124 META and LINK, 302-303 XML documents, 10-12 tar archive, 383, 385-386 tcl, dynamic hashing libraries with, 221 telephone repair manuals, 174 term replacement, in text retrieval, 258-259 TeX macros for paper-based publishing, 152 XML parsers in, 394 text, 218-219. See also XML documents declaring content, 26 representing sequences of, 209-210 transforming non-XML into XML, 395-396 XML as text-based format, 4 TEXT data type, SQL, 66 Text Encoding Initiative, 147, 151 text retrieval, 336, 337. See also information retrieval AutoLinked glossary, 333-334 and database implementation, 191-192 and databases, 291-294 defined, 251-252 document categorization, 252-253 heterogeneous collections, 259-260 implementation issues, 260-266 index creation, 260-264 multiple languages, 259-260 multiple locales, 260 queries, 254-259 results presentation, 264-266 results returned to a program, 266 uncategorized information, 253-254 textual analysis, 336 thesaurus, in text retrieval, 254, 258-259 third normal form, 73 tie(), 241, 242 Topic maps, 273-274, 312-313 trade shows, 413 transactions, object-oriented databases, 202 tree model, parsers, 389 two-ended inline links, 270-271 types, 212 Ultraseek Server, 412 Unicode, 7, 260 Unified Modelling Language (UML), 205 Uniform Resource Identifiers (URIs), 307 Uniform Resource Locators (URLs), 55-56, 267, 307. See also CGI scripts Uniform Resource Names (URNs), 267 Unix environment books about, 417-419 building expat and SP on, 390-392 cache manager, 296 \(... \) idiom, 25 MIME types, 55 ndbm with, 221 nsgmls with, 390 public domain software, 345-346 size overhead, 209 software installation, 377 unrestricted field length, 163-164 untie(), 241 UPDATE statement, SQL, 71 updating, 168 Usenet News, 174 users, 146-147 values, in RDF, 309-310 VARBINARY data type, SQL, 66 VARCHAR data type, SQL, 66 vectors, 208 Verity search engine, 291, 412 Versant, 404 versioning, 168 vi command (Unix), 25 virtual folders, 321 Visual Markup, 400 visual schema designers, with object-oriented databases, 195 Visual XML, 116 Web-based validator, 393 Web browsers, 394-395 XML-aware, 154, 155 and XML generation, 86-87 Web clients, 53 Web Reports tool, 322, 323 Web servers, 53-57 IP port, 54 Perl script for, 48-53 WHERE statement, SQL, 69 white space, 13, 179 wildcards, 255 Windows environment cache manager, 296 document type declaration, 9 Internet Explorer 5 with, 395 local edit of downloaded files, 286 MIME types, 55 nsgmls with, 390 software installation, 377 source repositories, 406 WordNet package, 258 word processors, XML-aware, 87 word sequences, in text retrieval, 255-256 workflow, 175, 297-298 World Wide Web architecture, 53-57 401 Authentication, 56-57 automatic site mirroring, 174 cookies, 57 sites about XML and related topics, 421-423 World Wide Web Conference, 304 World Wide Web Consortium, 4, 423 Metadata Activity group, 307 Woven Goods for Linux, 423 writer's revisions, 172 wrote.tbl, 81-82 xargs command (Unix), 411 x-chat program, 424 XED, 396, 398, 399 Xerces-J parser, 393 xhost program, 378 XHTML, 4 XLink, 267, 315-316 overview, 270-272 XMetaL, 396, 397-398 XML. See also databases; SQL behavior, 212-215 book catalogue example, 117-124 books about, 414-416 class designs, 215-219 defined, 3-5 features reference, 6-32 generating with CGI, 96-104 generating with Java, 112-116 generating with Perl DBI, 91-95 generating with PHP, 104-112 HTML contrasted, 3-4, 5, 19 hybrid approaches, 287-298 namespaces, 33 and ndbm, 248-249 notations, 33 reading into a program, 117-142 reading specification, 6-7 reasons for generating, 85-87 and SGML, 4, 5 style sheets, 33 as text-based format, 4 tutorials, 396 Web sites about, 423 XML-APP (mailing list), 423-424 XML architectures for backup and data transfer, 149-150 as hub format, 147, 150-151 for interchange between applications, 153 for interchange with other organizations, 151-152 as intermediate format for advanced Web pages, 154, 156-158 for paper-based publishing, 152, 153 XML-aware Web browsers, 154, 155 XML Authority, 400 XML-aware tools, 87 XML-aware Web browsers, 154, 155, 394-395 XML declaration, 8 xml-dev (mailing list), 424 XML documents, 4. See also attributes; document type definition; elements; entities; tags arbitrary field nesting, 164-165 AutoLinker conversion, 336, 337 as BLOBs, 180-183 categorization, 252-253 character references, 13-14 character set, 7 comments, 8-9 document type declaration, 9-10 management, 297-298, 324 as objects, 297 per-document schemata, 163 processing instructions, 18-19 Prolog, 7-10 stored in relational database, 166 text content, 13-14 as trees, 166 XML declaration, 8 XML dot COM, 423 XML editors, 396-400 XML Extender, for IBM DB2 Universal Database, 403 XML files, 4 considered authoritative in databases, 317 including with parameter entities, 29-30 XML4C++, 393 XMLIO, 393 XML library for Gnome, 392 XML links and databases, 274-285 functionality, 321-324 XML-L (mailing list), 424 XML Parser for Java (Oracle), 393 XML parsers, 46, 389-394 creating in C, 133 external parsing, 124-129 history, 142 internal parsing, 129-142 with metadata, 317 XML Path Language, see XPath xmlproc, 393 XML Query Language, see XQuery XML relationships Containment, 206-208 Has, 206, 210 Is-A, 206, 210-211 Refers To, 206, 211 Sequence, 206, 208-210 XML Retrieval System (XRS), 412 XML Schema, 27, 32, 205-206 XML specification, 6-7 XML Spy, 399 XML standards, 267-274 XML Style Language (XSL), 33 XML Style Transformation Language (XSLT), see XSLT XMLWriter, 400 XP, 394 XPath, 267, 315-316 overview, 268-269 XPointer, 267, 268, 271 overview, 269-270 XPublish, 400 XQL, see XQuery XQL (mailing list), 424 XQuery, 267 overview, 272-273 XRS (XML Retrieval System), 412 XSL, 267 tutorials, 396 XSL-List (mailing list), 423 XSLT, 116, 151, 268, 395 and XML generation, 87 XT, 395 XTech, 413 X Window System License (MIT), 347, 366 Xyvision Parlance, 406 Yahoo!, 252 YEAR data type, SQL, 66 Yuri Rubinsky Insight Foundation, 146

Recommended publications