Sphider-Plus Manual

Sphider-Plus Manual

Sphider-plus manual Content 1. Introduction......................................................................................................................6 2. Version and legal info......................................................................................................6 3. Installation of version 3.0 – 3.2020d..............................................................................7 3.1 Preconditions...................................................................................................................................... 7 3.2 New installation................................................................................................................................... 7 3.3 Updating from version 1 and 2..........................................................................................................10 3.4 Updating from 3.x to 3.y....................................................................................................................10 4. Settings and customizing.............................................................................................12 5. Indexing..........................................................................................................................14 5.1 Various options................................................................................................................................. 14 5.2 Allow other hosts in same domain....................................................................................................15 5.3 Word stemming................................................................................................................................. 15 5.4 Periodical Re-indexing......................................................................................................................16 5.5 Preferred indexing............................................................................................................................ 16 5.6 Multi-threaded indexing.....................................................................................................................16 5.7 Create thumbnails as a web shot during index procedure................................................................17 5.8 Follow sitemap file............................................................................................................................17 5.9 Use private sitemap instead of global sitemap..................................................................................18 5.10 Create sitemap file.......................................................................................................................... 18 6. Using the indexer from command line........................................................................19 6.1 Overview and options.......................................................................................................................19 6.2 Multi-threaded indexing.....................................................................................................................19 6.3 Index only the new............................................................................................................................20 6.4 Reindex all........................................................................................................................................ 20 6.5 Index erased sites............................................................................................................................. 20 7. Keeping pages, words and files from being indexed.................................................21 7.1 robots.txt........................................................................................................................................... 21 7.2 URL must include / must not include string list..................................................................................21 7.3 Ignoring links..................................................................................................................................... 22 7.4 Canonical <link> tag......................................................................................................................... 22 7.5 Ignoring parts of a page by <!--sphider_noindex--> . <!--/sphider_noindex-->............................22 7.6 Ignoring parts of a page defined by <div id=’abc’> or <div class=’abc’>...........................................23 7.7 Indexing only parts of a page defined by <div id=’abc’> or <div class=’abc’>...................................23 7.8 Ignoring HTML elements defined by <tagname> . </tagname>..................................................24 7.9 Indexing only HTML elements defined by <tagname> . </tagname>..........................................24 7.10 Ignoring parts of a page defined by <ul class=’abc’> . </ul>.....................................................25 7.11 Ignoring parts of a page defined by <pre class=’abc’> . </pre>.................................................25 1 7.12 Ignored words................................................................................................................................. 26 7.13 Use of Whitelist............................................................................................................................... 26 7.14 Use of Blacklist............................................................................................................................... 27 7.15 Ignored files (by suffix)....................................................................................................................27 7.16 Index only files and documents with defined suffix.........................................................................27 8. UTF-8 support and 'Preferred charset'........................................................................28 9. Search modes.................................................................................................................30 9.1 Search with wildcards *.....................................................................................................................30 9.2 Strict search !.................................................................................................................................... 30 9.3 Tolerant search................................................................................................................................. 30 9.4 Link search site:................................................................................................................................ 31 9.5 Media search.................................................................................................................................... 31 9.6 Search only in one domain...............................................................................................................31 9.7 Search in categories.........................................................................................................................31 9.8 Greek language support...................................................................................................................32 9.9 Block queries.................................................................................................................................... 33 10. Chronological order for result listing........................................................................33 10.1 Sorting text results..........................................................................................................................33 10.2 Sorting media results......................................................................................................................35 11. PDF converter for Linux/UNIX systems.....................................................................35 12. Clean resources during index / re-index...................................................................37 13. Enable real-time output of logging data....................................................................37 14. Error messages and Debug mode.............................................................................38 15. Delete secondary characters......................................................................................39 16. Media search for images, audio streams and videos..............................................40 16.1 Media indexing................................................................................................................................ 40 16.2 Not supported media content..........................................................................................................41 16.3 Search for media content................................................................................................................41 16.4 Statistics for media content.............................................................................................................43 17. Feed support................................................................................................................44 17.1 XML product feeds.......................................................................................................................... 44 17.2 RDF, RSD, RSS and Atom feeds....................................................................................................46 18. Result cache for text and media queries...................................................................47

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    166 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us