PDF Version of This Documentation

PDF Version of This Documentation

Thunderstone Parametric Search Appliance WWW Site Indexer Version 23.1.0 Thunderstone Software December 10, 2020 Contents 1 Overview 21 1.1 Features........................................ 21 1.2 TechnicalSupport................................ ....... 22 2 Installation 23 2.1 How to unpack and install the Parametric Search Appliance................. 23 2.1.1 ConsoleMenu ................................... 23 2.1.2 FrontPanelLCD ................................. 27 2.2 Customizing the Parametric Search Appliance’s Appearance ................ 29 3 Operation 31 3.1 Running the Administrative Interface . .............. 31 3.2 FirstTimeRun:QuickStart . ........ 31 3.3 Administrative Interface Overview . ............. 34 3.4 BasicWalkSettings............................... ....... 58 3.4.1 WalkSummary................................... 58 3.4.2 Notes ......................................... 59 3.4.3 BaseURL(s) .................................... 59 3.4.4 Robots ........................................ 59 3.4.5 RobotsCrawl-delay ...... ..... ...... ...... ..... 60 3.4.6 AllowExtensions............................... 60 3.4.7 ExcludeExtensions. ..... 61 3.4.8 Exclusions .................................... 61 3.4.9 WalkDelay..................................... 61 3 4 CONTENTS 3.4.10 Parallelism .................................. 62 3.4.11 Verbosity .................................... 62 3.4.12 Disable Starting Walks . ....... 62 3.4.13 CatalogAuto-Import . ...... 63 3.4.14 Auto-ImportDelay . ..... 63 3.4.15 RewalkType ................................... 63 3.4.16 RewalkSchedule ............................... 65 3.4.17 ActionButtons ................................ 66 3.5 AdvancedWalkSettings . ....... 66 3.5.1 WatchURL...................................... 67 3.5.2 EndofWalkEmail ................................ 67 3.5.3 AttachLogs.................................... 67 3.5.4 Categories .................................... 67 3.5.5 CategoriesType................................ 68 3.5.6 DBWalker...................................... 69 3.5.7 URLFile ....................................... 69 3.5.8 URLURL ...................................... 69 3.5.9 SinglePage.................................... 69 3.5.10 PageFile ..................................... 70 3.5.11 PageURL...................................... 70 3.5.12 StripQueries ................................. 70 3.5.13 KeepQueryVars ................................ 70 3.5.14 IgnoreQueryVars . ...... ..... ...... ...... ..... 71 3.5.15 SortQueryVars................................ 71 3.5.16 LowerQueryVarValues . ..... 71 3.5.17 IgnoreCase................................... 72 3.5.18 ExtraDomains ................................. 72 3.5.19 ExtraNetworks................................ 72 3.5.20 ExtraURLsREX................................. 73 3.5.21 ExclusionREX................................. 73 CONTENTS 5 3.5.22 ExclusionPrefix ............................... 73 3.5.23 RSSFeeds ..................................... 74 3.5.24 ExcludebyField ............................... 74 3.5.25 EntitySourceFields . ...... 74 3.5.26 DatafromField................................ 75 3.5.27 RequiredREX .................................. 80 3.5.28 RequiredPrefix................................ 80 3.5.29 MaxPageSize .................................. 80 3.5.30 MaxPages ..................................... 80 3.5.31 MaxBytes ..................................... 80 3.5.32 MaxDepth ..................................... 81 3.5.33 MaxURLSize ................................... 81 3.5.34 MaxRequests.................................. 81 3.5.35 Max Connection Lifetime . ...... 81 3.5.36 PageTimeout.................................. 81 3.5.37 MetaTags..................................... 81 3.5.38 StandardMeta ................................. 82 3.5.39 AllMeta ...................................... 82 3.5.40 StorageCharset... ...... ..... ...... ...... ..... ..... 82 3.5.41 Source Default Charset . ....... 82 3.5.42 XMLUTF-8 ..................................... 83 3.5.43 KeepHTML ..................................... 83 3.5.44 KeepLinks .................................... 83 3.5.45 RemoveCommon ................................. 84 3.5.46 IgnoreTags................................... 84 3.5.47 KeepTags..................................... 84 3.5.48 IgnoreCharacters. ...... 84 3.5.49 PluginSplit.................................. 85 3.5.50 LanguageAnalysis . ..... 85 3.5.51 CJKMode ...................................... 86 6 CONTENTS 3.5.52 UnknownFileFormats . ..... 86 3.5.53 PDFTitleAction ............................... 86 3.5.54 WordDefinition ................................ 87 3.5.55 TextSearchMode ............................... 87 3.5.56 AttributeCompareMode. ...... 89 3.5.57 IndexFields.................................. 89 3.5.58 CompoundIndexFields . ..... 89 3.5.59 ExtraIndexes................................. 90 3.5.60 Spell-check Dictionaries . ......... 90 3.5.61 PrimerType................................... 90 3.5.62 PrimerURLs ................................... 91 3.5.63 UnprimerURLs ................................. 93 3.5.64 LoginInfo .................................... 94 3.5.65 ProxyAuto-ConfigURL . 94 3.5.66 Proxy ........................................ 95 3.5.67 ProxyLoginInfo ............................... 95 3.5.68 ClientCertificate . ...... 95 3.5.69 CookieSourcePath. ..... 95 3.5.70 CookieJar .................................... 96 3.5.71 StrictCookiePaths . ...... 96 3.5.72 Off-SitePages ................................ 96 3.5.73 Off-SiteComponents . ...... 96 3.5.74 StayUnder .................................... 97 3.5.75 PreventDuplicates . ...... 97 3.5.76 RespectCanonicalURLs. ...... 97 3.5.77 Duplicate Check Fields . ....... 97 3.5.78 StoreRefs.................................... 98 3.5.79 InlineIframes................................ ..... 98 3.5.80 MaxComponents................................ 98 3.5.81 ExecuteJavaScript . ...... 98 CONTENTS 7 3.5.82 FetchJavaScript .. ...... ..... ...... ...... ..... ..... 98 3.5.83 JavaScript String Links . ........ 99 3.5.84 DebugJavaScript . ..... 99 3.5.85 JavaScriptMemory. ..... 99 3.5.86 JavaScriptTimeout . ...... 99 3.5.87 AJAXCrawlableURLs. 99 3.5.88 WalkTraceSettings . 100 3.5.89 AuditLog..................................... 100 3.5.90 PerformanceLogging. 100 3.5.91 BatchLocks ................................... 101 3.5.92 URLProtocols ................................. 101 3.5.93 HTTPVersion .................................. 101 3.5.94 SSLClientProtocols . 101 3.5.95 SSLClientCiphers. 102 3.5.96 SSLUseSNI .................................... 102 3.5.97 Network Share Protocols . ....... 103 3.5.98 Network Share Access Method . ....... 103 3.5.99 Authentication Schemes . ....... 103 3.5.100EmbeddedSecurity. 104 3.5.101BodyStorageMethod . 104 3.5.102MultipleFetches . 104 3.5.103 Follow Cross-Site Links . ........ 104 3.5.104MaxRedirects ................................ 105 3.5.105EmptyFormRedirects . 105 3.5.106 Execute Walked Dataload . ....... 105 3.5.107IndexName................................... 105 3.5.108DNSMode ..................................... 106 3.5.109UserAgent ................................... 106 3.5.110Robots.txtAgents. ....... 106 3.5.111MimeTypes ................................... 107 8 CONTENTS 3.5.112CustomHeaders .. ...... ..... ...... ...... ..... 107 3.5.113 Respect Expires Header . ....... 107 3.5.114CacheContent ................................ 107 3.5.115DefaultRefreshTime. ....... 108 3.5.116MinimumRefreshTime . 108 3.5.117MaximumRefreshTime . 109 3.5.118MaximumProcessSize. 109 3.5.119MaximumLoadAverage . 109 3.5.120 Replication Settings . ........ 109 3.5.121SendData.................................... 109 3.5.122SendSettings .... ...... ..... ...... ...... ..... 110 3.5.123BatchRows................................... 110 3.5.124BatchSize ................................... 110 3.5.125BatchIdle................................... 110 3.5.126LogReplication. 110 3.6 SearchSettings .................................. 110 3.6.1 Notes ......................................... 111 3.6.2 QueryLogging .................................. 111 3.6.3 RotateSchedule ................................ 111 3.6.4 Email......................................... 111 3.6.5 ResultOrder ................................... 111 3.6.6 ResultsStyle .................................. 112 3.6.7 AllowRSS ...................................... 112 3.6.8 FormatXSLOutput ............................... 112 3.6.9 XSLEngine..................................... 112 3.6.10 XSLFile ...................................... 113 3.6.11 AbstractStyle................................ 113 3.6.12 AbstractLength... ...... ..... ...... ...... ..... 113 3.6.13 MaxTitleLength............................... 113 3.6.14 MaxURLDisplayLength . 113 CONTENTS 9 3.6.15 ResultsperPage ............................... 114 3.6.16 MaxUserResultsperPage. 114 3.6.17 PageLinksShown ............................... 114 3.6.18 ResultsperSite............................... 114 3.6.19 Allowsite:syntax . ...... ..... ...... ...... ..... 115 3.6.20 Allowlink:syntax . ...... ..... ...... ...... ..... 116 3.6.21 Parametric Search Options . ........ 116 3.6.22 ResultURLSource. ...... ..... ...... ...... ..... 117 3.6.23 Parametric Search Query . ....... 117 3.6.24 GroupBy...................................... 117 3.6.25 MaxGroupBys.................................. 118 3.6.26 MaxDocstoGroupBy..... ..... ...... ...... ..... .. 119 3.6.27 ResultsWidth................................

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    353 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us