Search Tools Taporware Find Collocates (Plain) Plain-Text
Total Page:16
File Type:pdf, Size:1020Kb
HELP?
Browse or search for tools that you can use. More Info
Type Source Format New And Popular All A-Z
Search
Search
Search - Text Gathering - List and Statistical - Visualization - Editing - Miscellaneous
Search Tools
TAPoRware Find Collocates (Plain)
PLAIN-TEXT
Collocation tool takes a word from the user and returns all of the words directly before and directly after it based on the given context. Results are listed alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation). NOTE: If your select context of "Words" with long context length or you select other context: Lines, Sentences or Paragraph, it is very likely that the specified pattern appears in each text of corcordance more than once. If this occurs, the words of collocation will be counted more than once as well. So the counts of collocates are not accurate. For the same reason, the zScore values are not accurate. We will find way to fix this later.
Detailed Info - TryIt - Website
TAPoRware Date Finder (Plain)
PLAIN-TEXT
This tool extracts dates from an Plain text document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s)).
Detailed Info - TryIt - Website
TAPoRware Find Co-occurrence (Plain)
PLAIN-TEXT
Co-occurrence tool looks for two words a certain distance apart from one another. By entering a primary and secondary pattern, TAPoR will search the document for anywhere that the two patterns are within the user-specified limits of words, sentences, or lines.
Note: If your select context of "Words" with long context length or you select other context: Lines, Sentences or Paragraph, it is very likely that the specified pattern appears in each text of corcordance more than once. If this occurs, the words of collocation will be counted more than once as well. So the counts of collocates are not accurate. For the same reason, the zScore values are not accurate. We will find way to fix this later.
Detailed Info - TryIt - Website
TAPoRware Synonym Finder
HTML, PLAIN-TEXT, XML
This tool uses the Roget's Interactive Thesaurus services to get the synonyms/antonyms of a given word
Detailed Info - TryIt - Website
TAPoRware Date Finder (XML)
TEI, XML
This tool extracts dates from an XML document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s))
Detailed Info - TryIt - Website
TAPoRware Date Finder (HTML)
HTML
This tool extracts dates from an HTML document. Dates can be limited to all dates, years, months, weeks, seasons, North American holidays or user defined dates (e.g. specific month(s), week(s), season(s), holiday(s))
Detailed Info - TryIt - Website
TAPoRware Find Collocates (HTML)
HTML
The collocation tool takes a word from the user and returns all of the words directly before and directly after it based on the given context and returns the results listed alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).
Detailed Info - TryIt - Website
TAPoRware Find Collocates (XML)
TEI, XML
Collocation tool takes a word from the user and returns all of the words directly before and directly after it based on the given context. The results are listed alphabetically, by frequency, or by Z-score (an indication of how far and in what direction that item deviates from its distribution's mean, expressed in units of its distribution's standard deviation).
Detailed Info - TryIt - Website
TAPoRware Find Co-occurrence (HTML)
HTML
Co-occurrence tool looks for two words a certain distance apart from one another. By entering a primary and secondary pattern, TAPoR will search the document for anywhere the two patterns are within the user-specified limits of words, sentences, or lines. If desired, the results can be narrowed to include words only found within certain tags.
Detailed Info - TryIt - Website
TAPoRware Find Co-occurrence (XML)
TEI, XML
Co-occurrence tool looks for two words a certain distance apart from one another. By entering a primary and secondary pattern, TAPoR will search the document for anywhere where the two patterns are within the user-specified limits of words/sentences/lines or surrounding elements.
Detailed Info - TryIt - Website
TAPoRware Find Words - Concordance (HTML)
HTML
Find Concordance (HTML) tool can find text anywhere in an HTML document. The search can be narrowed to specified tags. All results are returned with a concordance of either words, sentences, or lines.
Detailed Info - TryIt - Website
TAPoRware Find Words - Concordance (Plain)
PLAIN-TEXT
The Concordance (Text) tool can find text anywhere in a text document. The search can also be used to view a concordance of either words, sentences, or lines surrounding the result.
Detailed Info - TryIt - Website
TAPoRware Find Words - Concordance (XML)
TEI, XML
Find Concordance (XML) tool can find text anywhere in an XML document using the Find Text tool. The search can be narrowed to specified elements or attributes, and all results are returned with a concordance of either words/sentences/lines or surrounding elements.
Detailed Info - TryIt - Website
TAPoRware Acronym Finder
HTML, PLAIN-TEXT, XML
This tool tries its best to find all the possible acronyms and their original names in a submitted text. However, for some acronym like IGO with the original name of Intergovernmental Organization, the original name can not be identified.
Detailed Info - TryIt - Website
TAPoRware CAPs Finder HTML, PLAIN-TEXT, XML
This tool tries to find all the CAPs in the submitted text. It will list all single CAP except the first words of each sentence following the more than one CAP phrases.
Detailed Info - TryIt - Website
TAPoRware Words Distribution -- Weighted Centroid
HTML, PLAIN-TEXT, XML
This tool displays a circular graph based on word distribution data or the tablet word distribution data. The text is divided up into an arbitrary number of units, which are positioned around the circumference of the circle in a clockwise sequence. The more times a word appears in a particular text unit, the closer the word will be to that unit in the circle. If a word appears an equal number of times in all units, it be located in the centre of the circle. Words are colour coded based on the amount of times they appear in the text as a whole. Blue words have the highest word count. Rolling over a word will display lines representing its connections to the units. Clicking a word will keep its lines visible after you move the mouse off of it. Click the word again to rmeove the lines. The darker the line, the more times the word was found in that unit. Additionally, all the words found in the graph are listed on the left side of the applet. There is a scroll bar for viewing the words, should they extend past the bottom of the applet. This list of words features the same rollover and clicking functionality as those found in the the graph itself. This tool uses the processing library. * This tool requires the JRE (v1.4.2 and up) in order to work properly.
Detailed Info - TryIt - Website
TAPoRware Principal Component Analysis Tool (Plain)
PLAIN-TEXT
This tool uses principal components analysis method to analyze words relation among the user specified text units
Detailed Info - TryIt - Website
Text Gathering Tools
TAPoRware Extract Text (XML)
TEI, XML
Extract Text from XML Documents tool can extract the full body of text from an XML Document. This tool can also pull the text from user-specified elements or attributes.
Detailed Info - TryIt - Website
TAPoRware Googlizer
HTML, PLAIN-TEXT, XML
This tool calls google search engine directly and give different results. The rule of search terms in google can be applied here directly
Detailed Info - TryIt - Website TAPoRware Extract Text (HTML)
HTML
Retrieve html text based on user given html tags
Detailed Info - TryIt - Website
TAPoRware Text Aggregator
HTML, PLAIN-TEXT, XML
This tool aggregates multiple text from different locations and different format into a single text. The text source can be pointed by valid URLs, your local file or text you typed in.
Detailed Info - TryIt - Website
List and Statistical Tools
TAPoRware Summarizer (Plain)
PLAIN-TEXT
Extract statistic info , high frequency words list, concordance etc.
Detailed Info - TryIt - Website
TAPoRware Tokenizer (XML)
TEI, XML
This tool splits an XML document at specified points, or tokens. These tokens can be words, lines, sentences, paragraphs, characters, patterns, or tags. The results can be listed with the token removed or preserved before or after the split
Detailed Info - TryIt - Website
QMatrix
PLAIN-TEXT
This tool implements Raymond Queneau's matrix analysis of language with a given text. The results of the analysis include a breakdown of the text into formatives, signifiers, and bi-words. N.B. In order to work with texts with accents, load your document to myTexts first, then select the document when you use QMatrix. Accented characters will not display properly but the tool will work otherwise.
Detailed Info - TryIt - Website
TAPoRware List Tags (HTML)
HTML
This service list all the html tags of an html document.
Detailed Info - TryIt - Website TAPoRware Tokenizer (HTML)
HTML
This tool splits an HTML document at specified points, or tokens. These tokens can be words, lines, sentences, and paragraphs, as well as certain characters, patterns, or tags. The results can be listed with the token removed, before the split, or after the split.
Detailed Info - TryIt - Website
TAPoRware Tokenizer (Plain)
PLAIN-TEXT
Tokenize tool splits text document at specified points, or tokens. These tokens can be words, lines, sentences, and paragraphs, as well as certain characters or patterns. The results can be listed with the token removed, before the split, or after the split.
Detailed Info - TryIt - Website
Hyperpoet Frequencies
PLAIN-TEXT, TEI
List text frequency of user specified URL.
Detailed Info - TryIt - Website
Test List Words Tool
PLAIN-TEXT description
Detailed Info - TryIt - Website
TAPoRware List Word Pairs
HTML, PLAIN-TEXT, XML
This tool will list word pairs in a corpus based on different criteria.
Detailed Info - TryIt - Website
TAPoRware List Elements (XML)
TEI, XML
List XML Elements tool is used to display all of the elements contained in an XML document. This tool also allows the user to count all instances of an element and to view the structure/hierarchy of the document. It also provides a variety of tools for listing attributes and attribute values.
Detailed Info - TryIt - Website
TAPoRware List Words (HTML)
HTML
List Words (HTML) tool can be used to list all or user specified words found within a specified tag. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no tag is specified, 'body' tag is used.
Detailed Info - TryIt - Website
TAPoRware List Words (Plain)
PLAIN-TEXT
List Words (Plain) tool can be used to list all of the words found within a given text document. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order.
Detailed Info - TryIt - Website
TAPoRware List Words (XML)
TEI, XML
List Words (XML) tool can be used to list all or user specified words found within a specified element. The query results can be displayed alphabetically, by frequency, by order of appearance, or in reversed alphabetical order. If no element is specified, all words in the xml document will be returned.
Detailed Info - TryIt - Website
TAPoRware Texts Comparator
HTML, PLAIN-TEXT, XML
The tool compares two user submitted texts. It performs basic statistics on each text and lists the words and counts side by side.
Detailed Info - TryIt - Website
Visualization Tools
TAPoRware Pattern Distribution (XML)
TEI, XML
This tool will display user specified pattern distribution over different text unit in different format
Detailed Info - TryIt - Website
TAPoRware Pattern Distribution (HTML)
HTML
This tool displays visually pattern distribution in user selected text unit.
Detailed Info - TryIt - Website
TAPoRware Pattern Distribution (Plain)
PLAIN-TEXT This tool will display pattern distribution over plain text string in different format
Detailed Info - TryIt - Website
TAPoRware Visual Collocator
HTML, PLAIN-TEXT, XML
The Visual Collocator displays collocates of words using a graph layout. Words which share similar collocates will be drawn together in the graph, producing new insight into the text. Any word can be double-clicked to fetch its collocates. Any word can be removed from the graph, and new words can be added using the text field. Additionally, words can be made "sticky", then dragged around to new positions, creating a user defined layout. This tool uses the prefuse library. * This tool requires the JRE (v1.4.2 and up) in order to work properly.
Detailed Info - TryIt - Website
TAPoRware Word Cloud
HTML, PLAIN-TEXT, XML
This tool count the words in the text and display them in font and color based on their count. The order and the number of words are specified by user.
Detailed Info - TryIt - Website
TAPoRware Word Brush
HTML, PLAIN-TEXT, XML
'Word Brush' allows the user to paint with words extracted from an online document. This tool uses Java to display results. The current applet was compiled using Java 1.4.2_08 so you will need at least this version of the Java plug-in for your browser if you wish to use it.
Detailed Info - TryIt - Website
TAPoRware Raining Words
HTML, PLAIN-TEXT, XML
'Raining Words' is designed to display high frequency words such that high frequency words are rendered larger and move more slowly than words with lower frequencies. The source can be plain text, XML or HTML. If the source is XML or HTML, the processed text will be limited to the context of user given element. It is then filtered using a stop-word list. The resulting text is then scanned for the top 20 high frequency words. This tool uses Java to display results. The current applet was compiled using Java 1.4.2_08 so you will need at least this version of the Java plug-in for your browser if you wish to use it.
Detailed Info - TryIt - Website
TAPoRware Hypergraph (XML)
XML
Using hypergraph package to draw the xml document' structure.
Detailed Info - TryIt - Website
Humanist Trends Viewer (Voyeur) PLAIN-TEXT
This tool is intended to help view word trends in the Humanist Discussion Group archive. The archives have been scraped from the web, segmented into yearly volumes, and stripped down to only the plain text from the bodies of the email messages (see the TADA Archives of the Humanist Discussion Group for more information on acquiring and processing the archives).
Detailed Info - TryIt - Website
EditingTools
XSL Transformer
DOCBOOK, MEP, TARL, TEI, XML
This is a transformation tool that applies the specified XSL stylesheet to the selected XML file.
Detailed Info - TryIt
TEI Transformer
TEI
This is a tranformation tool that uses a set of TEI stylesheets to convert a TEI document into HTML. For more information see http://www.tei- c.org/Stylesheets/teic/.
Detailed Info - TryIt
Neko Transformer
HTML
This an HTML "tidying" tool which is based on the CyberNeko HTML Parser. Use Neko Transformer to balance tags and "fix up many common mistakes that human (and computer) authors make in writing HTML documents. NekoHTML adds missing parent elements; automatically closes elements with optional end tags; and can handle mismatched inline element tags." More information about Neko can be found at http://people.apache.org/~andyc/neko/
Detailed Info - TryIt
HTML Entity Transformer
HTML
This is a transformation tool that reads the specified HTML document and converts all HTML entities into their Unicode counterparts. The tool produces an HTML page once it completes.
Detailed Info - TryIt
Raw Entity Transformer
XML This is a transformation tool that reads the specified XML document and converts all entities in the document into their Unicode counterparts. The tool outputs the resulting XML in its raw form. As the result, most browser will not be able to display the result properly. Please save the tool invocation results locally to see the XML output.
Detailed Info - TryIt
Pretty Entity Transformer
XML
This is a transformation tool that reads the specified XML document and converts all entities in the document into their Unicode counterparts. The tool output is then converted to a pretty-printed HTML with an XSL stylesheet. This tool is not suitable for processing large XML files due to the limitations imposed by the XSL transformation. Please use Raw Entity Transformer if it is necessary to transform large XML files.
Detailed Info - TryIt
MS Word Transformer
MSWORD
This a converter tool that extracts plain text from Microsoft Word documents using Jakarta POI library.
Detailed Info - TryIt
Highlighter
HTML, MSWORD, PDF
Highlighter tool uses Apache Lucene library to provied a KWIC search. It highlights all occurrences of the specified query within a document.
Detailed Info - TryIt
Diff Transformer
DOCBOOK, HTML, MEP, PLAIN-TEXT, TARL, TEI, XML
This tool compares two files and highlights the differences in each of them.
Detailed Info - TryIt
PDF Transformer
This is a converter tool that extracts plain text from a PDF document. The tool uses PDFBox, an open source Java PDF library for working with PDF. More information about the library can be found at http://www.pdfbox.org.
Detailed Info - TryIt Miscellaneous Tools
LiteMorph
PLAIN-TEXT
LiteMorph
Detailed Info - TryIt - Website
TAPoRware XML Transformer
XML
Provide your xml and xsl file, this tool will transform the xml into the file specified in your xsl file. However, because the output is pre-configured as HTML, if your xsl's target format is xml, you need to view the source of the output to cee the xml
Detailed Info - TryIt - Website
TokenX
XML
A text visualization, analysis, and play tool
Detailed Info - TryIt - Website