source: src/tok.l/tok_cmdline.ggo

Last change on this file was e0cd003, checked in by Tomasz Obrebski <to@…>, 11 years ago

wsp�lny parametr -e usuni�ty
wyg�adzone teksty help

  • Property mode set to 100644
File size: 813 bytes
RevLine 
[5f4d9c3]1package "tok"
2version "0.1"
[243d027]3usage   "tok [OPTIONS]"
4purpose "tok transforms raw text into UTT format."
[5f4d9c3]5
[243d027]6description "OPTIONS"
7
8option "interactive"            i       "Interactive mode (no output buffering)." flag off
9
10text "
11DESCRIPTION
12
13tok reads from standard input, identifies tokens on the basis of their orthographic form and writes a sequence of segments in UTT format to
[e0cd003]14the standard output.
[243d027]15
16OUTPUT FORMAT
17
[e0cd003]18UTT-file with four fields: START, LENGTH, TYPE, and FORM. In the TYPE field five types of tokens are distinguished:
[243d027]19
20  W (word) - continuous sequence of letters
21  N (number) - continuous sequence of digits
22  S (space) - continuous sequence of space characters
23  P (punctuation) - single printable character other than W, N, S
24  B (unprintable character) - single unprintable character
25
26USAGE EXAMPLE
27
28      tok
29"
Note: See TracBrowser for help on using the repository browser.