Changeset 243d027 for src/tok.l/tok_cmdline.ggo
- Timestamp:
- 01/18/13 18:46:38 (12 years ago)
- Branches:
- master
- Children:
- e0cd003
- Parents:
- 18e1952
- git-author:
- Tomasz Obrebski <to@…> (01/18/13 18:46:38)
- git-committer:
- Tomasz Obrebski <to@…> (01/18/13 18:46:38)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
src/tok.l/tok_cmdline.ggo
r5f4d9c3 r243d027 1 1 package "tok" 2 2 version "0.1" 3 usage "tok [OPTIONS]" 4 purpose "tok transforms raw text into UTT format." 3 5 4 option "interactive" i "Interactive mode." flag off 6 description "OPTIONS" 7 8 option "interactive" i "Interactive mode (no output buffering)." flag off 9 10 text " 11 DESCRIPTION 12 13 tok reads from standard input, identifies tokens on the basis of their orthographic form and writes a sequence of segments in UTT format to 14 the standard output. The type of the token is printed as the type field. 15 16 OUTPUT FORMAT 17 18 UTT-file with four fields: start, length, type, and form. In the type field five types of tokens are distinguished: 19 20 W (word) - continuous sequence of letters 21 N (number) - continuous sequence of digits 22 S (space) - continuous sequence of space characters 23 P (punctuation) - single printable character other than W, N, S 24 B (unprintable character) - single unprintable character 25 26 USAGE EXAMPLE 27 28 tok 29 "
Note: See TracChangeset
for help on using the changeset viewer.