Index: p/doc/Makefile
===================================================================
--- app/doc/Makefile	(revision b2647ded51d1c3f8e4a856396ea16c7afd54b9e6)
+++ 	(revision )
@@ -1,36 +1,0 @@
-main: utt.info utt.pdf utt.html utt.ps
-
-utt.info: utt.texinfo
-	makeinfo utt.texinfo
-
-utt.pdf: utt.texinfo
-	texi2pdf utt.texinfo
-	rm utt.{aux,cp,fn,ky,log,pg,toc,tp,vr}
-
-utt.html: utt.texinfo
-	makeinfo --html --no-split utt.texinfo
-
-utt.dvi: utt.texinfo
-	texi2dvi utt.texinfo
-
-utt.ps:	utt.dvi
-	dvips -o utt.ps utt.dvi
-
-
-copy:
-ifdef UTT_SHARE_DIR
-	# tworzymy archiwa (wymagane przez programy)
-	gzip --best utt.info
-	mv utt.info.gz ${UTT_SHARE_DIR}/info/
-
-	# tworzymy archiwa (wymagane przez programy)
-	#gzip --best utt.man
-	#mv utt.man.gz ${UTT_SHARE_DIR}/man/man3/utt.gz.1
-
-	#pozostale dokumenty
-	mv utt.{ps,pdf,html} ${UTT_SHARE_DIR}/doc/utt/
-endif
-
-clean:
-	rm -f utt.{aux,cp,dvi,fn,fns,html,info,ky,log,pdf,pg,ps,toc,tp,vr}
-	rm -f *~
Index: p/doc/utt.texinfo
===================================================================
--- app/doc/utt.texinfo	(revision 2d89d4bc829d3ca6b96523646f3a340ab8bbcbd6)
+++ 	(revision )
@@ -1,2920 +1,0 @@
-
-\input texinfo   @c -*-texinfo-*-
-@c @documentencoding ISO-8859-2
-@documentencoding UTF-8
-@c @documentlanguage pl
-
-@c %**start of header
-@setfilename utt.info
-@settitle UAM Text Tools v0.90
-@c %**end of header
-
-@copying
-This manual is for UAM Text Tools (version 0.90, October, 2008)
-
-Copyright @copyright{}  2005, 2007  Tomasz ObrÄbski, MichaÅ Stolarski, Justyna Walkowska, PaweÅ Konieczka.
-
-Permission is granted to copy, distribute and/or modify this document
-under the terms of the GNU Free Documentation License, Version 1.2 or
-any later version published by the Free Software Foundation; with no
-Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
-copy of the license is included in the section entitled GNU Free
-Documentation License,,GNU Free Documentation License.
-
-@c @quotation
-@c Permission is granted to ...
-@c No permission is granted until the document is completed.
-@c @end quotation
-@end copying
-
-
-@titlepage
-@title UAM Text Tools 0.90 - User Manual
-@subtitle edition 0.01, @today
-@subtitle status: prescript
-@author by Justyna Walkowska, Tomasz ObrÄbski and MichaÅ Stolarski
-@page
-@vskip 0pt plus 1filll
-@insertcopying
-@end titlepage
-
-@contents
-
-@c @paragraphindent none
-
-@iftex
-@tex
-% \usepackage[T1]{fontenc}
-% \usepackage[utf8]{inputenc}
-% \usepackage{times}
-@end tex
-
-@parskip = 0.5@normalbaselineskip plus 3pt minus 1pt
-@end iftex
-@c @headings off
-@c @everyheading LEM(1) @| @| LEM(1)
-@everyfooting @today @c @| @thispage @|
-
-@ifnottex
-
-@node Top
-@top UTT - UAM Text Tools
-
-@insertcopying
-
-@menu
-* General information::                       
-* UTT file format::             
-* Configuration files::         
-* UTT components::
-* Auxiliary tools::
-* Usage examples::              
-* PMDBF dictionary::            
-@c * Examples::                    
-@c * Copyright::
-* GNU Free Documentation License:: 
-* Reporting bugs::                                    
-* Author::                      
-@end menu
-@end ifnottex
-
-
-@c ----------------------------------------------------------------------
-
-@node General information
-@chapter General information
-
-UAM Text Tools (UTT) is a package of language processing tools
-developed at Adam Mickiewicz University. Its functionality includes:
-
-@itemize @bullet
-
-@item
-tokenization Ã³ÅÄ
-ÅŒ
-@item
-dictionary-based morphological analysis
-@item
-heuristic morphological analysis of unknown words
-@item
-spelling correction Ã³ÅÄ
-ÅÄÅŒ
-@item
-pattern search
-@item
-sentence splitting
-@item
-generation of concordance tables
-@end itemize
-
-The toolkit is destined for processing of raw (not annotated)
-unrestricted text for any conceivable purpose.
-
-The system is organized as a collection of command-line programs, each
-performing one operation, e.g. tokenization, lemmatization, spelling
-correction. The components are independent one from another, the
-unifying element being the uniform i/o file format.
-
-The components may be combined in various ways to provide various text
-processing services. Also new components supplied by the used may be
-easily incorporated into the system provided that they respect the i/o
-file format conventions.
-
-UTT component programs does not depend on any specific tagset or
-morphological description format. 
-
-UTT is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by 
-the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
-
-The Polex/PMDBF dictionary is licensed under the Creative Commons by-nc-sa License which prohibits commercial use.  
-
-
-List of contributors:
-
-@itemize
-@item Pawel Konieczka
-@item Tomasz ObrÄbski
-@item MichaÅ Stolarski
-@item Marcin Walas
-@item Justyna Walkowska
-@item PaweÅ WereÅski
-@end itemize
-
-@c ----------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-
-@node    UTT file format
-@chapter UTT file format
-
-A UTT file contains annotation of a text. It consists of a sequence of
-segments. Each segment explicitly refers to a continuous piece of the
-text and provides some information on it.
-
-@section Segment format
-
-A segment occupies one line of a UTT file and consists of
-space-separated fields:
-
-
-@quotation
-@sp 1
-[@var{start} [@var{length}]] @var{type} @var{form} [@var{annotation1} [@var{annotation2} ...]]
-@sp 1
-@end quotation
-
-@table @var
-
-@item @var{start} 
-Non-negative integer value indicating the position in the source text where the
-segment starts.
-
-@item @var{length}
-Non-negative integer value indicating the length of the segment.
-
-@item @var{type}
-A sequence of non-ASCII characters (without spaces or letters, which could lead to @var{type} being misinterpreted as a @var{start} or @var{length} field). 
-@var{type} reflects the main classification of segments -
-into words, numbers, punctuation marks, meta-text markers. 
-@xref{tok output,,tok output}, for description of automatically recognized type markers.
-
-@item @var{form}
-This field contains the textual form of the segment or the special
-symbol @code{*} indicating that the form is not given (e.g. when the segment has been created artificially to mark something and is of lentgh 0).
-
-The characters or character sequences that have special meaning in the
-@var{form} field are enumerated below.
-
-Characters with special meaning:
-
-@itemize
-@item @code{_} - space character
-@item @code{*} - undefined contents
-@end itemize
-
-Escape sequences:
-
-@itemize
-@item @code{\n} - new line
-@item @code{\t} - tabulation
-@item @code{\r} - carriage return  
-
-@item @code{\_} - the @code{_} character
-@item @code{\*} - the @code{*} character
-@item @code{\\} - the @code{\} character
-
-@c @item @code{\hh} - a character with hexadecimal code @code{hh} (used for non-printable characters)
-@end itemize
-
-@item @var{annotation1}
-@item @var{annotation2}
-@item ...
-Annotation fields have the following format:
-
-@var{longname} @code{:} @var{value}
-
-or
-
-@var{shortname} @var{value}
-
-where @var{longname} is a string of alphanumeric characters
-(isalnum() test), @var{shortname} - a single non-alphanumeric character
-(ispunct() test), and @var{value} is an arbitrary string of non-blank characters.
-
-@end table
-
-
-Only two fields are mandatory: @var{type} and @var{form}. All other fields
-may be absent. In the case when only one number precedes the
-@var{type} field, it is interpreted as the @var{START} position.
-
-If the @var{length} field is ommited, the length of the segment is the
-length of the @var{form} field, except when the value of the
-@var{form} field is @code{*} -- in this case, the length is assumed to
-be 0.
-
-If the @var{start} field is also absent, the segment is assumed to directly
-follow the preceding one.
-
-@c Conventions:
-
-@c Annotation fields with predefined meaning:
-
-@c @itemize
-@c @item @code{!} - UTT components are allowed to modify the contents of
-@c the @var{form} field (e.g. spelling correction does this). If this happens the
-@c original form of the segment have to be placed in the @code{!}-field.
-@c @item @code{@@} - morphological description
-@c @item @code{=} - node identifier assignment (used in graph encoding)
-@c @item @code{<} - preceding/dominating node(s) (used in graph encoding)
-@c @item @code{>} - succeeding/subordinate node(s) (used in graph encoding)
-@c @end itemize
-
-Segments of length 0 may be used to mark file positions with some
-information. See e.g. BOS and EOS (beginning/end of sentence) markers
-in the example below.
-
-Example:
-
-sentence: @samp{Piszemy dobre progrumy.}
-
-@example
-0000 00 BOS *
-0000 07 W Piszemy lem:pisaÄ,V
-0007 01 S _
-0008 05 W dobre lem:dobry,ADJ
-0013 01 S _
-0014 08 W progrumy cor:programy lem:program,N
-0022 01 P .
-0023 00 EOS *
-0023 01 S _
-0024 00 BOS *
-0024 11 W Warszawiacy lem:Warszawiak,N
-0035 01 S _
-0036 03 W teÅŒ
-0039 01 P .
-0040 00 EOS *
-
-@end example
-
-@example
-0000 BOS *
-0000 W Piszemy lem:pisaÄ,V
-0007 S _
-0008 W dobre lem:dobry,ADJ
-0013 S _
-0014 W progrumy cor:programy lem:program,N
-0022 P .
-0023 EOS *
-@end example
-
-Posion information may be provided only for some types of segments:
-
-@example
-0000 BOS *
-W Piszemy lem:pisaÄÂ,V
-S _
-W dobre lem:dobry,ADJ
-S _
-W progrumy cor:programy lem:program,N
-P .
-EOS *
-S _
-0024 BOS *
-W Warszawiacy lem:Warszawiak,N
-S _
-W teÅŒ
-P .
-EOS *
-@end example
-
-Position/length information may be provided only when necessary:
-
-@example
-0000 04 N *
-0000 N 12
-P .
-N 5
-S _
-W km
-@end example
-
-@section UTT File
-
-A UTT file consists of a sequence of segments.  The same text position
-may be covered by multiple segments. In cosequence, ambiguous text
-segmentation and ambiguous annotation may be represented.
-
-There are two structural requirements a valid UTT-formatted file
-has to meet:
-
-@itemize @bullet
-
-@item
-segments have to be sorted with respect to the @var{position} field,
-
-@item
-for each
-segment ending at position @var{n}, either there must be a segment starting at
-position @var{n+1}, or position @var{n+1} is not covered by any segment; similarly
-for each segment starting at position @var{n}, either there must be a segment
-ending at position @var{n-1}, or the position @var{n-1} must not be covered
-by any segment.
-
-@end itemize
-
-A valid annotation for the text fragment
-@example
-12.5 km
-@end example
-
-may be 
-
-@example
-0000 02 N 12
-0000 04 N 12.5
-0002 01 P .
-0003 01 N 5
-0004 01 S _
-0005 02 W km
-@end example
-
-but not
-
-@example
-0000 02 N 12
-0000 04 N 12.5
-0004 01 S _
-0005 02 W km
-@end example
-
-because in the latter example the first segment (starting at position
-0000, 2 characters long) ends at position @var{n}=0001 which is
-covered by the second segment and no segment starts at position
-@var{n+2}=0002.
-
-
-@section Flattened UTT file
-
-A UTT file format has two variants: regular and flattened. The regular
-format was described above.  In the flattened format some of the
-end-of-line characters are replaced with line-feed characters.
-
-The flatten format is basically used to represent whole sentences as
-single lines of the input file (all intrasentential end-of-line
-characters are replaced with line-feed characters).
-
-This technical trick permits to perform certain text
-processing operations on entire sentences with the use of such tools as
-@command{grep} (see @command{grp} component) or @command{sed} (see  @command{mar} component).
-
-The conversion between the two formats is performed by the tools:
-@command{fla} and @command{unfla}.
-
-@section Character encoding
-
-The UTT component programs accept only 1-byte character encoding, such
-as ISO, ANSI, DOS.
-
-
-@c @section Formats
-
-@c @unnumberedsubsubsec Basic format
-
-@c While processing large amounts of the overhead related with explicit
-@c ... of the start position and segment length becomes ... . Therefore,
-@c for efficiency reasons certain shortcuts are possible:
-
-@c @unnumberedsubsubsec Relative start position
-
-@c Start position may be given as relative distance from the last
-@c absolut position. 
-
-@c @unnumberedsubsubsec Absent length
-
-@c Segment length may by omitted. Normally it can be restored by counting
-@c the length of the @emph{form field}. For segments with the special value
-@c @code{*} in the @emph{form field} length 0 is assumed.
-
-@c @unnumberedsubsubsec Absent length and start position
-
-@c Both start position and segment length may be omitted. In this format
-@c each segment is assumed to follow the previous one. This format is,
-@c therefore, suitable only for unambiguously tagged text
-@c (0-length markers can be still used.)
-
-
-@c @table @code
-@c @item AL
-@c @code{1234 03 W kot}
-@c @item RL
-@c @code{+56 03 W kot}
-@c @item A
-@c @code{1234 W kot}
-@c @item R
-@c @code{+56 W kot}
-@c @item 0
-@c @code{W kot}
-@c @end table
-
-
-@c [JAK UZYSKAÄÂ POLSKIE CZCIONKI W DVI???]
-
-@macro parhelp
-@item @b{@minus{}@minus{}help}, @b{@minus{}h}
-Print help.
-@end macro
-
-
-@macro parversion
-@item @b{@minus{}@minus{}version}, @b{@minus{}V}
-Print version information.
-@end macro
-
-@macro parinteractive
-@item @b{@minus{}@minus{}interactive, @minus{}i}
-This option toggles interactive mode, which is by default off. In the
-interactive mode the program does not buffer the output.
-@end macro
-
-
-@c @macro parfile
-@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
-@c Input file name.
-@c If this option is absent or equal to '@minus{}', the program
-@c reads from the standard input.
-@c @end macro
-
-
-@c @macro paroutput
-@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
-@c Regular output file name. To regular output the program sends segments
-@c which it successfully processed and copies those which were not
-@c subject to processing. If this option is absent or equal to
-@c '@minus{}', standard output is used.
-@c @end macro
-
-@c @macro parfail
-@c @item @b{@minus{}@minus{}fail=@var{filename}, @minus{}e @var{filename}}
-@c Fail output file name. To fail output the program copies the segments
-@c it failed to process.  If this option is absent or equal to
-@c '@minus{}', standard output is used.
-@c @end macro
-
-
-@c @macro parcopy
-@c @item @b{@minus{}@minus{}copy, @minus{}c}
-@c Copy succesfully processed segments to regular output also in their
-@c original input form.
-@c @end macro
-
-
-@macro parinputfield
-@item @b{@minus{}@minus{}input-field=@var{fieldname}, @minus{}I @var{fieldname}}
-The field containing the input to the program. The default is the
-@var{form} field. The fields @var{position}, @var{length}, @var{type},
-and @var{form} are referred to as @code{1}, @code{2}, @code{3},
-@code{4}, respectively.
-@end macro
-
-
-@macro paroutputfield
-@item @b{@minus{}@minus{}output-field=@var{fieldname}, @minus{}O @var{fieldname}}
-The name of the field added by the program. The default is the name of the program.
-@end macro
-
-
-@macro pardictionary
-@item @b{@minus{}@minus{}dictionary=@var{filename}, @minus{}d @var{filename}}
-Dictionary file name.
-@end macro
-
-
-@macro parprocess
-@item @b{@minus{}@minus{}process=@var{type}, @minus{}p @var{type}}
-Process segments with the specified value in the @var{type} field.
-Multiple occurences of this option are allowed and are interpreted as
-disjunction. If this option is absent, all segments are processed.
-@end macro
-
-
-@macro parselect
-@item @b{@minus{}@minus{}select=@var{fieldname}, @minus{}s @var{fieldname}}
-Select for processing only segments in which the field named
-@var{fieldname} is present. Multiple occurences of this option are
-allowed and are interpreted as conjunction of conditions. If this
-option is absent, all segments are processed.
-@end macro
-
-
-@macro parunselect
-@item @b{@minus{}@minus{}unselect=@var{fieldname}, @minus{}S @var{fieldname}}
-Select for processing only segments in which the field @var{fieldname}
-is absent.  Multiple occurences of this option are allowed and are
-interpreted as conjunction of conditions. If this option is absent,
-all segments are processed.
-@end macro
-
-
-@macro paroneline
-@item @b{@minus{}@minus{}one-line}
-This option makes the program print ambiguous annotation in one output
-line by generating multiple annotation fields. By default when
-ambiguous annotation may be produced for a segment, the segment is
-multiplicated and each of the annotations is added to separate copy of
-the segment.
-@end macro
-
-
-@macro paronefield
-@item @b{@minus{}@minus{}one-field, @minus{}1}
-This option makes the program print ambiguous annotation in one
-annotation field. By default when ambiguous annotation may be produced
-for a segment, the segment is multiplicated and each of the
-annotations is added to separate copy of the segment.
-
-This option is useful when working with @command{kot} or @command{con}.
-@end macro
-
-
-@c ---------------------------------------------------------------------
-@c CONFIGURATION FILES
-@c ---------------------------------------------------------------------
-
-@node    Configuration files
-@chapter Configuration files
-
-Values for all command line options accepted by a component
-may be set in configuration files. The default location of the
-configuration files for a component named @command{@var{program}} are
-
-@example
-	@file{/usr/local/etc/utt/@var{program}.conf}
-@end example
-
-for system-wide configuration file and
-
-@example
-	@file{~/.utt/@var{program}.conf}
-@end example
-
-for user configuration file.
-
-@c The configuration file to load may be also specified with the
-@c @option{--config} option. Configuration file need not be provided.
-
-For each option, the value is set according to the following priority:
-
-@itemize
-@item command line
-@c @item configuration file indicated with @option{--config} option
-@item user configuration file (or configuration file indicated with the @option{--config} option)
-@item system-wide configuration file
-@end itemize
-
-Parameter values are specified in the following format:
-
-@var{parametername}=@var{value}
-
-where @var{parametername} is the short or long name of an option accepted by
-the program, or
-
-@var{parametername}
-
-if the option does not need arguments.
-
-You can introduce comments to configuration files using the # sign.
-
-If a program accepts multiple occurences of an option (e.g. @var{lem}'s select option) you can specify them in two distinct lines of the program's configuration file.
-
-@c The equal sign may be omitted.
-
-
-@quotation Tip
-If you have two (or more) frequently used sets of options for the same
-program (eg. lem with PMDBF dictionary and lem with a user dictionary)
-a good solution is to create two soft links to lem, called
-eg. lemg and lemu and specify their configuration in files lemg.conf
-and lemu.conf respectively.
-@end quotation
-
-@c ---------------------------------------------------------------------
-@c COMPONENTS
-@c ---------------------------------------------------------------------
-
-@node UTT components
-@chapter UTT components
-
-UTT components are of three types:
-
-@menu
-Sources: programs which read non-UTT data (e.g. raw text) and produce output
-in UTT format
-* tok::         a tokenizer
-
-Filters: programs which read and produce UTT-formatted data
-* lem::         a morphological analyzer
-* gue::         a morphological guesser
-* cor::         a simple spelling corrector
-* kor::         a more elaborated spelling corrector
-* sen::         a sentensizer
-* ser::         a pattern search tool (marks matches)
-* mar::         a pattern search tool (introduces arbitrary markers into the text)
-* grp::         a pattern search tool (selects sentences containing a match)
-@c * gph::         a word-graph annotation tool::
-@c * dgp::         a dependency parser
-
-Sinks: programs which read UTT data and produce output in another format
-* kot::         an untokenizer
-* con::         a concordance table generator
-@end menu
-
-@c ---------------------------------------------------------------------
-@c TOK
-@c ---------------------------------------------------------------------
-
-@page
-@node tok
-@section tok - a tokenizer
-
-@c ----------------------------------------
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Component category:}      @tab source
-@item @strong{Input format:}            @tab raw text file
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab -
-@end multitable
-
-
-@menu
-* tok description::
-* tok input::
-* tok output::
-* tok command line options::
-* tok example::
-@end menu
-
-@node tok description
-@subsection Description
-
-@code{tok} is a simple program which reads a text file and identifies
-tokens on the basis of their orthographic form.  The type of the token
-is printed as the @var{type} field.
-
-@node tok input
-@subsection Input
-
-Raw text.
-
-@node tok output
-@subsection Output
-
-UTT-file with four fields: @var{start}, @var{length}, @var{type}, and @var{form}. In the @var{type} field five types of tokens are distinguished: 
-
-@itemize
-
-@item @code{W}
-(word)
-- continuous sequence of letters
-
-@item @code{N}
-(number)
-- continuous sequence of digits
-
-@item @code{S}
-(space)
-- continuous sequence of space characters
-
-@item @code{P}
-(punctuation mark)
-- single printable characters not belonging to any of the other classes
-
-@item @code{B}
-(unprintable character)
-- single unprintable character
-
-@end itemize
-
-
-
-@node tok command line options
-@subsection Command line options
-
-@table @code
-
-@item @b{@minus{}@minus{}help}, @b{@minus{}h}
-Print help.
-
-@item @b{@minus{}@minus{}version}, @b{@minus{}V}
-Print version information.
-
-@item @b{@minus{}@minus{}interactive, @minus{}i}
-This option toggles interactive mode, which is by default off. In the
-interactive mode the program does not buffer the output.
-
-@end table
-
-@node tok example
-@subsection Example
-
-Input:
-
-@example
-Piszemy dobre programy.
-@end example
-
-Output:
-
-@example
-0000 07 W Piszemy
-0007 01 S _
-0008 05 W dobre
-0013 01 S _
-0014 08 W programy
-0022 01 P .
-0023 01 S \n
-@end example
-
-
-@c ---------------------------------------------------------------------
-@c SEN
-@c ---------------------------------------------------------------------
-
-@c @node sen - sentencizer
-@c @chapter sen - sentencizer
-
-@c Authors: Tomasz ObrÄbski
-
-@c ---------------------------------------------------------------------
-@c LEM
-@c ---------------------------------------------------------------------
-
-@page
-@node lem
-@section lem - morphological analyzer
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski, MichaÅ Stolarski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab tok
-@end multitable
-
-@menu
-* lem description::             
-* lem command line options::    
-* lem input::
-* lem output::
-* lem example::                 
-* lem dictionaries::            
-* lem hints::            
-@end menu
-
-@node lem description
-@subsection Description
-
-@command{lem} performs morphological analysis of a simple orthographic
-word, returning all its possible morphological annotations,
-disregarding the context.
-
-@c ----------------------------------------
-
-@node lem command line options
-@subsection Command line options
-
-@table @code
-@parhelp
-@parversion
-@parinteractive
-@c @parfile
-@c @paroutput
-@c @parfail
-@c @parcopy
-@parinputfield
-@paroutputfield
-@pardictionary
-@parprocess
-@parselect
-@parunselect
-@paroneline
-@paronefield
-@end table
-
-@c ----------------------------------------
-
-@node lem input
-@subsection Input
-
-Lem reads a UTT file and processes the value of the @var{form} field
-(the input field may be changed with @option{--input-field} option).
-
-@node lem output
-@subsection Output
-
-@command{lem} adds a new annotation field, whose default name is @code{lem}.  In
-case of ambiguity either the segment is multiplicated (default),
-multiple @code{lem} fields are added (@option{--one-line}) or ambiguous
-annotation is produced as the value of single @code{lem} field (option
-@option{--one-field,-1}):
-
-@itemize @bullet
-
-@item
-unambiguous value format:
-
-@example
-   <lemma>,<descr>
-@end example
-
-@item
-ambiguous value format (@option{--one-field} option)
-
-
-@example
-   <lemma>,<descr>[,<descr>][;<lemma>,<descr>[,<descr>]]
-@end example
-
-(alternative descriptions for the same lemma are separated by commas,
-alternative lemmata are separated by semicolons.)
-
-@end itemize
-
-@node lem example
-@subsection Example
-
-Input: 
-
-@example
-0000 07 W Piszemy
-0007 01 S _
-0008 05 W dobre
-0013 01 S _
-0014 08 W programy
-0022 01 P .
-0023 01 B \n
-@end example
-
-Output (default):
-
-@example
-0000 07 W Piszemy lem:pisaÄ,V/AiVpMdTrfNpP1
-0007 01 B _
-0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn
-0008 05 W dobre lem:dobry,ADJ/DpNsCnavGn
-0013 01 B _
-0014 08 W programy lem:program,N/GiNpCa
-0014 08 W programy lem:program,N/GiNpCn
-0014 08 W programy lem:program,N/GiNpCv
-0022 01 P .
-0023 01 B \n
-@end example
-
-Output (@option{--one-line} option):
-
-@example
-0000 07 W Piszemy lem:pisaÄ,V/AiVpMdTrfNpP1
-0007 01 S _
-0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn lem:dobry,ADJ/DpNsCnavGn
-0013 01 S _
-0014 08 W programy lem:program,N/GiNpCa lem:program,N/GiNpCn lem:program,N/GiNpCv
-0022 01 P .
-0023 01 S \n
-@end example
-
-Output (@option{--one-field} option):
-
-@example
-0000 07 W Piszemy lem:pisaÄ,V/AiVpMdTrfNpP1
-0007 01 S _
-0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn,ADJ/DpNsCnavGn
-0013 01 S _
-0014 08 W programy lem:program,N/GiNpCa,N/GiNpCn,N/GiNpCv
-0022 01 P .
-0023 01 S \n
-@end example
-
-@c ----------------------------------------
-
-@node lem dictionaries
-@subsection Dictionaries
-
-@command{lem} requires a dictionary. The dictionary may be provided in
-one of two formats: in text (source) format or in binary (fsa) format.
-
-@subsubheading Text format
-
-Dictionary entries have the following structure:
-
-@example
-<form>;<lemma>,<descr>[;<lemma>,<descr>]
-@end example
-
-@var{lemma} may be given explicitly or in the cut-add format:
-
-@example
-@code{[<cut1><add1>-]<cut2><add2>}
-@end example
-
-meaning: replace prefix of length @code{<cut1>} with
-string @code{<add1>}, replace suffix of length @code{<cut2>} with string
-@code{<add2>}. For example @code{3t} transforms @samp{kocie} into
-@samp{kot}, @code{3-4aÃÅy} transforms @samp{najbielsi} into @samp{biaÃÅy}
-
-Each dictionary entry must be written in one line and must not contain blank characters.
-
-Examples:
-@example
-kot;0,N/GaNsCn
-kota;1,N/GaNsCg;1,N/GaNsCa
-kotu;1,N/GaNsCd
-kotem;2,N/GaNsCi
-kocie;3t,N/GaNsCl;3t,N/GaNsCv
-najbielsi;3-4aÅy,ADJ/DsNpCnGp
-najbielsze;3-5aÅy,ADJ/DsNpCnGaifn
-najlepsi;dobry,ADJ/DsNpCnGp
-najlepsze;dobry,ADJ/DsNpCnGaifn
-@end example
-
-
-The mandatory file name extension for a text dictionary is @code{dic}. For large
-dictionaries it is preferable, however, to compile them into binary
-(fsa) format.
-
-@subsubheading Binary format
-
-The mandatory file name extension for a binary dictionary is @code{bin}. To
-compile a text dictionary into binary format, write:
-
-@example
-compiledic <dictionaryname>.dic
-@end example
-
-@subsubheading Polex/PMDBF dictionary
-
-A large-coverage morphological dictionary for Polish language, Polex/PMDBF, is included in
-the distribution as the default @emph{lem}'s dictionary. It's 
-located by default in:
-
-@file{$HOME/.local/share/utt/pl_PL.ISO-8859-2/lem.bin}
-
-in local installation or in
-
-@file{/usr/local/share/utt/pl_PL.ISO-8859-2/lem.bin}
-
-in system installation.
-
-@node lem hints
-@subsection Hints
-
-@subsubheading Combining data from multiple dictionaries
-
-@itemize
-
-@item Apply <dict1>, then apply <dict2> to words which were not annotatated.
-
-@example
-lem -d <dict1> | lem -S lem -d <dict2>
-@end example
-
-@item Add annotations from two dictionaries <dict1> and <dict2>.
-
-@example
-lem -c -d <dict1> | lem -S lem -d <dict2>
-@end example
-
-@end itemize
-
-
-@c ---------------------------------------------------------------------
-@c GUE
-@c ---------------------------------------------------------------------
-
-@page
-@node gue
-@section gue - morphological guesser
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-
-@item @strong{Authors:}                 @tab MichaÅ Stolarski, Tomasz ObrÄbski
-@item @strong{Component category:}      @tab filter
-
-@end multitable
-
-@menu
-* gue description::    
-* gue command line options::    
-* gue example::                 
-* gue dictionaries::            
-@end menu
-
-
-@node gue description
-@subsection Description
-
-@command{gue} guesess morphological descriptions of the form contained
-in the @var{form} field.
-
-
-@node gue command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-@parversion
-@parinteractive
-@c @parfile
-@c @paroutput
-@c @parfail
-@c @parcopy
-@parinputfield
-@paroutputfield
-@pardictionary
-@parprocess
-@parselect
-@parunselect
-@paroneline
-@paronefield
-
-@item @b{@minus{}@minus{}delta=@var{n}}
-Stop displaying answers after fall of weight, that is, when weight difference between 2 subsequent results is more than delta value (default=`0.2').
-
-
-@item @b{@minus{}@minus{}cut-off=@var{n}}
-Do not display answers with less weight than cut-off value (default=`200').
-
-
-@item @b{@minus{}@minus{}guess_count=@var{n}, @minus{}n @var{n}}
-Guess up to n descriptions  (default=`0', which means 'display all results').
-
-
-
-@end table
-
-@node gue example
-@subsection Example
-
-@example
-command: gue -n 2 
-
-input:
-0000 07 W smerfny 
-
-output:
-0000 07 W smerfny gue:,ADJ/CaDpGiNs
-0000 07 W smerfny gue:,ADJ/CnvDpGaipNs
-@end example
-                                  
-
-@node gue dictionaries
-@subsection Dictionaries
-
-@command{gue} requires a dictionary. For now, the dictionary must be provided in binary (fsa) format.
-The fsa format is created by compiling text-format dictionaries.
-
-
-
-@subsubheading Text format
-
-Dictionary entries have the following structure:
-
-@example
-@var{prefix}@code{*}@var{suffix}@code{;}@var{lemma}@code{,}@var{description}@code{:}@var{weight}
-@end example
-
-@var{lemma} must be given in the cut-add format:
-
-@example
-@code{[<cut1><add1>-]<cut2><add2>}
-@end example
-(no spaces in between): replace prefix of length @var{cut1} with
-string @var{add1}, replace suffix of length @var{cat2} with string
-@var{add2}.
-
-
-Example: @code{3-4aÅy} transforms @i{najbielsi} into @i{biaÅy}
-
-
-@var{description} contains the part of speech and morphosyntactic information (@xref{PMDBF dictionary}.).
-
-@var{weight} is an integer value between 1 and 999 indicating the
-likelihood of the guess.
-
-@c @example
-@c *ÅkÄ;1a,N/GfNsCa
-@c naj*elszy;3-4aÅy,ADJ/...:...
-@c @end example
-
-
-@c ---------------------------------------------------------------------
-@c COR
-@c ---------------------------------------------------------------------
-
-@page
-@node cor
-@section cor - spelling corrector
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski, MichaÅ Stolarski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab tok
-@end multitable
-
-@menu
-* cor description::
-* cor command line options::    
-* cor dictionaries::            
-@end menu
-
-
-@node cor description
-@subsection Description
-
-The spelling corrector applies Kemal Oflazer's dynamic programming
-algorithm @cite{oflazer96} to the FSA representation of the set of
-word forms of the Polex/PMDBF dictionary. Given an incorrect
-word form it returns all word forms present in the dictionary whose
-edit distance is smaller than the threshold given as the parameter.
-
-
-@node cor command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-@parversion
-@parinteractive
-@c @parfile
-@c @paroutput
-@c @parfail
-@c @parcopy
-@parinputfield
-@paroutputfield
-@pardictionary
-@parprocess
-@parselect
-@parunselect
-@paroneline
-@paronefield
-
-@item @b{@minus{}@minus{}distance=@var{int}, @minus{}n @var{int}}
-Maximum edit distance (default='1').
-
-@c @item @b{@minus{}@minus{}replace, @minus{}r}
-@c Replace original form with corrected form, place original form in the
-@c cor field. This option has no effect in @option{--one-*} modes (default=off)
-
-
-@end table
-
-@node cor dictionaries
-@subsection Dictionaries
-
-@command{cor} requires a dictionary. The dictionary has to be provided in binary (fsa) format. 
-The fsa format is created by compiling text-format dictionaries.
-
-@subsubheading Text format
-
-The @command{cor} dictionary is a list of words:
-@example
-odlot
-odlotowy
-odludek
-@end example
-
-@subsubheading Binary format
-
-The mandatory file name extension for a binary dictionary is @code{bin}. To
-compile a text dictionary into binary format, write:
-
-@example
-compiledic <dictionaryname>.dic
-@end example
-
-@c ---------------------------------------------------------------------
-@c KOR
-@c ---------------------------------------------------------------------
-
-@page
-@node kor
-@section kor - configurable spelling corrector
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab PaweÅ Werenski, Tomasz ObrÄbski, MichaÅ Stolarski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab tok
-@end multitable
-
-@menu
-* kor description::
-* kor command line options::
-* kor weights definition file::    
-* kor dictionaries::            
-@end menu
-
-
-@node kor description
-@subsection Description
-
-The spelling corrector applies a Pawel Werenski's dynamic programming
-algorithm to the FSA representation of the set of word forms of the
-Polex/PMDBF dictionary. The algorithm is an extension of K. Oflazer
-algorithm used by @command{cor}. In the extended version it is
-possible to assign weights to individual edit operations.
-
-Given an incorrect word form it returns all word forms
-present in the dictionary whose edit distance is smaller than the
-threshold given as the parameter.
-
-
-@node kor command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-@parversion
-@parinteractive
-@c @parfile
-@c @paroutput
-@c @parfail
-@c @parcopy
-@parinputfield
-@paroutputfield
-@pardictionary
-@parprocess
-@parselect
-@parunselect
-@paroneline
-@paronefield
-
-@item @b{@minus{}@minus{}distance=@var{int}, @minus{}n @var{int}}
-Maximum edit distance (default='1').
-
-@item @b{@minus{}@minus{}weights=@var{filename}, @minus{}w @var{filename}}
-Edit operations' weights file.
-
-@c @item @b{@minus{}@minus{}replace, @minus{}r}
-@c Replace original form with corrected form, place original form in the
-@c cor field. This option has no effect in @option{--one-*} modes (default=off)
-
-
-@end table
-
-
-@node kor weights definition file
-@subsection Weights definition file
-
-Example:
-
-@example
-
-%stdcor 1
-%xchg   1
-ÅŒ  rz 0.5
-ch h  0.5
-u  Ã³  0.5
-
-@end example
-
-
-Default weight is set to 1 (@code{%stdcor 1}), the weight of exchange
-operation is set to 1 (@code{%xchg 1}), the three principal orthographic
-errors are assigned the weight 0.5.
-
-The edit operation weight declaration, such as
-
-@example
-ÅŒ  rz 0.5
-@end example
-
-works in both ways, i.e. ÅŒ->rz, rz->ÅŒ.
-
-The default weights definition file for @code{kor} is:
-
-@example
-$HOME/.local/share/utt/weights.kor
-@end example
-
-or, if the above mentioned file is absent:
-
-@example
-/usr/local/share/utt/weights.kor
-@end example
-
-
-@node kor dictionaries
-@subsection Dictionaries
-
-see @command{cor}
-
-@c ---------------------------------------------------------------------
-@c SEN
-@c ---------------------------------------------------------------------
-
-@page
-@node sen
-@section sen - a sentensizer
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab tok
-
-@end multitable
-
-
-@menu
-* sen description::
-@c * sen input::
-@c * sen output::
-* sen example::                 
-@end menu
-
-@node sen description
-@subsection Description
-
-@command{sen} detects sentence boundaries in UTT-formatted texts and marks them with special zero-length segments, in which the @var{type} field may contain the BOS (beginning of sentence) or EOS (end of sentence) annotation. 
-
-@node sen example
-@subsection Example
-
-@example
-command: sen
-
-input:
-0000 05 W CzeÅÄ
-0005 01 P !
-0006 01 S _
-0007 02 W To
-0009 01 S _
-0010 02 W ja
-0012 01 P .
-0013 01 S \n
-
-output:
-0000 00 BOS *
-0000 05 W CzeÅÄ
-0005 01 P !
-0006 00 EOS *
-0006 00 BOS *
-0006 01 S _
-0007 02 W To
-0009 01 S _
-0010 02 W ja
-0012 01 P .
-0013 01 S \n
-0014 00 EOS *
-@end example
-
-
-@c ---------------------------------------------------------------------
-@c GPH
-@c ---------------------------------------------------------------------
-
-@c @node gph - graphizer
-@c @chapter gph - graphizer
-
-@c Authors: Tomasz ObrÄbski
-
-
-
-@c ---------------------------------------------------------------------
-@c SER
-@c ---------------------------------------------------------------------
-
-@page
-@node ser
-@section ser - pattern search tool
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab tok,  lem --one-field
-@end multitable
-
-@menu
-* ser description::
-* ser command line options::    
-* ser pattern::                 
-* ser how ser works::           
-* ser customization::           
-* ser limitations::             
-* ser requirements::            
-@end menu
-
-
-@node ser description
-@subsection Description
-
-@command{ser} looks for patterns in UTT-formatted texts.
-
-
-@c ---------------------------------------------------------------------
-@node ser command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-@parversion
-@c @parfile
-@c @paroutput
-@c @parinputfield
-@c @paroutputfield
-@parprocess
-@parinteractive
-
-@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
-The search pattern.
-
-@item @b{@minus{}@minus{}morph=@var{field}}
-The name of the annotation field containing the morphological
-description (default @code{lem}).
-
-@item @b{@minus{}@minus{}flex}
-Only print the generated flex source code.
-
-@item @b{@minus{}@minus{}macro=@var{filename}}
-Read macrodefinitions from file @var{filename} rather than from
-default location. This option allows to redefine the set of terms.
-
-@item @b{@minus{}@minus{}define=@var{filename}}
-Append macrodefinitions from file @var{filename}. This option
-allows to extend the set of terms.
-
-@end table
-
-
-@c ---------------------------------------------------------------------
-@node ser pattern
-@subsection Pattern
-
-The @command{ser} pattern is a regular expression over terms corresponding
-to text segments or segment sequences. Predefined terms are:
-
-@table @code
-
-@item seg(@var{t},@var{f},@var{a})
-a segment of type @var{t}, containing form @var{f} and annotation
-@var{a}
-
-@item form(@var{f})
-a segment containing form @var{f}
-
-@item field(@var{f})
-a segment containing annotation field @var{f}
-
-@item space(@var{f})
-a space segment of form @var{f}
-
-@item word(@var{f})
-a word segment of form @var{f}
-
-@item punct(@var{f})
-a punct segment of form @var{f}
-
-@item number(@var{f})
-a number segment of form @var{f}
-
-@item lexeme(@var{f})
-a word segment with lemma @var{f}
-
-@item cat(@var{c})
-a word segment of category @var{c}
-
-@end table
-
-All arguments are optional. If an argument is omitted, an arbitrary
-string of non-blank characters is assumed as the argument value. Term
-arguments may be arbitrary character-level regular expressions. The
-following special symbols can by used:
-
-@multitable {aaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @code{[@dots{}]}            @tab a character class
-@item @code{[^@dots{}]}           @tab a negated character class
-@item @code{|}                    @tab alternative
-@item @code{*}                    @tab repetition, including zero times
-@item @code{+}                    @tab repetition, at least one time
-@item @code{?}                    @tab optionality
-@item @code{@{@var{m},@var{n}@}}  @tab repetition from @var{m} to @var{n} times
-@item @code{@{@var{m},@}}         @tab repetition @var{m} or more times
-@item @code{@{@var{m}@}}          @tab repetition @var{m} times
-@item @code{@var{\ddd}}           @tab the character with octal value @var{ddd}
-@item @code{\x@var{hh}}           @tab the character with hexadecimal value @var{hh}
-@item @code{( )}                  @tab parentheses, used to override precedence
-@c @end multitable
-
-@c @multitable {aaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @code{.}    @tab a non-blank character
-@item @code{\w}   @tab a letter
-@item @code{\W}   @tab a non-blank character other than a letter
-@item @code{\d}   @tab a digit
-@item @code{\D}   @tab a non-blank character other than a digit
-@item @code{\s}   @tab a space or tab character
-@item @code{\S}   @tab a non-blank character (the same as @code{.})
-@item @code{\l}   @tab a lowercase letter
-@item @code{\L}   @tab an uppercase letter
-@end multitable
-
-
-@noindent The following characters:
-@example
-@verb{%  [   ]   ^   |   *   +   ?   {   }   ,   .   <   >   \ %}
-@end example
-must be escaped with a backslash, i.e. written as:
-@example
-@verb{% \[  \]  \^  \|  \*  \+  \?  \{  \}  \,  \.  \<  \>  \\ %}
-@end example
-
-@quotation Note
-The special symbols are ... borrowed from Perl with minor
-modifications ... for convenience 
-The meaning of certain special characters/sequences slightly differs
-from their common ???. This is motivated by convenience reasons.
-The meaning of the @code{.} special character is modified due to
-the special function of spaces in utt files (they are field
-separators). Use @code{\s} to explicitly 
-@end quotation
-
-In the argument of the @code{cat} term a special operator <...> may be
-used. A category specification enclosed in angle brackets matches all
-category descriptions which are consistent (non-contradictory) with the
-specification. For example @code{<N>} matches all noun descriptions,
-@code{<ADJ/Can>} matches all adjectives in accusative or nominal case.
-
-
-@*
-@noindent @b{Examples of one-segment patterns:}
-
-@multitable {aaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @code{seg}            @tab any segment
-@item @code{word}           @tab any word-form
-@item @code{word(pomocy)}   @tab the word-form @samp{pomocy}
-@item @code{word(naj.+)}    @tab a word-form beginning with @samp{naj}
-@item @code{word(\L\l+)}    @tab a capitalized word-form
-@item @code{punct}          @tab a punctuation character
-@item @code{space(.*\\n.*)} @tab a space segment containing a newline character
-@item @code{lexeme(pomoc)}  @tab any form of the lexeme 'pomoc'
-@item @code{cat(N/.*)}      @tab a word which category starts with @code{N/}
-@item @code{cat(<N/Ca>)}    @tab a word which category matches @code{N/Ca}
-@end multitable
-
-@*
-@noindent @b{Examples of multi-segment patterns:}
-
-@table @code
-
-@item (word(\L) punct(\.) space?)+ word(\L\l+)
-a sequence of initials followed by a surname
-
-@item punct seg(W|S|N)* cat(<NPRO/Sr>) seg(W|S|N)* punct
-a text fragment between two punctuation characters, containing an
-ocurrence of a relative pronoun
-
-@end table
-
-
-@node ser how ser works
-@subsection How ser works
-
-@node ser customization
-@subsection Customization
-
-@c All predefined terms correspond to single segments, 
-
-@example
-define(`verbseq', `(cat(<V>) (space cat(<V>)))')
-@end example
-
-
-the term @code{cat()} may not be used as a ... of 
-
-@c See @command{m4} manual for further details on macro definition format.
-
-@node ser limitations
-@subsection Limitations
-
-Do not use more than 3 attributes in <>.
-
-@node ser requirements
-@subsection Requirements
-
-In order to run @command{ser}, the following programs must be
-installed in the system:
-
-@itemize
-
-@item @command{m4}
-@item @command{grep}
-@item @command{flex}
-@item @command{gcc}
-
-@end itemize
-
-
-@c ---------------------------------------------------------------------
-@c GRP
-@c ---------------------------------------------------------------------
-
-@page
-@node grp
-@section grp - pattern search tool
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT flattened
-@item @strong{Output format:}           @tab UTT flattened
-@item @strong{Required annotation:}     @tab tok, sen, lem --one-field
-@end multitable
-
-
-@menu
-* grp description::
-* grp command line options::    
-* grp pattern::                 
-* grp hints::    
-@end menu
-
-
-@node grp description
-@subsection Description
-
-@code{gre} selects sentences containing an expression matching a
-pattern. The pattern format is exactly the same as that accepted by
-@code{ser}.
-
-@code{gre} is intended mainly for speeding up corpus search process.
-It is extremely fast (processing speed is usually higher then the speed
-of reading the corpus file from disk). 
-
-@node grp command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-@parversion
-@parprocess
-@parinteractive
-
-@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
-The search pattern.
-
-@item @b{@minus{}@minus{}morph=@var{field}}
-The name of the annotation field containing the morphological
-description (default @code{lem}).
-
-@item @b{@minus{}@minus{}command}
-Only print the generated flex source code.
-
-@item @b{@minus{}@minus{}macro=@var{filename}}
-Read macrodefinitions from file @var{filename} rather than from
-default location. This option allows to redefine the set of terms.
-
-@item @b{@minus{}@minus{}define=@var{filename}}
-Append macrodefinitions from file @var{filename}. This option
-allows to extend the set of terms.
-
-@end table
-
-
-@node grp pattern
-@subsection Pattern
-
-(see @code{ser})
-
-@node grp hints
-@subsection Hints
-
-The corpus search speed may be increased by combining grp with lzop
-compression tool (grp usually processes data faster than it is read from a
-disk, especially for slow laptop drives).
-
-@example
-cat corpus | tok | sen | lem -1 | fla | lzop -7 > corpus.grp.lzo
-@end example
-
-@example
-lzop -cd corpus.grp.lzo | grp -e @var{EXPR} | unfla | ser -e @var{EXPR}
-@end example
-
-
-
-@c ---------------------------------------------------------------------
-@c MAR
-@c ---------------------------------------------------------------------
-
-@page
-@node mar
-@section mar
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Marcin Walas, Tomasz ObrÄbski
-@item @strong{Input format:}            @tab UTT flattened
-@item @strong{Output format:}           @tab UTT flattened
-@item @strong{Required annotation:}     @tab tok, sen, lem -1
-@end multitable
-
-@subsection Description
-@code{mar} is a perl script, which matches given pattern on the utt-formated text
-and tags matching parts with any number of user-defined tags.
-
-@subsection Command line options
-@table @code
-@parhelp
-@parversion
-
-@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
-The search pattern.
-@item @b{@minus{}@minus{}action=@var{action}, @minus{}a @var{action} [p] [s] [P]}
-Perform only indicated actions. Where:
-@multitable {aaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @code{p}   @tab preprocess
-@item @code{s}   @tab search
-@item @code{P}   @tab postprocess
-@end multitable
-default: psP
-
-@item @b{@minus{}@minus{}command}
-print generated sed command, then exit
-
-@item @b{@minus{}@minus{}help, @minus{}h}
-print help, then exit
-
-@item @b{@minus{}@minus{}version, @minus{}v}
-print version, then exit
-@end table
-@subsection Tokens in pattern
-@code{mar} pattern is based on @code{ser} patterns(see @pxref{ser pattern}). @code{mar} pattern is a @code{ser} pattern,
-in which you can add any number of matching tags, which will be printed in exacly the place, where
-they were placed in the pattern. A valid token starts with @@ which follows any number of alphanumeric
-characters. For example valid match tokens are: @@STARTMATCH @@ENDMATCH
-
-Matching tokens can be placed between, before or after any of @code{ser} pattern terms. They don't have
-to be paritied. There can be any number of them in the pattern (zero or more). They don't have to be unique.
-They can be placed one after another. For example:
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @code{@@BOM lexeme(pomoc)}  @tab place tag @b{BOM} before any form of the lexeme 'pomoc'
-@item @code{@@MATCH lexeme(pomoc) @@MATCH}      @tab place tag @b{MATCH} before and after any form of the lexeme 'pomoc'
-@item @code{cat(<ADJ>) @@MATCH lexeme(pomoc) @@MATCH}      @tab place tag @b{MATCH} before and after any form of the lexeme 'pomoc' which is  followef by adjective
-@item @code{cat(<ADJ>) @@TAG @@BOM lexeme(pomoc) @@EOM}      @tab place tags @b{TAG} and @b{BOM}  before any form of the lexeme 'pomoc' which is  followed by adjective and tag @b{EOM} after it
-@end multitable
-
-(see mar's help 'mar -h' for some more information)
-
-@subsection How mar works
-@code{mar} translates given @code{ser} pattern with @code{m4} macroprocessor to regular expression. Then it changes it into @code{sed} command script, which is then executed.
-
-You can see translated sed script by using the @code{@minus{}@minus{}command} option.
-@subsection Limitations
-The complexity of computations performed by @code{mar} increases linearly with the number of placed tokens. So it is highly recommended not to place too much tokens.
-@subsection Requirements
-In order to run @code{mar}, the following programs must be installed in the system:
-
-@itemize
-
-@item @command{m4}
-@item @command{grep}
-@item @command{sed}
-
-@end itemize
-
-
-
-@c ---------------------------------------------------------------------
-@c KOT
-@c ---------------------------------------------------------------------
-
-@page
-@node kot
-@section kot - untokenizer
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Component category:}      @tab filter
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab text
-@item @strong{Required annotation:}     @tab tok
-@end multitable
-
-
-@menu
-* kot description::
-* kot command line options::    
-* kot usage examples::    
-@end menu
-
-@node kot description
-@subsection Description
-
-@command{kot} transforms a UTT formatted file back into raw text format.
-
-@node kot command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-
-@c @item @b{@minus{}@minus{}version}, @b{@minus{}v}
-
-@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
-
-@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
-
-@c @item @b{@minus{}@minus{}interactive @minus{}i}
-
-@c @item @b{@minus{}@minus{}config=@var{filename}}
-
-@item
-
-@item @b{@minus{}@minus{}gap-fill=@var{string}, @minus{}g @var{string}}
-print @var{string} between nonadjacent segments of the input file
-
-@item @b{@minus{}@minus{}spaces, @minus{}r}
-retain the special characters @code{_}, @code{\t},
-@code{\n}, @code{\r}, @code{\f} unexpanded in the output
-
-@end table
-
-@node kot usage examples
-@subsection Usage examples
-
-@example
-cat legia.txt | tok | kot	
-@end example
-
-@example
-cat legia.txt | tok | lem -1 | kot
-@end example
-
-@c ---------------------------------------------------------------
-@c CON
-@c ---------------------------------------------------------------
-
-
-@page
-@node con
-@section con - concordance table generator
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Justyna Walkowska
-@item @strong{Component category:}      @tab sink
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab text
-@item @strong{Required annotation:}     @tab ser or mar
-@end multitable
-@c
-
-@menu
-* con description::
-* con command line options::
-* con usage example::
-* con hints::    
-@end menu
-
-
-@node con description
-@subsection Description
-
-@command{con} generates a concordance table based on a pattern given to @command{ser}.
-
-
-@node con command line options
-@subsection Command line options
-
-@table @code
-
-@parhelp
-
-@c @item @b{@minus{}@minus{}help}, @b{@minus{}h}
-@c @item @b{@minus{}@minus{}version}, @b{@minus{}v}
-@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
-@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
-@c @item @b{@minus{}@minus{}fail=@var{filename}, @minus{}e @var{filename}} [???]
-@c @item @b{@minus{}@minus{}copy, @minus{}c} [???]
-@c @item @b{@minus{}@minus{}input-field=@var{fieldname}, @minus{}I @var{fieldname}}
-@c @item @b{@minus{}@minus{}output-field=@var{fieldname}, @minus{}O @var{fieldname}}
-@c @item @b{@minus{}@minus{}process=@var{class}, @minus{}p @var{class}}
-@c @item @b{@minus{}@minus{}interactive @minus{}i}
-@c @item @b{@minus{}@minus{}config=@var{filename}}
-@c @item
-@c @item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
-@c search pattern
-@c 
-@c @item @b{@minus{}@minus{}flex}
-@c only print the generated flex source code
-@c 
-@c @item @b{@minus{}@minus{}macro=@var{filename}}
-@c read macrodefinitions from file @var{filename} rather than from
-@c default location. This option allows to redefine the set of terms.
-@c 
-@c @item @b{@minus{}@minus{}define=@var{filename}}
-@c append macrodefinitions from file @var{filename}. This option
-@c allows to extend the set of terms.
-
-@item @b{@minus{}@minus{}left @minus{}l}            
-	Left context info (default='30c'). Example:
-@example			 
-				 -l=5c: left context is 5 characters
-                                 -l=5w: left context is 5 words
-                                 -l=5s: left context is 5 non-empty input lines
-                                 -l='\s*\S+\sr\S+BOS': left context starts with the given regex
-@end example
-
-@item @b{@minus{}@minus{}right @minus{}r}            
-	Right context info (default='30c').
-@item @b{@minus{}@minus{}trim @minus{}t}            
-	Clear incomplete words from output.
-@item @b{@minus{}@minus{}white @minus{}w}            
-	DO NOT change all white characters into spaces.
-@item @b{@minus{}@minus{}column @minus{}c}            
-	Left column minimal width in characters (default = 0).
-@item @b{@minus{}@minus{}ignore @minus{}i}            
-	Ignore segment inconsistency in the input.
-@item @b{@minus{}@minus{}bom}            
-	Beginning of selected segment (regex, default='[0-9]+ [0-9]+ BOM .*').
-@item @b{@minus{}@minus{}eom}            
-	End of selected segment (regex, default='[0-9]+ [0-9]+ EOM .*').
-@item @b{@minus{}@minus{}bod}            
-	Selected segment beginning display string (default='[').
-@item @b{@minus{}@minus{}eod}            
-	Selected segment end display string (default=']').
-
-
-
-@end table
-
-@node con usage example
-@subsection Usage example
-@example
-cat file.txt | tok | lem -1 | ser -e 'lexeme(dom)' | con  
-@end example
-
-
-@node con hints
-@subsection Hints
-
-@command{con} is a rather slow program. Do not pass large amounts of
-redundant text through this program. @command{con} works fine in the following
-sequence:
-
-@example
-... | grp -e EXPR | ser -e EXPR | con
-@end example
-
-
-@c ---------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-
-@page
-@node Auxiliary tools
-@chapter Auxiliary tools
-
-@menu
-* compiledic::         dictionary compiler
-* fla::                UTT file flattener
-* unfla::              UTT file unflattener
-@end menu
-
-
-@page
-@node compiledic
-@section compiledic - the dictionary compiler
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab MichaÅ Stolarski, Tomasz ObrÄbski
-@item @strong{Component category:}      @tab additional tool
-@end multitable
-@c
-
-@command{compiledic} compiles dictionaries in text format (@code{.dic} extension) into binary
-(FSA) format (@code{.bin} extension).
-
-Automaton representation of a dictionary is built using the AT&T tools:
-@itemize
-@item AT&T FSM Library,
-@item AT&T Lextools.
-@end itemize
-
-In order for the compiledic program to work you have to install the
-above mentioned packages into your system.  They are freely available
-for non-commercial use.
-
-Usage:
-@example
-        compiledic <dictionaryname>.dic
-@end example
-
-The file <dictionaryname>.bin will be generated.
-
-Remarque: The program produces a lot of temporary files which are
-stored in the current directory. They are deleted after successfull
-termination of the program.
-
-@c @menu
-@c * con command line options::
-@c * con usage example::
-@c * con hints::    
-@c @end menu
-
-
-@c -------------------------------------------------------------------------------
-@c FLA
-@c -------------------------------------------------------------------------------
-
-@page
-@node fla
-@section fla - the UTT file flattener
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Input format:}            @tab UTT regular
-@item @strong{Output format:}           @tab UTT flattened
-@item @strong{Required annotation:}     @tab sen
-@end multitable
-@c
-
-@menu
-* fla description::
-@c * fla command line options::
-@c * fla usage example::
-@end menu
-
-
-@node fla description
-@subsection Description
-
-@command{fla} ``flattens'' a utt file by merging segments belonging
-to one sentence in one line. Technically, end-of-line characters
-('\n', ASCII code 10) are replaced with line-feed characters ('\f',
-ASCII code 12).  The flattening makes it possible to process UTT files
-with such tools as @command{grep} or @command{sed} sentence by
-sentence (used in @command{grp} and @command{mar}).
-
-Flattened files should have the suffix @code{.fla}, eg. @file{thetext.utt.fla}.
-
-Flattened files are still human-readible.
-
-Usage:
-
-@example
-        fla [<bosregex>]
-@end example
-
-The facultative argument is a regular expression describing segments
-which should be treated as sentence beginnings (the test is: the
-segment contains a fragment matching the @code{<bosregex>}). By
-default, segments containing a field @code{BOS} are seeked.
-
-@c -------------------------------------------------------------------------------
-@c UNFLA
-@c -------------------------------------------------------------------------------
-
-@page
-@node unfla
-@section unfla - the UTT file unflattener
-
-@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@item @strong{Authors:}                 @tab Tomasz ObrÄbski
-@item @strong{Input format:}            @tab UTT flattened
-@item @strong{Output format:}           @tab UTT regular
-@item @strong{Required annotation:}     @tab -
-@end multitable
-
-@menu
-* unfla description::
-@c * fla command line options::
-@c * fla usage example::
-@end menu
-
-@node unfla description
-@subsection Description
-@command{unfla} transforms a flattened UTT file, produced by
-@command{fla}, into the regular format by restoring end-of-line
-characters.
-
-
-
-
-@c ---------------------------------------------------------------------
-@c USAGE EXAMPLES
-@c ---------------------------------------------------------------------
-
-@node Usage examples
-@chapter Usage examples
-
-@subsubheading Simple pipelines
-
-@enumerate
-
-@item tokenization
-
-cat text | tok > output1
-
-@item morphological annotation (1)
-
-simple dictionary based lemmatization
-
-cat text | tok | lem > output1
-
-@item morphological annotation (2)
-
-1) perform dictionary-based lemmatization
-4) guess descriptions for words which have no annotation
-
-@example
-cat text | tok | lem | gue -S lem > output2
-@end example
-
-@item morphological annotation (3)
-
-1) perform dictionary-based lemmatization
-2) try to correct words with no annotation
-3) perform dictionary-based lemmatization of corrected words
-4) guess descriptions for words which still have no annotation
-
-@example
-cat text | tok | lem | cor -p W -S lem | lem -I cor | gue -p W -S lem
-@end example
-@item spelling correction
-
-
-
-@example
-cat text | tok | egrep ' W ' | lem | egrep -v 'lem:' | cor -1
-@end example
-
-@item Expression extraction
-
-Extraction of all occurrences of a verb followed by a form of the noun 'rozmowa'.
-
-@example
-cat text | tok | lem -1 | ser -e 'cat(<V>) space lexeme(rozmowa)' -m | kot > output4
-@end example
-
-@item A word in context
-
-Extraction of text fragments containing a form of the lexeme 'rozmowa' in
-the context of 5 preceeding and 5 succeeding corpus segments.
-
-@example
-cat text | tok | lem -1 | ser -e 'seg@{5@} lexeme(rozmowa) seg@{5@}' -m | kot > output
-@end example
-
-@item generation of concordance table (1)
-
-@example
-cat text | tok | lem -1 | ser -e 'cat(<V>) space lexeme(rozmowa)' | con
-@end example
-
-10"
-
-@item generation of concordance table (2)
-
-The same as above but much faster
-
-@example
-cat text | tok | lem -1 | \
-grp -e 'cat(<V>) space lexeme(rozmowa)' | \
-ser -e 'cat(<V>) space lexeme(rozmowa)' | \
-con
-@end example
-
-2"
-
-@item generation of concordance table (3)
-
-Usually, one performs repetitively search over the same corpus. In
-such case it is advisable to transform the corpus data into the format
-required by @command{grp} first, and then use the preprocessed data.
-
-As @command{grp} (@command{grep}) processes data faster then it is
-read from the disk drive, the search time may be still shortened by
-using file compression techniques.  We suggest using the
-@command{lzop} compressor/decompressor.
-
-@item the fastest way to search a large corpus
-
-step 1: corpus preprocessing
-
-@example
-cat corpus | tok | sen | lem -1 \
-| fla | lzop -7 > corpus.grp.lzo
-@end example
-
-step 2: search
-
-@example
-lzop -cd corpus.grp.lzo | unfla | grp -e 'cat(<V>) space
-lexeme(rozmowa)' | ser -e 'cat(<V>) space lexeme(rozmowa)' | con
-@end example
-
-@end enumerate
-
-@c @subsubheading More complicated configurations
-
-
-@c @example
-@c mknod fifo1 p
-@c mknod fifo2 p
-@c mknod fifo3 p
-@c mknod fifo4 p
-@c mknod fifo5 p
-
-@c tok | lem -p W -e fifo1 > fifo2 &
-@c cor -e fifo3 < fifo1 | lem > fifo4 &
-@c gue < fifo3 > fifo5 &
-@c sort -m fifo2 fifo4 fifo5
-
-@c rm fifo?
-@c @end example
-
-
-@c ---------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-
-@c ---------------------------------------------------------------------
-@c PMDBF DICTIONARY
-@c ---------------------------------------------------------------------
-
-@node PMDBF dictionary
-@chapter PMDBF dictionary
-
-UTT components come with lexical data derived from Polish
-Morphological Database (PMDB).
-
-@menu
-* PMDBF files::    
-* PMDBF tag structure::                 
-* PMDBF parts of speech::           
-* PMDBF morphosyntactic attributes::           
-@end menu
-
-@node PMDBF files
-@section Files
-
-@node PMDBF tag structure
-@section Tag structure
-
-pos = [[:upper:]]+
-
-attr = [[:upper:]]+
-
-val = [[:lower:][:digit:]?!*+-] | <[^>\n]+>
-
-descr = pos ( / ( attr val + ) + ) ?
-
-@node PMDBF parts of speech
-@section Parts of speech
-
-@multitable {ADJPRP} { adjectival-passive-participle }
-@item @code{N} @tab noun
-@item @code{NPRO} @tab nominal-pronoun
-@item @code{NV} @tab deverbal-noun
-@item @code{V} @tab verb
-@item @code{BYC} @tab byc
-@item @code{VNI} @tab non-inflected-verb
-@item @code{ADJ} @tab adjective
-@item @code{ADJPAP} @tab adjectival-passive-participle
-@item @code{ADJPRP} @tab adjectival-present-participle
-@item @code{ADJPP} @tab adjectival-past-participle
-@item @code{ADJPRO} @tab adjectival-pronoun
-@item @code{ADJNUM} @tab adjectival-numeral
-@item @code{ADV} @tab adverb
-@item @code{ADVANP} @tab adverbial-anterior-participle
-@item @code{ADVPRP} @tab adverbial-present-participle
-@item @code{ADVPRO} @tab adverbial-pronoun
-@item @code{ADVNUM} @tab  adverbial-numeral
-@item @code{P} @tab preposition
-@item @code{PPRO} @tab prep-noun-pronoun
-@item @code{CONJ} @tab conjunction
-@item @code{EXCL} @tab exclamation
-@item @code{APP} @tab call
-@item @code{ONO} @tab onomatopoeia
-@item @code{PART} @tab particle
-@item @code{NUMCRD} @tab cardinal-numeral
-@item @code{NUMCOL} @tab collective-numeral
-@item @code{NUMPAR} @tab partitive-numeral
-@item @code{NUMORD} @tab ordinal-numeral
-@end multitable
-
-@node PMDBF morphosyntactic attributes
-@section Morphosyntactic attributes
-
-@multitable {Attr} {Val} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
-@c @headitem Attr @tab Val @tab Description
-@item
-@code{A} @tab @tab Aspect
-@item
-@tab @code{p} @tab perfect
-@item
-@tab @code{i} @tab imperfect.
-@item
-@item
-@code{V} @tab @tab Verb-Form
-@item
-@tab @code{b} @tab infinitive,
-@item
-@tab @code{p} @tab personal,
-@item
-@tab @code{i} @tab impersonal.
-@item
-@item
-@code{M} @tab @tab Mood
-@item
-@tab @code{d} @tab declarative,
-@item
-@tab @code{c} @tab conditional,
-@item
-@tab @code{i} @tab imperative.
-@item
-@item
-@code{T} @tab @tab Tense
-@item
-@tab @code{a} @tab past,
-@item
-@tab @code{r} @tab present,
-@item
-@tab @code{f} @tab future.
-@item
-@item
-@code{P} @tab @tab Person
-@item
-@tab @code{1} @tab 1,
-@item
-@tab @code{2} @tab 2,
-@item
-@tab @code{3} @tab 3.
-@item
-@item
-@code{D} @tab @tab Degree
-@item
-@tab @code{p} @tab positive,
-@item
-@tab @code{c} @tab comparative,
-@item
-@tab @code{s} @tab superlative.
-@item
-@item
-@code{N} @tab @tab Number
-@item
-@tab @code{s} @tab singular,
-@item
-@tab @code{p} @tab plural.
-@item
-@item
-@code{C} @tab @tab Case
-@item
-@tab @code{n} @tab nominative,
-@item
-@tab @code{g} @tab genitive,
-@item
-@tab @code{d} @tab dative,
-@item
-@tab @code{a} @tab accusative,
-@item
-@tab @code{i} @tab instrumantal,
-@item
-@tab @code{l} @tab locative,
-@item
-@tab @code{v} @tab vocative.
-@item
-@code{G} @tab @tab Gender
-@item
-@tab @code{p} @tab masculine-personal,
-@item
-@tab @code{a} @tab masculine-animal,
-@item
-@tab @code{i} @tab masculine-inanimate,
-@item
-@tab @code{f} @tab feminine,
-@item
-@tab @code{n} @tab neuter.
-@end multitable
-
-
-@c ---------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-@c 
-@c @node Examples
-@c @chapter Examples
-
-@c ----------------------------------------------------------------------
-@c ----------------------------------------------------------------------
-
-@node    GNU Free Documentation License
-@chapter GNU Free Documentation License
-
-@c The GNU Free Documentation License.
-@center Version 1.2, November 2002
-
-@c This file is intended to be included within another document,
-@c hence no sectioning command or @node.
-
-@display
-Copyright @copyright{} 2000,2001,2002 Free Software Foundation, Inc.
-51 Franklin St, Fifth Floor, Boston, MA  02110-1301, USA
-
-Everyone is permitted to copy and distribute verbatim copies
-of this license document, but changing it is not allowed.
-@end display
-
-@enumerate 0
-@item
-PREAMBLE
-
-The purpose of this License is to make a manual, textbook, or other
-functional and useful document @dfn{free} in the sense of freedom: to
-assure everyone the effective freedom to copy and redistribute it,
-with or without modifying it, either commercially or noncommercially.
-Secondarily, this License preserves for the author and publisher a way
-to get credit for their work, while not being considered responsible
-for modifications made by others.
-
-This License is a kind of ``copyleft'', which means that derivative
-works of the document must themselves be free in the same sense.  It
-complements the GNU General Public License, which is a copyleft
-license designed for free software.
-
-We have designed this License in order to use it for manuals for free
-software, because free software needs free documentation: a free
-program should come with manuals providing the same freedoms that the
-software does.  But this License is not limited to software manuals;
-it can be used for any textual work, regardless of subject matter or
-whether it is published as a printed book.  We recommend this License
-principally for works whose purpose is instruction or reference.
-
-@item
-APPLICABILITY AND DEFINITIONS
-
-This License applies to any manual or other work, in any medium, that
-contains a notice placed by the copyright holder saying it can be
-distributed under the terms of this License.  Such a notice grants a
-world-wide, royalty-free license, unlimited in duration, to use that
-work under the conditions stated herein.  The ``Document'', below,
-refers to any such manual or work.  Any member of the public is a
-licensee, and is addressed as ``you''.  You accept the license if you
-copy, modify or distribute the work in a way requiring permission
-under copyright law.
-
-A ``Modified Version'' of the Document means any work containing the
-Document or a portion of it, either copied verbatim, or with
-modifications and/or translated into another language.
-
-A ``Secondary Section'' is a named appendix or a front-matter section
-of the Document that deals exclusively with the relationship of the
-publishers or authors of the Document to the Document's overall
-subject (or to related matters) and contains nothing that could fall
-directly within that overall subject.  (Thus, if the Document is in
-part a textbook of mathematics, a Secondary Section may not explain
-any mathematics.)  The relationship could be a matter of historical
-connection with the subject or with related matters, or of legal,
-commercial, philosophical, ethical or political position regarding
-them.
-
-The ``Invariant Sections'' are certain Secondary Sections whose titles
-are designated, as being those of Invariant Sections, in the notice
-that says that the Document is released under this License.  If a
-section does not fit the above definition of Secondary then it is not
-allowed to be designated as Invariant.  The Document may contain zero
-Invariant Sections.  If the Document does not identify any Invariant
-Sections then there are none.
-
-The ``Cover Texts'' are certain short passages of text that are listed,
-as Front-Cover Texts or Back-Cover Texts, in the notice that says that
-the Document is released under this License.  A Front-Cover Text may
-be at most 5 words, and a Back-Cover Text may be at most 25 words.
-
-A ``Transparent'' copy of the Document means a machine-readable copy,
-represented in a format whose specification is available to the
-general public, that is suitable for revising the document
-straightforwardly with generic text editors or (for images composed of
-pixels) generic paint programs or (for drawings) some widely available
-drawing editor, and that is suitable for input to text formatters or
-for automatic translation to a variety of formats suitable for input
-to text formatters.  A copy made in an otherwise Transparent file
-format whose markup, or absence of markup, has been arranged to thwart
-or discourage subsequent modification by readers is not Transparent.
-An image format is not Transparent if used for any substantial amount
-of text.  A copy that is not ``Transparent'' is called ``Opaque''.
-
-Examples of suitable formats for Transparent copies include plain
-@sc{ascii} without markup, Texinfo input format, La@TeX{} input
-format, @acronym{SGML} or @acronym{XML} using a publicly available
-@acronym{DTD}, and standard-conforming simple @acronym{HTML},
-PostScript or @acronym{PDF} designed for human modification.  Examples
-of transparent image formats include @acronym{PNG}, @acronym{XCF} and
-@acronym{JPG}.  Opaque formats include proprietary formats that can be
-read and edited only by proprietary word processors, @acronym{SGML} or
-@acronym{XML} for which the @acronym{DTD} and/or processing tools are
-not generally available, and the machine-generated @acronym{HTML},
-PostScript or @acronym{PDF} produced by some word processors for
-output purposes only.
-
-The ``Title Page'' means, for a printed book, the title page itself,
-plus such following pages as are needed to hold, legibly, the material
-this License requires to appear in the title page.  For works in
-formats which do not have any title page as such, ``Title Page'' means
-the text near the most prominent appearance of the work's title,
-preceding the beginning of the body of the text.
-
-A section ``Entitled XYZ'' means a named subunit of the Document whose
-title either is precisely XYZ or contains XYZ in parentheses following
-text that translates XYZ in another language.  (Here XYZ stands for a
-specific section name mentioned below, such as ``Acknowledgements'',
-``Dedications'', ``Endorsements'', or ``History''.)  To ``Preserve the Title''
-of such a section when you modify the Document means that it remains a
-section ``Entitled XYZ'' according to this definition.
-
-The Document may include Warranty Disclaimers next to the notice which
-states that this License applies to the Document.  These Warranty
-Disclaimers are considered to be included by reference in this
-License, but only as regards disclaiming warranties: any other
-implication that these Warranty Disclaimers may have is void and has
-no effect on the meaning of this License.
-
-@item
-VERBATIM COPYING
-
-You may copy and distribute the Document in any medium, either
-commercially or noncommercially, provided that this License, the
-copyright notices, and the license notice saying this License applies
-to the Document are reproduced in all copies, and that you add no other
-conditions whatsoever to those of this License.  You may not use
-technical measures to obstruct or control the reading or further
-copying of the copies you make or distribute.  However, you may accept
-compensation in exchange for copies.  If you distribute a large enough
-number of copies you must also follow the conditions in section 3.
-
-You may also lend copies, under the same conditions stated above, and
-you may publicly display copies.
-
-@item
-COPYING IN QUANTITY
-
-If you publish printed copies (or copies in media that commonly have
-printed covers) of the Document, numbering more than 100, and the
-Document's license notice requires Cover Texts, you must enclose the
-copies in covers that carry, clearly and legibly, all these Cover
-Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
-the back cover.  Both covers must also clearly and legibly identify
-you as the publisher of these copies.  The front cover must present
-the full title with all words of the title equally prominent and
-visible.  You may add other material on the covers in addition.
-Copying with changes limited to the covers, as long as they preserve
-the title of the Document and satisfy these conditions, can be treated
-as verbatim copying in other respects.
-
-If the required texts for either cover are too voluminous to fit
-legibly, you should put the first ones listed (as many as fit
-reasonably) on the actual cover, and continue the rest onto adjacent
-pages.
-
-If you publish or distribute Opaque copies of the Document numbering
-more than 100, you must either include a machine-readable Transparent
-copy along with each Opaque copy, or state in or with each Opaque copy
-a computer-network location from which the general network-using
-public has access to download using public-standard network protocols
-a complete Transparent copy of the Document, free of added material.
-If you use the latter option, you must take reasonably prudent steps,
-when you begin distribution of Opaque copies in quantity, to ensure
-that this Transparent copy will remain thus accessible at the stated
-location until at least one year after the last time you distribute an
-Opaque copy (directly or through your agents or retailers) of that
-edition to the public.
-
-It is requested, but not required, that you contact the authors of the
-Document well before redistributing any large number of copies, to give
-them a chance to provide you with an updated version of the Document.
-
-@item
-MODIFICATIONS
-
-You may copy and distribute a Modified Version of the Document under
-the conditions of sections 2 and 3 above, provided that you release
-the Modified Version under precisely this License, with the Modified
-Version filling the role of the Document, thus licensing distribution
-and modification of the Modified Version to whoever possesses a copy
-of it.  In addition, you must do these things in the Modified Version:
-
-@enumerate A
-@item
-Use in the Title Page (and on the covers, if any) a title distinct
-from that of the Document, and from those of previous versions
-(which should, if there were any, be listed in the History section
-of the Document).  You may use the same title as a previous version
-if the original publisher of that version gives permission.
-
-@item
-List on the Title Page, as authors, one or more persons or entities
-responsible for authorship of the modifications in the Modified
-Version, together with at least five of the principal authors of the
-Document (all of its principal authors, if it has fewer than five),
-unless they release you from this requirement.
-
-@item
-State on the Title page the name of the publisher of the
-Modified Version, as the publisher.
-
-@item
-Preserve all the copyright notices of the Document.
-
-@item
-Add an appropriate copyright notice for your modifications
-adjacent to the other copyright notices.
-
-@item
-Include, immediately after the copyright notices, a license notice
-giving the public permission to use the Modified Version under the
-terms of this License, in the form shown in the Addendum below.
-
-@item
-Preserve in that license notice the full lists of Invariant Sections
-and required Cover Texts given in the Document's license notice.
-
-@item
-Include an unaltered copy of this License.
-
-@item
-Preserve the section Entitled ``History'', Preserve its Title, and add
-to it an item stating at least the title, year, new authors, and
-publisher of the Modified Version as given on the Title Page.  If
-there is no section Entitled ``History'' in the Document, create one
-stating the title, year, authors, and publisher of the Document as
-given on its Title Page, then add an item describing the Modified
-Version as stated in the previous sentence.
-
-@item
-Preserve the network location, if any, given in the Document for
-public access to a Transparent copy of the Document, and likewise
-the network locations given in the Document for previous versions
-it was based on.  These may be placed in the ``History'' section.
-You may omit a network location for a work that was published at
-least four years before the Document itself, or if the original
-publisher of the version it refers to gives permission.
-
-@item
-For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve
-the Title of the section, and preserve in the section all the
-substance and tone of each of the contributor acknowledgements and/or
-dedications given therein.
-
-@item
-Preserve all the Invariant Sections of the Document,
-unaltered in their text and in their titles.  Section numbers
-or the equivalent are not considered part of the section titles.
-
-@item
-Delete any section Entitled ``Endorsements''.  Such a section
-may not be included in the Modified Version.
-
-@item
-Do not retitle any existing section to be Entitled ``Endorsements'' or
-to conflict in title with any Invariant Section.
-
-@item
-Preserve any Warranty Disclaimers.
-@end enumerate
-
-If the Modified Version includes new front-matter sections or
-appendices that qualify as Secondary Sections and contain no material
-copied from the Document, you may at your option designate some or all
-of these sections as invariant.  To do this, add their titles to the
-list of Invariant Sections in the Modified Version's license notice.
-These titles must be distinct from any other section titles.
-
-You may add a section Entitled ``Endorsements'', provided it contains
-nothing but endorsements of your Modified Version by various
-parties---for example, statements of peer review or that the text has
-been approved by an organization as the authoritative definition of a
-standard.
-
-You may add a passage of up to five words as a Front-Cover Text, and a
-passage of up to 25 words as a Back-Cover Text, to the end of the list
-of Cover Texts in the Modified Version.  Only one passage of
-Front-Cover Text and one of Back-Cover Text may be added by (or
-through arrangements made by) any one entity.  If the Document already
-includes a cover text for the same cover, previously added by you or
-by arrangement made by the same entity you are acting on behalf of,
-you may not add another; but you may replace the old one, on explicit
-permission from the previous publisher that added the old one.
-
-The author(s) and publisher(s) of the Document do not by this License
-give permission to use their names for publicity for or to assert or
-imply endorsement of any Modified Version.
-
-@item
-COMBINING DOCUMENTS
-
-You may combine the Document with other documents released under this
-License, under the terms defined in section 4 above for modified
-versions, provided that you include in the combination all of the
-Invariant Sections of all of the original documents, unmodified, and
-list them all as Invariant Sections of your combined work in its
-license notice, and that you preserve all their Warranty Disclaimers.
-
-The combined work need only contain one copy of this License, and
-multiple identical Invariant Sections may be replaced with a single
-copy.  If there are multiple Invariant Sections with the same name but
-different contents, make the title of each such section unique by
-adding at the end of it, in parentheses, the name of the original
-author or publisher of that section if known, or else a unique number.
-Make the same adjustment to the section titles in the list of
-Invariant Sections in the license notice of the combined work.
-
-In the combination, you must combine any sections Entitled ``History''
-in the various original documents, forming one section Entitled
-``History''; likewise combine any sections Entitled ``Acknowledgements'',
-and any sections Entitled ``Dedications''.  You must delete all
-sections Entitled ``Endorsements.''
-
-@item
-COLLECTIONS OF DOCUMENTS
-
-You may make a collection consisting of the Document and other documents
-released under this License, and replace the individual copies of this
-License in the various documents with a single copy that is included in
-the collection, provided that you follow the rules of this License for
-verbatim copying of each of the documents in all other respects.
-
-You may extract a single document from such a collection, and distribute
-it individually under this License, provided you insert a copy of this
-License into the extracted document, and follow this License in all
-other respects regarding verbatim copying of that document.
-
-@item
-AGGREGATION WITH INDEPENDENT WORKS
-
-A compilation of the Document or its derivatives with other separate
-and independent documents or works, in or on a volume of a storage or
-distribution medium, is called an ``aggregate'' if the copyright
-resulting from the compilation is not used to limit the legal rights
-of the compilation's users beyond what the individual works permit.
-When the Document is included in an aggregate, this License does not
-apply to the other works in the aggregate which are not themselves
-derivative works of the Document.
-
-If the Cover Text requirement of section 3 is applicable to these
-copies of the Document, then if the Document is less than one half of
-the entire aggregate, the Document's Cover Texts may be placed on
-covers that bracket the Document within the aggregate, or the
-electronic equivalent of covers if the Document is in electronic form.
-Otherwise they must appear on printed covers that bracket the whole
-aggregate.
-
-@item
-TRANSLATION
-
-Translation is considered a kind of modification, so you may
-distribute translations of the Document under the terms of section 4.
-Replacing Invariant Sections with translations requires special
-permission from their copyright holders, but you may include
-translations of some or all Invariant Sections in addition to the
-original versions of these Invariant Sections.  You may include a
-translation of this License, and all the license notices in the
-Document, and any Warranty Disclaimers, provided that you also include
-the original English version of this License and the original versions
-of those notices and disclaimers.  In case of a disagreement between
-the translation and the original version of this License or a notice
-or disclaimer, the original version will prevail.
-
-If a section in the Document is Entitled ``Acknowledgements'',
-``Dedications'', or ``History'', the requirement (section 4) to Preserve
-its Title (section 1) will typically require changing the actual
-title.
-
-@item
-TERMINATION
-
-You may not copy, modify, sublicense, or distribute the Document except
-as expressly provided for under this License.  Any other attempt to
-copy, modify, sublicense or distribute the Document is void, and will
-automatically terminate your rights under this License.  However,
-parties who have received copies, or rights, from you under this
-License will not have their licenses terminated so long as such
-parties remain in full compliance.
-
-@item
-FUTURE REVISIONS OF THIS LICENSE
-
-The Free Software Foundation may publish new, revised versions
-of the GNU Free Documentation License from time to time.  Such new
-versions will be similar in spirit to the present version, but may
-differ in detail to address new problems or concerns.  See
-@uref{http://www.gnu.org/copyleft/}.
-
-Each version of the License is given a distinguishing version number.
-If the Document specifies that a particular numbered version of this
-License ``or any later version'' applies to it, you have the option of
-following the terms and conditions either of that specified version or
-of any later version that has been published (not as a draft) by the
-Free Software Foundation.  If the Document does not specify a version
-number of this License, you may choose any version ever published (not
-as a draft) by the Free Software Foundation.
-@end enumerate
-
-@page
-@heading ADDENDUM: How to use this License for your documents
-
-To use this License in a document you have written, include a copy of
-the License in the document and put the following copyright and
-license notices just after the title page:
-
-@smallexample
-@group
-  Copyright (C)  @var{year}  @var{your name}.
-  Permission is granted to copy, distribute and/or modify this document
-  under the terms of the GNU Free Documentation License, Version 1.2
-  or any later version published by the Free Software Foundation;
-  with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
-  Texts.  A copy of the license is included in the section entitled ``GNU
-  Free Documentation License''.
-@end group
-@end smallexample
-
-If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
-replace the ``with@dots{}Texts.'' line with this:
-
-@smallexample
-@group
-    with the Invariant Sections being @var{list their titles}, with
-    the Front-Cover Texts being @var{list}, and with the Back-Cover Texts
-    being @var{list}.
-@end group
-@end smallexample
-
-If you have Invariant Sections without Cover Texts, or some other
-combination of the three, merge those two alternatives to suit the
-situation.
-
-If your document contains nontrivial examples of program code, we
-recommend releasing these examples in parallel under your choice of
-free software license, such as the GNU General Public License,
-to permit their use in free software.
-
-@c Local Variables:
-@c ispell-local-pdict: "ispell-dict"
-@c End:
-
-
-@c ---------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-
-@node    Reporting bugs
-@chapter Reporting bugs
-
-Report bugs to <obrebski@@amu.edu.pl>.
-
-@c ---------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-
-@c @node    Copyright
-@c @chapter Copyright
-@c 
-@c Copyright 2004 by Tomasz ObrÄbski
-@c This software is free for research and educational use.
-
-@c ---------------------------------------------------------------------
-@c ---------------------------------------------------------------------
-
-@node    Author
-@chapter Author
-
-
-@bye
Index: auto/defaults
===================================================================
--- auto/defaults	(revision e4cec26b9f4157c18cc9809b9570c3f2eb14cc03)
+++ auto/defaults	(revision 9a367616612b9e69c7e0bb155d5b543d50824483)
@@ -22,4 +22,8 @@
 DEFAULT_CP='/bin/cp'
 DEFAULT_CHMOD='/bin/chmod'
+DEFAULT_MAKEINFO='/usr/bin/makeinfo'
+DEFAULT_TEXI2DVI='/usr/bin/texi2dvi'
+DEFAULT_TEXI2PDF='/usr/bin/texi2pdf'
+DEFAULT_DVIPS='/usr/bin/dvips'
 
 DEFAULT_CFLAGS='-g -O2 -Wall'
Index: auto/options
===================================================================
--- auto/options	(revision e4cec26b9f4157c18cc9809b9570c3f2eb14cc03)
+++ auto/options	(revision 9a367616612b9e69c7e0bb155d5b543d50824483)
@@ -25,4 +25,8 @@
 if [ -z "$CP" ];                then CP=$DEFAULT_CP;                                fi
 if [ -z "$CHMOD" ];             then CHMOD=$DEFAULT_CHMOD;                          fi
+if [ -z "$MAKEINFO" ];          then MAKEINFO=$DEFAULT_MAKEINFO;                    fi
+if [ -z "$TEXI2DVI" ];          then TEXI2DVI=$DEFAULT_TEXI2DVI;                    fi
+if [ -z "$TEXI2PDF" ];          then TEXI2PDF=$DEFAULT_TEXI2PDF;                    fi
+if [ -z "$DVIPS" ];             then DVIPS=$DEFAULT_DVIPS;                          fi
 
 if [ -z "$CFLAGS" ];            then CFLAGS=$DEFAULT_CFLAGS;                        fi
@@ -80,4 +84,9 @@
     CP=*)               CP="$value"                               ;;
     CHMOD=*)            CHMOD="$value"                            ;;
+    MAKEINFO=*)         MAKEINFO="$value"                         ;;
+    TEXI2DVI=*)         TEXI2DVI="$value"                         ;;
+    TEXI2PDF=*)         TEXI2PDF="$value"                         ;;
+    DVIPS=*)            DVIPS="$value"                            ;;
+
 
     CFLAGS=*)           CFLAGS="$value"                           ;;
@@ -136,4 +145,9 @@
   CP                    cp command
   CHMOD                 chmod command
+  MAKEINFO              makeinfo command
+  TEXI2DVI              texi2dvi command
+  TEXI2PDF              texi2pdf command
+  DVIPS                 dvips command
+
 
   CFLAGS                C compiler flags
Index: auto/output/Makefile
===================================================================
--- auto/output/Makefile	(revision b5884b3fd633a968051fc3f1574733eea13b0230)
+++ auto/output/Makefile	(revision 9a367616612b9e69c7e0bb155d5b543d50824483)
@@ -19,4 +19,8 @@
 CP = $CP
 CHMOD = $CHMOD
+MAKEINFO = $MAKEINFO
+TEXI2DVI = $TEXI2DVI
+TEXI2PDF = $TEXI2PDF
+DVIPS = $DVIPS
 
 CFLAGS = $CFLAGS
@@ -44,7 +48,13 @@
 ALL_FFLAGS = -t \$(FFLAGS)
 
-VPATH = ./src
+vpath %.c       ./src
+vpath %.l       ./src
+vpath %.pl      ./src
+vpath %.sed     ./src
+vpath %.sh      ./src
+vpath %.texinfo ./doc
 
 PROGRAMS = tok sen fla gph kot unfla grp mar ser kon rm12 rs12
+DOC_FILES = utt.info utt.dvi utt.html utt.pdf utt.ps
 
 TOK_OBJ_FILES = tok.o tok_cmdline.o
@@ -102,13 +112,4 @@
 .SUFFIXES: .l .y .h .c .pl .o
 
-#.INTERMEDIATE: \$(patsubst %.l,%.c,\$(TOK_FLEX_FILES))
-#.INTERMEDIATE: \$(patsubst %.ggo,%.c,\$(TOK_GGO_FILES))
-#.INTERMEDIATE: \$(patsubst %.ggo,%.h,\$(TOK_GGO_FILES))
-#.INTERMEDIATE: \$(patsubst %.l,%.c,\$(SEN_FLEX_FILES))
-
-.PHONY: all
-all: \$(PROGRAMS)
-
-
 .PHONY: help
 help:
@@ -120,4 +121,45 @@
 	\$(PR) --omit-pagination --width=80 --columns=4
 
+.PHONY: all
+all: \$(PROGRAMS)
+
+.PHONY: install
+install: all
+
+.PHONY: install-strip
+install:
+
+.PHONY: info
+info: utt.info
+
+.PHONY: install-info
+install-info:
+
+.PHONY: dvi
+dvi: utt.dvi
+
+.PHONY: install-dvi
+install-dvi:
+
+.PHONY: html
+html: utt.html
+
+.PHONY: install-html
+install-html:
+
+.PHONY: pdf
+pdf: utt.pdf
+
+.PHONY: install-pdf
+install-pdf:
+
+.PHONY: ps
+ps: utt.ps
+
+.PHONY: install-ps
+install-ps:
+	
+.PHONY: uninstall
+uninstall:
 
 .PHONY: clean
@@ -132,4 +174,5 @@
 	\$(RM) \$(FLA_OBJ_FILES)
 	\$(RM) \$(RS12_OBJ_FILES)
+	\$(RM) \$(DOC_FILES)
 
 .PHONY: distclean
@@ -137,12 +180,4 @@
 	\$(RM) \$(CONFIG_FILES)
 
-.PHONY: install
-install: all
-	echo TODO: make install
-	
-.PHONY: uninstall
-uninstall:
-	echo TODO: make uninstall
-
 %.o: %.c
 	\$(CC) -c \$< -o \$@ \$(ALL_CFLAGS)
@@ -172,4 +207,19 @@
 	\$(CHMOD) a+x \$@
 
+%.info: %.texinfo
+	\$(MAKEINFO) \$< -o \$@
+
+%.dvi: %.texinfo
+	\$(TEXI2DVI) --build=clean \$< -o \$@
+
+%.html: %.texinfo
+	\$(MAKEINFO) --html --no-split \$< -o \$@
+
+%.pdf: %.texinfo
+	\$(TEXI2PDF) --build=clean \$< -o \$@
+
+%.ps: %.dvi
+	\$(DVIPS) \$< -o \$@
+
 EOF
 
Index: auto/summary
===================================================================
--- auto/summary	(revision e4cec26b9f4157c18cc9809b9570c3f2eb14cc03)
+++ auto/summary	(revision 9a367616612b9e69c7e0bb155d5b543d50824483)
@@ -40,4 +40,8 @@
   CP              : $CP
   CHMOD           : $CHMOD
+  MAKEINFO        : $MAKEINFO
+  TEXI2DVI        : $TEXI2DVI
+  TEXI2PDF        : $TEXI2PDF
+  DVIPS           : $DVIPS
 
   CFLAGS          : $CFLAGS
Index: doc/utt.texinfo
===================================================================
--- doc/utt.texinfo	(revision 9a367616612b9e69c7e0bb155d5b543d50824483)
+++ doc/utt.texinfo	(revision 9a367616612b9e69c7e0bb155d5b543d50824483)
@@ -0,0 +1,2920 @@
+
+\input texinfo   @c -*-texinfo-*-
+@c @documentencoding ISO-8859-2
+@documentencoding UTF-8
+@c @documentlanguage pl
+
+@c %**start of header
+@setfilename utt.info
+@settitle UAM Text Tools v0.90
+@c %**end of header
+
+@copying
+This manual is for UAM Text Tools (version 0.90, October, 2008)
+
+Copyright @copyright{}  2005, 2007  Tomasz ObrÄbski, MichaÅ Stolarski, Justyna Walkowska, PaweÅ Konieczka.
+
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.2 or
+any later version published by the Free Software Foundation; with no
+Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.  A
+copy of the license is included in the section entitled GNU Free
+Documentation License,,GNU Free Documentation License.
+
+@c @quotation
+@c Permission is granted to ...
+@c No permission is granted until the document is completed.
+@c @end quotation
+@end copying
+
+
+@titlepage
+@title UAM Text Tools 0.90 - User Manual
+@subtitle edition 0.01, @today
+@subtitle status: prescript
+@author by Justyna Walkowska, Tomasz ObrÄbski and MichaÅ Stolarski
+@page
+@vskip 0pt plus 1filll
+@insertcopying
+@end titlepage
+
+@contents
+
+@c @paragraphindent none
+
+@iftex
+@tex
+% \usepackage[T1]{fontenc}
+% \usepackage[utf8]{inputenc}
+% \usepackage{times}
+@end tex
+
+@parskip = 0.5@normalbaselineskip plus 3pt minus 1pt
+@end iftex
+@c @headings off
+@c @everyheading LEM(1) @| @| LEM(1)
+@everyfooting @today @c @| @thispage @|
+
+@ifnottex
+
+@node Top
+@top UTT - UAM Text Tools
+
+@insertcopying
+
+@menu
+* General information::                       
+* UTT file format::             
+* Configuration files::         
+* UTT components::
+* Auxiliary tools::
+* Usage examples::              
+* PMDBF dictionary::            
+@c * Examples::                    
+@c * Copyright::
+* GNU Free Documentation License:: 
+* Reporting bugs::                                    
+* Author::                      
+@end menu
+@end ifnottex
+
+
+@c ----------------------------------------------------------------------
+
+@node General information
+@chapter General information
+
+UAM Text Tools (UTT) is a package of language processing tools
+developed at Adam Mickiewicz University. Its functionality includes:
+
+@itemize @bullet
+
+@item
+tokenization Ã³ÅÄ
+ÅŒ
+@item
+dictionary-based morphological analysis
+@item
+heuristic morphological analysis of unknown words
+@item
+spelling correction Ã³ÅÄ
+ÅÄÅŒ
+@item
+pattern search
+@item
+sentence splitting
+@item
+generation of concordance tables
+@end itemize
+
+The toolkit is destined for processing of raw (not annotated)
+unrestricted text for any conceivable purpose.
+
+The system is organized as a collection of command-line programs, each
+performing one operation, e.g. tokenization, lemmatization, spelling
+correction. The components are independent one from another, the
+unifying element being the uniform i/o file format.
+
+The components may be combined in various ways to provide various text
+processing services. Also new components supplied by the used may be
+easily incorporated into the system provided that they respect the i/o
+file format conventions.
+
+UTT component programs does not depend on any specific tagset or
+morphological description format. 
+
+UTT is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by 
+the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+
+The Polex/PMDBF dictionary is licensed under the Creative Commons by-nc-sa License which prohibits commercial use.  
+
+
+List of contributors:
+
+@itemize
+@item Pawel Konieczka
+@item Tomasz ObrÄbski
+@item MichaÅ Stolarski
+@item Marcin Walas
+@item Justyna Walkowska
+@item PaweÅ WereÅski
+@end itemize
+
+@c ----------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+
+@node    UTT file format
+@chapter UTT file format
+
+A UTT file contains annotation of a text. It consists of a sequence of
+segments. Each segment explicitly refers to a continuous piece of the
+text and provides some information on it.
+
+@section Segment format
+
+A segment occupies one line of a UTT file and consists of
+space-separated fields:
+
+
+@quotation
+@sp 1
+[@var{start} [@var{length}]] @var{type} @var{form} [@var{annotation1} [@var{annotation2} ...]]
+@sp 1
+@end quotation
+
+@table @var
+
+@item @var{start} 
+Non-negative integer value indicating the position in the source text where the
+segment starts.
+
+@item @var{length}
+Non-negative integer value indicating the length of the segment.
+
+@item @var{type}
+A sequence of non-ASCII characters (without spaces or letters, which could lead to @var{type} being misinterpreted as a @var{start} or @var{length} field). 
+@var{type} reflects the main classification of segments -
+into words, numbers, punctuation marks, meta-text markers. 
+@xref{tok output,,tok output}, for description of automatically recognized type markers.
+
+@item @var{form}
+This field contains the textual form of the segment or the special
+symbol @code{*} indicating that the form is not given (e.g. when the segment has been created artificially to mark something and is of lentgh 0).
+
+The characters or character sequences that have special meaning in the
+@var{form} field are enumerated below.
+
+Characters with special meaning:
+
+@itemize
+@item @code{_} - space character
+@item @code{*} - undefined contents
+@end itemize
+
+Escape sequences:
+
+@itemize
+@item @code{\n} - new line
+@item @code{\t} - tabulation
+@item @code{\r} - carriage return  
+
+@item @code{\_} - the @code{_} character
+@item @code{\*} - the @code{*} character
+@item @code{\\} - the @code{\} character
+
+@c @item @code{\hh} - a character with hexadecimal code @code{hh} (used for non-printable characters)
+@end itemize
+
+@item @var{annotation1}
+@item @var{annotation2}
+@item ...
+Annotation fields have the following format:
+
+@var{longname} @code{:} @var{value}
+
+or
+
+@var{shortname} @var{value}
+
+where @var{longname} is a string of alphanumeric characters
+(isalnum() test), @var{shortname} - a single non-alphanumeric character
+(ispunct() test), and @var{value} is an arbitrary string of non-blank characters.
+
+@end table
+
+
+Only two fields are mandatory: @var{type} and @var{form}. All other fields
+may be absent. In the case when only one number precedes the
+@var{type} field, it is interpreted as the @var{START} position.
+
+If the @var{length} field is ommited, the length of the segment is the
+length of the @var{form} field, except when the value of the
+@var{form} field is @code{*} -- in this case, the length is assumed to
+be 0.
+
+If the @var{start} field is also absent, the segment is assumed to directly
+follow the preceding one.
+
+@c Conventions:
+
+@c Annotation fields with predefined meaning:
+
+@c @itemize
+@c @item @code{!} - UTT components are allowed to modify the contents of
+@c the @var{form} field (e.g. spelling correction does this). If this happens the
+@c original form of the segment have to be placed in the @code{!}-field.
+@c @item @code{@@} - morphological description
+@c @item @code{=} - node identifier assignment (used in graph encoding)
+@c @item @code{<} - preceding/dominating node(s) (used in graph encoding)
+@c @item @code{>} - succeeding/subordinate node(s) (used in graph encoding)
+@c @end itemize
+
+Segments of length 0 may be used to mark file positions with some
+information. See e.g. BOS and EOS (beginning/end of sentence) markers
+in the example below.
+
+Example:
+
+sentence: @samp{Piszemy dobre progrumy.}
+
+@example
+0000 00 BOS *
+0000 07 W Piszemy lem:pisaÄ,V
+0007 01 S _
+0008 05 W dobre lem:dobry,ADJ
+0013 01 S _
+0014 08 W progrumy cor:programy lem:program,N
+0022 01 P .
+0023 00 EOS *
+0023 01 S _
+0024 00 BOS *
+0024 11 W Warszawiacy lem:Warszawiak,N
+0035 01 S _
+0036 03 W teÅŒ
+0039 01 P .
+0040 00 EOS *
+
+@end example
+
+@example
+0000 BOS *
+0000 W Piszemy lem:pisaÄ,V
+0007 S _
+0008 W dobre lem:dobry,ADJ
+0013 S _
+0014 W progrumy cor:programy lem:program,N
+0022 P .
+0023 EOS *
+@end example
+
+Posion information may be provided only for some types of segments:
+
+@example
+0000 BOS *
+W Piszemy lem:pisaÄÂ,V
+S _
+W dobre lem:dobry,ADJ
+S _
+W progrumy cor:programy lem:program,N
+P .
+EOS *
+S _
+0024 BOS *
+W Warszawiacy lem:Warszawiak,N
+S _
+W teÅŒ
+P .
+EOS *
+@end example
+
+Position/length information may be provided only when necessary:
+
+@example
+0000 04 N *
+0000 N 12
+P .
+N 5
+S _
+W km
+@end example
+
+@section UTT File
+
+A UTT file consists of a sequence of segments.  The same text position
+may be covered by multiple segments. In cosequence, ambiguous text
+segmentation and ambiguous annotation may be represented.
+
+There are two structural requirements a valid UTT-formatted file
+has to meet:
+
+@itemize @bullet
+
+@item
+segments have to be sorted with respect to the @var{position} field,
+
+@item
+for each
+segment ending at position @var{n}, either there must be a segment starting at
+position @var{n+1}, or position @var{n+1} is not covered by any segment; similarly
+for each segment starting at position @var{n}, either there must be a segment
+ending at position @var{n-1}, or the position @var{n-1} must not be covered
+by any segment.
+
+@end itemize
+
+A valid annotation for the text fragment
+@example
+12.5 km
+@end example
+
+may be 
+
+@example
+0000 02 N 12
+0000 04 N 12.5
+0002 01 P .
+0003 01 N 5
+0004 01 S _
+0005 02 W km
+@end example
+
+but not
+
+@example
+0000 02 N 12
+0000 04 N 12.5
+0004 01 S _
+0005 02 W km
+@end example
+
+because in the latter example the first segment (starting at position
+0000, 2 characters long) ends at position @var{n}=0001 which is
+covered by the second segment and no segment starts at position
+@var{n+2}=0002.
+
+
+@section Flattened UTT file
+
+A UTT file format has two variants: regular and flattened. The regular
+format was described above.  In the flattened format some of the
+end-of-line characters are replaced with line-feed characters.
+
+The flatten format is basically used to represent whole sentences as
+single lines of the input file (all intrasentential end-of-line
+characters are replaced with line-feed characters).
+
+This technical trick permits to perform certain text
+processing operations on entire sentences with the use of such tools as
+@command{grep} (see @command{grp} component) or @command{sed} (see  @command{mar} component).
+
+The conversion between the two formats is performed by the tools:
+@command{fla} and @command{unfla}.
+
+@section Character encoding
+
+The UTT component programs accept only 1-byte character encoding, such
+as ISO, ANSI, DOS.
+
+
+@c @section Formats
+
+@c @unnumberedsubsubsec Basic format
+
+@c While processing large amounts of the overhead related with explicit
+@c ... of the start position and segment length becomes ... . Therefore,
+@c for efficiency reasons certain shortcuts are possible:
+
+@c @unnumberedsubsubsec Relative start position
+
+@c Start position may be given as relative distance from the last
+@c absolut position. 
+
+@c @unnumberedsubsubsec Absent length
+
+@c Segment length may by omitted. Normally it can be restored by counting
+@c the length of the @emph{form field}. For segments with the special value
+@c @code{*} in the @emph{form field} length 0 is assumed.
+
+@c @unnumberedsubsubsec Absent length and start position
+
+@c Both start position and segment length may be omitted. In this format
+@c each segment is assumed to follow the previous one. This format is,
+@c therefore, suitable only for unambiguously tagged text
+@c (0-length markers can be still used.)
+
+
+@c @table @code
+@c @item AL
+@c @code{1234 03 W kot}
+@c @item RL
+@c @code{+56 03 W kot}
+@c @item A
+@c @code{1234 W kot}
+@c @item R
+@c @code{+56 W kot}
+@c @item 0
+@c @code{W kot}
+@c @end table
+
+
+@c [JAK UZYSKAÄÂ POLSKIE CZCIONKI W DVI???]
+
+@macro parhelp
+@item @b{@minus{}@minus{}help}, @b{@minus{}h}
+Print help.
+@end macro
+
+
+@macro parversion
+@item @b{@minus{}@minus{}version}, @b{@minus{}V}
+Print version information.
+@end macro
+
+@macro parinteractive
+@item @b{@minus{}@minus{}interactive, @minus{}i}
+This option toggles interactive mode, which is by default off. In the
+interactive mode the program does not buffer the output.
+@end macro
+
+
+@c @macro parfile
+@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
+@c Input file name.
+@c If this option is absent or equal to '@minus{}', the program
+@c reads from the standard input.
+@c @end macro
+
+
+@c @macro paroutput
+@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
+@c Regular output file name. To regular output the program sends segments
+@c which it successfully processed and copies those which were not
+@c subject to processing. If this option is absent or equal to
+@c '@minus{}', standard output is used.
+@c @end macro
+
+@c @macro parfail
+@c @item @b{@minus{}@minus{}fail=@var{filename}, @minus{}e @var{filename}}
+@c Fail output file name. To fail output the program copies the segments
+@c it failed to process.  If this option is absent or equal to
+@c '@minus{}', standard output is used.
+@c @end macro
+
+
+@c @macro parcopy
+@c @item @b{@minus{}@minus{}copy, @minus{}c}
+@c Copy succesfully processed segments to regular output also in their
+@c original input form.
+@c @end macro
+
+
+@macro parinputfield
+@item @b{@minus{}@minus{}input-field=@var{fieldname}, @minus{}I @var{fieldname}}
+The field containing the input to the program. The default is the
+@var{form} field. The fields @var{position}, @var{length}, @var{type},
+and @var{form} are referred to as @code{1}, @code{2}, @code{3},
+@code{4}, respectively.
+@end macro
+
+
+@macro paroutputfield
+@item @b{@minus{}@minus{}output-field=@var{fieldname}, @minus{}O @var{fieldname}}
+The name of the field added by the program. The default is the name of the program.
+@end macro
+
+
+@macro pardictionary
+@item @b{@minus{}@minus{}dictionary=@var{filename}, @minus{}d @var{filename}}
+Dictionary file name.
+@end macro
+
+
+@macro parprocess
+@item @b{@minus{}@minus{}process=@var{type}, @minus{}p @var{type}}
+Process segments with the specified value in the @var{type} field.
+Multiple occurences of this option are allowed and are interpreted as
+disjunction. If this option is absent, all segments are processed.
+@end macro
+
+
+@macro parselect
+@item @b{@minus{}@minus{}select=@var{fieldname}, @minus{}s @var{fieldname}}
+Select for processing only segments in which the field named
+@var{fieldname} is present. Multiple occurences of this option are
+allowed and are interpreted as conjunction of conditions. If this
+option is absent, all segments are processed.
+@end macro
+
+
+@macro parunselect
+@item @b{@minus{}@minus{}unselect=@var{fieldname}, @minus{}S @var{fieldname}}
+Select for processing only segments in which the field @var{fieldname}
+is absent.  Multiple occurences of this option are allowed and are
+interpreted as conjunction of conditions. If this option is absent,
+all segments are processed.
+@end macro
+
+
+@macro paroneline
+@item @b{@minus{}@minus{}one-line}
+This option makes the program print ambiguous annotation in one output
+line by generating multiple annotation fields. By default when
+ambiguous annotation may be produced for a segment, the segment is
+multiplicated and each of the annotations is added to separate copy of
+the segment.
+@end macro
+
+
+@macro paronefield
+@item @b{@minus{}@minus{}one-field, @minus{}1}
+This option makes the program print ambiguous annotation in one
+annotation field. By default when ambiguous annotation may be produced
+for a segment, the segment is multiplicated and each of the
+annotations is added to separate copy of the segment.
+
+This option is useful when working with @command{kot} or @command{con}.
+@end macro
+
+
+@c ---------------------------------------------------------------------
+@c CONFIGURATION FILES
+@c ---------------------------------------------------------------------
+
+@node    Configuration files
+@chapter Configuration files
+
+Values for all command line options accepted by a component
+may be set in configuration files. The default location of the
+configuration files for a component named @command{@var{program}} are
+
+@example
+	@file{/usr/local/etc/utt/@var{program}.conf}
+@end example
+
+for system-wide configuration file and
+
+@example
+	@file{~/.utt/@var{program}.conf}
+@end example
+
+for user configuration file.
+
+@c The configuration file to load may be also specified with the
+@c @option{--config} option. Configuration file need not be provided.
+
+For each option, the value is set according to the following priority:
+
+@itemize
+@item command line
+@c @item configuration file indicated with @option{--config} option
+@item user configuration file (or configuration file indicated with the @option{--config} option)
+@item system-wide configuration file
+@end itemize
+
+Parameter values are specified in the following format:
+
+@var{parametername}=@var{value}
+
+where @var{parametername} is the short or long name of an option accepted by
+the program, or
+
+@var{parametername}
+
+if the option does not need arguments.
+
+You can introduce comments to configuration files using the # sign.
+
+If a program accepts multiple occurences of an option (e.g. @var{lem}'s select option) you can specify them in two distinct lines of the program's configuration file.
+
+@c The equal sign may be omitted.
+
+
+@quotation Tip
+If you have two (or more) frequently used sets of options for the same
+program (eg. lem with PMDBF dictionary and lem with a user dictionary)
+a good solution is to create two soft links to lem, called
+eg. lemg and lemu and specify their configuration in files lemg.conf
+and lemu.conf respectively.
+@end quotation
+
+@c ---------------------------------------------------------------------
+@c COMPONENTS
+@c ---------------------------------------------------------------------
+
+@node UTT components
+@chapter UTT components
+
+UTT components are of three types:
+
+@menu
+Sources: programs which read non-UTT data (e.g. raw text) and produce output
+in UTT format
+* tok::         a tokenizer
+
+Filters: programs which read and produce UTT-formatted data
+* lem::         a morphological analyzer
+* gue::         a morphological guesser
+* cor::         a simple spelling corrector
+* kor::         a more elaborated spelling corrector
+* sen::         a sentensizer
+* ser::         a pattern search tool (marks matches)
+* mar::         a pattern search tool (introduces arbitrary markers into the text)
+* grp::         a pattern search tool (selects sentences containing a match)
+@c * gph::         a word-graph annotation tool::
+@c * dgp::         a dependency parser
+
+Sinks: programs which read UTT data and produce output in another format
+* kot::         an untokenizer
+* con::         a concordance table generator
+@end menu
+
+@c ---------------------------------------------------------------------
+@c TOK
+@c ---------------------------------------------------------------------
+
+@page
+@node tok
+@section tok - a tokenizer
+
+@c ----------------------------------------
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Component category:}      @tab source
+@item @strong{Input format:}            @tab raw text file
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab -
+@end multitable
+
+
+@menu
+* tok description::
+* tok input::
+* tok output::
+* tok command line options::
+* tok example::
+@end menu
+
+@node tok description
+@subsection Description
+
+@code{tok} is a simple program which reads a text file and identifies
+tokens on the basis of their orthographic form.  The type of the token
+is printed as the @var{type} field.
+
+@node tok input
+@subsection Input
+
+Raw text.
+
+@node tok output
+@subsection Output
+
+UTT-file with four fields: @var{start}, @var{length}, @var{type}, and @var{form}. In the @var{type} field five types of tokens are distinguished: 
+
+@itemize
+
+@item @code{W}
+(word)
+- continuous sequence of letters
+
+@item @code{N}
+(number)
+- continuous sequence of digits
+
+@item @code{S}
+(space)
+- continuous sequence of space characters
+
+@item @code{P}
+(punctuation mark)
+- single printable characters not belonging to any of the other classes
+
+@item @code{B}
+(unprintable character)
+- single unprintable character
+
+@end itemize
+
+
+
+@node tok command line options
+@subsection Command line options
+
+@table @code
+
+@item @b{@minus{}@minus{}help}, @b{@minus{}h}
+Print help.
+
+@item @b{@minus{}@minus{}version}, @b{@minus{}V}
+Print version information.
+
+@item @b{@minus{}@minus{}interactive, @minus{}i}
+This option toggles interactive mode, which is by default off. In the
+interactive mode the program does not buffer the output.
+
+@end table
+
+@node tok example
+@subsection Example
+
+Input:
+
+@example
+Piszemy dobre programy.
+@end example
+
+Output:
+
+@example
+0000 07 W Piszemy
+0007 01 S _
+0008 05 W dobre
+0013 01 S _
+0014 08 W programy
+0022 01 P .
+0023 01 S \n
+@end example
+
+
+@c ---------------------------------------------------------------------
+@c SEN
+@c ---------------------------------------------------------------------
+
+@c @node sen - sentencizer
+@c @chapter sen - sentencizer
+
+@c Authors: Tomasz ObrÄbski
+
+@c ---------------------------------------------------------------------
+@c LEM
+@c ---------------------------------------------------------------------
+
+@page
+@node lem
+@section lem - morphological analyzer
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski, MichaÅ Stolarski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab tok
+@end multitable
+
+@menu
+* lem description::             
+* lem command line options::    
+* lem input::
+* lem output::
+* lem example::                 
+* lem dictionaries::            
+* lem hints::            
+@end menu
+
+@node lem description
+@subsection Description
+
+@command{lem} performs morphological analysis of a simple orthographic
+word, returning all its possible morphological annotations,
+disregarding the context.
+
+@c ----------------------------------------
+
+@node lem command line options
+@subsection Command line options
+
+@table @code
+@parhelp
+@parversion
+@parinteractive
+@c @parfile
+@c @paroutput
+@c @parfail
+@c @parcopy
+@parinputfield
+@paroutputfield
+@pardictionary
+@parprocess
+@parselect
+@parunselect
+@paroneline
+@paronefield
+@end table
+
+@c ----------------------------------------
+
+@node lem input
+@subsection Input
+
+Lem reads a UTT file and processes the value of the @var{form} field
+(the input field may be changed with @option{--input-field} option).
+
+@node lem output
+@subsection Output
+
+@command{lem} adds a new annotation field, whose default name is @code{lem}.  In
+case of ambiguity either the segment is multiplicated (default),
+multiple @code{lem} fields are added (@option{--one-line}) or ambiguous
+annotation is produced as the value of single @code{lem} field (option
+@option{--one-field,-1}):
+
+@itemize @bullet
+
+@item
+unambiguous value format:
+
+@example
+   <lemma>,<descr>
+@end example
+
+@item
+ambiguous value format (@option{--one-field} option)
+
+
+@example
+   <lemma>,<descr>[,<descr>][;<lemma>,<descr>[,<descr>]]
+@end example
+
+(alternative descriptions for the same lemma are separated by commas,
+alternative lemmata are separated by semicolons.)
+
+@end itemize
+
+@node lem example
+@subsection Example
+
+Input: 
+
+@example
+0000 07 W Piszemy
+0007 01 S _
+0008 05 W dobre
+0013 01 S _
+0014 08 W programy
+0022 01 P .
+0023 01 B \n
+@end example
+
+Output (default):
+
+@example
+0000 07 W Piszemy lem:pisaÄ,V/AiVpMdTrfNpP1
+0007 01 B _
+0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn
+0008 05 W dobre lem:dobry,ADJ/DpNsCnavGn
+0013 01 B _
+0014 08 W programy lem:program,N/GiNpCa
+0014 08 W programy lem:program,N/GiNpCn
+0014 08 W programy lem:program,N/GiNpCv
+0022 01 P .
+0023 01 B \n
+@end example
+
+Output (@option{--one-line} option):
+
+@example
+0000 07 W Piszemy lem:pisaÄ,V/AiVpMdTrfNpP1
+0007 01 S _
+0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn lem:dobry,ADJ/DpNsCnavGn
+0013 01 S _
+0014 08 W programy lem:program,N/GiNpCa lem:program,N/GiNpCn lem:program,N/GiNpCv
+0022 01 P .
+0023 01 S \n
+@end example
+
+Output (@option{--one-field} option):
+
+@example
+0000 07 W Piszemy lem:pisaÄ,V/AiVpMdTrfNpP1
+0007 01 S _
+0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn,ADJ/DpNsCnavGn
+0013 01 S _
+0014 08 W programy lem:program,N/GiNpCa,N/GiNpCn,N/GiNpCv
+0022 01 P .
+0023 01 S \n
+@end example
+
+@c ----------------------------------------
+
+@node lem dictionaries
+@subsection Dictionaries
+
+@command{lem} requires a dictionary. The dictionary may be provided in
+one of two formats: in text (source) format or in binary (fsa) format.
+
+@subsubheading Text format
+
+Dictionary entries have the following structure:
+
+@example
+<form>;<lemma>,<descr>[;<lemma>,<descr>]
+@end example
+
+@var{lemma} may be given explicitly or in the cut-add format:
+
+@example
+@code{[<cut1><add1>-]<cut2><add2>}
+@end example
+
+meaning: replace prefix of length @code{<cut1>} with
+string @code{<add1>}, replace suffix of length @code{<cut2>} with string
+@code{<add2>}. For example @code{3t} transforms @samp{kocie} into
+@samp{kot}, @code{3-4aÃÅy} transforms @samp{najbielsi} into @samp{biaÃÅy}
+
+Each dictionary entry must be written in one line and must not contain blank characters.
+
+Examples:
+@example
+kot;0,N/GaNsCn
+kota;1,N/GaNsCg;1,N/GaNsCa
+kotu;1,N/GaNsCd
+kotem;2,N/GaNsCi
+kocie;3t,N/GaNsCl;3t,N/GaNsCv
+najbielsi;3-4aÅy,ADJ/DsNpCnGp
+najbielsze;3-5aÅy,ADJ/DsNpCnGaifn
+najlepsi;dobry,ADJ/DsNpCnGp
+najlepsze;dobry,ADJ/DsNpCnGaifn
+@end example
+
+
+The mandatory file name extension for a text dictionary is @code{dic}. For large
+dictionaries it is preferable, however, to compile them into binary
+(fsa) format.
+
+@subsubheading Binary format
+
+The mandatory file name extension for a binary dictionary is @code{bin}. To
+compile a text dictionary into binary format, write:
+
+@example
+compiledic <dictionaryname>.dic
+@end example
+
+@subsubheading Polex/PMDBF dictionary
+
+A large-coverage morphological dictionary for Polish language, Polex/PMDBF, is included in
+the distribution as the default @emph{lem}'s dictionary. It's 
+located by default in:
+
+@file{$HOME/.local/share/utt/pl_PL.ISO-8859-2/lem.bin}
+
+in local installation or in
+
+@file{/usr/local/share/utt/pl_PL.ISO-8859-2/lem.bin}
+
+in system installation.
+
+@node lem hints
+@subsection Hints
+
+@subsubheading Combining data from multiple dictionaries
+
+@itemize
+
+@item Apply <dict1>, then apply <dict2> to words which were not annotatated.
+
+@example
+lem -d <dict1> | lem -S lem -d <dict2>
+@end example
+
+@item Add annotations from two dictionaries <dict1> and <dict2>.
+
+@example
+lem -c -d <dict1> | lem -S lem -d <dict2>
+@end example
+
+@end itemize
+
+
+@c ---------------------------------------------------------------------
+@c GUE
+@c ---------------------------------------------------------------------
+
+@page
+@node gue
+@section gue - morphological guesser
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+
+@item @strong{Authors:}                 @tab MichaÅ Stolarski, Tomasz ObrÄbski
+@item @strong{Component category:}      @tab filter
+
+@end multitable
+
+@menu
+* gue description::    
+* gue command line options::    
+* gue example::                 
+* gue dictionaries::            
+@end menu
+
+
+@node gue description
+@subsection Description
+
+@command{gue} guesess morphological descriptions of the form contained
+in the @var{form} field.
+
+
+@node gue command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+@parversion
+@parinteractive
+@c @parfile
+@c @paroutput
+@c @parfail
+@c @parcopy
+@parinputfield
+@paroutputfield
+@pardictionary
+@parprocess
+@parselect
+@parunselect
+@paroneline
+@paronefield
+
+@item @b{@minus{}@minus{}delta=@var{n}}
+Stop displaying answers after fall of weight, that is, when weight difference between 2 subsequent results is more than delta value (default=`0.2').
+
+
+@item @b{@minus{}@minus{}cut-off=@var{n}}
+Do not display answers with less weight than cut-off value (default=`200').
+
+
+@item @b{@minus{}@minus{}guess_count=@var{n}, @minus{}n @var{n}}
+Guess up to n descriptions  (default=`0', which means 'display all results').
+
+
+
+@end table
+
+@node gue example
+@subsection Example
+
+@example
+command: gue -n 2 
+
+input:
+0000 07 W smerfny 
+
+output:
+0000 07 W smerfny gue:,ADJ/CaDpGiNs
+0000 07 W smerfny gue:,ADJ/CnvDpGaipNs
+@end example
+                                  
+
+@node gue dictionaries
+@subsection Dictionaries
+
+@command{gue} requires a dictionary. For now, the dictionary must be provided in binary (fsa) format.
+The fsa format is created by compiling text-format dictionaries.
+
+
+
+@subsubheading Text format
+
+Dictionary entries have the following structure:
+
+@example
+@var{prefix}@code{*}@var{suffix}@code{;}@var{lemma}@code{,}@var{description}@code{:}@var{weight}
+@end example
+
+@var{lemma} must be given in the cut-add format:
+
+@example
+@code{[<cut1><add1>-]<cut2><add2>}
+@end example
+(no spaces in between): replace prefix of length @var{cut1} with
+string @var{add1}, replace suffix of length @var{cat2} with string
+@var{add2}.
+
+
+Example: @code{3-4aÅy} transforms @i{najbielsi} into @i{biaÅy}
+
+
+@var{description} contains the part of speech and morphosyntactic information (@xref{PMDBF dictionary}.).
+
+@var{weight} is an integer value between 1 and 999 indicating the
+likelihood of the guess.
+
+@c @example
+@c *ÅkÄ;1a,N/GfNsCa
+@c naj*elszy;3-4aÅy,ADJ/...:...
+@c @end example
+
+
+@c ---------------------------------------------------------------------
+@c COR
+@c ---------------------------------------------------------------------
+
+@page
+@node cor
+@section cor - spelling corrector
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski, MichaÅ Stolarski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab tok
+@end multitable
+
+@menu
+* cor description::
+* cor command line options::    
+* cor dictionaries::            
+@end menu
+
+
+@node cor description
+@subsection Description
+
+The spelling corrector applies Kemal Oflazer's dynamic programming
+algorithm @cite{oflazer96} to the FSA representation of the set of
+word forms of the Polex/PMDBF dictionary. Given an incorrect
+word form it returns all word forms present in the dictionary whose
+edit distance is smaller than the threshold given as the parameter.
+
+
+@node cor command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+@parversion
+@parinteractive
+@c @parfile
+@c @paroutput
+@c @parfail
+@c @parcopy
+@parinputfield
+@paroutputfield
+@pardictionary
+@parprocess
+@parselect
+@parunselect
+@paroneline
+@paronefield
+
+@item @b{@minus{}@minus{}distance=@var{int}, @minus{}n @var{int}}
+Maximum edit distance (default='1').
+
+@c @item @b{@minus{}@minus{}replace, @minus{}r}
+@c Replace original form with corrected form, place original form in the
+@c cor field. This option has no effect in @option{--one-*} modes (default=off)
+
+
+@end table
+
+@node cor dictionaries
+@subsection Dictionaries
+
+@command{cor} requires a dictionary. The dictionary has to be provided in binary (fsa) format. 
+The fsa format is created by compiling text-format dictionaries.
+
+@subsubheading Text format
+
+The @command{cor} dictionary is a list of words:
+@example
+odlot
+odlotowy
+odludek
+@end example
+
+@subsubheading Binary format
+
+The mandatory file name extension for a binary dictionary is @code{bin}. To
+compile a text dictionary into binary format, write:
+
+@example
+compiledic <dictionaryname>.dic
+@end example
+
+@c ---------------------------------------------------------------------
+@c KOR
+@c ---------------------------------------------------------------------
+
+@page
+@node kor
+@section kor - configurable spelling corrector
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab PaweÅ Werenski, Tomasz ObrÄbski, MichaÅ Stolarski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab tok
+@end multitable
+
+@menu
+* kor description::
+* kor command line options::
+* kor weights definition file::    
+* kor dictionaries::            
+@end menu
+
+
+@node kor description
+@subsection Description
+
+The spelling corrector applies a Pawel Werenski's dynamic programming
+algorithm to the FSA representation of the set of word forms of the
+Polex/PMDBF dictionary. The algorithm is an extension of K. Oflazer
+algorithm used by @command{cor}. In the extended version it is
+possible to assign weights to individual edit operations.
+
+Given an incorrect word form it returns all word forms
+present in the dictionary whose edit distance is smaller than the
+threshold given as the parameter.
+
+
+@node kor command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+@parversion
+@parinteractive
+@c @parfile
+@c @paroutput
+@c @parfail
+@c @parcopy
+@parinputfield
+@paroutputfield
+@pardictionary
+@parprocess
+@parselect
+@parunselect
+@paroneline
+@paronefield
+
+@item @b{@minus{}@minus{}distance=@var{int}, @minus{}n @var{int}}
+Maximum edit distance (default='1').
+
+@item @b{@minus{}@minus{}weights=@var{filename}, @minus{}w @var{filename}}
+Edit operations' weights file.
+
+@c @item @b{@minus{}@minus{}replace, @minus{}r}
+@c Replace original form with corrected form, place original form in the
+@c cor field. This option has no effect in @option{--one-*} modes (default=off)
+
+
+@end table
+
+
+@node kor weights definition file
+@subsection Weights definition file
+
+Example:
+
+@example
+
+%stdcor 1
+%xchg   1
+ÅŒ  rz 0.5
+ch h  0.5
+u  Ã³  0.5
+
+@end example
+
+
+Default weight is set to 1 (@code{%stdcor 1}), the weight of exchange
+operation is set to 1 (@code{%xchg 1}), the three principal orthographic
+errors are assigned the weight 0.5.
+
+The edit operation weight declaration, such as
+
+@example
+ÅŒ  rz 0.5
+@end example
+
+works in both ways, i.e. ÅŒ->rz, rz->ÅŒ.
+
+The default weights definition file for @code{kor} is:
+
+@example
+$HOME/.local/share/utt/weights.kor
+@end example
+
+or, if the above mentioned file is absent:
+
+@example
+/usr/local/share/utt/weights.kor
+@end example
+
+
+@node kor dictionaries
+@subsection Dictionaries
+
+see @command{cor}
+
+@c ---------------------------------------------------------------------
+@c SEN
+@c ---------------------------------------------------------------------
+
+@page
+@node sen
+@section sen - a sentensizer
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab tok
+
+@end multitable
+
+
+@menu
+* sen description::
+@c * sen input::
+@c * sen output::
+* sen example::                 
+@end menu
+
+@node sen description
+@subsection Description
+
+@command{sen} detects sentence boundaries in UTT-formatted texts and marks them with special zero-length segments, in which the @var{type} field may contain the BOS (beginning of sentence) or EOS (end of sentence) annotation. 
+
+@node sen example
+@subsection Example
+
+@example
+command: sen
+
+input:
+0000 05 W CzeÅÄ
+0005 01 P !
+0006 01 S _
+0007 02 W To
+0009 01 S _
+0010 02 W ja
+0012 01 P .
+0013 01 S \n
+
+output:
+0000 00 BOS *
+0000 05 W CzeÅÄ
+0005 01 P !
+0006 00 EOS *
+0006 00 BOS *
+0006 01 S _
+0007 02 W To
+0009 01 S _
+0010 02 W ja
+0012 01 P .
+0013 01 S \n
+0014 00 EOS *
+@end example
+
+
+@c ---------------------------------------------------------------------
+@c GPH
+@c ---------------------------------------------------------------------
+
+@c @node gph - graphizer
+@c @chapter gph - graphizer
+
+@c Authors: Tomasz ObrÄbski
+
+
+
+@c ---------------------------------------------------------------------
+@c SER
+@c ---------------------------------------------------------------------
+
+@page
+@node ser
+@section ser - pattern search tool
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab tok,  lem --one-field
+@end multitable
+
+@menu
+* ser description::
+* ser command line options::    
+* ser pattern::                 
+* ser how ser works::           
+* ser customization::           
+* ser limitations::             
+* ser requirements::            
+@end menu
+
+
+@node ser description
+@subsection Description
+
+@command{ser} looks for patterns in UTT-formatted texts.
+
+
+@c ---------------------------------------------------------------------
+@node ser command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+@parversion
+@c @parfile
+@c @paroutput
+@c @parinputfield
+@c @paroutputfield
+@parprocess
+@parinteractive
+
+@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
+The search pattern.
+
+@item @b{@minus{}@minus{}morph=@var{field}}
+The name of the annotation field containing the morphological
+description (default @code{lem}).
+
+@item @b{@minus{}@minus{}flex}
+Only print the generated flex source code.
+
+@item @b{@minus{}@minus{}macro=@var{filename}}
+Read macrodefinitions from file @var{filename} rather than from
+default location. This option allows to redefine the set of terms.
+
+@item @b{@minus{}@minus{}define=@var{filename}}
+Append macrodefinitions from file @var{filename}. This option
+allows to extend the set of terms.
+
+@end table
+
+
+@c ---------------------------------------------------------------------
+@node ser pattern
+@subsection Pattern
+
+The @command{ser} pattern is a regular expression over terms corresponding
+to text segments or segment sequences. Predefined terms are:
+
+@table @code
+
+@item seg(@var{t},@var{f},@var{a})
+a segment of type @var{t}, containing form @var{f} and annotation
+@var{a}
+
+@item form(@var{f})
+a segment containing form @var{f}
+
+@item field(@var{f})
+a segment containing annotation field @var{f}
+
+@item space(@var{f})
+a space segment of form @var{f}
+
+@item word(@var{f})
+a word segment of form @var{f}
+
+@item punct(@var{f})
+a punct segment of form @var{f}
+
+@item number(@var{f})
+a number segment of form @var{f}
+
+@item lexeme(@var{f})
+a word segment with lemma @var{f}
+
+@item cat(@var{c})
+a word segment of category @var{c}
+
+@end table
+
+All arguments are optional. If an argument is omitted, an arbitrary
+string of non-blank characters is assumed as the argument value. Term
+arguments may be arbitrary character-level regular expressions. The
+following special symbols can by used:
+
+@multitable {aaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @code{[@dots{}]}            @tab a character class
+@item @code{[^@dots{}]}           @tab a negated character class
+@item @code{|}                    @tab alternative
+@item @code{*}                    @tab repetition, including zero times
+@item @code{+}                    @tab repetition, at least one time
+@item @code{?}                    @tab optionality
+@item @code{@{@var{m},@var{n}@}}  @tab repetition from @var{m} to @var{n} times
+@item @code{@{@var{m},@}}         @tab repetition @var{m} or more times
+@item @code{@{@var{m}@}}          @tab repetition @var{m} times
+@item @code{@var{\ddd}}           @tab the character with octal value @var{ddd}
+@item @code{\x@var{hh}}           @tab the character with hexadecimal value @var{hh}
+@item @code{( )}                  @tab parentheses, used to override precedence
+@c @end multitable
+
+@c @multitable {aaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @code{.}    @tab a non-blank character
+@item @code{\w}   @tab a letter
+@item @code{\W}   @tab a non-blank character other than a letter
+@item @code{\d}   @tab a digit
+@item @code{\D}   @tab a non-blank character other than a digit
+@item @code{\s}   @tab a space or tab character
+@item @code{\S}   @tab a non-blank character (the same as @code{.})
+@item @code{\l}   @tab a lowercase letter
+@item @code{\L}   @tab an uppercase letter
+@end multitable
+
+
+@noindent The following characters:
+@example
+@verb{%  [   ]   ^   |   *   +   ?   {   }   ,   .   <   >   \ %}
+@end example
+must be escaped with a backslash, i.e. written as:
+@example
+@verb{% \[  \]  \^  \|  \*  \+  \?  \{  \}  \,  \.  \<  \>  \\ %}
+@end example
+
+@quotation Note
+The special symbols are ... borrowed from Perl with minor
+modifications ... for convenience 
+The meaning of certain special characters/sequences slightly differs
+from their common ???. This is motivated by convenience reasons.
+The meaning of the @code{.} special character is modified due to
+the special function of spaces in utt files (they are field
+separators). Use @code{\s} to explicitly 
+@end quotation
+
+In the argument of the @code{cat} term a special operator <...> may be
+used. A category specification enclosed in angle brackets matches all
+category descriptions which are consistent (non-contradictory) with the
+specification. For example @code{<N>} matches all noun descriptions,
+@code{<ADJ/Can>} matches all adjectives in accusative or nominal case.
+
+
+@*
+@noindent @b{Examples of one-segment patterns:}
+
+@multitable {aaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @code{seg}            @tab any segment
+@item @code{word}           @tab any word-form
+@item @code{word(pomocy)}   @tab the word-form @samp{pomocy}
+@item @code{word(naj.+)}    @tab a word-form beginning with @samp{naj}
+@item @code{word(\L\l+)}    @tab a capitalized word-form
+@item @code{punct}          @tab a punctuation character
+@item @code{space(.*\\n.*)} @tab a space segment containing a newline character
+@item @code{lexeme(pomoc)}  @tab any form of the lexeme 'pomoc'
+@item @code{cat(N/.*)}      @tab a word which category starts with @code{N/}
+@item @code{cat(<N/Ca>)}    @tab a word which category matches @code{N/Ca}
+@end multitable
+
+@*
+@noindent @b{Examples of multi-segment patterns:}
+
+@table @code
+
+@item (word(\L) punct(\.) space?)+ word(\L\l+)
+a sequence of initials followed by a surname
+
+@item punct seg(W|S|N)* cat(<NPRO/Sr>) seg(W|S|N)* punct
+a text fragment between two punctuation characters, containing an
+ocurrence of a relative pronoun
+
+@end table
+
+
+@node ser how ser works
+@subsection How ser works
+
+@node ser customization
+@subsection Customization
+
+@c All predefined terms correspond to single segments, 
+
+@example
+define(`verbseq', `(cat(<V>) (space cat(<V>)))')
+@end example
+
+
+the term @code{cat()} may not be used as a ... of 
+
+@c See @command{m4} manual for further details on macro definition format.
+
+@node ser limitations
+@subsection Limitations
+
+Do not use more than 3 attributes in <>.
+
+@node ser requirements
+@subsection Requirements
+
+In order to run @command{ser}, the following programs must be
+installed in the system:
+
+@itemize
+
+@item @command{m4}
+@item @command{grep}
+@item @command{flex}
+@item @command{gcc}
+
+@end itemize
+
+
+@c ---------------------------------------------------------------------
+@c GRP
+@c ---------------------------------------------------------------------
+
+@page
+@node grp
+@section grp - pattern search tool
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT flattened
+@item @strong{Output format:}           @tab UTT flattened
+@item @strong{Required annotation:}     @tab tok, sen, lem --one-field
+@end multitable
+
+
+@menu
+* grp description::
+* grp command line options::    
+* grp pattern::                 
+* grp hints::    
+@end menu
+
+
+@node grp description
+@subsection Description
+
+@code{gre} selects sentences containing an expression matching a
+pattern. The pattern format is exactly the same as that accepted by
+@code{ser}.
+
+@code{gre} is intended mainly for speeding up corpus search process.
+It is extremely fast (processing speed is usually higher then the speed
+of reading the corpus file from disk). 
+
+@node grp command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+@parversion
+@parprocess
+@parinteractive
+
+@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
+The search pattern.
+
+@item @b{@minus{}@minus{}morph=@var{field}}
+The name of the annotation field containing the morphological
+description (default @code{lem}).
+
+@item @b{@minus{}@minus{}command}
+Only print the generated flex source code.
+
+@item @b{@minus{}@minus{}macro=@var{filename}}
+Read macrodefinitions from file @var{filename} rather than from
+default location. This option allows to redefine the set of terms.
+
+@item @b{@minus{}@minus{}define=@var{filename}}
+Append macrodefinitions from file @var{filename}. This option
+allows to extend the set of terms.
+
+@end table
+
+
+@node grp pattern
+@subsection Pattern
+
+(see @code{ser})
+
+@node grp hints
+@subsection Hints
+
+The corpus search speed may be increased by combining grp with lzop
+compression tool (grp usually processes data faster than it is read from a
+disk, especially for slow laptop drives).
+
+@example
+cat corpus | tok | sen | lem -1 | fla | lzop -7 > corpus.grp.lzo
+@end example
+
+@example
+lzop -cd corpus.grp.lzo | grp -e @var{EXPR} | unfla | ser -e @var{EXPR}
+@end example
+
+
+
+@c ---------------------------------------------------------------------
+@c MAR
+@c ---------------------------------------------------------------------
+
+@page
+@node mar
+@section mar
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Marcin Walas, Tomasz ObrÄbski
+@item @strong{Input format:}            @tab UTT flattened
+@item @strong{Output format:}           @tab UTT flattened
+@item @strong{Required annotation:}     @tab tok, sen, lem -1
+@end multitable
+
+@subsection Description
+@code{mar} is a perl script, which matches given pattern on the utt-formated text
+and tags matching parts with any number of user-defined tags.
+
+@subsection Command line options
+@table @code
+@parhelp
+@parversion
+
+@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
+The search pattern.
+@item @b{@minus{}@minus{}action=@var{action}, @minus{}a @var{action} [p] [s] [P]}
+Perform only indicated actions. Where:
+@multitable {aaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @code{p}   @tab preprocess
+@item @code{s}   @tab search
+@item @code{P}   @tab postprocess
+@end multitable
+default: psP
+
+@item @b{@minus{}@minus{}command}
+print generated sed command, then exit
+
+@item @b{@minus{}@minus{}help, @minus{}h}
+print help, then exit
+
+@item @b{@minus{}@minus{}version, @minus{}v}
+print version, then exit
+@end table
+@subsection Tokens in pattern
+@code{mar} pattern is based on @code{ser} patterns(see @pxref{ser pattern}). @code{mar} pattern is a @code{ser} pattern,
+in which you can add any number of matching tags, which will be printed in exacly the place, where
+they were placed in the pattern. A valid token starts with @@ which follows any number of alphanumeric
+characters. For example valid match tokens are: @@STARTMATCH @@ENDMATCH
+
+Matching tokens can be placed between, before or after any of @code{ser} pattern terms. They don't have
+to be paritied. There can be any number of them in the pattern (zero or more). They don't have to be unique.
+They can be placed one after another. For example:
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @code{@@BOM lexeme(pomoc)}  @tab place tag @b{BOM} before any form of the lexeme 'pomoc'
+@item @code{@@MATCH lexeme(pomoc) @@MATCH}      @tab place tag @b{MATCH} before and after any form of the lexeme 'pomoc'
+@item @code{cat(<ADJ>) @@MATCH lexeme(pomoc) @@MATCH}      @tab place tag @b{MATCH} before and after any form of the lexeme 'pomoc' which is  followef by adjective
+@item @code{cat(<ADJ>) @@TAG @@BOM lexeme(pomoc) @@EOM}      @tab place tags @b{TAG} and @b{BOM}  before any form of the lexeme 'pomoc' which is  followed by adjective and tag @b{EOM} after it
+@end multitable
+
+(see mar's help 'mar -h' for some more information)
+
+@subsection How mar works
+@code{mar} translates given @code{ser} pattern with @code{m4} macroprocessor to regular expression. Then it changes it into @code{sed} command script, which is then executed.
+
+You can see translated sed script by using the @code{@minus{}@minus{}command} option.
+@subsection Limitations
+The complexity of computations performed by @code{mar} increases linearly with the number of placed tokens. So it is highly recommended not to place too much tokens.
+@subsection Requirements
+In order to run @code{mar}, the following programs must be installed in the system:
+
+@itemize
+
+@item @command{m4}
+@item @command{grep}
+@item @command{sed}
+
+@end itemize
+
+
+
+@c ---------------------------------------------------------------------
+@c KOT
+@c ---------------------------------------------------------------------
+
+@page
+@node kot
+@section kot - untokenizer
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Component category:}      @tab filter
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab text
+@item @strong{Required annotation:}     @tab tok
+@end multitable
+
+
+@menu
+* kot description::
+* kot command line options::    
+* kot usage examples::    
+@end menu
+
+@node kot description
+@subsection Description
+
+@command{kot} transforms a UTT formatted file back into raw text format.
+
+@node kot command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+
+@c @item @b{@minus{}@minus{}version}, @b{@minus{}v}
+
+@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
+
+@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
+
+@c @item @b{@minus{}@minus{}interactive @minus{}i}
+
+@c @item @b{@minus{}@minus{}config=@var{filename}}
+
+@item
+
+@item @b{@minus{}@minus{}gap-fill=@var{string}, @minus{}g @var{string}}
+print @var{string} between nonadjacent segments of the input file
+
+@item @b{@minus{}@minus{}spaces, @minus{}r}
+retain the special characters @code{_}, @code{\t},
+@code{\n}, @code{\r}, @code{\f} unexpanded in the output
+
+@end table
+
+@node kot usage examples
+@subsection Usage examples
+
+@example
+cat legia.txt | tok | kot	
+@end example
+
+@example
+cat legia.txt | tok | lem -1 | kot
+@end example
+
+@c ---------------------------------------------------------------
+@c CON
+@c ---------------------------------------------------------------
+
+
+@page
+@node con
+@section con - concordance table generator
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Justyna Walkowska
+@item @strong{Component category:}      @tab sink
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab text
+@item @strong{Required annotation:}     @tab ser or mar
+@end multitable
+@c
+
+@menu
+* con description::
+* con command line options::
+* con usage example::
+* con hints::    
+@end menu
+
+
+@node con description
+@subsection Description
+
+@command{con} generates a concordance table based on a pattern given to @command{ser}.
+
+
+@node con command line options
+@subsection Command line options
+
+@table @code
+
+@parhelp
+
+@c @item @b{@minus{}@minus{}help}, @b{@minus{}h}
+@c @item @b{@minus{}@minus{}version}, @b{@minus{}v}
+@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
+@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
+@c @item @b{@minus{}@minus{}fail=@var{filename}, @minus{}e @var{filename}} [???]
+@c @item @b{@minus{}@minus{}copy, @minus{}c} [???]
+@c @item @b{@minus{}@minus{}input-field=@var{fieldname}, @minus{}I @var{fieldname}}
+@c @item @b{@minus{}@minus{}output-field=@var{fieldname}, @minus{}O @var{fieldname}}
+@c @item @b{@minus{}@minus{}process=@var{class}, @minus{}p @var{class}}
+@c @item @b{@minus{}@minus{}interactive @minus{}i}
+@c @item @b{@minus{}@minus{}config=@var{filename}}
+@c @item
+@c @item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
+@c search pattern
+@c 
+@c @item @b{@minus{}@minus{}flex}
+@c only print the generated flex source code
+@c 
+@c @item @b{@minus{}@minus{}macro=@var{filename}}
+@c read macrodefinitions from file @var{filename} rather than from
+@c default location. This option allows to redefine the set of terms.
+@c 
+@c @item @b{@minus{}@minus{}define=@var{filename}}
+@c append macrodefinitions from file @var{filename}. This option
+@c allows to extend the set of terms.
+
+@item @b{@minus{}@minus{}left @minus{}l}            
+	Left context info (default='30c'). Example:
+@example			 
+				 -l=5c: left context is 5 characters
+                                 -l=5w: left context is 5 words
+                                 -l=5s: left context is 5 non-empty input lines
+                                 -l='\s*\S+\sr\S+BOS': left context starts with the given regex
+@end example
+
+@item @b{@minus{}@minus{}right @minus{}r}            
+	Right context info (default='30c').
+@item @b{@minus{}@minus{}trim @minus{}t}            
+	Clear incomplete words from output.
+@item @b{@minus{}@minus{}white @minus{}w}            
+	DO NOT change all white characters into spaces.
+@item @b{@minus{}@minus{}column @minus{}c}            
+	Left column minimal width in characters (default = 0).
+@item @b{@minus{}@minus{}ignore @minus{}i}            
+	Ignore segment inconsistency in the input.
+@item @b{@minus{}@minus{}bom}            
+	Beginning of selected segment (regex, default='[0-9]+ [0-9]+ BOM .*').
+@item @b{@minus{}@minus{}eom}            
+	End of selected segment (regex, default='[0-9]+ [0-9]+ EOM .*').
+@item @b{@minus{}@minus{}bod}            
+	Selected segment beginning display string (default='[').
+@item @b{@minus{}@minus{}eod}            
+	Selected segment end display string (default=']').
+
+
+
+@end table
+
+@node con usage example
+@subsection Usage example
+@example
+cat file.txt | tok | lem -1 | ser -e 'lexeme(dom)' | con  
+@end example
+
+
+@node con hints
+@subsection Hints
+
+@command{con} is a rather slow program. Do not pass large amounts of
+redundant text through this program. @command{con} works fine in the following
+sequence:
+
+@example
+... | grp -e EXPR | ser -e EXPR | con
+@end example
+
+
+@c ---------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+
+@page
+@node Auxiliary tools
+@chapter Auxiliary tools
+
+@menu
+* compiledic::         dictionary compiler
+* fla::                UTT file flattener
+* unfla::              UTT file unflattener
+@end menu
+
+
+@page
+@node compiledic
+@section compiledic - the dictionary compiler
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab MichaÅ Stolarski, Tomasz ObrÄbski
+@item @strong{Component category:}      @tab additional tool
+@end multitable
+@c
+
+@command{compiledic} compiles dictionaries in text format (@code{.dic} extension) into binary
+(FSA) format (@code{.bin} extension).
+
+Automaton representation of a dictionary is built using the AT&T tools:
+@itemize
+@item AT&T FSM Library,
+@item AT&T Lextools.
+@end itemize
+
+In order for the compiledic program to work you have to install the
+above mentioned packages into your system.  They are freely available
+for non-commercial use.
+
+Usage:
+@example
+        compiledic <dictionaryname>.dic
+@end example
+
+The file <dictionaryname>.bin will be generated.
+
+Remarque: The program produces a lot of temporary files which are
+stored in the current directory. They are deleted after successfull
+termination of the program.
+
+@c @menu
+@c * con command line options::
+@c * con usage example::
+@c * con hints::    
+@c @end menu
+
+
+@c -------------------------------------------------------------------------------
+@c FLA
+@c -------------------------------------------------------------------------------
+
+@page
+@node fla
+@section fla - the UTT file flattener
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Input format:}            @tab UTT regular
+@item @strong{Output format:}           @tab UTT flattened
+@item @strong{Required annotation:}     @tab sen
+@end multitable
+@c
+
+@menu
+* fla description::
+@c * fla command line options::
+@c * fla usage example::
+@end menu
+
+
+@node fla description
+@subsection Description
+
+@command{fla} ``flattens'' a utt file by merging segments belonging
+to one sentence in one line. Technically, end-of-line characters
+('\n', ASCII code 10) are replaced with line-feed characters ('\f',
+ASCII code 12).  The flattening makes it possible to process UTT files
+with such tools as @command{grep} or @command{sed} sentence by
+sentence (used in @command{grp} and @command{mar}).
+
+Flattened files should have the suffix @code{.fla}, eg. @file{thetext.utt.fla}.
+
+Flattened files are still human-readible.
+
+Usage:
+
+@example
+        fla [<bosregex>]
+@end example
+
+The facultative argument is a regular expression describing segments
+which should be treated as sentence beginnings (the test is: the
+segment contains a fragment matching the @code{<bosregex>}). By
+default, segments containing a field @code{BOS} are seeked.
+
+@c -------------------------------------------------------------------------------
+@c UNFLA
+@c -------------------------------------------------------------------------------
+
+@page
+@node unfla
+@section unfla - the UTT file unflattener
+
+@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@item @strong{Authors:}                 @tab Tomasz ObrÄbski
+@item @strong{Input format:}            @tab UTT flattened
+@item @strong{Output format:}           @tab UTT regular
+@item @strong{Required annotation:}     @tab -
+@end multitable
+
+@menu
+* unfla description::
+@c * fla command line options::
+@c * fla usage example::
+@end menu
+
+@node unfla description
+@subsection Description
+@command{unfla} transforms a flattened UTT file, produced by
+@command{fla}, into the regular format by restoring end-of-line
+characters.
+
+
+
+
+@c ---------------------------------------------------------------------
+@c USAGE EXAMPLES
+@c ---------------------------------------------------------------------
+
+@node Usage examples
+@chapter Usage examples
+
+@subsubheading Simple pipelines
+
+@enumerate
+
+@item tokenization
+
+cat text | tok > output1
+
+@item morphological annotation (1)
+
+simple dictionary based lemmatization
+
+cat text | tok | lem > output1
+
+@item morphological annotation (2)
+
+1) perform dictionary-based lemmatization
+4) guess descriptions for words which have no annotation
+
+@example
+cat text | tok | lem | gue -S lem > output2
+@end example
+
+@item morphological annotation (3)
+
+1) perform dictionary-based lemmatization
+2) try to correct words with no annotation
+3) perform dictionary-based lemmatization of corrected words
+4) guess descriptions for words which still have no annotation
+
+@example
+cat text | tok | lem | cor -p W -S lem | lem -I cor | gue -p W -S lem
+@end example
+@item spelling correction
+
+
+
+@example
+cat text | tok | egrep ' W ' | lem | egrep -v 'lem:' | cor -1
+@end example
+
+@item Expression extraction
+
+Extraction of all occurrences of a verb followed by a form of the noun 'rozmowa'.
+
+@example
+cat text | tok | lem -1 | ser -e 'cat(<V>) space lexeme(rozmowa)' -m | kot > output4
+@end example
+
+@item A word in context
+
+Extraction of text fragments containing a form of the lexeme 'rozmowa' in
+the context of 5 preceeding and 5 succeeding corpus segments.
+
+@example
+cat text | tok | lem -1 | ser -e 'seg@{5@} lexeme(rozmowa) seg@{5@}' -m | kot > output
+@end example
+
+@item generation of concordance table (1)
+
+@example
+cat text | tok | lem -1 | ser -e 'cat(<V>) space lexeme(rozmowa)' | con
+@end example
+
+10"
+
+@item generation of concordance table (2)
+
+The same as above but much faster
+
+@example
+cat text | tok | lem -1 | \
+grp -e 'cat(<V>) space lexeme(rozmowa)' | \
+ser -e 'cat(<V>) space lexeme(rozmowa)' | \
+con
+@end example
+
+2"
+
+@item generation of concordance table (3)
+
+Usually, one performs repetitively search over the same corpus. In
+such case it is advisable to transform the corpus data into the format
+required by @command{grp} first, and then use the preprocessed data.
+
+As @command{grp} (@command{grep}) processes data faster then it is
+read from the disk drive, the search time may be still shortened by
+using file compression techniques.  We suggest using the
+@command{lzop} compressor/decompressor.
+
+@item the fastest way to search a large corpus
+
+step 1: corpus preprocessing
+
+@example
+cat corpus | tok | sen | lem -1 \
+| fla | lzop -7 > corpus.grp.lzo
+@end example
+
+step 2: search
+
+@example
+lzop -cd corpus.grp.lzo | unfla | grp -e 'cat(<V>) space
+lexeme(rozmowa)' | ser -e 'cat(<V>) space lexeme(rozmowa)' | con
+@end example
+
+@end enumerate
+
+@c @subsubheading More complicated configurations
+
+
+@c @example
+@c mknod fifo1 p
+@c mknod fifo2 p
+@c mknod fifo3 p
+@c mknod fifo4 p
+@c mknod fifo5 p
+
+@c tok | lem -p W -e fifo1 > fifo2 &
+@c cor -e fifo3 < fifo1 | lem > fifo4 &
+@c gue < fifo3 > fifo5 &
+@c sort -m fifo2 fifo4 fifo5
+
+@c rm fifo?
+@c @end example
+
+
+@c ---------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+
+@c ---------------------------------------------------------------------
+@c PMDBF DICTIONARY
+@c ---------------------------------------------------------------------
+
+@node PMDBF dictionary
+@chapter PMDBF dictionary
+
+UTT components come with lexical data derived from Polish
+Morphological Database (PMDB).
+
+@menu
+* PMDBF files::    
+* PMDBF tag structure::                 
+* PMDBF parts of speech::           
+* PMDBF morphosyntactic attributes::           
+@end menu
+
+@node PMDBF files
+@section Files
+
+@node PMDBF tag structure
+@section Tag structure
+
+pos = [[:upper:]]+
+
+attr = [[:upper:]]+
+
+val = [[:lower:][:digit:]?!*+-] | <[^>\n]+>
+
+descr = pos ( / ( attr val + ) + ) ?
+
+@node PMDBF parts of speech
+@section Parts of speech
+
+@multitable {ADJPRP} { adjectival-passive-participle }
+@item @code{N} @tab noun
+@item @code{NPRO} @tab nominal-pronoun
+@item @code{NV} @tab deverbal-noun
+@item @code{V} @tab verb
+@item @code{BYC} @tab byc
+@item @code{VNI} @tab non-inflected-verb
+@item @code{ADJ} @tab adjective
+@item @code{ADJPAP} @tab adjectival-passive-participle
+@item @code{ADJPRP} @tab adjectival-present-participle
+@item @code{ADJPP} @tab adjectival-past-participle
+@item @code{ADJPRO} @tab adjectival-pronoun
+@item @code{ADJNUM} @tab adjectival-numeral
+@item @code{ADV} @tab adverb
+@item @code{ADVANP} @tab adverbial-anterior-participle
+@item @code{ADVPRP} @tab adverbial-present-participle
+@item @code{ADVPRO} @tab adverbial-pronoun
+@item @code{ADVNUM} @tab  adverbial-numeral
+@item @code{P} @tab preposition
+@item @code{PPRO} @tab prep-noun-pronoun
+@item @code{CONJ} @tab conjunction
+@item @code{EXCL} @tab exclamation
+@item @code{APP} @tab call
+@item @code{ONO} @tab onomatopoeia
+@item @code{PART} @tab particle
+@item @code{NUMCRD} @tab cardinal-numeral
+@item @code{NUMCOL} @tab collective-numeral
+@item @code{NUMPAR} @tab partitive-numeral
+@item @code{NUMORD} @tab ordinal-numeral
+@end multitable
+
+@node PMDBF morphosyntactic attributes
+@section Morphosyntactic attributes
+
+@multitable {Attr} {Val} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
+@c @headitem Attr @tab Val @tab Description
+@item
+@code{A} @tab @tab Aspect
+@item
+@tab @code{p} @tab perfect
+@item
+@tab @code{i} @tab imperfect.
+@item
+@item
+@code{V} @tab @tab Verb-Form
+@item
+@tab @code{b} @tab infinitive,
+@item
+@tab @code{p} @tab personal,
+@item
+@tab @code{i} @tab impersonal.
+@item
+@item
+@code{M} @tab @tab Mood
+@item
+@tab @code{d} @tab declarative,
+@item
+@tab @code{c} @tab conditional,
+@item
+@tab @code{i} @tab imperative.
+@item
+@item
+@code{T} @tab @tab Tense
+@item
+@tab @code{a} @tab past,
+@item
+@tab @code{r} @tab present,
+@item
+@tab @code{f} @tab future.
+@item
+@item
+@code{P} @tab @tab Person
+@item
+@tab @code{1} @tab 1,
+@item
+@tab @code{2} @tab 2,
+@item
+@tab @code{3} @tab 3.
+@item
+@item
+@code{D} @tab @tab Degree
+@item
+@tab @code{p} @tab positive,
+@item
+@tab @code{c} @tab comparative,
+@item
+@tab @code{s} @tab superlative.
+@item
+@item
+@code{N} @tab @tab Number
+@item
+@tab @code{s} @tab singular,
+@item
+@tab @code{p} @tab plural.
+@item
+@item
+@code{C} @tab @tab Case
+@item
+@tab @code{n} @tab nominative,
+@item
+@tab @code{g} @tab genitive,
+@item
+@tab @code{d} @tab dative,
+@item
+@tab @code{a} @tab accusative,
+@item
+@tab @code{i} @tab instrumantal,
+@item
+@tab @code{l} @tab locative,
+@item
+@tab @code{v} @tab vocative.
+@item
+@code{G} @tab @tab Gender
+@item
+@tab @code{p} @tab masculine-personal,
+@item
+@tab @code{a} @tab masculine-animal,
+@item
+@tab @code{i} @tab masculine-inanimate,
+@item
+@tab @code{f} @tab feminine,
+@item
+@tab @code{n} @tab neuter.
+@end multitable
+
+
+@c ---------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+@c 
+@c @node Examples
+@c @chapter Examples
+
+@c ----------------------------------------------------------------------
+@c ----------------------------------------------------------------------
+
+@node    GNU Free Documentation License
+@chapter GNU Free Documentation License
+
+@c The GNU Free Documentation License.
+@center Version 1.2, November 2002
+
+@c This file is intended to be included within another document,
+@c hence no sectioning command or @node.
+
+@display
+Copyright @copyright{} 2000,2001,2002 Free Software Foundation, Inc.
+51 Franklin St, Fifth Floor, Boston, MA  02110-1301, USA
+
+Everyone is permitted to copy and distribute verbatim copies
+of this license document, but changing it is not allowed.
+@end display
+
+@enumerate 0
+@item
+PREAMBLE
+
+The purpose of this License is to make a manual, textbook, or other
+functional and useful document @dfn{free} in the sense of freedom: to
+assure everyone the effective freedom to copy and redistribute it,
+with or without modifying it, either commercially or noncommercially.
+Secondarily, this License preserves for the author and publisher a way
+to get credit for their work, while not being considered responsible
+for modifications made by others.
+
+This License is a kind of ``copyleft'', which means that derivative
+works of the document must themselves be free in the same sense.  It
+complements the GNU General Public License, which is a copyleft
+license designed for free software.
+
+We have designed this License in order to use it for manuals for free
+software, because free software needs free documentation: a free
+program should come with manuals providing the same freedoms that the
+software does.  But this License is not limited to software manuals;
+it can be used for any textual work, regardless of subject matter or
+whether it is published as a printed book.  We recommend this License
+principally for works whose purpose is instruction or reference.
+
+@item
+APPLICABILITY AND DEFINITIONS
+
+This License applies to any manual or other work, in any medium, that
+contains a notice placed by the copyright holder saying it can be
+distributed under the terms of this License.  Such a notice grants a
+world-wide, royalty-free license, unlimited in duration, to use that
+work under the conditions stated herein.  The ``Document'', below,
+refers to any such manual or work.  Any member of the public is a
+licensee, and is addressed as ``you''.  You accept the license if you
+copy, modify or distribute the work in a way requiring permission
+under copyright law.
+
+A ``Modified Version'' of the Document means any work containing the
+Document or a portion of it, either copied verbatim, or with
+modifications and/or translated into another language.
+
+A ``Secondary Section'' is a named appendix or a front-matter section
+of the Document that deals exclusively with the relationship of the
+publishers or authors of the Document to the Document's overall
+subject (or to related matters) and contains nothing that could fall
+directly within that overall subject.  (Thus, if the Document is in
+part a textbook of mathematics, a Secondary Section may not explain
+any mathematics.)  The relationship could be a matter of historical
+connection with the subject or with related matters, or of legal,
+commercial, philosophical, ethical or political position regarding
+them.
+
+The ``Invariant Sections'' are certain Secondary Sections whose titles
+are designated, as being those of Invariant Sections, in the notice
+that says that the Document is released under this License.  If a
+section does not fit the above definition of Secondary then it is not
+allowed to be designated as Invariant.  The Document may contain zero
+Invariant Sections.  If the Document does not identify any Invariant
+Sections then there are none.
+
+The ``Cover Texts'' are certain short passages of text that are listed,
+as Front-Cover Texts or Back-Cover Texts, in the notice that says that
+the Document is released under this License.  A Front-Cover Text may
+be at most 5 words, and a Back-Cover Text may be at most 25 words.
+
+A ``Transparent'' copy of the Document means a machine-readable copy,
+represented in a format whose specification is available to the
+general public, that is suitable for revising the document
+straightforwardly with generic text editors or (for images composed of
+pixels) generic paint programs or (for drawings) some widely available
+drawing editor, and that is suitable for input to text formatters or
+for automatic translation to a variety of formats suitable for input
+to text formatters.  A copy made in an otherwise Transparent file
+format whose markup, or absence of markup, has been arranged to thwart
+or discourage subsequent modification by readers is not Transparent.
+An image format is not Transparent if used for any substantial amount
+of text.  A copy that is not ``Transparent'' is called ``Opaque''.
+
+Examples of suitable formats for Transparent copies include plain
+@sc{ascii} without markup, Texinfo input format, La@TeX{} input
+format, @acronym{SGML} or @acronym{XML} using a publicly available
+@acronym{DTD}, and standard-conforming simple @acronym{HTML},
+PostScript or @acronym{PDF} designed for human modification.  Examples
+of transparent image formats include @acronym{PNG}, @acronym{XCF} and
+@acronym{JPG}.  Opaque formats include proprietary formats that can be
+read and edited only by proprietary word processors, @acronym{SGML} or
+@acronym{XML} for which the @acronym{DTD} and/or processing tools are
+not generally available, and the machine-generated @acronym{HTML},
+PostScript or @acronym{PDF} produced by some word processors for
+output purposes only.
+
+The ``Title Page'' means, for a printed book, the title page itself,
+plus such following pages as are needed to hold, legibly, the material
+this License requires to appear in the title page.  For works in
+formats which do not have any title page as such, ``Title Page'' means
+the text near the most prominent appearance of the work's title,
+preceding the beginning of the body of the text.
+
+A section ``Entitled XYZ'' means a named subunit of the Document whose
+title either is precisely XYZ or contains XYZ in parentheses following
+text that translates XYZ in another language.  (Here XYZ stands for a
+specific section name mentioned below, such as ``Acknowledgements'',
+``Dedications'', ``Endorsements'', or ``History''.)  To ``Preserve the Title''
+of such a section when you modify the Document means that it remains a
+section ``Entitled XYZ'' according to this definition.
+
+The Document may include Warranty Disclaimers next to the notice which
+states that this License applies to the Document.  These Warranty
+Disclaimers are considered to be included by reference in this
+License, but only as regards disclaiming warranties: any other
+implication that these Warranty Disclaimers may have is void and has
+no effect on the meaning of this License.
+
+@item
+VERBATIM COPYING
+
+You may copy and distribute the Document in any medium, either
+commercially or noncommercially, provided that this License, the
+copyright notices, and the license notice saying this License applies
+to the Document are reproduced in all copies, and that you add no other
+conditions whatsoever to those of this License.  You may not use
+technical measures to obstruct or control the reading or further
+copying of the copies you make or distribute.  However, you may accept
+compensation in exchange for copies.  If you distribute a large enough
+number of copies you must also follow the conditions in section 3.
+
+You may also lend copies, under the same conditions stated above, and
+you may publicly display copies.
+
+@item
+COPYING IN QUANTITY
+
+If you publish printed copies (or copies in media that commonly have
+printed covers) of the Document, numbering more than 100, and the
+Document's license notice requires Cover Texts, you must enclose the
+copies in covers that carry, clearly and legibly, all these Cover
+Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
+the back cover.  Both covers must also clearly and legibly identify
+you as the publisher of these copies.  The front cover must present
+the full title with all words of the title equally prominent and
+visible.  You may add other material on the covers in addition.
+Copying with changes limited to the covers, as long as they preserve
+the title of the Document and satisfy these conditions, can be treated
+as verbatim copying in other respects.
+
+If the required texts for either cover are too voluminous to fit
+legibly, you should put the first ones listed (as many as fit
+reasonably) on the actual cover, and continue the rest onto adjacent
+pages.
+
+If you publish or distribute Opaque copies of the Document numbering
+more than 100, you must either include a machine-readable Transparent
+copy along with each Opaque copy, or state in or with each Opaque copy
+a computer-network location from which the general network-using
+public has access to download using public-standard network protocols
+a complete Transparent copy of the Document, free of added material.
+If you use the latter option, you must take reasonably prudent steps,
+when you begin distribution of Opaque copies in quantity, to ensure
+that this Transparent copy will remain thus accessible at the stated
+location until at least one year after the last time you distribute an
+Opaque copy (directly or through your agents or retailers) of that
+edition to the public.
+
+It is requested, but not required, that you contact the authors of the
+Document well before redistributing any large number of copies, to give
+them a chance to provide you with an updated version of the Document.
+
+@item
+MODIFICATIONS
+
+You may copy and distribute a Modified Version of the Document under
+the conditions of sections 2 and 3 above, provided that you release
+the Modified Version under precisely this License, with the Modified
+Version filling the role of the Document, thus licensing distribution
+and modification of the Modified Version to whoever possesses a copy
+of it.  In addition, you must do these things in the Modified Version:
+
+@enumerate A
+@item
+Use in the Title Page (and on the covers, if any) a title distinct
+from that of the Document, and from those of previous versions
+(which should, if there were any, be listed in the History section
+of the Document).  You may use the same title as a previous version
+if the original publisher of that version gives permission.
+
+@item
+List on the Title Page, as authors, one or more persons or entities
+responsible for authorship of the modifications in the Modified
+Version, together with at least five of the principal authors of the
+Document (all of its principal authors, if it has fewer than five),
+unless they release you from this requirement.
+
+@item
+State on the Title page the name of the publisher of the
+Modified Version, as the publisher.
+
+@item
+Preserve all the copyright notices of the Document.
+
+@item
+Add an appropriate copyright notice for your modifications
+adjacent to the other copyright notices.
+
+@item
+Include, immediately after the copyright notices, a license notice
+giving the public permission to use the Modified Version under the
+terms of this License, in the form shown in the Addendum below.
+
+@item
+Preserve in that license notice the full lists of Invariant Sections
+and required Cover Texts given in the Document's license notice.
+
+@item
+Include an unaltered copy of this License.
+
+@item
+Preserve the section Entitled ``History'', Preserve its Title, and add
+to it an item stating at least the title, year, new authors, and
+publisher of the Modified Version as given on the Title Page.  If
+there is no section Entitled ``History'' in the Document, create one
+stating the title, year, authors, and publisher of the Document as
+given on its Title Page, then add an item describing the Modified
+Version as stated in the previous sentence.
+
+@item
+Preserve the network location, if any, given in the Document for
+public access to a Transparent copy of the Document, and likewise
+the network locations given in the Document for previous versions
+it was based on.  These may be placed in the ``History'' section.
+You may omit a network location for a work that was published at
+least four years before the Document itself, or if the original
+publisher of the version it refers to gives permission.
+
+@item
+For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve
+the Title of the section, and preserve in the section all the
+substance and tone of each of the contributor acknowledgements and/or
+dedications given therein.
+
+@item
+Preserve all the Invariant Sections of the Document,
+unaltered in their text and in their titles.  Section numbers
+or the equivalent are not considered part of the section titles.
+
+@item
+Delete any section Entitled ``Endorsements''.  Such a section
+may not be included in the Modified Version.
+
+@item
+Do not retitle any existing section to be Entitled ``Endorsements'' or
+to conflict in title with any Invariant Section.
+
+@item
+Preserve any Warranty Disclaimers.
+@end enumerate
+
+If the Modified Version includes new front-matter sections or
+appendices that qualify as Secondary Sections and contain no material
+copied from the Document, you may at your option designate some or all
+of these sections as invariant.  To do this, add their titles to the
+list of Invariant Sections in the Modified Version's license notice.
+These titles must be distinct from any other section titles.
+
+You may add a section Entitled ``Endorsements'', provided it contains
+nothing but endorsements of your Modified Version by various
+parties---for example, statements of peer review or that the text has
+been approved by an organization as the authoritative definition of a
+standard.
+
+You may add a passage of up to five words as a Front-Cover Text, and a
+passage of up to 25 words as a Back-Cover Text, to the end of the list
+of Cover Texts in the Modified Version.  Only one passage of
+Front-Cover Text and one of Back-Cover Text may be added by (or
+through arrangements made by) any one entity.  If the Document already
+includes a cover text for the same cover, previously added by you or
+by arrangement made by the same entity you are acting on behalf of,
+you may not add another; but you may replace the old one, on explicit
+permission from the previous publisher that added the old one.
+
+The author(s) and publisher(s) of the Document do not by this License
+give permission to use their names for publicity for or to assert or
+imply endorsement of any Modified Version.
+
+@item
+COMBINING DOCUMENTS
+
+You may combine the Document with other documents released under this
+License, under the terms defined in section 4 above for modified
+versions, provided that you include in the combination all of the
+Invariant Sections of all of the original documents, unmodified, and
+list them all as Invariant Sections of your combined work in its
+license notice, and that you preserve all their Warranty Disclaimers.
+
+The combined work need only contain one copy of this License, and
+multiple identical Invariant Sections may be replaced with a single
+copy.  If there are multiple Invariant Sections with the same name but
+different contents, make the title of each such section unique by
+adding at the end of it, in parentheses, the name of the original
+author or publisher of that section if known, or else a unique number.
+Make the same adjustment to the section titles in the list of
+Invariant Sections in the license notice of the combined work.
+
+In the combination, you must combine any sections Entitled ``History''
+in the various original documents, forming one section Entitled
+``History''; likewise combine any sections Entitled ``Acknowledgements'',
+and any sections Entitled ``Dedications''.  You must delete all
+sections Entitled ``Endorsements.''
+
+@item
+COLLECTIONS OF DOCUMENTS
+
+You may make a collection consisting of the Document and other documents
+released under this License, and replace the individual copies of this
+License in the various documents with a single copy that is included in
+the collection, provided that you follow the rules of this License for
+verbatim copying of each of the documents in all other respects.
+
+You may extract a single document from such a collection, and distribute
+it individually under this License, provided you insert a copy of this
+License into the extracted document, and follow this License in all
+other respects regarding verbatim copying of that document.
+
+@item
+AGGREGATION WITH INDEPENDENT WORKS
+
+A compilation of the Document or its derivatives with other separate
+and independent documents or works, in or on a volume of a storage or
+distribution medium, is called an ``aggregate'' if the copyright
+resulting from the compilation is not used to limit the legal rights
+of the compilation's users beyond what the individual works permit.
+When the Document is included in an aggregate, this License does not
+apply to the other works in the aggregate which are not themselves
+derivative works of the Document.
+
+If the Cover Text requirement of section 3 is applicable to these
+copies of the Document, then if the Document is less than one half of
+the entire aggregate, the Document's Cover Texts may be placed on
+covers that bracket the Document within the aggregate, or the
+electronic equivalent of covers if the Document is in electronic form.
+Otherwise they must appear on printed covers that bracket the whole
+aggregate.
+
+@item
+TRANSLATION
+
+Translation is considered a kind of modification, so you may
+distribute translations of the Document under the terms of section 4.
+Replacing Invariant Sections with translations requires special
+permission from their copyright holders, but you may include
+translations of some or all Invariant Sections in addition to the
+original versions of these Invariant Sections.  You may include a
+translation of this License, and all the license notices in the
+Document, and any Warranty Disclaimers, provided that you also include
+the original English version of this License and the original versions
+of those notices and disclaimers.  In case of a disagreement between
+the translation and the original version of this License or a notice
+or disclaimer, the original version will prevail.
+
+If a section in the Document is Entitled ``Acknowledgements'',
+``Dedications'', or ``History'', the requirement (section 4) to Preserve
+its Title (section 1) will typically require changing the actual
+title.
+
+@item
+TERMINATION
+
+You may not copy, modify, sublicense, or distribute the Document except
+as expressly provided for under this License.  Any other attempt to
+copy, modify, sublicense or distribute the Document is void, and will
+automatically terminate your rights under this License.  However,
+parties who have received copies, or rights, from you under this
+License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+@item
+FUTURE REVISIONS OF THIS LICENSE
+
+The Free Software Foundation may publish new, revised versions
+of the GNU Free Documentation License from time to time.  Such new
+versions will be similar in spirit to the present version, but may
+differ in detail to address new problems or concerns.  See
+@uref{http://www.gnu.org/copyleft/}.
+
+Each version of the License is given a distinguishing version number.
+If the Document specifies that a particular numbered version of this
+License ``or any later version'' applies to it, you have the option of
+following the terms and conditions either of that specified version or
+of any later version that has been published (not as a draft) by the
+Free Software Foundation.  If the Document does not specify a version
+number of this License, you may choose any version ever published (not
+as a draft) by the Free Software Foundation.
+@end enumerate
+
+@page
+@heading ADDENDUM: How to use this License for your documents
+
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and
+license notices just after the title page:
+
+@smallexample
+@group
+  Copyright (C)  @var{year}  @var{your name}.
+  Permission is granted to copy, distribute and/or modify this document
+  under the terms of the GNU Free Documentation License, Version 1.2
+  or any later version published by the Free Software Foundation;
+  with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+  Texts.  A copy of the license is included in the section entitled ``GNU
+  Free Documentation License''.
+@end group
+@end smallexample
+
+If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
+replace the ``with@dots{}Texts.'' line with this:
+
+@smallexample
+@group
+    with the Invariant Sections being @var{list their titles}, with
+    the Front-Cover Texts being @var{list}, and with the Back-Cover Texts
+    being @var{list}.
+@end group
+@end smallexample
+
+If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of
+free software license, such as the GNU General Public License,
+to permit their use in free software.
+
+@c Local Variables:
+@c ispell-local-pdict: "ispell-dict"
+@c End:
+
+
+@c ---------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+
+@node    Reporting bugs
+@chapter Reporting bugs
+
+Report bugs to <obrebski@@amu.edu.pl>.
+
+@c ---------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+
+@c @node    Copyright
+@c @chapter Copyright
+@c 
+@c Copyright 2004 by Tomasz ObrÄbski
+@c This software is free for research and educational use.
+
+@c ---------------------------------------------------------------------
+@c ---------------------------------------------------------------------
+
+@node    Author
+@chapter Author
+
+
+@bye