Context Navigation

utt.texinfo @ 261bf62

help

Last change on this file since 261bf62 was 261bf62, checked in by obrebski <obrebski@…>, 18 years ago

w utt.texinfo

git-svn-id: svn://atos.wmid.amu.edu.pl/utt@60 e293616e-ec6a-49c2-aa92-f4a8b91c5d16

Property mode set to 100644

File size: 79.0 KB

Rev	Line
[25ae32e]	1	\input texinfo @c --texinfo--
	2	@documentencoding ISO-8859-2
	3	@c @documentlanguage pl
	4
	5	@c %**start of header
	6	@setfilename utt.info
	7	@settitle UAM Text Tools v0.90
	8	@c %**end of header
	9
	10	@copying
[261bf62]	11	This manual is for UAM Text Tools (version 0.90, October, 2008)
[25ae32e]	12
[19760ef]	13	Copyright @copyright{} 2005, 2007 Tomasz ObrÃªbski, MichaÂ³ Stolarski, Justyna Walkowska, PaweÂ³ Konieczka.
[25ae32e]	14
	15	Permission is granted to copy, distribute and/or modify this document
[261bf62]	16	under the terms of the GNU Free Documentation License, Version 1.2 or
	17	any later version published by the Free Software Foundation; with no
	18	Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
	19	copy of the license is included in the section entitled GNU Free
	20	Documentation License,,GNU Free Documentation License.
[25ae32e]	21
	22	@c @quotation
	23	@c Permission is granted to ...
	24	@c No permission is granted until the document is completed.
	25	@c @end quotation
	26	@end copying
	27
	28
	29	@titlepage
	30	@title UAM Text Tools 0.90 - User Manual
	31	@subtitle edition 0.01, @today
	32	@subtitle status: prescript
	33	@author by Justyna Walkowska, Tomasz Obr@,{}ebski and Micha@l{} Stolarski
	34	@page
	35	@vskip 0pt plus 1filll
	36	@insertcopying
	37	@end titlepage
	38
	39	@contents
	40
	41	@c @paragraphindent none
	42
	43	@iftex
	44	@parskip = 0.5@normalbaselineskip plus 3pt minus 1pt
	45	@end iftex
	46
	47	@c @headings off
	48	@c @everyheading LEM(1) @\| @\| LEM(1)
	49	@everyfooting @today @c @\| @thispage @\|
	50
	51	@ifnottex
	52
	53	@node Top
	54	@top UTT - UAM Text Tools
	55
	56	@insertcopying
	57
	58	@menu
	59	* General information::
	60	* UTT file format::
	61	* Configuration files::
	62	* UTT components::
	63	* Auxiliary tools::
	64	* Usage examples::
	65	* PMDBF dictionary::
	66	@c * Examples::
	67	@c * Copyright::
	68	* GNU Free Documentation License::
	69	* Reporting bugs::
	70	* Author::
	71	@end menu
	72	@end ifnottex
	73
	74
	75	@c ----------------------------------------------------------------------
	76
	77	@node General information
	78	@chapter General information
	79
	80	UAM Text Tools (UTT) is a package of language processing tools
	81	developed at Adam Mickiewicz University. Its functionality includes:
	82
	83	@itemize @bullet
	84
	85	@item
	86	tokenization
	87	@item
	88	dictionary-based morphological analysis
	89	@item
	90	heuristic morphological analysis of unknown words
	91	@item
	92	spelling correction
	93	@item
	94	pattern search
	95	@item
	96	sentence splitting
	97	@item
	98	generation of concordance tables
	99	@end itemize
	100
	101	The toolkit is destined for processing of raw (not annotated)
	102	unrestricted text for any conceivable purpose.
	103
	104	The system is organized as a collection of command-line programs, each
	105	performing one operation, e.g. tokenization, lemmatization, spelling
	106	correction. The components are independent one from another, the
	107	unifying element being the uniform i/o file format.
	108
	109	The components may be combined in various ways to provide various text
	110	processing services. Also new components supplied by the used may be
	111	easily incorporated into the system provided that they respect the i/o
	112	file format conventions.
	113
	114	UTT component programs does not depend on any specific tagset or
	115	morphological description format.
	116
	117	UTT is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
	118	the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
	119
	120	The Polex/PMDBF dictionary is licensed under the Creative Commons by-nc-sa License which prohibits commercial use.
	121
	122
	123	List of contributors:
	124
	125	@itemize
	126	@item Pawel Konieczka
	127	@item Tomasz Obrebski
	128	@item Michal Stolarski
	129	@item Marcin Walas
	130	@item Justyna Walkowska
[04ae414]	131	@item Pawel Werenski
[25ae32e]	132	@end itemize
	133
	134	@c ----------------------------------------------------------------------
	135	@c ---------------------------------------------------------------------
	136
	137	@node UTT file format
	138	@chapter UTT file format
	139
	140	A UTT file contains annotation of a text. It consists of a sequence of
	141	segments. Each segment explicitly refers to a continuous piece of the
	142	text and provides some information on it.
	143
	144	@section Segment format
	145
	146	A segment occupies one line of a UTT file and consists of
	147	space-separated fields:
	148
	149
	150	@quotation
	151	@sp 1
	152	[@var{start} [@var{length}]] @var{type} @var{form} [@var{annotation1} [@var{annotation2} ...]]
	153	@sp 1
	154	@end quotation
	155
	156	@table @var
	157
	158	@item @var{start}
	159	Non-negative integer value indicating the position in the source text where the
	160	segment starts.
	161
	162	@item @var{length}
	163	Non-negative integer value indicating the length of the segment.
	164
	165	@item @var{type}
	166	A sequence of non-ASCII characters (without spaces or letters, which could lead to @var{type} being misinterpreted as a @var{start} or @var{length} field).
	167	@var{type} reflects the main classification of segments -
	168	into words, numbers, punctuation marks, meta-text markers.
	169	@xref{tok output,,tok output}, for description of automatically recognized type markers.
	170
	171	@item @var{form}
	172	This field contains the textual form of the segment or the special
	173	symbol @code{*} indicating that the form is not given (e.g. when the segment has been created artificially to mark something and is of lentgh 0).
	174
	175	The characters or character sequences that have special meaning in the
	176	@var{form} field are enumerated below.
	177
	178	Characters with special meaning:
	179
	180	@itemize
	181	@item @code{_} - space character
	182	@item @code{*} - undefined contents
	183	@end itemize
	184
	185	Escape sequences:
	186
	187	@itemize
	188	@item @code{\n} - new line
	189	@item @code{\t} - tabulation
	190	@item @code{\r} - carriage return
	191
	192	@item @code{\_} - the @code{_} character
	193	@item @code{\} - the @code{} character
	194	@item @code{\\} - the @code{\} character
	195
	196	@c @item @code{\hh} - a character with hexadecimal code @code{hh} (used for non-printable characters)
	197	@end itemize
	198
	199	@item @var{annotation1}
	200	@item @var{annotation2}
	201	@item ...
	202	Annotation fields have the following format:
	203
	204	@var{longname} @code{:} @var{value}
	205
	206	or
	207
	208	@var{shortname} @var{value}
	209
	210	where @var{longname} is a string of alphanumeric characters
	211	(isalnum() test), @var{shortname} - a single non-alphanumeric character
	212	(ispunct() test), and @var{value} is an arbitrary string of non-blank characters.
	213
	214	@end table
	215
	216
	217	Only two fields are mandatory: @var{type} and @var{form}. All other fields
	218	may be absent. In the case when only one number precedes the
	219	@var{type} field, it is interpreted as the @var{START} position.
	220
	221	If the @var{length} field is ommited, the length of the segment is the
	222	length of the @var{form} field, except when the value of the
	223	@var{form} field is @code{*} -- in this case, the length is assumed to
	224	be 0.
	225
	226	If the @var{start} field is also absent, the segment is assumed to directly
	227	follow the preceding one.
	228
	229	@c Conventions:
	230
	231	@c Annotation fields with predefined meaning:
	232
	233	@c @itemize
	234	@c @item @code{!} - UTT components are allowed to modify the contents of
	235	@c the @var{form} field (e.g. spelling correction does this). If this happens the
	236	@c original form of the segment have to be placed in the @code{!}-field.
	237	@c @item @code{@@} - morphological description
	238	@c @item @code{=} - node identifier assignment (used in graph encoding)
	239	@c @item @code{<} - preceding/dominating node(s) (used in graph encoding)
	240	@c @item @code{>} - succeeding/subordinate node(s) (used in graph encoding)
	241	@c @end itemize
	242
	243	Segments of length 0 may be used to mark file positions with some
	244	information. See e.g. BOS and EOS (beginning/end of sentence) markers
	245	in the example below.
	246
	247	Example:
	248
	249	sentence: @samp{Piszemy dobre progrumy.}
	250
	251	@example
	252	0000 00 BOS *
[19760ef]	253	0000 07 W Piszemy lem:pisaÃŠ,V
[25ae32e]	254	0007 01 S _
	255	0008 05 W dobre lem:dobry,ADJ
	256	0013 01 S _
	257	0014 08 W progrumy cor:programy lem:program,N
	258	0022 01 P .
	259	0023 00 EOS *
	260	0023 01 S _
	261	0024 00 BOS *
	262	0024 11 W Warszawiacy lem:Warszawiak,N
	263	0035 01 S _
[19760ef]	264	0036 03 W teÂ¿
[25ae32e]	265	0039 01 P .
	266	0040 00 EOS *
	267
	268	@end example
	269
	270	@example
	271	0000 BOS *
[19760ef]	272	0000 W Piszemy lem:pisaÃŠ,V
[25ae32e]	273	0007 S _
	274	0008 W dobre lem:dobry,ADJ
	275	0013 S _
	276	0014 W progrumy cor:programy lem:program,N
	277	0022 P .
	278	0023 EOS *
	279	@end example
	280
	281	Posion information may be provided only for some types of segments:
	282
	283	@example
	284	0000 BOS *
[19760ef]	285	W Piszemy lem:pisaÃŠ,V
[25ae32e]	286	S _
	287	W dobre lem:dobry,ADJ
	288	S _
	289	W progrumy cor:programy lem:program,N
	290	P .
	291	EOS *
	292	S _
	293	0024 BOS *
	294	W Warszawiacy lem:Warszawiak,N
	295	S _
[19760ef]	296	W teÂ¿
[25ae32e]	297	P .
	298	EOS *
	299	@end example
	300
	301	Position/length information may be provided only when necessary:
	302
	303	@example
	304	0000 04 N *
	305	0000 N 12
	306	P .
	307	N 5
	308	S _
	309	W km
	310	@end example
	311
	312	@section UTT File
	313
	314	A UTT file consists of a sequence of segments. The same text position
	315	may be covered by multiple segments. In cosequence, ambiguous text
	316	segmentation and ambiguous annotation may be represented.
	317
	318	There are two structural requirements a valid UTT-formatted file
	319	has to meet:
	320
	321	@itemize @bullet
	322
	323	@item
	324	segments have to be sorted with respect to the @var{position} field,
	325
	326	@item
	327	for each
	328	segment ending at position @var{n}, either there must be a segment starting at
	329	position @var{n+1}, or position @var{n+1} is not covered by any segment; similarly
	330	for each segment starting at position @var{n}, either there must be a segment
	331	ending at position @var{n-1}, or the position @var{n-1} must not be covered
	332	by any segment.
	333
	334	@end itemize
	335
	336	A valid annotation for the text fragment
	337	@example
	338	12.5 km
	339	@end example
	340
	341	may be
	342
	343	@example
	344	0000 02 N 12
	345	0000 04 N 12.5
	346	0002 01 P .
	347	0003 01 N 5
	348	0004 01 S _
	349	0005 02 W km
	350	@end example
	351
	352	but not
	353
	354	@example
	355	0000 02 N 12
	356	0000 04 N 12.5
	357	0004 01 S _
	358	0005 02 W km
	359	@end example
	360
[261bf62]	361	because in the latter example the first segment (starting at position
	362	0000, 2 characters long) ends at position @var{n}=0001 which is
	363	covered by the second segment and no segment starts at position
	364	@var{n+2}=0002.
	365
	366
	367	@section Flattened UTT file
	368
	369	A UTT file format has two variants: regular and flattend. The regular
	370	format was described above. In the flattened format some of the
	371	end-of-line characters are replaced with line-feed characters.
	372
	373	The flatten format is basically used to represent whole sentences as
	374	single lines of the input file (all intrasentential end-of-line
	375	characters are replaced with line-feed characters).
	376
	377	This technical trick permits to perform certain text
	378	processing operations on entire sentences with the use of such tools as
	379	@command{grep} (see @command{grp} component) or @command{sed} (see @command{mar} component).
	380
	381	The conversion between the two formats is performed by the tools:
	382	@command{fla} and @command{unfla}.
[25ae32e]	383
	384	@section Character encoding
	385
	386	The UTT component programs accept only 1-byte character encoding, such
[261bf62]	387	as ISO, ANSI, DOS.
[25ae32e]	388
	389
	390	@c @section Formats
	391
	392	@c @unnumberedsubsubsec Basic format
	393
	394	@c While processing large amounts of the overhead related with explicit
	395	@c ... of the start position and segment length becomes ... . Therefore,
	396	@c for efficiency reasons certain shortcuts are possible:
	397
	398	@c @unnumberedsubsubsec Relative start position
	399
	400	@c Start position may be given as relative distance from the last
	401	@c absolut position.
	402
	403	@c @unnumberedsubsubsec Absent length
	404
	405	@c Segment length may by omitted. Normally it can be restored by counting
	406	@c the length of the @emph{form field}. For segments with the special value
	407	@c @code{*} in the @emph{form field} length 0 is assumed.
	408
	409	@c @unnumberedsubsubsec Absent length and start position
	410
	411	@c Both start position and segment length may be omitted. In this format
	412	@c each segment is assumed to follow the previous one. This format is,
	413	@c therefore, suitable only for unambiguously tagged text
	414	@c (0-length markers can be still used.)
	415
	416
	417	@c @table @code
	418	@c @item AL
	419	@c @code{1234 03 W kot}
	420	@c @item RL
	421	@c @code{+56 03 W kot}
	422	@c @item A
	423	@c @code{1234 W kot}
	424	@c @item R
	425	@c @code{+56 W kot}
	426	@c @item 0
	427	@c @code{W kot}
	428	@c @end table
	429
	430
[19760ef]	431	@c [JAK UZYSKAÃ POLSKIE CZCIONKI W DVI???]
[25ae32e]	432
	433	@macro parhelp
	434	@item @b{@minus{}@minus{}help}, @b{@minus{}h}
	435	Print help.
	436	@end macro
	437
	438
	439	@macro parversion
	440	@item @b{@minus{}@minus{}version}, @b{@minus{}V}
	441	Print version information.
	442	@end macro
	443
	444	@macro parinteractive
	445	@item @b{@minus{}@minus{}interactive, @minus{}i}
	446	This option toggles interactive mode, which is by default off. In the
	447	interactive mode the program does not buffer the output.
	448	@end macro
	449
	450
	451	@c @macro parfile
	452	@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
	453	@c Input file name.
	454	@c If this option is absent or equal to '@minus{}', the program
	455	@c reads from the standard input.
	456	@c @end macro
	457
	458
	459	@c @macro paroutput
	460	@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
	461	@c Regular output file name. To regular output the program sends segments
	462	@c which it successfully processed and copies those which were not
	463	@c subject to processing. If this option is absent or equal to
	464	@c '@minus{}', standard output is used.
	465	@c @end macro
	466
	467	@c @macro parfail
	468	@c @item @b{@minus{}@minus{}fail=@var{filename}, @minus{}e @var{filename}}
	469	@c Fail output file name. To fail output the program copies the segments
	470	@c it failed to process. If this option is absent or equal to
	471	@c '@minus{}', standard output is used.
	472	@c @end macro
	473
	474
	475	@c @macro parcopy
	476	@c @item @b{@minus{}@minus{}copy, @minus{}c}
	477	@c Copy succesfully processed segments to regular output also in their
	478	@c original input form.
	479	@c @end macro
	480
	481
	482	@macro parinputfield
	483	@item @b{@minus{}@minus{}input-field=@var{fieldname}, @minus{}I @var{fieldname}}
	484	The field containing the input to the program. The default is the
	485	@var{form} field. The fields @var{position}, @var{length}, @var{type},
	486	and @var{form} are referred to as @code{1}, @code{2}, @code{3},
	487	@code{4}, respectively.
	488	@end macro
	489
	490
	491	@macro paroutputfield
	492	@item @b{@minus{}@minus{}output-field=@var{fieldname}, @minus{}O @var{fieldname}}
	493	The name of the field added by the program. The default is the name of the program.
	494	@end macro
	495
	496
	497	@macro pardictionary
	498	@item @b{@minus{}@minus{}dictionary=@var{filename}, @minus{}d @var{filename}}
	499	Dictionary file name.
	500	@end macro
	501
	502
	503	@macro parprocess
	504	@item @b{@minus{}@minus{}process=@var{type}, @minus{}p @var{type}}
	505	Process segments with the specified value in the @var{type} field.
	506	Multiple occurences of this option are allowed and are interpreted as
	507	disjunction. If this option is absent, all segments are processed.
	508	@end macro
	509
	510
	511	@macro parselect
	512	@item @b{@minus{}@minus{}select=@var{fieldname}, @minus{}s @var{fieldname}}
	513	Select for processing only segments in which the field named
	514	@var{fieldname} is present. Multiple occurences of this option are
	515	allowed and are interpreted as conjunction of conditions. If this
	516	option is absent, all segments are processed.
	517	@end macro
	518
	519
	520	@macro parunselect
	521	@item @b{@minus{}@minus{}unselect=@var{fieldname}, @minus{}S @var{fieldname}}
	522	Select for processing only segments in which the field @var{fieldname}
	523	is absent. Multiple occurences of this option are allowed and are
	524	interpreted as conjunction of conditions. If this option is absent,
	525	all segments are processed.
	526	@end macro
	527
	528
	529	@macro paroneline
	530	@item @b{@minus{}@minus{}one-line}
	531	This option makes the program print ambiguous annotation in one output
	532	line by generating multiple annotation fields. By default when
	533	ambiguous annotation may be produced for a segment, the segment is
	534	multiplicated and each of the annotations is added to separate copy of
	535	the segment.
	536	@end macro
	537
	538
	539	@macro paronefield
	540	@item @b{@minus{}@minus{}one-field, @minus{}1}
	541	This option makes the program print ambiguous annotation in one
	542	annotation field. By default when ambiguous annotation may be produced
	543	for a segment, the segment is multiplicated and each of the
	544	annotations is added to separate copy of the segment.
	545
	546	This option is useful when working with @command{kot} or @command{con}.
	547	@end macro
	548
	549
	550	@c ---------------------------------------------------------------------
	551	@c CONFIGURATION FILES
	552	@c ---------------------------------------------------------------------
	553
	554	@node Configuration files
	555	@chapter Configuration files
	556
	557	Values for all command line options accepted by a component
	558	may be set in configuration files. The default location of the
	559	configuration files for a component named @command{@var{program}} are
	560
	561	@example
[246900a]	562	@file{/usr/local/etc/utt/@var{program}.conf}
[25ae32e]	563	@end example
	564
	565	for system-wide configuration file and
	566
	567	@example
[246900a]	568	@file{~/.utt/@var{program}.conf}
[25ae32e]	569	@end example
	570
	571	for user configuration file.
	572
	573	@c The configuration file to load may be also specified with the
	574	@c @option{--config} option. Configuration file need not be provided.
	575
	576	For each option, the value is set according to the following priority:
	577
	578	@itemize
	579	@item command line
	580	@c @item configuration file indicated with @option{--config} option
	581	@item user configuration file (or configuration file indicated with the @option{--config} option)
	582	@item system-wide configuration file
	583	@end itemize
	584
	585	Parameter values are specified in the following format:
	586
	587	@var{parametername}=@var{value}
	588
	589	where @var{parametername} is the short or long name of an option accepted by
	590	the program, or
	591
	592	@var{parametername}
	593
	594	if the option does not need arguments.
	595
	596	You can introduce comments to configuration files using the # sign.
	597
	598	If a program accepts multiple occurences of an option (e.g. @var{lem}'s select option) you can specify them in two distinct lines of the program's configuration file.
	599
	600	@c The equal sign may be omitted.
	601
	602
	603	@quotation Tip
	604	If you have two (or more) frequently used sets of options for the same
	605	program (eg. lem with PMDBF dictionary and lem with a user dictionary)
	606	a good solution is to create two soft links to lem, called
	607	eg. lemg and lemu and specify their configuration in files lemg.conf
	608	and lemu.conf respectively.
	609	@end quotation
	610
	611	@c ---------------------------------------------------------------------
	612	@c COMPONENTS
	613	@c ---------------------------------------------------------------------
	614
	615	@node UTT components
	616	@chapter UTT components
	617
	618	UTT components are of three types:
	619
	620	@menu
	621	Sources: programs which read non-UTT data (e.g. raw text) and produce output
	622	in UTT format
	623	* tok:: a tokenizer
	624
	625	Filters: programs which read and produce UTT-formatted data
	626	* lem:: a morphological analyzer
	627	* gue:: a morphological guesser
[261bf62]	628	* cor:: a simple spelling corrector
	629	* kor:: a more elaborated spelling corrector
[25ae32e]	630	* sen:: a sentensizer
	631	* ser:: a pattern search tool (marks matches)
[261bf62]	632	* mar:: a pattern search tool (introduces arbitrary markers into the text)
[25ae32e]	633	* grp:: a pattern search tool (selects sentences containing a match)
[261bf62]	634	@c * gph:: a word-graph annotation tool::
	635	@c * dgp:: a dependency parser
[25ae32e]	636
	637	Sinks: programs which read UTT data and produce output in another format
	638	* kot:: an untokenizer
	639	* con:: a concordance table generator
	640	@end menu
	641
	642	@c ---------------------------------------------------------------------
	643	@c TOK
	644	@c ---------------------------------------------------------------------
	645
	646	@page
	647	@node tok
	648	@section tok - a tokenizer
	649
	650	@c ----------------------------------------
	651
	652	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	653	@item @strong{Authors:} @tab Tomasz ObrÃªbski
[25ae32e]	654	@item @strong{Component category:} @tab source
[261bf62]	655	@item @strong{Input format:} @tab raw text file
	656	@item @strong{Output format:} @tab UTT regular
	657	@item @strong{Required annotation:} @tab -
[25ae32e]	658	@end multitable
	659
	660
	661	@menu
	662	* tok description::
	663	* tok input::
	664	* tok output::
	665	* tok command line options::
	666	* tok example::
	667	@end menu
	668
	669	@node tok description
	670	@subsection Description
	671
	672	@code{tok} is a simple program which reads a text file and identifies
	673	tokens on the basis of their orthographic form. The type of the token
	674	is printed as the @var{type} field.
	675
	676	@node tok input
	677	@subsection Input
	678
	679	Raw text.
	680
	681	@node tok output
	682	@subsection Output
	683
	684	UTT-file with four fields: @var{start}, @var{length}, @var{type}, and @var{form}. In the @var{type} field five types of tokens are distinguished:
	685
	686	@itemize
	687
	688	@item @code{W}
	689	(word)
	690	- continuous sequence of letters
	691
	692	@item @code{N}
	693	(number)
	694	- continuous sequence of digits
	695
	696	@item @code{S}
	697	(space)
	698	- continuous sequence of space characters
	699
	700	@item @code{P}
	701	(punctuation mark)
	702	- single printable characters not belonging to any of the other classes
	703
	704	@item @code{B}
	705	(unprintable character)
	706	- single unprintable character
	707
	708	@end itemize
	709
	710
	711
	712	@node tok command line options
	713	@subsection Command line options
	714
	715	@table @code
	716
	717	@item @b{@minus{}@minus{}help}, @b{@minus{}h}
	718	Print help.
	719
	720	@item @b{@minus{}@minus{}version}, @b{@minus{}V}
	721	Print version information.
	722
	723	@item @b{@minus{}@minus{}interactive, @minus{}i}
	724	This option toggles interactive mode, which is by default off. In the
	725	interactive mode the program does not buffer the output.
	726
	727	@end table
	728
	729	@node tok example
	730	@subsection Example
	731
	732	Input:
	733
	734	@example
	735	Piszemy dobre programy.
	736	@end example
	737
	738	Output:
	739
	740	@example
	741	0000 07 W Piszemy
	742	0007 01 S _
	743	0008 05 W dobre
	744	0013 01 S _
	745	0014 08 W programy
	746	0022 01 P .
	747	0023 01 S \n
	748	@end example
	749
	750
	751	@c ---------------------------------------------------------------------
	752	@c SEN
	753	@c ---------------------------------------------------------------------
	754
	755	@c @node sen - sentencizer
	756	@c @chapter sen - sentencizer
	757
[19760ef]	758	@c Authors: Tomasz ObrÃªbski
[25ae32e]	759
	760	@c ---------------------------------------------------------------------
	761	@c LEM
	762	@c ---------------------------------------------------------------------
	763
	764	@page
	765	@node lem
	766	@section lem - morphological analyzer
	767
	768	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	769	@item @strong{Authors:} @tab Tomasz ObrÃªbski, MichaÂ³ Stolarski
[25ae32e]	770	@item @strong{Component category:} @tab filter
[261bf62]	771	@item @strong{Input format:} @tab UTT regular
	772	@item @strong{Output format:} @tab UTT regular
	773	@item @strong{Required annotation:} @tab tok
[25ae32e]	774	@end multitable
	775
	776	@menu
	777	* lem description::
	778	* lem command line options::
	779	* lem input::
	780	* lem output::
	781	* lem example::
	782	* lem dictionaries::
	783	* lem hints::
	784	@end menu
	785
	786	@node lem description
	787	@subsection Description
	788
	789	@command{lem} performs morphological analysis of a simple orthographic
	790	word, returning all its possible morphological annotations,
	791	disregarding the context.
	792
	793	@c ----------------------------------------
	794
	795	@node lem command line options
	796	@subsection Command line options
	797
	798	@table @code
	799	@parhelp
	800	@parversion
	801	@parinteractive
	802	@c @parfile
	803	@c @paroutput
	804	@c @parfail
	805	@c @parcopy
	806	@parinputfield
	807	@paroutputfield
	808	@pardictionary
	809	@parprocess
	810	@parselect
	811	@parunselect
	812	@paroneline
	813	@paronefield
	814	@end table
	815
	816	@c ----------------------------------------
	817
	818	@node lem input
	819	@subsection Input
	820
	821	Lem reads a UTT file and processes the value of the @var{form} field
	822	(the input field may be changed with @option{--input-field} option).
	823
	824	@node lem output
	825	@subsection Output
	826
	827	@command{lem} adds a new annotation field, whose default name is @code{lem}. In
	828	case of ambiguity either the segment is multiplicated (default),
	829	multiple @code{lem} fields are added (@option{--one-line}) or ambiguous
	830	annotation is produced as the value of single @code{lem} field (option
	831	@option{--one-field,-1}):
	832
	833	@itemize @bullet
	834
	835	@item
	836	unambiguous value format:
	837
	838	@example
	839	<lemma>,<descr>
	840	@end example
	841
	842	@item
	843	ambiguous value format (@option{--one-field} option)
	844
	845
	846	@example
	847	<lemma>,<descr>[,<descr>][;<lemma>,<descr>[,<descr>]]
	848	@end example
	849
	850	(alternative descriptions for the same lemma are separated by commas,
	851	alternative lemmata are separated by semicolons.)
	852
	853	@end itemize
	854
	855	@node lem example
	856	@subsection Example
	857
	858	Input:
	859
	860	@example
	861	0000 07 W Piszemy
	862	0007 01 S _
	863	0008 05 W dobre
	864	0013 01 S _
	865	0014 08 W programy
	866	0022 01 P .
	867	0023 01 B \n
	868	@end example
	869
	870	Output (default):
	871
	872	@example
[19760ef]	873	0000 07 W Piszemy lem:pisaÃŠ,V/AiVpMdTrfNpP1
[25ae32e]	874	0007 01 B _
	875	0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn
	876	0008 05 W dobre lem:dobry,ADJ/DpNsCnavGn
	877	0013 01 B _
	878	0014 08 W programy lem:program,N/GiNpCa
	879	0014 08 W programy lem:program,N/GiNpCn
	880	0014 08 W programy lem:program,N/GiNpCv
	881	0022 01 P .
	882	0023 01 B \n
	883	@end example
	884
	885	Output (@option{--one-line} option):
	886
	887	@example
[19760ef]	888	0000 07 W Piszemy lem:pisaÃŠ,V/AiVpMdTrfNpP1
[25ae32e]	889	0007 01 S _
	890	0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn lem:dobry,ADJ/DpNsCnavGn
	891	0013 01 S _
	892	0014 08 W programy lem:program,N/GiNpCa lem:program,N/GiNpCn lem:program,N/GiNpCv
	893	0022 01 P .
	894	0023 01 S \n
	895	@end example
	896
	897	Output (@option{--one-field} option):
	898
	899	@example
[19760ef]	900	0000 07 W Piszemy lem:pisaÃŠ,V/AiVpMdTrfNpP1
[25ae32e]	901	0007 01 S _
	902	0008 05 W dobre lem:dobry,ADJ/DpNpCnavGaifn,ADJ/DpNsCnavGn
	903	0013 01 S _
	904	0014 08 W programy lem:program,N/GiNpCa,N/GiNpCn,N/GiNpCv
	905	0022 01 P .
	906	0023 01 S \n
	907	@end example
	908
	909	@c ----------------------------------------
	910
	911	@node lem dictionaries
	912	@subsection Dictionaries
	913
	914	@command{lem} requires a dictionary. The dictionary may be provided in
	915	one of two formats: in text (source) format or in binary (fsa) format.
	916
	917	@subsubheading Text format
	918
	919	Dictionary entries have the following structure:
	920
	921	@example
	922	<form>;<lemma>,<descr>[;<lemma>,<descr>]
	923	@end example
	924
	925	@var{lemma} may be given explicitly or in the cut-add format:
	926
	927	@example
	928	@code{[<cut1><add1>-]<cut2><add2>}
	929	@end example
	930
	931	meaning: replace prefix of length @code{<cut1>} with
	932	string @code{<add1>}, replace suffix of length @code{<cut2>} with string
	933	@code{<add2>}. For example @code{3t} transforms @samp{kocie} into
[19760ef]	934	@samp{kot}, @code{3-4aÂ³y} transforms @samp{najbielsi} into @samp{biaÂ³y}
[25ae32e]	935
	936	Each dictionary entry must be written in one line and must not contain blank characters.
	937
	938	Examples:
	939	@example
	940	kot;0,N/GaNsCn
	941	kota;1,N/GaNsCg;1,N/GaNsCa
	942	kotu;1,N/GaNsCd
	943	kotem;2,N/GaNsCi
	944	kocie;3t,N/GaNsCl;3t,N/GaNsCv
[19760ef]	945	najbielsi;3-4aÂ³y,ADJ/DsNpCnGp
	946	najbielsze;3-5aÂ³y,ADJ/DsNpCnGaifn
[25ae32e]	947	najlepsi;dobry,ADJ/DsNpCnGp
	948	najlepsze;dobry,ADJ/DsNpCnGaifn
	949	@end example
	950
	951
	952	The mandatory file name extension for a text dictionary is @code{dic}. For large
	953	dictionaries it is preferable, however, to compile them into binary
	954	(fsa) format.
	955
	956	@subsubheading Binary format
	957
	958	The mandatory file name extension for a binary dictionary is @code{bin}. To
	959	compile a text dictionary into binary format, write:
	960
	961	@example
	962	compiledic <dictionaryname>.dic
	963	@end example
	964
	965	@subsubheading Polex/PMDBF dictionary
	966
	967	A large-coverage morphological dictionary for Polish language, Polex/PMDBF, is included in
	968	the distribution as the default @emph{lem}'s dictionary. It's
	969	located by default in:
	970
[261bf62]	971	@file{$HOME/.local/share/utt/pl_PL.ISO-8859-2/lem.bin}
	972
	973	in local installation or in
	974
	975	@file{/usr/local/share/utt/pl_PL.ISO-8859-2/lem.bin}
	976
	977	in system installation.
[25ae32e]	978
	979	@node lem hints
	980	@subsection Hints
	981
[261bf62]	982	@subsubheading Combining data from multiple dictionaries
[25ae32e]	983
[261bf62]	984	@itemize
[25ae32e]	985
[261bf62]	986	@item Apply <dict1>, then apply <dict2> to words which were not annotatated.
[25ae32e]	987
[261bf62]	988	@example
	989	lem -d <dict1> \| lem -S lem -d <dict2>
	990	@end example
[25ae32e]	991
[261bf62]	992	@item Add annotations from two dictionaries <dict1> and <dict2>.
[25ae32e]	993
[261bf62]	994	@example
	995	lem -c -d <dict1> \| lem -S lem -d <dict2>
	996	@end example
[25ae32e]	997
[261bf62]	998	@end itemize
[25ae32e]	999
	1000
	1001	@c ---------------------------------------------------------------------
	1002	@c GUE
	1003	@c ---------------------------------------------------------------------
	1004
	1005	@page
	1006	@node gue
	1007	@section gue - morphological guesser
	1008
	1009	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1010
[19760ef]	1011	@item @strong{Authors:} @tab MichaÂ³ Stolarski, Tomasz ObrÃªbski
[25ae32e]	1012	@item @strong{Component category:} @tab filter
	1013
	1014	@end multitable
	1015
	1016	@menu
[261bf62]	1017	* gue description::
[25ae32e]	1018	* gue command line options::
	1019	* gue example::
	1020	* gue dictionaries::
	1021	@end menu
	1022
[261bf62]	1023
	1024	@node gue description
	1025	@subsection Description
	1026
	1027	@command{gue} guesess morphological descriptions of the form contained
	1028	in the @var{form} field.
	1029
	1030
[25ae32e]	1031	@node gue command line options
	1032	@subsection Command line options
	1033
	1034	@table @code
	1035
	1036	@parhelp
	1037	@parversion
	1038	@parinteractive
	1039	@c @parfile
	1040	@c @paroutput
	1041	@c @parfail
	1042	@c @parcopy
	1043	@parinputfield
	1044	@paroutputfield
	1045	@pardictionary
	1046	@parprocess
	1047	@parselect
	1048	@parunselect
	1049	@paroneline
	1050	@paronefield
	1051
	1052	@item @b{@minus{}@minus{}delta=@var{n}}
	1053	Stop displaying answers after fall of weight, that is, when weight difference between 2 subsequent results is more than delta value (default=`0.2').
	1054
	1055
	1056	@item @b{@minus{}@minus{}cut-off=@var{n}}
	1057	Do not display answers with less weight than cut-off value (default=`200').
	1058
	1059
	1060	@item @b{@minus{}@minus{}guess_count=@var{n}, @minus{}n @var{n}}
	1061	Guess up to n descriptions (default=`0', which means 'display all results').
	1062
	1063
	1064
	1065	@end table
	1066
	1067	@node gue example
	1068	@subsection Example
	1069
	1070	@example
	1071	command: gue -n 2
	1072
	1073	input:
	1074	0000 07 W smerfny
	1075
	1076	output:
	1077	0000 07 W smerfny gue:,ADJ/CaDpGiNs
	1078	0000 07 W smerfny gue:,ADJ/CnvDpGaipNs
	1079	@end example
	1080
	1081
	1082	@node gue dictionaries
	1083	@subsection Dictionaries
	1084
	1085	@command{gue} requires a dictionary. For now, the dictionary must be provided in binary (fsa) format.
	1086	The fsa format is created by compiling text-format dictionaries.
	1087
	1088
	1089
	1090	@subsubheading Text format
	1091
	1092	Dictionary entries have the following structure:
	1093
	1094	@example
	1095	@var{prefix}@code{*}@var{suffix}@code{;}@var{lemma}@code{,}@var{description}@code{:}@var{weight}
	1096	@end example
	1097
	1098	@var{lemma} must be given in the cut-add format:
	1099
	1100	@example
	1101	@code{[<cut1><add1>-]<cut2><add2>}
	1102	@end example
	1103	(no spaces in between): replace prefix of length @var{cut1} with
	1104	string @var{add1}, replace suffix of length @var{cat2} with string
	1105	@var{add2}.
	1106
	1107
[19760ef]	1108	Example: @code{3-4aÂ³y} transforms @i{najbielsi} into @i{biaÂ³y}
[25ae32e]	1109
	1110
	1111	@var{description} contains the part of speech and morphosyntactic information (@xref{PMDBF dictionary}.).
	1112
	1113	@var{weight} is an integer value between 1 and 999 indicating the
	1114	likelihood of the guess.
	1115
	1116	@example
[19760ef]	1117	*Â³kÃª;1a,N/GfNsCa
	1118	naj*elszy;3-4aÂ³y,ADJ/...:...
[25ae32e]	1119	@end example
	1120
	1121
	1122	@c ---------------------------------------------------------------------
	1123	@c COR
	1124	@c ---------------------------------------------------------------------
	1125
	1126	@page
	1127	@node cor
	1128	@section cor - spelling corrector
	1129
	1130	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	1131	@item @strong{Authors:} @tab Tomasz ObrÃªbski, MichaÂ³ Stolarski
[25ae32e]	1132	@item @strong{Component category:} @tab filter
[261bf62]	1133	@item @strong{Input format:} @tab UTT regular
	1134	@item @strong{Output format:} @tab UTT regular
	1135	@item @strong{Required annotation:} @tab tok
[25ae32e]	1136	@end multitable
	1137
[261bf62]	1138	@menu
	1139	* cor description::
	1140	* cor command line options::
	1141	* cor dictionaries::
	1142	@end menu
	1143
	1144
	1145	@node cor description
	1146	@subsection Description
	1147
[25ae32e]	1148	The spelling corrector applies Kemal Oflazer's dynamic programming
	1149	algorithm @cite{oflazer96} to the FSA representation of the set of
	1150	word forms of the Polex/PMDBF dictionary. Given an incorrect
	1151	word form it returns all word forms present in the dictionary whose
	1152	edit distance is smaller than the threshold given as the parameter.
	1153
	1154
	1155	@node cor command line options
	1156	@subsection Command line options
	1157
	1158	@table @code
	1159
	1160	@parhelp
	1161	@parversion
	1162	@parinteractive
	1163	@c @parfile
	1164	@c @paroutput
	1165	@c @parfail
	1166	@c @parcopy
	1167	@parinputfield
	1168	@paroutputfield
	1169	@pardictionary
	1170	@parprocess
	1171	@parselect
	1172	@parunselect
	1173	@paroneline
	1174	@paronefield
	1175
	1176	@item @b{@minus{}@minus{}distance=@var{int}, @minus{}n @var{int}}
	1177	Maximum edit distance (default='1').
	1178
[261bf62]	1179	@c @item @b{@minus{}@minus{}replace, @minus{}r}
	1180	@c Replace original form with corrected form, place original form in the
	1181	@c cor field. This option has no effect in @option{--one-*} modes (default=off)
	1182
[25ae32e]	1183
	1184	@end table
	1185
	1186	@node cor dictionaries
	1187	@subsection Dictionaries
	1188
	1189	@command{cor} requires a dictionary. The dictionary has to be provided in binary (fsa) format.
	1190	The fsa format is created by compiling text-format dictionaries.
	1191
	1192	@subsubheading Text format
	1193
	1194	The @command{cor} dictionary is a list of words:
	1195	@example
	1196	odlot
	1197	odlotowy
	1198	odludek
	1199	@end example
	1200
[261bf62]	1201	@subsubheading Binary format
	1202
	1203	The mandatory file name extension for a binary dictionary is @code{bin}. To
	1204	compile a text dictionary into binary format, write:
	1205
	1206	@example
	1207	compiledic <dictionaryname>.dic
	1208	@end example
	1209
	1210	@c ---------------------------------------------------------------------
	1211	@c KOR
	1212	@c ---------------------------------------------------------------------
	1213
	1214	@page
	1215	@node kor
	1216	@section kor - configurable spelling corrector
	1217
	1218	[TODO]
	1219
	1220	@c ---------------------------------------------------------------------
	1221	@c SEN
	1222	@c ---------------------------------------------------------------------
	1223
[25ae32e]	1224	@page
	1225	@node sen
	1226	@section sen - a sentensizer
	1227
	1228	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1229
[19760ef]	1230	@item @strong{Authors:} @tab Tomasz ObrÃªbski
[25ae32e]	1231	@item @strong{Component category:} @tab filter
[261bf62]	1232	@item @strong{Input format:} @tab UTT regular
	1233	@item @strong{Output format:} @tab UTT regular
	1234	@item @strong{Required annotation:} @tab tok
[25ae32e]	1235
	1236	@end multitable
	1237
	1238
	1239	@menu
[261bf62]	1240	* sen description::
[25ae32e]	1241	@c * sen input::
	1242	@c * sen output::
	1243	* sen example::
	1244	@end menu
	1245
[261bf62]	1246	@node sen description
	1247	@subsection Description
	1248
	1249	@command{sen} detects sentence boundaries in UTT-formatted texts and marks them with special zero-length segments, in which the @var{type} field may contain the BOS (beginning of sentence) or EOS (end of sentence) annotation.
	1250
[25ae32e]	1251	@node sen example
	1252	@subsection Example
	1253
	1254	@example
	1255	command: sen
	1256
	1257	input:
[19760ef]	1258	0000 05 W CzeÂ¶ÃŠ
[25ae32e]	1259	0005 01 P !
	1260	0006 01 S _
	1261	0007 02 W To
	1262	0009 01 S _
	1263	0010 02 W ja
	1264	0012 01 P .
	1265	0013 01 S \n
	1266
	1267	output:
	1268	0000 00 BOS *
[19760ef]	1269	0000 05 W CzeÂ¶ÃŠ
[25ae32e]	1270	0005 01 P !
	1271	0006 00 EOS *
	1272	0006 00 BOS *
	1273	0006 01 S _
	1274	0007 02 W To
	1275	0009 01 S _
	1276	0010 02 W ja
	1277	0012 01 P .
	1278	0013 01 S \n
	1279	0014 00 EOS *
	1280	@end example
	1281
	1282
	1283	@c ---------------------------------------------------------------------
	1284	@c GPH
	1285	@c ---------------------------------------------------------------------
	1286
	1287	@c @node gph - graphizer
	1288	@c @chapter gph - graphizer
	1289
[19760ef]	1290	@c Authors: Tomasz ObrÃªbski
[25ae32e]	1291
	1292
	1293
	1294	@c ---------------------------------------------------------------------
[261bf62]	1295	@c SER
[25ae32e]	1296	@c ---------------------------------------------------------------------
	1297
	1298	@page
	1299	@node ser
	1300	@section ser - pattern search tool
	1301
	1302	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	1303	@item @strong{Authors:} @tab Tomasz ObrÃªbski
[25ae32e]	1304	@item @strong{Component category:} @tab filter
[261bf62]	1305	@item @strong{Input format:} @tab UTT regular
	1306	@item @strong{Output format:} @tab UTT regular
	1307	@item @strong{Required annotation:} @tab tok, lem --one-field
[25ae32e]	1308	@end multitable
	1309
	1310	@menu
[261bf62]	1311	* ser description::
[25ae32e]	1312	* ser command line options::
	1313	* ser pattern::
	1314	* ser how ser works::
	1315	* ser customization::
	1316	* ser limitations::
	1317	* ser requirements::
	1318	@end menu
	1319
	1320
[261bf62]	1321	@node ser description
	1322	@subsection Description
	1323
	1324	@command{ser} looks for patterns in UTT-formatted texts.
	1325
	1326
[25ae32e]	1327	@c ---------------------------------------------------------------------
	1328	@node ser command line options
	1329	@subsection Command line options
	1330
	1331	@table @code
	1332
	1333	@parhelp
	1334	@parversion
	1335	@c @parfile
	1336	@c @paroutput
	1337	@c @parinputfield
	1338	@c @paroutputfield
	1339	@parprocess
	1340	@parinteractive
	1341
	1342	@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
	1343	The search pattern.
	1344
	1345	@item @b{@minus{}@minus{}morph=@var{field}}
	1346	The name of the annotation field containing the morphological
	1347	description (default @code{lem}).
	1348
	1349	@item @b{@minus{}@minus{}flex}
	1350	Only print the generated flex source code.
	1351
	1352	@item @b{@minus{}@minus{}macro=@var{filename}}
	1353	Read macrodefinitions from file @var{filename} rather than from
	1354	default location. This option allows to redefine the set of terms.
	1355
	1356	@item @b{@minus{}@minus{}define=@var{filename}}
	1357	Append macrodefinitions from file @var{filename}. This option
	1358	allows to extend the set of terms.
	1359
	1360	@end table
	1361
	1362
	1363	@c ---------------------------------------------------------------------
	1364	@node ser pattern
	1365	@subsection Pattern
	1366
	1367	The @command{ser} pattern is a regular expression over terms corresponding
	1368	to text segments or segment sequences. Predefined terms are:
	1369
	1370	@table @code
	1371
	1372	@item seg(@var{t},@var{f},@var{a})
	1373	a segment of type @var{t}, containing form @var{f} and annotation
	1374	@var{a}
	1375
	1376	@item form(@var{f})
	1377	a segment containing form @var{f}
	1378
	1379	@item field(@var{f})
	1380	a segment containing annotation field @var{f}
	1381
	1382	@item space(@var{f})
	1383	a space segment of form @var{f}
	1384
	1385	@item word(@var{f})
	1386	a word segment of form @var{f}
	1387
	1388	@item punct(@var{f})
	1389	a punct segment of form @var{f}
	1390
	1391	@item number(@var{f})
	1392	a number segment of form @var{f}
	1393
	1394	@item lexeme(@var{f})
	1395	a word segment with lemma @var{f}
	1396
	1397	@item cat(@var{c})
	1398	a word segment of category @var{c}
	1399
	1400	@end table
	1401
	1402	All arguments are optional. If an argument is omitted, an arbitrary
	1403	string of non-blank characters is assumed as the argument value. Term
	1404	arguments may be arbitrary character-level regular expressions. The
	1405	following special symbols can by used:
	1406
	1407	@multitable {aaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1408	@item @code{[@dots{}]} @tab a character class
	1409	@item @code{[^@dots{}]} @tab a negated character class
	1410	@item @code{\|} @tab alternative
	1411	@item @code{*} @tab repetition, including zero times
	1412	@item @code{+} @tab repetition, at least one time
	1413	@item @code{?} @tab optionality
	1414	@item @code{@{@var{m},@var{n}@}} @tab repetition from @var{m} to @var{n} times
	1415	@item @code{@{@var{m},@}} @tab repetition @var{m} or more times
	1416	@item @code{@{@var{m}@}} @tab repetition @var{m} times
	1417	@item @code{@var{\ddd}} @tab the character with octal value @var{ddd}
	1418	@item @code{\x@var{hh}} @tab the character with hexadecimal value @var{hh}
	1419	@item @code{( )} @tab parentheses, used to override precedence
	1420	@c @end multitable
	1421
	1422	@c @multitable {aaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1423	@item @code{.} @tab a non-blank character
	1424	@item @code{\w} @tab a letter
	1425	@item @code{\W} @tab a non-blank character other than a letter
	1426	@item @code{\d} @tab a digit
	1427	@item @code{\D} @tab a non-blank character other than a digit
	1428	@item @code{\s} @tab a space or tab character
	1429	@item @code{\S} @tab a non-blank character (the same as @code{.})
	1430	@item @code{\l} @tab a lowercase letter
	1431	@item @code{\L} @tab an uppercase letter
	1432	@end multitable
	1433
	1434
	1435	@noindent The following characters:
	1436	@example
	1437	@verb{% [ ] ^ \| * + ? { } , . < > \ %}
	1438	@end example
	1439	must be escaped with a backslash, i.e. written as:
	1440	@example
	1441	@verb{% \[ \] \^ \\| \* \+ \? \{ \} \, \. \< \> \\ %}
	1442	@end example
	1443
	1444	@quotation Note
	1445	The special symbols are ... borrowed from Perl with minor
	1446	modifications ... for convenience
	1447	The meaning of certain special characters/sequences slightly differs
	1448	from their common ???. This is motivated by convenience reasons.
	1449	The meaning of the @code{.} special character is modified due to
	1450	the special function of spaces in utt files (they are field
	1451	separators). Use @code{\s} to explicitly
	1452	@end quotation
	1453
	1454	In the argument of the @code{cat} term a special operator <...> may be
	1455	used. A category specification enclosed in angle brackets matches all
	1456	category descriptions which are consistent (non-contradictory) with the
	1457	specification. For example @code{<N>} matches all noun descriptions,
	1458	@code{<ADJ/Can>} matches all adjectives in accusative or nominal case.
	1459
	1460
	1461	@*
	1462	@noindent @b{Examples of one-segment patterns:}
	1463
	1464	@multitable {aaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1465	@item @code{seg} @tab any segment
	1466	@item @code{word} @tab any word-form
	1467	@item @code{word(pomocy)} @tab the word-form @samp{pomocy}
	1468	@item @code{word(naj.+)} @tab a word-form beginning with @samp{naj}
	1469	@item @code{word(\L\l+)} @tab a capitalized word-form
	1470	@item @code{punct} @tab a punctuation character
	1471	@item @code{space(.\\n.)} @tab a space segment containing a newline character
	1472	@item @code{lexeme(pomoc)} @tab any form of the lexeme 'pomoc'
	1473	@item @code{cat(N/.*)} @tab a word which category starts with @code{N/}
	1474	@item @code{cat(<N/Ca>)} @tab a word which category matches @code{N/Ca}
	1475	@end multitable
	1476
	1477	@*
	1478	@noindent @b{Examples of multi-segment patterns:}
	1479
	1480	@table @code
	1481
	1482	@item (word(\L) punct(\.) space?)+ word(\L\l+)
	1483	a sequence of initials followed by a surname
	1484
	1485	@item punct seg(W\|S\|N)* cat(<NPRO/Sr>) seg(W\|S\|N)* punct
	1486	a text fragment between two punctuation characters, containing an
	1487	ocurrence of a relative pronoun
	1488
	1489	@end table
	1490
	1491
	1492	@node ser how ser works
	1493	@subsection How ser works
	1494
	1495	@node ser customization
	1496	@subsection Customization
	1497
	1498	@c All predefined terms correspond to single segments,
	1499
	1500	@example
[261bf62]	1501	define(`verbseq', `(cat(<V>) (space cat(<V>)))')
[25ae32e]	1502	@end example
	1503
	1504
	1505	the term @code{cat()} may not be used as a ... of
	1506
	1507	@c See @command{m4} manual for further details on macro definition format.
	1508
	1509	@node ser limitations
	1510	@subsection Limitations
	1511
[261bf62]	1512	Do not use more than 3 attributes in <>.
[25ae32e]	1513
	1514	@node ser requirements
	1515	@subsection Requirements
	1516
	1517	In order to run @command{ser}, the following programs must be
	1518	installed in the system:
	1519
	1520	@itemize
	1521
	1522	@item @command{m4}
	1523	@item @command{grep}
	1524	@item @command{flex}
	1525	@item @command{gcc}
	1526
	1527	@end itemize
	1528
	1529
	1530	@c ---------------------------------------------------------------------
[261bf62]	1531	@c GRP
[25ae32e]	1532	@c ---------------------------------------------------------------------
	1533
	1534	@page
	1535	@node grp
	1536	@section grp - pattern search tool
	1537
	1538	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	1539	@item @strong{Authors:} @tab Tomasz ObrÃªbski
[25ae32e]	1540	@item @strong{Component category:} @tab filter
[261bf62]	1541	@item @strong{Input format:} @tab UTT flattened
	1542	@item @strong{Output format:} @tab UTT flattened
	1543	@item @strong{Required annotation:} @tab tok, sen, lem --one-field
[25ae32e]	1544	@end multitable
	1545
	1546
[261bf62]	1547	@menu
	1548	* grp description::
	1549	* grp command line options::
	1550	* grp pattern::
	1551	* grp hints::
	1552	@end menu
	1553
	1554
	1555	@node grp description
	1556	@subsection Description
	1557
[25ae32e]	1558	@code{gre} selects sentences containing an expression matching a
	1559	pattern. The pattern format is exactly the same as that accepted by
	1560	@code{ser}.
	1561
	1562	@code{gre} is intended mainly for speeding up corpus search process.
	1563	It is extremely fast (processing speed is usually higher then the speed
	1564	of reading the corpus file from disk).
	1565
	1566	@node grp command line options
	1567	@subsection Command line options
	1568
	1569	@table @code
	1570
	1571	@parhelp
	1572	@parversion
	1573	@parprocess
	1574	@parinteractive
	1575
	1576	@item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
	1577	The search pattern.
	1578
	1579	@item @b{@minus{}@minus{}morph=@var{field}}
	1580	The name of the annotation field containing the morphological
	1581	description (default @code{lem}).
	1582
	1583	@item @b{@minus{}@minus{}command}
	1584	Only print the generated flex source code.
	1585
	1586	@item @b{@minus{}@minus{}macro=@var{filename}}
	1587	Read macrodefinitions from file @var{filename} rather than from
	1588	default location. This option allows to redefine the set of terms.
	1589
	1590	@item @b{@minus{}@minus{}define=@var{filename}}
	1591	Append macrodefinitions from file @var{filename}. This option
	1592	allows to extend the set of terms.
	1593
	1594	@end table
	1595
	1596
	1597	@node grp pattern
	1598	@subsection Pattern
	1599
	1600	(see @code{ser})
	1601
	1602	@node grp hints
	1603	@subsection Hints
	1604
	1605	The corpus search speed may be increased by combining grp with lzop
	1606	compression tool (grp usually processes data faster than it is read from a
	1607	disk, especially for slow laptop drives).
	1608
	1609	@example
	1610	cat corpus \| tok \| sen \| lem \| grp -a p \| lzop -7 > corpus.grp.lzo
	1611	@end example
	1612
	1613	@example
	1614	lzop -cd corpus.grp.lzo \| grp -a gP -e @var{EXPR} \| ser -e @var{EXPR}
	1615	@end example
	1616
	1617
[261bf62]	1618
[25ae32e]	1619	@c ---------------------------------------------------------------------
[261bf62]	1620	@c MAR
[25ae32e]	1621	@c ---------------------------------------------------------------------
[261bf62]	1622
	1623	@page
	1624	@node mar
	1625	@section mar
	1626
	1627	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1628	@item @strong{Authors:} @tab Marcin Walas, Tomasz ObrÃªbski
	1629	@item @strong{Component category:} @tab filter
	1630	@end multitable
	1631
	1632	[TODO]
	1633
	1634	@c ---------------------------------------------------------------------
	1635	@c KOT
[25ae32e]	1636	@c ---------------------------------------------------------------------
	1637
[261bf62]	1638
[25ae32e]	1639	@page
	1640	@node kot
	1641	@section kot - untokenizer
	1642
[261bf62]	1643	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1644	@item @strong{Authors:} @tab Tomasz ObrÃªbski
	1645	@item @strong{Component category:} @tab filter
	1646	@item @strong{Input format:} @tab UTT regular
	1647	@item @strong{Output format:} @tab text
	1648	@item @strong{Required annotation:} @tab tok
	1649	@end multitable
[25ae32e]	1650
	1651
	1652	@menu
[261bf62]	1653	* kot description::
[25ae32e]	1654	* kot command line options::
	1655	* kot usage examples::
	1656	@end menu
	1657
[261bf62]	1658	@node kot description
	1659	@subsection Description
	1660
	1661	@command{kot} transforms a UTT formatted file back into raw text format.
	1662
[25ae32e]	1663	@node kot command line options
	1664	@subsection Command line options
	1665
	1666	@table @code
	1667
	1668	@parhelp
	1669
	1670	@c @item @b{@minus{}@minus{}version}, @b{@minus{}v}
	1671
	1672	@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
	1673
	1674	@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
	1675
	1676	@c @item @b{@minus{}@minus{}interactive @minus{}i}
	1677
	1678	@c @item @b{@minus{}@minus{}config=@var{filename}}
	1679
	1680	@item
	1681
	1682	@item @b{@minus{}@minus{}gap-fill=@var{string}, @minus{}g @var{string}}
	1683	print @var{string} between nonadjacent segments of the input file
	1684
	1685	@item @b{@minus{}@minus{}spaces, @minus{}r}
	1686	retain the special characters @code{_}, @code{\t},
	1687	@code{\n}, @code{\r}, @code{\f} unexpanded in the output
	1688
	1689	@end table
	1690
	1691	@node kot usage examples
	1692	@subsection Usage examples
	1693
	1694	@example
	1695	cat legia.txt \| tok \| kot
	1696	@end example
	1697
	1698	@example
	1699	cat legia.txt \| tok \| lem -1 \| kot
	1700	@end example
	1701
[261bf62]	1702	@c ---------------------------------------------------------------
	1703	@c CON
	1704	@c ---------------------------------------------------------------
	1705
[25ae32e]	1706
	1707	@page
	1708	@node con
	1709	@section con - concordance table generator
	1710
	1711	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1712	@item @strong{Authors:} @tab Justyna Walkowska
	1713	@item @strong{Component category:} @tab sink
[261bf62]	1714	@item @strong{Input format:} @tab UTT regular
	1715	@item @strong{Output format:} @tab text
	1716	@item @strong{Required annotation:} @tab ser or mar
[25ae32e]	1717	@end multitable
	1718	@c
	1719
	1720	@menu
[261bf62]	1721	* con description::
[25ae32e]	1722	* con command line options::
	1723	* con usage example::
	1724	* con hints::
	1725	@end menu
	1726
[261bf62]	1727
	1728	@node con description
	1729	@subsection Description
	1730
	1731	@command{con} generates a concordance table based on a pattern given to @command{ser}.
	1732
	1733
[25ae32e]	1734	@node con command line options
	1735	@subsection Command line options
	1736
	1737	@table @code
	1738
	1739	@parhelp
	1740
	1741	@c @item @b{@minus{}@minus{}help}, @b{@minus{}h}
	1742	@c @item @b{@minus{}@minus{}version}, @b{@minus{}v}
	1743	@c @item @b{@minus{}@minus{}file=@var{filename}, @minus{}f @var{filename}}
	1744	@c @item @b{@minus{}@minus{}output=@var{filename}, @minus{}o @var{filename}}
	1745	@c @item @b{@minus{}@minus{}fail=@var{filename}, @minus{}e @var{filename}} [???]
	1746	@c @item @b{@minus{}@minus{}copy, @minus{}c} [???]
	1747	@c @item @b{@minus{}@minus{}input-field=@var{fieldname}, @minus{}I @var{fieldname}}
	1748	@c @item @b{@minus{}@minus{}output-field=@var{fieldname}, @minus{}O @var{fieldname}}
	1749	@c @item @b{@minus{}@minus{}process=@var{class}, @minus{}p @var{class}}
	1750	@c @item @b{@minus{}@minus{}interactive @minus{}i}
	1751	@c @item @b{@minus{}@minus{}config=@var{filename}}
	1752	@c @item
	1753	@c @item @b{@minus{}@minus{}pattern=@var{pattern}, @minus{}e @var{pattern}}
	1754	@c search pattern
	1755	@c
	1756	@c @item @b{@minus{}@minus{}flex}
	1757	@c only print the generated flex source code
	1758	@c
	1759	@c @item @b{@minus{}@minus{}macro=@var{filename}}
	1760	@c read macrodefinitions from file @var{filename} rather than from
	1761	@c default location. This option allows to redefine the set of terms.
	1762	@c
	1763	@c @item @b{@minus{}@minus{}define=@var{filename}}
	1764	@c append macrodefinitions from file @var{filename}. This option
	1765	@c allows to extend the set of terms.
	1766
	1767	@item @b{@minus{}@minus{}left @minus{}l}
	1768	Left context info (default='30c'). Example:
	1769	@example
	1770	-l=5c: left context is 5 characters
	1771	-l=5w: left context is 5 words
	1772	-l=5s: left context is 5 non-empty input lines
	1773	-l='\s*\S+\sr\S+BOS': left context starts with the given regex
	1774	@end example
	1775
	1776	@item @b{@minus{}@minus{}right @minus{}r}
	1777	Right context info (default='30c').
	1778	@item @b{@minus{}@minus{}trim @minus{}t}
	1779	Clear incomplete words from output.
	1780	@item @b{@minus{}@minus{}white @minus{}w}
	1781	DO NOT change all white characters into spaces.
	1782	@item @b{@minus{}@minus{}column @minus{}c}
	1783	Left column minimal width in characters (default = 0).
	1784	@item @b{@minus{}@minus{}ignore @minus{}i}
	1785	Ignore segment inconsistency in the input.
[261bf62]	1786	@item @b{@minus{}@minus{}bom}
[25ae32e]	1787	Beginning of selected segment (regex, default='[0-9]+ [0-9]+ BOM .*').
[261bf62]	1788	@item @b{@minus{}@minus{}eom}
[25ae32e]	1789	End of selected segment (regex, default='[0-9]+ [0-9]+ EOM .*').
	1790	@item @b{@minus{}@minus{}bod}
	1791	Selected segment beginning display string (default='[').
	1792	@item @b{@minus{}@minus{}eod}
	1793	Selected segment end display string (default=']').
	1794
	1795
	1796
	1797	@end table
	1798
	1799	@node con usage example
	1800	@subsection Usage example
	1801	@example
[261bf62]	1802	cat file.txt \| tok \| lem -1 \| ser -e 'lexeme(dom)' \| con
[25ae32e]	1803	@end example
	1804
	1805
	1806	@node con hints
	1807	@subsection Hints
	1808
	1809	@command{con} is a rather slow program. Do not pass large amounts of
	1810	redundant text through this program. @command{con} works fine in the following
	1811	sequence:
	1812
	1813	@example
	1814	... \| grp -e EXPR \| ser -e EXPR \| con
	1815	@end example
	1816
	1817
	1818	@c ---------------------------------------------------------------------
	1819	@c ---------------------------------------------------------------------
	1820
	1821	@page
	1822	@node Auxiliary tools
	1823	@chapter Auxiliary tools
	1824
	1825	@menu
	1826	* compiledic:: dictionary compiler
	1827	* fla:: UTT file flattener
	1828	* unfla:: UTT file unflattener
	1829	@end menu
	1830
	1831
	1832	@page
	1833	@node compiledic
	1834	@section compiledic - the dictionary compiler
	1835
	1836	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	1837	@item @strong{Authors:} @tab Michal Stolarski, Tomasz Obrebski
	1838	@item @strong{Component category:} @tab additional tool
	1839	@end multitable
	1840	@c
	1841
	1842	@command{compiledic} compiles dictionaries in text format (@code{.dic} extension) into binary
	1843	(FSA) format (@code{.bin} extension).
	1844
	1845	Automaton representation of a dictionary is built using the AT&T tools:
	1846	@itemize
	1847	@item AT&T FSM Library,
	1848	@item AT&T Lextools.
	1849	@end itemize
	1850
	1851	In order for the compiledic program to work you have to install the
	1852	above mentioned packages into your system. They are freely available
	1853	for non-commercial use.
	1854
	1855	Usage:
	1856	@example
	1857	compiledic <dictionaryname>.dic
	1858	@end example
	1859
	1860	The file <dictionaryname>.bin will be generated.
	1861
	1862	Remarque: The program produces a lot of temporary files which are
	1863	stored in the current directory. They are deleted after successfull
	1864	termination of the program.
	1865
	1866	@c @menu
	1867	@c * con command line options::
	1868	@c * con usage example::
	1869	@c * con hints::
	1870	@c @end menu
	1871
	1872
	1873	@page
	1874	@node fla
	1875	@section fla - the UTT file flattener
	1876
	1877	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	1878	@item @strong{Authors:} @tab Tomasz ObrÃªbski
[25ae32e]	1879	@item @strong{Component category:} @tab filter
	1880	@end multitable
	1881	@c
	1882
	1883	@command{fla} ``flattens'' a utt file by merging segments belonging
	1884	to one sentence in one line. Technically, end-of-line characters
	1885	('\n', ASCII code 10) are replaced with line-feed characters ('\f',
	1886	ASCII code 12). The flattening makes it possible to process UTT files
	1887	with such tools as @command{grep} or @command{sed} sentence by
	1888	sentence (used in @command{grp} and @command{mar}).
	1889
	1890	Flattened files should have the suffix @code{.fla}, eg. @file{thetext.utt.fla}.
	1891
	1892	Flattened files are still human-readible.
	1893
	1894	Usage:
	1895
	1896	@example
	1897	fla [<bosregex>]
	1898	@end example
	1899
	1900	The facultative argument is a regular expression describing segments
	1901	which should be treated as sentence beginnings (the test is: the
	1902	segment contains a fragment matching the @code{<bosregex>}). By
	1903	default, segments containing a field @code{BOS} are seeked.
	1904	@c @menu
	1905	@c * con command line options::
	1906	@c * con usage example::
	1907	@c * con hints::
	1908	@c @end menu
	1909
	1910
	1911
	1912	@page
	1913	@node unfla
	1914	@section unfla - the UTT file unflattener
	1915
	1916	@multitable {aaaaaaaaaaaaaaaaaaaaaaaaa} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
[19760ef]	1917	@item @strong{Authors:} @tab Tomasz ObrÃªbski
[25ae32e]	1918	@item @strong{Component category:} @tab filter
	1919	@end multitable
	1920
	1921	@command{unfla} transforms a flattened UTT file, produced by
	1922	@command{fla}, into the regular format by restoring end-of-line
	1923	characters.
	1924
	1925
	1926
	1927
	1928	@c ---------------------------------------------------------------------
	1929	@c USAGE EXAMPLES
	1930	@c ---------------------------------------------------------------------
	1931
	1932	@node Usage examples
	1933	@chapter Usage examples
	1934
	1935	@subsubheading Simple pipelines
	1936
	1937	@enumerate
	1938
	1939	@item tokenization
	1940
	1941	cat text \| tok > output1
	1942
	1943	@item morphological annotation (1)
	1944
	1945	simple dictionary based lemmatization
	1946
	1947	cat text \| tok \| lem > output1
	1948
	1949	@item morphological annotation (2)
	1950
	1951	1) perform dictionary-based lemmatization
	1952	4) guess descriptions for words which have no annotation
	1953
	1954	@example
	1955	cat text \| tok \| lem \| gue -S lem > output2
	1956	@end example
	1957
	1958	@item morphological annotation (3)
	1959
	1960	1) perform dictionary-based lemmatization
	1961	2) try to correct words with no annotation
	1962	3) perform dictionary-based lemmatization of corrected words
	1963	4) guess descriptions for words which still have no annotation
	1964
	1965	@example
	1966	cat text \| tok \| lem \| cor -p W -S lem \| lem -I cor \| gue -p W -S lem
	1967	@end example
	1968	@item spelling correction
	1969
	1970
	1971
	1972	@example
	1973	cat text \| tok \| lem --only-fail \| cor -1 > output3
	1974	@end example
	1975
	1976	@item Expression extraction
	1977
	1978	Extraction of all occurrences of a verb followed by a form of the noun 'rozmowa'.
	1979
	1980	@example
	1981	cat text \| tok \| lem -1 \| ser -e 'cat(<V>) space lexeme(rozmowa)' -m \| kot > output4
	1982	@end example
	1983
	1984	@item A word in context
	1985
	1986	Extraction of text fragments containing a form of the lexeme 'rozmowa' in
	1987	the context of 5 preceeding and 5 succeeding corpus segments.
	1988
	1989	@example
	1990	cat text \| tok \| lem -1 \| ser -e 'seg@{5@} lexeme(rozmowa) seg@{5@}' -m \| kot > output
	1991	@end example
	1992
	1993	@item generation of concordance table (1)
	1994
	1995	@example
	1996	cat text \| tok \| lem -1 \| ser -e 'cat(<V>) space lexeme(rozmowa)' \| con
	1997	@end example
	1998
	1999	10"
	2000
	2001	@item generation of concordance table (2)
	2002
	2003	The same as above but much faster
	2004
	2005	@example
	2006	cat text \| tok \| lem -1 \| \
	2007	grp -e 'cat(<V>) space lexeme(rozmowa)' \| \
	2008	ser -e 'cat(<V>) space lexeme(rozmowa)' \| \
	2009	con
	2010	@end example
	2011
	2012	2"
	2013
	2014	@item generation of concordance table (3)
	2015
	2016	Usually, one performs repetitively search over the same corpus. In
	2017	such case it is advisable to transform the corpus data into the format
	2018	required by @command{grp} first, and then use the preprocessed data.
	2019
	2020	As @command{grp} (@command{grep}) processes data faster then it is
	2021	read from the disk drive, the search time may be still shortened by
	2022	using file compression techniques. We suggest usin @command{lzop}.
	2023
	2024	@item the fastest way to search a large corpus
	2025
	2026	step 1: preprocessing
	2027
	2028	@example
	2029	cat corpus \| tok \| sen \| lem -1 \
	2030	\| grp -a p \| lzop -7 > corpus.grp.lzo
	2031	@end example
	2032
	2033	step 2: search
	2034
	2035	@example
	2036	lzop -cd corpus.grp.lzo \| grp -a gP -e 'cat(<V>) space
	2037	lexeme(rozmowa)' \| ser -e 'cat(<V>) space lexeme(rozmowa)' \| con
	2038	@end example
	2039
	2040	@end enumerate
	2041
	2042	@subsubheading More complicated configurations
	2043
	2044
	2045	@example
	2046	mknod fifo1 p
	2047	mknod fifo2 p
	2048	mknod fifo3 p
	2049	mknod fifo4 p
	2050	mknod fifo5 p
	2051
	2052	tok \| lem -p W -e fifo1 > fifo2 &
	2053	cor -e fifo3 < fifo1 \| lem > fifo4 &
	2054	gue < fifo3 > fifo5 &
	2055	sort -m fifo2 fifo4 fifo5
	2056
	2057	rm fifo?
	2058	@end example
	2059
	2060
	2061	@c ---------------------------------------------------------------------
	2062	@c ---------------------------------------------------------------------
	2063
	2064	@c ---------------------------------------------------------------------
	2065	@c PMDBF DICTIONARY
	2066	@c ---------------------------------------------------------------------
	2067
	2068	@node PMDBF dictionary
	2069	@chapter PMDBF dictionary
	2070
	2071	UTT components come with lexical data derived from Polish
	2072	Morphological Database (PMDB).
	2073
	2074	@menu
	2075	* PMDBF files::
	2076	* PMDBF tag structure::
	2077	* PMDBF parts of speech::
	2078	* PMDBF morphosyntactic attributes::
	2079	@end menu
	2080
	2081	@node PMDBF files
	2082	@section Files
	2083
	2084	@node PMDBF tag structure
	2085	@section Tag structure
	2086
	2087	pos = [[:upper:]]+
	2088
	2089	attr = [[:upper:]]+
	2090
	2091	val = [[:lower:][:digit:]?!*+-] \| <[^>\n]+>
	2092
	2093	descr = pos ( / ( attr val + ) + ) ?
	2094
	2095	@node PMDBF parts of speech
	2096	@section Parts of speech
	2097
	2098	@multitable {ADJPRP} { adjectival-passive-participle }
	2099	@item @code{N} @tab noun
	2100	@item @code{NPRO} @tab nominal-pronoun
	2101	@item @code{NV} @tab deverbal-noun
	2102	@item @code{V} @tab verb
	2103	@item @code{BYC} @tab byc
	2104	@item @code{VNI} @tab non-inflected-verb
	2105	@item @code{ADJ} @tab adjective
	2106	@item @code{ADJPAP} @tab adjectival-passive-participle
	2107	@item @code{ADJPRP} @tab adjectival-present-participle
	2108	@item @code{ADJPP} @tab adjectival-past-participle
	2109	@item @code{ADJPRO} @tab adjectival-pronoun
	2110	@item @code{ADJNUM} @tab adjectival-numeral
	2111	@item @code{ADV} @tab adverb
	2112	@item @code{ADVANP} @tab adverbial-anterior-participle
	2113	@item @code{ADVPRP} @tab adverbial-present-participle
	2114	@item @code{ADVPRO} @tab adverbial-pronoun
	2115	@item @code{ADVNUM} @tab adverbial-numeral
	2116	@item @code{P} @tab preposition
	2117	@item @code{PPRO} @tab prep-noun-pronoun
	2118	@item @code{CONJ} @tab conjunction
	2119	@item @code{EXCL} @tab exclamation
	2120	@item @code{APP} @tab call
	2121	@item @code{ONO} @tab onomatopoeia
	2122	@item @code{PART} @tab particle
	2123	@item @code{NUMCRD} @tab cardinal-numeral
	2124	@item @code{NUMCOL} @tab collective-numeral
	2125	@item @code{NUMPAR} @tab partitive-numeral
	2126	@item @code{NUMORD} @tab ordinal-numeral
	2127	@end multitable
	2128
	2129	@node PMDBF morphosyntactic attributes
	2130	@section Morphosyntactic attributes
	2131
	2132	@multitable {Attr} {Val} {aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa}
	2133	@c @headitem Attr @tab Val @tab Description
	2134	@item
	2135	@code{A} @tab @tab Aspect
	2136	@item
	2137	@tab @code{p} @tab perfect
	2138	@item
	2139	@tab @code{i} @tab imperfect.
	2140	@item
	2141	@item
	2142	@code{V} @tab @tab Verb-Form
	2143	@item
	2144	@tab @code{b} @tab infinitive,
	2145	@item
	2146	@tab @code{p} @tab personal,
	2147	@item
	2148	@tab @code{i} @tab impersonal.
	2149	@item
	2150	@item
	2151	@code{M} @tab @tab Mood
	2152	@item
	2153	@tab @code{d} @tab declarative,
	2154	@item
	2155	@tab @code{c} @tab conditional,
	2156	@item
	2157	@tab @code{i} @tab imperative.
	2158	@item
	2159	@item
	2160	@code{T} @tab @tab Tense
	2161	@item
	2162	@tab @code{a} @tab past,
	2163	@item
	2164	@tab @code{r} @tab present,
	2165	@item
	2166	@tab @code{f} @tab future.
	2167	@item
	2168	@item
	2169	@code{P} @tab @tab Person
	2170	@item
	2171	@tab @code{1} @tab 1,
	2172	@item
	2173	@tab @code{2} @tab 2,
	2174	@item
	2175	@tab @code{3} @tab 3.
	2176	@item
	2177	@item
	2178	@code{D} @tab @tab Degree
	2179	@item
	2180	@tab @code{p} @tab positive,
	2181	@item
	2182	@tab @code{c} @tab comparative,
	2183	@item
	2184	@tab @code{s} @tab superlative.
	2185	@item
	2186	@item
	2187	@code{N} @tab @tab Number
	2188	@item
	2189	@tab @code{s} @tab singular,
	2190	@item
	2191	@tab @code{p} @tab plural.
	2192	@item
	2193	@item
	2194	@code{C} @tab @tab Case
	2195	@item
	2196	@tab @code{n} @tab nominative,
	2197	@item
	2198	@tab @code{g} @tab genitive,
	2199	@item
	2200	@tab @code{d} @tab dative,
	2201	@item
	2202	@tab @code{a} @tab accusative,
	2203	@item
	2204	@tab @code{i} @tab instrumantal,
	2205	@item
	2206	@tab @code{l} @tab locative,
	2207	@item
	2208	@tab @code{v} @tab vocative.
	2209	@item
	2210	@item
	2211	@code{G} @tab @tab Gender
	2212	@item
	2213	@tab @code{p} @tab masculine-personal,
	2214	@item
	2215	@tab @code{a} @tab masculine-animal,
	2216	@item
	2217	@tab @code{i} @tab masculine-inanimate,
	2218	@item
	2219	@tab @code{f} @tab feminine,
	2220	@item
	2221	@tab @code{n} @tab neuter.
	2222	@end multitable
	2223
	2224
	2225	@c ---------------------------------------------------------------------
	2226	@c ---------------------------------------------------------------------
	2227	@c
	2228	@c @node Examples
	2229	@c @chapter Examples
	2230
	2231	@c ----------------------------------------------------------------------
	2232	@c ----------------------------------------------------------------------
	2233
	2234	@node GNU Free Documentation License
	2235	@chapter GNU Free Documentation License
	2236
	2237	@c The GNU Free Documentation License.
	2238	@center Version 1.2, November 2002
	2239
	2240	@c This file is intended to be included within another document,
	2241	@c hence no sectioning command or @node.
	2242
	2243	@display
	2244	Copyright @copyright{} 2000,2001,2002 Free Software Foundation, Inc.
	2245	51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA
	2246
	2247	Everyone is permitted to copy and distribute verbatim copies
	2248	of this license document, but changing it is not allowed.
	2249	@end display
	2250
	2251	@enumerate 0
	2252	@item
	2253	PREAMBLE
	2254
	2255	The purpose of this License is to make a manual, textbook, or other
	2256	functional and useful document @dfn{free} in the sense of freedom: to
	2257	assure everyone the effective freedom to copy and redistribute it,
	2258	with or without modifying it, either commercially or noncommercially.
	2259	Secondarily, this License preserves for the author and publisher a way
	2260	to get credit for their work, while not being considered responsible
	2261	for modifications made by others.
	2262
	2263	This License is a kind of ``copyleft'', which means that derivative
	2264	works of the document must themselves be free in the same sense. It
	2265	complements the GNU General Public License, which is a copyleft
	2266	license designed for free software.
	2267
	2268	We have designed this License in order to use it for manuals for free
	2269	software, because free software needs free documentation: a free
	2270	program should come with manuals providing the same freedoms that the
	2271	software does. But this License is not limited to software manuals;
	2272	it can be used for any textual work, regardless of subject matter or
	2273	whether it is published as a printed book. We recommend this License
	2274	principally for works whose purpose is instruction or reference.
	2275
	2276	@item
	2277	APPLICABILITY AND DEFINITIONS
	2278
	2279	This License applies to any manual or other work, in any medium, that
	2280	contains a notice placed by the copyright holder saying it can be
	2281	distributed under the terms of this License. Such a notice grants a
	2282	world-wide, royalty-free license, unlimited in duration, to use that
	2283	work under the conditions stated herein. The ``Document'', below,
	2284	refers to any such manual or work. Any member of the public is a
	2285	licensee, and is addressed as ``you''. You accept the license if you
	2286	copy, modify or distribute the work in a way requiring permission
	2287	under copyright law.
	2288
	2289	A ``Modified Version'' of the Document means any work containing the
	2290	Document or a portion of it, either copied verbatim, or with
	2291	modifications and/or translated into another language.
	2292
	2293	A ``Secondary Section'' is a named appendix or a front-matter section
	2294	of the Document that deals exclusively with the relationship of the
	2295	publishers or authors of the Document to the Document's overall
	2296	subject (or to related matters) and contains nothing that could fall
	2297	directly within that overall subject. (Thus, if the Document is in
	2298	part a textbook of mathematics, a Secondary Section may not explain
	2299	any mathematics.) The relationship could be a matter of historical
	2300	connection with the subject or with related matters, or of legal,
	2301	commercial, philosophical, ethical or political position regarding
	2302	them.
	2303
	2304	The ``Invariant Sections'' are certain Secondary Sections whose titles
	2305	are designated, as being those of Invariant Sections, in the notice
	2306	that says that the Document is released under this License. If a
	2307	section does not fit the above definition of Secondary then it is not
	2308	allowed to be designated as Invariant. The Document may contain zero
	2309	Invariant Sections. If the Document does not identify any Invariant
	2310	Sections then there are none.
	2311
	2312	The ``Cover Texts'' are certain short passages of text that are listed,
	2313	as Front-Cover Texts or Back-Cover Texts, in the notice that says that
	2314	the Document is released under this License. A Front-Cover Text may
	2315	be at most 5 words, and a Back-Cover Text may be at most 25 words.
	2316
	2317	A ``Transparent'' copy of the Document means a machine-readable copy,
	2318	represented in a format whose specification is available to the
	2319	general public, that is suitable for revising the document
	2320	straightforwardly with generic text editors or (for images composed of
	2321	pixels) generic paint programs or (for drawings) some widely available
	2322	drawing editor, and that is suitable for input to text formatters or
	2323	for automatic translation to a variety of formats suitable for input
	2324	to text formatters. A copy made in an otherwise Transparent file
	2325	format whose markup, or absence of markup, has been arranged to thwart
	2326	or discourage subsequent modification by readers is not Transparent.
	2327	An image format is not Transparent if used for any substantial amount
	2328	of text. A copy that is not ``Transparent'' is called ``Opaque''.
	2329
	2330	Examples of suitable formats for Transparent copies include plain
	2331	@sc{ascii} without markup, Texinfo input format, La@TeX{} input
	2332	format, @acronym{SGML} or @acronym{XML} using a publicly available
	2333	@acronym{DTD}, and standard-conforming simple @acronym{HTML},
	2334	PostScript or @acronym{PDF} designed for human modification. Examples
	2335	of transparent image formats include @acronym{PNG}, @acronym{XCF} and
	2336	@acronym{JPG}. Opaque formats include proprietary formats that can be
	2337	read and edited only by proprietary word processors, @acronym{SGML} or
	2338	@acronym{XML} for which the @acronym{DTD} and/or processing tools are
	2339	not generally available, and the machine-generated @acronym{HTML},
	2340	PostScript or @acronym{PDF} produced by some word processors for
	2341	output purposes only.
	2342
	2343	The ``Title Page'' means, for a printed book, the title page itself,
	2344	plus such following pages as are needed to hold, legibly, the material
	2345	this License requires to appear in the title page. For works in
	2346	formats which do not have any title page as such, ``Title Page'' means
	2347	the text near the most prominent appearance of the work's title,
	2348	preceding the beginning of the body of the text.
	2349
	2350	A section ``Entitled XYZ'' means a named subunit of the Document whose
	2351	title either is precisely XYZ or contains XYZ in parentheses following
	2352	text that translates XYZ in another language. (Here XYZ stands for a
	2353	specific section name mentioned below, such as ``Acknowledgements'',
	2354	``Dedications'', ``Endorsements'', or ``History''.) To ``Preserve the Title''
	2355	of such a section when you modify the Document means that it remains a
	2356	section ``Entitled XYZ'' according to this definition.
	2357
	2358	The Document may include Warranty Disclaimers next to the notice which
	2359	states that this License applies to the Document. These Warranty
	2360	Disclaimers are considered to be included by reference in this
	2361	License, but only as regards disclaiming warranties: any other
	2362	implication that these Warranty Disclaimers may have is void and has
	2363	no effect on the meaning of this License.
	2364
	2365	@item
	2366	VERBATIM COPYING
	2367
	2368	You may copy and distribute the Document in any medium, either
	2369	commercially or noncommercially, provided that this License, the
	2370	copyright notices, and the license notice saying this License applies
	2371	to the Document are reproduced in all copies, and that you add no other
	2372	conditions whatsoever to those of this License. You may not use
	2373	technical measures to obstruct or control the reading or further
	2374	copying of the copies you make or distribute. However, you may accept
	2375	compensation in exchange for copies. If you distribute a large enough
	2376	number of copies you must also follow the conditions in section 3.
	2377
	2378	You may also lend copies, under the same conditions stated above, and
	2379	you may publicly display copies.
	2380
	2381	@item
	2382	COPYING IN QUANTITY
	2383
	2384	If you publish printed copies (or copies in media that commonly have
	2385	printed covers) of the Document, numbering more than 100, and the
	2386	Document's license notice requires Cover Texts, you must enclose the
	2387	copies in covers that carry, clearly and legibly, all these Cover
	2388	Texts: Front-Cover Texts on the front cover, and Back-Cover Texts on
	2389	the back cover. Both covers must also clearly and legibly identify
	2390	you as the publisher of these copies. The front cover must present
	2391	the full title with all words of the title equally prominent and
	2392	visible. You may add other material on the covers in addition.
	2393	Copying with changes limited to the covers, as long as they preserve
	2394	the title of the Document and satisfy these conditions, can be treated
	2395	as verbatim copying in other respects.
	2396
	2397	If the required texts for either cover are too voluminous to fit
	2398	legibly, you should put the first ones listed (as many as fit
	2399	reasonably) on the actual cover, and continue the rest onto adjacent
	2400	pages.
	2401
	2402	If you publish or distribute Opaque copies of the Document numbering
	2403	more than 100, you must either include a machine-readable Transparent
	2404	copy along with each Opaque copy, or state in or with each Opaque copy
	2405	a computer-network location from which the general network-using
	2406	public has access to download using public-standard network protocols
	2407	a complete Transparent copy of the Document, free of added material.
	2408	If you use the latter option, you must take reasonably prudent steps,
	2409	when you begin distribution of Opaque copies in quantity, to ensure
	2410	that this Transparent copy will remain thus accessible at the stated
	2411	location until at least one year after the last time you distribute an
	2412	Opaque copy (directly or through your agents or retailers) of that
	2413	edition to the public.
	2414
	2415	It is requested, but not required, that you contact the authors of the
	2416	Document well before redistributing any large number of copies, to give
	2417	them a chance to provide you with an updated version of the Document.
	2418
	2419	@item
	2420	MODIFICATIONS
	2421
	2422	You may copy and distribute a Modified Version of the Document under
	2423	the conditions of sections 2 and 3 above, provided that you release
	2424	the Modified Version under precisely this License, with the Modified
	2425	Version filling the role of the Document, thus licensing distribution
	2426	and modification of the Modified Version to whoever possesses a copy
	2427	of it. In addition, you must do these things in the Modified Version:
	2428
	2429	@enumerate A
	2430	@item
	2431	Use in the Title Page (and on the covers, if any) a title distinct
	2432	from that of the Document, and from those of previous versions
	2433	(which should, if there were any, be listed in the History section
	2434	of the Document). You may use the same title as a previous version
	2435	if the original publisher of that version gives permission.
	2436
	2437	@item
	2438	List on the Title Page, as authors, one or more persons or entities
	2439	responsible for authorship of the modifications in the Modified
	2440	Version, together with at least five of the principal authors of the
	2441	Document (all of its principal authors, if it has fewer than five),
	2442	unless they release you from this requirement.
	2443
	2444	@item
	2445	State on the Title page the name of the publisher of the
	2446	Modified Version, as the publisher.
	2447
	2448	@item
	2449	Preserve all the copyright notices of the Document.
	2450
	2451	@item
	2452	Add an appropriate copyright notice for your modifications
	2453	adjacent to the other copyright notices.
	2454
	2455	@item
	2456	Include, immediately after the copyright notices, a license notice
	2457	giving the public permission to use the Modified Version under the
	2458	terms of this License, in the form shown in the Addendum below.
	2459
	2460	@item
	2461	Preserve in that license notice the full lists of Invariant Sections
	2462	and required Cover Texts given in the Document's license notice.
	2463
	2464	@item
	2465	Include an unaltered copy of this License.
	2466
	2467	@item
	2468	Preserve the section Entitled ``History'', Preserve its Title, and add
	2469	to it an item stating at least the title, year, new authors, and
	2470	publisher of the Modified Version as given on the Title Page. If
	2471	there is no section Entitled ``History'' in the Document, create one
	2472	stating the title, year, authors, and publisher of the Document as
	2473	given on its Title Page, then add an item describing the Modified
	2474	Version as stated in the previous sentence.
	2475
	2476	@item
	2477	Preserve the network location, if any, given in the Document for
	2478	public access to a Transparent copy of the Document, and likewise
	2479	the network locations given in the Document for previous versions
	2480	it was based on. These may be placed in the ``History'' section.
	2481	You may omit a network location for a work that was published at
	2482	least four years before the Document itself, or if the original
	2483	publisher of the version it refers to gives permission.
	2484
	2485	@item
	2486	For any section Entitled ``Acknowledgements'' or ``Dedications'', Preserve
	2487	the Title of the section, and preserve in the section all the
	2488	substance and tone of each of the contributor acknowledgements and/or
	2489	dedications given therein.
	2490
	2491	@item
	2492	Preserve all the Invariant Sections of the Document,
	2493	unaltered in their text and in their titles. Section numbers
	2494	or the equivalent are not considered part of the section titles.
	2495
	2496	@item
	2497	Delete any section Entitled ``Endorsements''. Such a section
	2498	may not be included in the Modified Version.
	2499
	2500	@item
	2501	Do not retitle any existing section to be Entitled ``Endorsements'' or
	2502	to conflict in title with any Invariant Section.
	2503
	2504	@item
	2505	Preserve any Warranty Disclaimers.
	2506	@end enumerate
	2507
	2508	If the Modified Version includes new front-matter sections or
	2509	appendices that qualify as Secondary Sections and contain no material
	2510	copied from the Document, you may at your option designate some or all
	2511	of these sections as invariant. To do this, add their titles to the
	2512	list of Invariant Sections in the Modified Version's license notice.
	2513	These titles must be distinct from any other section titles.
	2514
	2515	You may add a section Entitled ``Endorsements'', provided it contains
	2516	nothing but endorsements of your Modified Version by various
	2517	parties---for example, statements of peer review or that the text has
	2518	been approved by an organization as the authoritative definition of a
	2519	standard.
	2520
	2521	You may add a passage of up to five words as a Front-Cover Text, and a
	2522	passage of up to 25 words as a Back-Cover Text, to the end of the list
	2523	of Cover Texts in the Modified Version. Only one passage of
	2524	Front-Cover Text and one of Back-Cover Text may be added by (or
	2525	through arrangements made by) any one entity. If the Document already
	2526	includes a cover text for the same cover, previously added by you or
	2527	by arrangement made by the same entity you are acting on behalf of,
	2528	you may not add another; but you may replace the old one, on explicit
	2529	permission from the previous publisher that added the old one.
	2530
	2531	The author(s) and publisher(s) of the Document do not by this License
	2532	give permission to use their names for publicity for or to assert or
	2533	imply endorsement of any Modified Version.
	2534
	2535	@item
	2536	COMBINING DOCUMENTS
	2537
	2538	You may combine the Document with other documents released under this
	2539	License, under the terms defined in section 4 above for modified
	2540	versions, provided that you include in the combination all of the
	2541	Invariant Sections of all of the original documents, unmodified, and
	2542	list them all as Invariant Sections of your combined work in its
	2543	license notice, and that you preserve all their Warranty Disclaimers.
	2544
	2545	The combined work need only contain one copy of this License, and
	2546	multiple identical Invariant Sections may be replaced with a single
	2547	copy. If there are multiple Invariant Sections with the same name but
	2548	different contents, make the title of each such section unique by
	2549	adding at the end of it, in parentheses, the name of the original
	2550	author or publisher of that section if known, or else a unique number.
	2551	Make the same adjustment to the section titles in the list of
	2552	Invariant Sections in the license notice of the combined work.
	2553
	2554	In the combination, you must combine any sections Entitled ``History''
	2555	in the various original documents, forming one section Entitled
	2556	``History''; likewise combine any sections Entitled ``Acknowledgements'',
	2557	and any sections Entitled ``Dedications''. You must delete all
	2558	sections Entitled ``Endorsements.''
	2559
	2560	@item
	2561	COLLECTIONS OF DOCUMENTS
	2562
	2563	You may make a collection consisting of the Document and other documents
	2564	released under this License, and replace the individual copies of this
	2565	License in the various documents with a single copy that is included in
	2566	the collection, provided that you follow the rules of this License for
	2567	verbatim copying of each of the documents in all other respects.
	2568
	2569	You may extract a single document from such a collection, and distribute
	2570	it individually under this License, provided you insert a copy of this
	2571	License into the extracted document, and follow this License in all
	2572	other respects regarding verbatim copying of that document.
	2573
	2574	@item
	2575	AGGREGATION WITH INDEPENDENT WORKS
	2576
	2577	A compilation of the Document or its derivatives with other separate
	2578	and independent documents or works, in or on a volume of a storage or
	2579	distribution medium, is called an ``aggregate'' if the copyright
	2580	resulting from the compilation is not used to limit the legal rights
	2581	of the compilation's users beyond what the individual works permit.
	2582	When the Document is included in an aggregate, this License does not
	2583	apply to the other works in the aggregate which are not themselves
	2584	derivative works of the Document.
	2585
	2586	If the Cover Text requirement of section 3 is applicable to these
	2587	copies of the Document, then if the Document is less than one half of
	2588	the entire aggregate, the Document's Cover Texts may be placed on
	2589	covers that bracket the Document within the aggregate, or the
	2590	electronic equivalent of covers if the Document is in electronic form.
	2591	Otherwise they must appear on printed covers that bracket the whole
	2592	aggregate.
	2593
	2594	@item
	2595	TRANSLATION
	2596
	2597	Translation is considered a kind of modification, so you may
	2598	distribute translations of the Document under the terms of section 4.
	2599	Replacing Invariant Sections with translations requires special
	2600	permission from their copyright holders, but you may include
	2601	translations of some or all Invariant Sections in addition to the
	2602	original versions of these Invariant Sections. You may include a
	2603	translation of this License, and all the license notices in the
	2604	Document, and any Warranty Disclaimers, provided that you also include
	2605	the original English version of this License and the original versions
	2606	of those notices and disclaimers. In case of a disagreement between
	2607	the translation and the original version of this License or a notice
	2608	or disclaimer, the original version will prevail.
	2609
	2610	If a section in the Document is Entitled ``Acknowledgements'',
	2611	``Dedications'', or ``History'', the requirement (section 4) to Preserve
	2612	its Title (section 1) will typically require changing the actual
	2613	title.
	2614
	2615	@item
	2616	TERMINATION
	2617
	2618	You may not copy, modify, sublicense, or distribute the Document except
	2619	as expressly provided for under this License. Any other attempt to
	2620	copy, modify, sublicense or distribute the Document is void, and will
	2621	automatically terminate your rights under this License. However,
	2622	parties who have received copies, or rights, from you under this
	2623	License will not have their licenses terminated so long as such
	2624	parties remain in full compliance.
	2625
	2626	@item
	2627	FUTURE REVISIONS OF THIS LICENSE
	2628
	2629	The Free Software Foundation may publish new, revised versions
	2630	of the GNU Free Documentation License from time to time. Such new
	2631	versions will be similar in spirit to the present version, but may
	2632	differ in detail to address new problems or concerns. See
	2633	@uref{http://www.gnu.org/copyleft/}.
	2634
	2635	Each version of the License is given a distinguishing version number.
	2636	If the Document specifies that a particular numbered version of this
	2637	License ``or any later version'' applies to it, you have the option of
	2638	following the terms and conditions either of that specified version or
	2639	of any later version that has been published (not as a draft) by the
	2640	Free Software Foundation. If the Document does not specify a version
	2641	number of this License, you may choose any version ever published (not
	2642	as a draft) by the Free Software Foundation.
	2643	@end enumerate
	2644
	2645	@page
	2646	@heading ADDENDUM: How to use this License for your documents
	2647
	2648	To use this License in a document you have written, include a copy of
	2649	the License in the document and put the following copyright and
	2650	license notices just after the title page:
	2651
	2652	@smallexample
	2653	@group
	2654	Copyright (C) @var{year} @var{your name}.
	2655	Permission is granted to copy, distribute and/or modify this document
	2656	under the terms of the GNU Free Documentation License, Version 1.2
	2657	or any later version published by the Free Software Foundation;
	2658	with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
	2659	Texts. A copy of the license is included in the section entitled ``GNU
	2660	Free Documentation License''.
	2661	@end group
	2662	@end smallexample
	2663
	2664	If you have Invariant Sections, Front-Cover Texts and Back-Cover Texts,
	2665	replace the ``with@dots{}Texts.'' line with this:
	2666
	2667	@smallexample
	2668	@group
	2669	with the Invariant Sections being @var{list their titles}, with
	2670	the Front-Cover Texts being @var{list}, and with the Back-Cover Texts
	2671	being @var{list}.
	2672	@end group
	2673	@end smallexample
	2674
	2675	If you have Invariant Sections without Cover Texts, or some other
	2676	combination of the three, merge those two alternatives to suit the
	2677	situation.
	2678
	2679	If your document contains nontrivial examples of program code, we
	2680	recommend releasing these examples in parallel under your choice of
	2681	free software license, such as the GNU General Public License,
	2682	to permit their use in free software.
	2683
	2684	@c Local Variables:
	2685	@c ispell-local-pdict: "ispell-dict"
	2686	@c End:
	2687
	2688
	2689	@c ---------------------------------------------------------------------
	2690	@c ---------------------------------------------------------------------
	2691
	2692	@node Reporting bugs
	2693	@chapter Reporting bugs
	2694
	2695	Report bugs to <obrebski@@amu.edu.pl>.
	2696
	2697	@c ---------------------------------------------------------------------
	2698	@c ---------------------------------------------------------------------
	2699
	2700	@c @node Copyright
	2701	@c @chapter Copyright
	2702	@c
	2703	@c Copyright 2004 by Tomasz Obrebski
	2704	@c This software is free for research and educational use.
	2705
	2706	@c ---------------------------------------------------------------------
	2707	@c ---------------------------------------------------------------------
	2708
	2709	@node Author
	2710	@chapter Author
	2711
	2712
	2713	@bye

Note: See TracBrowser for help on using the repository browser.

UAM Text Tools

Context Navigation

source: app/doc/utt.texinfo @ 261bf62

Download in other formats: