source: dist/files/README @ 555c7f8

Last change on this file since 555c7f8 was 5f4d9c3, checked in by Maciej Prill <mprill@…>, 13 years ago

Rewritten the build system, added lem UTF-8 version.

  • Property mode set to 100644
File size: 1.4 KB
Line 
1General information
2*********************
3
4UAM Text Tools (UTT) is a package of language processing tools
5developed at Adam Mickiewicz University. Its functionality includes:
6* tokenization
7* dictionary-based morphological analysis
8* heuristic morphological analysis of unknown words
9* spelling correction
10* pattern search
11* sentence splitting
12* generation of concordance tables
13                     
14The toolkit is destined for processing of raw (not annotated)
15unrestricted text for any conceivable purpose.
16                       
17
18Installation
19**************
20
211) unpack the UTT tar archive
222) in the same directory, unpack the tar archives of all UTT dictionary modules you have
233) run
24        make install
25   in the root directory of the installation
264) add the bin directory to the PATH variable
27
28
29Requirements
30*************
31
32* File::HomeDir
33
34  the Perl package File::HomeDir must be installed
35  (to install the package, run 'perl -MCPAN -e shell' and write
36   'install File::HomeDir' after the 'cpan>' prompt appears)
37   
38* flex
39
40  to run the ser component, flex must be installed in your system
41
42* ruby
43
44  to run the tre component, ruby must be installed in your system
45
46* locale pl_PL.iso-8852-2
47
48  the locales pl_PL.iso-8859-2 (pl_PL in short) must be installed
49  and set while using UTT with the Polish module. The text you
50  process with UTT must be encoded in iso-8859-2.
51 
Note: See TracBrowser for help on using the repository browser.