General information ********************* UAM Text Tools (UTT) is a package of language processing tools developed at Adam Mickiewicz University. Its functionality includes: * tokenization * dictionary-based morphological analysis * heuristic morphological analysis of unknown words * spelling correction * pattern search * sentence splitting * generation of concordance tables The toolkit is destined for processing of raw (not annotated) unrestricted text for any conceivable purpose. Installation ************** 1) unpack the UTT tar archive 2) in the same directory, unpack the tar archives of all UTT dictionary modules you have 3) run make install in the root directory of the installation 4) add the bin directory to the PATH variable Requirements ************* * File::HomeDir the Perl package File::HomeDir must be installed (to install the package, run 'perl -MCPAN -e shell' and write 'install File::HomeDir' after the 'cpan>' prompt appears) * flex to run the ser component, flex must be installed in your system * ruby to run the tre component, ruby must be installed in your system * locale pl_PL.iso-8852-2 the locales pl_PL.iso-8859-2 (pl_PL in short) must be installed and set while using UTT with the Polish module. The text you process with UTT must be encoded in iso-8859-2.