|
Last change
on this file since c21bdd6 was
5f4d9c3,
checked in by Maciej Prill <mprill@…>, 14 years ago
|
|
Rewritten the build system, added lem UTF-8 version.
|
-
Property mode set to
100644
|
|
File size:
1.4 KB
|
| Line | |
|---|
| 1 | General information |
|---|
| 2 | ********************* |
|---|
| 3 | |
|---|
| 4 | UAM Text Tools (UTT) is a package of language processing tools |
|---|
| 5 | developed at Adam Mickiewicz University. Its functionality includes: |
|---|
| 6 | * tokenization |
|---|
| 7 | * dictionary-based morphological analysis |
|---|
| 8 | * heuristic morphological analysis of unknown words |
|---|
| 9 | * spelling correction |
|---|
| 10 | * pattern search |
|---|
| 11 | * sentence splitting |
|---|
| 12 | * generation of concordance tables |
|---|
| 13 | |
|---|
| 14 | The toolkit is destined for processing of raw (not annotated) |
|---|
| 15 | unrestricted text for any conceivable purpose. |
|---|
| 16 | |
|---|
| 17 | |
|---|
| 18 | Installation |
|---|
| 19 | ************** |
|---|
| 20 | |
|---|
| 21 | 1) unpack the UTT tar archive |
|---|
| 22 | 2) in the same directory, unpack the tar archives of all UTT dictionary modules you have |
|---|
| 23 | 3) run |
|---|
| 24 | make install |
|---|
| 25 | in the root directory of the installation |
|---|
| 26 | 4) add the bin directory to the PATH variable |
|---|
| 27 | |
|---|
| 28 | |
|---|
| 29 | Requirements |
|---|
| 30 | ************* |
|---|
| 31 | |
|---|
| 32 | * File::HomeDir |
|---|
| 33 | |
|---|
| 34 | the Perl package File::HomeDir must be installed |
|---|
| 35 | (to install the package, run 'perl -MCPAN -e shell' and write |
|---|
| 36 | 'install File::HomeDir' after the 'cpan>' prompt appears) |
|---|
| 37 | |
|---|
| 38 | * flex |
|---|
| 39 | |
|---|
| 40 | to run the ser component, flex must be installed in your system |
|---|
| 41 | |
|---|
| 42 | * ruby |
|---|
| 43 | |
|---|
| 44 | to run the tre component, ruby must be installed in your system |
|---|
| 45 | |
|---|
| 46 | * locale pl_PL.iso-8852-2 |
|---|
| 47 | |
|---|
| 48 | the locales pl_PL.iso-8859-2 (pl_PL in short) must be installed |
|---|
| 49 | and set while using UTT with the Polish module. The text you |
|---|
| 50 | process with UTT must be encoded in iso-8859-2. |
|---|
| 51 | |
|---|
Note: See
TracBrowser
for help on using the repository browser.