apertium-translator

Langue: en

Version: 312474 (ubuntu - 07/07/09)

Section: 1 (Commandes utilisateur)

NAME

apertium-translator - This application is part of ( apertium )

This tool is part of the apertium machine translation architecture: http://apertium.sf.net.

SYNOPSIS

apertium-translator {datadir} {language-pair} [format [infile [outfile]]]

DESCRIPTION

apertium-translator is the application responsible of the translation procedure in Apertium 1-based systems. For Apertium 2, the preferred method for translation is by using the apertium application.

This tool tries to ease the use of lt-toolbox (which contains all the lexical processing modules and tools) and apertium (which contains the rest of the engine) by providing a unique front-end to the end-user.

The different modules behind the apertium machine translation architecture are in order:

de-formatter: Separates the text to be translated from the format information.
morphological-analyser: Tokenizes the text in surface forms.
part-of-speech tagger: Chooses one surface forms among homographs.
lexical transfer module: Reads each source-language lexical form and delivers a corresponding target-language lexical form.
structural transfer module: Detects fixed-length patterns of lexical forms (chunks or phrases) needing special processing due to grammatical divergences between the two languages and performs the corresponding transformations.
morphological generator: Delivers a target-language surface form for each target-language lexical form, by suitably inflecting it.
post-generator: Performs orthographical operations such as contractions and apostrophations.
re-formatter: Restores the format information encapsulated by the de-formatter into the translated text and removes the encapsulation sequences used to protect certain characters in the source text.

OPTIONS

datadir The directory holding the linguistic data.

language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-es).

format Specifies the format of the input and output files which can have these values:

txt (default value) Input and output files are in text format.
txtu Input and output files are in text format and unknown words are not prepended with an asterisk (*).
html Input and output files are in "html" format. This "html" is the one acceptd by the vast majority of web browsers.
htmlu Input and output files are in "html" format. This "html" is the one acceptd by the vast majority of web browsers and unknown words are not prepended with an asterisk (*).
rtf Input and output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft Office (C) up to and including Office-97.
rtfu Input and output files are in "rtf" format. The accepted "rtf" is the one generated by Microsoft WordPad (C) and Microsoft Office (C) up to and including Office-97. Unknown words are not prepended with an asterisk (*).

FILES

These are the two files that can be used with this command:

infile Input file (stdin by default).

outfile Output file (stdout by default).

SEE ALSO

lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1), apertium(1).

BUGS

Lots of...lurking in the dark and waiting for you!

AUTHOR

(c) 2005,2006 Universitat d'Alacant / Universidad de Alicante. All rights reserved.