xml2brl

Langue: en

Autres versions - même langue

Version: 02/09/2009 (debian - 07/07/09)

Section: 1 (Commandes utilisateur)

NAME

xml2brl - translate an XML or text file into an embosser-ready braille file

SYNOPSIS

xml2brl [-b|-p|-r|-t] [-h] [-l] [-f configfile] [-Csetting=value] [INFILE] [OUTFILE]

DESCRIPTION

xml2brl will translate an XML or text file into an embosser-ready braille file. This includes translation into grade two, if desired, mathematical codes, etc. It also includes formatting according to a built-in style sheet which can be modified by the user.

It is not necessary to know XML, because MSWord and other word processors can export files in this format. If the word processor has been used correctly xml2brl will produce an excellent braille file.

OPTIONS

-b

back-translate. The input file must be a braille file, such as .brf. The output file is a back-translation of this file. It may be in either plain-text or XHTML (HTML), according to the setting of backFormat in the outputFormat section of the configuration file. HTML files will contain page numbers and emphasis. To get good HTML, the liblouis table must have the entry "space \e 1b" so that it will pass through escape characters. The html.sem file must also contain the line "pagenum pagenum". Text output files simply have a blank line between paragraphs. Encoding of text files is controlled by the outputEncoding setting. HTML files are always in utf-8.

-Csetting=value

This option enables you to specify configuration settings on the command line instead of changing the configuration file. You can use as many -C options as you wish. Any settings can be specified except those having to do with styles. The settings may be in any order. They override any settings in canonical.cfg or in the configuration file used by xml2brl.

-f configfile

This specifies the configuration file which tells xml2brl how to do the transcription. (It may be a list of file names separated by commas.) This file specifies such things as the number of cells per line, the number of lines per page, The translation tables to be used, how paragraphs and headings are to be formatted, etc. If this part of the command line is ommitted, xml2brl assumes that the configuration file is named default.cfg and is in the current direcsory. If the configuration file name contains a pathname xml2brl will consider this as a path on which to look for files that it needs. See Files And Paths.)

-h

This option causes xml2brl to print a help message describing usage and exit.

-l

This option will cause xml2brl and liblouisxml to print error messages to xml2brl.log instead of stderr. The file will be in the current directory. This option is particularly useful if xml2brl is called by a GUI script or Web application.

-p

Poorly formated input translation. Infile is any text file such as may have been obtained by extracting the text in a PDF file. The input file may also be an XML or HTML file which is so poorly formatted that better braille can be obtained by ignoring the formatting. xml2brl tries to guess paragraph breaks. The output is generally reasonably formatted, that is, with reasonable paragraph breaks.

-r

Reformat. The input file must be a braille file, such as BRF. The output is a braille file formatted according to the configuration file. It is advisable to set backFormat to HTML, since this will preserve print page numbers and emphasis. This program can be useful for changing the line length and page length of a braille file, for example, from 40 to 32 cells. It is also an excellent way to check the accuracy of liblouis tables. The original page numbers at the tops and bottoms of pages are discarded, and new ones are generated.

-t

The document is an HTML file, not XHTML. This option is useful with files downloaded from the Web in source form. Without it, the program will first try to parse the file as an XML document, producing lots of error messages. It will then try the HTML parser. With this option, it goes directly to the HTML parser. See also the formatFor configuration file setting, which enables you to format the braille output for viewing in a browser.

[infile]

This is the name of the input file containing the material to be transcribed. The file may be either an XML file or a text file. The -b, -r and -p options discussed above provide for other types of files and processing. Typical XML files are those provided by www.bookshare.org or those derived from a word processor by saving in XML format. If a text file is used paragraphs and headings should be separated by blank lines. In such a file there is no way to distinguish between paragraphs and headings, so they will all be formatted as paragraphs, as specified by the configuration file. However, if you want a blank line in the braille transcription use two consecutive blank lines in the text file.

[outfile]

This is the name of the output file. It will be transcribed as specified by the configuration file and the configuration settings. The following paragraphs provide more information on both the input and output files.
xml2brl is set up so that it can be used in a "pipe". To do this, omit both infile and outfile. Input is then taken from the standard input unit.

The first file name encountered (a word not preceded by a minus sign) is taken to be the input file and the second to be the output file. If you wish input to be taken from stdin and still want to specify an outfut file use two minus signs (--) for the input file.

If only the program name is typed xml2brl assumes that the configuration file is default.cfg, input is from the standard input unit, and output is to the standard output unit.

EXAMPLES

xml2brl input.xml output.brf

Translate input.xml to the braille file output.brf

echo "Hello louis" | xml2brl | lpr

Use xmlbrl inside a pipe.

xml2brl -b input.brf output.txt

Backtranslate input.brf to text.

BUGS

Probably some. Please report bugs to the mailing list.

AUTHOR

Written by John J. Boyer <john.boyer@jjb-software.com>

SEE ALSO

You will also find it advantageous to be acquainted with the companion library liblouis, which is a braille translator and back-translator.

The full documentation for xml2brl is maintained as a Texinfo manual. If the info and xml2brl programs are properly installed at your site, the command

 info liblouisxml
 
should give you access to the complete manual.

REQUISITES

This script requires the following programs:

liblouis (for braille translation and back-translation)


http://code.google.com/p/liblouis/

RESOURCES

Main web site: http://code.google.com/p/liblouisxml/

Mailing lists: http://www.freelists.org/list/liblouis-liblouisxml

Liblouis web site: http://code.google.com/p/liblouis/

COPYING

Copyright (C) 2002-2008 John J. Boyer and Copyright (C) 2004-2007 ViewPlus Technologies, Inc. Free use of this software is granted under the terms of the GNU Lesser General Public License (LGPL).

NOTES

1.
john.boyer@jjb-software.com
mailto:john.boyer@jjb-software.com