[Home] [Download] [Previous] [Next]


LaTeX output filter

This is a rather simple filter, which even doesn't support mathematical formulas (and probably never will support them, if nobody helps me to implement this functionality). So if you need a more powerful converter from OpenOffice.org Writer to LaTeX, consider using Henric Just's Writer2LaTeX (http://www.hj-gym.dk/~hj/writer2latex/).

The only reason why I released my own version of LaTeX filter is that I needed a conversion tool allowing to get a maximally clear LaTeX output. Unlike Writer2LaTeX, OfficeFMT doesn't claim to reproduce the original OpenOffice.org document layout: instead, it tries to generate an output similar to hand-written LaTeX documents, preserving only limited set of most commonly used character and paragraph formatting properties.

Filter options

In order to save your document into the LaTeX format, select the "File -> Export..." menu item in OpenOffice.org (note that, since this filter is for exporting files only, it is not available in the "File -> Save As..." dialog). In the "Export" dialog box select "OfficeFMT - LaTeX document (.tex)", specify the desired output file name and click "Export". After that a filter options dialog box will appear where you can specify the following parameters:

The LaTeX output filter options dialog

The LaTeX output filter options dialog

Multilingual documents

For rendering multilingual documents OfficeFMT uses the standard babel/fontenc/inputenc combination. As mentioned in the previous section, the inputenc option is selected depending from the real LaTeX document codepage. There is no special option allowing to select a specific font encoding, since the fontenc package options are constructed automatically depending from the list of languages used in the document. The algorithm is rather simple: the T1 encoding is always used, while T2A and LGR may be added for Cyrillic and Greek script correspondingly.

The list of languages itself (needed both for the babel package and for constructing the fontenc package options) is constructed by iterating through paragraph and character styles defined in the document and analyzing their language attribute. The ISO language codes are converted to human readable names and converted to lowercase (since the lowercase form is used in Babel). The langauge specified in the parameters of the default style is treated as a main document language and placed into the last position in the babel package options.

Rendering Unicode characters

The OfficeFMT LaTeX filter converts Unicode characters to LaTeX commands and ligatures according to its Unicode character database, stored in the latex-symbol.xml file. Some characters are always converted: this is true for all characters which have a special meaning in TeX (e. g. percent sign or backslash) and so need to be escaped and also for some characters which are rarely used in LaTeX files by itself, although such a usage is not prohibited. For example, guillemots are always represented as << >>. All other characters are converted only if they are not present in the selected output encoding.

Some LaTeX command to Unicode character mappings defined in the database are valid only for a specific script. For example, all support files for Cyrillic languages present in the Babel packages define the \No command allowing to type the numero sign (U+2116). However, for all other languages this command is not valid and should be replaced with \textnumero. Another example is the `~' symbol which is commonly used for non-breaking space in all languages except polytonic Greek.

Of course, in order to select a LaTeX command valid for a specific script, the converter needs to know a language of the text fragment it processes. However, note that the language is just taken from the initial OpenOffice.org document and not validated. So it is on your responsibility either to specify a correct language for each document fragment, of to instruct the filter to ignore the language markup at all (as described above). This means that very probably you will not immediately get a valid LaTeX output (especially for multilingual documents). You may need to insert or remove some additional language switching commands to prevent compiler errors like the following:

! LaTeX Error: Command \cyrd unavailable in encoding T1.

Automatically loaded packages

The following LaTeX packages are automatically referenced in the preamble if the filter considers they are needed, so that you can't control loading them:


[Home] [Download] [Previous] [Next]

The OpenOffice.org logo