Package translate :: Package storage :: Module utx
[hide private]
[frames] | no frames]

Module utx

source code

Manage the Universal Terminology eXchange (UTX) format

UTX is a format for terminology exchange, designed it seems with Machine
Translation (MT) as it's primary consumer.  The format is created by
the Asia-Pacific Association for Machine Translation (AAMT).

It is a bilingual base class derived format with L{UtxFile}
and L{UtxUnit} providing file and unit level access.

The format can manage monolingual dictionaries but these classes don't
implement that.

Specification
=============
The format is implemented according to the v1.0 UTX
L{specification<http://www.aamt.info/english/utx/utx-simple-1.00-specification-e.pdf>}

Format Implementation
=====================
The UTX format is a Tab Seperated Value (TSV) file in UTF-8.  The
first two lines are headers with subsequent lines containing a
single source target definition.

Encoding
--------
The files are UTF-8 encoded with no BOM and CR+LF line terminators.

Classes [hide private]
  UtxDialect
Describe the properties of an UTX generated TAB-delimited dictionary file.
  UtxHeader
A UTX header entry
  UtxUnit
A UTX dictionary unit
  UtxFile
A UTX dictionary file

Imports: csv, sys, time, base