TranslateOnLinux

It is entirely possible to work as a professional translator while running a GNU/Linux distribution as your chosen operating system.

The available information on compatible translation-related tools being sparse or out of date, this wiki aims to provide a list of offline (desktop) and online (cloud) solutions for linguists who wish to work on a GNU/Linux computer. The listed software should provide a comprehensive toolbox to fit most translation needs.

Inspiration for this list: the website LinuxForTranslators.com by Marc Prior, and tuxtrans, a GNU/Linux distribution specifically targeted at translators, by Peter Sandrini, along with its list of installed software.

Curated by Jean Dimitriadis (EN-FR/EL-FR translator).

CAT TOOLS

Some popular computer-aided translation (CAT) tools, such as SDL Trados, memoQ or Deja Vu are only Windows-compatible, but they are far from being the only options. There are indeed some powerful translation tools that can run on GNU/Linux. It is worth noting almost all of them can handle (albeit with some limitations) Trados SDLXLIFF files and/or SDLPPX packages, which are often sent out by translation agencies.

These include the following:

OFFLINE CAT TOOLS

OmegaT

OmegaT is a free-libre/open source translation memory application written in Java.

Don’t let its simple interface fool you, OmegaT boasts excellent features, some of which (like team projects) are mostly found in expensive proprietary CAT tools.

Some highlights include:

Platforms: GNU/Linux, Windows, OS X. Supported formats: More than 50 formats (with the help of the Okapi plugin), including Microsoft Word, Excel and PowerPoint, LibreOffice, HTML, TTX and SDLXLIFF (Trados), TXML (Wordfast Pro), IDML (InDesign) and PDF (text and Iceni Infix export). Support and manual: Documentation and manual. License: Free-Libre/Open Source Software. Cost: Gratis (Donations welcome). Reviews: OmegaT, ProZ, G2Crowd.

CafeTran Espresso

CafeTran Espresso is a feature-rich CAT tool that is fun to use, built from the ground up by a person who is both a developer and a translator himself, which shows in many ways.

Here are some highlights:

Platforms: GNU/Linux, Windows, OS X. Supported formats: file formats. Support: Solutions (Knowledge Base), Official forum and support, Reference documents. License: Closed source. Demo version: no time limit, TM files up to 1000 TUs in total and glossaries no larger than 500 terms. No limits for TMX editing. Cost: Licensing options // Also, full access for ProZ.com Plus members. Reviews: ProZ.

Memsource Editor/Cloud

Memsource is an easy and simple online CAT tool, which also offers a desktop editor (compatible with Linux), hence its inclusion in this section. It provides a complete translation environment that allows you to set up your own workflow, and share projects with your vendors.

Among other things, it offers an extensive file format support, with interesting file filter options and features a thorough support website. It also sports a useful document preview feature and can handle large remote TMs and termbases as well. It is used by some agencies who offer project-based licenses. The web version is reportedly quite slower than the desktop version.

Online/Offline tool Platforms: GNU/Linux, Windows, OS X. Supported formats: file formats. Support and manual: Support, Memsource Cloud user manual, Desktop Editor user manual. License: Closed source. Demo: 30-day trial / Free Personal edition (maximum two files at a time). Cost: editions and pricing. Reviews: ProZ, G2Crowd.

WordFast Pro

WordFast Pro is the only major commercial TM tool that is truly cross-platform. It is available as a standalone application (in contrast to WordFast Classic, which is Word-based).

WordFast Pro 3 and the newer WordFast Pro 5 have a different translation interface, but both share a number of features, such as multiple supported file formats (and file filter options), unlimited TM and Glossary access (as well as remote access, via WordFast Server), Machine Translation integration, advanced time-saving features and real-time assurance (Transcheck).

In addition, WFP 5 offers a WYSIWYG interface for formatting tags, Target-only Live Preview, Segment Filtering, Multilingual Translation Projects, Export and Import Translation Packages (including SDL Trados files and packages) and the ability to Chain (virtually merge) files for translation consistency.

The demo version has some limitations, but it is not time-based, making it useful to keep around, especially for round trip scenarios, since it supports many file formats and offers nice filter options.

Platforms: GNU/Linux, Windows, OS X. Supported formats: WFP 3, WFP 5 (check specifications). Support and manual: WFP 3 support wiki and manual, WFP 5 support wiki and manual, videos and training courses available as well. License: Closed source. Demo: Wordfast offers a demo version that runs without a paid license for translation memories (TM) of up to 500 translation units, making it possible to use Wordfast on actual translation projects before you decide to purchase. You can also register for a 30 day, fully functional trial license in case you have TMs that exceed this limit. Cost: pricing. Reviews: ProZ.

Swordfish Translation Editor

Swordfish III is an advanced CAT (Computer Aided Translation) tool based on XLIFF 1.2 open standard, designed for demanding professional translators. It supports exchanging TMX (Translation Memory eXchange) and uses the GlossML glossary format. It includes a super fast Internal database server and integrated support for RemoteTM Web Server. You can also use third-party database engines like Oracle 10g or MySQL 5.x for storing TM and terminology data.

Swordfish is compatible with other CAT tools, since it supports XLIFF (including SDLXLIFF), Uncleaned RTF, TTX, TTX Exchange and TXML, along with various native file formats.

Other features include In-Context Exact Matches, full interface customization, segment filtering, comfortable proofreading and TM/MT engines.

The developer, Maxprograms, also offers a range of localization software and utilities, among which Stingray (document aligner) and a DITA translation manager and publisher.

Free utilities include Anchovy (Glossary manager and term extractor), SRXEditor (Segmentation Rules editor), XLIFFChecker and TMXValidator.

Platforms: GNU/Linux, Windows, OS X. Supported formats: General documentation, XML formats and Software development. Support and manual: User guide, Getting started, Yahoo support group. License: Closed source. Demo version: 30-day trial. Cost: Online Store. Reviews: Proz.

Fluency Now

Fluency Now (Professional or Enterprise edition) is an easy to use full-featured CAT tool suite that’s affordable for freelancers and organizations alike.

Some of its product features include:

Platforms: GNU/Linux, Windows, OS X. Supported formats: Benefits & Features. Support and manual: Fluency 101, video tutorials. License: Closed source. Demo version: 15-day trial. Cost: Monthly and yearly subscription. Reviews: Proz.

Heartsome Translation Studio

Heartsome is a discontinued commercial CAT tool software that is now open sourced and offered gratis.

The software starts getting old (it is no longer developed for more than 4 years now), but sports some nice capabilities that make it useful to keep around.

It also offers a separate Heartsome TMX Editor for editing TMX memories.

Platforms: GNU/Linux, Windows, OS X. Supported formats: see GitHub page. Support: GitHub page has links for user manual, quick start guide and file conversion guide. License: Open sourced. Cost: Gratis. Reviews: Proz.

WordFast Classic

WordFast Classic is a CAT tool that operates entirely inside of Microsoft Word. Since some versions of MS Word can be installed and run via the Wine compatibility layer (see related section below), WordFast Classic can be used in GNU/Linux.

Platforms: GNU/Linux, Windows, OS X. Supported formats: Word. Support: support wiki, user manual. License: Closed source. Demo: Demo version with no time limit, up to 1000 translation units, limited MT. Cost: Price. Reviews: Proz.

Other

Virtaal: Virtaal is an easy and lightweight tool that opens Gettext (.po, .mo), XLIFF, TMX and TBX files, among other formats. Even if its nice feature set may not fully appeal to professional translators, it represents a very good utility to use as a quick viewer for the above-mentioned bilingual file types.

Lokalize: Lokalize is a computer-aided translation (CAT) tool, a full-featured GUI application for translators, written from scratch using the KDE4 framework. Aside from basic editing of PO files with nifty auxiliary details, it integrates support for glossary, translation memory, diff-modes for QA, project managing, etc. Mostly interesting for free software localization.

Anaphraseus (WordFast Classic-like OpenOffice/LibreOffice extension)

Esperantilo

ONLINE CAT TOOLS

While problematic from a privacy and freedom point of view, the advent of Cloud/browser-based CAT tools has added several GNU/Linux compatible solutions to the existing arsenal.

Memsource Cloud

Mentioned along with Memsource Editor in the above section.

MateCat

MateCat is a free, simple, enterprise-level online tool, designed to make translation, post-editing and outsourcing easy and to provide a complete set of features to manage and monitor translation projects. It is released as open source software.

Given the fact MateCat supports a host of file formats (as well as MT engines), it can also be used in a round trip scenario (as an additional filter to handle file types unsupported by other CAT tools). Read more about this here.

Supported formats: 70 formats. Support and manual: Documentation and support. License: Software based on mostly open source components. Cost: Free (registration recommended). Reviews: Proz, G2Crowd.

The completely open source version can also be installed offline or in a VM, and the open sourced MateCat filters are freely available too.

Warning: By default, MateCat stores your translated segments in the public MyMemory TM. To make sure this does not happen unwillingly, create a private TM resource: In the Project creation page, click on Settings (Alternatively, in the TM and glossary field, expand the drop-down menu and select Create resource). Click on + New resource button in the opened dialog. Give the TM an optional name. Hit Confirm. You will see that “MyMemory: Collaborative translation memory” resource is Enabled for Lookup, but not set to be Updated anymore. That way, translated segments will only be stored in your private resources.

Smartcat

Smarcat is an intuitive, feature-rich computer-assisted translation web app. It provides a full set of translation automation technologies for companies and translators and makes it easy for them to connect and collaborate.

Supported formats: file formats. Support and manual: Documentation and support. License: Closed source. Cost: Mostly free (registration needed), billable OCR service. Reviews: Proz, G2Crowd.

Wordfast Anywhere

WordFast Anywhere is a free and complete online CAT tool, one of the oldest of its kind, allowing translators to work from anywhere, provided they have an Internet connection and a web browser.

Supported formats: txml, txt, doc, docx, xls, xlsx, htm, html, inx, indml, pdf, jsp, asp, odt, tif, rtf, ppt, pptx,mif, ttx, txlf, xlf, xliff, sdlxliff. Support and manual: Wiki, manual. License: Closed source. Cost: Free (registration needed). Reviews: ProZ.

XTM Cloud

XTM International develops XTM, a complete online Translation Management System (TMS) with an integrated Computer Aided Translation (CAT), targeted at enterprises, LSPs and freelance translators. The centrally shared TM, terminology, workflow and translator workbench are all accessed via a browser. While XTM is cost effective and easy to use, it is a scalable system, that is built for collaboration and incorporates a comprehensive API.

Supported formats: file formats. Support and manual: XTM Academy provides access to knowledge base articles, webinars, tutorials, online documentation release notes, and support. License: Closed source. Demo: 30-day free trial. Cost: Monthly, quarterly or annual subscription pricing depending on the Account type (Freelance, Group, Entreprise), Duration, Number of users and Words/month. Reviews: Proz, Capterra.

Google Translation Toolkit

Google Translation Toolkit is a web application that integrates Google Translate suggestions. With the Google Translator Toolkit, translators can organize their work and use shared translations, glossaries and translation memories.

Be sure to add the Chrome extension Google Translator Toolkit Booster. GTT Booster improves on Google Translator Toolkit by transforming the experience into something more similar to an industry CAT tool.

Supported formats: Supported file formats & size limits. Support and manual: Translator Toolkit Help. License: Closed source. Cost: Free (Google account needed).

Wordbee

Wordbee is an online Translation Project Management and Translation Editor. It is a complete CAT tool, price management, customer management, invoicing, and linguistic solution.

Supported formats: Supported file types. Support and webinars: Documentation, Support, Webinars. License: Closed source. Demo: 14-day free trial. Cost: Monthly or annual subscription.

Other

Not to mention the different online browser-based localization platforms which can be used on GNU/Linux: Crowdin, Transifex, Weblate, Pontoon, Zanata, Pootle, POEditor, Webtranslateit, etc.

Alignment

Document alignment is the process of matching source language segments with target language segments and creating a reusable bilingual document (commonly a translation memory). It is a way of making use of existing translation materials, especially those not translated via a CAT tool.

LF Aligner (free, offline) is an excellent alignment tool which relies on Hunalign for automatic sentence pairing. Input: txt, doc, docx, rtf, pdf, html. Output: tab delimited txt, tmx, and xls. With web features. The Linux version opens an interactive easy to use terminal. The alignment review step requires editing the produced file in a spreadsheet program. The Windows version does not need installation and can run on Wine. It offers a GUI, and, at the alignment review/editing step, it opens an easy interface for splitting and merging source/target segments and fixing any alignment error.

WordFast Aligner (free, online)

PlusTools is a free MS Word add-in that can handle multiple tasks, including aligning documents.

YouAlign.com (free, online, registration needed, 1MB max file size) an alignment tool based on AlignFactory, a powerful automated document alignment tool with a web crawler.

TM-Town (online, registration needed) has an alignment tool that can help you convert a source and target document into a translation memory (TM) file.

WordFast Anywhere (free, online, registration needed) also integrates an autoaligner.

Many other CAT tools, such as WordFast Pro, OmegaT, CafeTran Espresso, SmartCat and Memsource offer an alignment feature.

Stingray Document Aligner (paid) is a cross-platform document aligner designed to assist professional translators in the production of translation memories from existing translated material.

Handling tags

Word documents (especially those coming from OCR’d files or PDF converted files) are often strewn with codes which produce unnecessary tags when imported in a CAT tool. This tagged information shows up in the translation grid as spurious codes{1}around{2}, or even in the mid{3}dle of, words, making sentences difficult to read and translate and generally negating many of the productivity benefits of the program. There are ways to get around this.

CodeZapper (€20 licence) is a set of Word VBA macros designed to “clean up” Word files before being imported into a standalone translation environment so that the files have fewer tags.

TransTools - Document Cleaner (freeware) Part of the TransTools set of MS Office add-ins, Document Cleaner is a collection of tools for the preparation of badly formatted documents for translation. It features a Tag cleaner tool, which attempts to strip unnecessary tags.

CafeTran Espresso CAT tool offers a special filter to handle MS Word documents after OCR, which clears the source text of unnecessary formatting tags. It can prove useful anytime an MS Word document produces too many unnecessary tags.

OmegaT CAT tool has a Remove tags option, which strips ALL tags from the imported document(s). The latest version now includes the TagWipe utility/Groovy script for clearing excessive tags on Word documents. It offers a GUI that lets the user select different options with which it should be run.

Translation Memory (TMX) Editing/Maintainance

CafeTran Espresso (paid software // free to use TMX editing feature)

CafeTran Espresso CAT tool offers a powerful workflow for performing various editing and maintenance tasks on TMX Translation Memories. Demo’s TM limits do not apply to the “Edit translation memory” workflow.

Heartsome TMX Editor (free)

A powerful TM maintenance tool for all CAT software. It provides many useful and practical functions besides common editing features, allowing you to perform TM maintenance tasks easily, simply and all with one tool.

TMXEditor (paid, 30-day free trial)

Maxprograms TMXEditor is a cross-platform desktop application designed for editing TMX. It is able to handle very large files with millions of segments.

SuperTMXMerge

Diff tool for comparing and merging TMX translation memories.

Goldpan TMX/TBX Editor (free, Windows VM only)

An intuitive and multifunctional TMX/TBX file editor from Logrus Global Software Development Team.

Segmentation/SRX editors

The SRX (Segmentation Rules eXchange) format is an open standard to save segmentation rules in a file so they can be used between different tools.

Examples of tools that support SRX files: OmegaT, CafeTran Espresso, Swordfish, WordFast Pro, and Memsource.

Okapi Ratel is a free-libre/open source cross-platform application that helps create and maintain segmentation rules in the SRX format. Such rules are used to break down translatable text into more meaningful parts. Ratel is part of the Okapi Framework, and can be used either as a standalone utility or from within Okapi Rainbow.

Maxprograms’ SRXEditor is included in Stingray and Swordfish III installers. While these programs require a license to be used, the SRXEditor can be launched freely.

Downloadable SRX rules: OmegaT’s default rules, LanguageTool’s segmentation rules.

Term extraction

Monolingual term extraction attempts to analyze a text or corpus in order to identify candidate terms, while bilingual term extraction analyses existing source texts along with their translations in an attempt to identify potential terms and their equivalents.

Monolingual term extraction

Prospector

Free, online, English only. If you work on English source texts, try it. Based on the world’s largest corpus of the English language supported by a Brigham Young University professor, and a unique algorithm developed by a team of in-house experts.

ThermoStat Web

Free, online. Languages: English, French, Italian, Portuguese, Spanish. The results are really good. Uses linguistic and statistical methods while taking the potential terms’ structures and relative frequencies into account in the analysis corpus. TermoStat is free, but users must register. Outputs a Tab delimited TXT (which can be renamed to and opened as a CSV file in LibreOffice).

Tilde terminology (powered by Taas)

Free/Premium, online, 25 languages. Various options to choose from. By default, it uses TWSC (Tilde wrapper system for CollTerm) based on linguistic analysis enriched by statistical features. Extraction time somewhat long, but results quite accurate. By specifying a subject domain, some term candidates are matched with translations from a variety of sources for target translation lookup.

The Sketch Engine

Subscription-based, online. Several supported languages. Monolingual and bilingual term extraction supported. Also offers a simpler OneClick term extractor for single word and multiword candidates.

Okapi Framework - Rainbow

Free, offline. Its term extraction utility offers flexible, configurable statistical analysis. Supports all languages which use scripts with space-delimited words.

Bilingual term extraction

The Sketch Engine

Subscription-based, online. Several supported languages. Monolingual and bilingual term extraction supported.

Anchovy

Anchovy is an offline multilingual cross-platform glossary editor and bilingual term extraction tool based on the open Glossary Markup Language (GlossML) format.

Anchovy is included in Swordfish III installers as free plugin and handles various import and export file formats.

Terminology management

File formats

Excel / CSV/TSV / Tab-delimited TXT

Some simple glossary formats such as Excel files, CSV/TSV (Comma-separated values/Tab-separated values) files and tab-delimited text files may be all you need for your terminology management and glossary exchange needs.

Excel and CSV/TSV files can simply be edited in office applications such as Microsoft Excel, LibreOffice Calc (superior CSV/TSV handling compared to Excel). Ron’s Editor, an excellent dedicated CSV editor with professional features (Feee/Lite and Pro versions available, with a 30-day trial) is Windows-only, but can be used with Wine after installing .Net Framework 4.5.2.

Tab-delimited TXT files can be edited in any text editor, although it is recommended to simply rename them to .csv or .tsv, no other conversion needed.

Most CAT tools support such files (and converting from one of these file formats to another is mostly trivial): OmegaT, CafeTran Espresso, Matecat, Memsource (Import/Export Excel), WordFast Pro (Tab delimited TXT), Fluency Now, Google Translator Toolkit, etc.

Swordfish uses the open Glossary Markup Language (GlossML) format. It offers a free glossary editor, Anchovy, which can import TMX, CSV and tab-del glossaries and export to GlosssML, CSV, HTML, TMX, TBX and XML.

For conducting CAT-tool independent TM and glossary searches, TMLookup is an open-source tool designed to search (massive) bilingual and multilingual text databases (translation memories) and glossaries. For glossaries, it can import TXT and XLS files. It runs fine on Wine.

TBX

TBX, short for TermBase eXchange, is the international standard for representing and exchanging information about terms, words, and other lexical data. Here’s a list of tools with some TBX support: http://www.tbxinfo.net/tbx-support/

Several CAT tools can import/read TBX files: OmegaT, CafeTran Espresso, Memsource, WordFast Pro, Swordfish, etc.

Heartsome TMX editor can import and export to TBX (but not directly edit), among other formats.

Anchovy can export to TBX: Anchovy (free).

Virtaal can edit/read TBX files

XBench (see below) can import TBX files and convert them to tab-delimited TXT files.

Goldpan TMX/TBX Editor (installs but does not run correctly in Wine, needs a Windows VM) can edit and export/save TBX (and TMX) files

SDLTB

Trados Studio termbases (SDLTB) are written in a proprietary format, created by SDL MultiTerm. Since SDL Trados/Multiterm are quite common, you might come accross these files fairly often.

Some CAT tools support importing SDLTB files: Fluency Now and Memsource (partly)

Tools for converting SDLTB files: Trados Studio Resource Converter and WfConverter

Tools for converting to SDLTB (among other conversions): Glossary Converter (Windows VM, or Wine with MS Office)

Tools

XBench is a Terminology and QA tool.

It features an older freeware non-unicode version (2.9) and a yearly subscription-based version 3.x in active development. The program is Windows-only but runs well on Wine.

XBench supports a host of glossary, TM and bilingual formats. For Terminology, XBench can import/read Tab-delimited Text files, TBX, MultiTerm XML Glossaries, Wordfast Glossaries, etc.

It can also be used to convert to TMX and Tab-delimited text files.

Office software

An Office suite represents an important part of a translator’s or reviser’s toolkit and is often a critical step in the translation workflow.

Microsoft Office

Various versions of Microsoft Office are quite well supported via Wine and are installable using PlayOnLinux (free) or CrossOver (paid). See dedicated section below.

If needed, Microsoft Office can also be installed in a Windows VM (see related section).

Useful Microsoft Office add-ins:

LibreOffice

LibreOffice is an excellent Office suite in its own right. It can open and save MS Office documents, but compatibility is not always perfect, which can be critical when it comes to delivering final documents and meeting client expectations. You may need to consider other options as well.

Feature/compatibility comparison with Microsoft Office.

Useful LibreOffice extensions:

Apache OpenOffice

Less developed and feature-rich than LibreOffice.

WPS Office

Free for Linux users. Boasts high MS Office compatibility.

SoftMaker Office 2018

Paid software. 30-day free trial. Boasts excellent MS Office compatibility.

Tip: Trial can be reset by removing the SoftMaker folder.

SoftMaker FreeOffice

The free version of the SoftMaker Office.

OnlyOffice

Has good looks and potential.

Google Docs/Google Drive (online)

Mostly for collaboration or own documents, not necessarily meant to replace a complete Office Suite.

Microsoft Office Online (online)

Same as above.

A word on Microsoft fonts:

You’ll probably need a way to install Microsoft/Windows fonts on your Linux distribution to be able to work with the same fonts as other Microsoft Office users. These are mostly proprietary and cannot be distributed along with your GNU/Linux distribution.

Copying the fonts folder from a Windows machine and placing it in the hidden .fonts folder of your home folder (or creating it, if it does not exist) is a distribution-independent solution. Other methods are available per GNU/Linux distribution.

If fonts don’t look good, you might need to learn how to improve anti-aliasing.

Language/Grammar checkers/Writing aids

Note: PerfectIt (EN) (paid) cannot be installed with Wine (as tested), it needs to be used in a Windows VM.

QA tools

Each CAT tool offers a built-in QA:

On top of the QA checks integrated into your chosen CAT tool, you may want or need to use a dedicated QA tool.

Proofreading/Revising (Track changes, compare documents)

Reviewing translations and proofreading can either be done within a CAT tool, especially with the Track Changes feature, via an exported Word or RTF bilingual table or on the exported target document itself.

Let’s take each scenario separately:

Track changes (in a CAT tool):

In SDL Trados and memoQ, reviewers often make use of the included Track changes feature.

While each CAT software offers different methods for reviewing translations, if you are working outside of the above-mentioned tools, you cannot use Track changes in the bilingual (SDLXLIFF, MQXLIFF) files themselves.

You can, however, suggest the use of TQAuditor’s Quick Compare feature, to produce an Excel report highlighting any changes made during revision, by comparing the translated and the reviewed bilingual file. If you register with TQAuditor light version for freelancers (free), you can keep track of these jobs and access a few more actions. Since TQAuditor supports bilingual files from various CAT tools, it represents a universal, “CAT-tool-agnostic” solution for producing translation reviews. This means you can open the SDLXLIFF files to review, make the necessary changes in your preferred CAT tool, and then create a comparison report in TQAuditor.

(Track) changes (in an exported bilingual Word/RTF)

Working on exported bilingual Word or RTF files offers the advantage that the reviewer does not need to own the original CAT tool, or any CAT tool, for that matter. The disadvantage being that the reviewer is left without the convenience (and TM and other resources) of a CAT tool. If you receive a bilingual Word or RTF for external review, you can just use your Office suite Text processing application (MS Word, LibreOffice Writer, WPS Writer etc.) to complete the review. Provided that bilingual review files can be opened in LibreOffice (SDL Trados files may not be compatible, it seems), it’s good to know that its track changes/compare document feature is compatible with MS Office, at least with some precautions (like ensuring you use the same author name and initials in both programs, see archived link here). If you don’t use Track changes, Compare document feature can be used to create a document with the modifications made.

(Track) changes (in the target document)

Here, it’s about working on the software program that handles the original file format. If it’s a Word document, you can use Track changes (or Compare Document) as usual. When delivering the final file, it’s usually expected to save a separate copy with no tracked changes (accept all).

Okapi Framework

The Okapi Framework is a cross-platform and free-libre/open source set of components and applications that offer extensive support for localizing and translating documentation and software.

Rainbow is an Okapi application which allows you to launch different utilities to help you perform various localization-related tasks. It includes CheckMate and Ratel, which can also be launched as standalone utilities.

CheckMate is a GUI application that performs various QA checks on bilingual translation files such as XLIFF, TMX, TTX, PO, TS, Trados-Tagged RTF, and any other bilingual format supported by the framework.

Ratel is a GUI application to create and maintain segmentation rules. Such rules are used to break down translatable text into more meaningful parts. Ratel uses Okapi’s SRX-based segmentation engine. SRX is the Segmentation Rules eXchange format.

BootCaT

Bootstrap Corpora And Terms from the Web. BootCaT is a free/libre software cross-platform Java application, which you can use to create specialized monolingual corpora on the fly. You can use a concordancer, such as AntConc, on the resulting corpus. Note: The Sketch Engine uses WebBootCaT among other features.

AntConc

AntConc is a freeware corpus analysis toolkit for monolingual concordancing and text analysis.

AntFileConverter

A freeware tool to convert PDF and Word (DOCX) files into plain text for use in corpus tools like AntConc.

AntFileSplitter

A freeware text file splitting tool.

AntPConc

A freeware parallel corpus analysis toolkit for concordance search and text analysis using UTF-8 encoded text files.

EncodeAnt

A freeware tool for detecting and converting character encodings.

Any2UTF8

Any2UTF8 is a simple program to convert plain text file in any character encoding to UTF8.

HTTRack/WebHTTRack

A free (GPL, libre/free software) and easy-to-use offline browser utility allowing you to copy a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site’s relative link structure. Simply open a page of the “mirrored” website in your browser, and you can browse the site from link to link as if you were viewing it online. Useful for some website translation scenarios.

Translate Toolkit

A toolkit with various terminal utilities for localization engineers, offering format conversions, and quality assurance tasks.

TMXValidator

Check the validity of your TMX documents on any platform.

XLIFFChecker

Verify the validity of your XLIFF (XML Localisation Interchange File Format) documents in any platform

Subtitling

Professional subtitling software is mostly Windows only, I’m afraid. Here are some free alternatives for more occasional subtitling (although Aegisub and, to a lesser extent, Subtitle Edit, are being used by professional subtitlers as well).

Aegisub

Aegisub is a free, cross-platform open-source tool for creating and modifying subtitles. Aegisub makes it quick and easy to time subtitles to audio and features many powerful tools for styling them, including a built-in real-time video preview.

Probably the most advanced and powerful free subtitling tool out there. Excellent for time-spotting.

Aegisub’s user manual.

Subtitle Edit

Subtitle Edit is a free (open source) subtitle editor. It can read, write, and convert between more than 200 subtitle formats. Useful for file conversions. Help page. Tutorial videos.

Gaupol

Editor for text-based subtitles. Supports translating a subtitles side by side with the original.

Amara offers a very good online subtitling platform.

dotsub is another web-based system for creating and viewing subtitles for videos in multiple languages across all platforms.

Transcription

Foot pedal not included.

Easytranscript

An easy to use transcription software with a variety of features. Free.

Express Scribe

There are two options for running Express Scribe on the Linux operating system.

Transcribe!

Download for Linux. There is a 30-day evaluation period, after which you decide if you wish to buy the software.

Dictionary lookup

GoldenDict is a feature-rich cross-platform dictionary lookup program.

The program has the following features:

With a bit of Google search, you can find different (mostly monolingual) Babylon, Stardict, Dictd and Lingvo dictionaries to use in GoldenDict in some languages.

Other Dictionary applications exist on GNU/Linux, but they are nowhere near GoldenDict in functionality.

For dictionary file conversion, make sure to check out PyGlossary, a tool for converting dictionary files aka glossaries with various formats for different dictionary applications.

Depending on your language pairs and subject fields, there are multiple dictionaries, glossaries, corpora, terminology banks, concordancers, etc. available online. Listing them is beyond the scope of this document.

Here’s just one: MagicSearch.com is a multilingual metasearch engine which allows you to search multiple sources (dictionaries, corpora, machine translation engines, concordancers, search engines) with a single click. Select a language pair and submit a search (select the same source and target language for monolingual searches). MagicSearch will display a single scrollable page with multiple sources. You can click on each source button as it changes its color (=loads) to move to the respective source. MagicSearch remembers the language pair you selected the next time you visit the site (using a cookie).

Tip: Set “One-page results” to OFF, to speed up queries. Click on the gear to select which resources to use (and hide the others), and reorder them as you see fit. MagicSearch is also available as a browser extension.

Project management & Invoicing

Project management tools don’t have to be translation-specific, although it may help. Also, depending on your country of residence and tax requirements, you can use different more general invoicing tools.

Protemos

Free for freelancers. Track translation orders, monitor deadlines, organize files, send invoices and control payments

LSP.expert

Project management and invoicing tool for Freelance translators, teams, and agencies. Online, subscription-based, 30-day trial.

BaccS

Translation project management and invoicing tool. Offers an offline Windows-only version as well as an online version. Paid Freelance edition with permanent licence. Free for ProZ Plus members.

I you are a ProZ.com member, you can also use the online ProZ.com invoicing tool.

Flantie

Flantie is an easy to use online task and invoice management system for translators and interpreters. It offers both a Free and and a Pro plan.

Project Libre

An open source alternative to Microsoft Projects.

DTP - Image localization

There is no QuarkXpress, Adobe InDesign, Photoshop or Illustrator for GNU/Linux.

Those offering DTP services may need to use a Windows VM or simply reconsider.

GNU/Linux is not completely lacking in the DTP department. Scribus can produce professional PDFs, just like InDesign or QuarkXpress, Inkscape is probably just as good as Illustrator and GIMP can tackle many of Photoshop’s capabilities.

It’s just that these tools cannot provide the required compatibility to flawlessly integrate into a client’s proprietary DTP workflow.

Along with some other software and utilities, they can, however, prove useful for occasional image localization and editing.

Speech recognition (STT)

Speech recognition is quite lacking in GNU/Linux. At present, few options compared to Nuance’s Dragon Naturally Speaking, which does not run natively.

Some solutions, such as the Voice notebook, use Google Speech API, which also is used in Google Docs (CafeTran offers a solution to take advantage of that).

Text-to-speech (TTS)

Text-to-speech can be used in a number of scenarios for revision or self-revision purposes. Alas, quality solutions do not abound.

Interested users might want to investigate Balabolka, which runs in Wine (not tested).

Another possible solution would be to use Amazon Polly text-to-speech service. Drop me a line if interested.

PDFs

PDF Editors

Sejda PDF Editor (Free, with some premium features)

Foxit Reader (Free) PDF reader (not editor) with some lite editing features such as PDF annotation and PDF sign.

PDF-XChange Editor (Free)

Qoppa PDF Studio (Paid)

Master PDF Editor (Paid)

Infix PDF Editor/TransPDF (Paid, free for TransPDF editing)

Inceni’s Infix is a PDF Editor for Windows which works well under Wine (see related section below). Just deselect the PDF printer option during installation.

This software is of special interest to translators because it can be used along with TransPDF, a free/pay as you go online service which handles PDF files translation: it offers to convert the PDF to the XLIFF format which can then be translated in a CAT tool, in order to produce a translated PDF file that closely follows the original layout.

PDFs translated via TransPDF can be edited using the free demo version of Infix PDF Editor. This is very handy for making final adjustments to spacing and layout ready for clients.

Users of the paid Infix version can use TransPDF free of charge. ProZ Plus members enjoy free 10 credits (equivalent to 10 PDF pages) per month, along with discounted credits purchase.

TransPDF also offers a paid OCR feature for non-editable PDFs.

For a thorough review of solutions for handling PDFs in translation, see here.

PDF Readers

Evince: Document viewer for multiple document formats. Supports PDF, PostScript, DjVu, TIFF, and DVI.

Foxit Reader: PDF reader (not editor) with some lite editing features such as PDF annotation and PDF sign/protect.

Acroread: Acrobat Reader 9 for Linux. You can install abracadabraCompteur 2016 plugin for PDF word count.

Okular: Universal document viewer, supporting different kinds of documents, like PDF, Postscript, DjVu, CHM, XPS, ePub, and others.

PDFSam: Split, merge, extract pages, rotate and mix PDF files.

PDF Chain: A graphical interface allowing to manipulate PDF documents (concatenate, burst, watermark, attach files…)

PDF Mod: Modify PDF documents: Reorder, rotate, and remove pages, export images from a document, edit the title, subject, author, and keywords, and combine documents via drag and drop.

Tabula: Tabula is a tool for liberating data tables locked inside (editable) PDF files.

PDF TO TEXT

For converting editable PDFs to Docx, most PDF conversion utilities require using a Windows VM.

Able2extract Professional is a closed source paid software that offers a Linux desktop version. It lets you convert, create and edit PDF documents. Conversion works for PDF to Word, Excdel, PowerPoint, AutoCAD, Images, Publisher and LibreOffice documents (ODT, ODS and ODP). It sports an OCR feature as well, including extracting scanned PDF tables into Excel.

Foxit PDF to word online converter produces excellent results. CloudConvert online converter does a very nice job as well, with the added bonus that it also handles a host of different file conversion types and offers an API (you can create a script).

AntFileConverter is a freeware tool to convert PDF and Word (DOCX) files into plain text for use in corpus tools like AntConc.

For command-line enthusiasts, programs such as pdftotext, pdfreflow (pdftohtml), Calibre’s e-book-convert, pdf2htmlEX, pdfbox can be added to the mix for PDF text extraction.

For a thorough review of solutions for handling PDFs in translation, see here.

OCR

Native OCR solutions usually lag behind ABBYY FineReader and other proprietary software.

Able2extract Professional is a paid software that offers PDF conversion, creation and editing with OCR capabilities, including extracting scanned PDF tables into Excel (7-day free trial available).

ABBYY FineReader and other programs (such as Adobe Acrobat Pro, Nuance OmniPage, Readiris and Wondershare PDFelement) can be used in a Windows VM, see related section. Some earlier versions of ABBYY FineReader (up to version 10) are reportedly working via Wine, see related section.

Free/Open Source OCR engines include Tesseract, GOCR, and Cuneiform. The first two provide decent results but struggle with complex layouts.

Various GUI software tools make use of these engines.

One of the easiest and most feature rich is gImageReader. It scans images and PDFs, with manual and automatic recognition in multiple languages. Post-processing the recognized text includes a spellchecker, an excellent find and replace feature and a useful “remove line breaks” action that can have its use outside of OCR tasks as well.

When it comes to image scan post-processing, or pre-processing for use in a program like ABBYY FineReader (to improve OCR, but also for creating scanned ebooks), there is nothing like Scantailor. Or maybe there is: Scantailor Advanced. The wiki is here.

E-book management & conversion

Calibre is an excellent e-book management application for organizing, viewing, converting and editing e-books. Since linguists tend to keep many (e)books around, it’s definitely worth looking into it.

Sigil is a very nice epub editor.

Interested in translating e-books? A solution is described here.

File/Folder comparison (DIFF) tools

Meld: Visual diff and merge tool for files, directories, and version controlled projects.

Beyond Compare: Compare files and folders, merge changes, synchronize files and generate reports. Commercial software.

DiffPDF: Compare PDFs

diff-pdf: Compare PDFs (terminal)

SuperTMXMerge: Diff tool for comparing and merging TMX translation memories.

Desktop search/Full text index

DocFetcher

Allows you to index select directories and search the contents of files on your computer. Great for ad hoc indexing your documentary research, your ebook library, etc.

FSearch

A fast and lightweight file search utility based on GTK+3. Similar to the (Windows only) Everything Search Engine, it provides instant (as you type) results, with RegEx and filter support among other features.

AngrySearch

A fast file search utility based on QT5, that attempts to provide a Linux version of the Everything Search Engine available for Windows.

Recoll

Recoll finds keywords inside documents as well as file names (uses indexing).

Searchmonkey

Allows users to search for file names and contents using powerful regular expressions.

Regexxer is a GUI search/replace tool featuring Perl-style regular expressions.

Gnome Shell Desktop Environment comes with a configurable built-in Tracker (Search), which allows searching various types of content, including “full-text search”. If you don’t use it, you might as well disable it, to prevent full indexing.

For simple, lightweight desktop search, you can use Gnome search tool or Catfish.

Of course, there are also powerful command-line utilities for searching files (find), text searching for lines matching a regular expression (grep), etc.

File rename utilities

Nautilus (Files) file manager comes bundled with a bulk renamer (just select multiple files and press F2).

Thunar file manager also includes an excellent Bulk Rename utility.

Text editors

Gedit and Kate are two of the nice default text editors available.

Geany is a great lightweight text editor and Integrated Development Environment (IDE).

Text editors with even richer functions inclure GitHub’s Atom, Sublime Text (free/paid) and Brackets.

For historical reasons, I’ll also include two of the oldest editors, with some hardcore fans: Vim and GNU Emacs (see also Editor wars).

Productivity tools selection

AutoKey (Py3)

A Python 3 port of AutoKey, the desktop automation utility for Linux and X11. It allows you to manage a collection of scripts, and assign abbreviations and hotkeys to these scripts allowing you to execute them on demand in whatever program you are using. It can also be used as a text expander, where you store phrases (snippets of text) to be reused across various applications by typing an abbreviation or a keyboard shortcut.

Wiki, OldAutokey page and user group.

Reduce eye-strain

Utilities that adjusts the color temperature of your display(s) according to the position of the sun.

Redshift

Also available as a Gnome shell extension.

f.lux

Note: Gnome Desktop environment also sports an integrated Night mode in Settings > Displays.

Clipboard managers

GPaste

Also available as a Gnome shell extension.

Parcellite

CopyQ

Klipper

Screenshots

Beyond the standard screenshot capabilities of the desktop environment (Gnome, KDE, etc.), there are separate apps that take it one step further:

Shutter

Lightscreen

Flameshot

Screen recording/Screencasting

SimpleScreenRecorder Excellent and easy-to-use screen recorder

Open Broadcaster Software Full-featured cross-platform screen recording and live streaming software.

recordMyDesktop

Peek animated GIF Screen Recorder

Time/Project tracking

Project Hamster

TimeCamp

RescueTime

Toggl

Pomodoro timers

Gnome Pomodoro

PomoDoneApp

Pomello

Unit conversion

ConvertAll

On-screen keyboard

Onboard is an excellent on-screen (virtual) keyboard. Handy if you wish to remember how to type less frequent characters for your language.

Running Windows applications

Natively on WINE

Wine (an acronym for “Wine Is Not an Emulator”) is a free and open-source compatibility layer that aims to allow computer programs developed for Microsoft Windows to run on Unix-like operating systems. Instead of simulating internal Windows logic like a virtual machine or emulator, Wine translates Windows API calls into POSIX calls on-the-fly, eliminating the performance and memory penalties of other methods and allowing you to cleanly integrate Windows applications into your desktop.

Running Windows software on Linux can sometimes be a complex endeavor, however, you get to install and run many Windows programs via Wine.

To see the support status of a specific application and version, you can use the Wine’s official website. Not all software is represented.

Since each Windows app has different requirements, it is usually recommended to create different Wine “prefixes”.

Two applications aim to make it easy to install Windows programs in such separate locations:

Some examples of useful translation-related applications that run well on Linux via Wine (mostly PlayOnLinux and Crossover) are the following:

Many Windows programs that require specific versions of the .NET framework can be installed successfully after installing these first. Tip: Trial versions can be easily reinstalled since you can create separate machines with a few clicks.

Through a Windows Virtual Machine (VM)

The recommended free and easy way to install a Windows virtual machine and run it inside GNU/Linux is through a cross-platform application called VirtualBox.

With Virtualbox (provided you also install the extension pack and guest additions), you can run Windows in fullscreen mode, use USB and other devices, share folders or clipboard between the two systems (the Linux host and the Windows client), take advantage of the Internet connection, and more. There’s no need to shut down and restart the Windows VM, you can simply save the VM state, for a quick reuse anytime.

This means you can run (almost) any Windows software while staying on Linux, including SDL Trados, memoQ, Deja Vu, Transit, ABBYY Finereader, Adobe Photoshop, Indesign, Illustrator, Premiere, etc. Working within a VM can be slightly inconvenient in the long run, so running such Windows applications is better left for occasional use only. Other CAT tools should be favored for primary use, but it is nice to know that you can still launch such Windows apps and utilities should your workflow require it.

Running a VM requires assigning some of your computer RAM to Windows operation, so it is preferable to have enough RAM, to begin with. The more, the better, starting from 8 GB.

Other virtualization applications worth mentioning:

Paid: Parallels Free/Paid: VMware Free: QEMU

Via Dual Boot/On a separate machine

Of course, you can install GNU/Linux alongside Windows on your PC or on a separate machine, but that defeats the purpose of using GNU/Linux as your main OS for professional translation, without the need of rebooting and using Windows.

And that concludes this wiki.

Happy translating!

Updates

Feedback

You can send me feedback via a ProZ message. Please start your subject with the name of the document: “TranslateOnLinux”.