Skip to the content.

Table of contents

TranslateOnLinux

It is entirely possible to work as a professional translator while running a GNU/Linux distribution as your chosen operating system.

This wiki aims to provide a list of desktop and web solutions for linguists who use a GNU/Linux OS for their work. The listed software should provide a comprehensive toolbox to fit most translation needs.

Inspiration for this list: the website LinuxForTranslators.com by Marc Prior, and the now defunct tuxtrans, a GNU/Linux distribution specifically targeted at translators by Peter Sandrini, along with its list of installed software.

For a more general overview, head over to LinuxForTranslators.com. For discussing any GNU/Linux-related topics in a professional translation/localization context, consider subscribing to the LinuxForTranslators@groups.io mailing list.

*Curated by Jean Dimitriadis (EN-FR/EL-FR translator).

CAT TOOLS

Some popular computer-aided translation (CAT) tools, such as Trados, memoQ or Déjà Vu are only Windows-compatible, but they are far from being the only options. There are some powerful translation tools that can run on GNU/Linux. It is worth noting almost all of them can handle (albeit with some limitations) Trados SDLXLIFF files and/or SDLPPX packages, which are often sent out by translation agencies.

These include the following:

OFFLINE CAT TOOLS

OmegaT

OmegaT is a free-libre/open source translation memory application written in Java.

Don’t let its simple interface fool you, OmegaT boasts excellent features, some of which (like team projects) are mostly found in expensive proprietary CAT tools.

Some highlights include:

Platforms: GNU/Linux, Windows, macOS. Supported formats: More than 50 formats (with the Okapi plugin), including Microsoft Word, Excel and PowerPoint, LibreOffice, HTML, TTX and SDLXLIFF (Trados), TXML (Wordfast Pro), IDML (InDesign) and PDF (plain text and Iceni Infix export). Support and manual: Documentation and manual, support page. License: Free-Libre/Open Source Software (GPLv2). Cost: Gratis (donations welcome). Reviews: OmegaT, ProZ, G2.

CafeTran Espresso

CafeTran Espresso is a feature-rich CAT tool that is fun to use, built from the ground up by a developer cum translator, which shows in many ways.

Here are some highlights:

Platforms: GNU/Linux, Windows, macOS. Supported formats: file formats. Support: Solutions (Knowledge Base), official forum and support, reference documents. License: Closed source. Demo version: no time limit, TM files up to 1000 TUs in total and glossaries no larger than 500 terms, up to 50 segments of MT (other than MyMemory). Cost: Licensing options // Also, full access for ProZ.com Plus members (currently extended to paying ProZ members). // Also available through ProZ Translator Group Buying (TGB) campaign. Reviews: ProZ.

Memsource Editor for Desktop

Memsource is an easy, streamlined and powerful online CAT tool, which also offers a desktop editor (compatible with Linux and written in Qt), hence its inclusion in this section. It provides a complete translation environment that allows you to set up your own workflow, and share projects with your vendors.

Among other things, it offers an extensive file format support, with interesting file filter and pre-translation options, a host of MT engines and MT-related functions (it is one of the tools of choice for MT post-editing) and features a thorough support website. It also sports a useful in-context preview feature and can handle large remote TMs and termbases as well. It is used by some agencies who offer project-based licenses as it allows for remote management and distribution of translation jobs, keeping track of translators’ progress in real time. The web version is reportedly quite slower than the desktop version. There is a port for Android and iOS.

Online/Offline tool Platforms: GNU/Linux, Windows, macOS, Android, iOS. Supported formats: file formats. Support and manual: Support, Memsource Editor for Desktop and for Web documentation. License: Closed source. Demo: 30-day trial / Free Personal edition (maximum two files at a time). Cost: editions and pricing. Reviews: ProZ, G2Crowd, Capterra.

WordFast Pro

Among the major commercial TM tools, WordFast Pro is the only one that is truly cross-platform. It is available as a standalone application (in contrast to WordFast Classic, which is Word-based).

WordFast Pro, whose interface and underlying technology was first implemented as version 4.0 is now at version 6.6.x.

The legacy WordFast Pro 3 was last updated to version 3.4.14 in August 2018 and is no longer in development. It still works in many systems and some users continue to use it.

While these tools have a different translation interface, both share a number of features, such as multiple supported file formats (and file filter options), unlimited TM and Glossary access (as well as remote access, via WordFast Server), Machine Translation integration, advanced time-saving features and real-time assurance (Transcheck).

In addition, the current WFP offers a WYSIWYG interface for formatting tags, Target-only Live Preview, Segment Filtering, Multilingual Translation Projects, Export and Import Translation Packages (including Trados files and packages) and the ability to Chain (virtually merge) files for translation consistency.

The demo version has some limitations, but it is not time-based, making it useful to keep around, especially for round trip scenarios, since it supports many file formats and offers nice filter options.

Platforms: GNU/Linux, Windows, macOS. Supported formats: WFP 3, WFP 6 (check specifications). Support and manual: Wordfast support page, support wiki and manual, videos and training courses are available as well. License: Closed source. Demo: Wordfast offers a demo version that runs without a paid license for translation memories (TM) of up to 500 translation units, making it possible to use Wordfast on actual translation projects before you decide to purchase. You can also register for a 30 day, fully functional trial license. Cost: pricing. Also available through ProZ Translator Group Buying (TGB) campaign. Reviews: ProZ, Capterra.

Swordfish Translation Editor

Swordfish IV is an advanced cross-platform CAT tool based on open standards, designed for demanding professional translators.

It supports exchanging TMX (Translation Memory eXchange) and uses the GlossML glossary format. It includes a super fast Internal database server and integrated support for RemoteTM Web Server. You can also use third-party database engines like Oracle 10g or MySQL 5.x for storing TM and terminology data.

Swordfish supports a host of file formats. It is also compatible with other CAT tools, since it supports the XLIFF (including Trados Studio SDLXLIFF files and SDLPPX packages, Wordfast Pro TXLF fiiles and MemoQ mqXLIF files), Uncleaned RTF, TTX, TTX Exchange and TXML (Worfast Pro and GlobaLink).

Other features include In-Context Exact Matches, full interface customization, segment filtering, comfortable proofreading and advanced TM/MT engines (including Azure Translator Text (Microsoft), DeepL, Google Cloud Translation, MyMemory and Yandex).

Strongly committed to open source software, the developer, Maxprograms, publishes all its tools as open source and free to use. The various localization software and utilities include Stingray (document aligner), TMXEditor, Fluenta DITA Translation Manager, as well as Anchovy (Glossary manager and term extractor), SRXEditor (Segmentation Rules editor), OpenXLIFF filters (which can be used with other CAT tools), XLIFF Manager and TMXValidator.

Platforms: GNU/Linux, Windows, macOS. Supported formats: General documentation, XML formats and Software development. Support and manual: User guide (PDF and web), Getting started, Groups.io support group (The group is intended for supporting all tools published at Maxprograms, not only Swordfish). License: Open source (Eclipse Public License v1.0). Demo version: 30-day trial. Cost: Free to use (requires building the binaries from source code, see this ProZ thread for additional compilation instructions, also a video), with subscriptions for support and binaries available on the online store. Reviews: Proz.

Fluency Now

Fluency Now is an easy to use full-featured CAT tool suite that’s affordable for freelancers and organizations alike.

Some of its product features include:

Platforms: GNU/Linux, Windows, macOS. Supported formats: Benefits & Features. Support and manual: Fluency 101, video tutorials. License: Closed source. Demo version: 15-day trial. Cost: Monthly and yearly subscription. Reviews: Proz, Capterra.

BasicCAT

BasicCAT is an open source and free computer-aided translation tool, which aims at providing a simple and useful tool for translators. The name is BasicCAT, because of its simplicity and its programming language, Basic.

It has the following features:

Note: Since it relies JavaFX, which has been removed from more recent Java runtimes, you can use a Java 8 version to make it run without further condiguration (java -jar BasicCAT.jar).

Platforms: GNU/Linux, Windows, macOS. Supported formats: TXT, IDML, XLIFF, Gettext Po (other file formats need to be converted to XLIFF or PO with Okapi or other tools) Support and documentation: Documentation. Support page (includes links to GitHub to open a public support ticket). License: Open source. Cost: Gratis (donation welcome).

Heartsome Translation Studio

Heartsome is a discontinued commercial CAT tool software that is now open sourced and offered gratis.

The software itself is getting old (it is no longer developed for more than 5 years now), but sports some nice capabilities that make it useful to keep around.

It also offers a separate Heartsome TMX Editor for editing TMX memories.

Platforms: GNU/Linux, Windows, macOS. Supported formats: see GitHub page. Support: GitHub page has links for user manual, quick start guide and file conversion guide. License: Open source (GPLv2). Cost: Gratis. Reviews: Proz.

WordFast Classic

WordFast Classic is a CAT tool that operates entirely inside of Microsoft Word. Since some versions of MS Word can be installed and run via the Wine compatibility layer (see related section below), WordFast Classic can be used in GNU/Linux.

Features include an intuitive interface, user-defined macros, integration with machine translation and external dictionaries and real-time QA.

Platforms: GNU/Linux, Windows, macOS. Supported formats: Word. Support: support wiki, user manual. License: Closed source. Demo: Demo version with no time limit, up to 1000 translation units, limited MT. Cost: Price. Reviews: Proz.

Other

Virtaal: Virtaal is an easy and lightweight tool that opens Gettext (.po, .mo), XLIFF, TMX and TBX files, among other formats. Even if its nice feature set may not fully appeal to professional translators, it represents a very good utility to use as a quick viewer for the above-mentioned bilingual file types. It should be noted that it hasn’t seen active development since 2017 and is in maintenance mode; some distros (like Fedora) do not ship it.

Lokalize: Lokalize is a computer-aided translation (CAT) tool, a full-featured GUI application for translators written from scratch using the KDE4 framework in 1999 and ported to the KDE5 framework in 2007. Aside from basic editing of PO files with nifty auxiliary details, it integrates support for glossary, translation memory, diff-modes for QA, project management, pology/posieve verification, etc. Mostly interesting for free software localization. It is tightly integrated with KDE localization together with kdesvn, which makes it ideal for localization of KDE software, while also serving as a full-blown localization tool.

QtLinguist: Developed by The Qt Company to localize primarily Qt applications, this tool is able to translate TS, PO and XLIFF files. Despite being proprietary, it is open source and generally available on Linux via the package qttools5/qt5-qttools/qt5tools (name depends on the distribution) and via website download. Downloading from your distribution is recommended.

DGT-OmegaT is an active fork of OmegaT (currently, the 3.4. branch), developed by the Directorate-General for Translation of the European Commission, adding some specific features built for their own needs. Some features, like Tagwipe, have been integrated into the original OmegaT software. Documentation. Download.

Anaphraseus (WordFast Classic-like CAT tool available as an OpenOffice/LibreOffice extension)

Esperantilo (no new versions since 2012)

OmegaT+ (OmegaT fork, no new version since 2012)

ONLINE CAT TOOLS

While problematic from a privacy and freedom point of view, the advent of Cloud/web-based CAT tools has added several GNU/Linux compatible solutions to the existing arsenal.

Memsource Editor for Web

Mentioned along with Memsource Editor for Desktop in the above section. Please note the desktop version still requires some steps to be completed in the online version.

MateCat

MateCat is a free, easy to use, enterprise-level online tool developped by Translated, designed to make translation, post-editing and outsourcing easy and to provide a complete set of features to manage and monitor translation projects.

MateCat supports a host of file formats and various MT engines (including Modern MT and free access to MyMemory MT service/Public TM). It also sports a nifty Aligner.

Given its feature-set, it can also be used in a round trip scenario (as an additional filter to handle file types unsupported by other CAT tools). Read more about this here.

Supported formats: 70 formats. Support and manual: Documentation and support. License: Software based on mostly open source components. Cost: Free (registration recommended). Reviews: Proz, G2.

The completely open source version of the webserver can also be installed offline.

Warning: By default, MateCat stores your translated segments in the public MyMemory TM. To make sure this does not happen unwillingly, create a private TM resource: In the Project creation page, click on Settings (Alternatively, in the TM and glossary field, expand the drop-down menu and select Create resource). Click on + New resource button in the opened dialog. Give the TM an optional name. Hit Confirm. You will see that “MyMemory: Collaborative translation memory” resource is Enabled for Lookup, but not set to be Updated anymore. That way, translated segments will only be stored in your private resources.

Warning: Matecat only officially supports Chrome/Chromium and will go out of its way to block your access if you’re using a different browser. It is possible to use it with other browsers such as Firefox by switching your user agent, but speed/performance might be slightly worse.

Smartcat

Smarcat is an intuitive, feature-rich computer-assisted translation web app. It provides a full set of translation automation technologies for companies and translators and makes it easy for them to connect and collaborate.

It directly integrates with Paypal and Payoneer as well as local and international wire transfer. A Marketplace section is available for seeking translation jobs (though often low-payed), which can then allow for direct referrals.

Supported formats: file formats. Support and manual: Documentation and support. License: Closed source. Cost: Mostly free (registration needed), billable OCR and MT services. Reviews: Proz, G2, Capterra.

Wordfast Anywhere

WordFast Anywhere is a free and complete online CAT tool, one of the oldest of its kind, allowing translators to work from anywhere, provided they have an Internet connection and a web browser.

Supported formats: txml, txt, doc, docx, xls, xlsx, htm, html, inx, indml, pdf, jsp, asp, odt, tif, rtf, ppt, pptx,mif, ttx, txlf, xlf, xliff, sdlxliff. Support and manual: Wiki, manual. License: Closed source. Cost: Free (registration needed). Reviews: ProZ.

XTM Cloud

XTM International develops XTM, a complete online Translation Management System (TMS) with an integrated Computer Aided Translation (CAT), targeted at enterprises, LSPs and freelance translators. The centrally shared TM, terminology, workflow and translator workbench are all accessed via a browser. While XTM is cost effective and easy to use, it is a scalable system, that is built for collaboration and incorporates a comprehensive API.

Supported formats: file formats. Support and manual: XTM Academy provides access to knowledge base articles, webinars, tutorials, online documentation release notes, and support. License: Closed source. Demo: 30-day free trial. Cost: Monthly, quarterly or annual subscription pricing depending on the Account type (Freelance, Group, Entreprise), Duration, Number of users and Words/month. Reviews: Proz, Capterra, G2.

Wordbee

Wordbee is an online Translation Project Management and Translation Editor. It is a complete CAT tool, price management, customer management, invoicing, and linguistic solution.

Supported formats: Supported file types. Support and webinars: Documentation, Support, Webinars. License: Closed source. Demo: 14-day free trial. Cost: Monthly or annual subscription. Reviews: Capterra, G2.

Termsoup

Termsoup is a cloud-based computer-assisted translation (CAT) software designed to boost productivity. This no-frill, user-friendly platform is ideal for literary (but not only) translators who need streamlined features for long-form projects. The platform makes collaboration simple with real-time editing and auto-save features so translators and colleagues can work together.

Sports various original features and an interesting UI/UX.

Supported formats: doc, docx, xls, xlsx, ppt, pptx, txt, html, xml, dtd, json, csv, yaml, srt, wix, json, yml, odt, ods, odp, po, xlf, xliff, sdlxliff, ttx, mif, idml, icml, and dita. Support and manual: User guide. License: Closed source. Cost: Subscription-based, free 10-day trial. Reviews: Capterra.

Cattitude

Cattitude is a new kid on the block, with an attitude and a vision, geared towards leveraging adaptive Machine Translation (along with intelligent matching and mining of databases) in a productive web-based translation environment. Made by translators for translators, it offers features such as the following:

Supported formats: Currently direct support for DOCX, XLSX, MQXLIFF (memoQ), MXLIFF (Memsource), PO Edit, SDLPPX, SDLXLIFF, SRT, TXT, XLF (Wordbee) and JSON. Support and manual: User guide provided after registration. License: Closed source. Cost: Subscription-based, with a free 30-day trial. It is offered in three versions: (a) with no MT engine, (b) with an adaptive ModernMT base engine and/or DeepL (requires a separate subscription to these MT services), (c) with a fully customized MT engine.

Other

Not to mention the different online browser-based continuous localization platforms which can be used on GNU/Linux:

Translation related tasks and tools

Alignment

Document alignment is the process of matching source language segments with target language segments and creating a reusable bilingual document (commonly a translation memory). It is a way of making use of existing translation materials, especially those not translated via a CAT tool.

Stingray Document Aligner (offline, open source and free to use, easy to build, with paid subscription for support and binaries) is a cross-platform document aligner designed to assist professional translators in the production of translation memories from existing translated material. It supports all file formats handled by the OpenXLIFF filter.

BasicCAT Aligner (free, offline) in a GUI program that integrates automatic alignment tools LF Aligner and Bluealign and lets you interactively work on aligning documents. Simple and efficient to use (Use Enter to split segments and Delete to merge segments, arrows to navigate the segments), actions like removing tags, and exporting to several file formats. It can also import TMX files.

Note: Since it relies JavaFX, which has been removed from more recent Java runtimes, you can use a Java 8 version to make it run without further condiguration (java -jar Aligner.jar).

LF Aligner (free, offline) is an excellent alignment tool which relies on Hunalign for automatic sentence pairing. Input: txt, doc, docx, rtf, pdf, html. Output: tab delimited txt, tmx, and xls. With web features. The Linux version opens an interactive easy to use terminal. The alignment review step requires editing the produced file in a spreadsheet program. The Windows version does not need installation and can run on Wine. It offers a GUI, and, at the alignment review/editing step, it opens an easy interface for splitting and merging source/target segments and fixing any alignment error.

WordFast Anywhere (free, online, registration needed) also integrates an autoaligner. Upload the source and target files, launch the alignment, and get a ZIP file with a TMX, XLS and TXT of automatically aligned bitext.

MateCat aligner (free, online) is a TMX creator whose design is very similar to MateCat translation editor and can be accessed via the Aligner tab from the MateCat homepage. It supports 69 file formats. Currently in beta.

WordFast Aligner (free, online). Paste the source and target text and align.

YouAlign.com (free, online, registration needed, 1MB max file size) an alignment tool based on AlignFactory, a powerful automated document alignment tool with a web crawler.

PlusTools is a free MS Word add-in that can handle multiple tasks, including aligning documents.

TM-Town (online, registration needed) has an alignment tool that can help you convert a source and target document into a translation memory (TM) file.

In addition, many other CAT tools, such as WordFast Pro, OmegaT, CafeTran Espresso, SmartCat and Memsource offer an alignment feature.

Handling tags

Word documents (especially those coming from OCR’d files or PDF converted files) are often strewn with codes which produce unnecessary tags when imported in a CAT tool. This tagged information shows up in the translation grid as spurious codes{1}around{2}, or even in the mid{3}dle of, words, making sentences difficult to read and translate and generally negating many of the productivity benefits of the program. There are ways to get around this.

CodeZapper (€20 licence) is a set of Word VBA macros designed to “clean up” Word files before being imported into a standalone translation environment so that the files have fewer tags.

TransTools - Document Cleaner (freeware) Part of the TransTools set of MS Office add-ins, Document Cleaner is a collection of tools for the preparation of badly formatted documents for translation. It features a Tag cleaner tool, which attempts to strip unnecessary tags.

CafeTran Espresso CAT tool offers a special filter to handle MS Word documents after OCR, which clears the source text of unnecessary formatting tags. It can prove useful anytime an MS Word document produces too many unnecessary tags.

OmegaT CAT tool has a Remove tags option, which strips ALL tags from the imported document(s). The latest version now includes the TagWipe utility/Groovy script for clearing excessive tags on Word documents. It offers a GUI that lets the user select different options with which it should be run.

Translation Memory (TMX) Editing - Maintenance

Heartsome TMX Editor (free, open source)

A powerful TM maintenance tool for all CAT software. It provides many useful and practical functions besides common editing features, allowing you to perform TM maintenance tasks easily, simply and all with one tool.

TMXEditor (free, open source)

Maxprograms TMXEditor is a cross-platform open source desktop application designed for editing TMX. It is able to handle very large files with millions of segments.

Okapi Olifant (free, open source)

An excellent free Translation Memory Editor. Caveat: its current version is based on the old (2009) Windows-only .NET version of Okapi framework (a new Java-based version is in the works). Upside: It can be installed via Wine (or PlayOnLinux/CrossOver), provided that you have .NET framework 2.0 or later installed (on your Wine bottle/prefix). Online documentation.

CafeTran Espresso (paid)

CafeTran Espresso CAT tool offers a powerful workflow for performing various editing and maintenance tasks on TMX Translation Memories. The Demo limits TM editing to 1,000 TUs.

SuperTMXMerge (free, open source)

Diff tool for comparing and merging TMX translation memories.

Goldpan TMX/TBX Editor (free, Windows VM only)

An intuitive and multifunctional TMX/TBX file editor from Logrus Global Software Development Team.

Segmentation - SRX editors

The SRX (Segmentation Rules eXchange) format is an open standard to save segmentation rules in a file so they can be used between different tools.

Examples of tools that support SRX files: OmegaT, CafeTran Espresso, Swordfish, WordFast Pro, and Memsource.

Maxprograms’ SRXEditor is an open source cross-platform editor of segmentation rules, designed to use Segmentation Rules eXchange (SRX) 2.0. You can use SRXEditor to create new SRX files and edit existing ones. It also can be used for testing segmentation rules to ensure that they break text as expected. It is also currently included in Swordfish III installers.

Okapi Ratel is a free-libre/open source cross-platform application that helps create and maintain segmentation rules in the SRX format. Such rules are used to break down translatable text into more meaningful parts. Ratel is part of the Okapi Framework, and can be used either as a standalone utility or from within Okapi Rainbow.

Downloadable SRX rules: OmegaT’s default rules, LanguageTool’s segmentation rules.

Term extraction

Monolingual term extraction attempts to analyze a text or corpus in order to identify candidate terms, while bilingual term extraction analyses existing source texts along with their translations in an attempt to identify potential terms and their equivalents.

Monolingual term extraction

TermoStat Web

Free, online. Languages: English, French, Italian, Portuguese, Spanish. The results are really good. Uses linguistic and statistical methods while taking the potential terms’ structures and relative frequencies into account in the analysis corpus. TermoStat is free, but users must register. Outputs a Tab delimited TXT (which can be renamed to and opened as a CSV file in LibreOffice).

Tilde terminology (powered by Taas)

Free/Premium, online, 25 languages. Various options to choose from. By default, it uses TWSC (Tilde wrapper system for CollTerm) based on linguistic analysis enriched by statistical features. Extraction time somewhat long, but results quite accurate. By specifying a subject domain, some term candidates are matched with translations from a variety of sources for target translation lookup.

The Sketch Engine

Subscription-based, online. Several supported languages. Monolingual and bilingual term extraction supported. Also offers a simpler OneClick term extractor for single word and multiword candidates.

Okapi Framework - Rainbow

Free, offline. Its term extraction utility offers flexible, configurable statistical analysis. Supports all languages which use scripts with space-delimited words.

Bilingual term extraction

The Sketch Engine

Subscription-based, online. Several supported languages. Monolingual and bilingual term extraction supported.

Anchovy

Anchovy is an offline multilingual cross-platform glossary editor and bilingual term extraction tool based on the open Glossary Markup Language (GlossML) format. It handles various import and export file formats.

Terminology management

File formats

Excel / CSV/TSV / Tab-delimited TXT

Some simple glossary formats such as Excel files, CSV/TSV (Comma-separated values/Tab-separated values) files and tab-delimited text files may be all you need for your terminology management and glossary exchange needs.

Excel and CSV/TSV files can simply be edited in office applications such as Microsoft Excel, LibreOffice Calc (superior CSV/TSV handling compared to Excel). Ron’s Editor, an excellent dedicated CSV editor with professional features (Feee/Lite and Pro versions available, with a 30-day trial) is Windows-only, but can be used with Wine after installing .Net Framework 4.8.

Tab-delimited TXT files can be edited in any text editor, although it is recommended to simply rename them to .csv or .tsv, no other conversion needed.

Most CAT tools support such files (and converting from one of these file formats to another is mostly trivial): OmegaT, CafeTran Espresso, Matecat, Memsource (Import/Export Excel), WordFast Pro (Tab delimited TXT), Fluency Now, etc.

Anchovy free utility can import TMX, CSV and tab-del glossaries and export to (GlossML, the format used by Swordfish), CSV, HTML, TMX, TBX and XML.

For conducting CAT-tool independent TM and glossary searches, TMLookup is an open-source tool designed to search (massive) bilingual and multilingual text databases (translation memories) and glossaries. For glossaries, it can import TXT and XLS files. It runs fine on Wine.

TBX

TBX, short for TermBase eXchange, is the international standard for representing and exchanging information about terms, words, and other lexical data. Here’s a list of tools with some TBX support: https://www.tbxinfo.net/tbx-support/

Several CAT tools can import/read TBX files: OmegaT, CafeTran Espresso, Memsource, WordFast Pro, Swordfish, etc.

Heartsome TMX editor can import and export to TBX (but not directly edit), among other formats.

Anchovy can export to TBX: Anchovy (free).

Virtaal can edit/read TBX files.

XBench (see below) can import TBX files and convert them to tab-delimited TXT files.

Goldpan TMX/TBX Editor (installs but does not run correctly in Wine, needs a Windows VM) can edit and export/save TBX (and TMX) files

Translate-Toolkit is a set of libraries and command line tools made by the same creators of Pootle and Virtaal and used by many free software projects, like Lokalize. It comes with several file conversion utilities, including xls2csv and csv2tbx. It’s useful to manage term bases in .xlsx, .csv and .tbx when no GUI tool is available (or for general scripting).

SDLTB

Trados Studio termbases (SDLTB) are written in a proprietary format, created by MultiTerm. Since Trados/Multiterm are quite common, you might come accross these files fairly often.

Some CAT tools support importing SDLTB files: CafeTran Espresso, Fluency Now and Memsource (partly).

Tools for converting SDLTB files: Trados Studio Resource Converter and WfConverter

Tools for converting to SDLTB (among other conversions): Glossary Converter (Windows VM, or Wine with MS Office)

Tools

XBench is a Terminology and QA tool.

It features an older freeware non-unicode version (2.9) and a yearly subscription-based version 3.x in active development. The program is Windows-only but runs well on Wine.

XBench supports a host of glossary, TM and bilingual formats. For Terminology, XBench can import/read Tab-delimited Text files, TBX, MultiTerm XML Glossaries, Wordfast Glossaries, etc.

It can also be used to convert to TMX and Tab-delimited text files.

Office software

An Office suite represents an important part of a translator’s or reviser’s toolkit and is often a critical step in the translation workflow.

Microsoft Office

Various versions of Microsoft Office are quite well supported via Wine and are installable using PlayOnLinux (free) or CrossOver (paid). See dedicated section below.

If needed, Microsoft Office can also be installed in a Windows VM (see related section).

Useful Microsoft Office add-ins:

LibreOffice

LibreOffice is an excellent Office suite in its own right. It can open and save MS Office documents, but the improved compatibility is not always perfect, which can be critical when it comes to delivering final documents and meeting client expectations. You may need to consider other options as well.

One neat functionality it has is the ability to export hybrid PDFs, which are PDFs embedded with the original .odt file. This means that, by opening a hybrid PDF through LibreOffice Writer, the translator/proofreader is able to edit the original file while maintaining PDF formatting. This is exclusive to LibreOffice, as no other office software is able to do this.

Pair it with hunspell for spelling support for your language and the hyphen/hyphenation package for optimal use.

Feature/compatibility comparison with Microsoft Office.

Useful LibreOffice extensions:

Apache OpenOffice

Less developed and feature-rich than LibreOffice. Use LibreOffice instead.

WPS Office

Free for Linux users and comes with an unchangeable ribbon/tabbed bar, not very customizable but it’s still featureful. Boasts high MS Office compatibility, but it is unable to handle free formats like ODT. It includes support for baloons that is arguably better than that of Microsoft Word, as the tracked changes section can be scrolled. It allows to set your own keyboard shorcuts for specific Unicode characters, which is useful depending on your keyboard layout. If you find any issues (e.g. theming), check the general troubleshooting page over the Arch Linux wiki.

SoftMaker Office

Paid software; you can subscribe to its Software-as-a-Service model (similar to Office 365) or purchase it once. 30-day free trial with FreeOffice. Boasts high MS Office compatibility, the interface is quite feature-complete and has a traditional look, although it lacks baloons.

Tip: Trial can be reset by removing the SoftMaker folder.

SoftMaker FreeOffice

The free version of SoftMaker Office.

OnlyOffice

It has good looks and potential, but it’s way less featureful than other desktop clients like LibreOffice, Softmaker Office and WPS Office. Works both offline and in the cloud, can be integrated directly to a Nextcloud instance. Boasts 100% compatibility with MS Office formats (as it saves natively into these formats). It does not currently include dynamic counting of words. See here.

Google Docs/Google Drive (online)

You can now directly edit, comment, and collaborate on Office files using Google Docs, Sheets, and Slides. Changes will be auto-saved to the file in Office format (help article). Google Docs can be set up for offline access.

With the Office Editing for Docs, Sheets & Slides Chrome extension, Microsoft Office files that you drag into Chrome, open in Gmail, Google Drive, and more, will be opened in Docs, Sheets, and Slides for viewing and editing.

Microsoft Office Online (online)

The online version of MS Office 365 is a stripped-down version of the original desktop suite and free to use.

Office file conversion

For quick and easy office file conversions, you can use command line utilities (directly in the terminal or integrated in scripts).

Abiword is a word processing program that can be used for various tasks, including CLI-based file conversions. For command line options, refer to its man page.

unoconv is a CLI utility for converting any document from and to any LibreOffice supported format. For command line options, refer to its man page.

soffice is the CLI version of the LibreOffice office suite. It can be used for launching LibreOffice GUI with specific options or in headless mode, as well as converting to various formats. For command line options, refer to its man page.

A word on Microsoft fonts:

You’ll probably need a way to install Microsoft/Windows fonts on your Linux distribution to be able to work with the same fonts as other Microsoft Office users. These are mostly proprietary. You can install these fonts on your GNU/Linux distribution via a package typically called ttf-mscorefonts-installer/msttcorefonts and signing an EULA. It’s the result of the now discontinued project Core fonts for the web and available here; however, since newer versions of Microsoft Core Fonts are unavailable for redistribution/commercial use since the project was discontinued and ttf-mscorefonts-installer packages incredibly old versions of these fonts (2012), this might render incompatibility. Emphasis on might. Thus it might be a good idea to save the original Microsoft Fonts for personal use.

Copying the fonts folder from a Windows machine and placing it in the hidden .local/share/fonts folder of your home folder (or creating it, if it does not exist) is a distribution-independent solution. On legacy systems, it is possible to do the same with the ~/.fonts folder or .local/share/fonts instead.

If fonts don’t look good, you might need to learn how to improve anti-aliasing.

Grammar checkers - Writing aids

Note: PerfectIt (EN) (paid) cannot be installed with Wine (as tested), it needs to be used in a Windows VM.

QA tools

Each CAT tool offers a built-in QA:

On top of the QA checks integrated into your chosen CAT tool, you may want or need to use a dedicated QA tool.

Proofreading and Revising (Track changes, compare documents)

Reviewing translations and proofreading can either be done within a CAT tool, especially with the Track Changes feature, via an exported Word or RTF bilingual table or on the exported target document itself.

Let’s take each scenario separately:

Track changes (in a CAT tool):

In Trados and memoQ, reviewers often make use of the included Track changes feature.

While each CAT software offers different methods for reviewing translations, if you are working outside of the above-mentioned tools (with the exception of CafeTran Espresso, see below), you cannot use Track changes in the bilingual (SDLXLIFF, MQXLIFF) files themselves.

You can, however, suggest the use of TQAuditor’s Quick Compare feature, to produce an Excel report highlighting any changes made during revision, by comparing the translated and the reviewed bilingual file. If you register with TQAuditor (free), you can keep track of these jobs and access a few more actions. Since TQAuditor supports bilingual files from various CAT tools, it represents a universal, “CAT-tool-agnostic” solution for producing translation reviews. This means you can open the SDLXLIFF files to review, make the necessary changes in your preferred CAT tool, and then create a comparison report in TQAuditor.

Please note CafeTran Espresso has recently added support for the tracking changes feature, including for Trados review tasks. Since it also supports Trados files and packages, SDLTM translation memories and SDLTB termbases, it can be used as a complete Trados replacement in most scenarios.

(Track) changes (in an exported bilingual Word/RTF)

Working on exported bilingual Word or RTF files offers the advantage that the reviewer does not need to own the original CAT tool, or any CAT tool, for that matter. The disadvantage being that the reviewer is left without the convenience (plus TM and other resources) of a CAT tool. If you receive a bilingual Word or RTF for external review, you can just use your Office suite Text processing application (MS Word, LibreOffice Writer, WPS Writer etc.) to complete the review. Provided that bilingual review files can be opened in LibreOffice (Trados files may not be compatible, it seems), it’s good to know that its track changes/compare document feature is compatible with MS Office, at least with some precautions (like ensuring you use the same author name and initials in both programs, see archived link here). If you don’t use Track changes, the Compare document feature can be applied to create a document showing the modifications made.

(Track) changes (in the target document)

Here, it’s about working on the software program that handles the original file format. If it’s a Word document, you can use Track changes (or Compare Document) as usual. When delivering the final file, it’s usually expected to save a separate copy with no tracked changes (accept all).

Various localization-related utilities

Okapi Framework

The Okapi Framework is a cross-platform and free-libre/open source set of components and applications that offer extensive support for localizing and translating documentation and software.

Rainbow is an Okapi application which allows you to launch different utilities to help you perform various localization-related tasks. It includes CheckMate and Ratel, which can also be launched as standalone utilities.

CheckMate is a GUI application that performs various QA checks on bilingual translation files such as XLIFF, TMX, TTX, PO, TS, Trados-Tagged RTF, and any other bilingual format supported by the framework.

Ratel is a GUI application to create and maintain segmentation rules. Such rules are used to break down translatable text into more meaningful parts. Ratel uses Okapi’s SRX-based segmentation engine. SRX is the Segmentation Rules eXchange format.

XLIFF Manager

XLIFF Manager is a cross-platform open source graphical user interface for OpenXLIFF Filters (an open source set of filters for creating, merging and validating XLIFF 1.2 and 2.0 files) written in JavaScript.

With XLIFF Manager you can:

TMLookup

TMLookup is an open source tool designed to search bilingual and multilingual text databases (translation memories) and glossaries.

BootCaT

Bootstrap Corpora And Terms from the Web. BootCaT is a free/libre software cross-platform Java application, which you can use to create specialized monolingual corpora on the fly. You can use a concordancer, such as AntConc, on the resulting corpus. Note: The Sketch Engine uses WebBootCaT among other features.

AntConc

AntConc is a freeware corpus analysis toolkit for monolingual concordancing and text analysis.

AntFileConverter

A freeware tool to convert PDF and Word (DOCX) files into plain text for use in corpus tools like AntConc.

AntFileSplitter

A freeware text file splitting tool.

AntPConc

A freeware parallel corpus analysis toolkit for concordance search and text analysis using UTF-8 encoded text files.

EncodeAnt

A freeware tool for detecting and converting character encodings.

Any2UTF8

Any2UTF8 is a simple program to convert plain text file in any character encoding to UTF8.

ODA File Converter

For converting between different versions of .dwg and .dxf files.

Scribus

Scribus is a free and open-source desktop publishing software, that can be used, among other things, for previewing Adobe InDesign IDML files.

HTTRack/WebHTTRack

A free (GPL, libre/free software) and easy-to-use offline browser utility allowing you to copy a World Wide Web site from the Internet to a local directory, building recursively all directories, getting HTML, images, and other files from the server to your computer. HTTrack arranges the original site’s relative link structure. Simply open a page of the “mirrored” website in your browser, and you can browse the site from link to link as if you were viewing it online. Useful for some website translation scenarios.

Translate Toolkit

A toolkit with various terminal utilities for localization engineers, offering format conversions, and quality assurance tasks. Most notably it includes command line tools for converting XLIFF, PO, TS, DOCX/XLS, TBX, TMX, TXT, WORDFAST, SRT, XML, etc.

TMXValidator

Check the validity of your TMX documents on any platform.

FileOpen

Plug-in and viewer to access documents encrypted with the FileOpen software. While the Linux version is old, the Windows versions can be run via Crossover Linux or PlayOnLinux (see section on WINE) in combination with a compatible Adobe Acrobat/Adobe Reader version.

Subtitling

Most professional subtitling software (such as EZTitles, Spot and WinCaps) is primarily available for Windows (and can only be used via a Windows VM on Linux). However, subtitlers working on Linux still have a a few tricks up their sleeve.

Supported commercial-grade tools include OOONA (cloud) and SubtitleNEXT (desktop).

Desktop subtitling editors

Free tools

Subtitle Edit

Subtitle Edit is an open source, actively developed subtitle editor, and currently the recommended free tool for professional work. It can read, write, and convert between more than 300 subtitle formats. Help page. Tutorial videos.

If the Linux-ready portable version using Mono is giving you issues, try the Windows version via Wine (see related section).

Aegisub

Aegisub is a free, powerful cross-platform open source tool for creating and editing subtitles, especially when it comes to the versatile SSA/ASS subtitle format. It makes it quick and easy to time subtitles to audio (time-spotting), and styling them, among other features.

Unfortunately, this excellent program has stopped being actively developed a few years ago, although some still swear by it.

Aegisub can be further enhanced by automation scripts, such as those by unanimated, lyger and petzku.

Aegisub’s user manual.

Subtitle Composer is a subtitle editor made available by KDE that allows for both mouse- and keyboard-driven workflows. Most notably it includes dettachable panes for you to customize your environment; translate subtitles side by side; recognize speech via PocketSphinx; detect errors automatically; and manage times in bulk.

Many other free desktop subtitling tools exist, some cross-platform, others, Linux-specific, but we’re focusing more on those who may cater to professional subtitlers.

Commercial tools

SubtitleNEXT

SubtitleNEXT is a professional tool for video captioning and subtitling, which claims to be fully functional on Linux through Wine (you can download the Demo version and see for yourself). It requires the installation of LAVFilters.

Online subtitling editors

OOONA offers a collection of professional online video localization tools with weekly, monthly, biannual and annual plans (and a free 4 weeks trial with limited subtitles export), making it a recommended solution for regular or occasional use, especially given its range of supported formats.

Amara features various versions of its online subtitling editor: Amara Public (a free, public workspace-only edition), the advanced, subscription-based Amara Plus, and Amara Enterprise for bigger teams. Check out the feature comparison.

Matesub is a new subtitling tool, developed by Translated.net, the company behind MateCat, MyMemory public TM and ModernMT. Currently on free beta access, it will be offered on a monthly-subscription basis later on.

In supported languages, if no SRT file is available/used, subtitles will be automatically generated and synced, and optionally pre-translated via MT (although you can safely skip this step). This can help speed up time spotting and transcription, and the subtitling process.

List of features: Auto-Transcription, Auto-Spotting, Auto-Translation, Easy collaboration, WYSIWYG editor, Real time QA check, Magic timeline, User-friendly UI, Top security.

Subtitling with CAT tools

Video localization has seen an exponential growth in recent years, and translators are sometimes involved in “translating” video templates (ie. already timed subtitles in the original language) for various video content types, especially corporate videos.

Some of this subtitling work occurs on CAT tools and general localization platforms, instead of specialized subtitling tools. To wit, many CAT tools (such as OmegaT, Wordfast Pro, MateCat, Memsource, CafeTran and Swordfish) support the SRT file format.

However, it is worth stating that most CAT tools are ill-equipped, if not unfit for serious subtitling, which is a complex and distinct service in its own right. While CAT tools can help ensure terminology consistency and conduct concordance search, they are lacking many important features and their MT suggestions and fuzzy matches are often only marginally useful.

CAT tools that at least offer a video preview function for subtitles, which is crucial for translating in-context (as is support for characters per second [CPS]/words per minute and characters per line) are the following: Smartcat, XTM, Wordbee and Smartling.

The only CAT tool that currently offers a more complete set of features for translating subtitles is Trados (via a RWS plugin), and partly, MemoQ, but these are only available on Windows.

Burning-in (embedding) subtitles

HandBrake is one of the best FLOSS tools to embed .srt and .ass subtitles to a video. For image-based subtitles, the free version of DaVinci Resolve video editor supports importing and embedding Final Cut Pro XML subtitle files.

Transcription

Foot pedal not included.

Easytranscript

An easy to use transcription software with a variety of features. Free.

F4Transkript

A paid transcription software, with a demo allowing to transcribe the first 5 minutes of an audio file.

Express Scribe

There are two options for running Express Scribe on the Linux operating system.

Transcribe!

Download for Linux. There is a 30-day evaluation period, after which you decide if you wish to buy the software.

oTranscribe is free web app for audio/video transcription. You can use it in combination with the Voice in Voice Typing Chrome extension.

OOONA Transcribe is an online subscription-based software aimed at creating script files and dialog lists.

Transcribe by Wreally is another online solution, which blends automatic and manual transcription, along with dictation.

Dictionary lookup

GoldenDict is a feature-rich cross-platform dictionary lookup program.

The program has the following features:

With a bit of Google search, you can find different (monolingual and sometimes bilingual) Babylon, Stardict, Dictd and Lingvo dictionaries to use in GoldenDict in some languages.

As described above, GoldenDict can be used to query online dictionaries and bilingual concordancers (such as Linguee), replacing IntelliWeb Search on Linux.

Other Dictionary applications exist on GNU/Linux, but they are nowhere near GoldenDict in functionality.

For dictionary file conversion, make sure to check out PyGlossary, a tool for converting dictionary files aka glossaries with various formats for different dictionary applications.

Depending on your language pairs and subject fields, there are multiple dictionaries, glossaries, corpora, terminology banks, concordancers, etc. available online. Listing them is beyond the scope of this document.

Here’s just one: MagicSearch.com is a multilingual metasearch engine which allows you to search multiple sources (dictionaries, corpora, machine translation engines, concordancers, search engines) with a single click. Select a language pair and submit a search (select the same source and target language for monolingual searches). MagicSearch will display a single scrollable page with multiple sources. You can click on each source button as it changes its color (=loads) to move to the respective source. MagicSearch remembers the language pair you selected the next time you visit the site (using a cookie).

Tip: Set “One-page results” to OFF, to speed up queries. Click on the gear to select which resources to use (and hide the others), and reorder them as you see fit. MagicSearch is also available as a browser extension.

Project management & Invoicing

Project management tools don’t have to be translation-specific, although it may help. Also, depending on your country of residence and tax requirements, you can use different more general invoicing tools.

Protemos

Free for freelancers. Track translation orders, monitor deadlines, organize files, send invoices and control payments

LSP.expert

Project management and invoicing tool for Freelance translators, teams, and agencies. Online, subscription-based, 30-day trial.

Trados Businesss Manager (previously BaacS)

Translation project management and invoicing tool. Offers an offline Windows-only version as well as an online version. Paid Freelance edition with permanent licence. The offline version is free for ProZ Plus members.

I you are a paying ProZ.com member, you can also use the online ProZ.com invoicing tool.

rulingo

Manage projects, customers and finances in a free cloud translation business management platform.

Breeze

Invoicing and project management web tool for translators

Flantie

Flantie is an easy to use online task and invoice management system for translators and interpreters. It offers both a Free and and a Pro plan.

Project Libre

An open source desktop alternative to Microsoft Projects.

DTP - Image localization

There is no QuarkXpress, Adobe InDesign, Photoshop or Illustrator for GNU/Linux.

Those offering DTP services may need to use a Windows VM or simply reconsider.

GNU/Linux is not completely lacking in the DTP department. Scribus can produce professional PDFs, just like InDesign or QuarkXpress, Inkscape is probably just as good as Illustrator and GIMP can tackle many of Photoshop’s capabilities.

It’s just that these tools cannot necessarily provide the required compatibility to flawlessly integrate into a client’s proprietary DTP workflow.

Along with some other software and utilities, they can, however, prove useful for image localization and editing.

Scribus can be used for previewing Adobe InDesign IDML files.

ImageTranslate is an online paid service for translating images.

ImageTrans is a computer-aided image and comic translation tool, with features such as automatic text area detection, OCR services, MT engines, and TM. It can use scripts to save the results as Photoshop’s PSD files, export data to Excel, Word, XLIFF files, or import data from these files. With the Chrome Extension, it is also possible to translate pictures on web pages directly. Finally, ImageTrans can be used not only as an image translator, but also an image reader, an image transcriber and a deep learning annotator.

Notes: ImageTrans depends on JRE 1.8 (which includes JavaFX) and OpenCV (see documentation).

Here is the content of the .sh script I use to launch it:

` ``` ` #!/usr/bin/env bash

/usr/lib/jvm/java-8-oracle/jre/bin/java -jar -Djava.library.path=[absolute-path-to]ImageTrans.jar ` ``` `

Note: Documentation, tutorials and videos, FAQ, Support.

Speech recognition (STT)

Speech recognition/dictation is quite lacking in GNU/Linux. At present, few options compare to Nuance’s Dragon Naturally Speaking, which does not run natively (although some editions are reported to be usable under Wine).

Online solutions, such as Voice notebook, make often use of the Google Speech API, which is also built-in in Google Docs (CafeTran Espresso offers a solution to take advantage of that).

Ibus-typing-booster, an auto-completion input method to speed up typing also supports speech recognition using Google Cloud Speech-to-Text service, which supports 120 languages.

Dictanote is a note-dating dictation app that can be installed as a standalone app on Linux, and Voice in Voice Typing is a Chrome extension to voice type in any textbox on many websites, including with punctuation. Other such web browser extensions can be found.

Currently, one way to integrate speech recognition into your GNU/Linux workflow across the desktop is to use an Android (or Apple?) phone. Indeed, the smartphones’ mic is optimized for recording voice and for noise suppression. Plus, you can apply voice typing across the desktop.

You can install an app on your Linux PC and your smartphone while connected on the same WiFi network (or bluetooth), such as Unified Remote, KDEConnect/GSConnect, or WiFi Mouse and combine it with Google Voice or the GBoard keyboard, which includes a microphone button.

Text-to-speech (TTS)

Text-to-speech can be used in a number of scenarios for revision or self-revision purposes. Alas, quality solutions do not abound.

Interested users might want to investigate Balabolka, which runs in Wine (not tested).

Another solution would be to use Amazon Polly text-to-speech service, with the combination of a bash script.

CafeTran Espresso CAT tool actually implements TTS with the Amazon Polly API.

For web browser based scenarios, Read Aloud for Chrome or Firefox is an excellent text to speech voice reader, with many voice options and engines (Google Wavenet, Amazon Polly, IBM Watson, and Microsoft), including 100,000 free characters per month for access to the premium voices. It can also read PDFs (opened in your browser) and ePubs (via the EPUBReader extension).

PDFs

PDF Editors

Sejda PDF Editor (Fremium)

Foxit Reader (Free) PDF reader (not editor) with some lite editing features such as PDF annotation and PDF sign.

PDF-XChange Editor (Free)

Able2Extract (Paid) lets you convert, OCR, create and edit PDF documents.

Qoppa PDF Studio (Paid)

Master PDF Editor (Paid)

Infix PDF Editor/TransPDF (Paid, free for TransPDF editing)

Inceni’s Infix is a PDF Editor for Windows which works well under Wine (see related section below). Just deselect the PDF printer option during installation.

This software is of special interest to translators because it can be used along with TransPDF, a free/pay as you go online service which handles PDF files translation: it offers to convert the PDF to the XLIFF format which can then be translated in a CAT tool, in order to produce a translated PDF file that closely follows the original layout.

PDFs translated via TransPDF can be edited using the free demo version of Infix PDF Editor. This is very handy for making final adjustments to spacing and layout ready for clients.

Users of the paid Infix version can use TransPDF free of charge. ProZ Plus members enjoy free 10 credits (equivalent to 10 PDF pages) per month, along with discounted credits purchase.

TransPDF also offers a paid OCR feature for non-editable PDFs.

FlexiPDF (Paid)

FlexiPDF is a PDF Editor for Windows which also works well under Wine (only lacks specific functionality such as print to PDF on Linux) and is sold both under a subscription model (FlexiPDF NX) and as a product (FlexiPDF).

The professional version of FlexiPDF includes the export/import of text from a PDF for translation with a CAT tool, similarly to Infix/TransPDF. They offer discounts on occasion and based on the country you are, so it’s affordable.

For a thorough review of solutions for handling PDFs in translation, see here.

PDF Readers

Evince: Document viewer for multiple document formats. Supports PDF, PostScript, DjVu, TIFF, and DVI.

Foxit Reader: PDF reader (not editor) with some lite editing features such as PDF annotation and PDF sign/protect.

Okular: Universal document viewer, supporting different kinds of documents, like PDF, Postscript, DjVu, CHM, XPS, ePub, and others. Tip: Since Okular refreshes automatically when the file is updated, it can be used for target file preview purposes with a CAT tool such as OmegaT.

PDFSam: Split, merge, extract pages, rotate and mix PDF files.

PDF Chain: A graphical interface allowing to manipulate PDF documents (concatenate, burst, watermark, attach files…)

PDF Mod: Modify PDF documents: Reorder, rotate, and remove pages, export images from a document, edit the title, subject, author, and keywords, and combine documents via drag and drop.

Tabula: Tabula is a tool for liberating data tables locked inside (editable) PDF files.

PDF TO TEXT

For converting editable PDFs to Docx, most PDF conversion utilities require using a Windows VM.

Able2extract Professional is a closed source paid software that offers a Linux desktop version. It lets you convert, create and edit PDF documents. Conversion works for PDF to Word, Excdel, PowerPoint, AutoCAD, Images, Publisher and LibreOffice documents (ODT, ODS and ODP). It sports an OCR feature as well, including extracting scanned PDF tables into Excel.

Foxit PDF to word online converter produces excellent results.

CloudConvert online converter does a very nice job as well, with the added bonus that it also handles a host of different file conversion types and offers an API (you can create a script).

AntFileConverter is a freeware tool to convert PDF and Word (DOCX) files into plain text for use in corpus tools like AntConc.

For command-line enthusiasts, programs such as pdftotext, pdfreflow (pdftohtml), Calibre’s e-book-convert, pdf2htmlEX, pdfbox can be added to the mix for PDF text extraction.

For a thorough review of solutions for handling PDFs in translation, see here.

PDF TO XLIFF TO PDF - INCENI INFIX & TRANSPDF

If you need to translate a PDF document with tricky layout that you need to replicate and deliver back into PDF (not DOCX) format, the following solution can be very efficient.

Iceni Infix is a PDF editor (with a monthly subscription or one-off payment) that offers the possibility to export text strings from editable PDFs as XLIFF files, then import them back and perform final adjustments. Infix 7 now uses a online TransPDF service or an offline legacy solution (with some drawbacks).

Related links: Infix 7 - Translating PDF Infix - Pricing

TransPDF is on online (free/pay as you go) service that aims to end the frustration of translating PDFs by converting them to good quality XLIFF for use with your own translation tools.

Simply upload your PDF to TransPDF and translate the XLIFF you get back using your own CAT tool (connectors exist for Trados, Memsource and memoQ). Upload your translated XLIFF and you will get a fully-formatted, translated PDF. Any post-editing can be done for free using Infix 7 PDF editor (a Windows application that runs well on Wine). This solution can also perform OCR (paid through credits) on scanned PDFs.

Note: By default, text is segmented per paragraph, although some tools, such as Trados and Memsource, apply additional segmentation, which makes the resulting XLIFF far more usable. I would recommend performing a round trip via one of these two tools to take advantage of the improved segmentation. You can also try to resegment the XLIFF file from paragraph to sentence via the Okapi Framework.

On the plus side, the original layout is preserved and should only require limited PDF editing (although in language pairs with high expansion rate, keeping the same layout can be troublesome), greatly reducing the time needed for layout finalization/DTP work.

On the down side, the service has some price attached to it which, although quite low, represents an additional cost.

Still, TransPDF represents an interesting option you should try.

Note: ProZ.com members get a 25% discount when buying credits. ProZ.com Plus members also receive 10 credits per month, as part of their Plus package.

Related links: TransPDF - Home page, TransPDF - Pricing, TransPDF - Help and support, TransPDF - PDF Translation, Step-by-step, TransPDF - Using MemoQ for PDF translation, TransPDF - Using Memsource for PDF translation

OCR

While there is no GUI version of ABBYY FineReader for GNU/Linux, ABBYY offers the ABBYY FineReader Engine for Linux as a command line interface (CLI), at an equivalent price.

Other native OCR solutions usually lag behind ABBYY FineReader and other proprietary software.

Able2extract Professional is a paid software that offers PDF conversion, creation and editing with OCR capabilities, including extracting scanned PDF tables into Excel (7-day free trial available).

A Windows VM (see related section) can be used to run ABBYY FineReader and other programs (such as Adobe Acrobat Pro, Nuance OmniPage, Readiris and Wondershare PDFelement). Some earlier versions of ABBYY FineReader (up to version 10) are also reportedly working via Wine (see related section).

Free/Open Source OCR engines include Tesseract, GOCR, and Cuneiform. The first two provide decent results, but struggle with complex layouts.

Various GUI software tools make use of these engines.

One of the easiest and most feature-rich is gImageReader. It scans images and PDFs, with manual and automatic recognition in multiple languages. Post-processing the recognized text includes a spellchecker, an excellent find and replace feature and a useful “remove line breaks” action that can have its use outside of OCR tasks as well.

When it comes to image scan post-processing, or pre-processing for use in a program like ABBYY FineReader (to improve OCR, but also for creating scanned ebooks), there is nothing like Scantailor. Or rather, was, because it has stopped being developed. Scantailor Advanced is still active though. The wiki is here.

For light OCR needs, online solutions might be of help: Convertio (10 page free trial, API available, prepaid and API packages available, multiple languages supported, excellent results), PDFCandy, Free Online OCR (max. size 15 MB, good OCR recognition for basic layouts, uses Tesseract), TM-Town (online, uses Tesseract).

On the CAT tools front, MateCat (online) and Wordfast Anywhere (online) offer PDF OCR conversion. Smartcat (a CAT tool developped by ABBYY) also offers paid OCR packages.

Inceni’s TransPDF (free or pay as you go) online service tackles the PDF problem in a different way, via conversion to XLIFF and back, offering an OCR feature paid with credits along the way (see below for more details).

Note: Many PDF editors, OCR solutions and even full-text search tools such as Recoll also offer the possibility to run OCR on the PDF itself, in order to make it editable and/or searchable. Making the PDF editable allows you to further edit or convert the document. Making it searchable is interesting if you wish to use the scanned PDF for reference purposes.

E-book management & conversion

Calibre is an excellent e-book management application for organizing, viewing, converting and editing e-books. Since linguists tend to keep many (e)books around, it’s definitely worth looking into it.

Sigil is a very nice epub editor.

Interested in translating e-books? A solution is described here.

File and Folder comparison (DIFF) tools

Meld: Visual diff and merge tool for files, directories, and version controlled projects.

Beyond Compare: Compare files and folders, merge changes, synchronize files and generate reports. Commercial software.

KDiff3: File and directory diff and merge tool

Kompare: File and directory diff and merge tool with mirrored scrolling

Diffuse: Graphical tool for merging and comparing text files

Diffoscope: Terminal utility for diff with colors for better visualization

DiffPDF: Compare PDFs

diff-pdf: Compare PDFs (terminal)

SuperTMXMerge: Diff tool for comparing and merging TMX translation memories.

Desktop search - Full text index

Recoll

Recoll is a full-text search tool that can find keywords within documents and file names. It supports a host of document formats and can be used to index your home folder or specific directories for ad hoc documentary research. Recommended.

DocFetcher

Allows you to index select directories and search the contents of files on your computer. Great for ad hoc indexing your documentary research, your ebook library, etc. Sadly, no longer being developped.

FSearch

A fast and lightweight file search utility based on GTK+3. Similar to the (Windows only) Everything Search Engine, it provides instant (as you type) results, with RegEx and filter support among other features.

AngrySearch

A fast file search utility based on QT5, that attempts to provide a Linux version of the Everything Search Engine available for Windows.

Searchmonkey

Allows users to search for file names and contents using powerful regular expressions.

Regexxer is a GUI search/replace tool featuring Perl-style regular expressions.

Gnome Shell Desktop Environment comes with a configurable built-in Tracker (Search), which allows for searching various types of content, including “full-text search”. If you don’t use it, you might as well disable it, to prevent full indexing.

KDE Plasma 5 Desktop Environment comes with baloo search, which indexes your home folder and can be configured to search only specific folders, to perform full content indexing (useful to find text in office files), and to allow searching hidden files and folders. Two setups are optimal for it: just leaving the defaults but with full content indexing disabled (openSUSE does this), or making it not search the home folder and only search (perhaps with full content indexing) in the Documents, Downloads, Pictures, Music and Videos folders (Fedora does this).

For simple, lightweight desktop search, you can use Tracker on GNOME, Baloo without full content indexing on Plasma, or Catfish.

Of course, there are also powerful command-line utilities for searching files (find), text searching for lines matching a regular expression (grep), etc.

File rename utilities

Nautilus (Files) file manager comes bundled with a bulk renamer (just select multiple files and press F2).

Thunar file manager also includes an excellent Bulk Rename utility.

Dolphin file manager that includes numeric-based bulk renaming (file1.txt, file2.txt etc)

KRename is a flexible program for bulk renaming based on text patterns.

Text editors

Gedit and Kate are two of the nice default text editors available.

Geany is a great lightweight text editor and Integrated Development Environment (IDE).

Text editors with even richer functions (also meant for coding) include GitHub’s Atom, Visual Studio Code, Sublime Text (free/paid) and Brackets.

For historical reasons, I’ll also include two of the oldest editors, with some hardcore fans: Vim and GNU Emacs (see also Editor wars).

Markdown editors

Atom, Visual Studio Code, Sublime Text, Brackets and even Geany all support Markdown syntax, with or without add-ons.

Three of the best dedicated Markdown editors/viewers are Typora, Zettlr and Obsidian.

Productivity tools selection

AutoKey

AutoKey is a desktop automation utility for Linux and X11, formerly hosted at OldAutoKey, and updated to run on Python 3. It allows you to manage a collection of scripts, and assign abbreviations and hotkeys to these scripts allowing you to execute them on demand in whatever program you are using. It can also be used as a text expander, where you store phrases (snippets of text) to be reused across various applications by typing an abbreviation or a keyboard shortcut.

Wiki, OldAutokey page and user group.

ibus-typing-booster

Ibus-typing-booster is an intelligent context sensitive completion input method to speed-up typing. It supports most languages (except Chinese and Japanese).

Note: CafeTran Espresso, OmegaT and other CAT tools already implement predictive typing/auto-completion features, but ibus-typing-booster brings this to the global level. Recommended.

Reduce eye-strain

Utilities that adjusts the color temperature of your display(s) according to the position of the sun.

f.lux

Redshift. Also available as a Gnome shell extension.

Note: the GNOME and Plasma Desktop environments also sport integrated Night mode. In GNOME it can be found under Settings > Displays, in Plasma it can be found under System Settings > Display and Monitor > Night Color.

Clipboard managers

GPaste

Parcellite

CopyQ

Klipper

Screenshots

Beyond the standard screenshot capabilities of the desktop environment (Gnome, KDE, etc.), there are separate apps that take it one step further:

Shutter

Lightscreen

Flameshot

Screen recording - Screencasting

SimpleScreenRecorder Excellent and easy-to-use screen recorder

Peek animated GIF/video Screen Recorder

Open Broadcaster Software Full-featured cross-platform screen recording and live streaming software.

recordMyDesktop

Time and Project tracking

Project Hamster

TMetric

TimeCamp

RescueTime

Toggl

Super Productivity

Pomodoro timers

Gnome Pomodoro

PomoDoneApp

Pomello

Unit conversion

ConvertAll (offline)

On-screen keyboard

Onboard is an excellent on-screen (virtual) keyboard. Handy if you wish to remember how to type less frequent characters for your language.

Running GNU-Linux on Windows via WLS

The Windows Subsystem for Linux (WLS) lets you run a GNU/Linux environment – including most command-line tools, utilities, and applications – directly on Windows, unmodified, without the overhead of a traditional virtual machine or dual-boot setup.

This means you are able to run Linux distributions and applications side-by-side with Windows. WLS requires Windows 10 (Build 19041 and higher) and Windows 11.

Running Windows applications on Linux

Natively on WINE

Wine (an acronym for “Wine Is Not an Emulator”) is a free and open-source compatibility layer that aims to allow computer programs developed for Microsoft Windows to run on Unix-like operating systems. Instead of simulating internal Windows logic like a virtual machine or emulator, Wine translates Windows API calls into POSIX calls on-the-fly, eliminating the performance and memory penalties of other methods and allowing you to cleanly integrate Windows applications into your desktop.

Running Windows software on Linux can sometimes be a complex endeavor, however, you get to install and run many Windows programs via Wine (and its helper script, Winetricks).

To see the support status of a specific application and version, you can use the Wine’s official website. Not all software is represented.

Since each Windows app has different requirements, it is usually recommended to create different Wine “prefixes”.

Two applications aim to make it easy to install Windows programs in such separate locations:

Some examples of useful translation-related applications that run well on Linux via Wine (mostly PlayOnLinux and Crossover) are the following:

Many Windows programs that require specific versions of the .NET framework can be installed successfully after installing these first. Tip: Trial versions can be easily reinstalled since you can create separate machines with a few clicks.

Through a Windows Virtual Machine (VM)

The recommended free and easy way to install a Windows virtual machine and run it inside GNU/Linux is through a cross-platform application called VirtualBox.

With Virtualbox (provided you also install the extension pack and guest additions), you can run Windows in fullscreen mode, use USB and other devices, share folders or clipboard between the two systems (the Linux host and the Windows client), take advantage of the Internet connection, and more. There’s no need to shut down and restart the Windows VM, you can simply save the VM state, for a quick reuse anytime.

This means you can run (almost) any Windows software while staying on Linux, including Trados, memoQ, Deja Vu, Transit, ABBYY Finereader, Adobe Photoshop, Indesign, Illustrator, Premiere, etc. Working within a VM can be slightly inconvenient in the long run, so running such Windows applications is better left for occasional use only. Other CAT tools should be favored for primary use, but it is nice to know that you can still launch such Windows apps and utilities should your workflow require it.

Running a VM requires assigning some of your computer RAM to Windows operation, so it is preferable to have enough RAM, to begin with. The more, the better, starting from 8 GB.

Other virtualization applications worth mentioning:

Paid: Parallels Free/Paid: VMware Free: QEMU, KVM

WinApps

This software lets you run Windows apps such as Microsoft Office/Adobe in Linux and GNOME/KDE as if they were a part of the native OS.

It connects your local system to a virtual machine (running with QEMU/KVM and virt-manager or virsh) and integrates Windows software by creating .desktop files under ~/.local/share/applications, rendering entries in your menu, and by connecting the VM into your home folder, which means Windows can access your local folder as well as open and save files.

With these set up, it uses freerdp to show that application (and only that instead of the full VM) on your desktop.

See also Thiago Masato Costa Sueto’s ProZ forum report.

Via Dual Boot - On a separate machine

You can also install GNU/Linux alongside Windows on your PC (which is often recommended as a first step for newcomers, although interested users with Windows 10/11 should also check out WLS) or on a separate machine.

Happy translating!

Updates

Feedback

You can send me feedback via a ProZ message. Please start your subject with the name of the document: “TranslateOnLinux”.