Proactively Building in Efficiency in Machine Translation Engines to Reduce Our Carbon Footprint

Proactively Building in Efficiency in Machine Translation Engines to Reduce Our Carbon Footprint
Dan Milczarski, Executive Vice President, Process & Technology, CQ fluency

What To Know

  • Tracking, carbon emissions during training and the improvement in BLEU scores for reference and comparison, the study found that the French>German, English>German, and German>French language pairs took the longest to train and were the most carbon-intensive pairs as a result compared to English>French, German>English, and French>English.
  •   From the way we operate ML hardware (including hosting engines in regions that use renewable energy sources) to the process in which we train natural language processing (NLP) models, to the way we factor in how specific language pairs performs, we are continually building on our best practices to further reduce total energy use.

By Dan Milczarski, Executive Vice President, Process & Technology, CQ fluency

Recent estimates suggest that the carbon footprint of training a single AI is as much as 284 tons of carbon dioxide equivalent — five times the lifetime emissions of an average car. As a responsible global citizen, organizations and corporations should feel ethically compelled to do their part to improve our earth’s prognosis in this time of climate change.

Language service provider (LSP) organizations, like CQ fluency, are heavily invested in training AI engines to fit the needs of their global clients. While machine translation (MT) isn’t a major contributor to our deteriorating climate, the processing power needed to train these engines is significant enough to warrant our attention on reevaluating our MT processes. Luckily, the evolution of language technology has enabled companies to train MT engines using significantly less processing power to reduce the environmental impact.

Like all machine learning activities, the best outcome is when you find harmony between artificial intelligence and human intelligence. When implementing a new MT model, the human efforts, knowledge, and experience in preparing the dataset is extremely important to ensure you train the engine right the first time.

One way to create efficiencies it to employ pre-training assessment scorecards and proprietary scripts to optimize the data sets before training. This allows the engineer to get first-time trained models to reach BLEU scores and edit distance rankings high enough to avoid needing a re-training with a larger dataset.

When we refer to implementation, it isn’t only how we train the engines, but also how we manage the environmental impact. Teaching machines the complexities of learning human language results in a heavy level of computation that is energy intensive.

In a recent study, researchers selected “six language pairs to assess the computational power required for training; that is, which pairs were more power-hungry and, hence, carbon-emitting.” The study focused on word pairs in English, French and German, using a dataset with 30,000 samples for each language. Tracking, carbon emissions during training and the improvement in BLEU scores for reference and comparison, the study found that the French>German, English>German, and German>French language pairs took the longest to train and were the most carbon-intensive pairs as a result compared to English>French, German>English, and French>English. This example demonstrates how the differences in lexicon – with German being the most diverse – can impact disparities in results and higher emissions.

This matters because in the global healthcare space, where pharmaceutical, biotechnology and medical device companies are simultaneously researching and publishing data in multiple languages, marketing products across different countries with differing regulations, and tracking adverse events or post-marketing results as part of a pharmacovigilance program, the importance of accurate translation at speed is evident.

It’s an investment to research, hone, and implement processes that provide accurate, quick, and comprehensive results for global clients working in multiple languages. At, CQ fluency we’ve prioritized development of AI, machine translation, process automation and other innovative translation management solutions for our health-focused clients.  The use of AI does not eliminate the need for human expertise and guidance. With the evolution of language technology, we have strategically built nimble teams to help best integrate our evolving platforms to achieve cost, security, speed, scale and quality goals.  Our technology solutions work hand-in-hand as part of a larger ecosystem with efficient ML architectures at the heart of it.  From the way we operate ML hardware (including hosting engines in regions that use renewable energy sources) to the process in which we train natural language processing (NLP) models, to the way we factor in how specific language pairs performs, we are continually building on our best practices to further reduce total energy use.

Companies like Google have also pledged to offset their carbon footprint as it pertains to machine learning through their model titled “4Ms” and is available to anyone using Google Cloud services.  These four practices together can reduce energy by 100x and emissions by 1000x.

  1. Model. Selecting efficient ML model architectures, such as sparse models, can advance ML quality while reducing computation by 3x–10x.
  2. Machine. Using processors and systems optimized for ML training, versus general-purpose processors, can improve performance and energy efficiency by 2x–5x.
  3. Mechanization. Computing in the Cloud rather than on premise reduces energy usage and therefore emissions by 1.4x–2x. Cloud-based data centers are new, custom-designed warehouses equipped for energy efficiency for 50,000 servers, resulting in very good power usage effectiveness (PUE). On-premise data centers are often older and smaller and thus cannot amortize the cost of new energy-efficient cooling and power distribution systems.
  4. Map Optimization. Moreover, the cloud lets customers pick the location with the cleanest energy, further reducing the gross carbon footprint by 5x–10x. While one might worry that map optimization could lead to the greenest locations quickly reaching maximum capacity, user demand for efficient data centers will result in continued advancement in green data center design and deployment.

Every language service provider (LSP) can still do their part by only training engines when needed, by analyzing previously trained models that required re-training to understand how they can avoid similar re-training and by donating to a service that helps offset carbon emissions from the MT training process.

Recently, CQ fluency launched the CQtrees initiative, planting a tree for every engine trained to help offset carbon emissions.  As natural carbon absorbers to clean the air, one tree can absorb up to 22lbs per year during their first 20 years of growth.  Trees of course have many other benefits beyond storing carbon.  They give us oxygen, stabilize soil, provide shelter/food for wildlife, regulate temperatures, slowing the flow of water through landscapes and much more.  Our employees, vendors and clients help us plant trees as a volunteer initiative and support programs that plant trees in the communities we serve (North America, South America, Europe and more).

We all must do our part to mitigate carbon emission in ML.