Proactively Building in Efficiency in Machine Translation Engines to Reduce Our Carbon Footprint

By Dan Milczarski, Executive Vice President, Process & Technology, CQ fluency

Recent estimates suggest that the carbon footprint of training a single AI is as much as 284 tons of carbon dioxide equivalent — five times the lifetime emissions of an average car. As a responsible global citizen, organizations and corporations should feel ethically compelled to do their part to improve our earth’s prognosis in this time of climate change.

Language service provider (LSP) organizations, like CQ fluency, are heavily invested in training AI engines to fit the needs of their global clients. While machine translation (MT) isn’t a major contributor to our deteriorating climate, the processing power needed to train these engines is significant enough to warrant our attention on reevaluating our MT processes. Luckily, the evolution of language technology has enabled companies to train MT engines using significantly less processing power to reduce the environmental impact.

Like all machine learning activities, the best outcome is when you find harmony between artificial intelligence and human intelligence. When implementing a new MT model, the human efforts, knowledge, and experience in preparing the dataset is extremely important to ensure you train the engine right the first time.

One way to create efficiencies it to employ pre-training assessment scorecards and proprietary scripts to optimize the data sets before training. This allows the engineer to get first-time trained models to reach BLEU scores and edit distance rankings high enough to avoid needing a re-training with a larger dataset.

When we refer to implementation, it isn’t only how we train the engines, but also how we manage the environmental impact. Teaching machines the complexities of learning human language results in a heavy level of computation that is energy intensive.

In a recent study, researchers selected “six language pairs to assess the computational power required for training; that is, which pairs were more power-hungry and, hence, carbon-emitting.” The study focused on word pairs in English, French and German, using a dataset with 30,000 samples for each language. Tracking, carbon emissions during training and the improvement in BLEU scores for reference and comparison, the study found that the French>German, English>German, and German>French language pairs took the longest to train and were the most carbon-intensive pairs as a result compared to English>French, German>English, and French>English. This example demonstrates how the differences in lexicon – with German being the most diverse – can impact disparities in results and higher emissions.

This matters because in the global healthcare space, where pharmaceutical, biotechnology and medical device companies are simultaneously researching and publishing data in multiple languages, marketing products across different countries with differing regulations, and tracking adverse events or post-marketing results as part of a pharmacovigilance program, the importance of accurate translation at speed is evident.

It’s an investment to research, hone, and implement processes that provide accurate, quick, and comprehensive results for global clients working in multiple languages. At, CQ fluency we’ve prioritized development of AI, machine translation, process automation and other innovative translation management solutions for our health-focused clients.  The use of AI does not eliminate the need for human expertise and guidance. With the evolution of language technology, we have strategically built nimble teams to help best integrate our evolving platforms to achieve cost, security, speed, scale and quality goals.  Our technology solutions work hand-in-hand as part of a larger ecosystem with efficient ML architectures at the heart of it.  From the way we operate ML hardware (including hosting engines in regions that use renewable energy sources) to the process in which we train natural language processing (NLP) models, to the way we factor in how specific language pairs performs, we are continually building on our best practices to further reduce total energy use.

Companies like Google have also pledged to offset their carbon footprint as it pertains to machine learning through their model titled “4Ms” and is available to anyone using Google Cloud services.  These four practices together can reduce energy by 100x and emissions by 1000x.

  1. Model. Selecting efficient ML model architectures, such as sparse models, can advance ML quality while reducing computation by 3x–10x.
  2. Machine. Using processors and systems optimized for ML training, versus general-purpose processors, can improve performance and energy efficiency by 2x–5x.
  3. Mechanization. Computing in the Cloud rather than on premise reduces energy usage and therefore emissions by 1.4x–2x. Cloud-based data centers are new, custom-designed warehouses equipped for energy efficiency for 50,000 servers, resulting in very good power usage effectiveness (PUE). On-premise data centers are often older and smaller and thus cannot amortize the cost of new energy-efficient cooling and power distribution systems.
  4. Map Optimization. Moreover, the cloud lets customers pick the location with the cleanest energy, further reducing the gross carbon footprint by 5x–10x. While one might worry that map optimization could lead to the greenest locations quickly reaching maximum capacity, user demand for efficient data centers will result in continued advancement in green data center design and deployment.

Every language service provider (LSP) can still do their part by only training engines when needed, by analyzing previously trained models that required re-training to understand how they can avoid similar re-training and by donating to a service that helps offset carbon emissions from the MT training process.

Recently, CQ fluency launched the CQtrees initiative, planting a tree for every engine trained to help offset carbon emissions.  As natural carbon absorbers to clean the air, one tree can absorb up to 22lbs per year during their first 20 years of growth.  Trees of course have many other benefits beyond storing carbon.  They give us oxygen, stabilize soil, provide shelter/food for wildlife, regulate temperatures, slowing the flow of water through landscapes and much more.  Our employees, vendors and clients help us plant trees as a volunteer initiative and support programs that plant trees in the communities we serve (North America, South America, Europe and more).

We all must do our part to mitigate carbon emission in ML.

Medical Device News Magazinehttps://infomeddnews.com
Medical Device News Magazine provides breaking medical device / biotechnology news. Our subscribers include medical specialists, device industry executives, investors, and other allied health professionals, as well as patients who are interested in researching various medical devices. We hope you find value in our easy-to-read publication and its overall objectives! Medical Device News Magazine is a division of PTM Healthcare Marketing, Inc. Pauline T. Mayer is the managing editor.

Experts Views and Opinions

When it was first announced, Apple Vision Pro was widely heralded as the savior of the VR/XR dream. Would Apple do to the spatial computing market what it did to the mobile phone market in the late noughties? What is the reality now that the device is launched – what does it mean for the business world’s adoption and effective utilization of spatial computing, and more specifically for our customers who are focused on surgical education and medical device sales and adoption? Read on.
Chuck Serrin advises since the pandemic, there has been a rise in ordering everything to the home - from vitamins to prescription drugs - and the days of venturing to a brick-and-mortar store are declining. The pharmaceutical industry is no different. It too is evolving, and pharmacy-focused businesses must make sure their systems are updated to keep up with the changing times.
Shawn Luke is a technical marketing engineer at DigiKey. DigiKey is recognized as the global leader and continuous innovator in the cutting-edge commerce distribution of electronic components and automation products worldwide, providing more than 15.3 million components from over 2,900 quality name-brand manufacturers. Read what he has to say.

By using this website you agree to accept Medical Device News Magazine Privacy Policy

Exit mobile version