Mingis on Tech: The language of malware

Sometimes, how you say something can be as important as what you say -- especially when's there been a cyberattack and law enforcement officials are trying to figure out who you are.

That's what CSO senior writer Fahmida Rashid found when she looked into how cybersecurity firms go about tracking down the bad actors behind malware campaigns. While linguistics may not be the first thing companies worry about when trying to protect -- or retrieve access to -- their data, it can help pinpoint an attack's origin, Rashid told Computerworld Executive Editor Ken Mingis.

Linguistics analysis has been used to investigate various attacks, including the 2014 Sony breach, ShadowBrokers and Guccifer 2.0 -- and it seems to be gaining traction  because it can help identify the shadowy figures behind ransomware attacks, Rashid said. For example, Flashpoint analysts analyzed every language version of the ransom notes that accompanied WannaCry, and determined that the notes written in Bulgarian, French, German, Italian, Japanese, Korean, Russian, Spanish and Vietnamese had been translated from a note originally written in English. (In the CoinVault ransomware attack, investigators found several phrases in “perfect Dutch,” indicating a Dutch connection.)

Ransomware lends itself well to linguistic analysis because when attackers write the  ransom notes their speech patterns show up in the text. There happens to be more text to analyze, and unlike spam and phishing messages where attackers have to  mimic legitimate entities, ransom notes can hide clues on how comfortable the writer is in that language.

The fascinating part, according to Rashid, is that linguists can learn about attackers by the way they phrase certain words, or even by the words themselves. That's particularly true of ransomware like WannaCry, where victims get a message from the attackers -- and that message can contain hidden clues. Linguists like Shlomo Argamon, professor of computer science at the Illinois Institute of Technology, say it’s important to have as much text as possible to analyze. The more there is, the more likely the “true” attributes can be surfaced.

It's not fool-proof, Rashid noted. Different people can speak multiple languages and with differing degrees of proficiency, sometimes obscuring an attack's origin. Attackers regularly employ red herrings and false flags to throw investigators off; they  manipulate when they launch attacks; change timestamps; and even intentionally insert cultural references and phrases to misdirect investigators. Even so, it is hard to consistently plant fake clues in speech.

For an audio podcast only, click play (or catch up on all episodes) below. Or you can now find us on iTunes, where you can download each episode and listen at your leisure.

Happy listening, and please, send feedback or suggestions for future topics to us. We'd love to hear from you.

IDG Insider


« What are the differences between Office 2016 and Office 365?


Net Neutrality: The July 12 Internet-Wide Day of Action protest and what to expect »
IDG News Service

The IDG News Service is the world's leading daily source of global IT news, commentary and editorial resources. The News Service distributes content to IDG's more than 300 IT publications in more than 60 countries.

  • Mail

Recommended for You

Trump hits partial pause on Huawei ban, but 5G concerns persist

Phil Muncaster reports on China and beyond

FinancialForce profits from PSA investment

Martin Veitch's inside track on today’s tech trends

Future-proofing the Middle East

Keri Allan looks at the latest trends and technologies


Do you think your smartphone is making you a workaholic?