Spam Negatively Affecting Linguistic Research

Spam. EBay has policies against it, search engines and email clients are constantly upgraded to filter it out, and most people just hate it. Now it turns out that they might be hampering the work of those linguists who have turned to Google to directly observe the evolution of language.

One popular technique used by spammers is “keyword spamming.” If everyone is currently talking about the exploits of a popular celebrity, why not add her name to your auction listing, website, or email message so that you get more visitors? Spammers have been known to load up their material with what seems to be a dictionary’s worth of words. In response, email clients and search engines have added filters that ignore lists of words that have no relationship to the true content of the email message or website. However, spammers are now getting around these filters by taking their lists of words and using them in grammatically-correct but nonsensical sentences.

How are linguists being affected? Traditional data sources such as controlled studies, eavesdropping, and collections of written or spoken words all have their disadvantages, so some linguists are using Google to monitor the evolution of language. However, other linguists warn that the Internet might not be the best source either, because keyword spamming practices might be contaminating the data.

Published by

Richard Leis

Richard Leis is a writer and poet. His first published poem, "Roadside Freak Show," arrives on August 21, 2017 in Impossible Archetype.  His essays about fairy tales and technology have been published on Tiny Donkey. Richard is also the Downlink Lead for the High Resolution Imaging Science Experiment (HiRISE) team at the University of Arizona. He monitors images of the Martian surface taken by the HiRISE camera located on board the Mars Reconnaissance Orbiter in orbit around Mars and helps ensure they process successfully and are validated for quick release to the science community and public. Once upon a time, Richard wrote and edited the science and technology news and commentary website Frontier Channel, hosted the RADIO Frontier Channel podcast, and organized transhumanist clubs. Follow Richard on his website (richardleis.com), on Goodreads (richardleis), Twitter (@richardleisjr), and Facebook (richardleisjr).