Scientists are studying your Twitter slang to help AI

Mathematicians studied 100 billion tweets to help computer algorithms better understand our colloquial digital communication.

Scientists are studying your Twitter slang to help AI
Photo credit: Getty Images
  • A group of mathematicians from the University of Vermont used Twitter to examine how young people intentionally stretch out words in text for digital communication.
  • Analyzing the language in roughly 100 billion tweets generated over eight years, the team developed two measurements to assess patterns in the tweets: balance and stretch.
  • The words people stretch are not arbitrary but rather have patterned distributions such as what part of the word is stretched or how much it stretches out.

What? whaat. WHAT? Whaaaattt?

While all of the above are expressions of confusion, you understand them to mean slightly different things. That's based upon the way you imagine the word to sound signified by the repetition of or emphasis put on certain letters. The underlying meaning imbued within our vernacular, slang, and deliberately misspelled words is how we lace our digital communication with human emotion.

Which has, coincidentally, proved to be one of the major challenges for language-processing artificial intelligence. But scientists are trying, and they're studying our Twitter lingo to bring computers up to speed on how humans really communicate.

Balance and Stretch

Photo credit: Dole777 / Unsplash

Over the last two decades, social media has provided scientists with a trove of free information about human behavior and language. A group of mathematicians from the University of Vermont used Twitter to examine how young people intentionally stretch out words in text for digital communication. They created a method to essentially quantify the semantic nuances in between stretched words, like "right" vs. "riiiiiight," with the aim to teach future AI algorithms human digital colloquialisms.

"Written communication has recently begun encoding new forms of expression, including the emotional emphasis delivered by stretching words out," said Chris Danforth, professor of Mathematics & Statistics in the Vermont Complex Systems Center and member of the research team behind the study.

In their study, published last week in the journal PLOS One, the team analyzed the language in roughly 100 billion tweets generated from 2008 to 2016. They developed two measurements to assess patterns in the tweets: balance and stretch. For example hahahaha would be considered a stretched world high on balance while a term like wtffffff has stretch but little balance as only one letter, f, contributes to the stretchiness. This means to put emphasis on the world abbreviated by the letter "f".

"With so much communication happening electronically these days, we're all trying to find ways to convey emotion through text. Emojis are helping, but the visual effect of 30 consecutive vowels in a curse word turns a bland profanity into a form of art," Danforth said.

Interestingly, the use of elongated words was found across languages. For example, "kkkkkkk" signifies laughter in Brazilian Portuguese while "wkwkwkwkwkwk" expresses it in Indonesian, according to the researchers.

Beyond the dictionary 

Ultimately, this project could help artificial intelligence algorithms understand critical intrinsic meanings contained in the idiosyncratic variations in our communicative text or other linguistic symbols, such as punctuation and emojis.

Dictionary definitions hardly reflect the way that we actually communicate with one another digitally. What the researchers found, though, is that the words people stretch out aren't arbitrary. Rather, they have patterned distributions such as what part of the word is stretched or how much it stretches out. Colloquial digital language is, after all, a system of symbols and for it to transfer meaning we must all be "in" on the patterns.

This research suggests that by gaining understanding into stretched words used on social media opens more doors to helping AI better understand our slang. Tools and methods were developed that could be useful in future studies, for example investigations of intentional mis-typings and misspellings.

What benefits come from AI algorithms better understanding our digital lingo? For one, it's possible that new tools could be applied to improve natural language processing, search engines, and spam filters.

"We were able to comprehensively collect and count stretched words like 'gooooooaaaalll' and 'hahahaha'," the researchers said in a press release, "and map them across the two dimensions of overall stretchiness and balance of stretch, while developing new tools that will also aid in their continued linguistic study, and in other areas, such as language processing, augmenting dictionaries, improving search engines, analyzing the construction of sequences, and more."

‘Designer baby’ book trilogy explores the moral dilemmas humans may soon create

How would the ability to genetically customize children change society? Sci-fi author Eugene Clark explores the future on our horizon in Volume I of the "Genetic Pressure" series.

Surprising Science
  • A new sci-fi book series called "Genetic Pressure" explores the scientific and moral implications of a world with a burgeoning designer baby industry.
  • It's currently illegal to implant genetically edited human embryos in most nations, but designer babies may someday become widespread.
  • While gene-editing technology could help humans eliminate genetic diseases, some in the scientific community fear it may also usher in a new era of eugenics.
Keep reading Show less

Massive 'Darth Vader' isopod found lurking in the Indian Ocean

The father of all giant sea bugs was recently discovered off the coast of Java.

A close up of Bathynomus raksasa

SJADE 2018
Surprising Science
  • A new species of isopod with a resemblance to a certain Sith lord was just discovered.
  • It is the first known giant isopod from the Indian Ocean.
  • The finding extends the list of giant isopods even further.
Keep reading Show less

These are the world’s greatest threats in 2021

We look back at a year ravaged by a global pandemic, economic downturn, political turmoil and the ever-worsening climate crisis.

Luis Ascui/Getty Images
Politics & Current Affairs

Billions are at risk of missing out on the digital leap forward, as growing disparities challenge the social fabric.

Keep reading Show less

Columbia study finds new way to extract energy from black holes

A new study explains how a chaotic region just outside a black hole's event horizon might provide a virtually endless supply of energy.

Credit: NASA's Goddard Space Flight Center
Surprising Science
  • In 1969, the physicist Roger Penrose first proposed a way in which it might be possible to extract energy from a black hole.
  • A new study builds upon similar ideas to describe how chaotic magnetic activity in the ergosphere of a black hole may produce vast amounts of energy, which could potentially be harvested.
  • The findings suggest that, in the very distant future, it may be possible for a civilization to survive by harnessing the energy of a black hole rather than a star.
Keep reading Show less
Mind & Brain

A psychiatric diagnosis can be more than an unkind ‘label’

A popular and longstanding wave of thought in psychology and psychotherapy is that diagnosis is not relevant for practitioners in those fields.

Scroll down to load more…
Quantcast