Scientists are studying your Twitter slang to help AI

Mathematicians studied 100 billion tweets to help computer algorithms better understand our colloquial digital communication.

Scientists are studying your Twitter slang to help AI
Photo credit: Getty Images
  • A group of mathematicians from the University of Vermont used Twitter to examine how young people intentionally stretch out words in text for digital communication.
  • Analyzing the language in roughly 100 billion tweets generated over eight years, the team developed two measurements to assess patterns in the tweets: balance and stretch.
  • The words people stretch are not arbitrary but rather have patterned distributions such as what part of the word is stretched or how much it stretches out.

What? whaat. WHAT? Whaaaattt?

While all of the above are expressions of confusion, you understand them to mean slightly different things. That's based upon the way you imagine the word to sound signified by the repetition of or emphasis put on certain letters. The underlying meaning imbued within our vernacular, slang, and deliberately misspelled words is how we lace our digital communication with human emotion.

Which has, coincidentally, proved to be one of the major challenges for language-processing artificial intelligence. But scientists are trying, and they're studying our Twitter lingo to bring computers up to speed on how humans really communicate.

Balance and Stretch

Photo credit: Dole777 / Unsplash

Over the last two decades, social media has provided scientists with a trove of free information about human behavior and language. A group of mathematicians from the University of Vermont used Twitter to examine how young people intentionally stretch out words in text for digital communication. They created a method to essentially quantify the semantic nuances in between stretched words, like "right" vs. "riiiiiight," with the aim to teach future AI algorithms human digital colloquialisms.

"Written communication has recently begun encoding new forms of expression, including the emotional emphasis delivered by stretching words out," said Chris Danforth, professor of Mathematics & Statistics in the Vermont Complex Systems Center and member of the research team behind the study.

In their study, published last week in the journal PLOS One, the team analyzed the language in roughly 100 billion tweets generated from 2008 to 2016. They developed two measurements to assess patterns in the tweets: balance and stretch. For example hahahaha would be considered a stretched world high on balance while a term like wtffffff has stretch but little balance as only one letter, f, contributes to the stretchiness. This means to put emphasis on the world abbreviated by the letter "f".

"With so much communication happening electronically these days, we're all trying to find ways to convey emotion through text. Emojis are helping, but the visual effect of 30 consecutive vowels in a curse word turns a bland profanity into a form of art," Danforth said.

Interestingly, the use of elongated words was found across languages. For example, "kkkkkkk" signifies laughter in Brazilian Portuguese while "wkwkwkwkwkwk" expresses it in Indonesian, according to the researchers.

Beyond the dictionary 

Ultimately, this project could help artificial intelligence algorithms understand critical intrinsic meanings contained in the idiosyncratic variations in our communicative text or other linguistic symbols, such as punctuation and emojis.

Dictionary definitions hardly reflect the way that we actually communicate with one another digitally. What the researchers found, though, is that the words people stretch out aren't arbitrary. Rather, they have patterned distributions such as what part of the word is stretched or how much it stretches out. Colloquial digital language is, after all, a system of symbols and for it to transfer meaning we must all be "in" on the patterns.

This research suggests that by gaining understanding into stretched words used on social media opens more doors to helping AI better understand our slang. Tools and methods were developed that could be useful in future studies, for example investigations of intentional mis-typings and misspellings.

What benefits come from AI algorithms better understanding our digital lingo? For one, it's possible that new tools could be applied to improve natural language processing, search engines, and spam filters.

"We were able to comprehensively collect and count stretched words like 'gooooooaaaalll' and 'hahahaha'," the researchers said in a press release, "and map them across the two dimensions of overall stretchiness and balance of stretch, while developing new tools that will also aid in their continued linguistic study, and in other areas, such as language processing, augmenting dictionaries, improving search engines, analyzing the construction of sequences, and more."

Live on Monday: Does the US need one billion people?

What would happen if you tripled the US population? Join Matthew Yglesias and Charles Duhigg at 1pm ET on Monday, September 28.

The "singleton hypothesis" predicts the future of humanity

Philosopher Nick Bostrom's "singleton hypothesis" predicts the future of human societies.

Politics & Current Affairs
  • Nick Bostrom's "singleton hypothesis" says that intelligent life on Earth will eventually form a "singleton".
  • The "singleton" could be a single government or an artificial intelligence that runs everything.
  • Whether the singleton will be positive or negative depends on numerous factors and is not certain.
Keep reading Show less

Learn innovation with 3-star Michelin chef Dominique Crenn

Dominique Crenn, the only female chef in America with three Michelin stars, joins Big Think Live.

Big Think LIVE

Having been exposed to mavericks in the French culinary world at a young age, three-star Michelin chef Dominique Crenn made it her mission to cook in a way that is not only delicious and elegant, but also expressive, memorable, and true to her experience.

Keep reading Show less

Universe works like a cosmological neural network, argues new paper

Controversial physics theory says reality around us behaves like a computer neural network.

Synapses in space.

Credit: sakkmesterke
Surprising Science
  • Physicist proposes that the universe behaves like an artificial neural network.
  • The scientist's new paper seeks to reconcile classical physics and quantum mechanics.
  • The theory claims that natural selection produces both atoms and "observers".
Keep reading Show less

We studied what happens when guys add their cats to their dating app profiles

43% of people think they can get a sense of someone's personality by their picture.

Photo by Luigi Pozzoli on Unsplash
Sex & Relationships

If you've used a dating app, you'll know the importance of choosing good profile pics.

Keep reading Show less
Coronavirus

Quarantine rule breakers in 17th-century Italy partied all night – and some clergy condemned the feasting

17th-century outbreaks of plague in Italy reveal both tensions between religious and public health authorities.

Scroll down to load more…
Quantcast