Security experts from Duo Security have developed a collection of open source tools and disclosed techniques that can be useful in identifying large Twitter botnet.
The experts developed the tools starting from the analysis of 88 million Twitter accounts and over half-a-billion tweets, one of the largest random datasets of Twitter accounts analyzed to date.
“This paper details the techniques and tools we created to both build a large dataset containing millions of public Twitter profiles and content, as well as to analyze the dataset looking for automated accounts.” reads the research paper published by Duo Security.
“By applying a methodical data science approach to analyzing our dataset, we were able to build a classifier that effectively finds bots at a large scale.”
The dataset was composed by using the Twitter’s API, collected records include profile name, tweet and follower count, avatar, bio, the content of tweets, and social network connections.
Practical data science techniques can be used to create a classifier that could help researchers in finding automated Twitter accounts.
The experts defined 20 unique account heuristics to discover the bots, they include the number of digits in a screen name, Entropy of the screen name, followers/following ratio, number of tweets and likes relative to the account’s age, number of users mentioned in a tweet, number of tweets with the same content, percentage of tweets with URLs, time between tweets, average hours tweeted per day, and average “distance” of account age in retweets/replies.
The above heuristics are organized in the 3 categories, the “Account attributes,” “Content,” and “Content Metadata.”
The tools and the techniques devised by the researchers could be very useful in investigating fraudulent activities associated with Twitter botnet. The experts first identify the automated bots then they use the tool to monitor the evolution of the botnets they belong.
The experts shared a case study related to the discovery of a sophisticated botnet of at least 15,000 bots involved in a cryptocurrency scam. The analysis of the botnet and the monitoring of the malicious infrastructure over time allowed the expert to discover how bots evolve to evade detection.
The experts reported their findings to Twitter that confirmed it is aware of the problem and that is currently working on implementing new security measure to detect problematic accounts.
“Twitter is aware of this form of manipulation and is proactively implementing a number of detections to prevent these types of accounts from engaging with others in a deceptive manner. Spam and certain forms of automation are against Twitter’s rules. In many cases, spammy content is hidden on Twitter on the basis of automated detections.” replied Twitter.
“When spammy content is hidden on Twitter from areas like search and conversations, that may not affect its availability via the API. This means certain types of spam may be visible via Twitter’s API even if it is not visible on Twitter itself. Less than 5% of Twitter accounts are spam-related.”.
Duo Security will release its tools as open source on August 8 during the the Black Hat conference in Las Vegas.
“Malicious bot detection and prevention is a cat-and-mouse game,” concluded Duo Principal R&D Engineer Jordan Wright. “We anticipate that enlisting the help of the research community will enable discovery of new and improving techniques for tracking bots. However, this is a more complex problem than many realize, and as our paper shows, there is still work to be done.”