In case you have been living under a rock, Elon Musk has paused his acquisition of Twitter because he believes a majority of accounts are bots. He challenged the Twitter community to help him figure out the percentage of bots on Twitter. This opportunity was too exciting for the BotNot team to pass up.
tl;dr: we estimate 24–37% of daily active Twitter users (that tweet) are bot accounts as of May 17, 2022. We arrived at this number by doing a “flash experiment” 24-hour hackathon to bring some rigor and evidence into this viral conversation.
Elon Musk’s Approach:
Musk has his ideas on quantifying the number of fake, spam, and duplicate accounts on Twitter. Last week, Musk said in a tweet he would review “a random sample of 100 followers of @twitter.” He added later: “Ignore the first 1000 followers, then pick every 10th. I’m open to better ideas.” So, Elon here’s a better idea, our Flash Experiment.
BotNot’s Flash Experiment aka Better Idea for Elon:
We carried out this experiment within 24 hours, calling it a “flash experiment.” We are happy to do a deep dive if you want to cover the resource expense (you know where to find us).
Our Tweet Sample
We took a sample of 500 tweets that were posted between 15:45:29 and 15:45:41 GMT on May 17, 2022. We include tweets in every language. This is a biased sample but useful because it is recent and complete (all tweets published in those 12 seconds.) More discussion on the bias later.
How We Classify
We used a tool called botometer to classify the author accounts as bots. The classifier provides a score from 1–5, and we took all accounts with a score greater than 3 to be bots.
The classifier flagged 38% of the tweets in our sample as published from bot accounts.
Botometer classifies bot types as well through their Ensemble of Specialized Classifiers (ESC) algorithm (reference).
How Reliable are These Predictions
The Ensemble of Specialized Classifiers research paper reports 84% recall and 64% precision. Worst-case misclassification errors give a range of 24%— 44% bot tweets in our sample.
Ignoring sampling biases (treated below) and taking assumptions, this finding suggests that 24–44% of twitter content is generated by bots.
What about Time Bias?
This sample was taken at a specific time of day, which introduces a time-dependent bias to our sample. We ignore this bias and assume bot activity is proportional to human activity over time.
What about Selection Bias?
Because we wanted a complete sample of tweets, we chose to scrape all tweets in a short period of time. This sampling technique favors high-frequency tweeters. Research (Inuwa-Dutse et al, 2020) has shown that high-frequency tweeters are more likely to be bots — tweet frequency was the single most important predictor for bot activity.
We were more likely to sample tweets from bot users because they are more prolific. However, this also means they generate more content than genuine human users, so our estimate of the percentage of bot-generated content on the platform still holds.
Additional statistical work is needed to correct for the effect of this bias on our analysis.
Our General Thoughts about Bot Users on Twitter
A 2017 research paper (Varol et al) estimated that 9–15% of Twitter accounts exhibited “social bot behaviors.” That number has probably grown in the 5 years since. But how much?
Of daily active users in our sample, 37% were classified as bots. With a lot of assumptions we can just extend our classification worst-case error range to daily active users and estimate that 24–37% of daily active Twitter users that tweet are bot accounts. We pick the lower half of this range to hedge against the selection bias issue.
It’s unclear from our experiment how many daily active users who do not tweet are bot accounts.
A statistical analysis of our sample will help us refine this number further, but this is a first-order “back-of-envelope” estimate.
User monetization is why daily active users are so important to Twitter’s bottom line.
In the process of collecting data for this flash study, we hit a lot of API limits and prohibitive costs for research like ours.
We realized that Twitter’s premium APIs must be extremely lucrative. Bot users almost certainly bring more revenue to the platform than ordinary humans by paying for these APIs.
Even the most conservative extrapolations from our findings suggest this is a huge revenue stream for the platform.
“If you’re not paying for the product, then you are the product.” — Tristan Harris, Center for Humane Technology
Want to Add-On? Here’s a Few Techniques for Bot Detection
The BotNot team did a lot of research in the last 24 hours. This section summarizes some of our findings for any readers who are curious about how bot classification works on Twitter and other social media networks.
Network Analysis / Graph Machine Learning
We use network science in our flagship product, so we wanted to know how network analysis applies tosocial bot detection. It turns out that there’s been a lot of research here!
WICO is a 2021 open dataset and study on misinformation on Twitter (Pogorelov, et al.) that focuses on the spread of COVID-19 and 5G conspiracy theories. Humans and bots work in concert to spread misinformation, so the data isn’t as useful for strict bot classification. However, the networks in this dataset are a rich data source for understanding the Twitter network.
More advanced techniques like graph neural networks offer even more powerful ways to take advantage of network structure.
Typically, genuine users and communities form more organic graph structures where bots form more artificial networks with master-nodes and/or “rings” — for example, a group of 200 bot accounts all follow each other to boost each bot’s follower count. Or, thousands of bot accounts follow a single influencer and nobody else (to boost that influencer’s follower count.) These signatures are very hard to see on a single bot user’s page but become clear when we trace the follower network out from that user’s account.
This approach is by far the most effective way to classify bots. Twibot-20 (Feng, et al.) uses graph ML to vastly improve the classification performance of bot detection systems.
Getting this graph data is quite expensive due to Twitter’s API, although the data itself is sparse (just lists of user_ids). It would be very cheap to export at scale for public study and good. Twitter has decided to hide this data behind an expensive paywall that keeps curious researchers like us away.
Language & Lexical Approaches
A more traditional and common approach to bot classification uses language features.
This is typically easier to apply because a system only needs to pull a few tweets for a user to make a judgment (compared to scanning thousands of follower lists for the network approach).
Lexical Analysis of Automated Accounts on Twitter (Inuwa-Dutse et al, 2020) is a great modern benchmark and review of this type of approach.
The best bot detection systems use a combination of network and language features to separate genuine users from bot accounts.
We are all frustrated by social media bots, and there’s evidence that suggests they are leading to more polarization and erosion of trust in our society. Social media platforms like Twitter do not have an incentive to remove them, as bots are designed to drive engagement and pay handsomely to use the platform through premium APIs as a backdoor advertising channel.
At BotNot, we focus on understanding bots and their negative impacts on our lives, business metrics, our supply chain, KYC, and so much more. Huge thanks to the team at botometer for years of work in building the product that powered this flash experiment!
A deeper look at the tweets in our sample.