1 in 10 Twitter accounts posting Spam, claims research

A mathematical model designed by GlobalData has estimated that around 10% of Twitter’s active accounts are posting spam content.

The data and analytics company notes that this is double that of Twitter’s reported figure – likely due to a difference in criteria as to what counts as ‘spam’.

Sidharth Kumar, Senior Data Scientist at GlobalData, comments:

“What is or is not spam is suddenly an important discussion point for the social media platform, given that Elon Musk’s bid to take over Twitter is now on hold due to a disagreement on the proportion of spam accounts on the platform. Twitter claims that bot/spam accounts on Twitter represent less than 5% of accounts while Elon Musk’s team thinks otherwise.

“The precise proportion of spam accounts is difficult to compute, as it is almost impossible to confirm the identity of the entity behind a tweet handle. Additionally, the definition of a spam account may differ for everyone. Incessant tweeting of non-original content can be considered spam, but some may choose to see it as a very active user sharing articles/opinions.”

Keeping all this in mind, GlobalData’s mathematical model estimated the number of spam accounts using multiple parameters to provide a weighted score, which was then used to determine the classification of ‘spam’ or ‘non-spam’.

GlobalData decided on these parameters by focusing on the differences in activity between typical spam accounts and that of an average Twitter user. Accounts performing poorly on many parameters received a higher score, indicating a higher probability of being spam. GlobalData analysts then independently observed handles at different score levels, and decided the cutoff for the classification (‘spam’ or ‘non-spam’) by consensus. The parameters used in the model were as follows:

Is the tweet handle verified? Verified handles are unlikely to indulge in spam
Is a tweet coming from third-party avenues? Tweets coming from third-party applications are likely to produce spam. Private Twitter API-based apps are often used for posting spam content
What is the number of historic Tweets that the handle has produced, divided by the days since its creation? Typically, spam accounts have a very high number of tweets per day over a lifetime
How frequent were the last 200 tweets? A very high number of Tweets published over a short span of time is more likely to be spam
What is the proportion of retweets in the last 200 tweets? Some spam accounts only retweet certain target accounts/topics on a regular basis
Of the last 200 Tweets, how many did not contain any hashtags or links? Spam accounts are unlikely to have plain-text content. They typically promote certain link, tweet or hashtag.
What is the standard deviation in typical tweet length? Some spam accounts keep posting similar messages in high frequency and do not have high variance in the content or its length
What is the median time between two tweets? Non-bot accounts typically have a higher median tweet time between tweets
What is the length of the description in the profile? Typically, non-bot active accounts have more detailed bios
Of the last 200 Tweets, what is the proportion of links shared? Spam accounts have more tendency to share a lot of links on Twitter

Kumar continues:

“There were a few research pieces published earlier in the media looking at the followers of certain handles to estimate spam or bot proportions. We felt that the correct approach would be to analyze samples of live streams, as that is more indicative of Twitter activity.

“Our estimate is conservative, as we wanted to be sure that we were correctly identifying accounts as spam. It is important to note that this is still an estimation. There is no conclusive way of knowing if a certain account is a bot or spam.”

GlobalData’s research was performed as part of its Social Media Analytics Platform, which tracks most relevant activity of the industry influencers on Twitter and Reddit.

Jun 8, 2022Chris Price

For latest tech stories go to TechDigest.tv

4 comments

Pingback: Phishing themes. Emotet returns. Russian broadcast hacked. Hunting forward as an exercise in threat intelligence collection. – The CyberWire | Upwork Solutions
Pingback: Phishing themes. Emotet returns. Russian broadcast hacked. Hunting forward as an exercise in threat intelligence collection. – Ukraine Digital News
Pingback: Phishing themes. Emotet returns. Russian broadcast hacked. Hunting forward as an exercise in threat intelligence collection. – Mustafa.net
Pingback: 1 in 10 Twitter accounts posting Spam, claims research - Daily News DOT

Comments are closed.

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
CookieLawInfoConsent	1 year	CookieYes sets this cookie to record the default button state of the corresponding category and the status of CCPA. It works only in coordination with the primary cookie.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__atuvc	1 year 1 month	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it before the share count cache is updated.
__atuvs	30 minutes	AddThis sets this cookie to ensure that the updated count is seen when one shares a page and returns to it before the share count cache is updated.

Cookie	Duration	Description
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
disqus_unique	1 year	Set to record internal statistics for anonymous visitors.
uvc	1 year 1 month	addthis.com sets this cookie to determine the usage of addthis.com service.
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
__gads	1 year 24 days	Google sets this cookie under the DoubleClick domain, tracks the number of times users see an advert, measures the campaign's success, and calculates its revenue. This cookie can only be read from the domain they are currently on and will not track any data while they are browsing other sites.

Cookie	Duration	Description
IDE	1 year 24 days	Google DoubleClick IDE cookies store information about how the user uses the website to present them with relevant ads according to the user profile.
loc	1 year 1 month	AddThis sets this geolocation cookie to help understand the location of users who share the information.
skimGUID	10 years	Set by Slimlinks, this cookie is a unique identifier provided to each device for log analysis to determine the number of unique users for various parts of skimresources.com.
test_cookie	15 minutes	doubleclick.net sets this cookie to determine if the user's browser supports cookies.
_ir	session	This is a Pinterest cookie that collects information on visitor behaviour on multiple websites. This information is used on the website, in order to optimize the relevance of advertisement.
__gpi	1 year 24 days	Google Ads Service uses this cookie to collect information about from multiple websites for retargeting ads.

Cookie	Duration	Description
wp_api	past	Description is currently not available.
wp_api_sec	past	Description is currently not available.
xtc	1 year 1 month	Description is currently not available.

Share this:

Like this:

4 comments