The social network follows a three-stage process to select a series of ‘tweets’ in the ‘For you’ section.
Twitter has shared some of its source code on GitHub, which now offers two repositories containing part of its Twitter recommendation algorithm, with details on how the ‘For You’ section works.
A leak involving some of the social network’s source code that was accidentally disclosed by a user going under the name of “FreeSpeechEnthusiast” is presently being investigated.
After receiving a court order from Twitter for copyright infringement, GitHub removed this content from its platform. Twitter then asked for information regarding the identities of both the individual who leaked the content and those who downloaded it.
This occurred just a few days after Twitter’s founder and CEO, Elon Musk, stated that he will make the platform’s algorithm public at the end of March. This officially occurred this past Friday, March 31.
According to a statement posted on his blog, the platform released this content to “take the first step in a new era of transparency” and to remove “any code that could undermine the security and privacy of the user.”
The source code for elements of Twitter’s operations, including the Twitter recommendation algorithm that chooses which “tweets” the social network presents to users in the section “For you,” is now available on GitHub in two new repositories called ” Main ” and ” ML “.
In its blog, Twitter has revealed how it channels recommendations and how it filters the recommendations it deems pertinent from articles to suggest. Platform has also suggested that it has provided more details about its Twitter recommendation algorithm there.
This has highlighted the fact that, although making this portion of its source code open to people, it has chosen to remove the portion designed for ad recommendations.
Twitter Recommendation Algorithm
The platform has looked into how it informs users, choosing a selection of tweets to appear in the “For you” part of each one.
Twitter clarified that its recommendation system “is made up of numerous interrelated services and occupations” and revealed the three-stage method it uses to filter this information.
It first gathers “the best tweets” from many sources of recommendations, a process known as “candidate sourcing,” and then uses a machine learning algorithm to rank each one.
Last but not least, it employs heuristics to filter out tweets, posts you have already seen, and items marked as unsafe or unsuitable for work (NSFW).
Home Mixer, a service that creates the “For you” section, is based on the Product Mixer program, which makes it easier to create content sources and “acts as the backbone” of the software that connects the publications that may be candidates to be included in this section with other scoring functions.
The best 1,500 tweets from a collection of hundreds of millions of these sources are pulled out in the first step of the process from a variety of sources. It makes advantage of both user-followed and unfollowed profiles to do this.
According to the network, “the ‘Para ti’ chronology now consists of 50% tweets within the network and another 50% outside of it on average, although this may vary from user to user.” social.
Twitter has stated that it “is the biggest candidate” when it comes to sources that users follow and that its goal is to display the most recent and pertinent “tweets” of the individuals who are followed.
Then, he employs Real Graph, a machine learning model that calculates the likelihood that two users would communicate. More tweets will be featured the more compatible it is.
As it no longer uses the Fanour Service, a tool that was formerly used to promote articles from a cache for each user, the platform has noted that it has lately improved on this section.
Users Not being Followed
Also, the social network has revealed that it takes two stances on how it incorporates publications from individuals who are not followed the “For you” suggestions.
To reply to which tweets the users of the people they follow interacted with and who likes the posts they publish; it first takes into account the so-called social graph.
After receiving the answers to these questions, it creates potential tweets based on them and categorizes them using a logistic regression model. It created the GraphJet graph processing engine to track its platform journey.
The second strategy through which Twitter is devoted to proposing postings from unfollowed accounts is known as “embedding spaces,” which seeks to respond to the query “Which tweets and users are close to my interests?”
These embeds create numerical representations of user interests and tweet content before calculating how similar two users or two randomly chosen tweets embedded in this area are to one another.
The procedure then moves on to the suggested tweet rating stage, where 1,500 candidates who “may be relevant” are subjected to a score that accurately predicts the significance of each candidate post.
A neural network with over 48 million parameters that are continuously trained to maximize user engagement on the site is used to classify data. The system then creates ten labels to assign each “tweet” a distinct score. Each one of them reflects the likelihood of engaging with the aforementioned publications.
Following this classification, the platform applies several filters that enable it to recommend more precisely and provide a variety of outcomes. At this time, Twitter, among other things, reduces the number of consecutive “tweets” from a single account and removes suggestions from blocked accounts.
When Twitter reaches the last point and has already selected the suggested publications, it turns on the Home Mixer and sends the information to all of the devices. The classification algorithm now mixes tweets with additional content like adverts or follows recommendations to other accounts.
Musk stated on his Twitter profile that “in the coming weeks” they will open “everything that contributes to showing a tweet” after revealing this portion of the Twitter algorithm for recommendations.
The business has also stated that it plans to develop new real-time capabilities, embeds, and user renditions as part of the expansion of its recommendation system.