An introduction to ShareChat: Personalising content for Next Billion Users — ShareChat

ShareChat is an Indian social media startup that presents its interface and recommended content in 14 Indian languages. Here’s taking a look at what makes us work


  • ShareChat is a social medium made available in 14 Indian languages.
  • It is a platform for everyone to share content as text, images or videos.
  • The ShareChat data science team is using some of the most advanced machine learning technologies to learn about usage trends, in order to sharpen recommendation and build a more wholesome platform.
  • The article provides an introduction into ShareChat, and how it is using key technologies to shape up its platform.

ShareChat, as many Indian users have noticed already, is a social network built to enable the next generation of India’s internet users. The application presents a platform for active discourse in vernacular languages. With the advent of super affordable internet connectivity and wider network coverages, the Next Billion of the world are slowly coming online, and this beckons the need to provide them with a platform to talk and share, crucially in their native language. Naturally, with such a task at hand, there are innumerable challenges that our data science team faces, in order to streamline recommendations and continuously make the platform better.

The importance of data

At ShareChat, our data is fresh. With most users coming online for the first time, and with there being no precedent to their usage or to their language, the nature of data and usage patterns being formed are varied, diverse and immense. This leaves us with a huge amount of data to process, and identify how internet behaviour varies within different ethnic groups, and how first-time users differ by nature from long-time users. Our data science team further has the task of identifying the finer elements, like the tone of a statement in any vernacular language, processing data from video consumption and so on.

In essence, we at ShareChat are witnessing many firsts, in terms of technology, usage and users as a whole. Every day, we learn along with our algorithms, and it is this that excites us to build stronger technologies to better develop our platform.

So what do we do?

The idea is for individuals to follow each other, along with personalities on our platform. This not only gives these personalities a greater context with their local followers, but also gives him/her a ground check clock on local sentiments. Second, it allows individuals from various corners, who have moved away from their motherland because of work or other opportunities, to be connected to updates form their own social circle(s).

At present, the ShareChat app sees over 70 million people accessing our platform every month — not a mean feat, considering how short our span of operations has been so far. We believe that communication does not necessarily come with a specific form, and as a result, our platform today supports text updates, still photographs, gifs and videos. Every day, we see over 1 million new posts across all languages and regions. When you compute all this information, you sit back and realise one thing — this is a staggering amount of data that we have in our hands. Hence enters our data science team.

Data Science at ShareChat

As an organisation building a social medium for a new generation of users, our data science team goes through a massive amount of data, which is forever changing. Users on our platform have diverse nature and opinions, and often there is no linearity or rhythm to content that is shared or newly generated. The key for us is to train our advanced AI and ML algorithms efficiently, in order to implement multiple things. The first of these things is our Trending Feed, where the algorithms are constantly processing data to dynamically assess proper recommendation of content.

This is closely linked to the content processing pipeline, which in turn is simultaneously linked to various factors, such as judging tonality of content, filtering sensitive content, assessing usage patterns and summarising region-based nature of content consumption to improve recommendation and build a more wholesome platform. This comprises of several algorithms in domains of Natural Language Processing, Computer Vision and Recommendation using Deep Neural Networks working in tandem to serve right content to right users at right time. While these are broad topics in themselves, we shall discuss them in more detail, going forward.

Tying the knots

Originally published at on January 4, 2019.

Product Manager | Stock Market Trader. I write about product management, financial freedom, and personal growth.