With growing amount of data on the Internet, the need for information retrieval systems i.e. search engines is inarguable. Moreover, to efficiently search such huge volumes of data, these search engines use numerous smart techniques and algorithms. Perhaps the most famous example is the PageRank algorithm to quantify the importance of a webpage. However, it was soon realized that determining the importance of a webpage alone was not enough. It was equally important to determine the relevancy (what the user exactly wants) of a webpage. It was Google which pioneered a technique for the same: personalization.
Personalization in the context of search engines means to filter and re-rank search results based on users’ interests and past experiences. Concretely, a programmer is more likely to mean the programming language “C” in a search query “C interview questions” rather than the alphabet C. Personalization helps in delivering relevant results by either interpreting the search query or re-ranking the search results according to a “personal profile”. This profile is created by social media, online news portals and search engines according to searches carried out by a user, selection of the results i.e. which link he clicks on and various other features or signals.
Personalization in searching has conquered the challenges posed in searching caused by exponentially rising data. According to TechCrunch, Facebook is processing about 2.5 billion pieces of content and 500+ terabytes of data each day and there is a similar status for Google, Amazon, Twitter and other leading corporations. Another similarity among these big tech giants is that they use personalization techniques as a primary mechanism. However, it introduces a new problem of its own.
Aptly termed “Filter Bubble” by author Eli Pariser in his book ‘The Filter Bubble: What the Internet is Hiding from You’, it is a phenomenon in which a user always ends up getting results resonating with is personal profile thereby getting “trapped” inside an information bubble created of his own. Pariser argues that these filter bubbles thus formed only show a user what he wants to see thereby separating im from contents that are in contrast to his interests. In a Ted Talk given by Pariser, he gives an example of how these bubbles impact the search results. In that example, he asked two of his friends to search for Egypt and send him the results. The result was surprising, one search result focused on olitical affairs of the country and the other (totally contradicting the first one) portrayed Egypt as a holiday destination.
Complex unsupervised algorithms are used as the framework for personalization. Primary of these algorithms is clustering, where the search engine uses various features (from the users’ past queries to the computer he uses) as input to classify users into different clusters. This segregation f users creates different bubbles and hampers exchange of opinions among them thereby disallowing “civic discourse”. It is often argued just how much distortion personalization brings to the search results. Some argue that personalization distorts only on the lower end results while others argue that it affects the top search results’ ranking.
Another intimidating problem raised by filter bubble is the perception of quality in the search by a personalized user. As the user’s personalized profile is maintained by news sites, online vendors, social media, and search engines, they discern the search results as of high quality. Meanwhile, at the same time users are completely unknown of what is hidden from them by programmed gatekeepers.
Unarguably, personalization is a necessity for efficient and relevant search results but it promotes these filter bubbles. Some search engines such as Duckduckgo market itself as free from filter bubbles but cannot outperform search giants such as Google when it comes to the relevancy of results. And even hough the most popular search engine, Google, has an option to turn off personalized search results, not many are aware of such feature. At the end, it’s a tradeoff between getting relevant search results and living inside an information bubble.
In today’s world of growing data, personalization is indispensable for efficient search and relevancy. It has succeeded in creating the structured web content and has unequivocally brought significant benefit to the users. But at the same time, it seems to violate the ultimate goal of internet i.e. freedom in the quest for knowledge and information.
Facebook’s CEO, Mark Zuckerberg says,
“A squirrel dying in front of your house may be more relevant to your interests right now than people dying in Africa” - one of the probable baselines for Facebook tailored news feed.
In the specific sense, this may be very true and in fact maybe what we want to hear but, simultaneously we are missing out on what we need to hear.
Though the choice is ours - There is a vast difference between what we need to hear and what we want to hear.