How was Bluesky built, which has grown to 5.5 million users in just two years since its launch?



Bluesky has gained popularity as an alternative to Twitter (now X), and has been used by many people, including gaining 5.5 million users in the two years since its release. Bluesky is built by 14 to 15 engineers, and The Pragmatic Engineer, a newsletter for engineers, has published a report on how it has been built so far.

Building Bluesky: a Distributed Social Network (Real-World Engineering Challenges)

https://newsletter.pragmaticengineer.com/p/bluesky



Many social networking sites, such as Mastodon and Threads, have emerged as alternatives to Twitter, but Bluesky is a decentralized social networking site where anyone can run their own server, its code is open source and published on GitHub, it gained 5 million users in the 12 months since the announcement of the invitation-only beta version, and it started with a small team of three engineers, with only 14 to 15 people at the time of writing.

Development of Bluesky began in 2022. The first 10 months of the project, from January to October 2022, were spent mainly on research. In 2022, a major event also occurred: Elon Musk's acquisition of Twitter.



A few days after the acquisition by Elon Musk was announced, Bluesky started registering for a waiting list for the app. Bluesky's mobile app was originally a prototype to make sure the protocol worked properly and was scheduled to be discarded after it had served its purpose, but the news of Twitter's acquisition generated unexpected interest, and the mobile app created by one person ended up being made into a production app.

The history of the blockchain can be divided into three phases as shown below. In 2022, various experiments were conducted, and in 2023, an invitation-only launch was conducted. In 2024, preparations were made for a public launch, and the architecture was updated to enable

federation of decentralized networks.



In addition, the logo has been redesigned for the official release. Unlike existing social media platforms such as Twitter, Instagram, TikTok, and Youtube, Bluesky does not lock users into a website or app, but allows people to freely choose which apps and websites to use. The butterfly logo was adopted as a symbol of such freedom and change.



In his presentation, Bluesky CEO Jay Graeber talked about his vision for the company, and included the slide below.



The development of Bluesky followed the following principles: 'Excellent ease of use, scalability, and developer experience compared to existing SNS,' 'Integrate app development and protocol development,' and 'If an idea or design doesn't work, discard it immediately.'

In the early stages of development, we decided to use AWS as the infrastructure and PostgreSQL as the database. Initially, the Bluesky architecture was consolidated into a single Personal Data Server (PDS).



The Bluesky development team split the services that were aggregated in the PDS into smaller parts and worked on modularizing the architecture for an open network. First, the feed generator service was split off, allowing third parties to generate feeds using their own algorithms.



Next, we moved everything view-related to the “Appview” service to avoid having to rely on other systems to send data to the web and mobile apps.



In addition, relays were introduced to prepare for the exchange of data in the event that hundreds or thousands of PDSs join the Bluesky network in the future.



We also updated the architecture to v2 to achieve complete federation, configured federation internally, and confirmed that federation was working properly.



Then in February 2024, we started crawling external federated servers.



You can check the steps to actually set up a PDS on your own in the article below.

I tried using 'PDS (Personal Data Server)', a mechanism to host Bluesky data on a proprietary server - GIGAZINE



Then, in March 2024 , a moderation service called Ozone was added , allowing users to do their own moderation.



The overall architecture of Bluesky is as shown in the diagram below.



The team also ran into issues with scaling the database as the number of users increased, so they migrated the database from PostgreSQL to ScyllaDB and SQLite. The team said, 'We were glad to have used PostgreSQL in the early stages because we didn't know exactly how to query the data. Now that we understand the data and queries we need, we can create indexes in Scylla and provide an API.'



In terms of infrastructure, the company initially used AWS but switched to on-premise by June 2023, reducing costs to a fraction of what they were.

According to Why, a technical advisor to the Bluesky development team, Bluesky has two data centers, one in the San Francisco Bay Area and the other in Brasilia. Each data center has 30 big servers, each running EPYC Genoa (the fourth generation model of AMD's server CPU 'EPYC'). The third data center is also considering being set up in Japan. The following article summarizes what I heard from Why about Bluesky.

I asked the people at Bluesky everything I want to know right now, including 'Bluesky's ambitions,' 'Bluesky's monetization plan,' 'Bluesky's official server specifications,' 'Will the API ever become unusable?' and 'The relationship between Jack Dorsey and Bluesky' - GIGAZINE

in Web Service, Posted by log1d_ts