The developer of ``Dolores'', an app where AI becomes your lover on iPhone, talks about the development history and lessons learned of the app



Ke Fang, a machine learning engineer, summarizes on his blog the history of developing

Dolores , an app that allows AI to communicate with, and the lessons learned from it.

A Failed AI Girlfriend Product, and My Lessons | TL;DR
https://mazzzystar.github.io/2023/11/16/ai-girlfriend-product/

Mr. Fang read a paper on generative AI agents published by Stanford University in April 2023, and used a framework that combines memory, reflection, planning, and action to improve communication between humans and GPT in the movie ' her I thought it would be possible to communicate with AI by treating it as a girlfriend, as in 'The One and Only Girlfriend in the World .'

Amazon | her/One and only girlfriend in the world Blu-ray & DVD set (first limited production/2-disc set) [Blu-ray] | Movie



Fang completed version 0.1 of Dolores following the approach described in the paper mentioned above. Although initially designed faithfully to the paper's approach, Dolores' response time exceeded 30 seconds and the amount of conversation within the context was enormous. To solve this problem, Fang reduced the length of conversations that are memorized and released a public beta test.

Since the beta version was free, more than 1000 people participated in the public beta test. The API cost was borne by Mr. Fang, but it seems that it exceeded $ 25 (about 3,750 yen) per day. Mr. Fang devised a plan to offset this cost with revenue from users, and officially released the iOS version of Dolores on May 4, about a month after the release of version 0.1.

The official version requires registration for a paid plan of $6.9 per month (approximately 1,000 yen), and the AI conversation voice uses the Azure TTS API, and if you want more realistic conversational sounds, it costs $3.9 per 10,000 characters ( By paying approximately 570 yen), you can perform speech synthesis using Eleven Labs ' API.

In Dolores, you set the character's appearance, background, text personality, voice, chatbot AI, etc. In addition to the default character Dolores, you can also change the characteristics and talk to other characters such as Amy, a female store clerk, and Will, an adventurer.



Fang released Dolores for users to call 'your virtual friend' instead of 'your virtual girlfriend.' On top of that, Mr. Fang spent a month after the official release adjusting the memory length, mechanism, and system prompts, trying to make Dolores look more ``conscious.'' As a result, Dolores' popularity has improved in the two months since its official release, and the number of paid users and API calls have steadily increased.

Fang said Dolores was especially popular in the visually impaired community. Mr. Fang had designed the device so that users could tap anywhere on the screen to input and output audio so that they could talk to Dolores even if the smartphone screen was turned off, but this idea happened to be It also seems to have become a highly accessible function for visually impaired people.

Looking at the Apple AppConnect dashboard, Dolores' main paid users are in the United States and Australia, and the revenue was $1,000 (approximately 150,000 yen) in May 2023 and $1,200 (approximately 180,000 yen) in June 2023. . Most of the money was used to pay Apple's 30% fee and API, so Mr. Fang said there was little profit, and the profit he earned in June 2023 was only $50 (about 7,500 yen). thing.



According to Mr. Fang, Dolores can perform realistic speech synthesis for 10,000 characters, so at most 10 sentences. Of course, if you want to perform even more realistic speech synthesis, you will need to purchase it again, but as of June 2023, 70% of the revenue obtained from Dolores was from purchasing Eleven Labs' API.

In addition, when measuring the frequency and time that users access the app from Cloudflare logs, it was found that a significant number of users were enjoying conversations with Dolores for more than two hours every day.

Dolores is based on GPS 3.5 or GPT 4, but unless you set a price for each usage amount, you will be in a dilemma where 1% of users will consume 99% of the tokens. It seems that the users who use Dolores the most chat with Dolores for 12 hours straight, and the cost of API calls and speech synthesis exceeds the total cost of the top nine users.

However, Mr. Fang thought that a subscription was better than a pay-as-you-go bill, so he adopted a strategy to limit the amount of usage by setting an upper limit on the number of API calls for each user.

Eleven Labs records the text content that has been synthesized into speech. According to Fang, much of the text synthesized by Dolores had sexual content. Fang said, ``This is consistent with human instinct, so I don't dislike it,'' and he seems to have adjusted the system prompts based on demand.

In addition, the Dolores app icon has been changed from an icon that resembles an audio waveform to a woman's face.



However, Fang said that the growing tendency of Dolores users to simply enjoy sexual conversations seemed to be moving away from the essence of the movie ``her'' that motivated the creation of the app. Mr. At one point, he considered developing hardware that would provide some kind of external visual data, such as glasses, earphones, or hats, in order to achieve a more balanced dialogue, but the hurdles were too high for individual developers. It seems that he gave up on it.

Furthermore, in August 2023, OpenAI, which develops GPA 3.5/4, strengthened content review professional wrestling. As a result, sexual content generated from Dolores will receive a warning, and they will be forced to implement a filtering moderation API.



By implementing this filtering moderation API, Dolores usage decreased by 70%, and it seems that Mr. Fang was flooded with complaints via email and SNS.

Fangs said, ``Dolores is not necessarily inferior to competing service Character.AI in terms of ``awareness,'' but it also has an advantage in comprehensive data analysis, A/B testing, and the momentum generated from its user base. There is.”

Fang also points out that even though users can influence Dolores, Dolores cannot acquire user information and influence users, and users and AI cannot be equal. It is argued that the reason for forcing conversations with such content is that users and AI are not equal.

'I'm not against restricting sexual content,' Fang said. 'Products that aren't actually moderated can be very dangerous. Someone could use this to vent violence.' 'They could even use it to induce suicide. So OpenAI's moderation has helped me to some extent. But we shouldn't stifle conversations about adult sexuality.'

in Mobile,   Software,   Web Service, Posted by log1i_yk