It was discovered that TikTok's parent company ByteDance had secretly copied ChatGPT, and OpenAI swiftly banned the ByteDance account.


Solen Feyissa

ByteDance, a major Chinese IT company, previously became a global AI leader with its TikTok algorithm that finds videos that match users' tastes, but with the rise of OpenAI, it has fallen behind in the field of generative AI. Masu. ByteDance, which is trying to make a comeback with its own generation AI 'Doubao', was developing a large-scale language model (LLM) under the name 'Project Seed' using OpenAI's API in violation of the terms of service. The IT news site The Verge reported that this was revealed from internal documents.

ByteDance is secretly using OpenAI's tech to build a competitor - The Verge

According to internal ByteDance documents reviewed by The Verge, ByteDance will develop the foundational LLM, codenamed 'Project Seed,' so that it will be involved in nearly every stage of development, including model training and evaluation. He said he was using OpenAI's API.

The Verge pointed out that such behavior is prohibited in the AI industry and clearly violates OpenAI's terms of service and Microsoft's terms of use, which ByteDance had purchased direct access to.

The terms of use in question are guidelines for corporations called '

Business terms ' published by OpenAI. In it, OpenAI specifies that its API 'may not be used to develop artificial intelligence models that compete with our products or services.'

The Verge confirmed that employees were discussing 'how to 'whitewash' evidence through 'data desensitization'' on ByteDance's internal tool, Lark. In addition, due to rampant abuse, employees working on Project Seed regularly exceeded the API access limit.

ByteDance boldly used the API in the early stages of Project Seed's development, but in late 2023, they strictly ordered the development team to 'not use GPT-generated text at any stage of model development.' I also know that. It was also around this time that the company received regulatory approval to release Project Seed in China through its own chatbot called Doubao.

However, ByteDance continues to use OpenAI's API in ways that violate its terms of service, including evaluating the performance of models for Doubao. A source with direct knowledge of ByteDance's internal affairs told The Verge, ``They say they want to make sure everything is legal, but they just don't want to get caught.''

On the other hand, ByteDance, which was confronted with an internal document by The Verge, said, ``The data generated by GPT was used to annotate the Project Seed model in the early stages of development, and was removed from ByteDance's training data around mid-2023.'' and an explanation. 'ByteDance has obtained a license from Microsoft to use GPT's API,' company spokesperson Jody Seth said in a statement. However, we use an in-house developed model for Doubao, which can only be used in China.'

Shortly after The Verge published this article, OpenAI notified The Verge that it had suspended ByteDance's account. OpenAI says that ByteDance's use of the API was minimal, but the account has been suspended pending an investigation. Once this investigation reveals that ByteDance is not following our policies, OpenAI will either require ByteDance to change its usage practices or formally suspend the account.

Although not openly discussed, it is a common practice for small and medium-sized enterprises to repurpose AI models to help build competing products, especially unique ones like OpenAI's, according to The Verge. It is said that it is. The use of such APIs is considered a 'legal gray area' among small and medium-sized businesses, as OpenAI and Microsoft have so far not taken action to arrest violators.

However, all the people interviewed agreed that it is extremely unusual for a large company like ByteDance to take such a step, and this reflects the fact that Project Seed's development team is being asked to 'deliver quickly.' ”, The Verge pointed out, indicating how great the pressure is.

in Software, Posted by log1l_ks