Shakespeare or ChatGPT? Study finds humans prefer AI poetry over the real thing


by

Ungry Young Man

Large-scale language models can generate natural sentences that sound as if they were written by humans, and in some cases it can be difficult to tell that an AI has written the sentence unless you tell the difference. When a research team from the University of Pittsburgh conducted a study on the discriminability and evaluation of AI-generated poetry and human-written poetry, they found that humans tended to prefer AI-written poetry.

AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably | Scientific Reports
https://www.nature.com/articles/s41598-024-76900-1



Shakespeare or ChatGPT? Study finds people prefer AI over real classic poetry
https://phys.org/news/2024-11-shakespeare-chatgpt-people-ai-real.html

Two experiments were conducted in this study.

The first experiment involved 1,634 participants and used poems by 10 famous English-speaking poets, including Chaucer , Shakespeare , Butler , Byron , Whitman , Dickinson , T.S. Eliot , Ginsberg , Plath , and Lasky . The research team prepared five poems by each poet and five poems generated by ChatGPT 3.5. Participants read the five real poems and the five AI-generated poems in random order and judged whether each poem was written by a human or generated by an AI.

As a result, the participants' correct answer rate was 46.6%, below 50%. Even more interestingly, the AI-generated poems were more likely to be judged as 'written by a human' than the human poems. In particular, when looking at the percentage of poems judged as 'written by a human,' the bottom five poems were all works by real poets.



In the second experiment, 696 participants were split into three groups: one told that the poems were all written by humans, one told that they were all generated by an AI, and one not told anything about the author. They were asked to rate some of the poems used in the first experiment on 14 characteristics, including quality, rhythm, imagery, sound, beauty, inspiration, and lyricism, on a 7-point scale.

As a result, the AI-generated poems were rated higher than human poems in all 13 characteristics except originality. The researchers found that rhythm was particularly important. They also found that people who were told that a poem was 'generated by an AI' tended to rate it lower, regardless of the actual author.



The research team speculates that the reason for this result may be that the AI-generated poems are relatively easy to understand.

For example, in the case of poems generated by AI, the works that imitated Plath's style were about 'sadness,' the works that imitated Whitman's style were about 'natural beauty,' and the works that imitated Byron's style were about 'beautiful, sad women.' On the other hand, the works of real poets require more complex and deeper interpretation. The research team analyzed that general readers who have not studied poetry professionally tend to prefer poems that are easy to understand, and because they have a preconceived notion that AI cannot write such easy-to-understand poems, they mistakenly judged the poems they found favorable to be human.

The research team points out that as AI capabilities evolve rapidly, standards that worked for traditional AI -- such as 'AI should not be able to write poetry of this quality' -- may no longer be valid for new AI, and calls for regulations that require transparency regarding the use of AI.

in Software,   Science, Posted by log1i_yk