LongWriter AI breaks 10,000-word barrier, challenging human authors

1 month ago 34

August 15, 2024 6:00 AM

Credit: VentureBeat made with Midjourney

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More

Researchers at Tsinghua University in Beijing have created a new artificial intelligence system that can produce coherent texts of more than 10,000 words, a significant advance that could transform how long-form writing is approached across various fields.

The system, described in a paper called “LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs,” tackles a persistent challenge in AI technology: the ability to generate lengthy, high-quality written content. This development could have far-reaching implications for tasks ranging from academic writing to fiction, potentially altering the landscape of content creation in the digital age.

The research team, led by Yushi Bai, discovered that an AI model’s output length directly correlates with the length of texts it encounters during training. “We find that the model’s effective generation length is inherently bounded by the sample it has seen during supervised fine-tuning,” the researchers explain. This insight led them to create “LongWriter-6k,” a dataset of 6,000 writing samples ranging from 2,000 to 32,000 words.

By feeding this data-rich diet to their AI model during training, the team scaled up the maximum output length from around 2,000 words to over 10,000 words. Their 9-billion parameter model outperformed even larger proprietary models in long-form text generation tasks.

LongWriter-glm4-9b from @thuke...

Read Entire Article