A study released by AWS this week indicates that 57% of the text content on the internet is already generated by AIs. The research shows that most of what is published on the web is a translation created by Machine Translation, artificial intelligence focused on translating texts. However, this is not only bad for users and creators, but it also hinders the training of generative AIs.
Since LLMs rely on human and expert content to deliver more accurate information, replicating texts using AIs and only with translations impacts the performance of generative AIs.
The AWS study highlights that translations are flawed because they come from poorly written texts. Consequently, this translated material will deliver incorrect or poor-quality information to users. In addition, there is still the issue that the LLM will “recycle” content for its training — it is AI training AI, almost a pyramid scheme.
Response quality drops with each prompt
Research shows that the quality and accuracy of responses generated by LLMs decline over time. If you think Google’s AI suggesting that you spread glue on your pizza is bad, just wait a few more years.
For those who use ChatGPT, Gemini, Copilot or other AI for simpler tasks, this drop in quality may go unnoticed. Despite this, at the end of 2023 and the beginning of 2024 we had the case of ChatGPT’s “laziness”. Some readers have already complained about a certain drop in quality of generative AIs.
The AWS study suggests a solution to this: the use of technologies to detect material generated by Machine Translation (MT). Unlike basic translators, which practically translate word for word, MTs use AI to evaluate the context of the text.
With information: Windows Central.