Can search engines detect AI content? - Search Engine Land

1 year ago
104

🥇 Bonuses, Promotions, and the Best Online Casino Reviews you can trust: https://bit.ly/BigFunCasinoGame

Can search engines detect AI content? - Search Engine Land

The AI tool explosion in the past year has dramatically impacted digital marketers, especially those in SEO. Given content creation’s time-consuming and costly nature, marketers have turned to AI for assistance, yielding mixed results Ethical issues notwithstanding, one question that repeatedly surfaces is, “Can search engines detect my AI content?” The question is deemed particularly important because if the answer is “no,” it invalidates many other questions about whether and how AI should be used. A long history of machine-generated content While the frequency of machine-generated or -assisted content creation is unprecedented, it’s not entirely new and is not always negative. Breaking stories first is imperative for news websites, and they have long utilized data from various sources, such as stock markets and seismometers, to speed up content creation.  For instance, it’s factually correct to publish a robot article that says: “A [magnitude] earthquake was detected in [location, city] at [time]/[date] this morning, the first earthquake since [date of last event]. More news to follow.”   Updates like this are also helpful to the end reader who need to get this information as quickly as possible. At the other end of the spectrum, we’ve seen many “blackhat” implementations of machine-generated content.  Google has condemned using Markov chains to generate text to low-effort content spinning for many years, under the banner of “automatically generated pages that provide no added value.” What is particularly interesting, and mostly a point of confusion or a gray area for some, is the meaning of “no added value.” How can LLMs add value? The popularity of AI content soared due to the attention garnered by GPTx large language models (LLMs) and the fine-tuned AI chatbot, ChatGPT, which improved conversational interaction. Without delving into technical details, there are a couple of important points to consider about these tools: The generated text is based on a probability distribution For instance, if you write, “Being an SEO is fun because…,” the LLM is looking at all of the tokens and trying to calculate the next most likely word based on its training set. At a stretch, you can think of it as a really advanced version of your phone’s predictive text. ChatGPT is a type of generative artificial intelligence This means that the output is not predictable. There is a randomized element, and it may respond differently to the same prompt. When you appreciate these two points, it becomes clear that tools like ChatGPT do not have any traditional knowledge or “know” anything. This shortcoming is the basis for all the errors, or “hallucinations” as they are called. Numerous documented outputs demonstrate how this approach can generate incorrect results and cause ChatGPT to contradict itself repeatedly. This raises serious doubts about the consistency of “adding value” with AI-written text, given the possibility of frequent hallucinations.  The root cause lies in how LLMs generate text, which won’t be easily resolved without a new approach. This is a vital consideration, especially for Your Money, Your Life (YMYL) topics, which can materially harm people’s finances or life if inaccurate.  Major publications like Men’s Health and CNET were caught publishing factually incorrect AI-generated information this year, highlighting the concern. Publishers are not alone with this issue, as Google has had difficulty reining in its Search Generative Experience (SGE) content with YMYL content.  Despite Google stating it would be careful with generated answers and going as far as to specifically give an example of “won’t show an answer to a question about giving a child Tylenol because it is in the medical space,” the SGE would demonstrably do this by simply asking it the question. Get the daily newsletter search marketers rely on. Google’s SGE and MUM It's clear Google believes there is a place for machine-generated content to answer users’ queries. Google has hinted at this since May 2021, when they announced MUM, their Multitask Unified Model. One challenge MUM set out to tackle was based on the data that people issue eight queries on average for complex tasks.  In an initial query, the searcher will learn some additional information, prompting related searches and surfac...

Loading comments...