“Google won’t rank your content if it’s written by AI! You need to make it human.” This is the advice that so many marketers were giving just a few months ago. After I shared the results of my recent blog post, which proudly sits at the top of Google, one fellow marketer confidently told me “It might rank now, but wait till Google tags it as AI. It will be penalized.”
But that consensus was broken less than a week ago when Google announced their most up-to-date guidelines for AI content. These guidelines clearly stated that Google does not take into account the method of content creation. All it cares about is how useful the content is, especially in relation to its E-E-A-T factors. If you want to know more about this information (and if you want to get on the first page of Google with AI, then you definitely do), check out my blog post on this subject and Google’s guidelines.
This led me to ask a few questions that, interestingly enough, neither Google nor ChatGPT were able to answer. Firstly, how do the most searched questions on Google score on an AI detector? And secondly, what do high ranking sites which do score as likely or possibly AI generated have in common?
Since no-one could answer these questions, I thought it was time that I went to find out!
Rules of The AI Detector Experiment!
Since it isn’t possible to check every high ranking site across the internet, it was important to get a useful subset of the kind of things people search for. I decided against using keywords as the most popular keywords are brand names and adult websites. Not exactly the kind of websites that would usually make use of AI text generation tools. I decided to us the most asked questions in the United States over the last 12 months.
This AI detector experiment the top 3 search results from the top 50 questions asked over the last year. Each site would be run through Open AI’s AI detector to see whether it was likely to be AI generated. I decided that I would take as much text as possible from each site, clean up the text to remove any image captions, advertisements, etc.
If the page which ranks on Google does not have enough text, the first link on the page would be clicked, and this process repeated until a page with enough content was found. This method was chosen because most of the pages which ranked and did not have enough content were content archives. If no page could be found with enough text, it was marked as N/A.
If there is a featured snippet or a news article at the top of the search results, these would be chosen as the first results. This decision was taken, as a lot of AI content is used to answer questions and a likely use case of ChatGPT and Bard will be to answer this kind of question.
What Was I Expecting to See?
Many of these questions are the kind of questions used for evergreen content that a lot of websites target. I especially expect to see longstanding websites and content that predates the release of quality content generation AI’s. For this reason, I do not expect to see any websites ranking that are marked as likely AI content. I also expect very few to be marked as possibly AI content.
What Are the Limitations of this Experiment?
It is important to note that the OpenAI AI detector, whilst probably the most reliable, is still not 100% reliable at detecting AI. If a website is marked as AI, it doesn’t mean that it definitely uses AI, if a website is marked human, it doesn’t mean that it is definitely human. However, if there is a substantial number of unclear, possible and likely results, then this would indicate that Google is genuinely not penalizing AI content.
It is also worth noting that many people who write with AI tools work hard to ensure that they are not detected as AI. Whilst this may not be necessary based on Google’s guidelines, it is likely that some AI content will have been manually or automatically rewritten to bypass an AI detector.
These results should not be extrapolated to a larger number of results. The sample set is far too small to be representative of every Google search result. It is likely that a larger sample would have a different set of results.
The Results!
Unsurprisingly, the vast majority of the websites ranked top of the search engines were marked as Very Unlikely AI generated by OpenAI’s AI detector. 67% of the sites were marked as such, with a further 14% being marked as Unlikely. This is potentially explained by the age of many of the sites and the age of the content being ranked in those positions.
However, 15% of the top search results in Google’s top 50 questions asked in the US being either unclear, possible or likely AI does give a lot of hope to people who want to use AI in their content generation workflow. Clearly, Google is not penalizing AI content, as these are competitive keywords (many getting hundreds of thousands of searches) and there will be a lot of human written content which has not reached the top positions.
There were two results which were marked as likely AI content, both from the same website, veggiedesserts.com. This website does have a Domain Authority (DA) rating of 65%, which does demonstrate the value of off-page SEO work to get your site on the first page of Google. Remember, this site may not use AI-generated content, but if OpenAI sees it as AI-generated then it’s likely that an AI detector tool used by Google would also see it as AI content. The fact that it still ranks indicates that AI content (as long as it is useful) will rank on the search engines.
The possibly AI-generated content websites were an interesting group of articles which did not seem to (at first glance) have much in common with each other. Yorkshire Post, Microsoft, Where Am I and 3 metric conversion pages were all flagged as possibly AI-generated. It is known that many newspapers have started using GPT-3 and other models for content generation, so it is not a surprise to see a newspaper in there.
The unclear section included some news websites and, surprisingly, many health websites including multiple NHS pages. This could have something to do with GPT-3’s training data as many health websites will use very similar language when explaining symptoms, and it may be a false positive. Or the NHS may already be using GPT-3 for its websites, who knows!
What Did We Learn From the AI Detector Experiment?
Unsurprisingly, content detected as human generated still reigns supreme. The vast majority of the pieces of content that show in the first three places on Google were found unlikely to be AI. This does not mean that all the content was human written, many AI-generated pieces of content can pass an AI detector. But it does indicate that there may be some elements of human written comment that the search engines do like. It may still be worthwhile checking your content against an AI detector before posting it.
But the good news is that Google does not appear to be penalizing content flagged as likely AI generated. This is something that they have already advised in their guidance, however, as many people (including myself) strive for AI-generated content which scores as human, many of us have not posted content which is detectable as AI to see it ranking. It is good to see evidence that content detected as AI does rank on Google.
It is also important to note the value of technical and off-page SEO as well as valuable content. The only likely AI website (according to OpenAI’s detector) to appear in the top 3 listings had very high quality off-page SEO and appeared to be following Google’s best practices. It’s clear that this website works hard on their SEO and this is paying off – if they are using AI (they may not be) then they clearly aren’t using it as a shortcut!
The final thing that we have learned is that, as far as Google are concerned, the end justifies the means when it comes to useful content. Even fully auto-generatedblog posts may be fine if you have curated them and ensured they are useful to readers before they are posted. On the other hand, a painstakingly unique blog post, written by hand, which provides no use to the readers will not rank on the search engines. If you generate a blog post with AI and you think you would take value and use from the blog post, you might not need to put the effort in to make it human.
Basically, you can choose to generate content however you want to do it – you can even use the free GPT-3 option on this website if you like. Google doesn’t mind whether an AI detector will flag it or not. Just make sure that you are providing value to your readers and the search engines will reward you!
If you do want to use an AI detector on your content, you can use OpenAI’s AI detector free of charge at this link. You can also use Originality.ai’s AI detector, which is a commercial product that can detect AI content and check for possible plagiarism.
You can download a CSV file of the data used can be seen here.