Pages with identical content

Identical content is usually called large blocks of information that are the same or very similar. They can be hosted in one or more domains. Typically, such content is published without the intention of deceiving users. Examples of duplicate materials that are not malicious:

• forum pages in regular versions of sites and versions for mobile devices;
• products in the online store, available at different URLs;
• Printable documents that duplicate the content of web pages.

If your site has multiple pages with almost identical content, you can point Google to your preferred URL. This can be done in many ways. This procedure is called "normalization."

In some cases, attackers intentionally duplicate content in different domains in order to influence the site ranking in the search engine or to attract more traffic. If you've ever tried to do this, then you must have used the service Using such deceptive tricks can cause inconvenience to users, since in the search results they will see almost the same repeatedly repeated content.

Google does its best to index and display pages with unique information. For example, if your site has a “standard” and “print” versions of each article that are not marked with the noindex meta tag, only one of them will be displayed in the search results. In those rare cases where Google believes that duplicate content is shown in order to manipulate the rating or mislead users, we will make changes to the index and rating of the sites in question. In this regard, the site rating may decrease or the site may be completely removed from the Google index and will not be available for search.

The recommendations below will allow you to avoid problems associated with repetitive content, and to ensure that visitors to the site are offered only materials of interest to them.

• Use 301 redirects. If you changed the structure of your site, set up 301 redirects ("permanent redirects") in the .htaccess file to redirect users, Googlebot and other "spiders" to the necessary pages.

• Provide uniformity. The system of your internal links should be streamlined.

• Use top level domains. So that we can show the most suitable versions of web documents in the search results, use top-level domains to publish materials related to a specific country whenever possible.

• Be careful with syndication. If you provide your content to other sites, then with each search query, Google will always show the version that it considers most acceptable to users. This version does not necessarily match the one you would choose. However, you should make sure that all sites that host your material have a link to the original article.

• Avoid repeating boilerplate texts. For example, do not place a lengthy copyright notice at the bottom of each page. Its short version, which will include a link to a page with detailed information, is enough. You can also use the URL Parameters tool to specify how Googlebots should handle these parameters.

• Do not use software stubs. Users are not interested in blank pages. For example, do not publish pages whose contents are not yet ready. If you can’t do without placeholder pages, block their indexing using the meta tag with the noindex directive.

• Explore your content management system. Check out how content is displayed on your site. In blogs, forums, and other similar services, the same content is often presented in several formats.

• Strive for a variety of content. If you have many similar pages, it is better to supplement each of them with some unique materials or to reduce them all into one. Suppose you have a travel site with separate pages about two cities, but the information on them is the same. You can add unique content to each page or combine them into one.

Google does not recommend blocking crawlers from accessing duplicate content using the robots.txt file or otherwise. If search engines do not have the ability to crawl pages with this content, then they will not be able to automatically determine that different URLs have the same content, and will treat them like unique pages. It’s best to allow these URLs to be crawled, but mark them as exact copies using the rel = "canonical" link, the URL parameter handling tool, or 301 redirects.