The InternetBlogging

How to delete a site or its fragments from the Google index

According to the concept of indexing adopted by Google, account is taken of the completeness, objectivity of information and compliance with its search query when issuing the results. If a site with illegal content enters the index, or the resource is intended for spam, then the pages of such a site will not be marked in the general database of the search engine. We also need to learn how to remove a site from the server by search results.

Zero indexing options for Google

Once the crawler - the program for gathering information about new resources - will scan the site page by page, then, if it meets the requirements of Google's policy regarding parsing, it will be indexed. But we will also describe how to delete your site or individual fragments for search engines using robots.txt - a pointer and at the same time a search terminator.

To exclude the entire resource from the issuance, a certain text zone is created in the root folder of the server on which the site is located - the robots.txt mentioned above. This zone is processed by search engines and operates according to the instructions read.

Keep in mind that the Google search engine will index the page, even if the user is not allowed to view. When the browser issues a response 401 or 403, "Access is not valid," this applies only to visitors, not to collection programs for this search server.

To understand how to delete a site from search indexing, you should enter the following lines into the text pointer:

User-agent: Googlebot

Disallow: /

This indicates to the search robot that it is forbidden to index the entire content of the site. Here's how to delete a Google site so that the site does not cache a resource in the list of detected sites .

Scan options for different protocols

If you need to list individual communication standards for which you would like to apply specific rules for indexing Google, for example, separately for http / https hypertext protocols, this should also be written in robots.txt in the following way (example).

(Http://yourserver.com/robots.txt) - the domain name of your site (any)

User-agent: * - for any search engine

Allow: / - allow full indexing

How to remove a site from the issuance completely for the https protocol

(Https://yourserver.com/robots.txt):

User-agent: *

Disallow: / full prohibition on indexing

Urgent removal of the URL of the resource from Google's Google search

If you do not want to wait for the re-indexing, and the site needs to be hidden as soon as possible, I recommend using the service http://services.google.com/urlconsole/controller. Pre-robots.txt should already be placed in the root directory of the site server. The instructions should be written in it.

If the pointer is for some reason not available for editing in the root directory, it's enough to create it in the folder with the objects for which you want to hide from the search engines. Once you do this and contact the automatic deletion service for hypertext addresses, Google will not scan the folders that are spelled out in robots.txt.

The period of such invisibility is fixed for 3 months. After this period, the directory removed from the issuance will be processed again by the Google server.

How to delete a site for scanning in part

When the search bot reads the contents of robots.txt, then based on its contents, certain decisions are made. For example, you need to exclude from the display the entire directory named anatom. To do this, it is enough to write down such instructions:

User-agent: Googlebot

Disallow: / anatom

Or, for example, you want not to index all pictures like .gif. To do this, add the following list:

User-agent: Googlebot

Disallow: /*.gif$

Here is another example. Let's delete the information about dynamically generated pages from the parsing, then add the following entry to the pointer:

User-agent: Googlebot

Disallow: / *?

So, approximately, and the rules for search engines are prescribed. Another thing is that it is much more convenient for all this to use the META tag. And webmasters often use just such a standard that regulates the operation of search engines. But we'll talk about this in the next articles.

Similar articles

 

 

 

 

Trending Now

 

 

 

 

Newest

Copyright © 2018 en.atomiyme.com. Theme powered by WordPress.