Hiding a WordPress Development Site from Googles Search Engine Index

Hiding a WordPress development website from the Google search engine index is something you need to do and can be done in a couple of ways.

Why you need to do it is that once your pages are indexed they’ll take a while to get rid of and since you are in development stage you’ll probably index surplus, duplicate and incorrect content – something that a client won’t want.

This guide looks at hiding WordPress content via WordPress itself and using a more robust basic authentication method.

Discourage Search Engines

You can set the checkbox option of  ‘Discourage search engines from indexing this site‘ in the Dashboard > Settings > Reading, this is supposed to prevent the Googlebot crawler from indexing the site.

The setting that changes to prevent the googlebot crawling is the changes applied  to the robots.txt file, normally it is like…

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

With the setting’Discourage search engines from indexing this site‘  is checked it is…

User-agent: *
Disallow: /

So when checked it is asking search engines not to index the pages – but you are not guaranteed that search engines will honour this setting – but a bigger issue is that you forget to reverse the setting when the site goes live.

Use HTTP Basic Authentication

HTTP Basic authentication is a simple challenge/password set up whereby you cannot see the site but instead see a operating system style dialog box which you need to authenticate to see the site – the browser will cache the username/password combination for a period of time.

hide-from-search-engine

What’s good about this is that the search engines can’t get to index your pages and it will be very difficult for you to forget to disable it when you do take the website live.

Doing it in cPanel

cPanel has this authentication builtin in the form of Directory Protection.

cpanel-directory-privacy

This works fine, but you have to create it with  a password strength over 70. For this type of authentication and issue I just want the password to be simple – especially for client proofing the site – so you can also set it up manually.

Manually Using htpasswd and htaccess

For non cPanel hosting the manual way is the same as the cPanel way but obviously cPanel does it all for you.

You need to work with 2 files .htaccess and .htpasswd

htaccess

In .htaccess at the top of the file all you need is this…

AuthName "Protected Area"
AuthUserFile "path-to-this-file/.htpasswd"
AuthType Basic
require valid-user

Key thing is to set the path correctly to the .htpasswd file which will contain user and password. You can actually call this file anything you like.

 

htpasswd

The .htpasswd contains the username and password and looks like this…

password:$apr1$7qAwQvuT$cf/GEkKrZ8U5qZCxYIBFK0

The above generates the user/password combo of password/password – nice and easy for us but stops Google indexing.

You can get the password generated using an online generator, just plugin the username and password and get the resulting code.

That’s it – its a clean solution to not indexing pages whilst the site is in development and also protecting a clients data pre going live on the web.

 

Further ref

Leave a Comment