Development URL indexed on Google
Hello!
In our WHM/cPanel environments we work as follows:
1. Set up development account on WHM.
2. Build the site using cPanel and Wordpress.
3. When the website is done, we use "Park Domain" in WHM to publish it with the right domain.
This all works great, but we have noticed one problem.
All the development URLs are indexed by Google. We do not want this.
They are not supposed to be shown anywhere outside of the organisation.
I need some advice on how to fix this. :)
-
You realistically only have a couple of options: - Password protect the root folder of your development installation cPanel > Files > Directory Privacy
- Add a 'robots.txt' file (will work for Google, but not all robots/crawlers respect it)
- Use an .htaccess file entry to only allow access from certain IPs (will only work if the authorised IPs are static)
User-agent: * Disallow: /
or to disallow them from a specific folderUser-agent: * Disallow: /yourfolder/
To learn more about robots.txt - just Google for it :-D0 -
You realistically only have a couple of options:
- Password protect the root folder of your development installation cPanel > Files > Directory Privacy
- Add a 'robots.txt' file (will work for Google, but not all robots/crawlers respect it)
- Use an .htaccess file entry to only allow access from certain IPs (will only work if the authorised IPs are static)
User-agent: * Disallow: /
or to disallow them from a specific folderUser-agent: * Disallow: /yourfolder/
To learn more about robots.txt - just Google for it :-D
Hello! Thank you for the information. Correct me if I am wrong, but our published websites, the ones that has been parked, use the same files as the development websites. So if I go in and add robots.txt, won't it affect both the live website and the development site? Same thing with adding a password protection, won't it also affect the live website or is there some kind of workaround?0 -
When you have finished development of the site and it is ready to go 'live', just remove whatever protection you used. 0 -
When you have finished development of the site and it is ready to go 'live', just remove whatever protection you used.
OK, so Google won't index it while it has protection, and when it is live, it won't be seen by Google? Also, about the robots.txt, did I get it right or am I thinking wrong here?0 -
When you have finished the development, change the content of the robots.txt file to read User-agent: * Disallow:
Note the Disallow line line is empty which means that all URLs can be indexed See www.robotstxt.org/orig.html#format for full details and syntax0 -
I agree with @rpvw in this one the robots.txt would be the best solution disallow any bots from indexing until you're ready to put the site live. Password protecting isn't a bad idea either and neither would affect the live site. Another solution might be to use a domain that isn't live and modify your local hosts file. 0 -
I agree with @rpvw in this one the robots.txt would be the best solution disallow any bots from indexing until you're ready to put the site live. Password protecting isn't a bad idea either and neither would affect the live site. Another solution might be to use a domain that isn't live and modify your local hosts file.
Thanks for explaining, and thanks for the help. :) Will take this answer as the solution.0
Please sign in to leave a comment.
Comments
7 comments