Disadvantages of search engine indexed pages
02/05/2009
Can there be too many pages indexed in the search engines?
There are many instances when it is BAD or harmful for search engines such as Google to have indexed pages on your website. Preventing inappropriate pages being indexed by Google or other search engines is an important part of good search engine optimisation which is often overlooked.
Many search engine optimisation providers seek to maximise the number of indexed pages, as this is sometimes given as a measure of success. Unfortunately this can lead to too many pages being indexed by the search engines, and the search engine rankings of the website being harmed.
Cornish WebServices do measure and record search engine index pages, but we exclude from this measurement all pages which should never have been indexed, and seek to remove these from the search engines as part of our search engine optimisation work.
What pages should not be indexed by search engines?
We are not referring here to pages which an organisation wishes to keep private. Using robot exclusion robot rules to prevent these from being seen is an inappropriate use of the robots.txt file. We are referring to perfectly valid pages or page URLs which can be viewed as web pages but which will harm the natural search rankings of the whole website if included within the indexed pages.Examples of pages which should not be indexed by search engines include:
· Development or test versions of the website; allowing these to be crawled by search engines can seriously harm the rankings of the main website. This can happen when the web design agency takes a copy of the website for development and does so in an irresponsible manner.
· WWW and non www versions of the web pages; it is likely these point to the same website, and if this is the case the pages will appear as duplicated pages with same content. Just one version should be visible within the search engine cache.
· Secure server (https) as well as standard (http) versions of the website. Exactly the same argument as for the www and non www versions of the web page. Only one version should be cached within the search engine index.
· Pages from affiliate networks. Cornish WebServices advise strongly against using many of the affiliate provided websites as these are all too similar in terms of page content and many of these websites do not have a chance of gaining high ranking sin the search engines. Attempting to ‘optimise’ these ‘free’ websites is more expensive that writing a new search engine optimised website. But if search engine optimisation work is carried out it is likely that all pages very similar to other websites in same affiliate scheme should be excluded from the search engine index.
· Registration and logged in pages. These are often short on content. In which case including then within the search engine index dilutes the quality of all your web pages.
· Session Ids can cause multiple versions of pages when visitors are logged into your website. If these multiple versions of web pages are index by the search engines they will show up as duplicate content pages and will harm natural search engine rankings.
· Agency software. The worst culprits here tend to be PPC agencies and Affiliate agencies who use PPC or affiliate software to manage the campaign. In fact these agencies typically setup the campaign and leave the software to automatically manage the campaign. This software or management system will be sold to the client as helping the PPC bid management (which may be true), but the disadvantages in terms of SEO are not mentioned. A good PPC agency who also understands SEO will work to ensure any bid management process they use does not harm natural search rankings. Unfortunately many PPC managers are not aware of the impact of their actions.
· Tracking code. When inserting tracking code into your website do you ever stop to think of its impact? Better measurement of website visitors, yes. But does the tracking code affect visitors? Yes it can, and it can adversely affect natural search engine rankings if not implemented correctly. Particularly problematic from late 2008 can be the urm_source and utm_medium query parameters from Google analytics tracking code.
When providing search engine optimisation, Cornish WebServices take action to prevent the search engine robots from indexing inappropriate web pages.