SEO Knowledge Base

Advice, tips, tricks and general information about search engine optimisation (SEO) and much more.

What factors will cause search engines difficulty when spidering my site?

Search Engine ScreenshotSummary: Ensure you don’t use any techniques that will cause search engines difficulty with spidering your website.  Make sure you use a clear, SEO friendly navigation, and follow these other tips to ensure you don’t do anything to jeopardise the chances of your pages being indexed.

Naturally, you want the search engines to spider as many of your pages as possible, to give you the best possible chance of being ranked for all your lovely content.  But some practices can cause a spider to give up and go and look elsewhere.  What are they?

Some obvious ones would include a complicated link structure that the search engine can’t follow.  Thousands of pages with every page linked to every other page for example.  Couple this with only a small amount of content for absolute disaster.  But, I hear you say, I’ve come across spammy sites like this in Google! Yes, it’s unfortunate that sometimes sites with thousands of pages of auto generated content do get indexed by Google.  The results are generally short lived though.

Read the webmaster guidelines: http://www.google.com/support/webmasters/bin/answer.py?hl=en&answer=35769.

If you think for a second that a site is abusing them, do us all a favour and report that site – instructions are on that page.  Knocking these sites out of the search engines makes it easier for quality sites like ours to rank well, and improves the quality of information served to search engine users.

Back to your site – what should you avoid?

  • Building your site so that some or all pages require a session ID or cookie for navigation.  Spiders usually can’t store cookies like a regular user so the navigation will fail.
  • Using URLS with excessive dynamic parameters.  Eh? Like this: www.mysite.com?love=lots&you=justyou&page=mypage etc.  Search engine spiders don’t like crawling complicated URLs like this as they often end up in errors.  Although Google has got better at crawling sites with dynamic parameters, it’s widely accepted that less of your pages will be indexed if you use them.
  • Creating sites with frames – a huge no-no in the SEO world because spiders find them difficult to crawl and it causes confusion as to which page ranks highest.  There’s really no reason to use frames nowadays – check this article for more info on the subject: http://searchenginewatch.com/showPage.html?page=2167901
  • Creating pages that are more than 3 clicks away from your home page, unless they have loads of inbound links.  Spiders have a nasty habit of ignoring ‘deep pages’.
  • Creating pages that have 100+ links.  It’s unlikely spiders would get past 50 on any page (that’s widely accepted as a max number).  So if you’re creating a contents page for a section of your site, it probably won’t be fully indexed if it has too many links.  Just break it down into sub contents pages.

The above will cause spiders to be confused and possibly stop crawling your site a lot earlier than you’d like!

The following are practices that don’t only cause a little difficulty – they can result in no spidering, full stop!

  • Making your pages accessible via a form where you have to select an option and then click submit (do you think a spider can do that?)
  • Making your pages accessible only via search (again, spiders can’t do it)
  • Making your pages accessible only if you are logged in
  • Blocking your pages with robots meta tags or robots.txt file, logically
  • Requiring a redirect before displaying the page – this is called ‘cloaking’ or ‘bait and switch’ and sites can get banned for using this practice.

Having a clear navigational structure, using SEO friendly menus and providing sitemaps both for users and search engines, will all help ensure that your pages are indexed.

About Angel SEO

Angel SEO has written 190 articles.

Enjoyed this article?

Subscribe to our RSS feed, follow us on Twitter or just simply recommend it.

Related Articles

Further Discussion

Leave a Response

Make sure you enter the * required information where indicated. Responses are moderated so please no link dropping, no keywords or domains as names; do not spam, and do not advertise!

© 2010 Angel SEO. Company No: 07344835, Angel Business Ltd
Angel SEO in Nottingham provides search engine optimisation aka SEO in the UK and SEO Nottingham