couchable.co New? Start Here RSS Twitter

SEO Friendly ExpressionEngine Templates Part 2

SEO Friendly ExpressionEngine Templates Part 2

So in Part 1 I went over the different ways of setting up your Meta Descriptions and title tags in ExpressionEngine. In Part 2 I go over cleaning up the different problems you might have with 404 not found pages and redirecting incorrect URIs to the correct page using your htaccess file. I'll also go over the basics of cacheing your media in your htaccess file as well.

If you haven’t read Part 1 where I go over setting up Titles and Meta Descriptions for all your templates, you can read that here.

Checking For Broken Links

So if you are ever working on a live site or just published a new site, there is a good chance that some broken links may have been crawled by google. And with the way ExpressionEngine URLs are set up, a link that may not be correct can still display content. Which means these incorrect URLs will be indexed by Google. To fix this problem you will have to redirect these links before Google will remove them.

Even if you are not working actively on a live site, it is good to check Google Webmaster Tools somewhat frequently. Your templates might not have any problems but there may be broken links in your entries that you are not aware of.

Fixing 404 errors in expressionengine with Google Webmaster Tools

Go to your Webmaster Tools and check under: Diagnostics > Crawl Errors

This list includes all the 404 and 500 errors on your site. If the Linked From number is one or two than it is probably a broken link in one of your entries. Hopefully you don’t but if you have Linked From numbers of 5+ pages listed, then there is probably a link in one of your templates that is incorrect. If that is the case you need to fix it as soon as possible. The error list will continue to get longer as you add content to your website.

Redirect URLs With Your .htaccess File

Now that you have your list of errors it is time to fix them. If the link is inside an entry then go ahead and make the fix. If the error is appearing on multiple pages, then it is probably an error inside one of your templates. In this case open the template and make the fix.

Once you have fixed all the broken links in your templates and entries, it is time to get into your htaccess file and redirect those broken links. Any broken link to an internal page you will want to make a redirect for. Google will continue to crawl this broken link until you make the redirect so make sure to do so.

redirect 301 /the/broken/link-goes-herehttp://yoursite.com/the/correct/url-here/ 

It is really that simple. On one line type “redirect 301” a space and then the broken link. Note, don’t include the domain just the URI segments. After the broken link add a space and include the full path to the correct URL. Do this for all of your 404 and 500 errors.

Continue to check Webmaster tools to make sure additional broken links or the same broken links don’t appear again. When your site gets crawled again, no new errors should show up on the list.

Broken Links In Search Results

If you had problems in your templates there is a good chance Google has indexed some incorrect URLs. Because ExpressionEngine will display a page even if the URL is incorrect Google will index it if a link is pointing at it.

On this site I had a problem in one of my templates where I wrote a URL /blog/posts instead of /blog/post. Needless to say, Google indexed a bunch of incorrect URLs. It is easy to see these problems when you look at your HTML Suggestions. There will be a number of duplicate Meta and Title tags. It is essential that you fix these problems right away or else you can end up with tens and potentially hundreds of broken links being indexed.

Check Your Search Results

Easiest way to check your sites links in Google is to search “site:yourdomain.com”. Most likely any broken links will be at the end of your results, so go to your last page of search results and check those links first.

Removing Pages From Google

The redirects in your htaccess file will keep people from visiting those incorrect links but to remove them from Google permanently go to Site Configuration > Crawler Access and click on the Remove URL tab. Enter all the search results you want to get rid of. Broken links are an obvious one but you may also want to get rid of all pagination pages because they hold no real value to searchers and contain duplicate content.

removing pagination pages and other results from google search results

Keep Pagination Pages From Being Indexed

To keep pagination pages from being indexed you need to create a NoIndex Meta Tag which looks like:

<meta name="robots" content="noindex" /> 

You need to be very careful when using these in your templates because it can be easy to accidentally keep Google from indexing lots of pages instead of the ones you intended. Here is how I use it in my homepage template.

{if segment_1 != ""}<meta name="robots" content="noindex" />{/if} 

I use a conditional to hide the meta tag on the homepage but if there is something in segment_1 (a pagination page) the meta tag is displayed, and prevents Google from indexing it.

Here is a good description of why you want to remove pagination pages from search results http://www.seomoz.org/blog/how-to-deal-with-pagination-duplicate-content-issues

Robot Text File

Robot text files are used to hide entire directories from search engine crawlers. I don’t do a lot with robot text files but I do use them to hide my miscellaneous directories like my javascript and css directories. To do so simply type:

User-AgentGooglebot 
Disallow
: /directory-to-hide

Cacheing With Your .htaccess File

This is more of a speed improvement than a SEO improvement but those kind of go hand in hand, so I’ll go over it quickly. I basically use the .htaccess file from the HTML Boilerplate, just with the time lengths changed.

# Perhaps better to whitelist expires rules? Perhaps.
 
ExpiresDefault      "access plus 1 month"

# cache.appcache needs re-requests 
# in FF 3.6 (thx Remy ~Introducing HTML5)
 
ExpiresByType text/cache-manifest "access plus 0 seconds"

# Your document html
 
ExpiresByType text/html "access plus 1 hour"

# Data
 
ExpiresByType text/xml "access plus 0 seconds"
 
ExpiresByType application/xml "access plus 0 seconds"
 
ExpiresByType application/json "access plus 0 seconds"

# RSS feed
 
ExpiresByType application/rss+xml "access plus 1 hour"

# Favicon (cannot be renamed)
 
ExpiresByType image/x-icon "access plus 1 week"

# Media: images, video, audio
 
ExpiresByType image/gif "access plus 1 year"
 
ExpiresByType image/png "access plus 1 year"
 
ExpiresByType image/jpg "access plus 1 year"
 
ExpiresByType image/jpeg "access plus 1 month"
 
ExpiresByType video/ogg "access plus 1 month"
 
ExpiresByType audio/ogg "access plus 1 month"
 
ExpiresByType video/mp4 "access plus 1 month"
 
ExpiresByType video/webm "access plus 1 month"

# HTC files  (css3pie)
 
ExpiresByType text/x-component "access plus 1 month"

# Webfonts
 
ExpiresByType font/truetype "access plus 1 month"
 
ExpiresByType font/opentype "access plus 1 month"
 
ExpiresByType application/x-font-woff   "access plus 1 month"
 
ExpiresByType image/svg+xml "access plus 1 month"
 
ExpiresByType application/vnd.ms-fontobject "access plus 1 month"

# CSS and JavaScript
 
ExpiresByType text/css "access plus 1 year"
 
ExpiresByType application/javascript "access plus 1 year"
 
ExpiresByType text/javascript "access plus 1 year" 

In Part 3 I will go over the 404 template and how to create tighter controls for your URLs.

Read More:
some other posts you might enjoy or I thought where particularly relevant to the web design post you just read

SEO Friendly ExpressionEngine Templates Part 1

It is a great feeling when you finally get that ExpressionEngine site live. You set up Google Analytics and Google Webmaster Tools, and submit your Sitemap. And when the site gets crawled and you see the results and are confronted…

Read More

Spread the Word

Subscribe to the RSS Feed or follow @couchable on Twitter

Have something to say? Leave A Comment Below

Most Popular:
the most popular posts of the year
Couchable Huh?
what this blog is all about

Couchable is a web design blog created by Tyler Herman. Not really updated anymore because I'm busy doing freelance design work and busy launching my little WordPress theme shop Real Theme Co. You can read a little more about my at my personal site Tylerherman.com