
by Matthew Callis in Development on Sep 22, 2009
Pagination can create search engine optimization (SEO) problems for any content management system (CMS). ExpressionEngine is no exception. Getting into ExpressionEngine can be quite a task. Some of the hardest tasks become easy, but some of the easiest tasks can become a headache.
It is important to get duplicate content out of the index, but the solution for EE is a little sloppy. Optimizing ExpressionEngine pagination for search isn't easy. But fortunately we've fought through the technical complexities and here we are to share our findings with you!
To get duplicate content out of the search engine, place meta name = "robots" content="noindex, follow" in the header to let the search bots know that content isn't for them. For our blog, we wanted to noindex, follow duplicate content pages we have on categories, authors, and the paginated pages. This header layout we have is working well for SEO on small-to-medium sites.
{assign_
variable:section="blog"}{!-- Use for main section on page --}
{embed="inc/.header" nofollow="{exp:md_detect_page_type url_segment="{segment_2}"}{if (pagination_page && segment_2 != '') || category_page || yearly_archive_page || segment_2 == 'author'}true{/if}{/exp:md_detect_page_type}"}{if embed:nofollow == 'true'}
<meta name="robots" content="noindex,follow"/>
{/if}
{!-- Don't let Google index other duplicate content. --}
{if (segment_1 == 'work' && segment_2 == 'c') }
<meta name="robots" content="noindex,follow"/>
{/if}For ExpressionEngine, that's all it takes to keep bots off your pagination and focused on your posts.
Sorry! This entry is no longer accepting comments.
Wow - Amazing! That has always bugged me but I never really knew how to go about fixing it. Thanks!
Couldn’t you use a plugin like this to reduce all those conditionals? http://expressionengine.com/forums/viewthread/92307/
{exp:md_detect_page_type}
{if pagination_page}<meta name=“robots” content=“noindex,follow”/>{/if}
{/exp:md_detect_page_type}
Brian, Thanks sir.
Something I love about this biz, there is always something new to learn.
Indeed, thanks Brian!
I’ve updated the code to use your plug-in that until now, I had never heard of:
{assign_variable:section="blog"}{!—Use for main section on page—}{embed="inc/.header" nofollow="{exp:md_detect_page_type url_segment="{segment_2}"}{if (pagination_page && segment_2 != '') || category_page || yearly_archive_page || segment_2 == 'author'}true{/if}{/exp:md_detect_page_type}"}
Note: the limit should match the limit you have set for your actual content display.
...and place this in the header template
{if embed:nofollow == 'true'}
<meta name="robots" content="noindex,follow"/>
{/if}
{!—Don't let Google index other duplicate content.—}{if (segment_1 == 'work' && segment_2 == 'c') }
<meta name="robots" content="noindex,follow"/>
{/if}
I second what James said—always something new to learn. I especially appreciate the willingness of the people in our industry to share insights like these!
I noticed that you said it is good for small to medium size sites. I would also add that being careful with duplicate content is even more important for larger sites—especially when links repeat on nearly every page.
Just in the past couple months, I’ve observed inadvertent penalties be assessed to sites for innocent mistakes involving too many duplicate links, caused by the duplicate content pages.
@Freeman, yeah duplicate content can become a big problem on large sites.
I believe this code scales well for small to medium sites in EE, but not on massive sites. That would require a custom solution.
In that last paragraph, do you mean too many duplicate links internally or a penalty from too many duplicate links off-site?