I often see forum posts and wishlist items where designers worry about how Webflow handles pagination from an SEO perspective.
This is a companion discussion topic for the original entry at https://www.sygnal.com/blog/is-webflows-pagination-an-seo-problem
Thank you for the insightful article. I agree that pages with pagination query parameters in the URL don’t have SEO value. My question is what actions should be taken once they are indexed by Google (and they do get indexed).
You can tell if Google’s indexing them by checking SERPs directly using site:yourdomain.com, or through search console.
With Webflow’s canonical change, they are more likely to be indexed, but as usually that will be up to Google.
If they are, it shouldn’t cause much harm, it just doesn’t add value either.
I wouldn’t worry unless you’re seeing significant “pollution” in the SERPs that’s creating problems for you.
If that did happen, I’d look at script-adding a META noindex only to those paginated pages, but ensure nofollow is still enabled so that Google recognizes that your collection pages are no orphaned.
Hey, thank you for your reply. Google indexed such pages and if on my personal small website it doesn’t cause big issues on one of my clients’ website, where they have a massive filtering system, it does create a lot of pollution. After I left my comment here, I continued researching and as a result I decided to implement “disallow” rule for pages with ? in url. So far looks like it worked, I’m monitoring what’s gonna be the long term effect.
I made a decision not to use noindex because I read somewhere that it’s not good to have canonical tag and noindex tag on the same page. And my canonical is added globally.
What do you think of my approach?
The only case I can think of where that would be problematic is if your canonical points to another URL which is indexable, e.g.;
- Current url -
/test1
canonical points to /test2
, and has a noindex meta.
/test2
allows indexing
Problematic since you’re sending ambiguous signals to Google… does the noindex refer to the /test1
or /test2
Any other normal scenario should be fine.
That’s a solid solution, however in general robots.txt defines where robots can visit, not what they should index.
There’s a disconnect there… if your page is aready indexed, and then you add it to robots.txt, the typical Google response, is to stop crawling that page for changes… but it will not remove your page from indexing.
A mistake people often make is to try to remove a page quickly by using both noindex and robots.txt. Robots.txt blocks the crawler from seeing the noindex, so the page is stuck in SERPs indefinitely.