North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Sorry! Here's the URL content (re. Paging Google...)
Doh! I had no idea my thread would require login/be hidden from general view! (A robots.txt info site had directed me there...) It seems I fell for an SEO scam... how ironic. I guess that's why I haven't heard from google...
Anyway, here's the page content (with some editing and paraphrasing):
Subject: paging google! robots.txt being ignored!
Hi. My robots.txt was put in place in August!
But google still has tons of results that violate the file.
doesn't complain (other than about the use of google's nonstandard extensions described at
The above page says that it's OK that
is last (after User-agent: *)
and seems to suggest that the syntax is OK.
I also tried
but it hasn't helped.
I asked google to review it via the automatic URL removal system (http://services.google.com/urlconsole/controller).
URLs cannot have wild cards in them (e.g. "*"). The following line contains a wild card:
How insane is that?
Oh, and while /*?* wasn't per their example, it was legal, per their syntax, same as /*? !
The site as around 35,000 pages, and I don't think a small robots.txt to do what I want is possible without using the wildcard extension.