PDA

View Full Version : googlebot


paradox
01-18-2006, 03:52 PM
If I understand it correctly, phpLD has a function where if a url to a page does not exist then the visitor or spider is redirected to the index page as opposed to being presented with a "page not found" error.

Where do I go to turn this feature off?

Is this a good thing to have related to googlebot and seo?

Would google interperet these redirections as cloaking as google bans sites that use redirects on the original urls?

TIA for your help.

Bill
01-18-2006, 07:15 PM
Sounds like you are talking about mod-rewrite. Mod re-write is better for SEO, it will redirect to the index page. It also gives you the capability to have catergories have the title name in the url. To turn it off, go into your admin panel. Then system > edit settings > directory. Change Enable Mod-Rewrite to Off. If you want to be your directory to be more SEO friendly, keep it enabled.

-Bill

djhomeless
01-21-2006, 07:39 PM
Mod_rewrite, and clean URL's have nothing to do with returning 404 errors properly. What you need to do is modify the way your htaccess file handles URL rewrites.

Your standard .htaccess file from phplinks has an entry like this:

Content visible to registered users only.

The * is a catch-all, meaning every request, good or bad, is passed to index.php. This is fine in normal cases, because 404's are captured and redirected to the normal index page. So great for web surfers, bad for Google.

I am still very new to phplinks, but what I did with my Joomla/Mambo site was trace every redirect that was needed and physically wrote a rule into the .htaccess file. What the meant was I could still use clean URL's, and every link worked as expected. It also meant that my 404 errors were returned as 404's and not redirected to the index page.

Google, for reasons only they understand, penalize you for having a 404 that returns a 200. In Google-land, they treat this redirect as duplicate content, and they remove it. What that means is the quality rating of your site goes down. Does this lead to blacklisting/sandboxing/reduced PR? I doubt it. But most people chose not to annoy Google.

I'm assuming this was the point the poster had in mind! If not, ignore my rant.

Geoffrey

David
01-21-2006, 11:44 PM
I would love to know a good answer this too.
So if anyone can research and find the answer, I'm sure there will be some people that will shower you with compliments! :D

djhomeless
01-22-2006, 12:10 AM
Content visible to registered users only.

Huh? I just did!

The answer is you remove the * and write rewrite rules specific to your URL structure. Then you have the best of both worlds.

David
01-22-2006, 12:17 AM
Content visible to registered users only.

Huh? I just did!

The answer is you remove the * and write rewrite rules specific to your URL structure. Then you have the best of both worlds.[/quote]

Oops! I should learn to read slower. :D

This is very important what you found. Thanks you very much for posting this.

To clarify, are you saying then that it should be changed to look like this?
Content visible to registered users only.
or this?
Content visible to registered users only.

djhomeless
01-22-2006, 07:51 AM
Well, I haven't yet worked out the structure on my phplinks site as I'm still importing my data and mapping the structure.

However, once I do, I will probably do something like I did with my Joomla site. There, I simply took the top level content structure and mapped it to a rule. Like so:

Content visible to registered users only.

So basically, the wildcard is removed so Google can see that you throw 404's properly. But also, you get to have nice and clean URL's that SE's want as well.

I believe there is a way to capture 404's by sending them to a static page not found that could at least have your site's look & feel. But you'll need to google for it.

VSDan
01-22-2006, 09:20 AM
I have quickly written a small script that will compile a list of all categories and subcategories, and then assemble a list of category/subcategory-specific RewriteRules along these lines:

Content visible to registered users only.

All other requests for categories/subcategories not in the database, will be 404 - which you can redirect using ErrorDocument 404.

You just have to copy and paste the script output (Rewrite lines) to your .htaccess file.

djhomeless
01-22-2006, 09:30 AM
A script like that should really be built into the core (if the dev's have time). Like, the procedure could just be added onto the 'Add Category' component. If the category being added is a top level cat, then write the name into htaccess as a rewrite rule.

VSDan
01-22-2006, 09:58 AM
I agree - I have posted the script in the Mods forum:

{link was to a forum link when it was powered by phpBB}

paradox
01-22-2006, 03:48 PM
Thanks all and a particular big thanks to VSDan for that great mod.

Now none of us will be angering the google gods :)