Related Post

19 Comments So Far

ameo Said,
June 4th, 2008 @7:41 pm  
Rate:
2.8

nice , i love the files to be ready to copy and paste :)
i’ll see if my blog support blocking robots or not ,

ameos last blog post..manage passwords / arting ads [ firefox ]

Nihar Said,
June 4th, 2008 @7:52 pm  
Rate:
2.9

very good post. my perma structure is year/month/articlename.
i think you are using sitemap plugin. I am using the same. I don’t have any special instructions in robot.txt
Let me know if i can put like you have done with the same perma structure?

Nihars last blog post..Get FREE Kaspersky Internet Security license key

June 4th, 2008 @11:27 pm  
Rate:
1.6

Nice post, I always knew they existed but never had the time to verify the file, with your post i’ll be sure to give it a look today.
I’m in the same situation of Nihar for the permalink structure, I will have to read on how to configure my robot file.

Thanks!

Chessmasters last blog post..Too Many Money Blogs

June 4th, 2008 @11:40 pm  
Rate:
2.5

Hmm I guess having a year/ on the permalink is a bit tricky for the robots.txt. Worst to worst, don’t put the
Disallow /YEAR/ parts…

You’ll still get a duplicate content though because I can go to your
http://www.YOURSITE.com/2007 to see all of your 2007 archive posts..

iCalvyn Said,
June 5th, 2008 @2:03 am  
Rate:
2.8

I did not disallow those admin content too, you are right, I should follow your way too

but i did not edit robots.txt over my root, i just edit the subdomain’s robots.txt

Steve Yu Said,
June 5th, 2008 @4:12 am  
Rate:
2.8

I just realize that my blog doesn’t have robot.txt file. So gotta create one now.

Steve Yus last blog post..Quickly Adjust the Volume of Your Speaker with just a Mouse Scroll

June 5th, 2008 @2:30 pm  
Rate:
2.5

Great tip - I would never have have even know about this if it wasn’t for your post.

Regretful Mornings last blog post..Wingman of the Year

June 6th, 2008 @12:20 am  
Rate:
2.5

Wow, I didn’t know that most bloggers don’t have robots.txt yet. Glad to help out. Now hopefully more search engine visitors will come more to your site!

@ICalvyn: I’m not sure how web crawlers work for subdomain, but if they’ll grab the robots.txt under the subdomain, then I guess you don’t need the root anymore

June 6th, 2008 @2:28 am  
Rate:
2.8

When I relook at my robots.txt file, I realize I allow the robot to access to my yearly archive.

How important it is to disallow the yearly archive? If I leave it as it is now, are you suggesting that there will be duplicate content issue?

Yan@Blog for Beginnerss last blog post..Optimize Your URL For Search Engines

Squeaky Said,
June 6th, 2008 @11:16 am  
Rate:
2.5

Micheal,

Having a good robots.txt file really helps with SEO and search engine traffic. Mine has improved a lot since I started cleaning up my robots.txt file.

You may want to validate your robots.txt file because there are some errors in it. I use this free online robots.txt validation for my site and it works very good. http://tool.motoricerca.info/robots-checker.phtml

I am working on my robots.txt file and still haven’t quite figured it all out but for the most part it is better. If you get a chance, would you look at mind and see what you think. If you need some ec credits, let me know.

Thanks……

June 7th, 2008 @11:20 am  
Rate:
2.5

@Yan: Yeah, it is. If you type http://thoushallblog.com/2008, you’ll see all of your posts in 2008. It’s kind of duplicate, don’t you think?

@Squeaky: Thanks Squeaky! My goodness, there are so many errors on mine :| It’s weird because I’ve got some of the configurations from some blogs on the web (I can’t remember wehere now, planning to give them some link love :( )

Squeaky Said,
June 7th, 2008 @2:23 pm  
Rate:
2.5

I have been working on Madmouse robots.txt for a few days now, and the Google crawl cycle is getting better. I have used the robots checker tool on many of the big bloggers sites and found lots of errors.

I am error free now, but I am sure that I have some items to address yet. But, for the most part it is better than what I had.

Once you get things to validate, it will be interesting to see if you notice any results as far as SEO, etc.

Squeakys last blog post..Stop! Blog Scrappers with the RSS Footer, WordPress Plugin

June 8th, 2008 @4:02 am  
Rate:
2.8

@Michael: Yup, you have your point. It’s time for an update. Anyway, I don’t understand why Disallow: /*?* is an error. I had that on my robots.txt file too after some advise by I-can’t-remember-who.

Yan@Blog for Beginnerss last blog post..If You Have Adsense, Use Section Targeting

June 8th, 2008 @9:24 am  
Rate:
2.5

Is there another tool that achive the same thing? It’ll be good to check whether the tool/checker itself has no bug whatsoever :)

Can never trust application 100% these days

June 8th, 2008 @5:03 pm  
Rate:
2.8

Since we create robots.txt file mainly for Google, I would place my trust on the big G to analyze it using Webmaster Tool. You have used that too, haven’t you, Michael?

Yan@Blog for Beginnerss last blog post..If You Have Adsense, Use Section Targeting

June 10th, 2008 @12:58 am  
Rate:
2.5

@Yan: Yeah, but honestly the Webmaster Tool doesn’t really analyze your robots.txt file in detail.

It’s probably worth researching again if you got errors, and see what other SEO experts say about the error, though.

June 10th, 2008 @1:51 am  
Rate:
2.8

Thanks for the advise. If you do find any useful tool online to analyze robots.txt, do let us know.

July 21st, 2008 @11:23 pm  
Rate:
3.5

Michael, help me write mine robot.txt files :)

July 22nd, 2008 @12:01 am  
Rate:
3.2

I’ve just updated this post with my latest robots.txt after following the web checker posted by Squeaky earlier

I think it’s a very good tool to analyze your robots.txt file. I’ll probably post something about it soon

@Arnold: You can copy paste my robots.txt and change the paths to match your blog :D

Please Leave Your Comments Below

- Why ask? This confirms you are a human user!

Please Note: All comments will be moderated