Submit Blogger sitemap to Google, Yahoo and Bing

Sitemaps let search engines to know more about the structure of your website. Blogs, like any other websites, can improve their visibility on the Internet, by adding their XML sitemaps to the major search engines (Google, Yahoo and Bing).

The problem with Blogger is that you can’t upload your own sitemap or any other file to your blogspot sub-domain root (ex. http://sub-domain.blogspot.com/) or custom domain root (ex. http://www.yanniel.info/).

No panic! There’s a workaround for this: luckily for bloggers, sitemaps can be generated as feeds; meaning that you can actually submit an RSS or Atom feed as a valid sitemap.

Blogger supports both RSS and Atom formats. Anyway, I advice you use the Atom feed URL, because I have had problems when submitting the RSS URL. What problems?  Well, I don’t recall now, but I’m pretty sure I had problems ;-)

Here is the Atom sitemap URL for Blogger (it does work for both blogspot sub-domain and custom domains):

http://sub-domain.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500

http://www. customdomain/atom.xml?redirect=false&start-index=1&max-results=500

Basically, the important part is:

atom.xml?redirect=false&start-index=1&max-results=500

A brief explanation:
  • atom.xml express the fact that you are requesting an XML Atom feed.
  • redirect=false prevents  Blogger from redirecting your sitemap to a third-party sitemap burner. This is very useful if you are using FeedBurner, because FeedBurner feeds are not recognized as valid sitemaps in most cases.
  • start-index=1 indicates that you want to syndicate starting on your first post. If you chose 10, for instance, your sitemap will start at post number 10.
  • max-results=500 tells Blogger to include 500 posts in your sitemap.  You can change this number as well, but the maximum count of posts syndicated in your feed will never exceed 500.
So, what happens if my blog has more than 500 posts? Well, you simply add a second sitemap, a third and so on. See the URLs:

http://sub-domain.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500

http://sub-domain.blogspot.com/atom.xml?redirect=false&start-index=501&max-results=500

http://sub-domain.blogspot.com/atom.xml?redirect=false&start-index=1001&max-results=500
................................................

There’s only one thing pending: you need to add your sitemaps to the major search engines…
Submitting your sitemaps for each search engine is a slightly different process. I might cover those SEO topics in future posts.

Fetching a web page with Delphi

This function fetches the HTML content of a given web page. It takes the page's URL as parameter and returns the corresponding HTML text. The name CURL comes from the PHP Client URL Library that can be used (among other things) for the same purpose.
.................
implementation

uses
  IdHTTP;

function Curl(aURL: string): string;
const
  cUSER_AGENT = 'Mozilla/4.0 (MSIE 6.0; Windows NT 5.1)';
var
  IdHTTP: TIdHTTP;
  Stream: TStringStream;
begin
  Result := '';
  IdHTTP := TIdHTTP.Create(nil);
  Stream := TStringStream.Create;
  try
    IdHTTP.Request.UserAgent := cUSER_AGENT;
    try
      IdHTTP.Get(aURL, Stream);
      Result := Stream.DataString;
    except
      Result := '';
    end;
  finally
    Stream.Free;
    IdHTTP.Free;
  end;
end;
.................

You can modify this routine to have the web page saved to a file instead. For that, you only need to use the TStringStream.SaveToFile method in substitution of TStringStream.DataString.

One final observation: you may change the cUSER_AGENT constant to whatever value you decide. If you don’t specify a user agent, then a default value will be provided.

Ah! Don’t forget to add IdHTTP to the uses clause!