Webmastering


Millions of people use Yahoo! to find information, and having your site in Yahoo! Search or the Yahoo! Directory can mean more sales, more conversations with people you wouldn't have met otherwise, and more hits for your web site. However, letting Yahoo! know that your site exists can be a bit confusing. There's a distinction between Yahoo! Search http://search.yahoo.com and the Yahoo! Directory http://dir.yahoo.com, and the process for submitting your site to each is a bit different.

If other sites on the Web link to your site, chances are good that Yahoo! has already added your site to its index. An index is simply another name for the total list of sites that Yahoo! is watching. Yahoo! Search relies on a crawler to find new sites and keep current sites up-to-date. If a site that's currently in Yahoo!'s index has linked to your site, the crawler has probably already visited your site and automatically added it to Yahoo!'s index.
You can see if Yahoo! is already indexing your site by searching for it with the
 url: meta keyword 

Browse to http://search.yahoo.com and enter a query like this:
 url:http://insert your site 
 
While Yahoo! Search tries to include as many sites as possible in its index, the Yahoo! Directory is more like an exclusive club, where sites have to be approved by Yahoo! Editors. Because Yahoo! wants to maintain a highly useful directory, the steps for inclusion are a bit more involved.
To see if your site is already listed in the Yahoo! Directory, browse to http://dir.yahoo.com and search for the title of your site. If you don't see your site among the results, you can suggest your site to the Yahoo! Directory.

The first thing you need to determine about your site is whether it's commercial or noncommercial, because you'll need to pay $299 to submit a commercial site. According to Yahoo!, "If your site sells something, promote[s] goods and services, or represents a company that sells products and/or services," your site is commercial and should be listed somewhere in the Business and Economy category within the directory. If your site is purely personal, informational, or not-for-profit, your site is noncommercial. A banner ad or text ad on your site doesn't necessarily make your site commercial; if you have such an ad, it'll be up to the Yahoo! Editors to decide whether your site is commercial.

Yahoo RSS

 
The Publisher's Guide contains a wealth of information about RSS, tools for generating "Add to My Yahoo!" buttons, and a form for submitting your RSS feed for indexing by Yahoo!.
As you update your RSS feed, you can notify My Yahoo! that you've done so by pinging the service at this URL:
http://api.my.yahoo.com/rss/ping?u=insert your feed's URL

The Publisher's Guide contains a wealth of information about RSS, tools for generating "Add to My Yahoo!" buttons, and a form for submitting your RSS feed for indexing by Yahoo!.
As you update your RSS feed, you can notify My Yahoo! that you've done so by pinging the service at this URL:
 http://api.my.yahoo.com/rss/ping?u=insert your feed's URL 
 
Imagine you have a directory on your server called /private and you'd like to keep any pages or files out of Yahoo! Search results. Apache includes many ways to set authentication, but a straightforward method involves setting a .htaccess file. The .htaccess file tells Apache how to configure a particular directory, 
and you can add a .htaccess file to the /private directory with the following information:

AuthName "Please enter you login info." AuthType Basic AuthUserFile /your/path/to/.htpasswd AuthGroupFile /dev/null require user insert user name


Note that AuthUserFile points to a file that contains the username and password of the authenticated user, and you'll need to change /your/path/to/ to a real directory on your server that's not accessible via the Web. The next step is to create that password file with the htpasswd tool. Enter the following command from a command prompt:
htpasswd -c /your/path/to/.htpasswd insert user name
This creates the proper .htpasswd file for that user and puts in place all of the pieces for basic HTTP authentication.
 
 

robots.txt Exclusions

If server authentication seems like overkill and you'd rather make your directory or files available to everyone except Slurp, you can do so with a robots.txt file, which indicates how you'd like robots to behave at your site. Well-behaved bots (such as Slurp) check for robots.txt before indexing anything, to make sure they're acting as the site owner wants them to.
With robots.txt, you can tell Slurp that you'd like it to exclude certain directories or files from its crawl. For example, if you'd like Slurp to skip a directory called /private, save the following line to a file called robots.txt:

User-agent: Slurp
 Disallow: /private/

You can also tell Slurp to skip specific files:

   User-agent: Slurp
   Disallow: /Private.doc
   Disallow: /Private.html


Once you've listed all of the files and directories you'd like to hide, add robots.txt to the root directory of your web site, so it has a URL like this:
http://example.com/robots.txt