Find Links to Any Web Site

When you browse to a web site you've never seen before, you don't have very much advance knowledge about the site. You might know that you've followed a link from a particular site that you read frequently, or you might have found the site in some search results for a certain search term. Of course, the site itself can tell you quite a bit, but that still doesn't give you any clues about where the site fits into the larger Web. With some searching at Yahoo!, you can get extra info about a site by using the special link: syntax.
If you want to find what sites are linking to any other particular site, you can browse to http://search.yahoo.com and  enter this query: link:insert URL.

 Instead of standard search results, Yahoo! will display a list of the sites that link to the URL you've specified in the query. For example, if you'd like to find out where the O'Reilly Hacks site fits into the Web, you could search for link:http://hacks.oreilly.com.
In the results, you immediately get a sense of how many pages link to the site and what kinds of sites are linking there. If you're browsing the Web, leaving a site to do a quick Yahoo! link: search can be annoying if you'd just like to get this sense about the current site you're visiting. To find the sites, you need to copy the current URL from your browser address bar, open a new window or tab, browse to Yahoo!, and then assemble the proper query. It's a quick process, but you can speed it up considerably with a bit of classic ASP and a JavaScript bookmarklet.

This hack uses JavaScript to get the URL of the current page you're viewing in your browser. From there, it passes the URL to a server-side script that assembles the proper Yahoo! query and fetches the top 10 results with Yahoo! Search Web Services. A new pop-up window will give a quick look at which sites are linking to the current page, without leaving your place.

Spider the Yahoo Catalog

Writing a spider to spider an existing spider's site may seem convoluted, but it can prove useful when you're looking for location-based services. This hack walks through creating a framework for full-site spidering, including additional filters to lessen your load.
In this hack, you'll learn how to write a spider that crawls the Yahoo! group of portals. The choice of Yahoo! was obvious; because it is one of the largest Internet portals in existence, it can serve as an ideal example of how one goes about writing a portal spider.
But before we get to the gory details of code, let's define what exactly a portal spider is. While many may argue with such a classification, I maintain that a portal spider is a script that automatically downloads all documents from a preselected range of URLs found on the portal's site or a group of sites, as is the case with Yahoo!. A portal spider's main job is to walk from one document to another, extract URLs from downloaded HTML, process said URLs, and go to another document, repeating the cycle until it runs out of URLs to visit. Once you create code that describes such basic behavior, you can add additional functionality, turning your general portal spider into a specialized one.
Although writing a script that walks from one Yahoo! page to another sounds simple, it isn't, because there is no general pattern followed by all Yahoo! sites or sections within those sites. Furthermore, Yahoo! is not a single site with a nice link layout that can be described using a simple algorithm and a classic data structure. Instead, it is a collection of well over 30 thematic sites, each with its own document layout, naming conventions, and peculiarities in page design and URL patterns. For example, if you check links to the same directory section on different Yahoo! sites, you will find that some of them begin with http://www.yahoo.com/r, some begin with http://uk.yahoo.com/r/hp/dr, and others begin with http://kr.yahoo.com.

If you try to look for patterns, you will soon find yourself writing long if/ elsif/else sections that are hard to maintain and need to be rewritten every time Yahoo! makes a small change to one of its sites. If you follow that route, you will soon discover that you need to write hundreds of lines of code to describe every kind of behavior you want to build into your spider.

This is particularly frustrating to programmers who expect to write code that uses elegant algorithms and nicely structured data. The hard truth about portals is that you cannot expect elegance and ease of spidering. Instead, prepare yourself for a lot of detective work and writing (and throwing away) chunks of code in a hit-and-miss fashion. Portal spiders are written in an organic, unstructured way, and the only rule you should follow is to keep things simple and add specific functionality only once you have the general behavior working.

Okaywith taxonomy and general advice behind us, we can get to the gist of the matter. The spider in this hack is a relatively simple tool for crawling Yahoo! sites. It makes no assumptions about the layout of the sites; in fact, it makes almost no assumptions whatsoever and can easily be adapted to other portals or even groups of portals. You can use it as a framework for writing specialized spiders.

Save the following code to a file called yspider.pl:


iPhone 5c 8GB - Llaunch

Apple is bringing its 8GB iPhone 5c to India. The 8GB variant was originally released back in March as the lowest cost current-generation iPhone option available.


The 16GB 5c currently retails in the region for 41,900 Rs., and the 8GB version in Europe is presently priced at 50 cheaper than the 16GB, which is the equivalent of 4,100 Rs. This means that the 8GB iPhone 5c could cost less than 35,000 Rs., which would be a huge selling point for Apple in India. Because of Apple's strong worldwide brand name, it is easy for them to sell their previous generation phones in regions like India and still turn a healthy profit.


 CEO Tim Cook even discussed the growth Apple has enjoyed in the Indian market: "iPhone sales grew by strong double-digits year-over-year, and in India and Vietnam sales more than doubled.

 The 8GB iPhone 5c should drop in India in the next few weeks, and expect the low-cost device to start popping up

Detecting JavaScript and Cookie Compatibility

You may be expecting a huge dump of code to see if JavaScript and cookies are enabled. There ’ s no way
that you ’ d want to go through with something like that at this point in the project, so the following
minimalist code is offered as a decent check for JavaScript compatibility:
< noscript >
You will not be able to view this site if JavaScript is not enabled.
Please turn on JavaScript to use this site.
< /noscript >

 all you need to put in your template view, and you ’ re 100 percent covered. If they don ’ t
have JavaScript turned on, they get this message. There really is no way to test to see if JavaScript is
turned on (after all, if it is off, you can ’ t run a test to see if it is on.
if (true){
//do something here, we must be on
}else{
//well shucks, JavaScript turned off, there’s no way to send an error message!
}

The  < noscript > option is very straightforward and displays just the error message. You might want to
add some branding to it, like the Claudia ’ s Kids logo, maybe a phone number or other information, but
that ’ s about as good as it gets. You could also contemplate removing the AJAX handlers from the
shopping carts, but that seems a bit much.

The same thing goes when checking for cookie support. You ’ ll need just a small bit of code that will try
to set a test cookie with a value say, the integer 1. If the site can write the cookie and retrieve it OK, then
cookies are supported. If not, display an error message.

< script >
var tcookie = new Date();
check_cookie = (tcookie.getTime() + ‘’);
document.cookie = “check_cookie=” + check_cookie + “; path=/”;
if (document.cookie.indexOf(check_cookie,0) < 0) {
alert(“You will not be able to view this site if cookies are not enabled.
Please enable them now.”);
}

< /script >

CodeIgniter system Folder

The system/ folder is where all the action happens. This folder contains all the CodeIgniter code of
consequence, organized into various folders:

application —  The   application foldercontains the application you ’ re building. Basically, this
folder contains your models, views, controllers, and other code (like helpers and class
extensions). In other words, this folder is where you ’ ll do 99 percent of your work.

cache —  The   cache foldercontains all cached pages for your application. In Chapter 9 , you learn
more about caching and how to turn your super - speedy development application into a
blazingly fast live application.

codeigniter —  The   codeigniter folderis where CodeIgniter ’ s core classes live. You have almost no
reason to go in here. All of your work will occur in the application folder. Even if your intent is
to extend the CodeIgniter core, you would do it with hooks, and hooks live in the application
folder.

database —  The   database foldercontains core database drivers and other database utilities. Again,
there ’ s no good reason for you to be in this folder.
fonts —  The   fonts foldercontains font - related information and utilities. Again, there ’ s no reason
to spend any time here.

helpers —  The   helpers foldercontains standard CodeIgniter helpers (such as date, cookie, and
URL helpers). You ’ ll make frequent use of helpers in your CodeIgniter career and can even
extend helpers thanks to improvements introduced in CodeIgniter version 1.6.

language —  The   language foldercontains language files. You can ignore it for now.

libraries —  The   libraries foldercontains standard CodeIgniter libraries (to help you with e - mail,
calendars, file uploads, and more). You can create your own libraries or extend (and even
replace) standard ones, but those will be saved in the application/libraries directory to keep
them separate from the standard CodeIgniter libraries saved in this particular folder.

logs —  The   logs folderis the folder CodeIgniter uses to write error and other logs to.


plugins —  The   plugins foldercontains plugins. Plugins and helpers are very similar, in that they
both allow developers to quickly address an issue or create content like forms, links, etc..

However, the main difference between them is that plugins usually consist of one function,
while helpers often have many functions bundled inside them.

CodeIgniter config.php

The  config.php filecontains a series of configuration options all of them stored in a PHP array called,
appropriately enough, $config) that CodeIgniter uses to keep track of your application ’ s  information
and settings.

The first configuration option you need to set inside config.php is the base URL of your application. You
do that by setting the absolute URL (including the http:// part) for $config[ ‘ base_url ’ ], like so:
$config[‘base_url’] = “http://www.example.com/test/”;

Once  you ’ ve set this configuration option, you can recall it whenever you want using the CodeIgniter
base_url()function, which can be a very handy thing to know. This one feature keeps you from
having to rewrite hard - coded URLs in your application, when you migrate from development to test or
from test to production.

The second thing you need to do is set a value for your home page by editing the $config[ ‘ index_
page ’ ]configuration option. CodeIgniter ships with a value of “ index.php ”  for  this  option,  which
means that index.php will appear in all of your URLs. Many CodeIgniter developers prefer to keep this
value blank, like so:
$config[‘index_page’] = ‘’;
To make this work, you need to include an .htaccess file to the CodeIgniter root directory, After  you ’ ve set this option value, there ’ s very little to do.
 For now, leave all the other values at their
default settings:
$config[‘uri_protocol’] = “AUTO”;
$config[‘url_suffix’] = “”;
$config[‘language’] = “english”;
$config[‘charset’] = “UTF-8”;
$config[‘enable_hooks’] = FALSE;
$config[‘subclass_prefix’] = ‘MY_’;
$config[‘permitted_uri_chars’] = ‘a-z 0-9~%.:_-’;
$config[‘enable_query_strings’] = FALSE;
$config[‘controller_trigger’] = ‘c’;
$config[‘function_trigger’] = ‘m’;
$config[‘log_threshold’] = 0;
$config[‘log_path’] = ‘’;
$config[‘log_date_format’] = ‘Y-m-d H:i:s’;


$config[‘cache_path’] = ‘’;
$config[‘encryption_key’] = “enter_a_32_character_string_here”;
$config[‘sess_cookie_name’] = ‘ci_session’;
$config[‘sess_expiration’] = 7200;
$config[‘sess_encrypt_cookie’] = TRUE;
$config[‘sess_use_database’] = FALSE;
$config[‘sess_table_name’] = ‘ci_sessions’;
$config[‘sess_match_ip’] = FALSE;
$config[‘sess_match_useragent’] = TRUE;
$config[‘cookie_prefix’] = “”;
$config[‘cookie_domain’] = “”;
$config[‘cookie_path’] = “/”;
$config[‘global_xss_filtering’] = TRUE;
$config[‘compress_output’] = FALSE;
$config[‘time_reference’] = ‘local’;
$config[‘rewrite_short_tags’] = FALSE

For more details on each of these configuration options, simply read the comments embedded in /
system/application/config/config.php. You will also get more detail on certain settings as you work
through the sections of the book and tweak the configuration as needed. For example, at some point, you
will want to use encryption for security purposes or set your logging threshold for debugging, and they
both require making changes to this file.

CodeIgniter ’ s Global XSS Filtering option is set to FALSE by default. The online User Guide suggests
that setting this to TRUE adds a lot of performance overhead to the system. However, at this point, it is
better to have some global protection put in place. That way you can be assured of some security
precautions while you ’ re in development. Chapter 9 discusses security issues in more depth, but for
now, it ’ s good to have something in place while you ’ re  developing.

In the same security vein, notice that sess_encrypt_cookie has been set to TRUE, and that you are to
enter a 32 - character encryption salt in encryption_key. Doing these two things will encrypt any
sessions and provide a salt for any hashing methods you use. Be sure to use a random string of upper -
and lowercase letters and numbers.

One final note before moving on: Make sure that you write down your encryption key and keep it safe
somewhere, or, at least, maintain good backups. You ’ ll need the key to retrieve other information, so if
your site is compromised or erased or if you lose your key any other way, you ’ ll be glad you have a
record  of  it.

Yahoo! Directory

While Yahoo! Search tries to include as many sites as possible in its index, the Yahoo! Directory is more like an exclusive club, where sites have to be approved by Yahoo! Editors. Because Yahoo! wants to maintain a highly useful directory, the steps for inclusion are a bit more involved.
To see if your site is already listed in the Yahoo! Directory, browse to http://dir.yahoo.com and search for the title of your site. If you don't see your site among the results, you can suggest your site to the Yahoo! Directory.

The first thing you need to determine about your site is whether it's commercial or noncommercial, because you'll need to pay $299 to submit a commercial site. According to Yahoo!, "If your site sells something, promote[s] goods and services, or represents a company that sells products and/or services," your site is commercial and should be listed somewhere in the Business and Economy category within the directory. If your site is purely personal, informational, or not-for-profit, your site is noncommercial. A banner ad or text ad on your site doesn't necessarily make your site commercial; if you have such an ad, it'll be up to the Yahoo! Editors to decide whether your site is commercial.


Adding a noncommercial site.
The first step to adding a noncommercial site is to find the appropriate category for your site. If you know of some sites that are similar to yours, you might try searching for the titles of those sites within the directory to see how they're categorized. Otherwise, start browsing through the directory at http://dir.yahoo.com for the most appropriate place for your site. If your site is a personal home page, browse to "Society and Culture"  People  Personal Home Pages. If your site is a weblog, you'll want to browse to "Computers and Internet"  Internet  World Wide Web  Weblogs.
Once you've found the appropriate category, click the "Suggest a Site" link at the top of the page. Choose Standard Consideration and follow the instructions for adding a site. You'll have the option to include a site title, URL, geographic location, and description. If you have suggestions about other categories that your site might be appropriate for, you can include those suggestions in notes to Yahoo! Editors.
Once you've made your submission, the waiting game begins. Yahoo! doesn't guarantee that all sites submitted will be reviewed, and many sites are not included in the directory. If your site doesn't show up in the directory within two or three weeks, you can resubmit your site using the same process. Multiple submissions in a short period of time could exclude your site from consideration altogether. To be guaranteed a response about your site's placement within the directory, you can submit your site as if it were a commercial site, paying the commercial fee.


What is Web browser?

Software that lets a user view HTML documents and access files and software related to those documents. Originally developed to allow users to view or browse documents on the World Wide Web, Web browsers can blur the distinction between local and remote resources for the user by also providing access to documents on a network, an intranet, or the local hard drive. Web browser software is built on the concept of hyperlinks, which allow users to point and click with a mouse in order to jump from document to document in whatever order they desire. Most Web browsers are also capable of downloading and transferring files, providing access to newsgroups, displaying graphics embedded in the document, playing audio and video files associated with the document, and executing small programs, such as Java applets or ActiveX controls included by programmers in the documents. Helper applications or plug-ins are required by some Web browsers to accomplish one or more of these tasks. Also called: browser.

What is timing attack?

An attack on a cryptographic system that exploits the fact that different cryptographic operations take slightly different amounts of time to process. The attacker exploits these slight time differences by carefully measuring the amount of time required to perform private key operations. Taking these measurements from a vulnerable system can reveal the entire secret key. Cryptographic tokens, network-based cryptosystems, and other applications where attackers can make reasonably accurate timing measurements are potentially at risk from this form of attack.

What is macro assembler?

An assembler that can perform macro substitution and expansion. The programmer can define a macro that consists of several statements and then use the macro name later in the program, thus avoiding having to rewrite the statements. For example, a macro called swap exchanges the values of two variables: After defining swap, the programmer can then insert an instruction such as “swap a, b” in the assembly language program. While assembling, the assembler replaces the instruction with the statements within the macro that swap the values of the variables a and b.