Robots.txt, Robots Meta Tag, .htaccess mod_rewrite

There are three commonly supported methods for instructing/requesting internet indexing spiders/bots/robots what to scan and what to skip. Each of these methods are complimentary in usefulness to each other, but none are not equal in effect.
Summary:
To really protect and enforce rules for any specific user agent that is visiting your website you will have to constantly analyze website traffic analytics, bandwidth reports and visiting IP addresses and geographic locations, known pubilc or private proxy servers, and the specific methods and tactics of EVERY unwanted program and visitor and be able to implement new means to thwart their new methods on a regular basis.
Robots Meta Tag

Use an embedded meta tag on a specific page to instruct search engine spiders and robots what to index and disallow:
- Pages including “noindex, nofollow” indicate that they are NOT to be index, NOT to be included in listings, and NOT to be scanned for reciprocal links.
- Pages including “index, nofollow” indicate that they are to be indexed and listed, but not scanned for reciprocal links.
- Pages including “index, follow” indicate that they are to be fully index and scanned for all reciprocal links and included in all applicable listings.
DO NOT index, DO NOT include in listings, and DO NOT follow reciprocal links
<input name="robots" content="noindex, nofollow" />
Index, include in listings, but DO NOT follow reciprocal links
<input name="robots" content="index, nofollow" />
Index, include in listings, and follow reciprocal links
<input name="robots" content="index, follow" />
Also see:
Robots.txt
Robots.txt is a plain text file that is implemented in the root directory of a URI as a configuration file used by some search engine spiders and internet robots/bot programs to help direct them to what you want to be indexed and what you don’t. Although many robots will read and follow your instructions in the “/robots.txt” file, many ‘less compliant’ programs may actually ignore this file completely.
Here are a few examples of robots.txt file (plain text):
Ask all search engines to NOT index or follow links on the entire website:
#asks all search engines to NOT index and NOT follow any pages or links on the entire website User-agent: * Disallow: /
Allows all search engines to index and follow links on the entire website by Disallowing nothing:
#allows search engines to index and follow all pages and links on the entire website by Disallowing nothing User-agent: * Disallow:
Disallows specific folders and files from indexing and following:
User-agent: * Disallow: /uploads/ # since this folder may contain secure, private, cached or temporary files, we should disallow this entire folder from being indexed. Disallow: /tmp/ # since this folder may contain cached or temporary files, we should disallow this entire folder from being indexed Disallow: /page.php
Also see:
HTML Character Codes – ASCII Special Characters
HTML codes to put ASCII special characters on your Web page
The following list includes the HTML codes for many of the ASCII symbols used on Web pages. The first section includes the first 255 character codes and their related HTML codes. Then, at the bottom you’ll find some other symbols and the HTML codes to create them. Not all browsers support all the codes, so be sure to test your HTML codes before you use them.
View Complete Set of HTML Character Codes for Special Charactes & Symbols
jQuery UI
jQuery UI provides abstractions for low-level interaction and animation, advanced effects and high-level, themeable widgets, built on top of the jQuery JavaScript Library, that you can use to build highly interactive web applications.
jQuery UI is an open-source library of interface components – interactions, full-featured widgets, and animation effects — based on the stellar jQuery javascript library . Each component is built according to jQuery’s event-driven architecture (find something, manipulate it) and is themeable, making it easy for developers of any skill level to integrate and extend into their own code.













