Robots.txt, Robots Meta Tag, .htaccess mod_rewrite

Search-Engine-Marketing

There are three commonly supported methods for instructing/requesting internet indexing spiders/bots/robots what to scan and what to skip. Each of these methods are complimentary in usefulness to each other, but none are not equal in effect.

  1. robots.txt
  2. Robots Meta Tag
  3. .htacess and mod_rewrite

Summary:

To really protect and enforce rules for any specific user agent that is visiting your website you will have to constantly analyze website traffic analytics, bandwidth reports and visiting IP addresses and geographic locations, known pubilc or private proxy servers, and the specific methods and tactics of EVERY unwanted program and visitor and be able to implement new means to thwart their new methods on a regular basis.

Block Unwanted Visitors by IP Address or UserAgent in Apache using mod_rewrite

Use .htaccess rules to block unwanted bots, spiders and other UserAgents that don’t fetch, or that fetch and ignore robots.txt.

Blocking visitors by IP address filtering in .htaccess file:

# deny specific IP addresses, and allow all others
order allow, deny
deny from 123.45.6.7
deny from 123.45.6.8
deny from 123.45.6.9
allow from all


Block specific UserAgent using mod_rewrite

   # Block Google Images Bot from Indexing your Copyrighted Images
   # Hopefully someday Google will publish a "supported way" of
   # Disallowing the Google Image Bot when necessary, but until then...
   RewriteEngine on
   RewriteCond %{HTTP_USER_AGENT} ^Googlebot-Image
   RewriteRule ^(.*)$ http://images.google.com/


The catch-22 with this method is that “sneaky” program developers can simply masquerade as “normal” visitors by using common web browser user agent strings. Reinforcing the fact that all three of these methods are USEFUL, but in no way a complete or secure solution even with the precise use of all three.


Also see:

  1. robots.txt
  2. Robots Meta Tag

Robots Meta Tag

meta-tags

Use an embedded meta tag on a specific page to instruct search engine spiders and robots what to index and disallow:

  1. Pages including “noindex, nofollow” indicate that they are NOT to be index, NOT to be included in listings, and NOT to be scanned for reciprocal links.
  2. Pages including “index, nofollow” indicate that they are to be indexed and listed, but not scanned for reciprocal links.
  3. Pages including “index, follow” indicate that they are to be fully index and scanned for all reciprocal links and included in all applicable listings.

DO NOT index, DO NOT include in listings, and DO NOT follow reciprocal links

<input name="robots" content="noindex, nofollow" />

Index, include in listings, but DO NOT follow reciprocal links

<input name="robots" content="index, nofollow" />

Index, include in listings, and follow reciprocal links

<input name="robots" content="index, follow" />

Also see:

  1. Robots.txt
  2. .htacess and mod_rewrite

Hyperlink Protocols: Create Hyperlinks to Email, Instant Messengers, VOIP programs & Phone Numbers

Google Talk Chat Hyperlinks

Google Talk Chat Hyperlinks

It’s occurred to me that Hyperlinks are a part of every day life for pretty much everyone these days. Whether you use them in basic internet browsing, copying and pasting YouTube and Facebook links to your friends, or making websites, knowing how to create hyperlinks is an important thing to know!

Some people don’t realize there are a lot more kinds of hyperlinks than just ones that use the HTTP or HTTPS (HTTP with SSL encryption) protocols. Many of which are very useful to today’s socially-networked webmasters and internet gurus.

Here are some useful examples:

Email HyperLink:
Example Email Link

<a href="mailto:youremail@yourdomain.com?subject=Email+Subject">Example Email Link</a>

Phone Number HyperLink:
Example Phone Link

<a title="Call 503-555-1212" href="callto:5035551212">Example Phone Link</a>

Skype Call HyperLink:
Example Skype Call Hyperlink

<a title="Call YourScreenName on Skype" href="skype:yourscreenname?call">Example Skype Call Hyperlink</a>

Skype Chat HyperLink:
Example Skype Chat Hyperlink

<a title="Chat with YourScreenName on Skype" href="skype:yourscreenname?chat">Example Skype Chat Hyperlink</a>

Google Talk Instant Message HyperLink:
Example Google Talk IM Hyperlink

<a title="Instant Message YourScreenName" href="gtalk:chat?jid=yourscreenname">Example Google Talk IM Hyperlink</a>

Google Talk Call Hyperlink:
Example Google Talk Call Hyperlink

<a title="Call YourScreenName on Google Talk" href="gtalk:call?jid=yourscreenname">Example Google Talk Call Hyperlink</a>

MSN/Windows Live HyperLink:
Example MSN/Windows Live IM Hyperlink

<a title="Instant Message YourScreenName on MSN/Windows Live Messenger" href="msnim:chat?contact=yourscreenname">Example MSN/Windows Live IM Hyperlink</a>

Yahoo! Instant Messanger HyperLink:
Example Yahoo! Instant Messenger IM Hyperlink

<a title="Chat with YourScreenName on Yahoo! Instant Messenger" href="ymsgr:sendim?yourscreenname">Example Yahoo! Instant Messenger IM Hyperlink</a>

AOL Instant Messenger HyperLink:
Example AOL Instant Messenger IM Hyperlink

<a title="Chat with YourScreenName on AOL Instant Messenger" href="aim:goim?screenname=yourscreenname">Example AOL Instant Messenger IM Hyperlink</a>

I’ll update this post and add more useful examples as I think of them.

HTML Character Codes – ASCII Special Characters

HTML Character Codes for Special Characters & Symbols

HTML Character Codes for Special Characters & Symbols

HTML codes to put ASCII special characters on your Web page

The following list includes the HTML codes for many of the ASCII symbols used on Web pages. The first section includes the first 255 character codes and their related HTML codes. Then, at the bottom you’ll find some other symbols and the HTML codes to create them. Not all browsers support all the codes, so be sure to test your HTML codes before you use them.

View Complete Set of HTML Character Codes for Special Charactes & Symbols

Next Page »