Site icon Techolac – Computer Technology News

2021 Complete Guide To .HTACCESS – From The Basics To Advanced Learning

With the use of the .htaccess file, you can control many aspects of the Apache webserver (and its many variants). Below, you will learn everything you need to know set up special error pages, password-protect directories, redirect, and much more.

How to Use This Guide

This guide was built to serve as a comprehensive resource for using .htaccess. If you are completely new to using .htaccess — you might want to start with the first chapter “.htaccess Basics” below.

If you are searching for specific code samples or tutorials look at the navigation on the right-hand side of this page to jump directly to sub-sections within this page.

.htaccess Basics

Let’s get acquainted with some .htaccess fundamentals before diving into commands.

What Is .htaccess?

The .htaccess file is a configuration file that controls how a webserver responds to various requests. It is supported by several webservers, including the popular Apache webserver used by most commercial web hosting providers.

.htaccess files operate at the level of a directory, allowing them to override global configuration settings of .htaccess directives higher in the directory tree.

How Is .htaccess Used?

Some common uses for .htaccess include redirecting URLs, enabling password protection for websites (or website pages); displaying custom error pages (such as 404 pages); and boosting SEO through a consistent trailing slash policy.

In the latter case, the webmaster may choose to either require a trailing slashe at the end of every URL on a site or not.

Why Is It Called .htaccess?

.htaccess stands for “hypertext access.” The name is derived from the tool’s original use which was to control user access to certain files on a per-directory basis.

Using a subset of Apache’s http.conf settings directives, .htaccess allowed a system administrator to restrict access to individual directories to users with a name and password specified in an accompanying .htpasswd file.

While .htaccess files are still used for this, they are also used for a number of other things which we’ll cover in this guide.

Where Is the .htaccess File?

In theory, every folder (directory) on your server could have one. Generally, though, there is one in your web root folder — that’s the folder that holds all the content of your website, and is usually labeled something like public_html or www.

If you have a single directory that contains multiple website subdirectories, there will usually be an .htaccess file in the main root (public_html) directory and also one in each subdirectory (/sitename).

Why Can’t I Find My .htaccess File?

On most file systems, file names that begin with a dot ( . ) are hidden files. This means they are not typically visible by default.

But they aren’t hard to get to. Your FTP client or File Manager should have a setting for “show hidden files.” This will be in different places in different programs, but is usually in “Preferences”, “Settings”, or “Folder Options.” Sometime you’ll find it in the “View” menu.

What If I Don’t Have an .htaccess File?

First of all, make sure that you have turned on “show hidden files” (or its equivalent), so that you can be sure you actually don’t have one. Often, .htaccess files are created automatically, so you will usually have one. But this isn’t always the case.

If you really don’t have one, you can easily create one:

Error Handling

Using .htaccess files to specify error documents is very simple, one of the simplest things you can do with this feature.

What Is an Error Code?

When a request is made to a web server, it tries to respond to that request, usually by delivering a document (in the case of HTML pages), or by accessing an application and returning the output (in the case of Content Management Systems and other web apps).

If something goes wrong with this, an error is generated. Different types of errors have different error codes. You are probably familiar with the 404 error, which is returned if the document cannot be found on the server.

There are many other error codes that a server can respond with.

Client Request Errors

Server Errors

Default Error Handling

If you don’t specify any type of error handling, the server will simply return the message to the browser, and the browser will display a generic error message to the user. This is usually not ideal.

Specifying Error Documents

Create an HTML document for each error code you want to handle. You can name these whatever you like, but it’s helpful to name them something that will help you remember what they’re for, like not-found.html or simply 404.html.

Then, in the .htaccess file, specify which document to use with each type of error.

ErrorDocument 400 /errors/bad-request.html
ErrorDocument 401 /errors/auth-reqd.html
ErrorDocument 403 /errors/forbid.html
ErrorDocument 404 /errors/not-found.html
ErrorDocument 500 /errors/server-err.html

 

Notice that each directive is placed on its own line.

And that’s it. Very simple.

Alternatives to .htaccess For Error Handling

Most Content Management Systems (CMS) like WordPress and Drupal, and most web apps, will have their own way of handling most of these error codes.

Password Protection With .htaccess

The original purpose of .htaccess files was to restrict access to certain directories on a per-user basis (hence the name, hypertext access). So we’ll look at that first.

.htpasswd

Usernames and passwords for the .htaccess system are stored in a file name .htpasswd.

These are stored each on a single line, in the form:

username:encryptedpassword

 

for example:

johnsmith:F418zSM0k6tGI

 

It’s important to realize that the password stored in the file isn’t the actual password used to log in. Rather it is a cryptographic hash of the password.

This means that the password has been run through an encryption algorithm, and the result is stored. When a user logs in, the plain-text password is entered and run through the same algorithm. If the input is the same, the passwords match and the user is granted access.

Storing passwords this way makes them more secure — if someone were to gain access to your .htpasswd file, they would only see the hashed passwords, not the originals. And there is no way to reconstruct the originals from the hash — it is a one way encryption.

Several different hashing algorithms can be used:

Creating Usernames and Passwords on the Command Line

You can create an .htpasswd file, and add username-password pairs to it, directly from the command line or SSH terminal.

The command for dealing with the .htpasswd file is simply htpasswd.

To create a new .htpasswd file, use the command with the -c option (for create), then type the path to the directory (not the URL, the actual path on the server). You can also include a user you want to add.

> htpasswd -c /usr/local/etc/.htpasswd johnsmith

 

This creates a new .htpasswd file in the /etc/ directory, and adds a record for a user named johnsmith. You will be prompted for a password, which will also be stored, using the md5 encryption.

If there is already an .htpasswd file at the specified location, a new one is not created — the new user is simply appended to the existing file.

If you’d prefer to use the bcrypt hashing algorithm, use the -b option.

Password Hashing Without the Command Line

If you don’t feel comfortable using the command line or SSH terminal (or if you don’t have access to it for some reason), you can simply create an .htpasswd file and populate it using a plain text editor, and upload it via FTP or file manager.

But then you’ll need to encrypt your passwords somehow, since the htpasswd command was taking care of that for you.

There are many .htpasswd encryption utilities available online. The best one is probably the htpasswd generator at Aspirine.org.

This gives you several options for hashing algorithm and password strength. You can simply copy-and-paste the output from there into your .htpasswd file.

Where to Keep Your .htpasswd File

You don’t need to have a separate .htpasswd file for every .htaccess file. In fact, you shouldn’t. Under most normal circumstances, you should have one for your entire web hosting account or main server directory.

The .htpasswd file should not be in a publicly accessible directory — not public_html or www or any subdirectory. It should be above those, in a folder that is only accessible from the server itself.

How to Use .htpasswd With .htaccess

Each directory can have its own .htaccess file, with its own set of users which are allowed to access it.

If you want any one (including non-logged-in users) to access the directory and its files, simply do nothing — that is the default.

To restrict access you need to add the following to the .htaccess file:

AuthUserFile /usr/local/etc/.htpasswd
AuthName "Name of Secure Area"
AuthType Basic
<Limit GET POST>
require valid-user
</Limit>

 

The first line specifies the path and file name to your list of usernames and passwords. The second line specifies a name for the secured area. This can be anything you like. The third line specifies “Basic” authentication, which is what you usually need.

The <Limit> tag specifies what is being limited (in this case, the ability to GET or POST to any file in the directory). Within the pair of <Limit> tags is a list of who is allowed to access files.

In the above example, any valid user can access files. If you want to restrict access to a specific user or few users, you can name them.

AuthUserFile /usr/local/etc/.htpasswd
AuthName "Name of Secure Area"
AuthType Basic
<Limit GET POST>
require user johnsmith
require user janedoe
</Limit>

 

You can also put users into groups and allow access based on group. This is done by adding another file which specifies the groups.

The group file, which could be named (for example) .htgroups looks like this:

admin: johnsmith janedoe
staff: jackdoe cindysmith

 

Then you can specify it in your .htaccess file:

AuthUserFile /usr/local/etc/.htpasswd
AuthGroupFile /usr/local/etc/.htgroup
AuthName "Admin Area"
AuthType Basic
<Limit GET POST>
require group admin
</Limit>

Alternatives to .htpasswd

Using .htaccess and .htpasswd to restrict access to certain files on your server only really makes sense if you have a lot of static files. The feature was developed when web sites were usually a collection of HTML documents and related resources.

If you are using a content management system (CMS) like WordPress or Drupal, you can use the built-in user management features to restrict or grant access to content.

Enabling Server Side Includes (SSI)

Now let’s learn what Server Side Includes are and how you can use them.

What Are Server Side Includes?

SSI, or Server Side Includes, is a light-weight scripting language used primarily to embed HTML documents into other HTML documents. This makes it easy to re-use common elements, such as headers, footers, sidebars, and menus. You can think of it as a precursor to today’s templating and content management systems.

<!-- include virtual="header.shtml" -->

 

SSI also has conditional directives (if, else, etc.) and variables, making it a complete, if somewhat hard to use, scripting language. (Typically, any project more complicated than a handful of includes will cause a developer to choose a more robust language like PHP or Perl.)

Enabling SSI

Some web hosting servers will have Server Side Includes enabled by default. If not, you can enable it with your .htaccess file, like so:

AddType text/html .shtml
AddHandler server-parsed .shtml
Options Indexes FollowSymLinks Includes

 

This should enable SSI for all files that have the .shtml extension.

SSI on .html files

If you want to enable SSI parsing on .html files, you can add a directive to accomplish that:

AddHandler server-parsed .html

 

The benefit of doing this is that you can use SSI without letting the world know you are using it. Also, if you change implementations in the future, you can keep your .html file extensions.

The downside of this is that every .html file will be parsed with SSI. If you have a lot of .html files that don’t actually need any SSI parsing, this can introduce a lot of unneeded server overhead, slowing down your page load times and using up CPU resources.

SSI on Your Index Page

If you don’t want to parse all .html files, but you do want to use SSI on your index (home) page, you’ll need to specify that in your .htaccess file.

That’s because when the web server is looking for a directory’s index page, it looks for index.html, unless you tell it otherwise.

If you aren’t parsing .html files, you’ll need your index page to be named index.shtml for SSI to work, and your server doesn’t know to look for that by default.

To enable that, simply add:

DirectoryIndex index.shtml index.html

 

This alerts the web server that the index.shtml file is the main index file for the directory. The second parameter, index.html is a backup, in case index.shtml can’t be found.

IP Blacklisting and IP Whitelisting

You can use .htaccess to block users from a specific IP address (blacklisting). This is useful if you have identified individual users from specific IP addresses which have caused problems.

You can also do the reverse, blocking everyone except visitors from a specific IP address (whitelisting). This is useful if you need to restrict access to only approved users.

Blacklisting by IP

To block specific IP addresses, simply use the following directive, with the appropriate IP addresses:

order allow,deny
deny from 111.22.3.4
deny from 789.56.4.
allow from all

 

The first line states that the allow directives will be evaluated first, before the deny directives. This means that allow from all will be the default state, and then only those matching the deny directives will be denied.

If this was reversed to order deny,allow, then the last thing evaluated would be the allow from all directive, which would allow everybody, overriding the deny statements.

Notice the third line, which has deny from 789.56.4. — that is not a complete IP address. This will deny all IP addresses within that block (any that begin with 789.56.4).

You can include as many IP addresses as you like, one on each line, with a deny from directive.

Whitelisting by IP

The reverse of blacklisting is whitelisting — restricting everyone except those you specify.

As you may guess, the order directive has to be reversed, so that everyone is first denied, but then certain addresses are allowed.

order deny,allow
deny from all
allow from 111.22.3.4
allow from 789.56.4.

Blocking Actions

.htaccess can be used to block users by domain or referrer. And you can use it to block bots and scrapers. Let’s find out how.

How to Block Users By Domain

You can also block or allow users based on a domain name. This can be help block people even as they move from IP address to IP address.

However, this will not work against people who can control their reverse-DNS IP address mapping.

order allow,deny
deny from example.com
allow from all

 

This works for subdomains, as well — in the previous example, visitors from xyz.example.com will also be blocked.

How to Block Users by Referrer

A referrer is the website that contains a link to your site. When someone follows a link to a page on your site, the site they came from is the referrer.

This doesn’t just work for clickable hyperlinks to your website, though.

Pages anywhere on the internet can link directly to your images (“hotlinking”) — using your bandwidth, and possibly infringing on your copyright, without providing any benefit to you in terms of traffic. They can also hotlink to your CSS files, JS scripts, or other resources.

Most website owners are okay with this when happens just a little bit, but sometimes this sort of thing can turn into abuse.

Additionally, sometimes actual in-text clickable hyperlinks are problematic, such as when they come from hostile websites.

For any of these reasons, you might want to block requests that come from specific referrers.

To do this, you need the mod_rewrite module enabled. This is enabled by default for most web hosts, but if it isn’t (or you aren’t sure) you can usually just ask your hosting company. (If they can’t or won’t enable it, you might want to think about a new host.)

The .htaccess directives that accomplish referrer-based blocking rely on the mod_rewrite engine.

The code to block by referrer looks like this:

RewriteEngine on
RewriteCond % ^http://.*example.com [NC,OR]
RewriteCond % ^http://.*anotherexample.com [NC,OR]
RewriteCond % ^http://.*onemoreexample.com [NC]
RewriteRule .* - [F]

 

This is a little tricky, so lets walk through it.

The first line, RewriteEngine on, alerts the parser that a series of directives related to rewrite is coming.

The next three lines each block one referring domain. The part you would need to change for your own use is the domain name (example) and extension (.com).

The backward-slash before the .com is an escape character. The pattern matching used in the domain name is a regular expression, and the dot means something in RegEx, so it has to be “escaped” using the back-slash.

The NC in the brackets specifies that the match should not be case sensitive. The OR is a literal “or”, and means that there are other rules coming. (That is — if the URL is this one or this one or this one, follow this rewrite rule.)

The last line is the actual rewrite rule. The [F] means “Forbidden.” Any requests with a referrer matching the ones in the list will fail, and deliver a 403 Forbidden error.

Blocking Bots and Web Scrapers

One of the more annoying aspects of managing a website is discovering that your bandwidth is being eaten up by non-human visitors — bots, crawlers, web scrapers.

These are programs that are designed to pull information out of your site, usually for the purpose of republishing it as part of some low-grade SEO operation.

There, of course, legitimate bots — like those from major search engines. But the rest are like pests that just eat away at your resources and deliver no value to you whatsoever.

There are several hundred bots identified. You will never be able to block all of them, but you can keep the activity down to a dull roar by blocking as many as you can.

There is a useful set of rewrite rules which blocks over 400 known bots compiled by AskApache.

Specifying a Default File for a Directory

When a request is made to a web server for a URL which does not specify a file name, the assumption built into most web servers is that the URL refers to a directory.

So, if you request http://example.com, Apache (and most other web servers) is going to look in the root directory for the domain (usually /public_html or something similar, but perhaps /example-com) for the default file.

The default file, by default, is called index.html. This goes way back to the beginning of the internet when a website was just a collection of documents, and the “home” page was usually an index of those documents.

But you might not want index.html to be the default page. For example, you might need a different file type, like index.shtml, index.xml, or index.php.

Or you might not think of your home page as an “index,” and want to call it something different, like home.html or main.html.

Setting the Default Directory Page

.htaccess allows you to set the default page for a directory easily:

DirectoryIndex [filename here]

 

If you want your default to be home.html it’s as easy as:

DirectoryIndex home.html

Setting Multiple Default Pages

You can also specify more than one DirectoryIndex:

DirectoryIndex index.php index.shtml index.html

 

The way this works is that the web server looks for the first one first. If it can’t find that, it looks for the second one, and so on.

Why would you want to do this? Surely you know which file you want to use as your default page, right?

Remember that .htaccess affects its own directory, and every subdirectory until it is overridden by a more local file. This means that an .htaccess file in your root directory can provide instructions for many subdirectories, and each one might have its own default page name.

Being able to place those rules in a single .htaccess file in the root means that you don’t have to duplicate all the other directives in the file at every directory level.

URL Redirects and URL Rewriting

One of the most common uses of .htaccess files is URL redirects.

URL redirects should be used when the URL for a document or resource has changed. This is especially helpful if you have reorganized your website or changed domain names.

301 vs. 302 Redirects

From a browser standpoint, there are two types of redirects, 301 and 302. (These numbers refer to the error code generated by the web server.)

301 means “Permanently Moved,” while 302 means “Moved Temporarily.” In most cases, you want to use 301. This preserves any SEO equity the original URL had, passing it on to the new page.

It also will cause most browsers to update their bookmarks. Most browsers will also cache the old-to-new mapping, so they will simply request the new URL when a link or user attempts to access the original. If the URL has changed permanently, these are all desirable results.

There’s very little reason to use 302 redirects, since there’s usually very little reason to temporarily change a URL. Changing a URL ever is undesirable, but is sometimes necessary. Changing it temporarily, with the plan to change it back later, is a bad idea and is almost always avoidable.

All the examples in this section will use the 301 redirect.

Redirect vs. Rewrite

There are two different ways to “change” a URL with .htaccess directives — the Redirect command and the mod_rewrite engine.

The Redirect command actually sends a redirect message to the browser, telling it what other URL to look for.

Typically, the mod_rewrite tool “translates” one URL (the one provided in a request) into something that the file system or CMS will understand, and then handles the request as if the translated URL was the requested URL.

When used this way, the web browser doesn’t notice that anything happened — it just receives the content it asked for.

The mod_rewrite tool can also be used to produce 301 redirects that work the same way as the Redirect command, but with more options for rules — mod_rewrite can have complex pattern matching and rewriting instructions, which Redirect cannot take advantage of.

Basic Page Redirect

To redirect one page to another URL, the code is:

Redirect 301 /relative-url.html http://example.com/full-url.html

 

This single-line command has four parts, each separated with a single space:

The relative URL is relative to the directory containing the .htaccess file, which is usually the web root, or the root of the domain.

So if http://example.com/blog.php had been moved to http://blog.example.com, the code would be:

Redirect 301 /blog.php http://blog.example.com

Redirecting a Large Section of Your Website

If you have moved your directory structure around, but kept your page names the same, you might want to redirect all requests for a certain directory to the new one.

Redirect 301 /old-directory http://example.com/new-directory

 

Redirecting an Entire Site

What if you entire site has moved to a new URL? Easy.

Redirect 301 / http://newurl.com

Redirecting www to non-www

Increasingly, websites are moving away from the www subdomain.

It’s never really been necessary, but it was a holdover from the days when most people who operated a website were using a server to store lots of their own documents, and the www or “world wide web” directory was used for content they wanted to share with others.

These days, some people use it, and some people don’t. Unfortunately, some users still automatically type www. in front of every URL out of habit. If you’re not using www, you want to make sure that these requests land in the right place.

To do this, you’ll need to use the mod_rewrite module, which is probably already installed on your web host.

Options +FollowSymlinks
RewriteEngine on
RewriteCond % ^www.example.com [NC]
RewriteRule ^(.*)$ http://example.org/$1 [R=301,NC]

 

Be careful!

A lot of other .htaccess and mod_rewrite guides offer some variation of the following code to accomplish this:

Options +FollowSymlinks
RewriteEngine on
RewriteCond % !^example.com [NC]
RewriteRule ^(.*)$ http://example.org/$1 [R=301,NC]

 

Do you see the problem with that?

It redirects all subdomains to the primary domain. So not just www.example.com, but also blog.example.com and admin.example.com and anything else. This is probably not the behavior you want.

Redirecting to www

But what if you are using the www subdomain?

You should probably set up a redirect to make sure people get to where they’re trying to go. Especially now that fewer people are likely to automatically add that www to the beginning of URLs.

You just reverse the above code.

RewriteEngine On
RewriteCond % ^example.com [NC]
RewriteRule ^(.*) http://www.website.com/$1 [R=301,NC]

Should I Redirect 404 Errors to the Homepage?

Several guides on .htaccess redirects include instructions on how to make 404 errors redirect to the home page.

This is a good example of how just because you can do something, it doesn’t mean you should do something.

Redirecting 404 errors to the site’s homepage is a terrible idea. It confuses visitors, who can’t figure out why they are seeing the front page of a site instead of a proper 404 error page.

All websites should have a custom 404 page which clearly explains to the user that the content couldn’t be found and, ideally, offers some search features to help the user find what they were looking for.

Why Use .htaccess Instead of Alternatives?

You can set up redirect in PHP files, or with any other type of server-side scripting. You can also set them up within your Content Management System (which is basically the same thing).

But using .htaccess is usually the fastest type of redirect. With PHP-based redirects, or other server-side scripting languages, the entire request must be completed, and the script actually interpreted before a redirect message is sent to the browser.

With .htaccess redirects, the server responds directly to the request with the redirect message. This is much faster.

You should note, though — some content management systems actually manage redirects by updating the .htaccess programatically. WordPress, for example, has redirect plugins that work this way. (And WP’s pretty URL system does this as well.)

This gives you the performance of using .htaccess directly, while also giving you the convenience of management from within your application.

Hiding Your .htaccess File: Security Considerations

There is no reason that someone should be able to view your .htaccess file from the web.

Moreover, there are some big reasons you should definitely not want people to see your .htaccess file.

The biggest issue is that if you are using an .htpasswd file, its location is spelled out in the .htaccess file. Knowing where to find it makes it easier to find.

Moreover, as a general rule, you don’t want to provide the public with details about your implementation.

Rewrite rules, directory settings, security — all of the things that you use .htaccess for — it is a good security practice to hide all of this behind-the-scenes at your web server. The more a hacker can learn about your system, the easier it is to compromise it.

It is very easy to hide your .htaccess file from public view. Just add the following code:

<Files .htaccess>
order allow,deny
deny from all
</Files>

Enabling MIME types

MIME types are file types. They’re called MIME types because of their original association with email (MIME stands for “Multipurpose Internet Mail Extensions”). They aren’t just called “file types” because MIME implies a specific format for specifying the file type.

If you’ve ever authored an HTML document, you’ve likely specified a MIME type, even if you didn’t know it:

<style type="text/css" src="/style.css?x40668" />

 

The type attribute refers to a specific MIME type.

MIME types on Your Server

Sometimes you’ll find that your web server isn’t configured to deliver a particular type of file. It just doesn’t work — requests for the file simply fail.

In most cases, you can fix this problem by adding the MIME type to your .htaccess file.

AddType text/richtext rtx 

 

This directive has three parts, each separated by a space:

If you want to associate several different file extensions with the same MIME type, you can do that on a single line.

AddType image/jpeg jpeg jpg jpe JPG 

Force Download by MIME Type

If you want all links to specific file types to launch as downloads, instead of being opened in the browser, you do that with the MIME type application/octet-stream, like this:

AddType application/octet-stream pdf

 

Again, you can specify multiple file extensions with a single type:

AddType application/octet-stream pdf doc docx rtf

List of File Extensions and MIME Types

Here is a not-quite-complete list of file formats and associated MIME types.

If you are managing your own website, and you know what file types you publish resources in, then there is no need to paste this entire list into your .htaccess file.

However, if you run a site that many other people are contributing and publishing content to, you may want to simply allow a large number of file types this way to make sure no one has a bad experience.

This is especially the case if you run a site where people might be specifically sharing a lot of files, for example a file sharing site, a project management application (where many files will often be attached to project), or a web app that handles email.

AddType application/macbinhex-40 hqx
AddType application/netalive net
AddType application/netalivelink nel
AddType application/octet-stream bin exe
AddType application/oda oda
AddType application/pdf pdf
AddType application/postscript ai eps ps
AddType application/rtf rtf
AddType application/x-bcpio bcpio
AddType application/x-cpio cpio
AddType application/x-csh csh
AddType application/x-director dcr
AddType application/x-director dir
AddType application/x-director dxr
AddType application/x-dvi dvi
AddType application/x-gtar gtar
AddType application/x-hdf hdf
AddType application/x-httpd-cgi cgi
AddType application/x-latex latex
AddType application/x-mif mif
AddType application/x-netcdf nc cdf
AddType application/x-onlive sds
AddType application/x-sh sh
AddType application/x-shar shar
AddType application/x-sv4cpio sv4cpio
AddType application/x-sv4crc sv4crc
AddType application/x-tar tar
AddType application/x-tcl tcl
AddType application/x-tex tex
AddType application/x-texinfo texinfo texi
AddType application/x-troff t tr roff
AddType application/x-troff-man man
AddType application/x-troff-me me
AddType application/x-troff-ms ms
AddType application/x-ustar ustar
AddType application/x-wais-source src
AddType application/zip zip
AddType audio/basic au snd
AddType audio/x-aiff aif aiff aifc
AddType audio/x-midi mid
AddType audio/x-pn-realaudio ram
AddType audio/x-wav wav
AddType image/gif gif GIF
AddType image/ief ief
AddType image/jpeg jpeg jpg jpe JPG
AddType image/tiff tiff tif
AddType image/x-cmu-raster ras
AddType image/x-portable-anymap pnm
AddType image/x-portable-bitmap pbm
AddType image/x-portable-graymap pgm
AddType image/x-portable-pixmap ppm
AddType image/x-rgb rgb
AddType image/x-xbitmap xbm
AddType image/x-xpixmap xpm
AddType image/x-xwindowdump xwd
AddType text/html html htm
AddType text/plain txt
AddType text/richtext rtx
AddType text/tab-separated-values tsv
AddType text/x-server-parsed-html shtml sht
AddType text/x-setext etx
AddType video/mpeg mpeg mpg mpe
AddType video/quicktime qt mov
AddType video/x-msvideo avi
AddType video/x-sgi-movie movie
AddType x-world/x-vrml wrl 

Block Hotlinking

Hotlinking is the practice of linking to resources from other domains instead of uploading the content to your own server and serving it yourself.

Say you find an image on a website that you really like, and you want to use it on your site. Ignoring copyright issues for the moment — you could download the image, upload it to your website, and embed it on your page like normal.

<img src="https://yourdomain.com/image.jpg">

 

But if you were lazy, or trying to save bandwidth, or didn’t know how to upload a file, you could just embed it directly form the original file.

<img src="https://originaldomain.com/image.jpg">

 

That’s hotlinking. It also happens with CSS and JS files, but images are the most common.

Some websites/hosts don’t mind at all if you do this — you can hotlink images from Wikipedia without anyone being upset. And some websites encourage it in one form or another.

For example, JQuery provides their JS libraries via a CDN (Content Delivery Network), so you can hotlink directly to it without having to upload it and serve it from your own server.

But many web host consider hotlinking to be a form of bandwidth and resource stealing.

To be sure, if you are running a relatively small site, you can’t afford to have thousands, or tens of thousands, of requests being made every day for resources that have nothing to do with actual visitors to your site.

If you are having a problem with hotlinking, you can disable it with some mod_rewrite rules added to your .htaccess file.

RewriteEngine on
RewriteCond % !^$
RewriteCond % !^http://(www.)?example.com/.*$ [NC]
RewriteRule .(gif|jpg|jpeg|png|js|css)$ - [F]

 

Be sure to change example.com in the third line to your actual domain name. This will catch any requests not coming from your domain, and then check if it matches one of the specified file extensions in the fourth line. If there is a match, the request fails.

If you want to add other file extensions, you can simply edit the last line.

Serving up Alternative Content

If you want to let the world know why their hotlinking has suddenly stopped working, you can replace hotlinked images with a special image with a message like, “We hate hotlinking!” or “Original Content Available at http://example.com”.

Instead of failing the request, you simply redirect it to the “special” image:

RewriteEngine on
RewriteCond % !^$
RewriteCond % !^http://(www.)?example.com/.*$ [NC]
RewriteRule .(gif|jpg)$ http://www.example.com/no-hotlinking.jpg [R,L]

 

If you really want to mess with people, you can redirect JavaScript or CSS files to special alternatives that may have unfortunate effects for the hotlinker. This is not recommended, however.

RewriteEngine on
RewriteCond % !^$
RewriteCond % !^http://(www.)?example.com/.*$ [NC]
RewriteRule .(js)$ http://www.example.com/break-everything.js [R,L]

RewriteEngine on
RewriteCond % !^$
RewriteCond % !^http://(www.)?example.com/.*$ [NC]
RewriteRule .(css)$ http://www.example.com/super-ugly.css [R,L]

Disable or Enable Index

What happens if you have a directory full of documents or other resources, no index.html file, and no default directory page specified in the .htaccess file?

In many cases, the result will be a generic directory listing of all the files in the directory.

That’s right. If you have a folder in your hosting directory labeled /images, and it has no index.html page, when someone navigates to http://yousite.com/images, they will be able to see a list of all the images on your site.

That’s the default behavior of most web servers, and it makes sense from the standpoint of the original conception of a website as simply a place to keep and share documents. But this is not the desired behavior for most sites.

Disabling Indexes

Many web hosting accounts will have disable this already as part of their global configuration. But not all do so.

If you need to disable automatically generated directory listings, doing so is easy:

Options -Indexes

Enabling Indexes

If your web server has disabled indexes as part of global configuration, but you do want them, you can enable them with the reverse of the above command.

Options +Indexes

Hiding some files from the Index

If you want to show directory listings, but you want to hide certain file types from the list, you can do that too.

IndexIgnore *.gif *.jpg

 

The * is a wild-card chracter. The above directive will hide all files that have a .gif or .jpg extension. If you wanted to be more specific, you could:

IndexIgnore secret-image.jpg

Enabling CGI Everywhere

CGI, or Common Gateway Interface, is a server-side method for including non-HTML scripts (like Perl or SSI) in web pages.

Typically, CGI scripts are stored in a folder labeled /cgi-bin. The webserver is configured to treat any resource in that directory as a script, rather than a page.

The problem with that is two-fold: URLs referencing CGI resources need to have /cgi-bin/ in them, which places implementation details into your URL — an anti-pattern to be avoided for a number of reasons.

A complex website may need a better organization structure than simply having a ton of scripts jammed into a single /cgi-bin folder.

If you want your web server to parse CGI scripts no matter where they are found in your directory structure, just add the following to your .htaccess file:

AddHandler cgi-script .cgi
Options +ExecCGI 

 

If you have other file extensions you want processed as CGI scripts, you can add them in the first line.

Scripts as Source Code

Most of the time, you put scripts in your web directory because, well, you want them to be run as scripts.

But sometimes that isn’t what you want. Sometimes you want to display the source code to public visitors, instead of running the script.

This might be the case if you run a file sharing service or a code repository site, and you want people to see the source code and be able to download it, but the scripts are actually part of your site’s functionality.

This can be done in your .htaccess file by removing the script handler for certain file types and replacing it with a handler for text.

RemoveHandler cgi-script .pl .cgi .php .py
AddType text/plain .pl .cgi .php .py 

 

Alternatively, as mention previously, you could force files with these extensinos to be downloaded automatically, rather than displayed.

RemoveHandler cgi-script .pl .cgi .php .py
AddType application/octet-stream .pl .cgi .php .py 

 

Be careful with either of these, though. If you only want some files to be displayed this way, but are still using these scripts for the rest of your website, your going to have a bad time if you put that directive into your web root’s .htaccess file.

A better practice would be to place all such “display only” scripts into a single directory, and then place the directive into an .htaccess file there in that folder.

Configuring PHP Settings

Sometimes you need to tweak PHP’s settings. The right way to do this is in a file called php.ini.

Unfortunately, not all web hosting companies allow their customers to edit the php.ini file. This is especially true of shared hosting providers, where a single installation of PHP may be running hundreds of web sites.

Fortunately, there’s a workaround — you can embed php.ini rules into your .htaccess file.

The syntax looks like:

php_value [setting name] [value]

 

So, for example, if you need to increase the max file upload size (a common issue), it is as easy as:

php_value upload_max_filesize  10M

 

Not all PHP settings can be specified in .htaccess files. For example you can not disable_classes this way.

For a complete list of all php.ini settings, see the official php.ini directives guide.

How to Prevent Access to Your PHP include Files

There are several ways to prevent unauthorized access to your PHP includes files.

First, you can put them into a directory and set your .htaccess file to deny all access to that directory (ie, Deny from all if you’re using the Apache HTTP Server). If someone does try to access the file, they will receive an HTTP 403 Forbidden response.

Alternatively, you can store these files outside the directory from which your website files are served. That is, if your webserver is serving files located in /srv/home, you can put your include files under /srv/home/includes. This makes the files inaccessible via URLs, though you can access and use them as follows: include 'PATH_TO_YOUR_FILE'

Finally, you can define a URL constant for the files you want accessible:

define('WEBSITE_URL', 'http://example.com');

 

Then, for the files that you don’t want accessed, include the following check:

if(!defined('WEBSITE_URL')) {
    header($_SERVER["SERVER_PROTOCOL"] . "403 Forbidden");
    exit;
}

How to Prevent Access to Your PHP ini Files

The way to prevent unauthorized access to your ini files is to edit your .htaccess file to deny access to the ini files (ie, Deny from all if using Apache).

How to Set Your Server’s Time Zone

You can set your server’s time zone by specifying it in your .htaccess file. To do so, you will need to add the following line:

php_value date.timezone 'Region/Zone'

 

Make sure to replace Region/Zone with the time zone you’d prefer.

Save your file. You can test your changes by creating a PHP test file containing the following in the same directory as the .htaccess file:

<?php phpinfo(); ?>

 

Load the file in your browser, and search for the name of the directive – its Local Value column should display your new time zone setting.

When Not to Use .htaccess

Editing your .htaccess file for the first time can give you sudden feeling of immense power over your web hosting environment. You suddenly feel like a sysadmin.

Unfortunately, this power can go to your head, and you may find yourself using the .htaccess file in ways that aren’t really the best.

When you need to do something that seems like an .htaccess sort of job, there’s basically two situations where you should put that directive somewhere else.

Further Upstream

Whenever possible, the types of directives you can place in an .htaccess file are better off being place in the httpd.conf file, which is a configuration settings file for the entire server.

Similarly, PHP settings more properly belong in the php.ini file, and most other languages have similar configuration setting files.

Placing directives further upstream, in the httpd.conf, php.ini, or other language-specific configuration file allows those settings to be “baked-in” to the web server’s parsing engine. With .htaccess, the directives have to be checked and interpreted with every single request.

If you have a low traffic site with only a handful of .htaccess directives, this isn’t a big deal. But if you have a lot of traffic, and a lot of directives, the performance lag can really add up.

Unfortunately, many shared hosting providers do not allow customers to access the httpd.conf or php.ini files, forcing users to rely on the slower .htaccess file.

This provides a double-penalty when compared to custom VPS configurations because shared hosting is also generally low-powered. This is one of the reasons that a site with respectable traffic should probably be on a VPS plan instead of shared hosting plan.

Further Downstream

If you are using a good Content Management System (CMS) such as WordPress or Drupal, some of the things you might do in an .htaccess file — such as redirect URLs or block IP addresses — can be done from inside the application.

Often, this works in conjunction with the .htaccess file, with the application programatically adding directives.

When this is available, it is usually best to accomplish these tasks from inside the application, rather then editing the .htaccess file yourself. You are less likely to introduce bugs and incompatible directives if you use a well-tested, open-source plugin.

Troubleshooting

Messing around with your .htaccess file can be great — but it can also cause your server to seize up and start delivering 500 Internal Server Error messages.

Here’s a few ideas to help you through that.

Do One Thing At a Time

This should go without saying, but — sadly — it’s a lesson many of us have to learn over and over again.

Do one thing. Then test it. Then do another thing. Test that.

If you do several things all at once, and then something fails, you won’t know which directive is causing the problem.

Backup Your File Before Each Action

Along with doing only one thing at a time, you should save your file between each thing you are trying. Your saved archive needs to be restorable. This isn’t Microsoft Word where you can just Undo — you need a saved copy of your file.

You should always have the latest working version available in case you mess something up. Always, always, always have the ability to restore to a working version.

This is easiest if you some kind of source management system like git. You can commit after each change, and roll back if you run into any problems.

Check the Error Logs

If you do run into a problem, and you’re having a hard time figuring out why, check your Apache error logs. These often provide valuable information about where to look.

Use Developer Forums to Get Help

Developer forums and Q&A sites like StackOverflow are invaluable tools for even the most seasoned developers and sysadmins. And don’t forget Google. Often, the difference between a bad web master and great one isn’t knowing the answer, its knowing where to find the answer.

Common .htaccess Problems

Sometimes you made a typo. Sometimes you have an esoteric and confusing problem caused by a confluence of unpredictable factors.

Most problems, and the really frustrating ones, are the ones in the middle — the simple, everyday problems that are easy to fix if you just knew about them.

Here’s a few of those.

Bad Filename

There is only one way to spell .htaccess — it has to begin with the dot, and it must be in all lowercase letters.

It seems dumb, but if your .htaccess file isn’t doing what you expect, that should be the first thing you check.

.htaccess Disabled or Partly Disabled

Some shared hosting providers disable .htaccess altogether. Others allow it, but restrict certain directives from being used — they’re just ignored if included.

Similarly, even on VPS plans or your own dedicated servers, .htaccess might be disabled.

If you have access to the httpd.conf file, or other server settings, you can check this yourself. If you find the directive AllowOverride None, you found the culprit. Replace it with AllowOverride All.

If you don’t have access to your httpd.conf file (because you’re on shared hosting, for example), you may need to contact your hosting company’s tech support and see if they can enable it for you, or offer you suggestions on ow to accomplish what you’re trying to do in a different way.

Conflicting or Overridden Directives

If you have multiple nested directories, it’s possible for each one to have its own .htaccess file. Every .htaccess file from the root, through each nested directory, applies — they are read in order, descending down the directory tree.

If you set something in your root directory, and then something in subdirectory overrides it, the directive in the .htaccess file closest to the requested file will take precedence.

Also see our mod-rewrite cheat sheet!

.htaccess Frequently Asked Questions

  • What is .htaccess file in SEO?

    The .htaccess file can be used to execute SEO-related tasks like redirects. Redirects can be used to avoid 404 error messages and to let search engine crawlers know which pages they should index. You can also set HTTP headers to improve page load speeds, which may boost your search engine ranking.

    In addition, you can use .htaccess to enact a consistent trailing slash policy. This, combined with www and HTTPS rules, can help you avoid duplicate content, which can be penalized by Google.

  • How do I create a .htaccess file in WordPress?

    To create an .htaccess file in WordPress, use this code:

    # BEGIN WordPress
    
    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    
    # END WordPress
    

    Note that when you install WordPress, the .htaccess file is automatically created. However, a faulty plugin can corrupt an .htaccess file, resulting in a need to re-create the file. 

  • Why can’t I see my .htaccess file?

    If you can’t see your .htaccess file it’s because it doesn’t exist or it’s hidden. To force your FTP client to show these files, you’ll need to change your client settings (i.e., in FileZilla, go to Server > Force showing hidden files). If you’ve made this change and you still don’t see .htaccess, you will need to re-create it.

  • How many .htaccess files should I have?

    Most websites do not need more than one .htaccess file. That’s because the .htaccess files allows you to make server configuration changes on a per-directory basis. However, when hosting multiples sites or complex applications some webmasters may use more than one file per site in order to execute advanced functions.

  • Where is .htaccess in the cPanel?

    To see the .htaccess file, log in to your cPanel account. Then go to Files > File Manager. When asked to choose the directory, select Web Root and make sure that Show Hidden Files is checked. You should now be able to view your .htaccess file in cPanel.

  • What is the use of .htaccess file in CodeIgniter?

    The .htaccess file can be used in conjunction with CodeIgniter to create search engine friendly URLs. By default, CodeIgniter URLs include the index.php file. By using .htaccess you can delete that default index.php file so that it doesn’t appear in all of your application’s URLs.

Exit mobile version