HOWTO: Hidden Web Pages
by John Chambers

There is a lot of information on the World Wide Web that is not reachable by following hyperlinks, and which can only be found if you know the exact spelling of its URL. A number of studies have produced estimated that between 40% and 50% of the information on the Web is in such "hidden" pages.

I'm not talking about hiding things using any sort of encryption or inside databases that require invoking sophisticated CGI scripts. I'm talking about simple web pages that are "on the web" in the sense that anyone who knows the exact URL can type it to a browser and see your page, but the pages can't be found using any sort of search or by following hyperlinks. Hiding information this way is very easy, and requires no sophistication at all.

There's nothing tricky about this. The ability to hide information "in plain sight" has been part of the Web from the early days of the original NCSA and CERN web servers. It was part of their design. If you hide web pages this way, you aren't breaking any rules. You are using the Web the way it was intended to be used.

One very common reason for doing this is that you're working on a new part of your web site. You don't want to be bothered with email asking you dumb questions about the parts that aren't finished yet. This will just waste your time. So you "hide" it all from your public until it's finished. You can get at it from browsers, because you know the URL. You can do all the testing you need. Then, when you're ready, you can "release" it all instantaneously by adding a hyperlink from a public web page.

The basic technique is based on a simple concept: If you use a URL that names a directory, a web server will send you a listing of the directory's contents. You can click on any of the links in the directory listing to see the files or go down into subdirectories. However, if the directory contains a file called "index.html", the server doesn't send a listing, but sends the index.html file instead.

For the computer programmers amongst us, I shouldn't need to say anything more. You know how it works; now go off and do it. But the previous paragraph may need a bit more explanation for some readers. So I'll walk you through how to set up hidden files on your web site.

This is all easiest if you are on a unix-based server running apache or any of its clones. These account for roughly 2/3 of all web servers. If your web site uses some other kind of server, the ideas are similar, but the names may have been changed.


First, make sure that you have a collection of files in your web directory. Maybe a few subdirectories with some little test files. Use a browser to make sure that they are all accessible. End with your browser showing a listing of your main web directory, or a subdirectory where you'd like hide things.

Next, most if not all browsers have a "Save" in some menu that lets you store the current web page into a file. On Netscape and Internet Explorer, it's in the File menu, labelled "Save As ..." or maybe "Save Page As ...". Use this, and save your directory listing in your web directory with the name "index.html".

Next, verify that this works. If your web server is set up in the usual fashion, when you tell your browser to reload your directory, you will see exactly the same thing as before. If you think about this, there's something curious about it. You just added "index.html" to your directory, but the listing doesn't show it. This is because the web server did not send you a listing of the directory. It sent you that index.html file, which was produced before your new index.html file existed. You already have one hidden web page, the "index.html" file itself. A visitor will think they're seeing a listing of your directory, but they're not.

(If the reload shows "index.html", then your web server has been reconfigured to use some other name. Look in the server's config files, or ask your web master, and change the name to the one that your server uses.)

To further verify this, edit your index.html file. The html for the directory listing will be pretty obvious. Pick one of your files, and delete it from the list. Go back to your browser and tell it to reload the page. That file should no longer be in the list.

Voila! You now have a "hidden" web page. If you deleted the entry for a directory, the entire contents of that directory are hidden. The only way anyone can find this web page is if they know (or can guess) the exact spelling of its name.

Yes, it really is that easy.

While you're at it, you might try some more experiments. Add a bit of text to the top of the index.html file. Change the contents of the <title>...</title> portion to something more descriptive than just "Index of ...". Tell your browser to reload the page, and verify that your changes took effect.

Next, create a new subdirectory. Give it a bizarre name that's difficult to spell. Something like "qv37Jbx". Move a few of your test files into it. Ask your browser to reload the listing, and verify that your new directory "doesn't exist".

Got the idea? As long as that index.html file exists, you can put anything you want into your web directory, and nobody can find it. As long as there are no links to it, the only way to see it is to know the exact spelling of its name. If its name isn't obvious, it's as secure as any password. In particular, the search sites won't know about it and won't tell anyone how to find your pages.

If you like, you can leave your index.html file looking like a normal directory listing. This will fool people into thinking they're seeing all your web pages. But it's not very pretty, is it? So start learning some html, and edit index.html into something that looks nicer.

One of the common uses of this technique is to share information among a small crowd without it getting out to the general public. To do this, you just need an agreement among the members of your crowd that you will all be careful about linking to each others' hidden web pages. Links from one hidden page to another are ok, and bookmarks in browsers are ok. But if a public page links to a hidden page, the search engines will follow the link and discover the hidden page (and everything it links to).

You might want to verify this occasionally. Give your pages good titles, and ask a few search sites to find your titles. If they succeed, then some friend has linked to your hidden pages from a public page. Do a bit of name changing. Ask the search site to find the link(s) for you. Give your friend a good chewing out.

How effective is this? Well, back in 1996 I created a directory on my site that contains a couple of "dirty pictures". They're not really porn; they're more like artistic photos. But they are naked women, so they'd offend some people. I put a description of them on my home page. Periodically I check the server logs, and use "ls -ul", to see if anyone has found them. So far, the server and ls both say that the files are unread. All I did was give my "porn" directory a bizarre name that's not a word in any language that I know. It's effective.


Bye the way, there's a related naming trick that you might find useful. Try renaming "index.html" to "HEADER.html", and tell your browser to reload. If your web server is set up in the usual way, this will do something interesting: You'll see the same thing you saw before, but at the bottom will be a full directory listing. This doesn't hide anything. But you can use it to tell visitors whatever you think is important about your directory.

I've done this with my demo and tools directories. These are where I put a lot of things that I'd like to be able to get to from anywhere. I got a lot of email asking for more information, so I took the ones that I thought would be most interesting to others, and made a HEADER.html file that describes them. At the bottom is the full listing, which I use when I want to grab something. I'm not hiding anything, and I don't mind if people borrow my tools. But this is based on the same sort of server behavior as the "index.html" hiding trick.