Adding cookies to your site

Cookies allow Web developers overcome the anonymous nature of the Web. CNET tells you how to implement cookies on your website.

You'd think that anything called a cookie couldn't be controversial, but when it comes to Web cookies, you'd better think again. On the one hand, cookies are an incredibly useful tool for Web site builders. But on the other hand, many users are extremely upset about them.

Cookies were developed to help site builders overcome the anonymous nature of the Web. The technology enables developers to stash a user ID, a session ID, or some other bit of identifying data on the user's machine. That makes it possible for developers to get a sense of whom they're dealing with and what path the user is taking through the site.

So what's the big deal? After all, a client-side cookie is just a calling card, like a "made especially for John Jacob Hammerschmidt" label sewn into the lining of a customer's jacket. A cookie is no more threatening than a bartender who calls out a customer's name as they walk through the door.

Giving a user a cookie requires only some simple JavaScript or CGI script. But there's nothing simple about quelling the security and privacy concerns that cookies can raise among your users.

Paul Bonner is chief technical officer of Mediatruck in Austin, Texas. He wrote "LANs come home" for CNET.COM, an article that won a Computer Press Award for the best individual online how-to article of 1996. What's a cookie?

A cookie is an HTTP header containing a string that a browser stores in a small text file on the user's hard drive. The file is saved in the Windows/Cookies directory (for Microsoft Internet Explorer) or in the Users folder (for Netscape Navigator).

The process is essentially benign. Cookies store information supplied by the user and read it back later. Cookies can't extract information from cookies belonging to other sites, nor can they interact with other data on the user's hard drive.

Cookies can't actually capture anything; they can only save or recall information. They're scratch-pad memory for the Web site, nothing more. Before any data can be stored in a cookie, a site must first gather that information--by asking the user to fill out a form, through a Webmaster's analysis of the user's actions to infer likely buying patterns, and so on. Cookies can be as large as 4K, but in practice, few exceed a couple hundred bytes.

Where did the term cookies come from? Lou Montulli, currently the protocols manager in Netscape's client product division, wrote the cookies specification for Navigator 1.0, the first browser to use the technology. Montulli says there's nothing particularly amusing about the origin of the name: "A cookie is a well-known computer science term that is used when describing an opaque piece of data held by an intermediary. The term fits the usage precisely; it's just not a well-known term outside of computer science circles."

Though companies have been writing cookie specifications for quite a while, standards groups are just now considering cookies. The Internet Engineering Task Force (IETF) is studying the technology as part of its investigation of state management.

Why cookies?

The purpose of cookies is to help sites overcome the fact that HTTP, the file transfer protocol that drives the Web, is fundamentally stateless, with absolutely no concept of sessions. In other words, users are strangers to your Web site every time they access a page. No matter how much information the user supplied on a previous page or in a prior visit, without cookies, your site can't distinguish someone who's been online for 12 hours from a person who just breezed in from Yahoo.

HTTP can't tell to whom it's sending a page, or which other pages it has delivered to that user. This makes for efficient server operations, since HTTP doesn't have the burden of tracking users. But this setup also means considerably more work for site builders wanting to know who's at the other end of the line. Cookies provide a solution, tracing users from page to page on your site (that's why cookies' full proper name is persistent cookies).

That kind of information is important, because unless you know something about your users, you can't build a Web site that's responsive to their preferences, habits, or history. For example, if you create cookies to follow a user's path around your site, you can greet his or her arrival at a new page with custom elements reflecting their history. At CNET's DOWNLOAD.COM, cookies determine whether an incoming user should be served the PC or Macintosh page for software downloads. The site serves whichever platform the user received the last time they visited, not necessarily the one they're running at the moment.

On an informational Web site, a lack of user data is definitely annoying, but on a commerce-oriented site, it's downright crippling. Salespeople who can't remember the name of their best customers, or how those shoppers like to pay for merchandise, or whether or not they've already paid, aren't likely to be stars. It simply isn't possible to build a successful e-commerce site without some mechanism for recognizing users as they move from one page to another.

Cookies let you tag your visitors with just enough information to identify who they are and what they're up to. Depending on your server-side database facilities, that information could be quite minimal. For instance, if you use a database table on your server to record a customer's login information, preferred form of payment, delivery address, and the contents of their shopping cart, then the only thing you might need to store in the cookie is a pointer to the customer's record in that table. While you could use a cookie to save all of a customer's registration form data, it's far more efficient and secure to simply store an ID number on the cookie, and then use that number to retrieve the remainder of the information from the server-side database

Cookie alternatives

Cookies aren't the only way to maintain state variables. At least two other methods were in use long before anyone had ever heard of a cookie: transmitting state data via hidden fields in a form, and appending state data to the end of a URL.

As an example of the latter approach, note the ?cnet.tkr designation after this URL: The ?cnet.tkr suffix tells the server where the request originated, information the server wouldn't get with a standard request for that page.

Both hidden fields and URL state data have problems, however. To use hidden fields, you must process every page request as a "form Submit"--a method that looks increasingly anachronistic in the age of dynamic HTML. The URL suffix method, meanwhile, is rife with security concerns. For one thing, anyone looking over the site client's shoulder can view the identifying information attached to the URL. In our example above, the information just indicates where the request came from, but in other cases, a site might transmit a user's login name and password in the URL, exposing them to prying eyes.

Cookies would seem to have it all over these methods in terms of ease of use, performance, flexibility, and reliability. So why are many Web users--and even some Web builders--scared to death of cookies?

The dark side of cookies

The ability of Web builders to employ cookies has been severely hampered by a torrent of bad press, wild rumors, and half-truths.

The technology got a bad rap from the beginning. At heart, cookies are nothing more than electronic Post-It notes for Web servers, but they attracted the attention of conspiracy theorists and neosurvivalists who concocted a barrage of stories about all the horrible things cookies could allegedly do. People claimed that any random site could read the cookies on a user's hard disk, that cookies could be used to steal information from hard disks, and that cookies posed various other threats to users' privacy and security.

Most--if not all--of these claims were patently false. For instance, a cookie can be read only by the site domain that created it, and can store only information supplied by the site or by the user. There is no way a cookie can rifle through the contents of the user's hard drive, nor can it haphazardly broadcast the client's private data across the Internet.

Word is slowly spreading that cookies aren't the poison pellets they've been made out to be. In fact, the cookie controversy might have faded by now, except that at least one banner advertising network has used cookies to track users' Web activities in a manner that many people find objectionable. The DoubleClick Network attempts to develop customer profiles and present those users with banner ads targeted toward their interests. Each time a visitor connects to a DoubleClick site, the DoubleClick server reads and/or writes a cookie to their hard disk, in the process compiling extensive data about the user's activities on those sites.

DoubleClick is quick to point out that it does not gather or store usernames, email addresses, or telephone numbers; nor does it sell or rent the information it collects. Rather, the company uses the information solely to deliver customized advertising on DoubleClick Network sites. Nevertheless, enough users have found this particular trail of bread crumbs so unacceptable that DoubleClick was forced to provide a free method to disable its tracking mechanism. Ironically, this function--the key that locks up what many perceive as DoubleClick's evil cookie jar--is itself a cookie.

If anything, the furor over DoubleClick demonstrates the fundamentally benign nature of cookies. The technology does not open the floodgates for rampant abuse of privacy, nor does it blast big security holes in users' systems. You are in far more peril when you hand your credit card to a waiter than when you accept a cookie from a respectable Web site, or for that matter give your credit card number to the site.

As in any transaction, however, smart buyers should take every step to ensure that they're dealing with reputable sites. Many factors go into establishing trustworthiness, beginning with Web builders prominently displaying a policy statement outlining exactly how they use cookies in their sites.

The basic cookie recipe

Web builders can create cookies by using a CGI program or JavaScript. JavaScript is simpler and doesn't require server-side programming. However, with CGI or any other server-side tool, the steps for implementing cookies are nearly the same, since all cookie processing is performed by the browser's Document Object. So whether you're using CGI, Microsoft's Active Server Pages (supported by IIS 3.0 or later), or JavaScript, you use the same code to tell the browser to read, write, or delete a cookie.

If you search the Web for the JavaScript code to create cookies, you'll find a thousand examples. Nearly all of them are taken directly from the classic code developed and placed in the public domain by Bill Dortch, of hIdaho Design.

Dortch's SetCookie function provides an efficient mechanism for setting the full range of parameters associated with a cookie: its name, value, expiration date and time, and path and domain; as well as whether or not it requires a secure page:

function SetCookie(name,value,expires,path,domain,secure) {
document.cookie = name + "=" + escape(value) +
((expires) ? "; expires=" + expires.toGMTString() : "") +
((path) ? "; path=" + path : "") +
((domain) ? "; domain=" + domain : "") +
((secure) ? "; secure" : "");

Note: the expiration date parameter is especially useful in e-commerce sites and other places where you may want the user to log in again after a period of inactivity. That helps ensure that you're still dealing with the same customer. For instance, every time a registered user does something on your site, you might write a cookie with an expiration time set to 90 minutes, and then read the cookie back before accepting the user's next action. If the cookie had expired, you'd know that more than 90 minutes had passed since the user last clicked a link or otherwise interacted with your site, and so you could request that they log in again before proceeding.

The JavaScript for reading a cookie is also a fairly straightforward function (this too is Dortch's code):

function GetCookie(name) {
var arg = name + "=";
var alen = arg.length;
var clen = document.cookie.length;
var i = 0;
while (i if (document.cookie.substring(i, j) == arg)
return getCookieVal (j);
i = document.cookie.indexOf(" ", i) + 1;
if (i == 0) break; } return null;

Another function, getCookieVal, extracts the cookie's value from the string that's returned by reading the document cookie:
function getCookieVal(offset) {
var endstr = document.cookie.indexOf(";", offset);
if (endstr == -1) endstr = document.cookie.length;
return unescape(document.cookie.substring(offset, endstr));

Deleting an existing cookie is even simpler--merely set its expiration attribute to a date that has already occurred, like so:
Document.Cookie =
"FirstName=John; expires=Tuesday, 01-Apr-1994 07:00:00 EST"

To see more sample JavaScript for manipulating cookies, check out Bill Dortch's Cookie Functions or the Cookie Demos page on Cookie Central, where you'll also find code for CGI-generated cookies.

Do state your cookie policy

Every site that gathers data from its users should include a disclosure statement with specific information about how the site uses cookies. Digital Voodoo (a Web development firm based in Austin, Texas) uses the following statement: "Our Web site utilizes cookies in order to provide a betterexperience. We do not collect personal information without your knowledge and permission, nor do we resell or distribute any site visitor data, including information that may be collected as a natural by-product of your visit. If you have any questions about our policy, please contact our Webmaster."

The widespread concern about cookies has very little to do with the technology and very much to do with the fear that unscrupulous site operators might use cookies for unacceptable practices. There's no reason to be afraid of or antagonistic toward the store clerk who remembers that you bought a tweed jacket last month, but nobody wants to deal with the salesperson who pockets credit card carbons. A clear policy statement can make it clear that in the hands of honest operators, cookies are anything but ominous.

Do use cookies to improve the user experience

Cookies help a lot when you want to improve the quality of a user's visit. You can store information in a cookie that will let you deliver a better, more personalized experience the next time the visitor comes to your site.

For example, if you offer two versions of your site, one with frames and one without, there's no need to make the user choose each time they visit. Instead, you can save their selection in a cookie, and then the next time they come to your door, your server can read the cookie and automatically deliver the preferred site type.

Cookies are especially useful in conjunction with dynamically generated pages. If your site delivers information on a variety of subjects, you could use a cookie to store data about which topics the visitor has shown an interest in. Then, when the user returns to your site, you can generate dynamic pages that highlight those topics.

For instance, a newspaper site might notice that a particular user invariably searches for basketball scores. The site could use that information to dynamically generate a custom page with a link to those scores right at the top.

Don't store sensitive information

Sometimes the appearance of security is as important as the fact. You might know that the information you store in a cookie is safe from prying by other Web sites, but users don't necessarily understand that. So it's essential to avoid storing any data that's even potentially embarrassing or costly to the user.

Cookies aren't the place to track passwords, for example, or credit card numbers, or purchase authorization numbers. The first time that users find their credit card numbers in a cookie is the last time they'll visit your site.

It's a much better idea to save sensitive information in a server-side database. Assign your customer an ID number, and use a cookie to store that number on their PC. Then, whenever the customer visits, use their ID number to look up other information in the database.

The real issue is trust. If you say that your site is storing information in cookies only to improve the user's experience, use cookies solely for that purpose. Don't sell the customer's name and email address to the fly-by-night sales outfit down the road, and don't inundate your users with junk email when you've promised not to.