LESSON THREE

KEEPING TRACK OF THINGS & USING E-MAIL, TOO

R.I.Feigenblatt - July 2005


1. How to always win at "Twenty Questions"

How many people have played "Twenty Questions"? This is a simple game for two (or more people). In a common variant one player thinks of a noun - a word which names something - and one other player tries to discover which noun it is by asking questions, which the first player is obliged to answer truthfully. The second player wins if he needs no more than 20 questions to do this.

The game is harder if the only answers allowed are "yes" and "no". Yet despite all the many things in this world, the player asking the questions can always win - at least if the noun being sought is a "common" (if rarely used) noun like "antidisestablishmentarianism" rather than a special noun like "William Jefferson Clinton". All common nouns are defined in a sufficiently comprehensive dictionary.

This is what the player asking the questions does to win. An amazing thing about this method is that his successive questions are all but repetitive. He takes out a dictionary and asks the other player if the noun is defined among the second half of the pages in the dictionary. Whatever the answer, he then asks if it is defined among the second half of the remaining subsection of pages just indicated. Doing that again and again, he narrows it down to a single page. With 1024 or fewer pages in the book, it takes no more than half of his quota of 20 questions to accomplish this.

You might correctly guess what he does then. He then asks if the noun is among the second half of the nouns defined on page. And then if it among the second half of those remaining within play. And so on. With no more than 128 nouns defined on any page, it takes no more than 7 additional questions to find out which word on the page is the noun sought. The player posing the questions wins, with at least 20-10-7=3 questions to spare at that.

I have two pedagogical reasons for relating this algorithm.

First, I wanted to teach you about the binary number system in a way which complements the instruction I offered in the first lecture of this series.

You see, we might chose to record the series of answers to our boring all-but-identical questions, by writing down Y when the answer is "yes", and N when it is "no". We would then generate a unique code for every noun in the dictionary, consisting of no more than 17 symbols under the assumptions we posited above. This code would single out a noun of any length of letters just as reliably as would spelling it with those letters. If it turned out all the nouns in English had as many letters as "antidisestablishmentarianism", we'd even save on the number of symbols we'd have to scribble to identify one of the nouns.

We wouldn't have to use "Y" and "N" to record the answers to our questions. After all, the single letter "Y" is not the word "YES" any more than the single letter "N" is the word "NO". We could use any pair of distinguishable symbols to record the code for the noun in question - like the numerals "0" and "1", in place of the letters "N" and "Y", respectively. The code for any of the nouns would then be a series of 0's and 1's.

In fact, it would be a way of writing the ORDINAL POSITION of the noun in the compendium defined in the dictionary - written with BINARY numerals, rather than decimal ones. (But I lie.)

What I claim is only true if the number of nouns defined on every page is exactly the same power of 2, i.e. 2 multiplied by itself several times. And it must also be the case the number of pages is exactly a power of 2 as well.

Actually, by making my algorithm a bit more arcane, and hence confusing, I could have dealt with the inconveniently irregular density of defined nouns on the pages of a typical dictionary. (This detour comprises the next paragraph.)

Before we used books in the form of a codex - a series of leaves, or pages, bound to a common spine, our books consisted of one very, very, long page, which we called a scroll. I could have told you we'd use a dictionary written in the form of a scroll - and that we'd pad the end of the scroll with definitions of enough mythical common nouns to bring the total number of nouns - actual plus mythical - to a power of two. In that case, we could immediately use the form of the question we posed when we had originally narrowed things down to a particular page of a modern multi-page dictionary. And then the statement I made a while ago wouldn't be a lie - encoding the question answers with 1's and 0's would INDEED be the numerical position of the noun in the dictionary, written in binary, rather than the decimal notation we use in everyday life. It would only take 17 binary numerals - or bits - of information to distinguish any English common noun from all the others.

But I have a second pedagogical reason for teaching you how to always win at Twenty Questions. You see, even when the number of subdivisions of something wasn't regular, it was useful to deal with something of enormous complexity - like the aggregate of all the common nouns in English - maybe over 100,000 words - by successive division into a few things at each step. It's very hard for us to keep more than about 7 things in our active (or "short-term") memory at any given time. That's why we use this type of strategy, which is commonly called chunking. By using a cascade of subdivisions, creating a hierarchy, we reorganize something very complicated into chunks which are easy to handle.

We see this chunking strategy used all the time. For example, a large army may be segmented into several corps. Each corps may have several divisions. The divisions might each be made up of several regiments. And so on down the line to battalions, companies and squads.

Nature often follows this pattern, too. It might be very hard to organize thousands of leaves on a tree. But there's a trick. The trunk is divided into branches. The branches are divided into sub-branches. And so on down the line until we have a few leaves on each twig. At each step in its growth, the tree only has to know how to divide something up a few ways. And then it "recursively" repeats this strategy again and again - to architect a very complicated structure from some very simple rules.

You may have even seen how people like to draw on this natural example to illustrate hierarchical organizations like armies, large civilian businesses, or even political districts. They might draw something that looks like a tree they call an organization chart. But usually their tree will be upside down, with the trunk toward the top and the leaves at the bottom, to reflect animal psychology in which something higher up is more dominant.

What does all this have to do with computers? Well, we will soon look at two organization problems. First, how a single computer manages the space within its persistent secondary memory - the magnetic hard drive you'll find on most PCs, or personal computers. And second, we'll look at how we can sort out a large network of computers - like the millions on the Internet - which can communicate with one another. In both cases, we will use the principle of a cascade of subdivisions to get our work done. And we may even draw a tree-like chart or two. But any of our trees will grow sideways - with its root at the left and its leaves at the right. This will be a reflection of the way we write English - from left to right, as we work our way down successive subdivisions of an all-encompassing space.

2. Organizing space in computer memories - bits, bytes and files

As you learned in our first lesson, the computer stores both facts which we call data and methods we call programs in its memory. It uses the programs to deduce and record new facts from old facts - like the total revenue a chain of store accrues by adding up the receipts of every store in the chain.

It uses its primary memory, nowadays made from silicon chips, like an office worker uses his desktop, while its secondary memory, typically magnetic hard disk storage, plays the role of something like the office worker's filing cabinet.

Of course, unlike the workspace of an old-fashioned office worker manipulating paper, the two computer memories are mechanical - or perhaps we should say electronic - and can be rewritten again and again pretty much without limit. They consist of massive quantities of identical storage cells, each of which can record either a zero or a one. To keep track of them, the computer numbers them, just like we might number the houses along a street or the boxes at a post office. In fact, we call this indexing quantity the ADDRESS of the cell.

Actually, things are a tad more complicated - but not much.

As you recall from our first lesson, computers always represent any symbol they store as a number - consisting of a group of bits. In this way, they can store actual numbers, or letters, or even sounds or pictures. They use multiple such groups to represent a collection of such numbers, one group per number.

A practical question that may come to mind is this: what is the biggest number the computer ever has to store? This will tell us how many bits we will have to group together as a unit to record such a number.

It turns out this decision is less critical than it might appear at first glance. That's because whatever convention we decide upon - say, 8 bits - we can always aggregate an ordered sequence of such groups to record bigger and bigger numbers. People can write down a number of arbitrary size, even though they only have a mere 10 decimal digits to use - because they can string together multiple columns, or places, or such digits.

In decades past, people chose to group bits 4, 5, 6, 8 or more at a time. But today, virtually all computer memories - whether they are chip-based primary memories, magnetic-disk-based secondary memories, or something else - group bits 8 at a time into what we call a byte. The practical meaning of this is that each byte is given a unique sequential address.

Recall that a collection of 8 bits lets us distinguish, or encode, 2x2x2x2x2x2x2x2=256 different symbols, such as all the letters and numerals we might find on a typewriter keyboard, both upper and lower case - with some numerical codes left over to boot.

If we were to use computer memory to record the letters which appear sequentially in a sentence, we could put one letter each into its own byte, using a series of bytes with consecutive addresses to preserve that order. Of course, in each byte we'd actually put the NUMERAL CODE corresponding to each letter! And the space between words would be like a "blank letter" character recorded in a byte of its own, and we'd use a numerical code to represent it as well.

	
            Figure: Recording a sentence in digital memory

Sentence: =======Dog bites man.=======

Coding scheme: D--->068, (space)--->032, et cetera

+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ Memory: |068|111|103|032|098|105|116|101|115|032|109|097|110|046| +---+---+---+---+---+---+---+---+---+---+---+---+---+---+ / \ / \ Actual bit pattern recorded: 01110100

What if we wanted to insert a word in that sentence? How could a program be ordered to do that? How could we cut the "string" or "chain" and jam in the extra word? We couldn't! The cells of the memory are INDIVISIBLE. They are actually made so very small to begin with that they barely record alternative symbols in a highly reliable way. We have already squeezed all we can ahead of time - we were not wasteful when we started.

So instead, we do the following. We take a new UNUSED sequence of memory cells bearing consecutive addresses. We start copying over the letters before the break into these successive new locations. Then we copy the word we want to insert (with any space characters we need) into the following locations. And finally we copy over the portion of the original sentence after the break in the sentence. We now have a record of the amended sentence in the computer memory. At this point we might elect to return the original string of memory cells to a pool of unused cells we can call upon when we have to do something else.

Because primary memory is so much faster than secondary memory, we would have used primary memory to amend the sentence in the way just described whenever possible. In general, a program will copy sequences of bytes representing data from secondary to primary memory, do LOTS of manipulations on that data, and then store the results back into secondary memory again. Those manipulations could number in the millions or billions, so that even though reading from and writing to the secondary memory is slow, the computer might still spend most of its time doing stuff with data in primary memory.

We will now depart the small but fast primary memory and concentrate on the huge albeit slow secondary memory. As you recall, the secondary memory - typically a magnetic disk - will not forget what has been stored in it when power is turned off, just like the songs on a cassette player tape will stay recorded even if the player is powered off. The persistence of secondary computer memory is one good reason why regular conventions have been established on how it is organized beyond the clumping of bits 8 at a time into bytes.

These conventions did not always exist, and they have evolved over time. In fact, to this day, some part of the programs which run your computer ultimately just refer to the successive bytes on your magnetic hard disk by their numerical addresses. It's just that now we erect edifices which hide such tedious details from you. After all, rapidly automating tedious details without error is why you want to use a computer in the first place!

In the world of paper records, we use letters to write out sentences, assemble them into paragraphs and then assemble those into entire documents. With the advent of the codex, we don't use scrolls anymore - but we could. That would save a long of stapling, anyway.

Just like a filing cabinet is used to store paper documents, for example a collection of paper scrolls, the secondary memory of a computer is used to store electronic documents. But instead of calling them documents on the computer, we use another, sadly confusing, term instead. We call them FILES. In the paper world of the filing cabinet, a file consists of a collection of documents. But in the electronic world of the computer we use the word "file" to refer to JUST ONE electronic document. Sorry about that, chief.

Just as a paper document can consist of any number of ordered alphabetic letters, an electronic file can consist of any number of ordered bytes. In fact, it is common practice for computers to record not only the contents of a file, but some auxiliary information for each file, one of which is the number of bytes in the file. This helps the computer figure out when one file ends and another one begins. This is useful because, in effect, the electronic memory is just one big scroll completely filled with empty crossword-puzzles boxes to record individual letters. Another piece of auxiliary information computers will record is the name of each file, so that you can refer to it by that name, and find it among all the many others that may be stored on your hard disk.

You might think that each file is always recorded in the bytes of successively numbered addresses on your disk. That is not entirely true. To manage the limited space on your hard disk, a computer will often split any given file into pieces of a standard size, called clusters, so as to manage its storage better as it runs out of unused space. But the details needn't concern you, and modern computers take pains to hide implementation details like that. From a logical or abstract point of view, you can always think of the string or chain of bytes making up a disk file as occupying successive locations on the disk, with something called a "seek index" to indicate any given byte for copying to or from the primary memory of the computer.

Programs can replace the individual bytes of a file, add more bytes to the end of the file, or truncate the end of a file, making it shorter. Programs can even order an entire file to be completely erased, or deleted. When a file is erased, a new file of the same name can be created and filled with data, too.

Hard disk files are not just used to store electronic documents, which people are interested in reading and writing. They are also used to store programs, one to a file. You might like to think of a program as an electronic document only the computer reads. Like document files, program files also have names to disambiguate them from one another - as well as from electronic documents consisting only of data. On PCs based on the Microsoft Windows operating system, a file containing a program often has a name ending with a suffix consisting of a period and the three letter combo "com", so: ".com"; these three letters are short for "command", reminding us that programs are sequences of commands the computer follows to carry out algorithms.

3. Organizing space in computer memories - directories (or folders)

Magnetic disk drives for computers are over a half century old: The first had only 5 megabytes - about 5 million bytes - of capacity. Originally, since storing each byte was so expensive, they weren't big enough to hold very many files. But then they followed a seemingly endless path of improvement that made them ever cheaper and capable of storing ever more bytes of information. It is very hard to keep track of things when you can store hundreds or even thousands of files.

One way to keep track is to sort the files alphabetically by their names. We might even elect to write out such a listing as an index or directory we can use to remind ourselves of what we have stored on the computer when it is powered off. And when it is powered on, it might even be nice to have a program that displays the directory of files on a screen or even prints it out. In fact, this was one of the first "housekeeping" or "utility" programs written when computers started acquiring large magnetic disk drives as secondary memory.

But what if we use the computer to work or many projects? Or if many people share use of the computer? How many files can you collect before you get confused? Or want to use the SAME name, like "budget" for a NEW file you have already used for another file you created long ago for a different purpose?

The solution to this problem is to chunk groups of files together if they are related in some way. Traditionally, we call such a group a DIRECTORY, and in more recent years, a FOLDER, in analogy with what we use for a collection of paper documents. Files of the same name are then disambiguated by the folder within which they lie, so we can have a file named "budget" in the "school" folder and ANOTHER file ALSO named "budget" in the "church" folder. The two files live independent existences. Their contents can be completely different and modifying one file will do nothing to modify the other - even though they share the same name. After all, Bill Gates is not Bill Clinton, even though they are both Bill.

When the operating system of the computer allows the disk file system to support multiple directories, each of which can bear a unique name, we have a solution to our problem. At least for a while.

Then, we collect more and more directories or folders, and it's "deja vu all over again" as the late Yogi Berra is quoted. That's when our friend the tree comes to our rescue! Just as the trunk of a tree can branch, and those branches can branch in turn, and so on, down to the level of individual leaves, we can support a hierarchy of folders within folders, culminating in individual files.

On computers running Microsoft Windows, there is a program called "Windows Explorer" which draws a graphical representation of the file system hierarchy. Typical output looks as shown below. To keep the clutter down, the program lets you dynamically expand or collapse any level, of the tree structure, by using the mouse to indicate the branch whose level of representation detail you wish to modify. This tool also helps you do things like delete folders, create new ones, or even move or copy folders from one place in the hierarchy to another, among other things.


Windows Explorer is how Microsoft Windows illustrates your secondary storage.
Here we see three files (right) at the bottom of a folder hierarchy (left).
The folder in which they sit is highlighted in blue.
Note that higher level folders can hold files as well as folders within themselves.

A particular folder down in the file tree hierarchy can be named using alphabetic letters. Basically, this is the sequence of folder names starting closest to the tree root and working down branchings one at a time, using the "backward" slash character to "delimit" or separate the various folder names. An example is "C:\WINDOWS\SYSTEM32\DRIVERS" This refers to the "DRIVERS" folder held within the "SYSTEM32" folder, which in turn is held within the "WINDOWS" folder. The notation "C:" is used to designate the disk drive, which lets you distinguish among multiple disk drives which may be on your computer. The first secondary storage unit on your PC which is not a floppy disk drive ALWAYS uses the letter "C:" ("A:" and "B:" are reserved for any floppy disk drives!) By the way, the highest level directory is not called the "trunk" as you might guess, but is instead called the "root directory". Unlike a natural tree, a computer directory tree has only a single root and no trunk at all. (Maybe we should have called it a bush instead of a tree!)

Not all computers run Microsoft Windows. Another operating system called "UNIX" is very popular, particularly a flavor called "Linux". UNIX computers also organize their disk space into folders and sub-folders, traditionally called directories and subdirectories. In fact, UNIX supported a file tree long before the predecessor of Windows (called DOS) did. But notations on various operating systems differ in detail. UNIX uses the "forward" slash character, not the "backwards" slash (or "backslash") character employed by Microsoft Windows. And no matter how many hard disk drives a UNIX computer has, the root directory is indicated simply by a forward slash, without any preceding alphabetic characters. UNIX has an auxiliary mechanism to hang the contents of various physical disk drives on their own branches within the file system tree.

The tree structure is also used in the computer world to create an abstract model of things other than hard disk file systems. We will soon see that World Wide Web sites consist of a cascade of directories which are named like the file tree of a UNIX computer - using forward slashes to separate levels of the hierarchy. And when we want to point to a particular computer on the Internet, we'll soon learn that we will again use one of two alternate indexing schemes - both based on tree structures, albeit independent of one another. But both schemes will use the same special symbol to separate levels of that hierarchy - not the slash symbol, but the period symbol.

Tree structures appear again and again in the computer world - and beyond - because they are so useful. You may have written a post card when you were visiting "New York, New York", i.e. New York City within New York State. You too, used a special symbol to distinguish levels of the hierarchy - in this case, the comma. And because of this distinction, you could reuse the name "New York" without confusion. Likewise, there is no problem in the computer world when you want to have a folder (or subdirectory) named "exceptions" within another one named "exceptions" within yet another named "exceptions" as well and so on as many times as you desire. (The Mayor and the Governor won't be confused with one another if you do!)

4. More about labels for secondary storage media

One last detour before we leave the world of secondary storage devices. In the Microsoft Windows world, you saw we used the successive letters of our alphabet, followed by the colon symbol, to designate various storage units. A: is reserved for the first floppy disk drive (if installed). And B: is reserved for the second floppy disk drive (if installed). No more than two floppy disk drives can be installed. C: and successive letters all the way to Z: (if needs be) are reserved for secondary storage devices which are not floppy disks. Until the last decade or two, these were always magnetic hard disks. Now a richer variety of storage media are supported, such as optical disk drives and flash memory.

One complication with magnetic hard disk drives is that more than one letter may be used to designate a single physical unit. The use of multiple letters per unit is in part a legacy of the fact that the capacity of magnetic hard disk drives grew faster than Microsoft Windows could accommodate that growth without gimmicks. (It has NOTHING to do with the fact that very many platters may be used to implement a single unit, by the way.) Adding to the confusion is that the collection of letters used to refer to one physical drive may not even be contiguous! We don't have space to explain in detail here; we just wanted to make you aware of this lest you be confused later.

As computer operator, you have considerable (if not unlimited) administrative freedom to change things to suit yourself. For example, you might have a single huge modern hard disk drive in your PC whose entire space you call C: - or you might prefer to divide it into mythical (logically independent) C: and D: drives. The latter choice can be of great practical value in implementing a backup scheme.

By putting all your data on the D: drive, you can protect all your data against loss from disk failure, etc. merely by making a regular "backup image" of the D: drive with a program like "Ghost" or "Partimage" onto a backup medium (like an optical disk) you keep in a safe place elsewhere.

By putting all your programs, (including of course, the operating system), onto the C: drive, making a similar "backup image" of it protects your programs as installed and configured from loss as well. Far more common than loss from failure of the physical hardware will be things like inadvertant conflict between a newly installed program and a previously installed one you do not discover until many months have passed and things have gotten more tangled yet by the addition of yet more programs. And if the PC in question is attached to the Internet, the sad truth is that defects in the design of your software might be exploited by other computers on the Internet to install malicious programs on your computer to do the bidding of the invader. The simple way to (temporarily) get rid of such a problem (when discovered) is to restore the programs on your C: drive from your backup image, a process that only takes a few minutes and requires no expert diagnosis of quite what went wrong. The long term solution involves applying a "defect patch" from the vendor of the defective software and making a backup image of the patched C: drive.

Generally speaking, magnetic hard disks are permanently attached to your computer's support frame and you must turn screws to extract or replace them - not an impossible task by any means. But now you can also make hard disk swapping fast by installing a dock into the drive bay into which a drawer holding the actual drive sits. Just turn a key lock and the drive is locked into place and connected - or freed for sliding out for replacement by another drawer-mounted drive.

Why would you want to extract a hard disk drive? Well, they are now so cheap, you might buy an extra drive for the "backup images" of the drive you regularly use. Data can be copied between a pair of magnetic disks faster than many other types of secondary media. This makes it easy to update backups and then transport the backup drive to another building in the event the first building is destroyed or the computer is stolen.

You also now can use optical disk drives and chip-based flash drives to supplement - or sometimes even replace - magnetic disk drives. Unlike magnetic drives, each physical unit always uses a SINGLE alphabetic letter to name itself. Optical drives use single removable platters which the drive can read. Some are read-only, some can be written once and some can be rewritten many times. Compatibility and stability issues are rather more complicated than for floppy disks, so study more or consult an expert if you want to begin writing optical disks.

5. Number, please?! - getting in contact with others on the Internet

Millions of computers are attached to the Internet. If one computer wants to talk to another computer, how does it make the connection? And with millions of computers all talking at once, how do the conversations keep from getting confused?

This is not the first time a problem of this sort has emerged. After all, we had a global telephone system decades before we had a global computer network like the Internet. And for centuries now, we've had postal communications systems based on the physical transportation of paper. You know the general answer to the puzzle we are posing about the Internet. We have to assign addresses to all the potential communicants. Analogy with the telephone system may be especially illuminating when we shall so require.

But the Internet is not a centrally administered communications system like the old pre-breakup Ma Bell telephone system. It consists of voluntary cooperation between thousands of independent parties, some big, some small. Now that it is important, bigger players base things on legal contracts. But in the old days, it was just based on personal relationships - handshakes and continuing cooperation based on reciprocal good will and good behavior.

The only technology of the Internet per se is MATHEMATICS. The particular physical infrastructures are just implementations of the rules which parties participating in Internet communication use to carry out the protocol, or language and procedures which make the Internet work.

The Internet is something like the French language. It does not matter if French is spoken, written with a pen, telegraphed with Morse code, carved on stone tablets which are sent by tramp steamer, or written in the snow by whatever means. It does not even matter what the sex or age or nationality of the communicants are. The French language is an abstraction, whose representations are given meaning by convention.

In the same way, virtually any physical medium might form a communications path for the Internet - a radio signal between a satellite in space and a station on the earth, a copper telephone wire, a laser beam shot between the tops of tall buildings in a city or even paper messages written by hand and tied to the legs of carrier pigeons. The only requirement is that the abstract form of the messages fits a certain pattern - which we call "IP" - or "Internet Protocol". IP messages per se aren't even mandated to travel at least a certain minimum speed - or even arrive successfully at their intended destination before the universe comes to an end! Things like speed, reliability, secrecy and other vitally important features of a sophisticated communications system are provided by messaging systems built ON TOP of IP - just like a fast food restaurant might be able to build its business around unreliable employees by hiring enough of them and then using managers who can make last minute decisions to keep the system working. There actually are at least two systems built on top of IP - TCP and UDP. And reliable delivery of messages in the right order is only a feature of TCP, not UDP.

Computers and any other digital gadget attached to the Internet use IP to communicate. They identify one another by their respective IP addresses. Actually, computers don't have IP addresses. A piece of hardware attached to the computer, called a network adapter, gets an IP address unique in all the world. (In fact, it may even be "multi-homed," meaning it uses multiple IP addresses as well.) You have probably heard of the type of network adapter called a modem, which whistles actual sounds to talk to other modems over conventional telephone lines.

A computer may have more than one network adapter, and so may be associated with multiple IP addresses that way. Special computers on the Internet, called routers, fall into this category. They are like telephone offices that coordinate Internet traffic, because each knows the topology, or layout, with which at least some part of the Internet connects various computers. The routers work together as a team to interconnect any device with some particular IP address to another with some other IP address, by relaying traffic between them bucket-brigade style.

You see, even when two computers hold an extended conversation with one another, their talk is divided into discrete segments called packets. Keeping packet sizes below a certain number of bytes in size lets multiple computers share the forwarding capabilities of a single router in a timely manner. It's like an all-you-can eat salad bar in a restaurant. You can come back as many times as you want for more, but you can only take one plate of food at a time.

I won't describe what an IP packet looks like in detail - we don't have time. But like a postcard, besides its short message, or content, it has two important addresses on it - a destination address and an origination address. Both addresses consist of 4 ordered bytes of information. Since each byte can represent 256 possibilities, 4 bytes can represent any of 256x256x256x256 = over 4 BILLION IP addresses. We are slowly moving to a system using 6 bytes for addresses, but for now we're making due with merely 4 billion possibilities. (If you have been paying attention, you will know the four bytes of an IP address consist of 32 bits, or binary digits, because there are 8 bits to every byte.)

It turns out there are ways to lie about, or "spoof," the originating IP address - just like you can lie on a USPS postcard! That opens the door to all sorts of potential mischief we won't have time to discuss - but at least you can't send anthrax via Internet Protocol!

It is customary to write IP addresses using decimal digits to represent each byte, and separate the numbers for each byte by a period. An IP address might look like this:

211.154.010.122

By convention, the most significant byte is always on the left. You might LOOSELY like to think of an IP address as the Internet's version of a telephone number. But no traditional telephony per se might be involved. For example, two computers on the Internet might communicate using only the cables of a cable TV firm.

Protocols built on top of IP, sort of the way English is built on top of the 26-letter alphabet, make Internet addressing a bit more subtle. Both TCP and UDP, sometimes called TCP/IP and UDP/IP, add the concept of a port number. TCP and UDP ports are independent of one another, but both are numbers designated by two ordered bytes, allowing for 65,536 different ports. In an IP packet used by TCP, two bytes within the message "payload" of the packet are actually used to indicate the TCP port number.

Port numbers are used to MULTIPLEX a single network adapter, much in the way a telephone extension number is used to multiplex a single telephone number at a large business location. In this way a single network adapter can better hold a conversation with many other network adapters at the same time.

As you might expect, its not so much that COMPUTERS talk with one another through their network adapters, as that PROGRAMS running on computers talk to OTHER PROGRAMS running on other computers. To minimize confusion, a program of a given type may follow the convention of holding its conversation with another program using a pre-agreed-upon TCP or UDP port. Let's make this clear with an example.

It turns out that when we were retrieving World Wide Web pages in our previous lesson, two programs were at work. One ran on a distant computer which held copies of the Web pages it could serve up when requested. These were actually files on its secondary memory, or magnetic hard disk, copies of which it would transmit when asked. The program doing this is called a "Web server", appropriately enough. Meanwhile, we were running another program on our PC called a "Web browser". By typing page addresses to our Web browser, we ordered it to contact the Web server program running on the distant computer, which sent copies back to the Web browser running on our PC, which then painted our screen with their images.

By convention, a Web server listens for Web page requests at TCP port 80 of its network adapter. And when it transmits a requested page, by convention, it sends the page to port 80 of the network adapter which made the request. Other ports might have been agreed upon - even different ones for each direction of the traffic.

Numbered addresses like IP addresses are compact. You'll note that we also use numerical addresses - called telephone numbers - to attach two telephones together to hold audio communications between people. But numbers are hard for people to remember. That's why we use books called telephone directories, too. Telephone directories associate the names of the parties we may want to reach with their respective phone numbers.

The Internet has its own analog to the telephone directory, through which we can look up IP addresses, although it isn't a printed book. It is called the Domain Name System, or DNS for short. DNS names or addresses are actually organized into a hierarchy. The notation we use for a DNS name separates the successive branches of the hierarchy by using periods.

Microsoft has a computer at the DNS address microsoft.com (Actually, this is a bank of computers which handle the enormous traffic they must endure, but we'll ignore this.) Notice that the name of Microsoft's computer is similar to the name of Internet retailer Amazon's computer, amazon.com Both end in ".com", pronounced "dot com", which is the commercial domain of the Domain Name System. microsoft.com is actually a subdomain of the commercial domain, as is amazon.com. Subdomains can be divided further. One example is support.microsoft.com

Note that like IP addresses, the successive branches of the DNS hierarchy are written with periods to separate different levels. But no transparent correspondence exists between IP and DNS addresses for name SEGMENTS separated by successive periods. For one thing, IP addresses always use 4 bytes, and hence 3 periods. But DNS addresses use one or more periods, as required. Further, IP addresses put more significant name segments at the left, while DNS addresses put more significant name segments on the right! All the same, any DNS address is translated by the domain name system into a corresponding IP address (and in some contexts a plurality of addresses in a range). Most programs make their use of DNS directory computers so implicit, we don't even remember this "telephone directory" lookup is going on.

Whether we refer to fundamental numerical IP addresses or more human-friendly DNS addresses, we use computer addresses with many different types of programs - not just Web browsers. For example, one can conduct voice or video communications via the Internet. Addresses are needed for the programs which implement these functions. And then there's the type of electronic telegram we call e-mail, which we will soon discuss at length. E-mail programs also depend on computer addresses to connect sending and receiving pairs of computers with one another.

6. More about Web access, reading files and starting programs

But before we move on to e-mail, I'd like to offer a clarification involving Web browsers, given the enormous value getting at Web pages presents.

You may recall from last time that the name of all Web pages begin with this curious prefix: "http://" These four letters are an acronym for "Hyper-Text Transfer Protocol", which is the language spoken between Web browser programs and Web server programs running on different computers. HTTP is actually built upon TCP, as TCP is built upon IP, but we will ignore such details for now.

Instead, I will tell you that Web browsers can speak multiple languages, or protocols. One example is HTTP. Another is an older language called FTP, an acronym for file transfer protocol, which uses the prefix "ftp://" . As the name suggests, this was (and remains) another method for moving files from one computer to another. But FTP is unlike HTTP is some ways. Most significant is that FTP originally did not involve displaying the file one requested and was sent. Web browsers tend to add this display step - although they can be directed to instead save a received file in an arbitrary folder on one's magnetic hard disk.

So use of the prefix "http://" is what tells the Web browser to speak the "Web language" to the remote computer we designate. It has nothing to do with any letters in the address name, like "www". Many people think all Web page names must have "www" in the address name of the computer running its Web server, but this is just plain false. For example, the home page of the Web server for the Haralson County Historical Society sits at http://hchistory.com . There is no www anywhere in sight!

Actually, the proper notation for the Society's home page on the Web is http://hchistory.com/ - but Web browsers are clever enough to add the slash when they get into trouble without it, and it saves people a bit of typing.

We can save some more typing now that you know that a Web page displayed on your screen is actually produced by the Web browser program running on your PC. We saw that we could open multiple Web pages at a time using a simple "three-step dance" involving the Windows key. If you were to examine in detail the structure of any Web page opened in this way (cf. below), you'd notice all sorts of writing at the top of the page - as well as what looks like a text entry field to the immediate right of a label that says "Address." You can enter the name of any page on the Web there, and make the Web browser replace the current page with that new page in the same rectangular display space, (called a "window").


Microsoft Windows' default Web broswer (called Internet Explorer)
displaying the home page of the Hephaestus Project Web site

You can select this text input area for typing into, without use of the mouse, just by using our old friend the Tab key as many times as needed to activate the vertical blinking line, called the CARET, in this area. When it appears, we say the text input area has "keyboard focus."

You can type properly formed Web page addresses into this space, just as you have been taught. They all start with "http://", as you recall. But implicitly, the Web browser ASSUMES you want to fetch a Web page and talk the Web language when you type into this space. If you type neither "ftp://" (to use File Transfer Protocol) nor "http://" (to use Hyper-Text Transfer Protocol), it will assume the latter and silently prefix it before anything you do in fact type. The consequence is that you can type "hchistory.com" there (without the quotation marks, of course!) and it will be as if you typed "http://hchistory.com/". The Web browser will request the home page of the Web server running on the computer at hchistory.com by contacting it at TCP port 80, using the HTTP language. That's why the Haralson County Historical Society likes to write just "hchistory.com" when advertising its Web page - experienced Web browser users know they need type nothing more than that once they have a Web browser window open.

You may then wonder why you had to type the "http://" prefix when I first taught you to get Web pages using the Windows key. The reason is that after we hit the Windows key, we hit the "R" key, which means "run". If we had then typed the name of a file containing a program, rather than the name of a Web page, that program would have been copied from secondary memory (the magnetic hard disk) into primary memory and started up by the operating system. The "http://" prefix made it clear to the operating system that we meant a Web page. The operating system is actually smart enough to know you need to run a Web browser program to look at a Web page, so it would start that program up and type the Web page address into its (Web address) text input field for you! That's why a window with the Web page eventually popped up on your screen.

Now, we can see what happens when we type the name of a file after hitting the Windows key and "R" key. Let's try the file "C:\WINDOWS\COMMAND\EDIT.COM" EDIT.COM is a file containing a program called a "text editor". Windows knows it is a program and so loads from secondary memory into primary memory and starts it up. A text editor is a program which can modify or create files on your hard disk. For example, you might use it to compose a grocery list, and then print the list out to take to the store with you. Text editors are better than using am old-fashioned typewriter, because they make it much easier and quicker to rearrange text. Fancier versions of text editors called "word processors" also do things like correct spelling errors and may even let you embed graphical images, too.

We will try one more thing using our old three-step dance. Hit the Windows key, then the "R" key and then the name of a disk file which is NOT a program, for example:

C:\WINDOWS\SYSTEM.INI

Functionally, it happens that this file is used to configure the way Windows starts up. But Windows also knows it is a simple text file, using displayable characters, and so starts a program that can view such a file, typically a program called Notepad. Notice that Notepad displays the contents of the file for you to review. In fact, Notepad is another text editor program, as was EDIT.COM. But it is more modern in that it uses the style common to programs written specifically for Microsoft Windows.

You should be aware that even a single software publisher might provide you with multiple alternate programs that perform similar functions. And of course anyone can write new programs which compete with these, too. Think of them as tools in the toolbox you call the magnetic hard disk drive.

Let's review. When we do our three-step dance, Windows will do one of three alternate things depending on what we type into the text entry field in the third step. If we type a Web page address, a Web browser will display that page. If we type the name of a file holding a program, Windows will start that program up. And if we type the name of a file not holding a program, Windows will try to figure out what sort of program can be used to view such a file and start it up to do so. It won't always succeed. But it can be interesting when it does.

For example, it turns out we can use the Web browser program to save any Web page it displays into a file on our own hard disk. Let's do this, making sure to choose a name ending in ".htm" to indicate it is a Web page, for example, "MyOwnCopy.htm" If we later specify that same file in our usual three-step dance, our computer will re-open the copy we have saved and display it again using the Web browser. We won't have to connect to the Web server from which we got the page in the first place to look at it again. That might be very useful if that Web page is subject to change (e.g. it is the front page of a newspaper) and we want a record of what it looked the last time we last asked for a copy from the Web server.

Before we leave the subject of Web pages, I'd like to explore Web page addresses in a bit more detail. We showed you what the address of the home page on a computer running a Web server looked like. It begins with the usual prefix, "http://", is followed by the DNS address (or name) of the computer running the Web server, and then ends in a slash. For example, http://hchistory.com/ for the home page of the Haralson County Historical Society.

What is the structure of the address of other Web pages on served up by that same server? They all start out the SAME WAY, and then append text which looks very much like a file tree hierarchy! In fact, this hierarchy was inspired in just such a way. So if you want to look at the list of the officers of the Historical Society, you could type:

http://hchistory.com/misc/officials.htm

That looks very much like a file named "officials.htm", in a folder named "misc", sitting under a file system rooted at the home page. But, we observe, the slashes aren't the "backward" slashes we use to designate files on Microsoft Windows computers. Instead, Web page addresses use the "forward" slashes you find in designating files on computers running the UNIX operating system.

7. E-mail - yet another new means of several to communicate

On to electronic mail, or as we say for short, e-mail. If the World Wide Web is like a giant (practically free) library, then e-mail is like having your own Western Union office, except that it is practically free, unlike the large prices we old-timers once associated with sending a telegram. As a matter of fact, as we shall see later on, one of the biggest problems associated with e-mail is how very microscopic the cost of sending it is!

As we have developed various methods of communications over the centuries, some methods have been totally supplanted, but in many cases we have simply added another tool to our toolbox, to be used when the balance of its advantages and disadvantages outweighed those of other media for a particular realm of competence.

Electrical telegraphy was a great breakthrough when it emerged in the early 19th century. For the first time, at least in principle, messages could be flashed around the world at the ultimate speed limit - the speed of light, which can travel a distance equal to 7 circumferences of the earth in under a second. Previously, telegraphy based on telescopic observation of visual signals limited communications to something like 120 miles an hour - only many millions of times slower! One book about telegraphy written in the late 1990's used the wry title "The Victorian Internet." Once the planet was wired with telegraph lines, people in London could learn about disasters happening in coastal China within minutes of their observation near a telegraphy office. The economic consequences could be discounted in financial markets and palliatives set into motion - long before the many weeks or even months it had previously taken for news to reach distant shores by ship. Needless to say, telegraphy completely changed the nature of military conflict as much as it changed economic competition.

So we should not make TOO MUCH out of subsequent electrical developments. But we shouldn't minimize them either. For example, the COST of sending information across oceans would come down many million-fold between the time the first telegraph oceanic cables were layed and the present day. That would open up the use of electrical communications for things scarcely imaginable at the dawn of the electrical age.

When the harmonic telegraph - better known to us as the telephone - was invented, amazing new possibilities emerged. But even the great could not see its potential. Supposedly, the first US president to witness a demonstration of the telephone, Rutherford B. Hayes, said:

It's an amazing invention but who would ever use it?"

Perhaps Hayes thought the telegraph was just fine. But early telephone proponents offered three advantages of the telephone over the telegraph.

First, unlike the telegraph, the telephone operated without any power source save the voice of its users. The microphone was so sensitive an instrument that the breath provided enough vigor to make the microphone a big enough dynamo for what electricity the distant loudspeaker required to be audible by human ears. (Imagine how impressed you might be if you could operate your automobile without gasoline, just by using your steering wheel.) Eventually, electrically powered amplification was required - to do things like span the distance between coasts - but early telephones worked for hundreds of miles without need of batteries, much less an electrical power grid which was non-existent! But maybe President Hayes thought batteries were not that expensive, especially if you just use telegraphy for the important stuff that only national presidents and such like did!

Two other advantages were presented for the telephone. Unlike the telegraph, it required no skilled operator who could "hammer out" and understand Morse code - you merely spoke and listened, just as you did in normal conversation. Finally, since untrained people could speak much faster than even trained operators could send Morse code, communications was much faster. Perhaps President Hayes was not impressed by the cost saved by eliminating a technical specialist like a Morse code operator, as least for a US President, and thought that since you were communicating at the speed of light anyway, sending a message in 1 minute rather than 3 minutes wasn't that much an improvement. He probably couldn't envision the use of the "improved telegraph" for a free-wheeling two-way brainstorming session. After all, it was only for sending messages on which the fate of nations would hang, and so you wouldn't want to say anything with it anyway you hadn't thought about a long time in advance!

Of course, now we laugh at the idea that no one would want to use the telephone, although perhaps we might agree the telephone never learned to speak clearly enough to easily disambiguate similar sounds, and has probably gotten much worse now that we use wireless phones in areas with marginal cell-tower coverage! Of course even that didn't stop our fellow Americans from buying more cell phones than there are households, and people in Europe doing so long before we did.

Telephone systems have gotten ever more capable now that we have answering machines and voice mail. You don't have to rendezvous with your recipient just to leave him a message. He can pick it up when he finds convenient - and you two can still enjoy much reduced latency compared to paper-based mail.

One thing both the (traditional-style) telephone and telegraph are equally poor at is conveying graphical information - or even large amounts of text using arcane words like foreign language or technical jargon unfamiliar to one party or the other. Paper mail is great for that stuff, but mail takes hours if not days to be delivered - unless, of course, you use a facsimile, or "fax" machine.

Primitive experiments with fax machines date back to the dawn of electrical telegraphy in the early 19th century. But they didn't become practical tools until news services started using them in the early 20th century to wire photographic images. They became common business tools by the 1980s, when they were extensively used in Japan, in part due to the pictographic nature of the Kanji script. Late generation fax machines are digital devices, making them very immune to error. They are dead simple to use, and very cheap to boot, just a fraction of $100. If you can feed a sheet of paper into a slot and dial a phone number - or maybe just punch a speed dial button, you are a qualified fax operator. People who don't want to invest time or money using computers might be well advised to get themselves a fax machine - saving themselves lots of driving without the delays, inconvenience, and even expense of using paper-based mail.

So, you ask, in an age when most people can afford a fax machine, why bother with something like e-mail - a glorified telegram machine?

Well, if you care to save your fax messages for future reference - perhaps because they have tax or legal import, you might appreciate the ability to store them compactly and search them for content thousands of times faster than any human can. Paper isn't good for that, but digital storage is. E-mail consists of a series of characters you computer can store and read directly. I have never thrown away any e-mail mail, because it is so easy to store, so quick to search, and so easy to copy to provide backup or communication to third parties. It might be easy to copy a few old fax pages to a third party; but what if you wanted to send a portfolio of hundreds of pages relating to an important matter in one shot?

Computers are getting better at automatically recognizing text so that it can be read as easily as e-mail. If fax machines don't disappear - their sales peaked at the turn of the century - perhaps they will ultimately bundle in a computer-based feature like this to help close the utility gap with e-mail.

E-mail usually includes the ability to enclose or "attach" arbitrary computer files along with a cover letter of text. This means it is good at transmitting multimedia content like color photos, audio messages and other things - something fax machines can't do. Of course, a QUARTER BILLION wireless camera phones were sold around the world last year, so phones are getting more capable - more like full blown PCs in fact. But for now, they don't have quite the flexibility of a real PC - for example, right now they can only send photos they capture, not digital photos held in files, much less things like multidimensional data files of interest to some experts.

Still, one must grant that wireless phones are now much better at things like sending text - at least if you don't mind typing on a teeny keyboard. Many cellular services now support so-called "text-messaging." It is wildly popular outside the United States, especially in poor countries, for the economy it affords over the use of voice calls. In fact, it is a sort of e-mail, save that the party who is sent the message has the additional option to let his phone signal when a new message arrives, something people with PCs COULD do with e-mail, but usually choose not to do. Another thing that is very nice about wireless phones is their compact size, which lets you carry them anywhere, so you can always count on having it around. You don't have to wait to reach your home or office to access a desktop PC or find a WiFi hotspot for the laptop PC you carry. And even if your laptop can uses a cellular telephony modem, a handheld gadget like a cellphone will be taken very many places you'd never consider dragging even the lightest laptop PC.

In the long run, increasingly capable wireless phones may prove to be our primary tools for all sorts of communication, if not our only one. Auxiliary eyepiece displays may make imagery more detailed at very little energy cost and full-sized fold-up keyboards have actually been sold for some years now. One cell phone now even bundles in limited voice recognition using a microprocessor based in the phone itself!

8. A brief history of transoceanic electrical communication

It is interesting to look back how far we have come. The first telegraph cable under the Atlantic Ocean was used to exchange greeting messages between Queen Victoria and an American president whose name you know - James Buchanan. (It just so happens I lived for many years in a village adjacent to a huge tract of land that was the summer home of the man behind this cable - Cyrus West Field.)

Sadly, the early cable broke almost immediately and continuous telegraph service across the Atlantic Ocean only began after the US Civil War. It first cost about $1800 in today's money to send 20 words - almost $100 a word! And the initial speed was only 2 words per MINUTE.

But we made lots of progress in the 20th century. I like to think of the history of telephony in the twentieth century in terms of the story of my family.

There was no regular telephone service across the Atlantic Ocean when my parents were born in Europe. But before they started school, a radio link which could support a SINGLE telephone call at a time went up. There was less of a queue to use it than you might think, because a three-minute call cost over $750 in today's money. Still, if you could speak 125 words per minute, you "only" paid about $2 per word, figuring in currency inflation.

When my folks came to America after World War II, someone would still have to use the same radio link to call across the ocean. That was even true the year I was born, in the middle of the Baby Boom. But the year I turned one - close to exactly a century after those telegrams between Victoria and Buchanan - they layed the first trans-Atlantic telephone cable, which supported up to three dozen simultaneous calls. It was now much cheaper to talk, too - a three-minute call "only" cost a little over $80 in today's money.

Now, it turns out that 18 years after that, I was a young engineer at Bell Labs working on the prospective sixth trans-Atlantic telephone cable. Do you remember the tragic demise of the cable President Buchanan used? Then maybe you are not surprised to hear that years before we would lay a new ocean cable, we'd figure out how to fix them if they went bad. My work was a theoretical computer study that estimated how accurately we could locate a fault in the cable if it occured.

A couple years after I finished my study, our copper cable was deployed and could support some four thousand telephone calls between Europe and America at one time. But even as late as 1980 you still paid about $11 for a three-minute call New York-London in today's money. In those days I helped teach electrodynamics to electrical engineering undergraduates at MIT, and our course also covered optical fiber cable communications using lasers. But it would be some years until cables like that were used for intercontinental telephone service.

Finally, in 1988, the first optical fiber cable spanned the Atlantic, meaning that enormously more telephone calls might be supported than over old-fashioned copper cables. And by the turn of the century the latest generation of optical fiber cable, using multiple laser beams, could support MILLIONS of simultaneous phone calls. So nowadays you can use the Internet to do intercontinental voice chat, basically for FREE. And it only costs a penny or three a minute if you want to call out to a conventional telephone system on the other end. I hope to demo something like this during a future lesson.

9. E-mail - an old favorite by now, especially via the Web

In later lessons, we will examine PC-based alternatives to e-mail like live text chat software and electronic bulletin boards, as well as the use of audio and video communications using the Internet. But for all the fancier alternatives, old-fashioned PC-based e-mail is not without its charms. By using it, I many years ago essentially eliminated nearly all of my long-distance telephone calls - although that's not important now that we live in an age where long distance call are almost free. And for people without fancy cellphones or PCs at home, the PCs one finds in any library make handy tools for communicating with the vast number of people like me who make so-called traditional Internet e-mail an everyday tool.

One particularly handy type of e-mail does not rely on using one's computer to run an e-mail program or archive copies of old e-mails. This is so-called Web-based e-mail. The idea is that a third-party computer on the Internet supports a Web site which provides an interface, or access scheme, to an e-mail system it also supports, by means of dynamically created Web pages and archival files on its machines for your e-mails.

If you can figure out how to retrieve a Web page, and you can type letters into text entry fields on Web pages, you have all the basic skills you need to use Web-based e-mail. Nowadays third parties provide e-mail services, including vast archives (100 megabytes and up) for copies of the e-mails you send and receive - all for free, which is to say, often through the financing provided by ads.

E-mail services provided through the Web allow you to access your e-mail anywhere you find a computer which can access the World Wide Web! While not quite as portable as a cell phone, you can always find the access you need at most any library in the cities of the developed world. You don't have to wait to get home!

Parties who let you "outsource" the support of e-mail through an Internet connection are an example of what we call an "ASP" or Application Service Provider. Some people think that in the future, rather than buying as many programs to install on our own computer, we might instead more often lease access to programs running on distant computers maintained by experts, with which we communicate using a machine of only enough smarts to do the communications - something called a "Thin Client". This would actually be something of a return to the past, a pattern which held for many computer professionals who accessed large remote computers from home using simple terminals, in the age before PCs came into existence. The difference now would be the vastly faster communications lines we'd use, which would allow us a very rich graphical and auditory experience compared to that those who dwelt in the slow text ghetto knew all those decades ago.

10. Signing up for and using Web based e-mail

There are many providers of totally free or ad-supported e-mail services which have you use the Web to access it. Among those with the largest clienteles are Yahoo and Hotmail. The Haralson County Historical Society switched over to the former when trouble emerged using the latter.

Sadly, I could not find a provider who had both a simple sign-up procedure that asked few questions and provided Web pages which could be reliably manipulated without depending upon a mouse.

I will assume you have now learned to use the mouse on your own or will learn soon, from the several Web-based interactive lessons to which I have refered you in previous lessons. Ask me if you have trouble locating them!

A service called My Way provides free Web-based e-mail service, and has a relatively easy sign-up procedure. Of course all services have you agree to a terms of service document, which in effect is a contract. However, the enforcebility of such contracts is open to question since your agreement is never witnessed by a human being and the authentication of your identity varies from very weak to none.

This week I will print out the agreement for you to read in your spare time, and then you can sign up for the service if you approve. I think most people will. Next week let me know if you need help signing up.

The Web site for My Way is found at myway.com, or being more formal, http://myway.com/ . Actually, entering the abbreviated address I gave you will actually get you "automagically" scooted over to the Web page at: http://www.myway.com/index1.html . From there you can click on the link labelled "My Email" and start the sign-up procedure. After you have signed up, you will continue to use this path to access your e-mail account - whether you want to send e-mail to other parties, or check if others have sent you e-mail.

Just like you need a street address or post office box to receive mail, you will need an "e-mail address" so that people have a means to direct e-mail to you. The address for the Haralson County Historical Society is:

HaralsonHistory@yahoo.com

An e-mail address is made up of two parts, separated by a special symbol, the "at" sign. ( @ ) All conventional Internet e-mail uses this format. The letters to the left of the sign are your user identity or "userid". Those to the right of the name is the address of the COMPUTER which will receive incoming e-mail. Either a numerical IP address or more user-friendly DNS address might be used. The latter is by far the alternative most often encountered.

After you sign up with My Way, your e-mail address will be of the form:

myuserid@myway.com

where "myuserid" will be the userid you choose to identify yourself. You will probably have a limited choice for the number and types of symbols in your userid, and an attempt to elect the userid which someone else using the service has already reserved will naturally result in your request being declined.

You might like to think about the userid you choose. You will probably want it to be easy for people to type, although the use of electronic address books can make this issue basically unimportant. But not everyone carries such a book with him, and it might be nice if they could remember your e-mail address all the same when they want to write you from odd places. One clever thing some people do is create a Web page with the e-mail addresses and other things they want to carry around with them. If they can remember that one Web page address, they can access the information it stores from anywhere in the world.

11. The culture of the e-mail world

When you want to send somebody else e-mail, you have to know their e-mail address. You can ask for it, or you might just ask them to send you an e-mail by giving them your address. The mail you get will almost always show their return address. But doing this means you are relying on them not to forget to write you!

Web-based e-mail gives you templates to fill in to generate e-mail you want to send. This includes not only any text you want to communicate, but other things as well. For example, you need to specify the recipient or recipients of your message. In general, you can send e-mail to a great many people just as easily as you can to a single individual. But if you find you are sending lots of stuff to the same large group over and over again, you should instead think about setting up a Web-based electronic bulletin board instead, which will also automate the means of archiving communications. This is especially true if the group is comprised of the members of an organization that may outlive their membership in it. Future members will appreciate the legacy of electronic archives their forerunners will have left behind.

Another important thing to consider when drafting an e-mail is a single line summarizing the message, which you will put in the "subject" line. This helps your recipient make better use of his time by perusing the subject line before diving deeper into the message - if at all! Sending e-mail is cheap and easy, so people tend to send a lot when they get good at it. You only have so much time in your life, so you may not want to read everything other folks send you - especially when it is something they clipped for you with minimal effort, in which you have little interest.

Because it is so easy to read and write e-mail, an original message may start an entire conversation that extends over many individual messages. There is no generally accepted etiquette on how you express a desire to cut the chat off. Use good manners. But many people understand an open-ended message that goes long unanswered may mean something other than preoccupation.

And the first time you send someone an e-mail, you may be surprised how long it takes to get a reply. I once sent a small government an e-mail that was only picked up and answered about 18 months later! Most organizations accept e-mail these days, and they usually make an effort to indicate how long a real reply might take by use of a very prompt automated boilerplate reply.

Many people check e-mail once a day, others once a week or even just once a month. Just because e-mail can shows up at once doesn't mean people feel obliged to answer quickly, if they have busy lives - or maybe just a very great many e-mail accounts to attend! When someone gives me their e-mail address, I usually send a quick hello to gauge how long it will take them to answer a typical e-mail: it is a better indicator than what they might tell you explicitly if asked! If they can't be reached by alternative electronic means like the telephone, e.g. on account of taste, and one day you need to reach them in a hurry, it is good to known in advance how long an e-mail reply might take!

Also, appreciate that sometimes people have a good reason to abandon one e-mail account for another, just as they may switch from using one post office box for another in a big city - perhaps without using a change of address form. Maybe they just want to escape the stream of junk mail now clogging up their old box! We use a special name for junk e-mail. We call it "spam", after a certain classic routine of the old British comedy ensemble Month Python's Flying Circus. If you change e-mail accounts, whether to dodge spam or another reason, do remember to write your e-mail penpals about the change if you want to hear from them again!

Spam has reached epidemic levels. There are effective strategies for dealing with it, which we can address if you are interested. Global solutions are possible, but the "stickiness" of legacy customs makes it hard to effect change. One simple method, with limited effectiveness, is not to give out your e-mail address with wild abandon, and to ask those to whom you give it to request your explicit permission before passing it on to third parties. Also, realize that if you post your e-mail on a Web page, mailing list aggregators can use programs to automatically scour the Web for just that sort of thing! They can then use any of millions of easily hijacked computers on the Internet to do mass-mailings for free, while covering their tracks.

You should be aware that unlike paper mail sealed in an envelope, e-mail is not private at all - it's like the text written on a postcard. Everyone - which is to say, every computer - who handles it can read it whole. That includes the human administrators of said computers, of course. It is possible to encrypt e-mail, so that only the intended recipient can read it, assuming he keeps a secure computer regimen - a neat trick these days, to say the least. And you can also authenticate the origin or e-mail with technology related to encryption as well. Free tools are readily available, but are rarely integrated with e-mail programs. But they provide the maximum in privacy you will find in this world, short of the use of steganography to hide the fact you are sending messages in the first place! Even without steganography, free encryption tools provide theoretical security regimens even Big Brother can't crack without the use of "black bag" techniques.

If you use an e-mail program which runs on your PC, like Outlook or Eudora, rather than Web-based e-mail, you will save copies of e-mail you send and receive on your hard disk.

But if you use Web-based e-mail, your provider will almost certainly give you the ability to save e-mails you send and receive on HIS hard disk. This will take the form of simulated folders you can create and use to groups your individual messages. But you may want to back up such records (e.g. on your own PC) in the event your provider closes shop and leaves you in the lurch.

E-mail is a part of modern life. You're not an experienced computer user if you haven't tried it in one form or the other. It provides a great way to archive the episodes of your life, profound and trivial, for later recall by you - or your heirs. It is super-cheap and compact to store, search and copy compared to paper-based archives - just like all electronic records. Even people who deal very little with other types of written records write notes to other people now and then. E-mail is a great way for them to taste what the digital age is like.


A P P E N D I C E S


An interesting study of the historic costs and usages of various communications systems is: Internet pricing and the history of communications, A. M. Odlyzko. This paper is a subset of a larger work called: The history of communications and its implications for the Internet, A. M. Odlyzko. All of Professor Odlyzko's are collected here.


A partial history of communication price and volume by type


(To convert 1972 dollars above to 2005 dollars, multiply by 4.5)

Today (2005), at a wholesale price of roughly $1 per gigabyte,
Internet traffic costs for 1000 words are as follows:
1. 0.0005 cents for text (5 bytes per word)
2. 0.0500 cents for 8 kbps (minimal) VoIP audio conveying 120 WPM speech
3. 1.6000 cents for 256 kbps (minimal) point-to-point video conveying 120 WPM speech


A large collection of statistics pertaining to information technology is available at IT Facts