Wednesday, April 1, 2015

Teaching the Internet: One of the Big Ideas of Computer Science

The Internet is not only a ubiquitous tool in our society, but it is an amazing intellectual and engineering achievement.  Just as every person in our society should learn the basics of how electricity works, every person in our society should learn the basics of how the Internet works -- both because most people use it almost every day, and because there are some enchanting ideas behind the Internet.  I believe its basic elements could be taught to middle school students in several days using the following approach.

What does it even mean to connect computers together using a network? By analogy, the telephone network allows us to connect any two phones in the world so that they can exchange audio streams.  Despite the huge size and complexity of the telephone network, it's quite simple for users to establish a phone call without any knowledge of the underlying network.  For example, to place a call, they don't need to know that the call must pass through a switching station in (for example) New Jersey.  Similarly, users of any two computers on the Internet can establish a connection between their computers in which the computers exchange streams of bits (ones and zeros) that are represented as different voltage levels on the electrical cables connecting the computers.  Again, the users can open a connection without specifying that their data must flow through a certain trans-Pacific cable.

But how does the fact that computers can exchange ones and zeros allow them to exchange web pages or cat videos?  There are really three key aspects of understanding how computers represent data: the first is that ones and zeros are the basic currency of computers, the primitives with which all digital data are encoded; the second is that any integer can be represented as a sequence of ones and zeros by encoding it in base two (binary); the third is that digital data such as text, images, and audio are encoded within computers and computer networks by translating them into numbers (and then, into a stream of ones and zeros).

In sixth grade grade my son learned how to view decimal numbers as numbers encoded using base ten.  That is, to consider every digit in a decimal number as a multiplier for a power of ten (the "ones place", the "tens place", the "hundreds place", etc.).  Once you grasp that notion for base ten, it's easy to change the base to another number, to teach how to encode numbers in base two.  When a number is encoded in binary, it becomes a sequence of ones and zeros.

The next step is to describe how to encode various types of data as numbers.  Encoding text as numbers is simple: we assign a different number to every character.  So we can encode the sequence of characters within a book or a web page by translating each character to its corresponding number.  Images can be encoded as numbers by assigning a different number to every pixel in the image according to the brightness of the pixel (color images can be encoded by using three numbers for each pixel, one for the brightness of each primary color).  Thus, an image becomes a long series of numbers (which are translated to an even longer series of bits).  A video can be encoded as a sequence of images.  Then it becomes possible to explain how many images and videos might fit into one gigabyte, and how the speed of an Internet connection affects the time required to download a cat video.

Once we can explain how a cat video is encoded as a sequence of bits, we can explain how that sequence of bits can be segmented into packets, what packet switching is, and why packet switching facilitates sharing of network bandwidth, failure recovery, and management of network congestion.  We can explain that links in the network can be constructed using a range of physical implementations, from ethernet to optical fiber to wifi.  We can explain how every computer has an address within the network, and the importance of routing algorithms in directing packets along the most efficient path through the network.  We can explain that packets have headers that specify their destination address and that prevent them from circulating in the network indefinitely.

We can also explain what the difference is between the Internet and the World Wide Web: the notion that the Internet speaks multiple languages, or protocols, to accomplish different tasks.  Consider this analogy to a possible extension to the telephone system: imagine that we added one additional digit to every telephone number, where that digit specifies which of ten possible human languages the answerer should speak when they answer the phone.  Similarly, every Internet connection specifies a port number that designates whether the computer that answers the connection should be prepared to receive an email message, or provide a web page, or initiate a Skype conversation, or should provide data for an interactive multi-user aerial combat game.

The Internet is extremely complex, and I don't mean to imply that every aspects of its workings can be explained in a few days.  But I believe that its essential aspects can be explained in a few days at the middle school or high school level.  You don't have to be a computer scientist or a programmer to understand basic aspects of how the Internet works.