|Obtain unprecedented power, availability and capacity by getting many machines to work together on a problem.|
How do you maintain availability when your server room is on fire? Don't rely on RAID or Clustering -- neither effectively protects against the failure of an entire network segment. In fact, RAID can only save you from basic drive failure; any other problem and you're sunk. Clustering doesn't help either. Due to the extreme amounts of data transfer, most clustered computers need to be on the same network segment or physically connected by special cables. This means that any physical disaster that can destroy one cluster machine will probably kill the others.
What we really want is something with the power of clustering, but using low-cost computers that can be located anywhere on the network. That something is a Distributed Computing network. In fact, most of our organizations already have a system like this in place, but are either unaware of it or think of it as a security threat. It's called Peer-to-Peer filesharing (xref ch10). That's right -- all those employees running Napster and its ilk are really linking your organization to a global Distributed Computing network. While the most prominent applications may be a bit misguided, the fundamental technology is ideal for creating an ultra-highly available computing environment.
Distributed Computing is a broad topic. We're going to break it into three functional segments: messaging, processing and storage. Let's look at each segment in more detail.
Messaging: The ability to move information between computers in a distributed environment is a prerequisite for distributed storage or processing. Thus, distributed messaging is probably the most explored, but least heralded, aspect of distributed computing. The basic internet routing protocols are distributed messaging applications. Email is also a distributed messaging application. CORBA, IIOP and RMI are examples of standardized systems for distributed messaging. New distributed messaging applications are constantly emerging, such as peer-to-peer instant messaging, anonymity/anti-censorship networks (FreeNet, FreeBird), Web services (SOAP), and Microsoft's .NET framework.
Processing: In the early nineties scientists began to theorize that you could get supercomputer-type power by combining a huge number of weaker computers. But it was SETI@home that showed everyone just how much power was really available. The non-profit project's distributed network surpassed the computing power of the world's most advanced supercomputers in a matter of months. A number of companies have since tried to commercialize similar technology, offering an inexpensive and more robust alternative to supercomputers. This field of study is now known as Grid Computing. But the reality is that most organizations don't need supercomputing power, so growth has been slow.
Storage: To the vast majority of organizations, distributed storage should be particularly tantalizing. It can offer high security, scalability, availability, and performance -- if implemented correctly. With distributed storage, the concept of a central fileserver vanishes. Instead, the unused space on workstations and servers throughout the network is pooled together, creating a massive virtual RAID array. No one PC is a weak point -- in fact, a significant number of machines have to fail before data is lost. Backup tapes are only be needed in the most cataclysmic of situations. With multiple worksites, the system becomes even more robust, allowing the data from an entire site to be recovered from any other site.
Now for the reality check. Obviously something has to be wrong, because RAID and Clustering still exist, and are still in use everywhere. The first problem is that the field is relatively new, and reliable commercial systems have not had time to mature. The mature systems that do exist are variants on Clustering technology, using high performance computers or specialized storage devices. For example, in the Enterprise Storage world, Distributed Storage means treating all of your high-end file and backup servers as a single massive storage device.
The second problem is that network lag and latency makes running real-time applications over a distributed network rather difficult. For example, a database call that needed to contact 30 different PCs across the organization would not be nearly as fast as a lookup to a single, high performance server or database cluster. There are ways around this, such as cacheing frequently used information in a few nearby systems, minimizing the lookup time. But current applications are not yet that advanced.
The truth is, only certain applications of distributed network computing make sense right now. While there are certain types of computational problems that work well over a distributed network, few organizations need this sort of power. At the moment, distributed storage is the most interesting aspect of distributed computing.
Napster and its descendants showed the potential of distributed storage by creating an incredibly large and reliable data warehouse from thousands of inexpensive PCs across the Internet. The most popular files were very highly available -- at any given time hundreds of computers around the world would have these files online. Even if fifty of these computers went offline, there were at least as many more to continue providing the information. Without even meaning to, the Peer-to-peer networks achieved levels of high availability that were previously considered impossible.
Unfortunately, the fact that the Peer-to-Peer technology also facilitated the illegal distribution of copyrighted material overshadowed its high-availability benefits. Not only did the users fail to notice (they were too busy listening to their new tunes or watching the latest flicks), but the business world was too busy trying to suffocate the technology under mounds of legal papers.
Security in a distributed environment is very difficult. It's one thing to secure a relatively small number of centralized servers. Its another thing to secure all of the desktops and workstations throughout an organization. In general, most of the machines in a grid are not going to be trustworthy. Furthermore, the network connecting the grid poses its own security risks. An intruder can gain a wealth of knowledge from basic and advanced analysis of the constant flow of information between grid nodes.
A secure distributed computing application needs to maintain data security even if a large number of machines are compromised. In some parallel computing environments, the algorithm used might be as valuable as the data being computed. In such situations the program code should also be protected. Obtaining all of the information on any participating machine should not be enough to piece together the protected data. The only single point of failure should be the machine that is controlling the application. Protecting this single machine becomes the only critical security task.
How it Works
In the early 90's, computer networks went mainstream. Individual computers became faster and less expensive. The office or university desk had an actual computer on it, versus the "dumb terminal" of a decade prior. Computer centers had multiple servers, each providing different functionalities. Processing and storage resources became decentralized throughout the organization.
Although decentralization gave the end-user more immediate power, it also created some serious managerial problems: inefficient resource utilization, inaccessible data, bottlenecks, and single points of failure. To this day, many organizations find themselves plagued by these problems. They find themselves constantly fighting to centralize resources, in what is an innately decentralized system.
Distributed Computing has evolved as an elegant solution to all of these problems. Its beginnings can be traced to the "timesharing" technologies used by mainframes and the early protocols for routing data across a network. Standards for program intercommunication (CORBA, IIOP, RMI) arose, allowing different programs on different machines to work together. These standards enabled one machine to directly exchange information, services and processing power with another machine. This meant that if one machine broke, the necessary resources could be obtained from another. Two or more machines could become a single, more powerful entity.
Two other related technologies also came of age in the late nineties -- parallel computing and peer-to-peer computing (xref w/ch10). In parallel computing a single task is split up such that multiple computers can work on the problem simultaneously. This is actually a very difficult task -- often figuring out how to split up a problem is just as hard as solving the problem itself! See the sidebar for an example. When used in a clustered environment, parallel processing allows maximal use of the processing resources for solving a problem. Otherwise, the problem is handled by a single processor while the others remain idle. In most cases, an application has to be specifically written to take proper advantage of multiple processors.
Take the example of modeling the flow of air over the wing of a plane. In traditional computing you'd use fluid dynamics equations to represent the flow of air and the shape of the wing. The computer would crunch away at these equations. But how do you get two computers to crunch away together? It turns out that if you have enough processors, you can have each processor represent a particle of air. Each computer figures out what its particle will do when it encounters the wing. It also talks to its neighbors to figure out how it will interact with them.
Peer-to-peer computing allows computers to exchange information without the need of a central server. Every peer is both a client and a server. For example, in traditional file exchange one computer posts information to a central repository. When finished, other computers can get the file from the repository. In peer-to-peer, the repository isn't necessary -- other computers simply contact the original file provider directly. Peer-to-Peer file sharing is one of the few true distributed computing applications. While it's primarily a distributed storage application, distributed messaging technology is also used to help organize the network. Furthermore, the search feature built into most file sharing clients is actually a distributed processing application.
The next big distributed technologies are Grid Computing and Distributed Storage. In Grid Computing the resources of thousands of machines are pooled together in a utility grid. When your machine needs more power, it taps into the grid. When it doesn't, it's spare processing power is available to the rest of the grid. With Distributed Storage, the same concept applies, except the utility resource is storage space, versus processing power.
The above information is the start of a chapter in "Network Security Illustrated," published by McGraw-Hill and available from amazon.com, as well as your local bookstore. The book goes into much greater depth on this topic. To learn more about the book and what it covers, click here.
Below, you'll find links to online resources that supplement this portion of the book.