Gnutella
Gnutella (pronounced with a silent 'g') is a distributed software project to create a true peer-to-peer file sharing network, without a central server.
| Table of contents |
|
2 How it works 3 Protocol features and extensions 4 Clients 5 See also 6 External links |
The first client was developed by Justin Frankel and Tom Pepper of Nullsoft, a division of AOL, in early 2000. On March 14, the program was made available for download on Nullsoft's servers. The event was prematurely announced on Slashdot, and thousands downloaded the program that day. The source code was to be released later, supposedly under the GNU General Public License (GPL).
The next day, AOL stopped the availability of the program over legal concerns and restrained Nullsoft from doing any further work on the project. This did not stop Gnutella; after a few days the protocol had been reverse engineered, and compatible open source clones started showing up. This parallel development of different clients by different groups remains the modus operandi of Gnutella development today.
The Gnutella network would be a fully distributed alternative to semi-centralized systems like Napster. Initial popularity of the network was spurred on by Napster's threatened legal demise in early 2001. This growing surge in popularity revealed the limits of the initial protocol's scalabilty. In early 2001, variations of the protocol (implemented first in closed source clients) allowed scalabilty to improve somewhat. Instead of treating every user as client and server, some users were now treated as "ultrapeers", routing search requests and responses for users connected to them.
This allowed the network to grow in popularity. In late 2001, the Gnutella client LimeWire, which had driven much of the protocol's development, was released as open source. In February, 2002, Morpheus, a commercial file sharing group, abandoned its FastTrack-based peer-to-peer software and released a new client based on the open source Gnutella client Gnucleus.
Sometimes the word 'Gnutella' refers not to a particular project or particular piece of software, but to the open protocol used by various clients. Since new clients are under development in various locations, and since a new protocol is apparently on the way too, it is hard to say what the word 'Gnutella' will mostly stand for in the future.
The name is a word play on GNU and Nutella. Supposedly, Frankel and Pepper ate a lot of nutella working on the original project, and they were going to use the GNU GPL license on the finished program. Gnutella is not associated with the GNU project; see GNUnet for the GNU project's equivalent.
To envision how Gnutella works, imagine a large circle of users (called nodes), who each have Gnutella client software. The client software on the initial use must bootstrap and find at least one of those other nodes. Different methods have been used for this, including a pre-existing list of possibly working node addresses shipped with the software, using Gwebcache sites on the web to find nodes, as well as using IRC to find nodes. Chances are at least one node (call it B) will work. Once it has connected, node B will send node A its own list of working nodes. Node A will try to connect to the nodes it was shipped with, as well as nodes it receives from other nodes, until it reaches a certain quota, usually user-specifiable. It will only connect to that many nodes, but it keeps the nodes it has not yet tried (it discards ones that it tries but did not work).
Now, when user A wants to do a search, it sends the request to each node it is actively connected to. It is possible that some of them will no longer work, in which case user A tries to connect to the nodes it has saved as backups. The number of actively connected nodes for user A is usually quite small (around 5), so each node then forwards the request to all the nodes it is connected to, and they in turn forward the request, and so on. In theory, the request will eventually find its way to every user on the Gnutella network.
If a search request turns up a result, the node that had the result contacts the searcher (whose IP address was included with the search request) directly. They negotiate the file transfer and the transfer proceeds. If more than one copy of the same file is found, the searcher can perform a "swarm" download - download pieces of the file from different nodes. This results in increased download rates.
Finally, when user A disconnects, the client software saves the list of nodes that it was actively connected to, and that it was keeping as a backup, for use next time it connects.
In practice, searching on the Gnutella network is often slow and unreliable. Each node is a regular computer user; as such, they are constantly connecting and disconnecting, so the network is never completely stable. Since individual users' connections are likely to be slow, it can take a very long time for a search request to traverse the entire network (which averages around 100,000 nodes at any time).
The real benefit of having Gnutella so de-centralized is to make it very difficult to shut the network down. Unlike Napster, where the entire network relied on the central server, Gnutella cannot be shut down by shutting down any one node. As long as there are at least two users, Gnutella will continue to exist.
The development of the Gnutella protocol is lead nowadays by the GDF (Gnutella Developer Forum). Many protocol extensions have been and are being developed by the software vendors and free Gnutella developers of the GDF. These extensions include intelligent query routing, SHA checksums, and parallel downloading in slices (swarming).
There are efforts to finalize these protocol extensions in the Gnutella 0.6 specification at the Gnutella protocol development website. The Gnutella 0.4 standard, although being still the latest protocol specification since all extensions only exist as proposals so far, is outdated. In fact, it is hard to impossible connecting with the 0.4 handshake.
In January 2003, Shareaza announced the Gnutella2 protocol, which uses the UDP network protocol for searches rather than TCP, has an extensible binary XML-like packet format, and includes many of the aforementioned extensions. Shareaza released the draft specification on March 26, 2003. Gnutella2 (G2) is not supported by the rest of the "old" Gnutella network – except for Gnucleus – and the developers of Gnutella refer to it as "MP" (Mike's Protocol) because it was not approved by the GDF.
Some popular Gnutella clients are:
History
How it works
Protocol features and extensions
Gnutella operates on a query flooding protocol. The outdated Gnutella version 0.4 network protocol employs five different packet types, namely
These are mainly concerned with searching the Gnutella network. File transfers are handled using HTTP.Clients
See also
External links
Papers on Gnutella and Filesharing