[home: http://monkeyfist.com]
essays · argument · politics · technology · culture

P2P and the Web

Monday, 14 May 2001


[icon] Printer version
[icon] Permanent URL
[icon] Support this author's work

Introduction

Some say that P2P is the future of the Internet, others that it's just a passing fad. In this article I will examine the relationship between P2P and the Web, and how both technologies might benefit each other.

What is P2P?

Before going further, it'll help to define what I'm talking about when I use the phrase "P2P". Several attempts have been made at drawing the boundary lines as to what constitutes a P2P application. One thing that has become immediately clear is that a strict technical definition is not appropriate -- most P2P applications currently make use of a central server at some point.

Having earnestly studied the marketing materials for many applications, I've come to the opinion that the best definition of P2P is: "P2P is whoever says they're P2P." And it's quite likely that before that they were probably B2C, B2B and A2A!

Seriously though, there's no doubt that there is a phenomenon known as P2P, driven by the need to communicate and collaborate, and with a strong element of decentralized control. It's this that I take as my starting point for P2P, and I doubt whether at this stage any stronger definition is viable.

The purpose of this article is to examine this brave new world of user-level network applications, and compare it against the most popular user-level network application to-date, the Web.

The Early Web

Some of the defining characteristics of the early Web shed light on both why it was popular, and also on why P2P applications are achieving popularity. The early Web was easy: for the initial audience it was easy to read, easy to write and easy to implement. The ready availability of clients for multiple platforms, along with the "View Source" ethic, encouraged the rapid take-up of the new technology. Furthermore, the Web was a lot of fun! People revelled in the extra ability to express themselves and share information.

Think here about the ease-of-use of instant messaging, or the instant satisfaction factor of Napster. There's a short and direct feedback loop, based on interaction with our fellow network users.

As the Web Grew

As the Web grew in popularity and adoption, its characteristics have necessarily changed. It hit upon several problems, connected with the increase in number of users, and the level of expertise of those new users. Web-based collaboration became less usable, network congestion became an issue, media outlets talked of the "World Wide Wait".

Whereas in the first instance you might well have classified the Web as peer-to-peer in the sense that nearly everyone who wielded a browser also published their own web pages, that ratio changed too. Less expert users, on sporadically-connected client-only platforms (mostly personal computers), grew to be the majority. For those users there was less control, less immediacy and little of the positive feedback that Web early adopters drew. Sure, the Web was still great with its large and diverse content, but the excitement just wasn't the same.

One major source of disappointment for web users is the ineffectiveness of search. The web's simple-minded and free-spirited approach to markup was simultaneously the key to widespread adoption and the death knell for structured searching. Even the most advanced search engine technology of today is remarkably basic, and the user still has to do a lot of the walking. Two factors contribute to this information noise: poor information quality, due to the lack of any reputation management, and poor page classification.

So, with all these faults, is the Web in decline? No way. The Web has established certain baselines that benefit everyone and which ensures its place as a foundation of Internet information systems. These can be divided into technical factors and human factors. Technical factors include:

  • Universality. The ubiquity of HTML and HTTP in implementation ensures that Web technology is a basic level of communication that practically all platforms support -- even some microwave ovens!
  • Internationalization. Although still immature, the Web infrastructure is designed to support international content on a platform independent basis. Standards like XML and Unicode are encouraging this, and browser technology has increasingly good internationalization support.
  • Openness. The Web is based on open standards, with an increasing commitment to those standards from vendors. This promotes participation from software creators as well as the benefits of open scrutiny for the protocols and technologies.
  • Integration. Largely due to the ubiquity of Web software components, we're now seeing the Web integrated into many other applications. A basic example is the delivery of software updates over the web. Further on, most if not all modern desktop environments feature Web integration.


More human factors include:

  • Content. The breadth, depth and extraordinary diversity of content on the Web means it is an important cornerstone of information systems. Anything that might replace it must exceed the Web's usefulness and challenge for the accessiblility crown. Notwithstanding my comments about web authors being proportionally on the decline, it's still sufficiently easy to publish on the Web that its primacy for content is not yet challenged.
  • Understanding. It's important that users of a system understand how it works, and what its limitations are. Albeit increasing slowly, there is an increasing awareness of what the Web is, and what it is not. URLs are sufficiently commonplace that most people have a basic understanding of the technology.


So What is the Web Anyway?

When defining the web, most people would probably still say "Well, it's HTTP and HTML." This may have been an original fact of the situation, but the philosophy of the Web is a lot broader. As seen by Tim Berners-Lee, the Web is in fact constituted of anything that can be named with a uniform resource identifier, URI. URIs include URLs as a subset, but also include those things which have names but cannot be retrieved directly via a protocol that forms part of that name. The other foundation stone of the Web is of course internationalization -- whose end, at least in the W3C's eyes, is Unicode everywhere.

What does this expansion of the concept of the Web mean? Anything with a URI is part of the Web. If we look at things directly accessible, this might include:

  • USENET (news:)
  • Pages on a web server (http:)
  • Telephones (tel:)
  • Email (mailto:)
  • Freenet (freenet:)


What you can notice here immediately is that some P2P applications, given a URI naming scheme, are automatically part of the Web's structure. It would be quite simple to create URIs for instant messaging, for instance, in addition. Ideally, we want P2P protocols to have some of the Web's other characteristics as mentioned above, as well as just a URI scheme.

P2P Applications Scratch Itches

Now we've established where the Web is, let's look again at some of its deficiencies. It's my opinion that several of the popular P2P trends address direct lacks in current web technology. These attributes of P2P apps include:

  • Ease of "publication". With file sharing clients, publication is as simple as dragging and dropping. While some web programs offer this, it either costs or results in unfortunate lockin to certain systems. The fact is that web publication is still harder than it should be, perhaps primarily due to configuration issues.
  • Everyone's a server. Supported by the increasing availability of always-on connectivity, P2P's everyone-a-server mentality gets back to how things were before the dialup explosion. It means, in effect, that sharing and publication involves fewer configuration and inconvenient upload/synchronization issues.
  • Better user interfaces. Perhaps this one's the killer point. Though there is definitely room for improvement in P2P app UIs, dedicated interfaces that are suited for the task make a life a lot easier than contorted web-forms interfaces. Many web-based communication and collaboration solutions lose heavily because of user interface issues.
  • Controlled communities. Dedicated P2P apps create communities with a certain amount of control, which is largely due to the use of a custom UI. This makes it a lot easier to introduce more structure either transparently, or with minimum fuss. For example, benefits can be drawn in the metadata (and thus searching) arena, and cryptographic transport (e.g. Groove). These are areas where on the Web it's difficult to persuade users their extra effort is worth it.


P2P Needs to be Plugged in to the Web

The itches that P2P scratch are all well and good, but what's the use of systems that aren't integrated into the ubiquitous protocols and software systems that we all use today? In today's environment, no technology can afford to stay an island. Remember "push" technology? Rather than reinventing from the ground up, P2P applications need to open up and embrace the open platform of the Web.

One useful integration technique is proxying. A local HTTP gateway to another service can be run locally on a machine, providing an HTML/HTTP-based user interface. This technique has been used for Freenet, and for a telephony interface. There's even a project for Mozilla, called Protozilla, to enable Mozilla to natively understand new URI schemes.

Many P2P apps already use the Web as a bootstrapping point, underlining that integration is desirable. The alternative -- fragmentation -- is in the long term neither in the vendor or the users' interest.

Scratching the Web's Itches

Proxying is just the first step. Web technologies are in rapid development that will help bring the missing pieces from P2P to the Web. Let's examine some of these.

  • User Interface. It's long been recognized that the Web's UI features are woeful. The W3C's XForms activity is working on creating a new set of widgets and browser/server interaction model. These should go a long way to making the Web a more comfortable interface for performing interactive tasks.
  • Ease of publishing. Although it's been around a while, WebDAV (HTTP extensions to support content authoring and management) is still in the startup stage as a technology, though it's great to see it in the latest versions of Microsoft operating systems and in GNOME's Nautilus. Where it's really lacking is on the ISP side, though folk like FreeDrive, DriveSpace and XDrive are providing WebDAV access. Such access needs to get nearer to the user. When users sign up with an ISP they need to get a folder on their desktop, which is their web site via WebDAV.
  • Shared computation. This is an area in which web technologies are making leaps and bounds of late. Both Sun and Microsoft's future strategies appear to depend on distributed computing over the Web, using HTTP and XML as the foundation. RPC/messaging focused technologies such as SOAP, WSDL, UDDI and friends all provide an underpinning for web-distributed computation. Other, more minority interest, techniques like shared data spaces can also underpin web-distributed computation. Technologies such as RDF, essentially an interchange format for knowledge, are important in this area.


To me, one of the most important things about P2P that I want to see brought back into the Web is giving back the user decent access to content creation facilities. Browser vendors have consistently failed to fulfill the original vision of the writeable web, and the flow of hypertext creation has become ever increasingly one-way. We've seen web applications like Blogger and EditThisPage.com become successful in filling this need, but there's no obvious business model there, and many of these companies are struggling. The authoring, publishing and interaction capability simply has to make its way back into the user agent. (There's an argument to be made that the Internet Service Provider should provide these types of services.) Those of us who determinedly use the Web as a research notebook, and as a means of keeping in touch with family and friends, as well as a business tool know the value of being able to create hypertext -- but we know how hard it is too.

While the technologies I've mentioned above are to a certain extent works-in-progress, there are also aspects of the Web that are more established, which P2P itself will benefit from building on.

  • Data interoperability. The rise of interoperable data formats, due to the influence of XML, enables applications and developers to "do more" with data. We've seen some creative uses of other applications' data due to open file formats: these things often "just happen." Projects like Jabber illustrate the potential of opening up applications and protocols like this.
  • Decentralization. Picked out as one of the strong features in P2P, the Web has always been a decentralized information resource. So there's already a philosophical compatibility between the Web and P2P, meaning that such things as the adoption of URI addressing schemes should be natural. It's hard to think of a good reason not to try and integrate P2P architectures with the Web.


Some P2P-ish Ideas for the Web

One advantage P2P applications have had is the opportunity to make a fresh start with several ideas unencumbered by web history. While this is both good and bad, there are some ideas from P2P apps that I think the Web would definitely benefit from.

The first of these that attracted me was search by example, as employed by OpenCOLA. The idea here is that a simple way of searching for quite a complex set of metadata matches is to throw out an example. This is a lot simpler for the user than defining separately each metadata facet and the relationships between them. As the Web heads towards exposing a larger amount of metadata, useful ways will be required to sift through this extra information. We already know that users of search engines only want one text-box and that's it -- how are they going to cope with the many dimensions of investigation created by decent web metadata? Therefore, the search by example technique seems a hopeful way to progress, although it will require some thought in order to implement.

The second aspect of P2P applications that the Web could benefit from is a notification API. One of the most awkward things about the web is the need to continually check pages to see if they've changed. There's no general publish-subscribe system in order to notify me of updates, or the meeting of particular search criteria. Particular applications do implement this in a proprietary way -- but there is no standard piece of web infrastructure to implement this. Perhaps this is another area in which it's hard to find a business model. This is a problem that needs to be solved by both web protocol engineers and user agent creators. WebDAV may be one way in which this problem can be attacked, especially if client-side servers were in use. The ICE protocol has already implemented notifications, but it has a specialized application area.

Conclusions

Peer-to-peer applications, free from legacy issues, have started to fill needs where the web browser and web-based applications are failing. In particular communication and collaboration are needs which the Web isn't meeting properly. Ease-of-publication, configuration and search are advantages of P2P applications. However, the Web's ubiquity and openness contrast with the fragmented, control-seeking, nature of many P2P networks. If P2P-type apps aren't to die from fragmentation, or to collapse into corporation-controlled monopoly, then they need to embrace the technologies of the Web. The Web already has many technologies that are a suitable substrate for P2P applications, there's no need to start from scratch. On the other hand, Web software vendors -- and Internet service providers -- need to pay more attention to the desires of the user, and enable content creation, collaboration and searching to reach new levels of usability. "Just good enough" won't always be good enough.


· More about the web
· More by Edd Dumbill
· More web pages like this article
· Discuss this article

Return to top of page