The client software of the early volunteer computing projects consisted of a single program that combined the scientific computation and the distributed computing infrastructure. This monolithic architecture was inflexible; for example, it was difficult to deploy new application versions.
More recently, volunteer computing has moved to middleware systems that provide a distributed computing infrastructure independently of the scientific computation. Examples include:
* The Berkeley Open Infrastructure for Network Computing (BOINC). BOINC is the most widely-used middleware system. It is open source (LGPL) and is developed by an NSF-funded research project located at the UC Berkeley Space Sciences Laboratory. It offers client software for Windows, Mac OS X, Linux, and other Unix variants.
* XtremWeb is used primarily as a research tool. It is developed by a group based a University of Paris - South.
* Xgrid is developed by Apple. Its client and server components run only on Mac OS X.
* Grid MP is a commercial middleware platform developed by United Devices and has been used in volunteer computing projects including grid.org, World Community Grid, Cell Computing, and Hikari Grid.
Most of these systems have the same basic structure: a client program runs on the volunteer's computer. It periodically contacts project-operated servers over the Internet, requesting jobs and reporting the results of completed jobs. This "pull" model is necessary because many volunteer computers are behind firewalls that don't allow incoming connections. The system keeps track of each user's "credit", a numerical measure of how much work that user's computers have done for the project.
Volunteer computing systems must deal with several problematic aspects of the volunteered computers: their heterogeneity, their churn (that is, the arrival and departure of hosts), their sporadic availability, and the need to not interfere with their performance during regular use.
In addition, volunteer computing systems must deal with several related problems related to correctness:
* Volunteers are unaccountable and essentially anonymous.
* Some volunteer computers (especially those that are overclocked) occasionally malfunction and return incorrect results.
* Some volunteers intentionally return incorrect results or claim excessive credit for results.
One common approach to these problems is "replicated computing", in which each job is performed on at least two computers. The results (and the corresponding credit) is accepted only if they agree sufficiently.
No comments:
Post a Comment