edits

2024-01-11 17:25:36 -08:00 · 2024-01-11 17:25:36 -08:00 · 3a51b1e583
parent 48e7f33583
commit 3a51b1e583
6 changed files with 193 additions and 186 deletions
--- a/BOINC-overview.md
+++ b/BOINC-overview.md
@ -0,0 +1,128 @@
+BOINC is a platform for distributed computing.
+It is designed to support 'high throughput computing':
+with large numbers of independent compute-intensive jobs,
+and the performance goal of high rate of job completion
+rather than low turnaround time of individual jobs.
+It also has features to support [distributed data storage](VolunteerStorage)
+and [distributed parallel computing](Sporadic-applications).
+
+BOINC has a client/server architecture:
+the 'server' distributes jobs,
+while the 'client' runs on worker nodes, and execute jobs.
+BOINC can use worker nodes that are:
+
+* Heterogeneous: they have different processor and GPU types
+different operating systems (Windows, Mac OS, Linux, Android).
+
+* Sporadically available.
+
+* Untrusted: they may return incorrect computational results.
+
+* Large scale: millions or more worker nodes.
+
+Hence BOINC is well-suited to [volunteer computing](VolunteerComputing)
+in which the computing resources are consumer devices
+(desktop and laptop computers, tablets, phones,
+game consoles, appliances) volunteered by their owners.
+
+It can also be used with
+[organizational desktop resources](DesktopGrid)
+(the PCs in a company or university)
+or with data-center resources (clusters or clouds),
+or with any combination of resources.
+
+BOINC can run most existing HTC applications with minor modifications,
+including those that use GPUs and/or multiple CPU cores.
+It can use virtual machines to run unmodified Linux applications
+on Windows and Mac worker nodes.
+It efficiently supports applications that use large data files,
+or that required large amounts of memory.
+
+BOINC can be used as a 'back' end for existing
+job-submission systems such as HTCondor; details are [here](GridIntegration).
+
+BOINC is distributed under the LGPL v3 open-source license.
+It can be used for any purpose (academic, commercial, or private)
+and can be used with applications that are not open-source.
+
+## Cost comparison
+
+BOINC was created to provide scientists with large computing power
+at a small cost.
+One study found the following costs for a particular workload:
+
+### **Use Amazon's Elastic Computing Cloud: $175 Million**
+
+### **Build a cluster: $12.4 Million**
+This includes power and air-conditioning infrastructure, network hardware, computing hardware, storage, electricity, and sysadmin personnel.
+
+### **Use BOINC: $125,000**
+Based on the average throughput and budget of several
+volunteer computing projects.
+
+It takes (very roughly) three man-months to create a volunteer computing
+project using BOINC:
+one month of an experienced sys admin, one month of a programmer,
+and one month of a web developer.
+Once the project is running,
+budget a 50% FTE (mostly system admin) to maintain it.
+In terms of hardware, you'll need a mid-range server computer and a fast connection to the commercial Internet.
+
+## Organizational options
+
+The volunteer computing projects using BOINC vary in terms of their
+organizational structure and the set of scientists they serve.
+Examples include:
+
+* Research group.
+ The project is operated by a single research group,
+ and serves the members of that group.
+ Examples include SETI@home, Rosetta@home, and Einstein@home.
+
+* Application-centered research community.
+ The project is operated by a single research group,
+ but serves a broader community in that science area.
+ Example: Climateprediction.net,
+ which is based at Oxford but provides computing to
+ researchers at other institutions.
+
+* Science Gateway.
+ The project is operated by a **science gateway**,
+ i.e. a web site that serves a particular scientific community,
+ and that provides HTC as well as other functions.
+ An example is nanoHUB.
+
+* Institutional umbrella project.
+ The project is operated by an organization (university or research lab),
+ and serves the researchers in that organization.
+ For example, LHC@home servers multiple groups at CERN.
+ An academic example (no longer operating)
+ is the University of Westminster in London.
+ This idea is elaborated on [here](VirtualCampusSupercomputerCenter).
+
+* HPC provider.
+ The project is operated by an HPC provider such as a supercomputing center.
+ It processes the provider's HTC jobs
+ (i.e. the jobs that don't actually need a supercomputer),
+ and serves the provider's clients that have HTC workloads.
+ An example is Texas Advanced Computing Center (TACC).
+
+There are advantages in having BOINC projects that are high
+in the organizational hierarchy, and that serve many scientists:
+
+* The cost of maintaining a BOINC project is roughly constant,
+  regardless of its size.
+  For large projects, the cost per scientist is lower.
+
+* Publicity options: high-level organizational entities typically have
+  existing publicity mechanisms (e.g. alumni magazines, newsletters, etc.)
+  that can be leveraged to recruit volunteers.
+
+* Longevity: the duration of one scientist's need for HTC is generally shorter
+  than that of a group of scientists.
+  There are benefits in having a project last a long time
+  (e.g. amortizing the startup cost).
+
+* Continuity: similarly, one scientist's computing workload may
+  be sporadic, while that of a group of scientists is more continuous.
+  Some volunteers prefer projects with continuous workloads.
--- a/BOINC-projects.md
+++ b/BOINC-projects.md
@ -1,10 +1,9 @@
-# BOINC projects
-
-A **BOINC project** is a server that distributes jobs.
-Each project has a [master URL](ServerComponents#ThemasterURL), which
-
-* identifies a web site that describes the project and shows its status.
-* identifies servers that distribute jobs and collect results.
+A 'BOINC project' is essentially a server that distributes jobs.
+Each project has a [master URL](ServerComponents#ThemasterURL),
+which exports RPCs directing the BOINC client
+to servers that distribute jobs and files.
+The master URL can also provide
+a public web site that describes the project and shows its status.

 Volunteers can create "accounts" on projects.
 The BOINC client (which runs on worker nodes)
@ -12,16 +11,10 @@ can be "attached" to accounts on any number of projects.

 <img src=https://github.com/BOINC/boinc/blob/master/doc/attach.jpg width=600>

-Projects are independent; each one has its own applications,
-databases and servers,
-and is not affected by other projects.
+Projects are independent; each one has its own applications, accounts,
+databases and servers.

-The BOINC project itself operates a web site, https://boinc.berkeley.edu.
-The BOINC client periodically contacts this server to obtain
-* A list of approved projects
-* News of updates to the client software.
-
-Creating projects is relatively easy.
+Creating a project is relatively easy.
 An organization can create multiple projects,
 e.g. for testing new applications.
 A project can run entirely on a single computer
@ -29,9 +22,27 @@ A project can run entirely on a single computer
 A project can also be spread across multiple computers,
 so that it can handle large numbers of attached clients.

+## The role of UC Berkeley
+
+BOINC itself is based at UC Berkeley.
+Projects can ask to be 'vetted' by BOINC.
+It operates a server at https://boinc.berkeley.edu.
+This has several functions:
+
+* It provides a web site explaining what BOINC is,
+and showing a list of vetted projects.
+* It provides downloads of the BOINC client for all supported platforms.
+These installers are 'signed' by UC Berkeley.
+* It exports the list of vetted projects.
+The BOINC client periodically fetches this list
+and uses it in the 'add projects' GUI dialog.
+* It exports a list of the current client versions.
+This is used by the BOINC client to notify volunteers
+when a new version is available.
+
 ## Account managers and Science United

-The original thinking was that there would be many projects,
+The original assumption was that there would be many projects,
 competing for computing power (i.e. volunteers) by generating
 mass-media publicity and creating compelling web sites.
 In practice, the need to attract volunteers has been a major
--- a/BoincOverview.md
+++ b/BoincOverview.md
@ -1,157 +0,0 @@
-# BOINC Overview
-
-BOINC is a platform for distributed **high throughput computing**,
-i.e. large numbers of independent compute-intensive jobs,
-where there performance goal is high rate of job completion
-rather than low turnaround time of individual jobs.
-It also offers mechanisms for distributed data storage.
-
-BOINC has a client/server architecture:
-the **server** distributes jobs,
-while the **client** runs on worker nodes, which execute jobs.
-
-BOINC can be used in two ways:
-
-* In [volunteer computing](VolunteerComputing),
-  the worker nodes are consumer devices (desktop and laptop computers,
-  tablets, smartphones) volunteered by their owners.
-  BOINC [addresses the various challenges](BoincIntro) inherent in this environment
-  (heterogeneity, host churn and unreliability, scale, security, and so on).
- There are a number of volunteer-computing **BOINC projects**
- such as Einstein@home, LHC@home, World Community Grid, and so on.
- The BOINC client can be "attached" to one or many of these;
- it processes jobs for the projects to which it is attached.
-
-* BOINC can also be used for [in-house computing](DesktopGrid) within an organization (e.g. a company).
-  In this case case the worker nodes are
-  cluster nodes or other organizational computers,
-  and they are attached only to the organization's BOINC server.
-
-BOINC can run all existing HTC applications,
-including those that use GPUs and/or multiple CPU cores.
-It can use virtual machines to run existing Linux applications on Windows and Mac worker nodes.
-
-BOINC provides mechanisms for job submission and control, designed for performance at scale.
-However, it can also be used as a back end for existing
-job-submission systems such as HTCondor; details are [here](GridIntegration).
-
-BOINC is distributed under the LGPL v3 open-source license.
-It can be used for any purpose (academic, commercial, or private)
-and can be used with applications that are not open-source.
-
-## Cost comparison
-
-BOINC was created to provide scientists with large computing power at a small cost.
-Suppose you need, say, 100 TeraFLOPS for 1 year.
-Here are some ways you can get it:
-
-### **Use Amazon's Elastic Computing Cloud: $175 Million**
-Based on $0.10 per node/hour.
-### **Build a cluster: $12.4 Million**
-This includes power and air-conditioning infrastructure, network hardware, computing hardware, storage, electricity, and sysadmin personnel.
-### **Use BOINC: $125,000**
-Based on the average throughput and budget of the 6 largest volunteer computing projects.
-
-It takes (very roughly) three man-months to create a BOINC project:
-one month of an experienced sys admin, one month of a programmer, and one month of a web developer.
-Once the project is running, budget a 50% FTE (mostly system admin) to maintain it.
-In terms of hardware, you'll need a mid-range server computer and a fast connection to the commercial Internet.
-
-## Getting started
-
-To compute using BOINC, you'll need to set up a BOINC server
-and configure your applications to run under BOINC.
-Technical documentation is [here](Home).
-
-If you're doing in-house computing,
-install the BOINC client on your worker nodes, and you're done.
-This is detailed [here](DesktopGrid).
-
-In the volunteer computing case, you'll need to get clients to attach to your server.
-There are several ways to do this:
-
-* Create a public-facing web site for your project.
- Announce it and publicize it using whatever channels are available to you:
- mass media, social media, newletters, paid advertising, etc.
-
-* [Contact us](ProjectPeople) and ask to have your project listed by BOINC.
- You'll be asked to demonstrate that a) your project is doing
- what you claim it is, and b) you're following a set of security practices.
- Your project will then a) be announced on the BOINC web site news column,
- b) be listed on the BOINC web site, and
- c) appear in the list of projects shown in the BOINC client GUI.
-
-* [Contact us](ProjectPeople) and ask to have your project
- included in [Science United](https://scienceunited.org),
- a framework in which volunteers sign up for science areas instead of projects.
- You'll need to tell us what types of research your project is doing,
- and then you'll automatically get computing power from volunteers
- who have registered an interest in those areas.
- This has the advantage that you don't have to create a public-facing web site or do any publicity.
- In addition, you can ask to be included in Science United even before you've created your project.
- At that point we can tell you roughly how much computer power you'll get,
- and you can decide whether this justifies the investment in creating a project.
-
-These approaches are not mutually exclusive; you can do any or all of them.
-
-## Organizational options
-
-The volunteer computing projects using BOINC vary in terms of their
-organizational structure and the set of scientists they serve.
-Examples include:
-
-* Research group.
- The project is operated by a single research group,
- and serves the members of that group.
- Examples include SETI@home, Rosetta@home, and Einstein@home.
-
-* Application-centered research community.
- The project is operated by a single research group,
- but serves a broader community in that science area.
- Examples: Climateprediction.net,
- which is based at Oxford but collaborates with
- projects around the world.
- Mindmodeling.org serves researchers from about 20 universities who use the same application (the ACT-R cognitive modeling system).
-
-* Science Gateway.
- The project is operated by a **science gateway**,
- i.e. a web site that serves a particular scientific community,
- and that provides HTC as well as other functions.
- An example is nanoHUB.
-
-* Institutional umbrella project.
- The project is operated by an organization (university or research lab),
- and serves the researchers in that organization.
- For example, LHC@home servers multiple groups at CERN.
- An academic example (no longer operating) is the University of Westminster in London.
- This idea is elaborated on [here](VirtualCampusSupercomputerCenter).
-
-* HPC provider.
- The project is operated by an HPC provider such as a supercomputing center.
- It processes the provider's HTC jobs
- (i.e. the jobs that don't actually need a supercomputer),
- and serves the provider's clients that have HTC workloads.
- An example is Texas Advanced Computing Center (TACC).
-
-There are several advantages in having BOINC projects that are high
-in the organizational hierarchy, and that serve many scientists:
-
-* The cost of maintaining a BOINC project is roughly constant,
-  regardless of its size.
-  For large projects, the cost per scientist is lower.
-
-* Publicity options: high-level organizational entities typically have
-  existing publicity mechanisms (e.g. alumni magazines, newsletters, etc.)
-  that can be leveraged to recruit volunteers.
-
-* Longevity: the duration of one scientist's need for HTC is generally shorter
-  than that of a group of scientists.
-  There are benefits in having a project last a long time
-  (e.g. amortizing the startup cost).
-
-* Continuity: similarly, one scientist's computing workload may
-  be sporadic, while that of a group of scientists is more continuous.
-  Some volunteers prefer projects with continuous workloads.
-
-So if you're thinking about using BOINC,
-consider the possible scope of your project.
--- a/Computing-with-BOINC.md
+++ b/Computing-with-BOINC.md
@ -8,9 +8,8 @@ For help with BOINC, post to the

 ## Introductory docs

-* [BOINC overview](BoincOverview)
-* [BOINC projects](ProjectsApps)
-* [Features of BOINC](BoincIntro)
+* [BOINC overview](BOINC-overview)
+* [BOINC projects](BOINC-projects)
 * [Create a BOINC server (cookbook)](Create-a-BOINC-server-(cookbook))
 * [BOINC apps (introduction)](Boinc-apps-(introduction))
 * [Deploy Linux apps using VirtualBox (cookbook)](Deploy-Linux-apps-using-VirtualBox-(cookbook))
--- a/Going-public.md
+++ b/Going-public.md
@ -4,8 +4,41 @@
 ## Server software upgrades
 ## Log files
 ## Backups
+## Get worker nodes
+
+in-house:
+
+If you're doing in-house computing,
+install the BOINC client on your worker nodes, and you're done.
+This is detailed [here](DesktopGrid).
+
+
+volunteer: 
+Create a public-facing web site for your project.
+Announce it and publicize it using whatever channels are available to you:
+mass media, social media, newletters, paid advertising, etc.
+
+
 ## Get vetted
+* [Contact us](ProjectPeople) and ask to have your project listed by BOINC.
+You'll be asked to demonstrate that a) your project is doing
+what you claim it is, and b) you're following a set of security practices.
+Your project will then a) be announced on the BOINC web site news column,
+b) be listed on the BOINC web site, and
+c) appear in the list of projects shown in the BOINC client GUI.
+
 ## Science United
+* [Contact us](ProjectPeople) and ask to have your project
+ included in [Science United](https://scienceunited.org),
+ a framework in which volunteers sign up for science areas instead of projects.
+ You'll need to tell us what types of research your project is doing,
+ and then you'll automatically get computing power from volunteers
+ who have registered an interest in those areas.
+ This has the advantage that you don't have to create a public-facing web site or do any publicity.
+ In addition, you can ask to be included in Science United even before you've created your project.
+ At that point we can tell you roughly how much computer power you'll get,
+ and you can decide whether this justifies the investment in creating a project.
+
 ## Web site

 content
--- a/11
+++ b/11
@ -47,12 +47,12 @@ doc big picture
 Intro docs
    (base-case examples w/ cookbooks; pictures; videos where feasible)

-    Introduction to BOINC
+    Overview
        What BOINC is; why use it; volunteer computing
        client/server: projects, worker nodes
        role of UCB
        features
-    The structure of BOINC
+    Projects
        projects, attachments, account managers
        Science United, BOINC Central
        role of UCB
@ -114,10 +114,3 @@ Intro docs
            content
            forums
            spam control
-----------------------------
-detailed docs
-Handling completed jobs (assimilation)
-    Standard assimilators
-    Assimilators in scripting languages (Python, PHP, etc.)
-    Assimilators in C++
-