*** empty log message ***

svn path=/trunk/boinc/; revision=4712
This commit is contained in:
David Anderson 2004-12-02 22:56:49 +00:00
parent f1656f4379
commit 5664a85754
10 changed files with 499 additions and 68 deletions

103
doc/acct_mgt.php Normal file
View File

@ -0,0 +1,103 @@
<?php
require_once("docutil.php");
page_head("External account management <br>(work in progress)");
echo "
<p>
Currently BOINC provides only a web-based interface
for creating and managing accounts.
Participants must locate the web sites of BOINC projects,
read them, decide which to join,
and fill out a separate registration form at each site.
This may deter some potential participants.
<p>
We wish to enable new ways for people to find and join BOINC projects.
For example, one could have a <b>sharing control panel</b> that shows:
<ul>
<li> a list of BOINC projects, with short descriptions and 'join' checkboxes;
<li> a simple form for the user's email address and basic preferences.
</ul>
The user can join projects simply by checking boxes, clicking OK,
and responding to account-verification emails.
<p>
The sharing control panel is an example of what we will call
'account management applications'.
<p>
This document describes a mechanism that allow account
management applications to interact with BOINC projects.
<h2>RPCs for account management</h2>
<p>
We propose having BOINC projects provide
an XML-RPC interface for account management.
RPCs will use HTTP on port 80,
so it will be easy to implement the client side in any language
(C++, Visual Basic, etc.),
and the mechanism will work through firewalls that allow outgoing web requests.
<p>
The proposed RPC functions are as follows:
<h3>Create tentative account</h3>
";
list_start();
list_item(
"input", "email address
<br>host name
<br>client nonce ID (crypto-random)"
);
list_item(
"output", "tentative account ID"
);
list_item(
"action",
"The server creates a 'tentative account' database record.
The server sends email to the given address, of the form:
<pre>
Someone (hopefully you) joined [project name] with this email address.
To confirm your participation in [project name] please visit the following URL:
xxx
If you do not want to participate in [project name], just ignore this message.
</pre>
When the user visits xxx, they see a release form and OK button.
The OK button validates the tentative account,
creating a new user record if needed.
");
list_end();
echo "
<h3>Query account status</h3>
";
list_start();
list_item("input",
"tentative account ID
<br>client nonce ID
"
);
list_item("output",
"bool validated
<br>account key
"
);
list_item("action",
"If the account has been validated, return true and the account key.
The account management application
can then do <a href=gui_rpc.php>BOINC GUI RPC</a> to the BOINC core client
to attach to the project."
);
list_end();
echo "
<p>
Possible additions:
RPCs to get and set preferences.
";
page_tail();
?>

View File

@ -62,6 +62,7 @@ before getting into the source code.
<ul>
<li> <a href=backend_state.php>Backend state transitions</a>
<li> <a href=backend_logic.php>The logic of backend programs</a>
<li> <a href=server_debug.php>Debugging server components</a>
</ul>
<h2>Protocols</h2>

View File

@ -26,6 +26,8 @@ htmlspecialchars("
<shmem_key> shared_memory_key </shmem_key>
<download_url> http://A/URL </download_url>
<download_dir> /path/to/directory </download_dir>
<download_dir_alt> /path/to/directory </download_dir_alt>
<uldl_dir_fanout> N </uldl_dir_fanout>
<upload_url> http://A/URL </upload_url>
<upload_dir> /path/to/directory </upload_dir>
<cgi_url> http://A/URL </cgi_url>
@ -91,7 +93,14 @@ list_item("db_passwd", "Database password");
list_item("shmem_key", "ID of scheduler shared memory. Must be unique on host.");
list_item("download_url", "URL of data server for download");
list_item("download_dir", "absolute path of download directory");
list_item("download_dir_alt",
"absolute path of old download directory
(see <a href=hier_dir.php>Hierarchical upload/download directories</a>)"
);
list_item("upload_url", "URL of file upload handler");
list_item("uldl_dir_fanout", "fan-out factor of upload and download directories
(see <a href=hier_dir.php>Hierarchical upload/download directories</a>)"
);
list_item("upload_dir", "absolute path of upload directory");
list_item("cgi_url", "URL of scheduling server");
list_item("stripchart_cgi_url", "URL of stripchart server");
@ -104,39 +113,60 @@ take a long time. If you enable this feature, be sure to rotate the
logs so that they are not too big.");
list_end();
echo "<b>The following control features that you may or may not want available to users.</b>";
echo "
<b>The following control features that you may or may not want
available to users.</b>
";
list_start();
list_item("disable_account_creation", "If present, disallow account creation");
list_item("show_results", "Enable web site features that show results (per user, host, etc.)");
list_item("disable_account_creation",
"If present, disallow account creation"
);
list_item("show_results",
"Enable web site features that show results (per user, host, etc.)"
);
list_end();
echo "<b>The following control the way in which results are scheduled, sent, and assigned to users and hosts.</b>";
echo "
<b>The following control the way in which results are scheduled, sent,
and assigned to users and hosts.</b>
";
list_start();
list_item("one_result_per_user_per_wu", "If present, send at most one result of a given workunit to a given
user. This is useful for checking accuracy/validity of results. It
ensures that the results for a given workunit are generated by
<b>different</b> users. If you have a validator that compares
different results for a given workunits to ensure that they are
equivalent, you should probably enable this. Otherwise you may end up
validating results from a given user with results from the <b>same</b>
user.");
list_item("max_wus_to_send", "Maximum results sent per scheduler RPC. Helps prevent hosts with
trouble from getting too many results and trashing them. But you
should set this large enough so that a host which is only connected to
the net at intervals has enough work to keep it occupied in between
connections.");
list_item("min_sendwork_interval", "Minimum number of seconds to wait after sending results to a given
host, before new results are sent to the same host. Helps prevent
hosts with download or application problems from trashing lots of
results by returning lots of error results. But don't set it to be so
long that a host goes idle after completing its work, before getting
new work.");
list_item("daily_result_quota", "Maximum number of results sent to a given host in a 24-hour
period. Helps prevent hosts with download or application problems from
returning lots of error results. Be sure to set it large enough that
a host does not go idle in a 24-hour period, and can download enough
work to keep it busy if disconnected from the net for a few days.");
list_item("enforce_delay_bound", "Don't send results to hosts too slow to complete them within delay bound");
list_item("one_result_per_user_per_wu",
"If present, send at most one result of a given workunit to a given user.
This is useful for checking accuracy/validity of results.
It ensures that the results for a given workunit are generated by
<b>different</b> users.
If you have a validator that compares different results
for a given workunits to ensure that they are equivalent,
you should probably enable this.
Otherwise you may end up validating results from a given user
with results from the <b>same</b> user."
);
list_item("max_wus_to_send",
"Maximum results sent per scheduler RPC. Helps prevent hosts with
trouble from getting too many results and trashing them. But you
should set this large enough so that a host which is only connected to
the net at intervals has enough work to keep it occupied in between
connections."
);
list_item("min_sendwork_interval",
"Minimum number of seconds to wait after sending results to a given
host, before new results are sent to the same host. Helps prevent
hosts with download or application problems from trashing lots of
results by returning lots of error results. But don't set it to be so
long that a host goes idle after completing its work, before getting
new work."
);
list_item("daily_result_quota",
"Maximum number of results sent to a given host in a 24-hour
period. Helps prevent hosts with download or application problems from
returning lots of error results. Be sure to set it large enough that
a host does not go idle in a 24-hour period, and can download enough
work to keep it busy if disconnected from the net for a few days."
);
list_item("enforce_delay_bound",
"Don't send results to hosts too slow to complete them within delay bound"
);
list_end();
// THE INFORMATION BELOW NEEDS TO BE ORGANIZED AND PUT INTO TABLES OR SOME OTHER LESS CRAMPED FORM

View File

@ -28,6 +28,7 @@ A file is described by an XML element of the form
[ <sticky/> ]
[ <signature_required/> ]
[ <no_delete/> ]
[ <report_on_rpc/> ]
</file_info>
")."
The elements are as follows:
@ -36,38 +37,61 @@ list_start();
list_item(
"name", "The file's name, which must be unique within the project."
);
list_item("url", "a URL where the file is
(or will be) located on a data server.");
list_item("md5_cksum", "The MD5 checksum of the file.");
list_item("nbytes", "the size of the file in
bytes (may be greater than 2^32).");
list_item("max_nbytes", "The maximum allowable
size of the file in bytes (may be greater than 2^32).
This is used to prevent flooding data servers with bogus data.");
list_item("status", "0 if the file is not present,
1 if the file is present, or a negative error code if there was a
problem in downloading or generating the file.");
list_item("generated_locally", "If present,
indicates that the file will be generated by an application on
the client, as opposed to being downloaded.");
list_item("executable", "If present, indicates
that the file protections should be set to allow execution.");
list_item("upload_when_present", "If present,
indicates that the file should be uploaded after it is created.");
list_item("sticky", "If present, indicates that
the file should be retained on the client after its initial use.");
list_item("signature_required", "If present,
indicates that the file should be verified with an RSA signature.
This generally only applies to executable files.");
list_item("no_delete", "If present for an input (workunit) file,
indicates that the file should NOT be removed from the download/
directory when the workunit is completed. You should use this
if a particular input file or files are used by more than one
workunit, or will be used by future, unqueued workunits.");
list_item("no_delete", "If present for an output (result) file,
indicates that the file should NOT be removed from the upload/
directory when the corresponding workunit is completed.
Use with caution - this may cause your upload/ directory to overflow.");
list_item("url",
"a URL where the file is (or will be) located on a data server."
);
list_item("md5_cksum", "The MD5 checksum of the file."
);
list_item("nbytes",
"the size of the file in bytes (may be greater than 2^32)."
);
list_item("max_nbytes",
"The maximum allowable size of the file in bytes (may be greater than 2^32).
This is used to prevent flooding data servers with bogus data."
);
list_item("status",
"0 if the file is not present,
1 if the file is present, or a negative error code if there was a
problem in downloading or generating the file."
);
list_item("generated_locally",
"If present, indicates that the file will be generated by an application on
the client, as opposed to being downloaded."
);
list_item("executable",
"If present, indicates that the file protections should be set to allow
execution."
);
list_item("upload_when_present",
"If present, indicates that the file should be uploaded after it is created.
");
list_item("sticky",
"If present, indicates that the file should be retained
on the client after its initial use."
);
list_item("signature_required",
"If present, indicates that the file should be verified with an
RSA signature.
This generally only applies to executable files."
);
list_item("no_delete",
"If present for an input (workunit) file,
indicates that the file should NOT be removed from the download/
directory when the workunit is completed. You should use this
if a particular input file or files are used by more than one
workunit, or will be used by future, unqueued workunits."
);
list_item("no_delete",
"If present for an output (result) file,
indicates that the file should NOT be removed from the upload/
directory when the corresponding workunit is completed.
Use with caution - this may cause your upload/ directory to overflow."
);
list_item("report_on_rpc",
"Include a description of this file in scheduler RPC requests,
so that the scheduler may send appropriate work
using <a href=sched_locality.php>locality scheduling</a>."
);
list_end();
echo "
These attributes allow the specification of various types of files: for

94
doc/hier_dir.php Normal file
View File

@ -0,0 +1,94 @@
<?php
require_once("docutil.php");
page_head("Hierarchical upload/download directories");
echo "
The data server for a large project,
may store 100Ks or millions of files at any given point.
If these files are stored in 'flat' directories
(project/download and project/upload)
the data server may spend a lot of CPU time searching directories.
If you see a high CPU load average,
with a lot of time in kernel mode,
this is probably what's happening.
<p>
The solution is to use
<b>hierarchical upload/download directories</b>.
To do this, include the line
".html_text("
<uldl_dir_fanout>1024</uldl_dir_fanout>
")."
in your <a href=configuration.php>config.xml file</a>
(this is the default for new projects).
<p>
This causes BOINC to use hierarchical upload/download directories.
Each directory will have a set of 1024 subdirectories, named 0 to 3ff.
Files are hashed (based on their filename) into these directories.
<p>
The hierarchy is used for input and output files only.
Executables and other application version files are
in the top level of the download directory.
<p>
This affects your project-specific code in a couple of places.
First, your work generator must put input files in
the right directory before calling <a href=tools_work.php>create_work()</a>.
To do this, it can use the function
".html_text("
int dir_hier_path(
const char* filename, const char* root, int fanout, char* result
);
")."
This takes a name of the input file
and the absolute path of the root of the download hierarchy
(typically the download_dir element from config.xml)
and returns the absolute path of the file in the hierarchy.
<p>
Secondly, your validator and assimilator should call
".html_text("
int get_output_file_path(RESULT const& result, string& path);
")."
to get the paths of output files in the hiearchy.
If your application has multiple output files,
you'll need to generalize this function.
<p>
A couple of utility programs are available:
".html_text("
dir_hier_move src_dir dst_dir fanout
dir_hier_path filename
")."
<code>dir_hier_move</code> moves all files from src_dir (flat)
into dst_dir (hierarchical with the given fanout).
<code>dir_hier_path</code>, given a filename,
prints the full pathname of that file in the hierarchy.
<h2>Transitioning from flat to hierarchical directories</h2>
<p>
If you are operating a project with flat directories,
you can transition to a hierarchy as follows:
<ul>
<li> Stop the project and add &lt;uldl_dir_fanout> to config.xml.
You may want to locate the hierarchy root at a new place
(e.g. download/fanout); in this case update the
&lt;download_dir> element of config.xml,
and add the element
".html_text("
<download_dir_alt>old download dir</download_dir_alt>
")."
This causes the file deleter to check both old and new locations.
<li> Use dir_hier_move to move existing upload files to a hierarchy.
<li> Start the project, and monitor everything closely for a while.
</ul>
";
page_tail();
?>

View File

@ -11,6 +11,13 @@ A paper about BOINC's design goals is here:
<a href=boinc2.pdf>PDF</a> |
<a href=http://boinc.de/madrid_de.htm>HTML/German</a> |
<a href=http://www.seti-nl.org/content.php?c=boinc_berkeley_madrid>HTML/Dutch</a>
<p>
A technical paper about BOINC is
<a href=grid_paper_04.pdf>here</a>.
This paper appeared in the
5th IEEE/ACM International Workshop on Grid Computing,
November 8, 2004, Pittsburgh, USA.
<p>
The BOINC's features fall into several areas:

View File

@ -88,11 +88,11 @@ Troubleshooting: check the log files of all daemon processes.
<h2>Develop back end components</h2>
<ul>
<li> Write a work generator.
<li> Write a validator.
<li> Write an assimilator.
<li> Edit the configuration file to use these programs
instead of the place-holder programs.
<li> Write a <a href=tools_work.php>work generator</a>.
<li> Write a <a href=validate.php>validator</a>.
<li> Write an <a href=assimilate.php>assimilator</a>.
<li> Edit the <a href=configuration.php>configuration file</a>
to use these programs instead of the place-holder programs.
<li> Make sure everything works correctly.
</ul>
@ -100,7 +100,6 @@ instead of the place-holder programs.
<h2>Extras</h2>
<ul>
<li> Make the core client available from your site
<li> Add message board categories: see html/ops/create_forums.php
</ul>

50
doc/server_debug.php Normal file
View File

@ -0,0 +1,50 @@
<?php
require_once("docutil.php");
page_head("Debugging server components");
echo "
<p>
A grab-bag of techniques for debugging BOINC server software:
<h2>Log files</h2>
Most error conditions are reported in the log files.
Make sure you know where these are.
If you're interested in the history of a particular WU or result,
grep for WU#12345 or RESULT#12345 (12345 represents the ID)
in the log files.
The html/ops pages also provide an interface for this.
<h2>Database query tracing</h2>
If you uncomment the symbol SHOW_QUERIES in db/db_base.C,
and recompile everything,
all database queries will be written to stderr
(for daemons, this goes to log files;
for command-line apps it's written to your terminal).
This is verbose but extremely useful for tracking down
database-level problems.
<h2>Scheduler single-stepping</h2>
The scheduler is a CGI program.
It reads from stdin and writes to stdout,
so you can also run it with a command-line debugger like gdb.
Direct a scheduler request file
(which you can copy from a client;
they're saved in files called sched_request_PROJECT.xml)
to stdin, set breakpoints, and start stepping through the code.
<p>
This is useful for figuring out why your project is generating
'no work available' messages.
<h2>MySQL interfaces</h2>
You should become familiar with MySQL tools such as
<ul>
<li> mytop: like 'top' for MySQL
<li> the mysql interpreter ('mysql') and in particular
the 'show processlist;' query.
<li> MySQLadmin: general-purpose web interface to MySQL
</ul>
";
page_tail();

View File

@ -10,7 +10,7 @@ PROJECT/
bin/
cgi-bin/
log_HOSTNAME/
pid/
pid_HOSTNAME/
download/
html/
inc/
@ -37,6 +37,11 @@ Each project directory contains:
<li> upload: storage for data server uploads.
</ul>
<p>
The upload and download directories
may contain large numbers (millions) of files.
For efficiency they are normally organized as
a <a href=hier_dir.php>hierarchy</a> of subdirectories.
";
page_tail();
?>

118
doc/win_install.php Normal file
View File

@ -0,0 +1,118 @@
<?php
require_once("docutil.php");
page_head("Windows installation options");
echo "
<h2>Components</h2>
<ul>
<li> <b>core client</b>: the program that manages file transfers
and execution of applications.
<li> <b>applications</b>: project-specific programs run by the core client.
<li> <b>manager</b>: the GUI to the core client.
<li> <b>screensaver</b>: a program that runs when the machine is idle.
Typically it sends a message to the core client,
telling it to do screensaver graphics.
</ul>
<h2>Single-user mode</h2>
<p>
This is the default.
The goals are simplicity and nice graphics.
<p>
Say the install is done by user X.
The manager runs automatically when X logs in.
The manager starts up the core client.
The core client it runs as a regular process, not a service.
If the manager crashes the core client continues to run.
The user can re-run the manager.
When the user logs out, the manager, the core client,
and any running applications exit.
<p>
Files (in the BOINC directory) are owned by user X.
<p>
Detection of mouse/keyboard is done by the core client.
<p>
The screensaver works as it currently does,
except that we'll pass window-station/desktop info
so that the password-protected screensaver mechanism will work.
<p>
Other users can't run the BOINC manager.
<h2>Run-while-logged-in mode</h2>
<p>
This is the same as single-user mode except
that the BOINC manager (and core client)
run whenever any user is logged in.
Processes run as whoever is logged in.
<p>
If someone logs in while BOINC is already running,
it will not start a new instance of BOINC.
<h2>Run-always nongraphical mode</h2>
<p>
This is for situations, such as a PC lab in a school,
where the administrator wants BOINC to run on the machine
all the time (even when no one is logged in)
but doesn't want any other users to be able to see or control BOINC.
<p>
The core client runs as a service, started at boot time.
On Windows 2003 and greater is runs under the 'network service' account.
Otherwise it runs as the installing user.
<p>
There is no mouse/keyboard checking,
so run-when-idle is not supported.
There is no screensaver capability.
Only the installing user can run the BOINC manager.
Files are accessable only to the installing user.
<h2>Run-always graphical mode</h2>
<p>
This is for PCs that have multiple users,
all of whom want to see graphics and have control over BOINC.
BOINC should run when no one is logged in.
<p>
The core client runs as a service, started at boot time.
It runs under the 'local system' account
(and hence so do all applications).
The manager starts at login for all users.
The manager checks mouse/keyboard input
and conveys idle state to the core client.
<p>
The screensaver either does graphics itself
(based on info obtained from the screensaver via RPC)
or (via the core client) has an application do the graphics.
In this case the application must switch to the same
window station and desktop as the screensaver.
<p>
<b>NOTE: this is not implemented and may never be,
because of technical difficulties
and the undesirability of running BOINC as 'local system'.</b>
<h2>Customizing the installer</h2>
<p>
The new BOINC installer is an MSI package.
Suppose you want to modify it so that you can
deploy BOINC across a Windows network using Active Directories,
and have all the PCs attached to a particular account.
Here's how to do this:
<ul>
<li> Using ORCA, edit the installer to set the installation
parameters to what you want.
<li> The global property ACCOUNTS_LOCATION specifies
(either in UNC or drive:path format)
a directory containing initial account files (normally null).
You can edit this to point to the account file you want.
For large-scale deployments it is probably safer
to use UNC paths.
</ul>
";
?>