boinc/doc/bashford_cookbook.txt

383 lines
16 KiB
Plaintext
Raw Normal View History

How I, a BOINC neophyte, set up a trivial test.
I won't cover building the software except to say that I found it
necessary to use CVS versions rather than the tarballs. I used the
version with the CVS tag, boinc_core_release_4_56, which was the
highest numbered tag at the time (mid December 2004).
This was a strictly internal test on a machine behind a firewall and
not visible to the outside world. This is important, because I don't
follow proper procedures with keys. Doing this on a publicly visible
project would be a bad thing. (see
http://boinc.berkeley.edu/code_signing.php).
The machine was an i686-pc running Red Hat version 9. It had mySQL
version 3.23, Apache 2.0 already running, and python 2.2, as well as
the necessary build software for a non-graphical client and API.
The objective is to host a project on this machine using the upper_case
test app that comes with boinc, and then volunteer the same machine
to do computations as a client. I did this as an ordinary user (mostly).
On this machine, ordinary users can run crontab jobs.
./make_project --delete_prev_inst --drop_db_first --base TESTDIR SHORTNAME LONGNAME
where TESTDIR is the absolute pathname of a directory the project's
files will be set up,
SHORNAME is a short, white-space-free name for the project,
LONGNAME is a user-friendly name, which can contain whitespace.
This creates directory TESTDIR/projects/SHORTNAME, that I'll refer to
as the "project directory" or PROJDIR. It several subdirs, including
bin, containing utility programs and daemons that will be for this
particular project; html, contianing the files for the project's web
site, including PHP scripts than run on the web server (in my case,
the same machine as the project host) in response to user/manager
actions through a browser; and cgi-bin, which contain binary exectuables
that run on the web server side in response to http communications from
the core-client running on volunteer machines.
It also creates TESTDIR/keys, a world readable (!) directory
containing the project's encryption keys. In a real, publicly visible
project, these should be removed from here and kept in a secret place,
but I didn't do this.
It also creates a file, PROJDIR/SHORTNAME.httpd.conf, and tells you
you should append this to your httpd config file. Go ahead and do this.
For me it was, from PROJDIR,
su
cd PROJDIR
cat SHORTNAME.httpd.conf >> /etc/httpd/conf/httpd.conf
exit
[It migth also be necessary to restart the web server? I don't think
it was in my case.]
Edit the defines at the top of PROJDIR/html/project/project.inc.
PROJECT, will be used in title banners on the web site.
MASTER_URL, should be http://HOSTNAME/SHORTNAME and URL_BASE
should be http://HOSTNAME/SHORTNAME/ (note the trailing slash).
[There must also be some requirements on IMAGE_* and PROFILE_*
values, but I haven't played around with this feature. Anyone?]
[The project_cookbook page says to protect the html/ops directory
by putting .htaccess and .htpasswd files there, but doesn't tell
what to PUT in those files. I don't know, so I didn't do it.
Dave?]
The make_project script also emits a line something like,
0,5,10,15,20,25,30,35,40,45,50,55 * * * * PROJDIR/bin/start --cron
and tells you to put in a crontab file. For now, just store this line
in a file called, say CRONENTRY, but dont run crontab on it yet.
The start --cron command is supposed to start various other daemons
but it depends information in config.xml on how to do this, and we
haven't properly set that file up yet. [Note, this contradicts
project_cookbook.]
Now the Project's public web site should be visible through a browser
as http://HOSTNAME/SHORTNAME, and the project management site should be
http://HOSTNAME/SHORTNAME_ops. Go ahead and look around, but don't
try taking any actions yet.
The next steps prepare information to be put in the mySQL databse by
the programs bin/xadd and bin/update_versions, and by the daemons.
Take a look at PROJDIR/config.xml, which make_project has generated.
You'll see that it has some info about the location of the site, and so
on. It also tells how to run some daemons, feeder, transitioner
and file_deleter. This isn't a complete set of daemons, and we'll
need to come back to this file; but it's good enough for now.
Copy projext.xml from the tools dir of the source tree to PROJDIR and
edit it to tell about the apps to run and the platforms on which they
run. Here's my version:
<boinc>
<platform>
<name>i686-pc-linux-gnu</name>
<user_friendly_name>Linux/x86</user_friendly_name>
</platform>
<platform>
<name>windows-intelx86</name>
<user_friendly_name>Windows/x86</user_friendly_name>
</platform>
<platform>
<name>powerpc-apple-darwin</name>
<user_friendly_name>Mac OS X</user_friendly_name>
</platform>
<platform>
<name>anonymous</name>
<user_friendly_name>anonymous</user_friendly_name>
</platform>
<app>
<name>upper_case</name>
<user_friendly_name>Convert to Upper Case</user_friendly_name>
</app>
</boinc>
The only thing I changed is the app part.
The app part tells about the application(s) to be run on the client
machines and the platform part tells what kind of client architectures
we might have application versions for. The exact text of the "name"
elements is important (see below). Now cd to PROJDIR/bin and do
./xadd
This puts information from project.xml into the mySQL database. You should
now be able to see some entries under "Platforms" and "Applications"
on the project management page.
[BUGLET! Trying to view platforms in the projectect management site
gives: "Fatal error: Call to undefined function:
mysql_real_escape_string() in
/home/bashford/boinctest/projects/cplan/html/inc/util_ops.inc" I find
that if I change "if(1)" to "if(0)" this goes away. Probably
something to do with PHP version dependence.]
Now copy the upper_case executable from the apps dir of the build
directory to the proper place in PROJDIR/apps. In this case, "proper place"
means:
1) It must be in a subdir under apps with the same name as given
for the "name" of the "app" in project.xml.
2) The name of the copied executable must be of the form
APPNAME_VERSION_PLATFORM
where
APPNAME matches the "name" of the "app", as with the subdir name
VERSION must be of the form MAJOR.MINOR where MAJOR must match
the major version number of the boinc software! (In effect, it
must be 4.0 these days.
PLATFORM must match the "name" of the "platform" in project.xml
for which this is an executable.
In our case, the executables name must therefore be,
upper_case_4.0_i686-pc-linux-gnu
the "0" being the only part that's a free choice.
Clients will download this and run it. If you want, you can gzip it
and leave the .gz extension on.
Now cd to PROJDIR/bin and run
./update_versions
This scans the app dir and creates AppVersions (info about which Apps
have really-existing versions for which platforms), and puts them in
the mySQL database. You'll get a dialog about signing apps with keys
to authenticate them. I just said yes, and went on, ignoring the
warning that this is a security hole. Now on the project management
site, you should see some entries under Application Versions. (You
need to push the OK button.) Also, a copy of the app will be put
in PROJDIR/downloads
Now let's start those daemons and the project itself, although it
won't really do anything much. Go ahead and add and start your
cron job by doing
crontab CRONENTRY
and cd to PROJDIR/bin and do
./start
You might get an email from cron telling you it's starting the
daemons. Using ps you should be able to see the feeder, transitioner
and file_deleter daemons running. Look under PROJDIR/log_* for these
deamon's log files to see that they aren't giving error messages, etc.
After 5 minutes or so, do
./stop
to shut down the project. This stops the daemons and sets up files
under PROJDIRs that prevent the daemons from getting started again by
cron. Make sure the log files reflect that they have stopped, and that
they stay stopped.
[MAYBE-BUG? At this point the feeder log not only contains the
"No results added; sleeping 1 sec" type lines but between these,
"restarting enumeration
enumeration restart returned nothing"
and then 99 repititions of
"already restarted enum on this array scan"
Is it a mistake to start the feeder daemon before any work units or
results
are created? Maybe this cookbook should say to defer the crontab
command and the first run of ./start to a later step?]
At this point, you should take care of some permission/ownership
problems that would bite you later. Under PROJDIR, the directories,
cgi-bin, log_*, upload, and html/cache must be writable by the httpd
daemon. I did this by chmoding them to be world writable. I suppose
it could be done more safely with ownership and group changes, but the
wrinkle is that the log directory needs to be writable by both the
httpd daemon and the user under which the project's cron daemons run
(in my case apache and bashford, respectively). Since files will be
created in these places that will be owned by apache (or whatever the
httpd user id is), it may be awkward deleting them once the project
has been up and running. In particular, a future run run of
make_project with the --delete_prev_inst flag may fail. [Is that
right?].
Next we need to create "Work Units" and "Results" (see
http://boinc.berkeley.edu/work.php and
http://boinc.berkeley.edu/result.php). The upper_case app takes a
file called "in" (filenames are hardwired in this particular app)
converts it all to upper case, and writes the result to "out". The
system can be told about this using the utility program create_work
(see http://boinc.berkeley.edu/tools_work.php). It needs template
files for the work units and results. Check the above-cited
docs for an explanation of the content, I'll just give what I used:
in PROJDIR/templates/work_unit_template:
<file_info>
<number>0</number>
</file_info>
<workunit>
<file_ref>
<file_number>0</file_number>
<open_name>in</open_name>
</file_ref>
</workunit>
in PROJDIR/templates/result_template
<file_info>
<name><OUTFILE_0/></name>
<generated_locally/>
<upload_when_present/>
<max_nbytes>100000</max_nbytes>
<url><UPLOAD_URL/></url>
</file_info>
<result>
<file_ref>
<file_name><OUTFILE_0/></file_name>
<open_name>out</open_name>
</file_ref>
</result>
Copy the file test/input from the source directory to PROJDIR/download/in
Now from PROJDIR do
./bin/create_work -appname upper_case -wu_name ucwu -wu_template templates/work_unit_template -result_template templates/result_template in
This adds one workunit, called ucwu, and five results to the
database. In other words. The project wants five different clients
to try to calculate a result, and it will compare them for validation
before declaring this work unit done.
For testing's sake, make another one a bit differently,
keeping the same template files, do
./bin/create_work -appname upper_case -wu_name easywu -wu_template templates/work_unit_template -result_template templates/result_template -min_quorum 1 -target_nresults 1 in
This creates a workunit, easywu, that only wants one result (i.e. no
validation via redundancy). This makes it easier to test with one
user and machine volunteering as a client.
More daemons are needed to create a continuing stream of work,
carry out the validation and assimilate completed work units.
Obviously, these depend on the nature of the project and it's apps,
but for testing purposes, boinc supplies some we can use. The executables
are in PROJDIR/bin
The utility, make_work just make endless copies of some existing
work unit, so put into the daemons section of config.xml
<daemon>
<cmd>
make_work -wu_name easywu -cusion 5
</cmd>
</daemon>
Note that "easywu" is what we called the work unit made before. The option
"-cusion 5" says make sure there are at least 5 unsent results at any time.
Now arrange to run the boinc supplied, sample_trivial_validator and
sample_dummy_assimilator. Let's see if these daemon entries in
config.xml will work: [Help, I don't really understand this, and I'm
not sure it's working for me!]
<daemon>
<cmd>
sample_trivial_validator -d 3 -app upper_case
</cmd>
</daemon>
<daemon>
<cmd>
sample_dummy_assimilator -d 3 -app upper_case
</cmd>
</daemon>
Now start up the project [again?]. From PROJDIR/bin,
./start
The project site should be fully functional now. If you look at
the workunites and pending results in the project management site, you
should see that more workunits and (pending) results have been created.
Check to see that all the daemons listed in config.xml are really running
and that their log files look okay.
[Re the above MAYBE-BUG, the feeder log looks more sensible now.]
The project is ready for volunteers. To become one, go to the project's
public website, http://HOSTNAME/SHORTNAME, and follow Create account
directions. If you encounter an offer to download the boinc_client software,
don't do it if it's coming from the project site. Only do it if it's coming
from BOINC's site, http://boinc.berkeley.edu/download.php.
You should get an email with your account key. Then
follow the "Getting Started" link and Download and install BOINC
from the Berkeley site. For a linux client, you make a clean directory
for the client to make files in, and into download a file with a name
like boinc_client_4.56_i686-pc-linux-gnu.gz. The gunzip it and run
it from the command line. It will prompt for the project website
and your account id code. It should produce output indicating that it is contacting the project's site, validating your account id, downloading the app,
asking for results to calculate, doing the calculation and sending results
back. This all goes to standard output. You'll probably want to kill
it with CTRL-C [this is harmless, no?] and restart it as a background
job using something like (for bash shell),
./boinc_4.13_i686-pc-linux-gnu >> boinc.log 2>> error.log &
or for csh shell,
./boinc_4.13_i686-pc-linux-gnu >>boinc.log &) >>&error.log
There is more information about running the boinc client software
at http://noether.vassar.edu/~myers/help/boinc/unix.html
and http://boinc.berkeley.edu/client_unix.php.
[NOTE to web doc maintainers: There should be a link to one of
these (and equivs for other platforms) from the download page.
The documentation in http://noether.vassar.edu/~myers/help/boinc/unix.html
on putting a client invocation in the .profile
or .login seems odd. It then would run when your logged out! How about a
nohup?]
The output produced on the client should show work getting done,
and the project management web site should show this user getting "credit"
for results.
[Problem? I see evidence of completed results only on the project
management site, not on the public site. On the Results page of
the management site, it looks like I'm getting sucessful output, but
0.0 credit is claimed and "---" is granted. Can I set up this demo
differently so that credit is given and the public site shows it?
Some features of the public site show up as "temporarily disabled".
[BIGGER PROBLEM: result uploads are all failing! The log info of the
boinc_client (version 4.13, linux executable from berkeley website)
says "Error on file upload: invalid signature" file_upload_handler.log
shows "Returning error to client 199.76.2.44: invalid signature (permanent)"
I didn't do much with keys but take the defaults, and I did not have
this problem on a previous attempt when I used the 4.56 client I had
compiled myself. Is there a version skew problem?]