boinc/doc/db_purge.php

<?php
require_once("docutil.php");
page_head("Database purging utility");
echo "
As a BOINC project operates, the size of its
workunit and result tables increases.
Eventually they become so large
that adding a field or building an index may take hours or days.

<p>
To address this problem, BOINC provides a utility <b>db_purge</b>
that writes result and WU records to XML-format archive files,
then deletes them from the database.

<p>
Workunits are purged only when their input files have been deleted.
Because of BOINC's file-deletion policy,
this implies that all results are completed.
So when a workunit is purged, all its results are purged too.

<p>
Run db_purge from the project's bin/ directory.
It will create an archive/ directory and store archive files there.

<p>
db_purge is normally run as a daemon,
specified in the <a href=configuration.php>config.xml</a> file.
It has the following command-line options:
";
list_start();
list_item("-min_age_days N",
    "Purge only WUs with mod_time at least N days in the past.
    Recommended value: 7 or so.
    This lets users examine their recent results."
);
list_item("-max N", "Purge at most N WUs, then exit");
list_item("-max_wu_per_file N",
    "Write at most N WUs to each archive file.
    Recommended value: 10,000 or so."
);
list_item("-zip", "Compress archive files using zip");
list_item("-gzip", "Compress archive files using gzip");
list_item("-d N", "Set logging verbosity to N (1,2,3)");
list_end();

echo "
<h3>Archive file format</h3>
<p>
The archive files have names of the form
wu_archive_TIME and result_archive_TIME
where TIME is the Unix time the file was created.
In addition, db_purge generates index files
'wu_index' and 'result_index'
associating each WU and result ID with the timestamp of its archive file.
<p>
The format of both type of index files is a number of rows each containing:
<pre>
ID     TIME
</pre>
The ID field of the WU or result, 5 spaces,
and the timestamp part of the archive filename where the record with that
ID can be found.
<p>
The format of a record in the result archive file is:
".html_text("
<result_archive>
    <id>%d</id>
  <create_time>%d</create_time>
  <workunitid>%d</workunitid>
  <server_state>%d</server_state>
  <outcome>%d</outcome>
  <client_state>%d</client_state>
  <hostid>%d</hostid>
  <userid>%d</userid>
  <report_deadline>%d</report_deadline>
  <sent_time>%d</sent_time>
  <received_time>%d</received_time>
  <name>%s</name>
  <cpu_time>%.15e</cpu_time>
  <xml_doc_in>%s</xml_doc_in>
  <xml_doc_out>%s</xml_doc_out>
  <stderr_out>%s</stderr_out>
  <batch>%d</batch>
  <file_delete_state>%d</file_delete_state>
  <validate_state>%d</validate_state>
  <claimed_credit>%.15e</claimed_credit>
  <granted_credit>%.15e</granted_credit>
  <opaque>%f</opaque>
  <random>%d</random>
  <app_version_num>%d</app_version_num>
  <appid>%d</appid>
  <exit_status>%d</exit_status>
  <teamid>%d</teamid>
  <priority>%d</priority>
  <mod_time>%s</mod_time>
</result_archive>
")."
The format of a record in the WU archive file is:
".html_text("
<workunit_archive>
    <id>%d</id>
  <create_time>%d</create_time>
  <appid>%d</appid>
  <name>%s</name>
  <xml_doc>%s</xml_doc>
  <batch>%d</batch>
  <rsc_fpops_est>%.15e</rsc_fpops_est>
  <rsc_fpops_bound>%.15e</rsc_fpops_bound>
  <rsc_memory_bound>%.15e</rsc_memory_bound>
  <rsc_disk_bound>%.15e</rsc_disk_bound>
  <need_validate>%d</need_validate>
  <canonical_resultid>%d</canonical_resultid>
  <canonical_credit>%.15e</canonical_credit>
  <transition_time>%d</transition_time>
  <delay_bound>%d</delay_bound>
  <error_mask>%d</error_mask>
  <file_delete_state>%d</file_delete_state>
  <assimilate_state>%d</assimilate_state>
  <hr_class>%d</hr_class>
  <opaque>%f</opaque>
  <min_quorum>%d</min_quorum>
  <target_nresults>%d</target_nresults>
  <max_error_results>%d</max_error_results>
  <max_total_results>%d</max_total_results>
  <max_success_results>%d</max_success_results>
  <result_template_file>%s</result_template_file>
  <priority>%d</priority>
  <mod_time>%s</mod_time>
</workunit_archive>
")."
";
page_tail();
?>
* empty log message * svn path=/trunk/boinc/; revision=4948 2004-12-27 22:52:55 +00:00			`<?php`
			`require_once("docutil.php");`
			`page_head("Database purging utility");`
			`echo "`
			`As a BOINC project operates, the size of its`
replace wait3(), wait4() with waitpid() svn path=/trunk/boinc/; revision=7170 2005-08-04 00:12:50 +00:00			`workunit and result tables increases.`
			`Eventually they become so large`
			`that adding a field or building an index may take hours or days.`
* empty log message * svn path=/trunk/boinc/; revision=4948 2004-12-27 22:52:55 +00:00
			`<p>`
			`To address this problem, BOINC provides a utility <b>db_purge</b>`
replace wait3(), wait4() with waitpid() svn path=/trunk/boinc/; revision=7170 2005-08-04 00:12:50 +00:00			`that writes result and WU records to XML-format archive files,`
			`then deletes them from the database.`
* empty log message * svn path=/trunk/boinc/; revision=4948 2004-12-27 22:52:55 +00:00
			`<p>`
			`Workunits are purged only when their input files have been deleted.`
			`Because of BOINC's file-deletion policy,`
			`this implies that all results are completed.`
			`So when a workunit is purged, all its results are purged too.`

			`<p>`
			`Run db_purge from the project's bin/ directory.`
			`It will create an archive/ directory and store archive files there.`

			`<p>`
			`db_purge is normally run as a daemon,`
			`specified in the <a href=configuration.php>config.xml</a> file.`
			`It has the following command-line options:`
			`";`
			`list_start();`
			`list_item("-min_age_days N",`
			`"Purge only WUs with mod_time at least N days in the past.`
			`Recommended value: 7 or so.`
			`This lets users examine their recent results."`
			`);`
			`list_item("-max N", "Purge at most N WUs, then exit");`
			`list_item("-max_wu_per_file N",`
			`"Write at most N WUs to each archive file.`
			`Recommended value: 10,000 or so."`
			`);`
			`list_item("-zip", "Compress archive files using zip");`
			`list_item("-gzip", "Compress archive files using gzip");`
			`list_item("-d N", "Set logging verbosity to N (1,2,3)");`
			`list_end();`

			`echo "`
replace wait3(), wait4() with waitpid() svn path=/trunk/boinc/; revision=7170 2005-08-04 00:12:50 +00:00			`<h3>Archive file format</h3>`
			`<p>`
			`The archive files have names of the form`
			`wu_archive_TIME and result_archive_TIME`
			`where TIME is the Unix time the file was created.`
			`In addition, db_purge generates index files`
			`'wu_index' and 'result_index'`
			`associating each WU and result ID with the timestamp of its archive file.`
			`<p>`
			`The format of both type of index files is a number of rows each containing:`
			`<pre>`
			`ID TIME`
			`</pre>`
			`The ID field of the WU or result, 5 spaces,`
			`and the timestamp part of the archive filename where the record with that`
			`ID can be found.`
			`<p>`
			`The format of a record in the result archive file is:`
			`".html_text("`
			`<result_archive>`
			`<id>%d</id>`
			`<create_time>%d</create_time>`
			`<workunitid>%d</workunitid>`
			`<server_state>%d</server_state>`
			`<outcome>%d</outcome>`
			`<client_state>%d</client_state>`
			`<hostid>%d</hostid>`
			`<userid>%d</userid>`
			`<report_deadline>%d</report_deadline>`
			`<sent_time>%d</sent_time>`
			`<received_time>%d</received_time>`
			`<name>%s</name>`
			`<cpu_time>%.15e</cpu_time>`
			`<xml_doc_in>%s</xml_doc_in>`
			`<xml_doc_out>%s</xml_doc_out>`
			`<stderr_out>%s</stderr_out>`
			`<batch>%d</batch>`
			`<file_delete_state>%d</file_delete_state>`
			`<validate_state>%d</validate_state>`
			`<claimed_credit>%.15e</claimed_credit>`
			`<granted_credit>%.15e</granted_credit>`
			`<opaque>%f</opaque>`
			`<random>%d</random>`
			`<app_version_num>%d</app_version_num>`
			`<appid>%d</appid>`
			`<exit_status>%d</exit_status>`
			`<teamid>%d</teamid>`
			`<priority>%d</priority>`
			`<mod_time>%s</mod_time>`
			`</result_archive>`
			`")."`
			`The format of a record in the WU archive file is:`
			`".html_text("`
			`<workunit_archive>`
			`<id>%d</id>`
			`<create_time>%d</create_time>`
			`<appid>%d</appid>`
			`<name>%s</name>`
			`<xml_doc>%s</xml_doc>`
			`<batch>%d</batch>`
			`<rsc_fpops_est>%.15e</rsc_fpops_est>`
			`<rsc_fpops_bound>%.15e</rsc_fpops_bound>`
			`<rsc_memory_bound>%.15e</rsc_memory_bound>`
			`<rsc_disk_bound>%.15e</rsc_disk_bound>`
			`<need_validate>%d</need_validate>`
			`<canonical_resultid>%d</canonical_resultid>`
			`<canonical_credit>%.15e</canonical_credit>`
			`<transition_time>%d</transition_time>`
			`<delay_bound>%d</delay_bound>`
			`<error_mask>%d</error_mask>`
			`<file_delete_state>%d</file_delete_state>`
			`<assimilate_state>%d</assimilate_state>`
			`<hr_class>%d</hr_class>`
			`<opaque>%f</opaque>`
			`<min_quorum>%d</min_quorum>`
			`<target_nresults>%d</target_nresults>`
			`<max_error_results>%d</max_error_results>`
			`<max_total_results>%d</max_total_results>`
			`<max_success_results>%d</max_success_results>`
			`<result_template_file>%s</result_template_file>`
			`<priority>%d</priority>`
			`<mod_time>%s</mod_time>`
			`</workunit_archive>`
			`")."`
* empty log message * svn path=/trunk/boinc/; revision=4948 2004-12-27 22:52:55 +00:00			`";`
			`page_tail();`
			`?>`