THIS FILE IS DEPRECATED
This document describes the core client's policies for managing disk space. The goals are the following (highest to lowest priority):
Total_Limit; disk usage limit as determined by prefs, disk size, and non-BOINC usage
Core_Usage; space currently being used by core client
all_Projects_limit [aPl] = total_limit - core_usage, this is space that projects can use [aPL]
Project_usage(p) = The size of all files associated with a project p, returns project->size in most cases;
all_Project_usage[aPu] = {Project_usage(p)} for all Project p
Project_limit(p) = aPl*resource_share(p)
The free space on a client is determined by
free_space = all_Projects_Limit - all_Project_usage
A project is an offender if
project_usage(p) - project_limit(p) > 0
The greatest offender is the project with
max {project_usage(p) - project_limit(p))} for all p.
The client will always try to remove files from the greatest offender first before querying other projects.
Consider a system participating two projects, A and B, with resource shares 75% and 25%, respectively. After computing the aPl available, A would receive 75% of the aPL as its project limit and B would receive 25% of the aPL as its project limit.
If P_u(A) < P_l(A), Project A is not utilizing all Project_limit(A). Therefore, B should be able to use the difference (P_u(A) - P_l(A)) if it needs to. This applies to all projects in any situation where a project is not utilizing all its Project_limit(p). This unused space will show up as free_space in most cases.
When A wants to add a new file, if adding the file would cause all_Project_usage >= aPL, a project must delete a file first. If A is not an offender, then files are deleted from an offender, in this case, Project B. Files will be deleted from B as described in Adding a File to a Project below.
When the BOINC client requests work, it tells the scheduling server how much free space it has on the client, as well as how much space it could potentially free. It is then the projects decision on how much work to award the client based on these values. When the files are downloaded, others are deleted as described below
Because of the way BOINC calculates the program size, this kind of error is unavoidable. The client will notice that BOINC has violated its disk share and will do the necessary steps described above (see first priority in Maintaining Project Disk Share). The user will have to find the problematic file and remove it in order for BOINC to reclaim that space. Deleted files will not be restored.
The client will try to create space for these temporary files as much it can, to the point where all files except those running a computation for a project are deleted. Because BOINC makes computation a priority over storage at this time, this is a very bad situation as it will delete all files in all other projects to ensure the work continues. If the total space for the BOINC client is still larger than the preferences allow, the client will suspend all activities and notify the user of the issue. The user can either drop the offending project or increase its disk size for BOINC in their preferences.
The algorithm is run whenever:
The client will first attempt to use all the PDS that is free. When all the space for BOINC projects is used by some combination of projects, files must be deleted to make space.
The client maintains the project share sizes and workunit queues by:
The client's method of creating free_space ensures the above:
PROJECT: double size double share_size double resource_share FILE_INFO: double nbytes get_more_disk_space(): for some PROJECT p and space_needed = number of bytes required
init total_space = 0 total_space = free space in the project disk size if (total_space > space_needed): return true mark all projects as unchecked while(total_space < space_needed): g_of = the greatest_offender if(couldn't find one or g_of == p): increase priority to delete from if can't increase anymore, return false mark g_of as checked only delete low priority files up to the point when it isn't an offender return true associate_file(): for some FILE_INFO fip and a PROJECT p init space_made = 0 // check offenders if(get_more_disk_space(p, fip->size)): p.size += fip.nbytes return true // check self is there any free space? try and delete expendable files from p if(space_made > fip->nbytes): return true // if hasn't return true yet, failure return false
This checking is done in the data_manager_poll() which is placed at the top of the client's FSM hierarchy. If there is no space violation, it takes no action and returns false. If BOINC is larger than the Total_Limit, the client will reduce Project_usages using the following method:
If all three conditions fail, all computation is suspended, a messsage is sent to the user explaining the problem and the function returns true. If at any time in the function, the total used space falls below the allowed disk usage set by the user preferences, the function returns false.
In conjunction with the CPU Scheduler's work fetch policy, the data manager's work fetch policy's goals are to:
When an RPC request is made, the client communicates to the server the values described above and the server can make a decision on how much work to send to the client.
anything_free(): init total_size = 0 foreach p in projects: total_size += p.size get project disk size free space = project disk size - total_size // get the total number of bytes that would be free // if files with priority < pr were deleted from all other projects // and low priority files were deleted from this project // total_potentially_free(): for some project my_p and some priority pr init tps = anything_free(); ref_count all files in file_infos foreach p in projects if(p != my_p): tps += potentially free space from p with priority less than pr foreach fip in file_infos if(fip.project == my_p, is permantent, and not part of a computation): and if(fip has lowest priority or is expired): tps += fip->nbytes potentially_free(): for a project p and some priority pr if it is not an offender: return 0; foreach fip in file_infos: if(fip.project == p, is permantent, and not part of a computation): and if(fip.priority <= pr or is expired): tps += fip->nbytes } return tps
There are three types of deletion policy that a project can specify in its config.xml
The policies are invoked by including the following in the config.xml file
<deletion_policy_priority/>
<deletion_policy_expire/>
- the LRU policy in inbedded in the core-client as the default
If any of these flags are present in the config.xml, a similar tag will be included in a successful RPC request.
A FILE_INFO, when created, has the following default values related to a projects deletion policy. These values are created for every file.
priority = P_LOW;
time_last_used = time now
exp_date = 60 days from now
where P_LOW is defined in client_types.h as the following
#define P_LOW 1
#define P_MEDIUM 3
#define P_HIGH 5
If using the defualts, files will not be guarenteed to survive more than sixty days if <deletion_policy_expire> is true.
If a priority or exp_date other than the default is required, the priority must be set when the workunit is created. By including the following tags in a workunit or result template, the default information is replaced.
<priority>(int; 1-5)<priority>
<exp_days>(int; # of days to keep)<exp_days>
The client communicates three values of disk usage to the server.
The server will assign workunits normally using the first amount. If no workunits were assigned, a second pass of the database is made using the second amount. If no workunits were assigned and the following is in the config.xml:
<delete_from_self/>
the third amound of free_space is used and a third pass of the database is made. Return whatever workunits were deemed acceptable for the host.
Under most circumstances, the amount of free_space will be enough to get workunits for a project. If a project has larger workunits (> 1 gb) or the host is storing many files for a project, amounts 2 & 3 become more important. The amount of free_space if files are deleted is found by:
There is currently a method for requesting a list of files from the project. There needs to be a way to communicate the information back to the project, such as an xml doc that can be parsed by the project.
There also needs to be a database, separate from the scheduling database, which keeps track of the files on host's clients.
"; page_tail(); ?>