Locality scheduling is intended for projects for which The goal of locality scheduling is to minimize the amount of data transfer to hosts. In sending work to at given host, the scheduler tries to send results that uses input files already on the host.

To use locality scheduling, projects must do the following:

Locality scheduling works as follows:

On-demand work generation

This mechanism, which is used in conjunction with locality scheduling, lets a project create work in response to scheduler requests rather than creating all work ahead of time. The mechanism is controlled by an element in config.xml of the form: ".html_text(" N ")." where N is some number of seconds.

When a host storing file X requests work, and there are no available results using X, then the scheduler touches a 'trigger file'

PROJECT_ROOT/locality_scheduling/need_work/X
The scheduler then sleeps for N seconds, and makes one additional attempt to find suitable unsent results.

The project must supply a 'on-demand work generator' daemon program that scans the need_work directory. If it finds an entry, it creates additional workunits for the file, and the transitioner then generates results for these workunits. N should be chosen large enough so that both tasks complete within N seconds most of the time (10 seconds is a good estimate).

The work generator should delete the trigger file after creating work.

In addition, if the work generator (or some other project daemon) determines that no further workunits can be made for a file X, then it can touch a trigger file

PROJECT_ROOT/locality_scheduling/no_work_available/X
If the scheduler finds this trigger file then it assumes that the project cannot create additional work for this data file and skips the 'notify, sleep, query again' sequence above. Of course it still does the initial query, so if the transitioner has made some new results for an existing (old) WU, they will get picked up. "; page_tail(); ?>