Notes:
1) We still need to read all the data from socket (copy_socket_to_null());
otherwise the client will block on the send, and never get the success reply.
The previous approach (read-only file) didn't do this.
All we're saving is disk I/O on server.
2) The client reports a result only after it know that
all its output files have successfully been uploaded.
It won't re-upload anything if that's the case.
Except for very specific cases, strncpy() should never be used.
It can result in a non-terminated string.
Also replace strncat() with strlcat(); the latter is simpler
because you don't have to calculate remaining buffer space.
This broke other things (e.g. get_file_size on that file).
We would accomplish the same thing a cleaner way,
i.e. notice the file is already there and of the right size.
The file upload handler checked for ".." in the filename.
Also check for control chars and for starting with /.
Put this into a separate function, is_valid_filename().
Otherwise, result file names can be inferred from result names.
An attacker with task A could find the name of the "wingman" task B,
upload fake files as B's output files,
upload the same files as A's output files,
report A as completed, and get unearned credit.
An issue with unicode strings in python 2.4 and 2.6 (and possibly 2.5) prevents shlex to split the command which leads to the daemon or task not starting. The unicode issue seems to be fixed in python 2.7. The exact error message is: "TypeError: execv() argument 1 must be (encoded string without NULL bytes), not str".
See: https://github.com/vinodc/gitlab-webhook-branch-deployer/issues/1
This opens the validator up to a result name spoofing attack where a bogus client can claim it processed the result reported by a different client for the same workunit.
If the printf() or close() calls change errno, the original lseek() error is lost. The logged error would differ from the message send to the client. This amends 005957a.
Suggested by Juha Sointusalo
- if a daemon or task should run in a shell, add <use_shell>1</use_shell> to the task entry in config.xml
this will spawn a "sh -c cmd" process that propagates signals to the child process (see 881863d)
- if a daemon or task has to use a shell (pipe or redirection present in cmd) and <use_shell> is not enabled:
don't execute the cmd and print an error message (other daemons and tasks are still started)
The for loop copies newly created objects into the vector and destroys the original objects. The resize() instantiates the objects directly in the vector. Suggested by Nicolás Alvarez.
If the command of a task or daemon wants to use shell features like |, > or < the start script uses a shell encapsulation (sh -c) to start the process.
This had two problems:
1. It also started a shell if the command contained ' or " and didn't check if |, > or < where escaped or used within quotes (e.g. as part of a regular expression). The new mechanism uses the python module shlex to prepare the arguments for the execvp() call. It also detects if a shell encapsulation is needed and informs the user about it.
2. The actual daemon or task is a subprocess of the shell and was not terminated with the parent. The new signal propagation mechanism properly kills the daemon or task if the shell receives a signal to do so (e.g. by stop).
This was marked suspicious by Coverity because sizeof(buf) is always 8 when using strlcpy(). Since it is a fixed string we can assume that buf has enough space and use strcpy().
fixes CID 27756 found by Coverity