ompi-restart

Langue: en

Version: Jan 14, 2010 (ubuntu - 24/10/10)

Section: 1 (Commandes utilisateur)

NAME

ompi-restart, orte-restart - Restart a previously checkpointed parallel job using the Open PAL Checkpoint/Restart Service (CRS)

NOTE: ompi-restart, and orte-restart are all exact synonyms for each other. Using any of the names will result in exactly identical behavior.

SYNOPSIS

ompi-restart [ options ] <GLOBAL SNAPSHOT HANDLE>

Options

ompi-restart will attempt to restart a previously checkpointed parallel job from the global snapshot handle reference returned by ompi_checkpoint.
<GLOBAL SNAPSHOT HANDLE>
The global snapshot handle reference returned by ompi_checkpoint, used to restart the job. This is required to be the last argument to this command.
-h | --help
Display help for this command
-p | --preload
Preload the checkpoint files on the remote systems before restarting the application. Disabled by default.
--fork
Fork off a new process, which is the restarted process. By default, the restarted process will replace ompi-restart.
-s | --seq
The sequence number of the checkpoint to restart from. By default, the most recent sequence number is used (specified by -1).
-hostfile | --hostfile
The hostfile from which to restart the application. Useful in unscheduled environments. (Same behavior as --machinefile option)
-machinefile | --machinefile
The machinefile from which to restart the application. Useful in unscheduled environments. (Same behavior as --hostfile option)
-v | --verbose
Enable verbose output for debugging.
-gmca | --gmca <key> <value>
Pass global MCA parameters that are applicable to all contexts. <key> is the parameter name; <value> is the parameter value.
-mca | --mca <key> <value>
Send arguments to various MCA modules.

DESCRIPTION

ompi-restart can be invoked multiple, non-overlapping times. This allows the user to restart a previously running parallel job.

SEE ALSO


  orte-ps(1), orte-clean(1), ompi-checkpoint(1), opal-checkpoint(1), opal-restart(1), opal_crs(7)