In Mainframe Projects, sometimes due to a problem your batch job will fail. The reasons can be any. It can be logic related or system related.
I am sharing this post to get a complete idea of how a step or Job can be restarted and the related logic and how you can write it in a JCL. Also added what happens when you do not restart a batch job after fail.
RESTART Parameter Can be Used only in Job
- RESTART parameter can be coded only in Job statement
- So RESTART parameter handles the only job failures
- It cannot be used in EXEC or Step
Also Read: How to Restart Proc Step
How to handle job-step failure restarts
- The RD parameter you can use in both Job and Exec statements
- The RD can handle both Job failures and System Failures
- When RD=R or RD=RNC, the system will perform automatic restart either job or step
|R||Restart, checkpoints allowed. This option allows the system to automatically restart the execution of a job or job step from the beginning or from the last checkpoint.|
|RNC||Restart, no checkpoints. This option allows the system to perform an automatic step restart when the job or job step fails. Automatic and deferred checkpoint restarts aren’t allowed.|
|NR||No automatic restart, checkpoints allowed. This option suppresses automatic restarts but permits deferred checkpoint restarts.|
|NC||No automatic restart, no checkpoints. This option indicates that the system can’t perform an automatic step restart if the job or job step fails and that checkpoint restarts aren’t allowed.|
Check Point Handling in JCL along with COBOL/PL1
- The CHKPT parameter in the DD statement will allow you to write Checkpoint in a file
- CHKP=EOV, when the end of volume arrives during file processing, after the restart, it checks the entry in Checkpoint file and from that point, it continues.
- In COBOL, you need to mention the RERUN clause in the I-O-CONTROL section and assigns the DD name. So checkpoint will write into the data set of DD name.
- How to use COBOL RERUN Check here
Also Read: How to Skip Steps From Proc
How a Job can RESTART automatically in JES2 and JES3
If the system fails, you can automatically restart a job using a JES control statement.
In a JES2 environment, you use the RESTART parameter on a /*JOBPARM statement to queue the job for re-execution after a system IPL. If a RESTART parameter isn’t coded, N is assumed unless the installation overrides the default during JES2 initialization.
In a JES3 environment, you use the FAILURE parameter on a //*MAIN statement to indicate the recovery option to use if the system fails. If the FAILURE parameter isn’t coded, a default failure option will be assigned based on job class.
What happens when you do not restart
- All the downstream systems will be impacted
- Escalation will happen with production issues
- Clients and Users will have BIG business impact.