Spring Boot provides a very simple way of setting up Cron jobs in an application using Spring Scheduler. It can be easily packaged into a container , deployed on Kubernetes and it works like charm.
All our production applications use Spring Scheduler to run the cron jobs . One caveat is that the application can run on only pod , as it lacks the capability to synchronize scheduling over multiple instances , which effectively reduces the resiliency and in turn reliability of the cron jobs .
How do we get around this ?
Run multiple pod replicas
Shedlock comes in handy as it provides a way to synchronize jobs over multiple instances , ensuring our scheduled jobs run only once at same time .
I find this article by Baeldung very useful to quick start Shedlock in your Spring Boot project.
It’s not perfect yet ….
Now we are able to run multiple pod replicas on K8s . All looks good . Alas ,one fine day , we notice an issue when there was planned Kubernetes upgrade in cluster wherein the admins drained nodes in rollover fashion , and unfortunately our both cron service pods were scheduled on same node , hence they went down at same time because of which the scheduled jobs didn’t run .
To mitigate this , we tried out a feature in Kubernetes called as [Pod Disruption budget](kubernetes.io/docs/tasks/run-application/co.. "kubernetes.io/docs/tasks/run-application/co..")(PDB) which prevents such voluntary admin operations to be halted and ensures that we have a minimum number of pods running at any given time .
Having the above PDB ensures that we always have one guaranteed pod running in K8S cluster at any given point of time .
All the production jobs in production environment are running in a resilient way now :)