Cronus the Titan

Cron Job for Lazy Developers

I am a lazy developer who always prefers not to build things if I can avoid them. Recently I faced an interesting challenge (you can read more about it on Snyk Blog), which requires running a set of cron jobs, that depends on each other (e.g. job B depends on the output of job A). Let’s try to see what will be the laziest solution we can find for this problem!

First iteration: Kubernetes Jobs

The first intuitive solution was to use Kubernetes CronJob. Assuming you have a Kubernetes cluster, and someone is maintaining it, this is a very good solution for lazy developers. Just push your code to a container, write the manifest (a few lines of YAML) and you’re done! Well, almost… As I said, I needed to run multiple cron jobs that depend on each other. While it is possible to do that with pure Kubernetes objects (for example: running multiple containers in the same job, triggering another job from the first cron job, etc), those solutions are not easy and require writing a lot of synchronization logic. Projects like Argo Workflow makes this problem a lot easier, but (a) we didn’t have any installed (remember: I’m a lazy developer!) and (b) this is still a very complex set of YAML files.

Second iteration: Serverless?

Using AWS lambda could be interesting solution, especially when using Step Functions to orchestrate dependencies between multiple functions. This could be an interesting solution, but I never worked with it in the past and the learning curve seems too high to me.

Third Iteration: CircleCI

At Snyk, we’re using mainly CircleCI for a lot of our needs. Like most modern CI services, it has support for triggering builds at specific times and defining dependencies between jobs. I ended up with a pretty big CircleCI YAML file (more than 400 lines of code!), but it was really easy to set up – and relatively readable. Using the ability to reuse configuration I could define common tasks once and trigger them with different parameters.
An example pipeline run on CircleCI
We ended up with a pretty long pipeline (17 jobs!), but easy to understand, maintain and extend. But nothing is perfect. Monitor, for example, is a challenge. CircleCI can send notifications to Slack on failure (or success, based on your configuration), which gives you basic monitoring. Sending metrics requires additional code (ideally with something that supports push like StatsD or Graphite). Either case, it requires additional work and set up and does not come out of the box (and, to be honest, the same problem exists for Kubernetes corn jobs). Despite all that, I am pretty happy with the choice to go with a managed CI tool like CircleCI. Yes, it sounds strange at first, but actually, it does the job pretty well (especially when you consider the alternatives). What are your thoughts about it? Did you face a similar problem? Which solution did you choose?

Leave a Reply

Your email address will not be published. Required fields are marked *