I'll start this post by saying that it's been a while since we got the first version of the CRON, which became de facto a default task scheduling tool for developers. Even more, cron jobs are older than me and I'm not that young.

When I first got into software development, we used to deploy our code on EC2 instances and have a minimal continuous delivery setup realized via webhooks that triggered the git pull command and restart nginx, but we also had a bunch of recurring tasks That had to be invoked at midnight (classic example). Some of them had to run every couple of minutes.

I remember learning the cron syntax at that time, it felt almost like RegExp, but surely, it was 1000 times easier, however, this is not what I want to talk about.

Cron jobs are not evolving

And they should not, but the way we approach them should. There is nothing wrong with the CRON itself as a tool. It is as it is, and has been helping developers for ages now. The problem is not the tool, the problem is the way we use it.

Over time, a lot of tools that I used for software development have been changed, and updated; some of them died, RIP Netbeans IDE.
These updates always bring something to your routine. For example, Docker helps deploy your code without forgetting to install that tiny package from 2005 with a fixed version to keep legacy PHP projects running. Or NodeJS lets you believe in the fairy tale that your JS code will run fine on the server (spoiler alert - it wouldn't). The only thing that remains the same on every single project is CRON..

At some point, almost every (I think every, just can't recall all tasks) project I worked on had to have a certain amount of scheduled tasks to be running. I can't really explain why, but I always tried to avoid cron jobs at all costs, it felt like you took a wrong turn, almost like your architecture is incorrect and now you try to patch it with some tasks that will be fixing your mistakes every couple of minutes. Even though it's not really like this and cron jobs are powerful tools to solve various tasks, sometimes I still feel this way.

Trying to explain this feeling I came up with one answer. When you write your backend code, you, as a developer, have a deep understanding of the context of your program (I hope so), its runtime, a ton of dependencies, and how they are injected using that fancy DI lib. But what about CRON Jobs? Do they exist inside the context of your app? The answer is no.

The reality

CRON jobs are defined in crontab files in Linux and triggered by a cron scheduler when needed unless you use any other planner; due to its nature - they simply trigger a script.
If you ask yourself what's wrong with that, you can probably say "nothing", and I'd agree with you, but.

The reality of modern software development is different, now people tend to scale more horizontally than vertically, even though vertical scaling is way more affordable and in most cases (especially in early stages) easier and in my opinion favorable. We might consider this horizontal scaling a premature optimization, though I'm not entirely convinced that's accurate. Sometimes it's not only about being ready to scale your app for hundreds of thousands of users, it's about the way we deploy our projects nowadays.

I think this is dictated by the fact that user demand for availability and resilience is way higher than it used to be, and of course some belief in your next idea; so what do you do? Correct, you do to cloud provider and enable blue-green deployments and a minimum number of instances, or you go even further and enable cross-regional deployment to place your app closer to the customer (No one cares that the DB is a single node in us-east-2, but the app instance is in Australia.)

And now it's time to add a few scheduled tasks. The problem here is: Docker. You heard me right, and don't get it wrong, I love Docker, but you can't just place your crontab file in a docker image and call it a day. This will lead to your app having two simultaneous executions of the cron jobs. I suspect many developers face this same challenge. Let's see what cloud providers offer us. Or even better, instead of researching on our own, let's follow the modern approach to building projects and consult ChatGPT/Claude or other LLMs.

Here are a couple of suggestions:

  1. Fly.io > Fly.io provides a mechanism for scheduled tasks through their "Fly Machines" functionality. You can deploy a separate machine dedicated to running your scheduled jobs.
  2. Render > Render offers built-in cron job support through their "Cron Jobs" feature, which allows you to set up scheduled tasks directly in your Render dashboard. These jobs run on separate infrastructure from your web services.
  3. AWS > AWS EventBridge (formerly CloudWatch Events) allows you to create rules that run on schedules AWS Lambda can be triggered on a schedule AWS Batch for more resource-intensive scheduled jobs

I'm not going to start talking about k8s jobs that must spin up a container each time when they invoke, spending a ton of time on that, or the fact that you have to pay 5 USD/month for each cron job using Render.com to run a script inside a docker container.

Even with all these solutions, you still have to actually build something that will be running outside of the scope of your project and will call an endpoint on your backend or push a message into a queue and allow only one consumer to read from that. It becomes worse when you realize that you need to monitor executions, or react to failed executions, and prevent them from overlapping for long-running tasks.

What I'm trying to say here, is that the modern way of software engineering is already quite complex and broad, and having a need to deploy and maintain one more system to invoke a couple of functions in your backend is kind of absurd in my opinion.

There are some tools and lib that try to solve this, like NodeJS Bull lib that uses Redis to act like an orchestrator for task executions, etc. But do you really want this?

Solution?

I guess at this stage you have a right to say that there is a lot of critique and no solutions offered, so let's talk about it.

All I can say is that I believe scheduled tasks should exist and be executed from within your code, while the orchestration must be done by an external system. They should be scoped by type or name and be able to prevent overlapping. As a developer, you shouldn't need to worry about task synchronization or building solutions to monitor them properly.

The gap between modern application architecture and outdated scheduling tools presented an opportunity to create something better. And if that feels like it's approaching some advertisement section in the YouTube video from your favorite creator - you are not that far away from the truth.
This article is a reflection of my thoughts that resulted in a project called schedo.dev that aims to solve these problems.

It gives developers a way to describe functions in the runtime that will execute the code. This prevents accidents from happening, such as having two prod environments running simultaneously and processing duplicate money withdrawals (a real story I heard).

When building it, we went through different solutions, so you don't have to. Schedo will do synchronization and will deliver the job to available consumers once the job is ready to be executed.

Compared to standard cron jobs, you can trigger them immediately when needed, monitor execution times, and read the logs. But even more important - you don't build a thing yourself. Cron jobs must be easy, and you need to be thinking about what's the actual thing happening inside, not about the way how to trigger it or make sure it's not overlapping.

Define a job and run your code, this is how it's supposed to be.

schedo.defineJob(
  'send-weekly-report',   // Identifier
  '0 9 * * 1',            // Schedule (every Monday at 9 AM)
  async (ctx) => {        // Handler
    await sendReport(ctx.userId);
    return 'Report sent';
  }
);

This article is already long enough, but I'd like to emphasize a few points about the way it works. If you already gave up on this semi-promotional article, I don't blame you. But if you are still here - let's dive in.

  1. Job definition, the snippet above is an example of the job being defined using Schedo.dev SDK. Whenever your app starts it connects to the remote server and checks if this job exists for the environment and matches the name & schedule. Job is defined once, no matter how many instances of the app you have.
  2. Job scheduler. After the job is defined in Schedo, it's registered in our cron scheduler which takes care of the invocation when needed. So, your job is stored and scheduled on Schedo's side.
  3. Job execution. Since your app is connected to Schedo's API, when the time comes - there is a signal sent to one of the connected instances of the app, ensuring there are no simultaneous executions. Once the job execution is picked up - it's locked for that worker.
  4. Timeouts, you can define the job with two different types of timeouts. Pickup timeout - the worker must pick up the job within a period of time defined before the job becomes expired. Execution timeout - simply the time given for a job to execute.
  5. Blocking jobs. By default, Schedo tries to behave as standard crontab, not preventing jobs from overlapping. But you can and, in a lot of cases should, define the job as blocking if you want the next execution to be skipped if the previous one is still running. The job becomes skipped in this case

Some person asked my friend who developed this project with me, "What did happen to cron jobs over the last couple of years that we decided this project must exist?" That spot question didn't get the response we wanted at that moment, but now I'd say "Nothing. And that's exactly why we believe this must exist".

Cron JOBS