Back

Show HN: Hatchet v1 – a task orchestration platform built on Postgres

42 points8 hoursgithub.com

Hey HN - this is Alexander from Hatchet. We’re building an open-source platform for managing background tasks, using Postgres as the underlying database.

Just over a year ago, we launched Hatchet as a distributed task queue built on top of Postgres with a 100% MIT license (https://news.ycombinator.com/item?id=39643136). The feedback and response we got from the HN community was overwhelming. In the first month after launching, we processed about 20k tasks on the platform — today, we’re processing over 20k tasks per minute (>1 billion per month).

Scaling up this quickly was difficult — every task in Hatchet corresponds to at minimum 5 Postgres transactions and we would see bursts on Hatchet Cloud instances to over 5k tasks/second, which corresponds to roughly 25k transactions/second. As it turns out, a simple Postgres queue utilizing FOR UPDATE SKIP LOCKED doesn’t cut it at this scale. After provisioning the largest instance type that CloudSQL offers, we even discussed potentially moving some load off of Postgres in favor of something trendy like Clickhouse + Kafka.

But we doubled down on Postgres, and spent about 6 months learning how to operate Postgres databases at scale and reading the Postgres manual and several other resources [0] during commutes and at night. We stuck with Postgres for two reasons:

1. We wanted to make Hatchet as portable and easy to administer as possible, and felt that implementing our own storage engine specifically on Hatchet Cloud would be disingenuous at best, and in the worst case, would take our focus away from the open source community.

2. More importantly, Postgres is general-purpose, which is what makes it both great but hard to scale for some types of workloads. This is also what allows us to offer a general-purpose orchestration platform — we heavily utilize Postgres features like transactions, SKIP LOCKED, recursive queries, triggers, COPY FROM, and much more.

Which brings us to today. We’re announcing a full rewrite of the Hatchet engine — still built on Postgres — together with our task orchestration layer which is built on top of our underlying queue. To be more specific, we’re launching:

1. DAG-based workflows that support a much wider array of conditions, including sleep conditions, event-based triggering, and conditional execution based on parent output data [1].

2. Durable execution — durable execution refers to a function’s ability to recover from failure by caching intermediate results and automatically replaying them on a retry. We call a function with this ability a durable task. We also support durable sleep and durable events, which you can read more about here [2]

3. Queue features such as key-based concurrency queues (for implementing fair queueing), rate limiting, sticky assignment, and worker affinity.

4. Improved performance across every dimension we’ve tested, which we attribute to six improvements to the Hatchet architecture: range-based partitioning of time series tables, hash-based partitioning of task events (for updating task statuses), separating our monitoring tables from our queue, buffered reads and writes, switching all high-volume tables to use identity columns, and aggressive use of Postgres triggers.

We've also removed RabbitMQ as a required dependency for self-hosting.

We'd greatly appreciate any feedback you have and hope you get the chance to try out Hatchet.

[0] https://www.postgresql.org/docs/

[1] https://docs.hatchet.run/home/conditional-workflows

[2] https://docs.hatchet.run/home/durable-execution

diarrhea4 hours ago

This is very exciting stuff.

I’m curious: When you say FOR UPDATE SKIP LOCKED does not scale to 25k queries/s, did you observe a threshold at which it became untenable for you?

I’m also curious about the two points of:

- buffered reads and writes

- switching all high-volume tables to use identity columns

What do you mean by these? Were those (part of) the solution to scale FOR UPDATE SKIP LOCKED up to your needs?

abelanger4 hours ago

I'm not sure of the exact threshold, but the pathological case seemed to be (1) many tasks in the backlog, (2) many workers, (3) workers long-polling the task tables at approximately the same time. This would consistently lead to very high spikes in CPU and result in a runaway deterioration on the database, since high CPU leads to slower queries and more contention, which leads to higher connection overhead, which leads to higher CPU, and so on. There are a few threads online which documented very similar behavior, for example: https://postgrespro.com/list/thread-id/2505440.

Those other points are mostly unrelated to the core queue, and more related to helper tables for monitoring, tracking task statuses, etc. But it was important to optimize these tables because unrelated spikes on other tables in the database could start getting us into a deteriorated state as well.

To be more specific about the solutions here:

> buffered reads and writes

To run a task through the system, we need to write the task itself, write the instance of that retry of the count to the queue, write an event that the task has been queued, started, completed | failed, etc. Generally one task will correspond to many writes along the way, not all of which need to be extremely latency sensitive. So we started buffering items coming from our internal queues and flushing them once every 10ms, which helped considerably.

> switching all high-volume tables to use identity columns

We originally had combined some of our workflow tables with our monitoring tables -- this table was called `WorkflowRun` and it was used for both concurrency queues and queried when serving the API. This table used a UUID as the primary key, because we wanted UUIDs over the API instead of auto-incrementing IDs. The UUIDs caused some headaches down the line when trying to delete batches of data and prevent index bloat.

themanmaran3 hours ago

How does queue observability work in hatchet? I've used pg as a queueing system before, and that was one of my favorite aspects. Just run a few SQL queries to have a dashboard for latency/throughput/etc.

But that requires you to keep the job history around, which at scale starts to impact performance.

abelanger3 hours ago

Yeah, part of this rewrite was separating our monitoring tables from all of our queue tables to avoid problems like table bloat.

At one point we considered partitioning on the status of a queue item (basically active | inactive) and aggressively running autovac on the active queue items. Then all indexes for monitoring can be on the inactive partitioned tables.

But there were two reasons we ended up going with separate tables:

1. We started to become concerned about partitioning _both_ by time range and by status, because time range partitioning is incredibly useful for discarding data after a certain amount of time

2. If necessary, we wanted our monitoring tables to be able to run on a completely separate database from our queue tables. So we actually store them as completely independent schemas to allow this to be possible (https://github.com/hatchet-dev/hatchet/blob/main/sql/schema/... vs https://github.com/hatchet-dev/hatchet/blob/main/sql/schema/...)

So to answer the question -- you can query both active queues and a full history of queued tasks up to your retention period, and we've optimized the separate tables for the two different query patterns.

hyuuu2 hours ago

i have been looking for something like this, the closest I could find by googling was celery workflow, i think you should do better marketing, I didn't even realize that hatchet existed!

digdugdirk5 hours ago

Interesting! How does it compare with DBOS? I noticed it's not in the readme comparisons, and they seem to be trying to solve a similar problem.

abelanger4 hours ago

Yep, durable execution-wise we're targeting a very similar use-case with a very different philosophy on whether the orchestrator (the part of the durable execution engine which invokes tasks) should run in-process or as a separate service.

There's a lot to go into here, but generally speaking, running an orchestrator as a separate service is easier from a Postgres scaling perspective: it's easier to buffer writes to the database, manage connection overhead, export aggregate metrics, and horizontally scale the different components of the orchestrator. Our original v0 engine was architected in a very similar way to an in-process task queue, where each worker polls a tasks table in Postgres. This broke down for us as we increasing volume.

Outside of durable execution, we're more of a general-purpose orchestration platform -- lots of our features target use-cases where you either want to run a single task or define your tasks as a DAG (directed acyclic graph) instead of using durable execution. Durable execution has a lot of footguns if used incorrectly, and DAGs are executed in a durable way by default, so for many use-cases it's a better option.

darkteflon3 hours ago

Hatchet looks very cool! As an interested dilettante in this space, I’d love to read a comparison with Dagster.

Re DBOS: I understood that part of the value proposition there is bundling transactions into logical units that can all be undone if a critical step in the workflow fails - the example given in their docs being a failed payment flow. Does Hatchet have a solution for those scenarios?

abelanger2 hours ago

Re DBOS - yep, this is exactly what the child spawning feature is meant for: https://docs.hatchet.run/home/child-spawning

The core idea being that you write the "parent" task as a durable task, and you invoke subtasks which represent logical units of work. If any given subtask fails, you can wrap it in a `try...catch` and gracefully recover.

I'm not as familiar with DBOS, but in Hatchet a durable parent task and child task maps directly to Temporal workflows and activities. Admittedly this pattern should be documented in the "Durable execution" section of our docs as well.

Re Dagster - Dagster is much more oriented towards data engineering, while Hatchet is oriented more towards application engineers. As a result tools like Dagster/Airflow/Prefect are more focused on data integrations, whereas we focus more on throughput/latency and primitives that work well with your application. Perhaps there's more overlap now that AI applications are more ubiquitous? (with more data pipelines making their way into the application layer)

darkteflon2 hours ago

Perfect - great answer and very helpful, thanks.

bomewish2 hours ago

Why not fix all the broken doc links and make sure you have the full sdk spec down first, ready to go? Then drop it all at once, when it’s actually ready. That’s better and more respectful of users. I love the product and want y’all to succeed but this came off as extremely unprofessional.

abelanger2 hours ago

Really appreciate the candid feedback, and glad to hear you like the product. We ran a broken links checker against our docs, but it's possible we missed something. Is there anywhere you're seeing a broken link?

Re SDK specs -- I assume you mean full SDK API references? We're nearly at the point where those will be published, and I agree that they would be incredibly useful.

wilted-iris3 hours ago

This looks very cool! I see a lot of Python in the docs; is it usable in other languages?

abelanger3 hours ago

Thanks! There are SDKs for Python, Typescript and Go. We've gotten a lot of requests for other SDKs which we're tracking here: https://github.com/hatchet-dev/hatchet/discussions/436

throwaway9w42 hours ago

Is there any documentation of the api, so that someone can call it directly without going through the sdk?

abelanger2 hours ago

We use gRPC on our workers. All API specs can be found here: https://github.com/hatchet-dev/hatchet/tree/main/api-contrac...

However, the SDKs are very tightly integrated with the runtime in each language, and we use gRPC on the workers which will make it more difficult to call the APIs directly.

tombhowl4 hours ago

[dead]