Scheduler / Working Queue

wakabayashi · December 12, 2024

I am working a lot with cronjobs, to not slow down my website when calling heavy processes. While this works solid, it's also getting a bit messy and I see that the core offers an ultra professional way of handling jobs. 😲 But there is not much documentation. @datakick Can you give me a short overview, how this is working and what be the simplest implentation to start with?

This is, what I believe at the moment:

I need to implement WorkQueueTaskCallable with one function execute().
When such a task is executed, an entry is saved to table 'ps_workqueue_task'.
When I want to such a task to be executed I need to call createTask()
There are schedulded tasks in addition, that trigger the WorkQueueTaskCallable in a certain frequency. Probably I have to implement InitializationCallback for that.
If a scheduld task was startet it's saved to 'ps_schedulded_task_execution' table

Not sure, if this is correct and probably there is much more. What I wonder:

What does actually execute the whole process? Do I need only one cronjob, that somehow fires the schedulded task? Is the TriggerController doing this from time to time as well? But why so?
Multiple cronjobs can to things parallel (at least, this is what I believe). Is this working queue the same or will one job be executed after the other? Let's say the packs quantity is updating some products, can I update on the same time currencies with a task (random example).
My goal is to cache more resources. That for I need a clever way to rebuild the cache. Do only the minimum on the fly the rest should be handled behind. For example I wan't to cache the reviews of product/category pages. When someone posts a new product review, I need to rebuild multiple caches. This working queue system should be ideal for jobs like this, right?

datakick · December 13, 2024

This is not 100% completed functionality -- it can be used perfectly fine, and it is being used by core, but there is still a piece missing that can make this very useful.

There are couple of main use cases for this workqueue system:

execute some task in isolation, and potentially in different thread/process
defer execution of task for some time
execute some task on regular basis (cronjob replacement)

There are a lot of tasks in the core/modules that we want to be executed, but we don't really care about the result, or even when exactly the task finishes. A few examples:

generate image thumbnails for a product
sending email

Currently, when we call Mail::send() method, the system is trying to synchronously deliver an email to smtp server. This can take a lot of time, and delays page response time. Order confirmation page can take few seconds to display because of this. Work queue system can help with this. We can describe the 'Send email task', and pass it to work queue system to execute it later, and the original process can continue its work. The email will be eventually sent, depending on the workqueue executor implementation.

Unfortunately, currently exists only one Work Queue Executor implementation -- immediate executor. This implementation executes the task immediately, synchronously -- so there is no real benefits.

We have plans are to implement different executors

stand alone executor - there would be a dedicated worker process(es) that run as a cli php application (maybe even on different servers). The task would be serialized, saved in some queue (db, redis,...), end workers would pick it up and execute them. This is most advanced, most complicated, but also provides the most benefits
cron-based executor -- tasks would be serialized and saved somewhere (db, redis), and executed on periodical cron events (every minute or so, for example)
end-of-process executor -- the task would be executed in the same process that it initialized, but at different time. Instead of execute the task immediately, it would be executed after the response has been sent to browser, just before script end

It would be up to store owner to decide, and set up, which executor they want to use.

I'm currently running a variant of stand-alone executor on my site. It's not a complete solution, just a proof of concept, but it works quite nicely.

But back to your question.

To use this system, you need to

Implement WorkQueueTaskCallable class. The execute method accepts $parameters. It can be anything, but it's a good idea to somehow encapsulate the shape of that parameters. Example task that generates image thumbnails/types can look like this:

class GeneratImageTypesForImageTask implements WorkQueueTaskCallable
{
    
    static function createTask(Image $image): WorkQueueTask
    {
        $parameters = [
            'imageId' => (int)$image->id
        ];
        return WorkQueueTask::createTask(
            GenerateImageForProductTask::class,
            $parameters,
            WorkQueueContext::fromContext(Context::getContext())
        );
    }

    public function execute(WorkQueueContext $context, array $parameters)
    {
        $image = new Image((int)$parameters['imageId']);
        // ... todo - generate all image types for this image
    }
}

To execute this task, somewhe in core/module you need to get instance of WorkQueueClient, and enqueue the task:

    //...
    $image = new Image((int)$parameters['imageId']);
    $workQueueClient = ServiceLocator::getInstance()->getWorkQueueClient();
    $workQueueClient->enqueue(GeneratImageTypesForImageTask::createTask($image));

And that's it.

If you want to schedule execution of the task, you can create a Scheduled Task and save it to database. $payload property of scheduled task is passed as $parameters to callable.

Sign In

Scheduler / Working Queue

Question

wakabayashi

1 answer to this question

Recommended Posts

datakick

Create an account or sign in to comment

Create an account

Sign in