python - Can two celery applications be interdependent? or two tasks of one application be interdependent? -
my workflow follows, using celery rabbitmq
step 1. large file broken multiple parts(let's 4), , put mq,
step 2. workers(let's 2) process files , store somewhere.
now, question , have task complete, , joining files, ofcourse synchronous task, i.e. parts of file should have been processed, so, do through celery make joining task dependent on step 2.
do create separate application join files, somehow receive status of these workers, whether have finished processing files.
or put joining of files task in mq, again (block waiting) assure that, parts processed, n join files, (this again can done worker)
which approach achievable? make these 2 tasks interdependent
yes 2 celery applications/tasks can interdependent.
to achieve goal use celery canvas: http://celery.readthedocs.org/en/latest/userguide/canvas.html , more precisly 'chords'
a chord task executes after of tasks in group have finished executing.
from celery import chord @task def process_parts(part): pass @task def join_parts(parts) pass def split_file(f) return file_parts_array def process_file(f): process_parts = [process_part.s(x) x in split_file(f)] join_parts = join_files.s() result = chord(process_parts)(join_parts) return result
you map task join_parts specific queue workers on storage machine consume join_files tasks.
Comments
Post a Comment