Spring Batch - Need help in design approach -
we need merge set of table data 1 data source based on last run config date. had implemented spring batch , working fine performance slow. taken around 18 hours process around 5 million records. haven't used multi threading or partionin yet. need in finding right design approach increase performance. task done through sql loader , completed in 3 hours. have around 8 table merged datasource. please let me know if info needed. in advance.
spring batch designed allow incremental enhancement of batch jobs basic single threaded processing full blown multi-jvm scaled solutions minimal configuration changes each step. without knowing use case, approach take depend on requirements:
- do need restartability? if so, eliminates basic multi-threaded steps since readers not support multi-threaded processing restartability.
- is process io bound? i'm assuming eliminate remote chunking option.
if above assumptions correct, leaves partitioning. can read more partitioning vs chunking here: difference between spring batch remote chunking , remote partitioning.
once you've chosen partitioning model, other questions you'll need answer are:
- what partitioning strategy? partitioning sends descriptions of data processed master each slave. you'll need determine description consists of (in db, id ranges common option).
- local or remote? can throughput need single jvm using threads execute slaves or need more horse power? if so, you'll want @ remote partitioning.
Comments
Post a Comment