Skip to content

rewrite/optimize JdbcStepExecutionDao GET_LAST_STEP_EXECUTION and COUNT_STEP_EXECUTIONS  #4017

Closed
@adil-belmehdi

Description

@adil-belmehdi

Hello,
We are investigating some massive index scan on a spring-batch database and we noticed that these queries COUNT_STEP_EXECUTIONS and COUNT_STEP_EXECUTIONS in JdbcStepExecutionDao can be optimized.
In fact, these queries seems to use a useless sub-query, and so, will scan the BATCH_JOB_EXECUTION table twice.
These queries can be re-written like below without any impact on the current behavior.

COUNT_STEP_EXECUTIONS :

  • current query :
SELECT COUNT(*) 
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from %PREFIX%JOB_EXECUTION where JE.JOB_INSTANCE_ID = ?)
and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and SE.STEP_NAME = ?
  • enhanced query :
SELECT COUNT(*) 
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and JE.JOB_INSTANCE_ID = ?
and SE.STEP_NAME = ?

GET_LAST_STEP_EXECUTION :

  • current query :
SELECT 
SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT,
SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, 
SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION,
JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, 
JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where  SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from %PREFIX%JOB_EXECUTION where JE.JOB_INSTANCE_ID = ?)
and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and SE.STEP_NAME = ?
order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc
  • enhanced query :
SELECT 
SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT,
SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, 
SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION,
JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, 
JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where  SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and JE.JOB_INSTANCE_ID = ?
and SE.STEP_NAME = ?
order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc

Do you see any reason to prefer the current query instead of the second one ?

Regards,
Adil

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions