Closed
Description
Hello,
We are investigating some massive index scan on a spring-batch database and we noticed that these queries COUNT_STEP_EXECUTIONS
and COUNT_STEP_EXECUTIONS
in JdbcStepExecutionDao
can be optimized.
In fact, these queries seems to use a useless sub-query, and so, will scan the BATCH_JOB_EXECUTION
table twice.
These queries can be re-written like below without any impact on the current behavior.
COUNT_STEP_EXECUTIONS :
- current query :
SELECT COUNT(*)
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from %PREFIX%JOB_EXECUTION where JE.JOB_INSTANCE_ID = ?)
and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and SE.STEP_NAME = ?
- enhanced query :
SELECT COUNT(*)
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and JE.JOB_INSTANCE_ID = ?
and SE.STEP_NAME = ?
GET_LAST_STEP_EXECUTION :
- current query :
SELECT
SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT,
SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT,
SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION,
JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE,
JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from %PREFIX%JOB_EXECUTION where JE.JOB_INSTANCE_ID = ?)
and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and SE.STEP_NAME = ?
order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc
- enhanced query :
SELECT
SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT,
SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT,
SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION,
JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE,
JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and JE.JOB_INSTANCE_ID = ?
and SE.STEP_NAME = ?
order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc
Do you see any reason to prefer the current query instead of the second one ?
Regards,
Adil