Revisit the configuration of infrastructure beans with @EnableBatchProcessing

Compared to the XML configuration style where infrastructure beans (`JobRepository`, `JobLauncher`, etc) should be defined manually,  `@EnableBatchProcessing` does a good job in configuring those beans automatically and making them available for autowiring in users configuration classes. However, several issues have been reported regarding the default behaviour of this annotation as well as the customization of its behaviour. Here is a non exhaustive list:

### 1. Customization of infrastructure beans is not straightforward

For example, as reported in #3765, in order to create a custom serializer, one needs to provide a custom `JobRepository`. Now in order to provide a custom `JobRepository`, one needs to provide a custom `BatchConfigurer` (either by implementing the interface or by extending the default one and override a method), something like:

```java
@Configuration
@EnableBatchProcessing
public class MyJobConfigWithCustomSerializer {

    @Bean
    public BatchConfigurer batchConfigurer() {
        return new DefaultBatchConfigurer() {
            @Override
            public JobRepository getJobRepository() {
                ExecutionContextSerializer serializer = new Jackson2ExecutionContextStringSerializer();
                // customize serializer
                JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
                factory.setSerializer(serializer);
                // set other properties on the factory bean
                try {
                    factory.afterPropertiesSet();
                    return factory.getObject();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }

}
```

Moreover, in this case, the custom serializer should also be set on the `JobExplorer` in order to correctly deserialize the execution context while exploring meta-data that was persisted with the `JobRepository`. So one needs to do the following as well:

```java
@Configuration
@EnableBatchProcessing
public class MyJobConfigWithCustomSerializer {

    @Bean
    public BatchConfigurer batchConfigurer() {
        return new DefaultBatchConfigurer() {
            @Override
            public JobRepository getJobRepository() {
                JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
                factory.setSerializer(createCustomSerializer());
                // set other properties on the factory bean
                try {
                    factory.afterPropertiesSet();
                    return factory.getObject();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }

            @Override
            public JobExplorer getJobExplorer() {
                JobExplorerFactoryBean factoryBean = new JobExplorerFactoryBean();
                factoryBean.setSerializer(createCustomSerializer());
                // set other properties on the factory bean
                try {
                    factoryBean.afterPropertiesSet();
                    return factoryBean.getObject();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }

            private ExecutionContextSerializer createCustomSerializer() {
                Jackson2ExecutionContextStringSerializer serializer = new Jackson2ExecutionContextStringSerializer();
                // customize serializer
                return serializer;
            }
        };
    }
    
}
```

The process is the same for other properties of the job repository/explorer like the tables prefix, `LobHandler`, etc.

### 2. Unconditional exposure of some beans

While batch specific beans like `JobRepository`, `JobLauncher`, etc could be exposed in the application context "safely", some beans like the transaction manager could be used in other parts of the application and exposing it unconditionally could be problematic (see #816). This is especially true when using Spring Boot, and this requires bean overriding which [is not](https://stackoverflow.com/questions/67919939/spring-batch-infrastructure-defined-with-java-config-beans-not-with-enablebatch) always [wanted](https://stackoverflow.com/questions/56233013/overriding-bean-issue-in-spring-batch/56233436) by users.

### 3. Extending infrastructure beans is not straightforward / possible

Since infrastructure beans are systematically defined and exposed by `@EnableBatchProcessing` and not looked up from the application context first, it is not easy/possible to extend those beans to add custom behaviour (like adding tracing to the `JobRepository` for instance, as reported in #3899) and use the extensions in place of default beans.

### 4. `BatchConfigurer` is eventually an unnecessary level of indirection

Most people tend to declare infrastructure beans in the application context and expect them to be picked up by Spring Batch (this is not a wrong expectation). Here are some examples:

* https://stackoverflow.com/questions/64205418/spring-batch-isolation-level/64206508#64206508
* https://stackoverflow.com/questions/68124140/spring-boot-spring-batch-hsqldb-configure-hsqldb-for-jobrepository
* https://stackoverflow.com/questions/56567514/hibernateitemwriter-in-spring-batch-program-tries-to-run-without-beginning-trans/56575830#56575830
* https://stackoverflow.com/questions/55339675/cant-serialize-access-for-this-transaction-when-running-single-job-serialized/55341011#55341011
* https://stackoverflow.com/questions/68938000

If `@EnableBatchProcessing` is changed to look for beans in the application context first, requiring users to provide a custom `BatchConfigurer` would not become mandatory anymore. For example, the following way of configuring a custom `JobRepository`:

```java
@Configuration
@EnableBatchProcessing
public class MyJobConfigWithCustomJobRepository {

    @Bean
    public BatchConfigurer batchConfigurer() {
        return new DefaultBatchConfigurer() {
            @Override
            public JobRepository getJobRepository() {
                JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
                // set properties on the factory bean
                try {
                    factory.afterPropertiesSet();
                    return factory.getObject();
                } catch (Exception e) {
                    throw new RuntimeException(e);
                }
            }
        };
    }

   // job bean definition
}
```

could become:

```java
@Configuration
@EnableBatchProcessing
public class MyJobConfigWithCustomJobRepository {

    @Bean
    public JobRepository jobRepository() throws Exception {
        JobRepositoryFactoryBean factory = new JobRepositoryFactoryBean();
        // set properties on the factory bean
        factory.afterPropertiesSet();
        return factory.getObject();
    }

   // job bean definition
   
}
```

### 5. Confusing configuration when batch meta-data is not required

The configuration of the `JobRepository`/`JobExplorer` in `@EnableBatchProcessing` is based on the presence of a `DataSource` bean (if no data source is provided, a Map-based job repository/explorer is configured, which was deprecated anyway #3780). If the application context contains one or more datasource that should *not* be used by Spring Batch for its meta-data, things seem to become complicated and confusing to many people. Here are some examples:

* [Problem is I already have 3 another dataSources defined, but I don't want to use any of them in springBatch](https://stackoverflow.com/questions/39913918/spring-boot-spring-batch-without-datasource)
* [Spring-Batch without persisting metadata to database?](https://stackoverflow.com/questions/25077549/spring-batch-without-persisting-metadata-to-database)
* [Define an in-memory JobRepository](https://stackoverflow.com/questions/44238232/define-an-in-memory-jobrepository)
* [In-memory repository with Spring Boot](https://github.com/spring-projects/spring-batch/issues/905)
* And so on.

It is concerning that people end up with an empty setter for the datasource:

- [Example 1](https://stackoverflow.com/a/42721313/5019386)
- [Example 2](https://stackoverflow.com/a/52590772/5019386)
- [Example 3](https://stackoverflow.com/a/52643365/5019386)

The data source is actually an implementation detail of a particular `JobRepository` implementation, which is the JDBC based `JobRepository`. Other implementations of `JobRepository` might not need a data source at all  (like a [MongoDB based job repository](https://github.com/spring-projects/spring-batch/issues/877) for example). The point here is that the data source should not be a first order concern in terms of configuration, but rather a second order concern. In other words, `@EnableBatchProcessing` should first make sure the user wants to use the JDBC based job repository, and if so, only then check for a data source bean in the application context.

---

## Possible solutions

I see a couple of options here, but I'm open for other suggestions as well.

### 1. Use annotation attributes to customize infrastructure beans

The idea here is to make `@EnableBatchProcessing` first look for infrastructure beans in the application context (this is similar and consistent with the way other projects from the portfolio configure apps, like [Spring Security](https://docs.spring.io/spring-security/site/docs/current/reference/html5/#oauth2login-javaconfig-wo-boot) for instance). If those beans are not defined, then create them and register them in the application context. The same naming conventions used with XML configuration style should be used for consistency. For example:

```java
@Configuration
@EnableBatchProcessing(dataSource = "myDataSource",  // could be omitted if named "dataSource"
                       transactionManager = "myTransactionManager", // could be omitted if named "transactionManager"
                       serializer = "mySerializer")
public class MyJobConfiguration {

    @Bean
    public Job job(JobBuilderFactory jobBuilderFactory) {
        return jobBuilderFactory.get("myJob")
                // define job flow
                .build();
    }

    @Bean // could be the one auto-configured by Spring Boot
    public ExecutionContextSerializer mySerializer() {
        ExecutionContextSerializer serializer = new Jackson2ExecutionContextStringSerializer();
        // customize serializer
        return serializer;
    }

}
```

In this example, `@EnableBatchProcessing` would first look for a `JobRepository` bean named `jobRepository` (same naming convention as XML) in the application context. If no such bean is defined, then it should create one by setting collaborators as defined in the annotation attributes.

### 2. Provide a base configuration class with infrastructure beans

Similar to the [base XML application context that defines infrastructure beans](https://docs.spring.io/spring-batch/docs/4.3.x/reference/html/jsr-352.html#jsrSetupContexts) with XML configuration, the idea is to provide a similar mechanism but for Java configuration. This means providing a base configuration class (which could be something like the current `AbstractBatchConfiguration` but with ready-to-use bean definitions in it) that users can extend to define their batch jobs:

```java
@Configuration
public class MyBatchApplication extends BatchConfiguration {

    @Bean
    public Job job(JobBuilderFactory jobBuilderFactory) {
        return jobBuilderFactory.get("job")
                // define job
                .build();
    }

}
```

Any bean that needs customization can be declared in the user's class.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revisit the configuration of infrastructure beans with @EnableBatchProcessing #3942

1. Customization of infrastructure beans is not straightforward

2. Unconditional exposure of some beans

3. Extending infrastructure beans is not straightforward / possible

4. `BatchConfigurer` is eventually an unnecessary level of indirection

5. Confusing configuration when batch meta-data is not required

Possible solutions

1. Use annotation attributes to customize infrastructure beans

2. Provide a base configuration class with infrastructure beans

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Revisit the configuration of infrastructure beans with @EnableBatchProcessing #3942

Description

1. Customization of infrastructure beans is not straightforward

2. Unconditional exposure of some beans

3. Extending infrastructure beans is not straightforward / possible

4. BatchConfigurer is eventually an unnecessary level of indirection

5. Confusing configuration when batch meta-data is not required

Possible solutions

1. Use annotation attributes to customize infrastructure beans

2. Provide a base configuration class with infrastructure beans

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

4. `BatchConfigurer` is eventually an unnecessary level of indirection