Spring Batch Example

Before going through spring batch example program, let’s get some idea about spring batch terminologies.

  • A job can consist of ‘n’ number of steps. Each step contains Read-Process-Write task or it can have single operation, which is called tasklet.
  • Read-Process-Write is basically read from a source like Database, CSV etc. then process the data and write it to a source like Database, CSV, XML etc.
  • Tasklet means doing a single task or operation like cleaning of connections, freeing up resources after processing is done.
  • Read-Process-Write and tasklets can be chained together to run a job.

Let us consider a working example for implementation of spring batch. We will consider the following scenario for implementation purpose.

A CSV file containing data needs to be converted as XML along with the data and tags will be named after the column name.

Below are the important tools and libraries used for spring batch example.

  1. Apache Maven 3.5.0 – for project build and dependencies management.
  2. Eclipse Oxygen Release 4.7.0 – IDE for creating spring batch maven application.
  3. Java 1.8
  4. Spring Core 4.3.12.RELEASE
  5. Spring OXM 4.3.12.RELEASE
  6. Spring JDBC 4.3.12.RELEASE
  7. Spring Batch 3.0.8.RELEASE
  8. MySQL Java Driver 5.1.25 – use based on your MySQL installation. This is required for Spring Batch metadata tables.

Spring Batch Example Directory Structure

Below image illustrates all the components in our Spring Batch example project.


Spring Batch Maven Dependencies

Below is the content of pom.xml file with all the required dependencies for our spring batch example project.

Spring Batch Processing CSV Input File

Here is the content of our sample CSV file for spring batch processing.

Spring Batch Job Configuration

We have to define spring bean and spring batch job in a configuration file. Below is the content of job-batch-demo.xml file, it’s the most important part of spring batch project.

  1. We are using FlatFileItemReader to read CSV file, CustomItemProcessor to process the data and write to XML file using StaxEventItemWriter.
  2. batch:job – This tag defines the job that we want to create. Id property specifies the ID of the job. We can define multiple jobs in a single xml file.
  3. batch:step – This tag is used to define different steps of a spring batch job.
  4. Two different types of processing style is offered by Spring Batch Framework, which are “TaskletStep Oriented” and “Chunk Oriented”. Chunk Oriented style is used in this example refers to reading the data one by one and creating ‘chunks’ that will be written out, within a transaction boundary.
  5. reader: spring bean used for reading the data. We have used csvFileItemReader bean in this example that is instance of FlatFileItemReader.
  6. processor: this is the class which is used for processing the data. We have used CustomItemProcessor in this example.
  7. writer: bean used to write data into xml file.
  8. commit-interval: This property defines the size of the chunk which will be committed once processing is done. Basically it means that ItemReader will read the data one by one and ItemProcessor will also process it the same way but ItemWriter will write the data only when it equals the size of commit-interval.
  9. Three important interface that are used as part of this project are ItemReader, ItemProcessor and ItemWriter from org.springframework.batch.item package.

Spring Batch Model Class

First of all we are reading CSV file into java object and then using JAXB to write it to xml file. Below is our model class with required JAXB annotations.

Note that the model class fields should be same as defined in the spring batch mapper configuration i.e. property name="names" value="id,firstname,lastname,dob" in our case.

Spring Batch FieldSetMapper

A custom FieldSetMapper is needed to convert a Date. If no data type conversion is required, then only BeanWrapperFieldSetMapper should be used to map the values by name automatically.

The java class which extends FieldSetMapper is ReportFieldSetMapper.

Spring Batch Item Processor

Now as defined in the job configuration an itemProcessor will be fired before itemWriter. We have created a CustomItemProcessor.java class for the same.

We can manipulate data in ItemProcessor implementation, as you can see that I am converting first name and last name values to upper case.

Spring Configuration Files

In our spring batch configuration file, we have imported two additional configuration files – context.xml and database.xml.

  • jobRepository – The JobRepository is responsible for storing each Java object into its correct meta-data table for spring batch.
  • transactionManager– this is responsible for committing the transaction once size of commit-interval and the processed data is equal.
  • jobLauncher – This is the heart of spring batch. This interface contains the run method which is used to trigger the job.

Spring Batch uses some metadata tables to store batch jobs information. We can get them created from spring batch configurations but it’s advisable to do it manually by executing the SQL files, as you can see in commented code above. From security point of view, it’s better to not give DDL execution access to spring batch database user.

Spring Batch Tables

Spring Batch tables very closely match the Domain objects that represent them in Java. For example – JobInstance, JobExecution, JobParameters and StepExecution map to BATCH_JOB_INSTANCE, BATCH_JOB_EXECUTION, BATCH_JOB_EXECUTION_PARAMS and BATCH_STEP_EXECUTION respectively.


The JobRepository is responsible for saving and storing each java object into its correct table.


Below are the details of each meta-data table.

  1. Batch_job_instance: The BATCH_JOB_INSTANCE table holds all information relevant to a JobInstance.
  2. Batch_job_execution_params: The BATCH_JOB_EXECUTION_PARAMS table holds all information relevant to the JobParameters object.
  3. Batch_job_execution: The BATCH_JOB_EXECUTION table holds data relevant to the JobExecution object. A new row gets added every time a Job is run.
  4. Batch_step_execution: The BATCH_STEP_EXECUTION table holds all information relevant to the StepExecution object.
  5. Batch_job_execution_context: The BATCH_JOB_EXECUTION_CONTEXT table holds data relevant to an Job’s ExecutionContext. There is exactly one Job ExecutionContext for every JobExecution, and it contains all of the job-level data that is needed for that particular job execution. This data typically represents the state that must be retrieved after a failure so that a JobInstance can restart from where it had failed.
  6. Batch_step_execution_context: The BATCH_STEP_EXECUTION_CONTEXT table holds data relevant to an Step’s ExecutionContext. There is exactly one ExecutionContext for every StepExecution, and it contains all of the data that needs to persisted for a particular step execution. This data typically represents the state that must be retrieved after a failure so that a JobInstance can restart from where it failed.
  7. Batch_job_execution_seq: This table holds the data execution sequence of job.
  8. Batch_step_execution_seq: This table holds the data for sequence for step execution.
  9. Batch_job_seq: This table holds the data for sequence of job in case we have multiple jobs we will get multiple rows.

Spring Batch Test Program

Our Spring Batch example project is ready, final step is to write a test class to execute it as a java program.

Just run above program and you will get output xml like below.

That’s all for Spring Batch example, you can download final project from below link.

Reference: Official Guide

