My name is Batch, Spring Batch

Very often Enterprise applications need to import or export a lot of data from or to external systems. These operations may be required with regular frequency; in order to implement them you can choose among several solutions provided from frameworks (Batch Framework), APIs (Quartz) and tools (ETL). One of the best solution is to apply a scheduled batch process that easily fits that kind of technical requirements. In this POST I’d like to provide an easy introduction to Spriong Batch; there are tons of how-tos and examples in iternet, but when I had the need to try a simple solution at home in a new workspace I’ve met some small issues; then my goal is to provide a quick overview of the main features and to show how to setup a Java project for an Hello World Spring Batch example.

What Spring Batch is?

Spring Batch is a framework that provides functions able to process large volumes of records, to manage transactions, to start job operations and to store their statistics. It is based on Spring Core then you can manage dependency injection and use all Spring libraries. It is built on a simple model that is shown here:batch-model

How you can see, the main concept is the job; a job is composed by one or more steps. It is executed through a JobLauncher and all data related to the execution are persisted in a JobRepository. In other words when you are going to design and implement a Spring Batch job, you have to figure out how many steps your job needs, which JobLauncher you have to use and where you want to put job execution data.

Enviroment setup

I assume that today every Java programmer has maven intalled and enabled in his programming system and/or that he has the plugin on his preferred IDE; then the first step to get Spring Batch ready is to run this command on your command line:

mvn archetype:generate -DgroupId=com.hellospringbatchworld.app -DartifactId=HelloSpringBatchWorld -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

It will create a standard maven structure with a pom.xml file within the its root directory; at this point you can replace the its content with this:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.hellospringbatchworld.app</groupId>
  <artifactId>HelloSpringBatchWorld</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>HelloSpringBatchWorld</name>
  <url>http://maven.apache.org</url>
  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
        <groupId>org.springframework.batch</groupId>
        <artifactId>spring-batch-core</artifactId>
        <version>3.0.1.RELEASE</version>
    </dependency>
  </dependencies>
</project>

The enviroment isn’t still ready, it needs to load all artifacts related to Spring Batch in order to use its jar files; in order to do that you need to run this maven command:

mvn clean dependency:copy-dependencies

This command will load all jar files and allow to begin to code the example from your preferred IDE; I’m going to show how to do with Eclispe. Open an Eclipse IDE instance, and from an existing workspace “File”-> “Import …” -> “Maven” -> “Existing Maven Project”; now you have to navigate to the directory where before you’ve copied pom.xml file and complete import project phase.

Let’s code!

How you can see you have a main class named App under com.hellospringbatchworld.app package. On few minutes we will use this class to run the Job, but something still misses; we need to edit two xml files. First one (app1-context.xml) is where we have to define the structure of the Job; in this case it’s very simple, we want just to show an Hello World message on the console.

<?xml version="1.0" encoding="UTF-8"?>
  <beans xmlns="http://www.springframework.org/schema/beans"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xmlns:batch="http://www.springframework.org/schema/batch"
         xsi:schemaLocation="http://www.springframework.org/schema/beans
         http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
         http://www.springframework.org/schema/batch
         http://www.springframework.org/schema/batch/spring-batch-3.0.xsd">

    <import resource="classpath:/META-INF/simple-springbatch-conf.xml" />
    <!-- JOB DECLARATION START-->
    <batch:job id="HelloSpringBatchWorld">
       <batch:step id="HSBWStep">
         <batch:tasklet ref="HSBWTasklet"/>
       </batch:step>
    </batch:job>
    <bean id="HSBWTasklet" class="com.hellospringbatchworld.app.HelloSpringBatchWorldTasklet" />
     <!-- JOB DECLARATION END-->
</beans>

Let’s focus on “JOB DECLARATION” section, it defines a batch job whose id is “HelloSpringBatchWorld”; it has just one step and the class where you can find its implementation is the bean having “id”=”HSBWTasklet”. Let’s take a view to this class

package com.hellospringbatchworld.app;

import org.springframework.batch.core.StepContribution;
import org.springframework.batch.core.scope.context.ChunkContext;
import org.springframework.batch.core.step.tasklet.Tasklet;
import org.springframework.batch.repeat.RepeatStatus;

public class HelloSpringBatchWorldTasklet implements Tasklet{

	public RepeatStatus execute(StepContribution arg0, ChunkContext arg1)throws Exception {
		System.out.println("Hello SpringBatch World!!!");
		return RepeatStatus.FINISHED;
	}

}

It is a basic Tasklet implementation; it has only a method and in our example it executes a simple print on System.out stream and return the “finished” status to spring batch. There is no matter to spend more time about this class; but remeber this is just an helloworld example; in many cases you will use this class structure in order to implement very complex logics. If we come back to app1-context.xml, we can see that it includes another file: simple-springbatch-conf.xml.

<?xml version="1.0" encoding="UTF-8"?>
  <beans xmlns="http://www.springframework.org/schema/beans"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://www.springframework.org/schema/beans
         http://www.springframework.org/schema/beans/spring-beans-3.0.xsd">

  <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
    <property name="jobRepository" ref="jobRepository"/>
  </bean>
  <bean id="jobRepository" class="org.springframework.batch.core.repository.support.SimpleJobRepository">
     <constructor-arg>
	  <bean class="org.springframework.batch.core.repository.dao.MapJobInstanceDao"/>
     </constructor-arg>
     <constructor-arg>
	<bean class="org.springframework.batch.core.repository.dao.MapJobExecutionDao" />
     </constructor-arg>
     <constructor-arg>
	<bean class="org.springframework.batch.core.repository.dao.MapStepExecutionDao"/>
     </constructor-arg>
     <constructor-arg>
	<bean class="org.springframework.batch.core.repository.dao.MapExecutionContextDao"/>
     </constructor-arg>
  </bean>

  <bean id="transactionManager"  class="org.springframework.batch.support.transaction.ResourcelessTransactionManager"/>
</beans>

This file is very important for non production enviroment, it overwrites spring batch default values for jobRepository, jobLauncher and transactionManager beans; these customized beans allow to use an in memory repository avoiding the use of a database. This feaure is very useful, because defines the structure where store metadata related to the execution of the job and of its related steps. Using an in memory approach will lighten the job but it will lose all data useful to trace the job execution.

Let’s run the Job!

We are at the end of our HelloWorld example; we have all the stuff ready to see our job working. Code below is the main we need to run it.

package com.hellospringbatchworld.app;

import org.springframework.batch.core.Job;
import org.springframework.batch.core.JobExecution;
import org.springframework.batch.core.JobParameters;
import org.springframework.batch.core.launch.JobLauncher;
import org.springframework.context.ApplicationContext;
import org.springframework.context.support.ClassPathXmlApplicationContext;

public class App{
	public static void main(String[] args) throws Exception {

		String[] springConfig = { "META-INF/app1-context.xml" };
		ApplicationContext context  = new ClassPathXmlApplicationContext(springConfig);
		JobLauncher jobLauncher = (JobLauncher) context.getBean("jobLauncher");
		Job job = (Job) context.getBean("HelloSpringBatchWorld");

		JobExecution execution = jobLauncher.run(job, new JobParameters());
		System.out.println("Job staus: " + execution.getStatus());

	}
}

It is very easy, it creates an array of String in order to reference the configuration file where are defined all the beans; it creates a Spring context where it can retrieve a JobLauncher and the Job. Now it is ready to execute that Job with an empty JobParamers. That’s all!

Conclusion

How I said before, this is just a post where I want to show how easy is to build a software infrastructure able to execute a Spring Batch Job. You can start from this post and add more stuff in order to achieve your needs.

Advertisements

About amicidiroberto

Curioso, onesto, polemico ed innamorato della programmazione, del mondo IT, ma soprattutto di mia moglie e delle mie figlie
This entry was posted in Java, Spring, Spring Batch. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s