My Octopress Blog

A blogging framework for hackers.

Spring Batch Deployment Example

Deployment has always been a tricky part of Batch Processing. Unlike a Web application, there is no standardized deployment, and there are a variety of environments that could be deployed to. You could actually deploy a Batch application within a web container, or as a standalone java app to be started by one of many available schedulers. For this reason, everyone’s environment is different, and there can be no one example that someone can use as a starting point. However, I have done enough standalone deployments for Linux using Bash to share a simple example.

The job itself matters little for the purposes of this post. A simple one tasklet job suffice:

   <job id="sampleJob" job-repository="jobRepository">
<step id="simpleStep">
<tasklet ref="tasklet" />
</step>
</job>

<beans:bean id="jobRepository"
class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"/>

<beans:bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />

<beans:bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<beans:property name="jobRepository" ref="jobRepository" />
</beans:bean>

<beans:bean id="tasklet" class="net.lucasward.sample.SampleTasklet" />


(namespace removed for readability)

The Tasklet simply prints ‘Hello World’

public class SampleTasklet implements Tasklet {

public RepeatStatus execute(StepContribution contribution,
ChunkContext chunkContext) throws Exception {
System.out.println("Hello World");
return FINISHED;
}
}


Simple enough, right? Now for the hard part. If we want a scheduler to be able to launch this from the command line, what do we do? The first problem to solve is, what does our deployment look like? In my mind, there’s three necessary components:


  1. The jars themselves, which need to be on the classpath
  2. A Script that can be called
  3. The xml and/or .properties files that will be used for configuration


Personally, I prefer to separate the XML files from the jars, to allow for tweaks to the Job. This is especially useful for Spring Batch job definitions, although may be less so for normal spring configuration files. My preferred layout is to have three directories: bin, lib, and resources. Obviously, the scripts go in /bin, jars in /lib, and xml/properties in /resources. It doesn’t really matter how you break yours up, but it’s the format I’ll be using. In order to create this layout, I’ll use Maven and the assembly plugin:


<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.2-beta-2</version>
<configuration>
<descriptors>
<descriptor>src/main/assembly/descriptor.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<id>make-distribution</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>


And the Descriptor:

<assembly>
<id>distribution</id>
<formats>
<format>tar.gz</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<fileSet>
<directory>src/main/scripts</directory>
<outputDirectory>bin</outputDirectory>
<useDefaultExcludes>true</useDefaultExcludes>
</fileSet>
<fileSet>
<directory>src/main/resources</directory>
<outputDirectory>resources</outputDirectory>
<useDefaultExcludes>true</useDefaultExcludes>
<filtered>true</filtered>
</fileSet>
</fileSets>

<dependencySets>
<dependencySet>
<outputDirectory>lib</outputDirectory>
</dependencySet>
</dependencySets>
</assembly>


The format I’m using is tar.gz, since I’m targeting linux, but there are many more available.

The two fileset entries describes both the bin and resources directory. (I haven’t talked about the script yet, but I will below). The ‘DependencySet’ reference is to transfer all the dependencies that Maven is managing into the lib directory, including the created jar itself. When you run ‘mvn install’ there will be two artifacts created: The normal jar, and another file with the same name, ending in -distribution.tar.gz. In the case of my example that is: batch-deploy-sample-1.0-SNAPSHOT.jar and batch-deploy-sample-1.0-SNAPSHOT-distribution.tar.gz. In my example, unzipping gave me the following:

./bin:
sampleJob.sh

./lib:
aopalliance-1.0.jar spring-batch-core-2.1.1.RELEASE.jar spring-tx-2.5.6.jar
batch-deploy-sample-1.0-SNAPSHOT.jar spring-batch-infrastructure-2.1.1.RELEASE.jar stax-1.2.0.jar
commons-logging-1.1.1.jar spring-beans-2.5.6.jar stax-api-1.0.1.jar
jettison-1.1.jar spring-context-2.5.6.jar xpp3_min-1.1.4c.jar
spring-aop-2.5.6.jar spring-core-2.5.6.jar xstream-1.3.jar

./resources:
jobs

./resources/jobs:
sampleJob.xml


All we need now is a simple script to actually run the job:

#!/bin/bash

CP=resources/

LIB=lib/*
for f in $LIB
do
CP=$CP:$f
done

java -cp $CP org.springframework.batch.core.launch.support.CommandLineJobRunner \
jobs/sampleJob.xml sampleJob


I’m probably not going to win any awards for my bash scripting skills anytime soon, but it gets the job done and isn’t quite as archaic as a more concise version would be. Essentially, I’m creating a classpath from all the jar files in /lib by looping through the files and separating them with a colon. Once that is done, I can start a java process, using the CommandLineJobRunner as the Main method. As described in the documentation, all I need to pass to the job runner is the xml file to run the job, and the job name. (It’s worth noting that normally you would also need JobParameters, but since I’m using the MapRepository, it isn’t necessary)

You can download the running example from my github account: batch-deploy-sample.

Comments

Lucas Ward
Interesting. I'll definitely add your code for the script directory. I'm not a big fan of using the CLASSPATH variable, so I think I'll stick to just passing in the classpath to the command line via -cp
Philippe
I slightly corrected the shell script so that it can be launched from anywhere, rather than having to cd to the bin directory.
Here is what I did :

# Set the classpath
SCRIPTDIR="$( cd "$( dirname "$0" )" && pwd )"
CLASSPATH=.

for file in `ls $SCRIPTDIR/../lib`
do
export CLASSPATH=$CLASSPATH:$SCRIPTDIR/../lib/$file
done
export CLASSPATH=$CLASSPATH:$SCRIPTDIR:$SCRIPTDIR/../resources

# Launch the conversion job
java org.springframework.batch.core.launch.support.CommandLineJobRunner …
Philippe
Thanks a lot ! It's exactly what I was looking for and it works like a charm…
Philippe
This comment has been removed by the author.
Dil………………………………………:)
I want to stop the running spring batch job, Please let me know how can i do that .. i wil b follooing u r blog fr ans
Lucas Ward
I'm sorry for the crazy late response, I've been extremely busy with client work for the last few weeks. Are you still having this issue?
David Tam
Thanks, this is very useful for someone totally new to Spring Batch, but when I run the job.. I get this..

ClassPathXmlApplicationContext [INFO] Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@4ca31e1b: display name [org.springframework.context.support.ClassPathXmlApplicationContext@4ca31e1b]; startup date [Fri Oct 22 10:28:52 CDT 2010]; root of context hierarchy
2010-10-22 10:28:52 XmlBeanDefinitionReader [INFO] Loading XML bean definitions from class path resource [jobs/sampleJob.xml]
2010-10-22 10:28:53 CommandLineJobRunner [ERROR] Job Terminated in error:
org.springframework.beans.factory.BeanDefinitionStoreException: Unexpected exception parsing XML document from class path resource [jobs/sampleJob.xml]; nested exception is java.lang.IllegalArgumentException: 'beanName' must not be empty

Its probably something simple I'm missing.