Deployment has always been a tricky part of Batch Processing. Unlike a Web application, there is no standardized deployment, and there are a variety of environments that could be deployed to. You could actually deploy a Batch application within a web container, or as a standalone java app to be started by one of many available schedulers. For this reason, everyone’s environment is different, and there can be no one example that someone can use as a starting point. However, I have done enough standalone deployments for Linux using Bash to share a simple example.
The job itself matters little for the purposes of this post. A simple one tasklet job suffice:
(namespace removed for readability)
The Tasklet simply prints ‘Hello World’
Simple enough, right? Now for the hard part. If we want a scheduler to be able to launch this from the command line, what do we do? The first problem to solve is, what does our deployment look like? In my mind, there’s three necessary components:
Personally, I prefer to separate the XML files from the jars, to allow for tweaks to the Job. This is especially useful for Spring Batch job definitions, although may be less so for normal spring configuration files. My preferred layout is to have three directories: bin, lib, and resources. Obviously, the scripts go in /bin, jars in /lib, and xml/properties in /resources. It doesn’t really matter how you break yours up, but it’s the format I’ll be using. In order to create this layout, I’ll use Maven and the assembly plugin:
And the Descriptor:
The format I’m using is tar.gz, since I’m targeting linux, but there are many more available.
The two fileset entries describes both the bin and resources directory. (I haven’t talked about the script yet, but I will below). The ‘DependencySet’ reference is to transfer all the dependencies that Maven is managing into the lib directory, including the created jar itself. When you run ‘mvn install’ there will be two artifacts created: The normal jar, and another file with the same name, ending in -distribution.tar.gz. In the case of my example that is: batch-deploy-sample-1.0-SNAPSHOT.jar and batch-deploy-sample-1.0-SNAPSHOT-distribution.tar.gz. In my example, unzipping gave me the following:
All we need now is a simple script to actually run the job:
I’m probably not going to win any awards for my bash scripting skills anytime soon, but it gets the job done and isn’t quite as archaic as a more concise version would be. Essentially, I’m creating a classpath from all the jar files in /lib by looping through the files and separating them with a colon. Once that is done, I can start a java process, using the CommandLineJobRunner as the Main method. As described in the documentation, all I need to pass to the job runner is the xml file to run the job, and the job name. (It’s worth noting that normally you would also need JobParameters, but since I’m using the MapRepository, it isn’t necessary)
You can download the running example from my github account: batch-deploy-sample.
The job itself matters little for the purposes of this post. A simple one tasklet job suffice:
<job id="sampleJob" job-repository="jobRepository">
<step id="simpleStep">
<tasklet ref="tasklet" />
</step>
</job>
<beans:bean id="jobRepository"
class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean"/>
<beans:bean id="transactionManager"
class="org.springframework.batch.support.transaction.ResourcelessTransactionManager" />
<beans:bean id="jobLauncher"
class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
<beans:property name="jobRepository" ref="jobRepository" />
</beans:bean>
<beans:bean id="tasklet" class="net.lucasward.sample.SampleTasklet" />
(namespace removed for readability)
The Tasklet simply prints ‘Hello World’
public class SampleTasklet implements Tasklet {
public RepeatStatus execute(StepContribution contribution,
ChunkContext chunkContext) throws Exception {
System.out.println("Hello World");
return FINISHED;
}
}
Simple enough, right? Now for the hard part. If we want a scheduler to be able to launch this from the command line, what do we do? The first problem to solve is, what does our deployment look like? In my mind, there’s three necessary components:
- The jars themselves, which need to be on the classpath
- A Script that can be called
- The xml and/or .properties files that will be used for configuration
Personally, I prefer to separate the XML files from the jars, to allow for tweaks to the Job. This is especially useful for Spring Batch job definitions, although may be less so for normal spring configuration files. My preferred layout is to have three directories: bin, lib, and resources. Obviously, the scripts go in /bin, jars in /lib, and xml/properties in /resources. It doesn’t really matter how you break yours up, but it’s the format I’ll be using. In order to create this layout, I’ll use Maven and the assembly plugin:
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<version>2.2-beta-2</version>
<configuration>
<descriptors>
<descriptor>src/main/assembly/descriptor.xml</descriptor>
</descriptors>
</configuration>
<executions>
<execution>
<id>make-distribution</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
And the Descriptor:
<assembly>
<id>distribution</id>
<formats>
<format>tar.gz</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<fileSet>
<directory>src/main/scripts</directory>
<outputDirectory>bin</outputDirectory>
<useDefaultExcludes>true</useDefaultExcludes>
</fileSet>
<fileSet>
<directory>src/main/resources</directory>
<outputDirectory>resources</outputDirectory>
<useDefaultExcludes>true</useDefaultExcludes>
<filtered>true</filtered>
</fileSet>
</fileSets>
<dependencySets>
<dependencySet>
<outputDirectory>lib</outputDirectory>
</dependencySet>
</dependencySets>
</assembly>
The format I’m using is tar.gz, since I’m targeting linux, but there are many more available.
The two fileset entries describes both the bin and resources directory. (I haven’t talked about the script yet, but I will below). The ‘DependencySet’ reference is to transfer all the dependencies that Maven is managing into the lib directory, including the created jar itself. When you run ‘mvn install’ there will be two artifacts created: The normal jar, and another file with the same name, ending in -distribution.tar.gz. In the case of my example that is: batch-deploy-sample-1.0-SNAPSHOT.jar and batch-deploy-sample-1.0-SNAPSHOT-distribution.tar.gz. In my example, unzipping gave me the following:
./bin:
sampleJob.sh
./lib:
aopalliance-1.0.jar spring-batch-core-2.1.1.RELEASE.jar spring-tx-2.5.6.jar
batch-deploy-sample-1.0-SNAPSHOT.jar spring-batch-infrastructure-2.1.1.RELEASE.jar stax-1.2.0.jar
commons-logging-1.1.1.jar spring-beans-2.5.6.jar stax-api-1.0.1.jar
jettison-1.1.jar spring-context-2.5.6.jar xpp3_min-1.1.4c.jar
spring-aop-2.5.6.jar spring-core-2.5.6.jar xstream-1.3.jar
./resources:
jobs
./resources/jobs:
sampleJob.xml
All we need now is a simple script to actually run the job:
#!/bin/bash
CP=resources/
LIB=lib/*
for f in $LIB
do
CP=$CP:$f
done
java -cp $CP org.springframework.batch.core.launch.support.CommandLineJobRunner \
jobs/sampleJob.xml sampleJob
I’m probably not going to win any awards for my bash scripting skills anytime soon, but it gets the job done and isn’t quite as archaic as a more concise version would be. Essentially, I’m creating a classpath from all the jar files in /lib by looping through the files and separating them with a colon. Once that is done, I can start a java process, using the CommandLineJobRunner as the Main method. As described in the documentation, all I need to pass to the job runner is the xml file to run the job, and the job name. (It’s worth noting that normally you would also need JobParameters, but since I’m using the MapRepository, it isn’t necessary)
You can download the running example from my github account: batch-deploy-sample.
Here is what I did :
# Set the classpath
SCRIPTDIR="$( cd "$( dirname "$0" )" && pwd )"
CLASSPATH=.
for file in `ls $SCRIPTDIR/../lib`
do
export CLASSPATH=$CLASSPATH:$SCRIPTDIR/../lib/$file
done
export CLASSPATH=$CLASSPATH:$SCRIPTDIR:$SCRIPTDIR/../resources
# Launch the conversion job
java org.springframework.batch.core.launch.support.CommandLineJobRunner …
ClassPathXmlApplicationContext [INFO] Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@4ca31e1b: display name [org.springframework.context.support.ClassPathXmlApplicationContext@4ca31e1b]; startup date [Fri Oct 22 10:28:52 CDT 2010]; root of context hierarchy
2010-10-22 10:28:52 XmlBeanDefinitionReader [INFO] Loading XML bean definitions from class path resource [jobs/sampleJob.xml]
2010-10-22 10:28:53 CommandLineJobRunner [ERROR] Job Terminated in error:
org.springframework.beans.factory.BeanDefinitionStoreException: Unexpected exception parsing XML document from class path resource [jobs/sampleJob.xml]; nested exception is java.lang.IllegalArgumentException: 'beanName' must not be empty
Its probably something simple I'm missing.