Using Custom S3 Resource in Spring Batch Application

In the last article titled Simple S3 ItemReader for Spring Batch Application, I explained how to write a custom FileItemReader for S3. You might have noticed, the solution works great but you will have to restart your application/server for any changes in the S3 file to take effect.

Here, we will try another approach where we will extend org.springframework.core.io.AbstractResource class and implment a S3 resource provider. For this, we will have to implement the below methods:

public String getDescription()
public InputStream getInputStream()
public boolean exists()
public long contentLength()
public long lastModified()
public String getFilename()
public URL getURL()

Let's implement these methods.

Method: getDescription()

This should return a short description on the aws resource including the bucket name and the object name.

StringBuilder builder = new StringBuilder("S3 resource [bucket='");
        builder.append(this.bucketName);
        builder.append("' and key='");
        builder.append(this.key);
        builder.append("']");
        return builder.toString();

Method: getInputStream()

This method should return a S3ObjectInputStream as given below.

GetObjectRequest getObjectRequest = new GetObjectRequest(this.bucketName, this.key);
        return this.amazonS3.getObject(getObjectRequest)
                .getObjectContent();

Method: exists()

To check whether the specified resource exists or not. We can use AmazonS3.getObjectMetadata() to decide whether the resource exists or not.

GetObjectMetadataRequest metadataRequest = new GetObjectMetadataRequest(
                        this.bucketName, this.key);
        return this.amazonS3.getObjectMetadata(metadataRequest);

Method: contentLength()

We can derive this from the above returned ObjectMetadata.

return objectMetadata.getContentLength();

Method: lastModified()

Similar to the above method.

return objectMetadata.getLastModified().getTime();

Method: getFilename()

This is nothing but the resource name.

return this.key;

Method: getURL()

The url can be constructed using the bucket and resource name.

Region region = this.amazonS3.getRegion()
                .toAWSRegion();
        return new URL("https", region.getServiceEndpoint(AmazonS3Client.S3_SERVICE_NAME),
                "/" + this.bucketName + "/" + this.key);

Integrating S3 Resource with FileItemReader

We call the setResource() method of FlatFileItemReader and set the S3 resource instance.

public ItemReader reader() throws IOException {
        FlatFileItemReader reader = new FlatFileItemReader<>();
        reader.setResource(new S3Resource(s3Client(), "bucketName", "fileName"));
        lineMapper.setLineTokenizer(your tokenizer);
        lineMapper.setFieldSetMapper(your field mapper);
        reader.setLineMapper(your line mapper);
        return reader;
    }

Conclusion

So, we built a resource class for aws S3 and set it as the resource provider for FlatFileItemReader. Similarly you can implement a S3 writer as well, let me know if you need any help on that.

Note: In Spring Cloud, there is in-built support available for reading and writing files from aws S3. Refer this for details.

Using Custom S3 Resource in Spring Batch Application - Part 2