Monday, October 27, 2014

Springing into Mongo

Much has been made in recent years of document/NoSQL databases, and for very good reason. As the problems we face as engineers evolve, so do our storage needs -- or lack of storage needs.  The weight and unwieldy nature of relational databases simply isn't always the best answer.  Pair this with replication capabilities that work almost flawlessly out of the box, and the fact that mongo is completely free and open source makes it a pretty amazing alternative to the "traditional" solutions like Oracle and SQL Server, and further appetizing when you consider that the longstanding independent stalwart, MySQL, now resides in the grip of Oracle.  

So how can you reap the benefits of Mongo on your project?  Per the theme of my previous blog, we're going to stay in the comfortable confines of Spring and Spring Boot.  Spring Boot has a "starter" for Mongo that is incredibly instructive.

"Springing" off from this beginning, it's incredibly easy to apply the technology to the solution of your choice.  There's one minor shortcoming that needs to be patched up, however, and we will look at that a bit later on. 

For starters, let's look at how Mongo differs from standard JPA in spring.

My favorite Mongo visualizer is Robomongo, and I highly recommend having it around if you plan on working with Mongo extensively. 


Entities  

Much like JPA entities are annotated with @Entity, any item you want to store as a Mongo doc should be annotated as @Document.  Unlike JPA, however, you don't need to annotate every field that you'd like to persist.  They are persisted by default.  To apply an index to any column.  This is recommended for columns queried and accessed frequently.  Here's an example of a simple Mongo object:

import org.bson.types.ObjectId
import org.springframework.data.annotation.Id
import org.springframework.data.mongodb.*

@Document
class Item {

@Id
ObjectId id // ObjectId is s GUID-like unique ID that works well with mongo 

@Indexed
String itemIdentifier

Double price
}

When you boot up your spring context for the first time and start trying to put some data in, a collection of documents will be created.  I won't belabor the differences between Mongo and a relational DB too much more, but what's worth keeping in mind is that you don't have a uniform "table structure" as such.  When you're working with object models, you will find out relatively quickly if you're persisting things right, but you often won't get the sort of show-stopping ugly errors and exceptions that a SQL-based driver would give you if you tried to do something untoward.  That being said, let's take a brief look at how you configure Spring to connect to a mongo server and database, as I didn't find this to be well-documented on Spring's site:

import org.springframework.data.mongodb.*

@Configuration
@ComponentScan
@EnableMongoRepositories
class Config extends AbstractMongoConfiguration {

  protected String getDatabaseName() {
    return "myDb"
  }

  @Bean
  Mongo mongo() {
    return new Mongo('127.0.0.1:27017')
  }
}

When you combine this configuration with Spring Boot, you're pretty much good to go as far as connecting to your Mongo DB and doing the basic things.  But we all know you want to do more than the basics.  Thankfully, there are some very easy ways to pep things up.  As I mentioned in my prior blog, it's extremely easy to enable repositories in Spring.  It's pretty much the same thing with Mongo, though you need to add an @EnableMongoRepositories annotation to your configuration. From here, it's exactly the same exercise as it is with Spring JPA


Repositories 

With Mongo, your repository will look identical to its JPA equivalent:

import org.bson.types.ObjectId
import org.springframework.data.repository.*

interface ItemRepository extends PagingAndSortingRepository<Item, ObjectId> {
  Item save(Item item)

  Item findOne(ObjectId id)

  Page<Item> findAll(Pageable options)
}

I'm also pleased to report that you can use Spring's excellent @RepositoryRestResource annotation to effortlessly ReST-enable your data layer.  I probably gushed about this enough in my last post, but this really eliminates the need to write a service layer in large part.  Like so many other things in Spring, this really saves you time when you're getting a project rolling. 


Gotchas: Parent-Child relationships

In my experience with Mongo, the one hang-up I bumped into was Mongo's (or rather Spring Mongo's) implementation of parent-child relationships.  Strictly speaking, they don't exist.  Mind you, there IS a @DbRef annotation that tells the framework to establish a connection between collections, but in my experience, it was a bit of a poor substitute for something like @OneToMany and @ManyToOne.  This being said, I stumbled upon a functional gap that most people seem to circumvent in this scenario.  You see, the commonly-accepted way to persist children in Mongo is to simply ship the entire collection of children in with the parent object.  Mongo supports arrays -- so this can be reasonably efficient if your document is sized-reasonably.  My use case involved a small parent document with vast numbers of children.  So I needed to link one collection to another, but I also needed to be able to get all of the children in one shot, and more importantly, I wanted to be able to add a child without pulling down the entire parent document first.  This would have been even more difficult in a concurrent or asynchronous setting.  Thankfully, with a little bit of code and -- admittedly -- some trial and error, I was able to come up with a workable solution.  And I'll admit, it took me on a trip down memory lane.  As you've seen here, my examples are all in Groovy.  I rarely code in Java anymore if I can avoid it, and in putting this solution together, I was required to tap into some old friends from the Java/Spring way.  Without further ado, here's the code: 

@Retention(RetentionPolicy.RUNTIME)
@Target([ ElementType.FIELD ])
public @interface Parent {
}

------
import com.mongodb.DBObject
import com.rocksoft.example.domain.Parent // see above
import org.springframework.beans.factory.*
import org.springframework.data.mongodb.*
import org.springframework.stereotype.Component
import org.springframework.util.ReflectionUtils
import java.lang.reflect.Field


@Component
class MongoListener extends AbstractMongoEventListener {

  @Autowired
  MongoOperations mongoOperations

  public void onAfterSave(final Object source, DBObject dbo) {
    ReflectionUtils.doWithFields(source.class, 
      new ReflectionUtils.FieldCallback() {
      void doWith(Field field) {
        if (field.isAnnotationPresent(DBRef) &&
            field.isAnnotationPresent(Parent)) {
          ReflectionUtils.makeAccessible(field)
          def fieldValue = field.get(source)
          Field parentField = fieldValue.class.declaredFields.find {
            (it.genericType?.hasProperty('actualTypeArguments') &&
             it.genericType?.actualTypeArguments?.first() ==      
             source.class) || it.class == source.class
          }

          ReflectionUtils.makeAccessible(parentField)
          if (Collection.isAssignableFrom(parentField.type)) {
            Collection value = parentField.get(fieldValue)
            if (!value) {
              value = []
            }

            value << source
            fieldValue."$parentField.name" = value
          } else {
            fieldValue."$parentField.name" = source
          }

          mongoOperations.save(fieldValue)
        }
      }
    })
  }

}
There is a bit of hullaballoo here, but I'll try to summarize what you're seeing.  First off, we owe the ease of doing this to the mongo listening capabilities in Spring. What we're doing, in short, is watching the objects that come in and seeing if it's called out as having a parent that needs to know about it.  To "tag" something as part of a parent-child relationship, we throw the annotation where we would normally have a @ManyToOne, as well as a @DbRef annotation, so Spring knows that it needs to link things up.  A lot of the code you're seeing simply finds the target field in the parent object.  This can be done via one-to-one or one-to-many.  This would need to be enhanced a bit for true production use, as a non-collection genericized type would cause the code to crash and burn.  Finally, we set the value back on the parent, and we use the framework-provided MongoOperations class from our context to plug the modified object back in. 

There's one final gotcha that had me wrapped around the axle.  In your parent @DbRef annotation, you will need to add a lazy=true attribute. Failure to do so resulted in a StackOverflowError that I didn't spend too much time chasing, as my use case essentially screamed for lazy collections.  


In Closing

Hopefully this can save some time for those of you that are looking to make the jump to Mongo for persistence.  The great news is that once you've laid the foundation down with Spring, it's really pretty easy to swap out your database underneath.  Mongo puts reliable replication and extremely flexible storage options at your fingertips.  But as with everything else, I strongly urge you to look at the costs and benefits. For the parent/child requirement, I think a relational database is probably a better option in the end, simply because it doesn't require the "duct tape" I shared above.  I ended up settling on PostgreSQL, another open source and otherwise-unaffiliated alternative that is quite easy to install and maintain.

Wednesday, October 22, 2014

The Power Of Spring Boot

Spring Boot is described by its creators as an "opinionated" way to go about building  production-ready Spring applications.  As a long-time advocate of this ever-evolving, incredibly handy framework, I was wielding a hammer and searching for nails.  When I finally started a suitable project, I was eager to dig into to see what kind of savings in time and sanity I could reap. 

Where Spring has always come in incredibly handy is by simplifying the things we do with Java and taking some of the guesswork out of these common operations.  A lot of us in the Java community have taken it a step further by adopting Groovy (or another JVM language) as our main coding mechanism.  The deal gets even more sweet when you mix in Spring's dependency injection capabilities as well as its myriad framework features. 

Let's face it: if you're an engineer building "business" apps, most of what you're going to do is take data from one place and put it into another.  There are a lot of creative ways to do this.  I'll admit that I've occasionally gone overly elaborate when it wasn't necessary, but let's face it: boredom and repetition force us to some strange things sometimes.  What I've seen with Spring Boot so far is really changing the entire discourse as it concerns these mundane activities.  It frees us up to do things that are less repetitive. And potentially more creative! 


Step 0: Bootstrapping

Using the build automation tool of your choice (I like Gradle), follow the guidelines on Spring's site to make a simple build script.  Also, create a simple bean annotated with @Configuration and @ComponentScan


Step 1: Persistence

We usually start with some sort of a data model.  This has been exceedingly easy with Spring.  You have some tables that you need to represent as objects in your code?  No problem.

import javax.persistence.*

@Table(name = "item")
class Item {
 @Id
 @Column(name = "item_id")
 Long id

 @Column(name = "item_name")
 String name

}


Step 2: Make a repository

This is where it starts to get pretty cool.  The @Repository stereotype has really evolved over time.  To make a fully functional repository that can be wired in for data access, just make an interface that looks something like this: 

import org.springframework.data.repository.*

interface ItemRepository extends PagingAndSortingRepository<Item, Long> {

 Item save(Item item)
 Item findOne(Long id)
 Page<Item> findAll(Pageable options)

}

You see we used the souped-up PagingAndSortingRepository, but there are other options that are slightly more generic.  It all depends upon what you need.  At this point, you could make a rudimentary Spring app that accessed your database, just by wiring in your repo. 

.
.
.
@Autowired
ItemRepository repo

void foo() {
  repo.save(new Item(id: 1L, name: 'foobar'))
  assert repo.findOne(1L).name == 'foobar'
}
.
.
.

Step 3: Now it gets really cool

This is all fine and dandy, but not super practical.  What we'd normally do at this point is whip up some kind of a ReST service that would broker the situation for us.  The fine folks behind Spring realize that we're sick of doing this repeatedly, and they've made it insanely easy.  Enter Spring Data REST.  We can modify what we have above incredibly easily.  If you're using Gradle, you just have to make sure you have the right dependencies:

    compile("org.springframework.boot:spring-boot-starter-data-rest")
    compile("org.springframework.boot:spring-boot-starter-data-jpa")

Then add another annotation to your config: 

@Import(RepositoryRestMvcConfiguration)

Then make a few minor alterations to your repo: 

import org.springframework.data.repository.*
import org.springframework.data.rest.*

@RepositoryRestResource(path = "item")
interface ItemRepository extends PagingAndSortingRepository<Item, Long> {

 Item save(@Param("item") Item item)
 Item findOne(Long id)
 Page<Item> findAll(Pageable options)

}

Start up your application, and you'll have an endpoint hanging off of your context root at "item."  If you access it, you'll see that you get a JSON response.  What's pretty nifty here is that it will only support the ReST equivalent of the operations you've called out.  See the Spring site for a full list.  Given the code we've sketched out, you can do a GET, one with an ID parameter and one without, and a POST.  If you POST directly to the "item" endpoint, you'll see some data saved.  Your body would look something like this: 

{
"id": 2,
"name": "foobar"
}

This is really what makes this framework cool.  For a large percentage of the apps, there's simply no need to even write a services layer.  Model up your objects, spell out your data operations in an interface, and you're set.  No more boilerplate madness. 


Things to consider

In closing, if you're thinking about making the data situation a bit more robust, bear in mind that Spring Data REST deals in "links" when modeling relationships.  In referring to an object, you have to provide its URL.  It appears to me that Spring just peels the meaningful part off of the URL you provide, but I haven't dug deep enough to guarantee it. 

A final thing that I hope can save some of you a bit of time: you can't persist child objects through the parent object's REST endpoint.  Say "Item" has "ChildItem" objects.  You will need to expose a ChildItem repository with its respective @RepositoryRestResource. This may seem unwieldy at first, but I think it's fair to have to call out the ReST operations you'd like the child object to support.  Say you want to create a child object: 

{
"id": 4,
"parent": "http://my.site.com/item/2",
"foo": "bar"
}

This was another thing that had me running all over the place and didn't seem to be well-documented.  

Happy Springing!