Declaring Transaction Attributes on Spring's TransactionProxyFactoryBean

Posted on January 23, 2006 by Scott Leberknight

One of the (many) benefits of using Spring for web application development is that you can declaratively define transaction boundaries on POJOs, e.g. business service layer classes that delegate to one or more DAOs underneath. One of the simplest ways to do this is to use Spring's TransactionProxyFactoryBean to specify how to apply transactions to methods on a proxied object. Without going into all the gory details of how Spring AOP works, using dynamic proxies and such, there is something to watch out for when using this useful class. Typically when using TransactionProxyFactoryBean you'll create a parent bean definition in your Spring ApplicationContext which defines the transaction manager and the transaction attributes. You then create child beans which inherit the transaction manager and transaction attributes. For example:

<bean id="abstractTransactionProxy"
    abstract="true"
    class="org.springframework.transaction.interceptor.TransactionProxyFactoryBean">
    <property name="transactionManager" ref="hibernateTransactionManager"/>
    <property name="transactionAttributes">
        <props>
            <prop key="get*">PROPAGATION_SUPPORTS,readOnly</prop>
            <prop key="find*">PROPAGATION_SUPPORTS,readOnly</prop>
            <prop key="is*">PROPAGATION_SUPPORTS,readOnly</prop>
            <prop key="*">PROPAGATION_REQUIRED</prop>
        </props>
    </property>
</bean>

You would then create child beans which inherit from abstractTransactionProxy and which are your actual service POJOs. For example:

<bean id="customerService" parent="abstractTransactionProxy">
    <property name="target" ref="customerTarget"/>
</bean>

<bean id="productService" parent="abstractTransactionProxy">
    <property name="target" ref="productTarget"/>
</bean>

Easy enough, right? The child beans will now inherit the transaction manager and all the transaction attributes. The definition of transaction attributes in abstractTransactionProxy specifies several things. First, any method whose name begins with "get", "find", or "is" will support execution within a transaction, and the "readOnly" marker indicates that they can be executed in a read-only mode. So, if one of these methods is called when there is no existing transaction, then no transaction will be created and the method executes without the overhead of a database transaction. If one of the methods is called when there is an existing transaction, it simply inherits the transactional context and executes within the transaction. So far so good. Now, the last part of the transaction attributes definition specifies a "*" as the method name, meaning any other method whose name doesn't start with "get", "find", or "is." In addition, any other method is required to execute within a transaction. If an existing transaction exists the method will inherit the transactional context; if not then a new transaction will be automatically created. The "*" acts as a catch-all against a developer who adds a new method that must execute transactionally but who forgets to add the method to the transaction attributes definition. Take the following example:

<bean id="abstractTransactionProxy"
    abstract="true"
    class="org.springframework.transaction.interceptor.TransactionProxyFactoryBean">
    <property name="transactionManager" ref="hibernateTransactionManager"/>
    <property name="transactionAttributes">
        <props>
            <prop key="create*">PROPAGATION_REQUIRED</prop>
            <prop key="update*">PROPAGATION_REQUIRED</prop>
            <prop key="delete*">PROPAGATION_REQUIRED</prop>
            <prop key="*">PROPAGATION_SUPPORTS,readOnly</prop>
        </props>
    </property>
</bean>

What's the difference here? In this case, if a developer doesn't follow the naming conventions for "bang" methods - methods that insert, update, or delete database records - then no transactions are applied and their changes will not be committed to the database. For example, suppose a developer creates a method named "saveProduct" in a class which has the above transaction attributes applied to it. What happens? It supports an existing transaction but if called without an existing transaction it will execute non-transactionally and in a read-only fashion, since the "*" transaction attribute is the rule that applies. Thus the database changes would never take effect! In fact, if the method is sometimes called when there is an existing transaction and sometimes not, a subtle bug would arise since sometimes the database changes would be committed and sometimes not! To allow developers to add methods whose name begins with "save," you would have to add another transaction attribute definition. Even worse, what happens if no rule matches a given method? Take the following:

<bean id="abstractTransactionProxy"
    abstract="true"
    class="org.springframework.transaction.interceptor.TransactionProxyFactoryBean">
    <property name="transactionManager" ref="hibernateTransactionManager"/>
    <property name="transactionAttributes">
        <props>
            <prop key="create*">PROPAGATION_REQUIRED</prop>
            <prop key="update*">PROPAGATION_REQUIRED</prop>
            <prop key="delete*">PROPAGATION_REQUIRED</prop>
        </props>
    </property>
</bean>

For this example, if a method named "saveProduct" is executed, none of the above transaction attributes apply, and the method is assumed to be non-transactional. So what is the point of all this? There are several. First, define your transaction attributes explicitly for read-only methods like "get*" and "find*". Second, have the read-only method definitions support an existing transaction using PROPAGATION_SUPPORTS and add the readOnly marker. This ensures that when there is no existing transaction, the method executes non-transactionally and may be optimized by the JDBC driver and database to be read-only. Third, add a "*" rule to the transaction attributes definition which specifies the default transaction rule. Fourth, make this default catch-all rule be as conservative as possible. Usually you will want to use PROPAGATION_REQUIRED, as this will require the method to execute within an existing transaction or create a new one. It is much better to have a read-only method execute transactionally than have a bang method which changes database state execute non-transactionally, even though it will have additional overhead. Finally, perform code reviews or write a script or find some way to automate checking that your naming conventions are being followed so that methods execute with the correct transactional context.

No One "Caught" It?

Posted on January 14, 2006 by Scott Leberknight

Apparently none of the <sarcasm>tons of people who read my blog</sarcasm> caught the fact that in my test failings part ii entry the test code is wrong. (I only caught it because I read that entry as I was writing abour validation in Rails.) If the line of code dao.findVisit(-9999L) executes without throwing any exception, then the test still passes! Just because the test method throws Exception and the try/catch only deals with HibernateObjectRetrievalFailureException does not mean the test fails when the findVisit method throws no exception at all. You still need to remember to call fail() if no exception is thrown, as the following shows.

public void testFindNonExistentVisit() throws Exception {
    try {
        dao.findVisit(-9999L);
        fail("findVisit should have thrown an exception!");  // you do need this line!
    }
    catch (HibernateObjectRetrievalFailureException ex) {}
}

So even though all the ExceptionAssertionTemplate code is more verbose (hey, it's what you expect from Java developers, right?) it still ensures that you don't forget to add the call to fail() in your test methods. I don't think it's that much more verbose, and since I apparently cannot even get my code correct in a blog entry, it's probably a good idea to use it, or else I'll keep forgetting to call fail() and have my tests passing when they should fail.

Supressing Validation in Rails On Fields That Are Not Required

Posted on January 14, 2006 by Scott Leberknight

I've been really busy since my last post, which was was back in October of last year. Time goes by too damn fast! Anyway, I've been mucking around with Ruby on Rails lately and was adding some validation to a model object. I had a bunch of fields that are not required, but if a value is entered they should be validated against a specific format. In my case email addresses and phone numbers. Normally to validate the format of a field in Rails you can simply write the following. (Note that EMAIL_FORMAT is a Regexp object created as a constant in the class in case you wonder where it came from.)

validates_format_of :email,
                    :with => EMAIL_FORMAT,
                    :message => 'must be a valid email address, e.g. a@b.com'

That is all well and good except for the minor problem that this will cause Rails to run validation on the email field regardless of whether it is empty or not. So even though the field isn't required, Rails will validate it and report an error back to the user. Not exactly what I had in mind.

So after some extensive Googling, I found some documentation and examples that use the :if option along with a Proc object, which I have heard of but don't really understand. I'll learn that later, as for now I simply want to make the validation work properly. So the following code will validate the email address, but only if the user actually entered a value.

validates_format_of :email,
                    :allow_nil => true,
                    :with => EMAIL_FORMAT,
                    :if => Proc.new {|c| not c.email.blank?},
                    :message => 'must be a valid email address, e.g. a@b.com'

So that's how you do it. You create a new Proc object with a block that checks whether the field you are validating is blank. Rails will only run validation on the email field if it's not blank. Note that I also used the allow_nil option to tell Rails that a nil value is OK. If the value is nil then validation is suppressed. But won't that take care of blank values sent in from a web form? Unfortunately Rails (at least by default) will not convert empty strings received from the HTTP request parameters to nil automatically. If it did, then the :if option would not be required in the validation. So if, for example, you sent a GET request with the query string "first_name=Scott&last_name=Leberknight&email=&cell_phone=&home_phone=", Rails would assign empty strings to email, cell_phone, and home_phone request parameters. Or perhaps I should say it simply takes the request parameters as they are sent by the browser and assigns the values sent by the browser. This behavior is the same for POST requests when you don't fill out a value for things like text fields and text areas, select lists, etc. in forms. The crappy thing about empty strings is that they'll be stored in your database as empty strings rather than as null values. That's something to research later, as I want null values instead of empty strings in my database.

Test failings, Part II

Posted on October 27, 2005 by Scott Leberknight

A co-worker, when I showed him the ExceptionAssertionTemplate and ExceptionAssertionCallback, took one look and said, "Why don't you just catch the exception you are expecting, have your test throw Exception, and be done with it?" For example,

public void testFindNonExistentVisit() throws Exception {
    try {
        dao.findVisit(-9999L);
    }
    catch (HibernateObjectRetrievalFailureException ex) {}
}

Um, yeah, why wasn't I just doing that? It is simpler and less code. Thanks Andrew!

Ensuring Your Tests fail() When They Should

Posted on October 24, 2005 by Scott Leberknight

A few weeks ago I posted about how I use the fail() method immediately after an Exception should have been thrown, in order to ensure that invalid input or use of the method results in the exception that is expected. Since manually remembering to call fail() in every one of these situations is simply not going to work (at least not for me), I wrote a couple of quick and dirty utility classes to help out. They are based on a Spring-like Template/Callback approach, and embed the code I don't want to forget in the template class so that it is impossible to forget to call fail().

The template class is currently called ExceptionAssertionTemplate and contains two overloaded execute() methods. Here it is, with all my verbose JavaDocs and such.

package com.sleberknight.tests;

import junit.framework.Assert;

/**
 * Template class for use by unit tests that want to perform actions
 * that cause expected exceptions and then  assert the exception
 * was thrown and is of the correct type.
 *
 * @author Scott Leberknight
 */
public class ExceptionAssertionTemplate {

    /**
     * Executes the specified ExceptionAssertionCallback and
     * fails the test if no exception was actually thrown. Does not check
     * the type of exception that was thrown. Any thrown exception is
     * considered a successful test.
     *
     * @param callback
     */
    public void execute(ExceptionAssertionCallback callback) {
        try {
            callback.doActionCausingException();
            Assert.fail("Should have thrown an exception ");
        } catch (Exception e) {
        }
    }

    /**
     * Executes the specified ExceptionAssertionCallback
     * and either fails the test if no exception was actually thrown or
     * asserts that the exception thrown is of the expected type.
     *
     * @param callback
     * @param expectedExceptionType
     */
    public void execute(ExceptionAssertionCallback callback, Class expectedExceptionType) {
        try {
            callback.doActionCausingException();
            Assert.fail("Should have thrown a " + expectedExceptionType.getName());
        } catch (Exception e) {
            Assert.assertEquals(expectedExceptionType, e.getClass());
        }
    }

}

So to use it you need a ExceptionAssertionCallback, which is where you put code that should throw an exception. The no-arg execute() method executes the callback and fails the test if no exception was thrown. The execute() method that takes the expectedExceptionType argument additionally asserts that the thrown exception is of the expected type. So here is the callback class implementation.

package com.sleberknight.tests;

/**
 * Callback class that unit tests can use to perform actions that should throw exceptions.
 *
 * @author Scott Leberknight
 */
public abstract class ExceptionAssertionCallback {

    /**
     * Implement this method to throw the expected exception.
     *
     * @throws Exception
     */
    public abstract void doActionCausingException() throws Exception;

}

That's it. The ExceptionAssertionCallback is an abstract class with one method, doActionCausingException(), which must be implemented. So how do you use these classes? Here is a simple example, in which I have a DAO I am testing a findVisit() method with a non-existent record in the database. This method should throw a HibernateObjectRetrievalFailureException when called with an invalid identifier. So basically you create the template, and then call the execute() method, passing in the callback, defined here as an anonymous inner class. Note that dao is defined as a private instance variable in the test case, and is thus available to use in the callback. The template then executes the callback's doActionCausingException() method and verifies that it throws an exception of type HibernateObjectRetrievalFailureException. Assuming it does, the test passes. Otherwise, the test fails.

public void testFindNonExistentVisit() {
    ExceptionAssertionTemplate template = new ExceptionAssertionTemplate();
    template.execute(new ExceptionAssertionCallback() {
        public void doActionCausingException() throws Exception {
            dao.findVisit(-9999L);
        }
    }, HibernateObjectRetrievalFailureException.class);
}

You could of course create a base test case that creates the ExceptionAssertionTemplate, which I actually did, and then there is even less code in your test as you can then just use the existing template that was created for you. But the main point here is that by using this approach, you cannot forget to fail() the test ever again.

Using Spring with Hibernate3's Lazy-Load Everything Approach

Posted on October 22, 2005 by Scott Leberknight

I am writing this so that I never, ever, ever forget how the new default lazy-load behavior in Hibernate3 can completely mess with your head for hours and hours of debugging when you are expecting Spring's HibernateTemplate's load method to throw a HibernateObjectRetrievalFailureException in a unit test, only to find the reason is quite simple but subtle! Oh, and of course if anyone else is reading, or more importantly, Googling, then hopefully this will help you too.

I have an implementation of a DAO that extends Spring's HibernateDaoSupport class which has a finder method for an entity given the unique identifier, which is of type Long. That finder method basically does this:

public Entity findEntity(Long id) {
    return (Entity)getHibernateTemplate().load(Entity.class, id);
}

Note specifically that I am using the load() method which is defined to "Return the persistent instance of the given entity class with the given identifier, throwing an exception if not found." Specifically, this method should catch any HibernateException subclass thrown by the internal HibernateCallback and convert it into the appropriate class in the Spring DataAccessException hierarchy, e.g. in this case it should be converted to a HibernateObjectRetrievalFailureException if I pass in an identifier for which there is no persistent entity. So far, so good.

Next I have a simple unit test that calls my finder method using an invalid identifier, and then it asserts that the appropriate exception was thrown. Basically it looks something like this:

public void testFindEntityUsingInvalidIdentifier() {
    final Class expectedExceptionType = HibernateObjectRetrievalFailureException.class;
    try {
        dao.findEntity(-9999L);
        fail("Should have thrown an " + expectedExceptionType.getName());
    } catch (Exception e) {
        assertEquals(expectedExceptionType, e.getClass());
    }
}

So I ran this test thinking it was a no-brainer. It failed. In fact, it failed with the message "junit.framework.AssertionFailedError: Should have thrown a org.springframework.orm.hibernate3.HibernateObjectRetrievalFailureException." Um, what? I was about 100% sure there was no such row in the database, since I was using Spring's AbstractTransactionalDataSourceSpringContextTests test class which ensures transactions are rolled back after each test, and because I knew I didn't put any data in the database with that identifier. So that meant the findEntity method did not throw any exception. I started adding the old fallback System.out.println() statements all over the place to see what exactly was going on. The finder method actually was returning a non-null object, but when I tried to call any method on it, like toString(), it then threw a raw Hibernate ObjectNotFoundException, which as of Hibernate3 is unchecked not checked. Hmmm. Performed some initial debugging using a real debugger no less and found that proxy objects were being returned, and then looked in the stack traces and saw some CGLIB stuff in there, meaning the entity object was in fact proxied.

Since the object was actually a proxy, that explained why no exception was thrown by HibernateTemplate.load() - since no methods had been called on the proxy yet, Hibernate3 was happily returning a proxy object for an identifier which does not really exist. The second you call any method on that proxy, you get the Hibernate ObjectNotFoundException. Did a bunch of research and finally found that the Hibernate3 reference guide, in Section 11.3 (Loading an object) mentions that "load() will throw an unrecoverable exception if there is no matching database row. If the class is mapped with a proxy, load() just returns an uninitialized proxy and does not actually hit the database until you invoke a method of the proxy." I then did more research and arrived at a resource I should probably have looked at much sooner, as this is my first time using Hibernate3 (I've been using Hibernate2.x for a while now). That resource is the Hibernate3 migration guide which mentions that Hibernate3 now defaults to lazy="true" for everything, both classes and collections. I even remember reading this a while back but it didn't occur to me that the load() method in the Hibernate Session would behave like that.

In any case, with lazy loading set to "true" because of the changed default value in Hibernate3, the Spring HibernateTemplate.load() basically is useless for lazily initialized objects, since it will not catch any Hibernate exceptions and thus won't ever do the conversion into one of the DataAccessException subclasses. There are a few solutions or workarounds or hacks or whatever you want to call them. First, you could set default-lazy="false" in your root hibernate-mapping element, which will switch the behavior back to what it was in Hibernate2.x. That is what I did to verify that my test would work properly when there was no proxying involved. Second, you can use the HibernateTemplate.get() method instead of load() (which delegates to the Hibernate Session.get() method) because get() "hits the database immediately and returns null if there is no matching row" and then if the result is null, throw the appropriate exception yourself. Third, you can leave Hibernate3's default behavior alone and override it for specific classes and/or collections. For collections I almost always use lazy loading but I had never done it for classes. So you could set lazy="true" for specific mapped classes in your Hibernate mappings.

I fully understand why you want to lazily initialize your collections, but I am not sure why I would want the default behavior for normal classes to be lazy load. When I perform a find for a specific object, I pretty much know I want that object since I am going to do something with it, like display it in a web page or something. I suppose one use case for lazy initializing classes would be if you perform a query and get back a list of objects, and you don't want to initialize all of them until the user actually needs them. But even in that case I generally am going to display the search results and I will be using those objects. So I am still somewhat at odds for when I would realistically want this behavior; even for search results I can retrieve only the results I need by using Hibernate's ability to set the start row and number of rows returned for a query.

So, the main point of all this is that the default lazy loading of everything in Hibernate3 may cause some unexpected problems, and if you start seeing CGLIB in stack traces and weird things are happening, chances are you've got an object proxy rather than the actual objct you thought you were getting, and that the proxy is causing some weird behavior. Whew!

Don't Forget To fail() Your Tests

Posted on October 05, 2005 by Scott Leberknight

Often when writing JUnit tests, I use the fail() method immediately after an Exception should have been thrown, in order to ensure that invalid input or use of the method results in the exception you expect. However, earlier this week I was updating a test case and noticed one test method that was expecting an exception to occur did not have a call to fail(). So this is the code that should have been in the test:

public void testSomethingFailsThatYouExpectToFail() {
    try {
        SomeObject someObject = new SomeObject();
        someObject.someMethodThatShouldFail("invalid input");
        fail("Should have thrown IllegalArgumentException");  // need this line!
    }
    catch (Exception ex) {
        assertEquals(IllegalArgumentException.class, ex.getClass());
    }
}

But the code in the test did not have the fail() method call. Without fail(), this test works as expected only when the method actually throws the exception and the assertion validates the type of exception thrown. If, on the other hand, the exception is not thrown then the test passes but is not actually a valid test!

So as a result I did a quick search through all tests and found a significant number where I or one of the other developers had forgotten to call fail() immediately following the method call where we expect an exception to be thrown, which I then had to fix by adding the call to fail() back in. After doing that I found there were actually three tests that then failed. So even though these were exceptional test cases I had to fix the offending code. Maybe I can write a Checkstyle or PMD rule that would detect this type of error in unit tests, because it is so easy to forget to call fail(), have the test pass, and move on to the next task.

Standard UML Is Preferable...NOT!

Posted on August 04, 2005 by Scott Leberknight

One of the projects I am involved with a work recently had its design review with major stakeholders - the typical dog and pony show nonsense where people who have no business making technical decisions try to impact your project - and one of the outcomes was that an "architect" was concerned that some diagrams were not "standard UML." This person basically stated in an email after the review that we should "prefer" standard UML diagrams, apparently because standard UML is, I suppose, standard. There was no other reason given. He then attached a sample formal UML state diagram from some other project. I looked at it and could not even understand it due to the level of complexity.

I sent out an email to my team telling them to ignore what that "architect" said and to continue producing documentation that fits the intended target audience. In this case that meant not only technical personnel but also functional and business domain experts who do not know UML, have never probably heard of it, and do not care a lick about whether a diagram is UML. The only thing they care about is whether they can understand it in the context of their domain expertise. Period. The diagrams in question were simple screen flow diagrams done using PowerPoint. They consisted of squares with text inside (the screen name) and arrows going between the squares to show the navigation between screens. Nontechnical people can easily glance at this type of diagram and grasp it. Try taking a simple diagram like that and shoving a formal UML model in front of them. I am guessing most will immediately tune out as you start to explain how the little solid circle is the starting point and the circle with the solid circle inside is the end point. Or is the solid circle the end point? Well, before we can build our system we better make sure we get those correct, or else the code monkeys might just code everything in reverse, right?

The point is this: just as you should prepare a briefing according to your intended audience, you should do the same for software modeling. And you should only model as much as you need, using whatever notation is comfortable for your team, to understand your problem and begin implementation. No more and no less. If you need to show a flow diagram to a business domain expert, please stop trying to educate them on formal UML notations and just use simple notations so they can concentrate on whether your model is accurate, not on trying to remember what symbol means what. And even between developers, how many times do people get bent around the axle when whiteboarding some classes and their relationships? Does anyone really remember which diamond (solid or empty) is aggregation and which is composition? Does it even matter if your colleague understands what you mean? I vote "No." So, resist the formal modeling and resultant analysis paralysis and vote "No" on the formal UML referendum by your local PowerPoint Architect.

Traceability from Requirements to Code

Posted on August 04, 2005 by Scott Leberknight

My question is this: who cares and why should we care if we can trace from a functional use case down to one or more classes, or even methods? Recently I have been discussing this issue, as I might be required to worry about this type of traceability on an upcoming project. My main problem with this whole idea stems from the fact that well-designed and abstracted OO code typically will not map in any sort of one-to-one fashion from a use case to code. This is mainly because of the fact that use cases are a functional breakdown of system requirements, while OO (and nowadays AOP) code that implements those requirements does not follow a structured, top-down breakdown. Thus there is not a one-to-one mapping of requirements to code. So that being the case, in my experience at least, what good does it do you to be able to show that a specific use case links to a whole mess of classes? Seriously, if someone can show that use case "Edit FooBlah" links to classes as well as other artifacts such as XML files, JSPs, properties files, etc. how does that truly help them? What is the purpose other than simply being able to do the linkage? The "Edit FooBlah" use case in say, a Struts-Spring-Hibernate typical web application crosses all tiers and touches a whole mess of artifacts ranging from Java classes to HTML, JavaScript, CSS, JSPs, Struts configuration files, resource bundles, Spring configuration files, Hibernate mapping files, utility classes, etc. etc. etc. If I've properly architected the application into well-defined layers, however, I already know exactly what artifacts make up a particular use case since almost all the typical use cases will be built in the same, consistent way.

I can clearly see the purpose for linking bug fixes, issues, etc. to the code that was altered to perform a fix. For example, if a bug is found, someone enters it into a tool like Atlassian JIRA. It then is reviewed, worked on, and code is checked in against that bug. The affected code tends to be concentrated (hopefully) into a small number of classes and is is easy to see how a particular issue was resolved. JIRA and similar tools are also useful to document new feature requests, improvements, etc. once a system is released into at least alpha or beta. In addition, tracking issues like this allows release notes to be easily assembled from reports run for a particular release number and to inform users about fixes, changes, and other things that might directly affect them. People can also search and find whether there is a workaround to a specific issue or whether someone else already reported the bug they found. There are lots of other good reasons why we use issue tracking systems like this.

But as for linking requirements, in whatever form they might be, directly to code, I do not see the tangible benefits. What if your requirements are on index cards in a XP-style development shop? Or if they are in textual use cases in a wiki? Is anyone really going to gain tremendous benefits from being able to tell you that a use case links to the following 27 different files? And the management of this information becomes much more complicated over time, as requirements change throughout the initial development as well as product maintenance. If a feature a product contains is fundamentally changed or improved, do I care that I can go back and trace the old requirements to code that has been removed from the system long ago? To me the most important thing is to have a set of automated tests that define whether a system does the tasks it is intended to perform. This can be unit tests, functional test suites, user acceptance tests, performance and load tests, and whatever other types of tests you want. But in the end they determine whether the product meets its users needs. Period. Users do not care to look at some gigantic stack of paper containing system requirements. They simply want a system they can easily use to perform their job. That's it. It's that simple. But for whatever reason many people in our young and immature industry still continue to believe that documentation such as requirements traceability, test plans, system requirements specifications, system design documents, system security plans, data migration plans, and umpteen other documents are the most important facet in developing software.

At JavaOne in 2004 I saw a demo of some products that claimed to easily and seamlessly (of course) provide linkage between requirements and code. So here's how it worked. You go into the requirements tool, select the requirement your class or (as the demo showed) method satisfies, and then drag and drop the requirement directly into your source code. The tool suite, which of course required you to have all the various parts of the suite, then created a code comment "linked" to the requirement and apparently was then able to track exactly what code linked to each requirement. Notwithstanding how brittle this solution was - just go into the source and delete the comment - it was also invasive and ugly. Assuming you had a sizable system, soon your developers would be spending all their time linking code to requirements, and trying to figure out which requirements some little utility class that everyone is using (e.g. StringUtils) maps to, which of course is an exercise in futility.

Limiting the Number of Query Results With ORDER BY Clauses

Posted on July 14, 2005 by Scott Leberknight

Many times you need to limit the number of results returned by a query, for example if you have the potential for a user-created query to return a very large number of results. Since this pretty much always happens in applications, you have to deal with it somehow. But most books and articles don't really talk about the gory details, since many times the mechanisms you must use are database and vendor-specific. The two databases I've worked most with are MySql and Oracle. Limiting results in MySql is ridiculously simple, and I really wish their syntax was part of standard SQL. For example, suppose you have a person table and you want to search on first and last name, and you need to order by last name then first name. In addition, you need to limit the results to 25 results. In MySql you can issue the following query:

select * from person where last_name = 'Smith' order by last_name, first_name limit 25

The key in the above is the limit clause which MySql provides. MySql applies the limit after it has retrieved the results and applied the order by clause as well. Trivial. Now let's consider how to do this in Oracle.

Oracle provides a pseudo-field called rownum which seems like a promising way to specify limits on result sets. Suppose we issue the following query in Oracle:

select * from person where last_name = 'Smith' and rownum <= 25 order by last_name, first_name

When you run that query, you find much different results than you expect. This is because Oracle performs the query and applies the rownum to each row in the results before applying the order by clause. So the question is how in the world do you use Oracle's rownum in concert with an order by clause and get the correct results. I cheated. I have been using Hibernate for a while on an Oracle database and know it is able to handle this exact situation - that is, limiting the number of results and applying an order by clause to a query. So I set the hibernate.show_sql property to true and looked at what it generated for a query. It turns out to be very simple and makes use of a subselect:

select * from (
  select * from person where last_name = 'Smith' order by last_name, first_name
)
where rownum <= 25

So Hibernate is simply wrapping the original query in another query which limits the results. Since the subselect is executed first, the rownum property of the outer select works properly. Pretty slick the way Hibernate does that but it would be much easier if Oracle had a syntax more like the elegant MySql limit clause.

Unfortunately, limiting the number of results for a query is only part of the equation, since you normally also need to provide some type of paging functionality. There are tools, such as the very popular DisplayTag custom tag written by Fabrizio Giustina, to help with the display of data in a nice table that provides paging and sorting. Unfortunately, most of them want to have all the results completely in memory for the paging to work properly. I hate doing this since you are just taking up memory by putting data in the user session that the user will probably never even access. Thus you are just wasted memory by storing the complete result set in memory. I have always tried to implement paging so that you go back to the database for each page of information and not store anything in the session. You think Google stores a couple million results in memory for each search? Doubt it.

So how can you accomplish paging in the database? Again you normally have to use vendor-specific syntax. For MySql you simply use the limit clause, but you specify two parameters: the start row and the number of rows to retrieve. For example, suppose we want the first page from the person search assuming a page size of 25:

select * from person where last_name = 'Smith' order by last_name, first_name limit 0, 25

Simple. Suppose you want the second and then third pages. The queries are:

select * from person where last_name = 'Smith' order by last_name, first_name limit 25, 25
select * from person where last_name = 'Smith' order by last_name, first_name limit 50, 25

Again, in MySql this is really simple to accomplish. In Oracle, not as simple since you now have to get additional operators involved, specifically row_number() and over() functions. More on that later...for now I suggest you migrate all Oracle databases to MySql. Wouldn't that be nice? :-)

One last thing. The really nice thing about Hibernate is that it understands how to limit result sets and perform paging operations in all the databases it supports. Your code does not need to care. You simply do something like:

Criteria criteria = session.createCriteria(Person.class);
criteria.addOrder(Order.asc("lastName"));
criteria.addOrder(Order.asc("firstName"));
criteria.setFirstResult(50);
criteria.setMaxResults(25);
List results = criteria.list();

Then, Hibernate will perform the appropriate vendor-specific query based on the dialect you configure Hibernate with. Thus, using a tools like Hibernate allows you to write queries at a higher level and completely abstract the actual query that is performed on your specific database at runtime.