Feb 29 2012

A Semantic Versioning convention of Corporate or Parent POMs

Category: Refactoring,Small businesses,Software engineeringGabriel Belingueres @ 3:47 pm

Maven allows us to inherit from parent POMs to allow reuse of project or corporate standards, like:

  • Forcing a given minimum maven version.
  • Using a specific library version.
  • Defining an exact plugin version (for maximum build reproducibility).

If the IT organization share some basic configurations across all their project, it usually represent it as a corporate POM, that is a top-level pom.xml file that is used as the parent of all the projects in their project portfolio.

For some reason, commonly this corporate POMs are versioned with a single integer number. I don’t know if it is either just a convention, or it had some origins on some specific constraint on older Maven versions which are no longer present, or if there is some logical rationale behind it.

On the other hand, I like the Semantic Versioning reasoning, and I try to comply with it in my own projects. So the simple question is: Why would we not follow the same versioning technique for corporate POMs too?

We can follow the Semantic Versioning for corporate POMs, using a Major.Minor.Patch versioning format, as any of the applications and libraries we daily use.

The key to define a semantic versioning, is that POMs are not just configuration files: They define a public API, just like any library have classes with public methods (which comprises its public API).

The difference in the interpretation is in the point of view of the person using the API: In the case of the class library, it is a developer trying to use the API to build a bigger functionality; in the case of the build tool API (pom.xml file), it’s a developer trying to use the API to generate that previously written piece of functionality. And as happens with a class library, the magnitude of the change from version to version also can be measured.

Which part of the X.Y.Z version format to change?

Let’s see some basic rules for the case when you need to change the corporate POM, being X the Major part or the version, Y the Minor part, and Z the Patch part, as defined in the Semantic Versioning page:

Patch

7. Patch version Z (x.y.Z | x > 0) MUST be incremented if only backwards compatible bug fixes are introduced. A bug fix is defined as an internal change that fixes incorrect behavior.

When applied to a corporate POM, we can interpreted it as:

Patch version Z MUST be incremented if the generated artifacts built are functionally exact the same as the generated by the previous version.

This means that a rebuild with the new POM will generate the artifacts exactly as the previous one, so the result will be a functionally equivalent version of it.

For example, if you are building a web application, where the final artifact is a .WAR file, then adding a new <developer> to the <developers> section of the corporate/parent POM will produce exactly the same web application with its original functionality untouched (except maybe some resource filter that add the new developer to some file).

Minor

8. Minor version Y (x.Y.z | x > 0) MUST be incremented if new, backwards compatible functionality is introduced to the public API. It MUST be incremented if any public API functionality is marked as deprecated. It MAY be incremented if substantial new functionality or improvements are introduced within the private code. It MAY include patch level changes. Patch version MUST be reset to 0 when minor version is incremented.

We can interpret it as:

Minor version Y MUST be incremented if the POM is modified in a way that it generates artifacts that are functionally equivalent, but essentially it produce a new different artifact. It MAY be incremented if substantial new functionality or improvements are introduced within the way the build is done, and may affect the final artifact. In both cases the build MUST work. Patch version MUST be reset to 0 when minor version is incremented.

For example, if your project use the Apache commons-lang library, defined in the section of its parent POM, and you want to update it from version 2.4 to 2.6, then you MUST change the minor version, because the generated artifact (.war file) will be different (a dependency has changed), though probably at runtime the application works the same.

It MAY be incremented if new plugin versions (commonly defined in a <pluginManagement> section) want to be used. It can happen that the plugin execution produces a different result, so is probable that the final artifact is modified too.

One thing is for sure: If you are given a new corporate/parent POM with a higher Minor number, then it is very likely the build will work out of the box, but you MUST RE-TEST the generated artifact.

Major

9. Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API. It MAY include minor and patch level changes. Patch and minor version MUST be reset to 0 when major version is incremented.

Interpretation for a POM file:

Major version X MUST be incremented if the build (probably) will break, or the generated artifacts will break. It MAY include minor and patch level changes. Patch and minor version MUST be reset to 0 when major version is incremented.

For example, you may introduce the use of a plugin which requires Maven 3+ to work, then breaking the current build (possibly done with Maven 2.2+). Another example is the update from an old version of a library to a new one, where the public API changed, and this will obviously break the build.

Sometimes it will not break the build but the generated artifacts: consider the case where you upgrade your application presentation layer code from JSTL 1.0 to JSTL 1.1 in a parent POM, the build will not break but the web application will not work properly since the JSP pages will declare <%@ taglib uri=”http://java.sun.com/jstl/core” prefix=”c” %> instead of <%@ taglib uri=”http://java.sun.com/jsp/jstl/core” prefix=”c” %>.

Of course, when in doubt, just go back to the old convention: change the Major version number and you are ready to go.

Conclusion

There is no need to use the current convention of “a single major version number” in your corporate/parent POMs. You just need the discipline to use the X.Y.Z format right, which have the added benefit of implicitly show us the magnitude of the required change to upgrade to a newer parent version: this information is not conveyed in the current convention.

In teams where their members are not very experimented with Maven, there may be a big initial number of parent POMs before stabilizing, so the X.Y.Z format may help greatly (specially if the organization has high people turnover).

As a rule of thumb consider this guidelines, regarding changes in your parent POMs:

  • A Patch version change: A small change is required to upgrade. Artifacts should work out of the box.
  • A Minor version change: A moderate size change is required to upgrade. Artifacts need to be re tested in their original runtime environment.
  • A Major version change: A bigger effort is required to upgrade. Build probably break, and code may need to be upgraded too.


Nov 24 2011

Struts2-JUnit 0.2 Released!

Category: Java,Software engineeringGabriel Belingueres @ 2:24 pm

Wow!

It’s been a while since my last post, and luckily I managed to have some free time to work on some interesting projects. Finally I can write about this handy, little utility library I developed for testing our projects, hosted on Google Code under the name struts2-junit.

Based on JUnit 4, it’s very easy to use.

Please, give it a try and tell us what do you think.

Tags: , , ,


Oct 16 2010

Things to consider before doing big refactorings in small businesses

Category: Refactoring,Small businesses,Software engineeringGabriel Belingueres @ 5:32 pm

When someone in the team advice to carry out a big refactoring, seems like a revelation made present into the room. There are the ones that look really surprised and find the proposition somewhat bold and there are others that may think “finally! someone brings this subject to the table!”

If you are lucky, the refactoring was detected on time and everybody is happy that it was discovered. Sometimes it come too late to the discussion and no one want to dig any further into this problem, and sometimes the refactoring is just a different way of doing things but no a necessarily better.

The question is which indicators we can watch to make a sound decision on putting the whole team to work on a refactoring or not. Here are some issues to have into account when that moment finally arrive.

Here I present some questions you can ask yourself and to your team to help you, which basically are based on pure common sense. You will find it ordered looked from the most technical view-point to the most business related:

Make sure the refactoring make sense

Your first task should be to dig into the technical issues surrounding the refactoring:

It is a necessary change to do?

Think (and ask to the team if they agree) if it would be better to refactor or would it better off leaving the code base as it is now. Having some smart people to throw an idea on the table doesn’t mean everybody leave the room running to implement it. Sometimes the refactor comes into play too late in the schedule and you may want to release as it is now, and schedule the refactoring on a future iteration.

Decide if the refactoring it is indeed an improvement that would unleash some new functional or non-functional requirement? or it is just a different way of doing the same? For example a big and important relational table where one of their attributes is a blob. You may want to split that table in two for better performance and improve memory consumption of server resources, but are really those non-functional requirements what need to be optimized? or do you want to do it because the software “would just work better that way”?

Will the refactoring leave the project in a pivot position?

A pivot position is a state of your project in which it is situated on a crossroads where all the paths leads you to a success scenario. So you must always prefer the project to be in a pivotal position for maximizing maneuverability, because you will never reach a dead-end, or be forced to follow a specific path to success. You must always have your options open in the presence of problems.

To illustrate the concept, some refactorings are needed because of unfilled holes into the functional specifications. To present an example in a much minor scale, suppose there is an algorithm to perform some financial calculation that you are not sure if it apply or not to a use case, then you can model it using an Strategy pattern. Here the pivotal position is having implemented the Strategy, so change will be easier when the unfilled specification hole gets actually filled. However, just proposing to change that every algorithm in your project has to be modeled as a Strategy pattern is an overkill: it is just a speculation to think that every algorithm is subject to change, even plain if-then-else logic, so a big refactoring like this is not appealing at all.

Remember the 3Ps of software engineering

The 3Ps of software engineering are:

(People, Process, Product)

Sometimes called the 4Ps if you add Project to the tern. It basically states that Project depends on how well you can control three variables: they are done by People which uses a Process to develop a Product. So for the project to be successful, you must keep healthy the people, the process used by people, and the product you are developing.

Suppose that the three variables are in harmony when no changes are made to the project, I like to think the 3Ps as a perfectly balanced 3 plate scales hanging from a triangle. When a new requirement or functionality arrives to implement, a “disturbance in the force” is generated by the environment that the team must “sense” and start working to balance it again.
Considering that you realize you need to refactor a big piece of the code base, and this will impact the whole team, you must plan wisely to avoid missing the target. First of all start with the most important, unstable and unpredictable of the three variables: the People.

People

  • Is there consensus that the refactoring must be carried over?
  • Will the team be more confident on making non trivial changes to the code base after that?
  • Will they deliver more quickly?
  • Will they deliver with fewer bugs?
  • Will people have longer working days?

Product

  • Will the product improve in some aspect visible to the users (like performance or functionality) or on any other way (like extensibility or maintainability)?
  • Will the actual product or service degrade its quality of service until the refactoring is completed? For how long?
  • Will the refactoring be completed by stages, progressively? or will the product or service be modified in one shot?

Process

  • Will the process allow the team to deviate from the usual plan to perform the refactoring?
  • Is the structure of the team or business adequate to quickly communicate the needs and changes between different roles like developers, DBAs, sysadmins, testers, users?

Supply and demand

Remember that in the market economy, the fundamental truth are given by the laws of Supply and Demand, and for any business they are as inescapable as the laws of gravity in nature. Specially if you work on a small business, chances are that you must think from time to time in the business case of proposed changes. You must asses how the refactoring will impact your business in the medium and long-term future:

  • Will the refactoring improve your capacity to supply new functionality/services/products?
  • Will the refactoring impact positively on the demand of your services/products?

Stay aligned with management in this. Ask yourself “What would I do if it were my only decision? What would I do if I’d have to bet $100,000 of my own pockets on this change? You would be surprised what you would be your answers when you carry out yourself that exercise!

Conclusions

When faced to do big refactorings many things must be evaluated before allowing the team to reorganise the code base of the project. IMO, the best thing you can do is: search for consensus with the team to decide if the change is appealing or not, jointly plan the effort, and analyse the business case of it.

Tags: , , , , , ,


Sep 19 2010

Some confusing cases of method overloading – Part 3

Category: JavaGabriel Belingueres @ 3:28 am

Following this series of posts related to method overloading in Java, now we will see how will the compiler behave when calling an overloaded method which actual parameter is a result of the conditional ? operator.

First of all, in the case you missed the previous posts, here you have the links to the first part, and the second part.

Let’s consider the following example program:

import java.util.Random;

public class OverloadPrimitiveTest {

  public void method(int a) {
    System.out.println("method(int)");
  }

  public void method(Integer b) {
    System.out.println("method(Integer)");
  }

  public static void main(String[] args) {
    OverloadPrimitiveTest o = new OverloadPrimitiveTest();

    o.method(1); // call method(int)
    o.method(new Integer(1)); // call method(Integer)
    o.method((short) 1); // call method(int)
    o.method(null); // call method(Integer)

    Random random = new Random();
    o.method((random.nextBoolean()) ? 1 : new Integer(2)); // call method(int), ambiguous
  }

}

The first two calls are straightforward to digest, they call the intuitively correct overloaded method since their actual parameters in the method call expression are those specific ones.

The third call seems strange at first sight but it is OK, since Java can safely widen a primitive short to an int (no overflow can happen), so this resolve to the overloaded method which takes a primitive int parameter. The fourth call is very intuitive also, because a null value can be assigned to reference types, not a primitive type, so using a null as the actual parameter resolves intuitively to the method that takes an Integer object.

The fifth call is what I consider ambiguous.

Why is it ambiguous?

The last call does a couple of things with its actual parameter. First of all, the actual parameter is taken after the evaluation of the conditional ? operator, and the type returned by the conditional operator is the primitive int, because the specification says that it will unbox its second and third operand as necessary (JLS 15.25.)

However, comparing the fifth invocation of the overloaded method with the first and second ones, it is confusing at first sight which one of the methods will be selected at compile-time.

An example using references

The same confusing case may happen with references to objects too, since it is not necessary the presence of a boxing or unboxing operation for the anomalous case to appear. Just like the primitive int type can be widened to long or float, a reference type can be widened to some superclass in the class herarchy, or some implemented interface. Here is an example:

import java.applet.Applet;
import java.awt.Panel;
import java.util.Random;

import javax.swing.JApplet;

public class OverloadPrimitiveTest2 {

 public void method(Panel p) {
   System.out.println("Panel");
 }

 public void method(JApplet a) {
   System.out.println("JApplet");
 }

 public static void main(String[] args) {
   OverloadPrimitiveTest2 o = new OverloadPrimitiveTest2();

   o.method(new Panel()); // prints Panel
   o.method(new JApplet()); // prints JApplet

   o.method((new Random().nextBoolean()) ? new Applet() : new JApplet()); // prints Panel
 }

}

Here, JApplet is a subclass of Applet, which in turn is a subclass of Panel. In the third method invocation the conditional ? operator decides that the type of the expression is Applet, since JApplet is widened to Applet without any problem. In turn, the best fit for the method invocation is calling method(Panel) since an expression of type Applet can be widened to type Panel. Again it is confusing the intention of the programmer since both methods may have been selected (if not for the conditional ? operator.)

A note on code checking tools

The usual code analysis tools, like FindBugs, PMD, and (now free) Google’s CodePro AnalytiX are happy to detect and mark the use of the conditional ? operator. I find that informative but more often than not, it marks a false positive, that is to say that not necessarily I believe that using the conditional ? operator is wrong: it has its use to make the program somewhat easier to read. However, I do think that the conditional ? operator is too smart in trying to reduce the two evaluated expressions into a single type expression: sometimes it widens too much and useful type information is lost in the oblivion.

As a side note, CodePro AnalytiX has a rule indicating that there is an overloaded method and recommends that either rename the method (to avoid the overloading) or change the number of parameters. I actually didn’t find this recommendation very useful, as the intention of an overloaded method is actually that by design it should have the same name (because they perform the same process for a different type) and it would by unnatural to add an extra parameter just for disambiguate the method invocation at compile-time.

Conclusions

The conditional ? operator is tricky, because it is too smart and infer things that most of the time we will not be aware of. In addition, combine method overloading with this operator and we will certainly have some fun time hunting bugs. However, I don’t personally think nor that the conditional ? operator should be avoided (as suggested by code analysis tools) nor method overloading, though I do believe you must not mix them.

You can distill a rule from this. Note that in the conditional ? operator, the types of the second and third operands are different and when considered in isolation, each one of them would make the compiler choose a different overloaded method to invoke. So a simple rule of thumb to remember would be:

Don’t use the conditional ? operator as an actual parameter of an overloaded method if the second and third operands have different types.

I wish some day I could see a rule implemented on those code analysis tools to detect this tricky cases.

Tags: ,


Sep 14 2010

Some confusing cases of method overloading – Part 2

Category: JavaGabriel Belingueres @ 11:30 pm

Some days ago I posted about an example program which called an overloaded method where its parameters are final variables. This time I’ll call an overloaded constructor where its actual parameter is a null literal. As per the Java Language specification, overloaded constructor resolution is identical to overloaded method resolution, so for methods these same concepts apply.

Consider this example, extracted from the (very good and entertaining) book Java Puzzlers, which is a perfectly legal Java program:

public class Confusing {

  private Confusing(Object o) {
    System.out.println("Object");
  }

  private Confusing(double[] dArray) {
    System.out.println("double array");
  }

  public static void main(String[] args) {
    new Confusing(null); // prints double array
  }

}

Here, the class Confusing provides an overloaded constructor: one of them take an Object as a parameter and the other an array. Now, as the Java Language Specification says in its Chapter 10: arrays are objects, are dynamically created, and can be assigned to variables of type Object. All methods of the class Object can be invoked on an array.

So we have that double[] is a subclass of Object, and we try to create an instance of the class Confusing with its actual parameter being a null literal. Because null can be assigned to a double[] array, and to an Object instance, then both methods are applicable, however as per the Java Language Specification, the overloaded method to be called is the most specific one, and because an array is a subclass of Object, then the most specific is the Confusing(double[]) constructor, so the compiler chooses this constructor and then the program compiles without error on Java 6 update 20. Tools like FindBugs and PMD didn’t find this weird, as no warning was emitted from them.

Is it ambiguous or not?

Clearly the above program is perfectly legal…for the compiler. For me, and I believe many others average Joe programmers it is confusing. Why it is legal when at first glance it seems it is ambiguous? My rule of thumb is that code must be easy to read, because code is read more times than written. Also, for productivity purposes the compiler doesn’t count as somebody that read code, so why write perfectly well-formed code that is difficult to grasp at first sight?

The code is ambiguous because a null literal could be used as actual parameter for both constructors, it is not clear which one it will choose to call. Whether or not the Java compiler actually has very deterministic and definite rules to disambiguate the call, it is irrelevant to the human (and error prone) reader, and therefore it is sound to emit a Warning message here.

How to disambiguate

Casting the null parameter to the actual type you intent to refer to is the simplest way to disambiguate the call. No question which constructor you are referring to now:

new Confusing((Object)null);
new Confusing((String[])null);

or you can define a variable with the right type:

Object x = null;
new Confusing((x);
double[] y = null;
new Confusion(y);

Conclusions

The rules of overloaded method resolution are complicated to understand, and “smart” code that depends on those rules makes the program harder to read. I would personally like to see FindBugs or PMD add some rule to detect this kind of situation. As said in the first part of this series of posts, why don’t we use the Java type system to the fullest?

Stay tuned for more interesting overloaded examples!

Tags: ,


Sep 12 2010

Some confusing cases of method overloading

Category: JavaGabriel Belingueres @ 12:38 am

Some time ago I was starting to wonder how the Java compiler actually worked, because some programs I was reading at that time compiled without errors but looked like something was out-of-place, particularly some example programs which included overloaded methods.

Method overloading is basically a feature offered by a programming language allowing to create several methods with the same name, but different parameters type or output type. Java is one of such languages.

Java defines at compile time which method will be actually called when encountered with a method invocation. The rules that are checked by the compiler to take such a decision are rather complex and can be found in the Java Language Specification for the intrigued reader. Those rules may or may not be perfect but are the ones that apply to every Java program, so proposing a change on them produces the very undesirable effect of making every Java program to potentially fail to compile. However, the compiler may be modified to emit a Warning when something looks suspicious, even if the program compiles correctly. In this post I’ll show you some of those cases:

Calling an overloaded method where the actual parameter is a final variable

class A {
}

class B extends A {
}

public class OverloadingTest {

  public void method(A a) {
    System.out.println("method(A)");
  }

  public void method(B b) {
    System.out.println("method(B)");
  }

  public static void main(String[] args) {
    OverloadingTest o = new OverloadingTest();

    A p = new B();
    o.method(p); // call method(A)

    o.method(new B()); // call method(B)

    final A p2 = new B();
    o.method(p2); // call method(A)
  }
}

In this example, the first and second calls are intuitively resolved to what the programmer expect, however the third call is open to some controversy. Since the actual method called by the program is resolved at compile time, the Java compiler checks the actual parameters passed to the method call against its static type, not its dynamic type.

This makes sense since at compile time, the compiler can not determine actually which will be the dynamic type of a variable when the program is running. In the third call, the p2′s static type is A and its dynamic type is B. However the p2 variable is marked as final, meaning it can not change its type during its lifetime. In other words, p2 is instantiated as having the B type, and its dynamic type can not change…ever.

IMO, this seems to me like a programmer error on the intent of the variable p2. It is a B, and it will never by any other type? or it is an A? Because based on that decision the compiler will call one or the other overloaded method. I believe the compiler can check on this and show a Warning. I checked this code compiling against Java 6 Update 20 and no warning, also FindBugs and PMD didn’t detect this as a code smell either.

There are some ways to disambiguate the conflicting line of code:

  • Cast the variable to the more specific type:
    final A p2 = new B();
    o.method((B)p2);
  • Remove the final modifier (meaning that its dynamic type could change):
    A p2 = new B();
    o.method((B)p2);
  • Keep the final modifier, but changing the static type to one that indeed lead the compiler to resolve to one (and only one) matching method:
    final B p2 = new B();
    o.method(p2);
    // or
    final A p2 = new A();
    o.method(p2);

Dealing with blank final declarations

Consider the following code excerpt:

final A a;
if (someCondition) {
  a = new A();
}
else {
  a = new B();
}
o.method(a);

There is no obvious way of detecting the issue here (at least not without a flow analysis) so a conservative thing to do is to raise the warning anyway, so the programmer may take out either the final keyword, or refactor the code like this:

if (someCondition) {
  final A a = new A();
  o.method(a);
}
else {
  final B b = new B();
  o.method(b);
}

This time, PMD somewhat points to a problem here, since it warns about the use of final (blank) local variables (AvoidFinalLocalVariable Rule.)

The conditional ? operator

Seems counter intuitive that if

o.method(new B());

call method(B), the following two actually resolve to method(A):

o.method((false) ? new A() : new B());
o.method((false) ? (A)null : new B());

and the following will call method(B):

o.method((false) ? null : new B());

This is so because of the resulting expression type produced by the ? operator.

As both the second and third operand of the ? operator are by all means constant expressions, IMO they should be treated just with the same semantics as if a blank final variable had been used instead:

final A a;
if (condition)
  a = expressionReturningA;
else
  a = expressionReturningB;
o.method(a);

Then, the same reasoning as before applies.

IMO, using the ? operator to resolve an expression for use as an actual parameter in an overloaded method call should be avoided if possible, it’s a code smell, and as such, tools must warn the programmer. Since the ? operator has only one type (which depends on the second and third operand), the only case when it may not emit a warning is if both operands reduce to a type that uniquely resolve to only one (and the same) method.

Conclusions

The interaction between overloaded methods and different kind of expressions in Java is tricky. On the one hand, method resolution rules are very complicated, but reasoning about expressions that represent constants or non changing values (such as final variables) should allow the programmer to think about the code in abstract terms, just as any mathematical formula would behave. Why bother to program in a strongly typed language if we can not reason about the program in an intuitive fashion? Why not make the tools to emit a warning when we write somewhat ambiguous code? Why can’t we use the Java type system to the fullest?

In the upcoming weeks I’ll post some more obscure examples.

Calling an overloaded method where the actual parameter is a final variable

Tags: ,


Sep 03 2010

Hello world!

Category: UncategorizedGabriel Belingueres @ 11:15 pm

Hello!! This is my first post. I’ll be writing about things that usually are related to the software development area.  I must state clear that this blog will have many personal opinions, and frecuently some source code to read.

By now I leave you with this:

public class Test {
  public static final void main(String[] argv) {
    System.out.println("Hello world!");
  }
}