Things to consider before doing big refactorings in small businesses

When someone in the team advice to carry out a big refactoring, seems like a revelation made present into the room. There are the ones that look really surprised and find the proposition somewhat bold and there are others that may think “finally! someone brings this subject to the table!”

If you are lucky, the refactoring was detected on time and everybody is happy that it was discovered. Sometimes it come too late to the discussion and no one want to dig any further into this problem, and sometimes the refactoring is just a different way of doing things but no a necessarily better.

The question is which indicators we can watch to make a sound decision on putting the whole team to work on a refactoring or not. Here are some issues to have into account when that moment finally arrive.

Here I present some questions you can ask yourself and to your team to help you, which basically are based on pure common sense. You will find it ordered looked from the most technical view-point to the most business related:

Make sure the refactoring make sense

Your first task should be to dig into the technical issues surrounding the refactoring:

It is a necessary change to do?

Think (and ask to the team if they agree) if it would be better to refactor or would it better off leaving the code base as it is now. Having some smart people to throw an idea on the table doesn’t mean everybody leave the room running to implement it. Sometimes the refactor comes into play too late in the schedule and you may want to release as it is now, and schedule the refactoring on a future iteration.

Decide if the refactoring it is indeed an improvement that would unleash some new functional or non-functional requirement? or it is just a different way of doing the same? For example a big and important relational table where one of their attributes is a blob. You may want to split that table in two for better performance and improve memory consumption of server resources, but are really those non-functional requirements what need to be optimized? or do you want to do it because the software “would just work better that way”?

Will the refactoring leave the project in a pivot position?

A pivot position is a state of your project in which it is situated on a crossroads where all the paths leads you to a success scenario. So you must always prefer the project to be in a pivotal position for maximizing maneuverability, because you will never reach a dead-end, or be forced to follow a specific path to success. You must always have your options open in the presence of problems.

To illustrate the concept, some refactorings are needed because of unfilled holes into the functional specifications. To present an example in a much minor scale, suppose there is an algorithm to perform some financial calculation that you are not sure if it apply or not to a use case, then you can model it using an Strategy pattern. Here the pivotal position is having implemented the Strategy, so change will be easier when the unfilled specification hole gets actually filled. However, just proposing to change that every algorithm in your project has to be modeled as a Strategy pattern is an overkill: it is just a speculation to think that every algorithm is subject to change, even plain if-then-else logic, so a big refactoring like this is not appealing at all.

Remember the 3Ps of software engineering

The 3Ps of software engineering are:

(People, Process, Product)

Sometimes called the 4Ps if you add Project to the tern. It basically states that Project depends on how well you can control three variables: they are done by People which uses a Process to develop a Product. So for the project to be successful, you must keep healthy the people, the process used by people, and the product you are developing.

Suppose that the three variables are in harmony when no changes are made to the project, I like to think the 3Ps as a perfectly balanced 3 plate scales hanging from a triangle. When a new requirement or functionality arrives to implement, a “disturbance in the force” is generated by the environment that the team must “sense” and start working to balance it again.
Considering that you realize you need to refactor a big piece of the code base, and this will impact the whole team, you must plan wisely to avoid missing the target. First of all start with the most important, unstable and unpredictable of the three variables: the People.

People

  • Is there consensus that the refactoring must be carried over?
  • Will the team be more confident on making non trivial changes to the code base after that?
  • Will they deliver more quickly?
  • Will they deliver with fewer bugs?
  • Will people have longer working days?

Product

  • Will the product improve in some aspect visible to the users (like performance or functionality) or on any other way (like extensibility or maintainability)?
  • Will the actual product or service degrade its quality of service until the refactoring is completed? For how long?
  • Will the refactoring be completed by stages, progressively? or will the product or service be modified in one shot?

Process

  • Will the process allow the team to deviate from the usual plan to perform the refactoring?
  • Is the structure of the team or business adequate to quickly communicate the needs and changes between different roles like developers, DBAs, sysadmins, testers, users?

Supply and demand

Remember that in the market economy, the fundamental truth are given by the laws of Supply and Demand, and for any business they are as inescapable as the laws of gravity in nature. Specially if you work on a small business, chances are that you must think from time to time in the business case of proposed changes. You must asses how the refactoring will impact your business in the medium and long-term future:

  • Will the refactoring improve your capacity to supply new functionality/services/products?
  • Will the refactoring impact positively on the demand of your services/products?

Stay aligned with management in this. Ask yourself “What would I do if it were my only decision? What would I do if I’d have to bet $100,000 of my own pockets on this change? You would be surprised what you would be your answers when you carry out yourself that exercise!

Conclusions

When faced to do big refactorings many things must be evaluated before allowing the team to reorganise the code base of the project. IMO, the best thing you can do is: search for consensus with the team to decide if the change is appealing or not, jointly plan the effort, and analyse the business case of it.

Some confusing cases of method overloading – Part 3

Following this series of posts related to method overloading in Java, now we will see how will the compiler behave when calling an overloaded method which actual parameter is a result of the conditional ? operator.

First of all, in the case you missed the previous posts, here you have the links to the first part, and the second part.

Let’s consider the following example program:

import java.util.Random;

public class OverloadPrimitiveTest {

  public void method(int a) {
    System.out.println("method(int)");
  }

  public void method(Integer b) {
    System.out.println("method(Integer)");
  }

  public static void main(String[] args) {
    OverloadPrimitiveTest o = new OverloadPrimitiveTest();

    o.method(1); // call method(int)
    o.method(new Integer(1)); // call method(Integer)
    o.method((short) 1); // call method(int)
    o.method(null); // call method(Integer)

    Random random = new Random();
    o.method((random.nextBoolean()) ? 1 : new Integer(2)); // call method(int), ambiguous
  }

}

The first two calls are straightforward to digest, they call the intuitively correct overloaded method since their actual parameters in the method call expression are those specific ones.

The third call seems strange at first sight but it is OK, since Java can safely widen a primitive short to an int (no overflow can happen), so this resolve to the overloaded method which takes a primitive int parameter. The fourth call is very intuitive also, because a null value can be assigned to reference types, not a primitive type, so using a null as the actual parameter resolves intuitively to the method that takes an Integer object.

The fifth call is what I consider ambiguous.

Why is it ambiguous?

The last call does a couple of things with its actual parameter. First of all, the actual parameter is taken after the evaluation of the conditional ? operator, and the type returned by the conditional operator is the primitive int, because the specification says that it will unbox its second and third operand as necessary (JLS 15.25.)

However, comparing the fifth invocation of the overloaded method with the first and second ones, it is confusing at first sight which one of the methods will be selected at compile-time.

An example using references

The same confusing case may happen with references to objects too, since it is not necessary the presence of a boxing or unboxing operation for the anomalous case to appear. Just like the primitive int type can be widened to long or float, a reference type can be widened to some superclass in the class herarchy, or some implemented interface. Here is an example:

import java.applet.Applet;
import java.awt.Panel;
import java.util.Random;

import javax.swing.JApplet;

public class OverloadPrimitiveTest2 {

 public void method(Panel p) {
   System.out.println("Panel");
 }

 public void method(JApplet a) {
   System.out.println("JApplet");
 }

 public static void main(String[] args) {
   OverloadPrimitiveTest2 o = new OverloadPrimitiveTest2();

   o.method(new Panel()); // prints Panel
   o.method(new JApplet()); // prints JApplet

   o.method((new Random().nextBoolean()) ? new Applet() : new JApplet()); // prints Panel
 }

}

Here, JApplet is a subclass of Applet, which in turn is a subclass of Panel. In the third method invocation the conditional ? operator decides that the type of the expression is Applet, since JApplet is widened to Applet without any problem. In turn, the best fit for the method invocation is calling method(Panel) since an expression of type Applet can be widened to type Panel. Again it is confusing the intention of the programmer since both methods may have been selected (if not for the conditional ? operator.)

A note on code checking tools

The usual code analysis tools, like FindBugs, PMD, and (now free) Google’s CodePro AnalytiX are happy to detect and mark the use of the conditional ? operator. I find that informative but more often than not, it marks a false positive, that is to say that not necessarily I believe that using the conditional ? operator is wrong: it has its use to make the program somewhat easier to read. However, I do think that the conditional ? operator is too smart in trying to reduce the two evaluated expressions into a single type expression: sometimes it widens too much and useful type information is lost in the oblivion.

As a side note, CodePro AnalytiX has a rule indicating that there is an overloaded method and recommends that either rename the method (to avoid the overloading) or change the number of parameters. I actually didn’t find this recommendation very useful, as the intention of an overloaded method is actually that by design it should have the same name (because they perform the same process for a different type) and it would by unnatural to add an extra parameter just for disambiguate the method invocation at compile-time.

Conclusions

The conditional ? operator is tricky, because it is too smart and infer things that most of the time we will not be aware of. In addition, combine method overloading with this operator and we will certainly have some fun time hunting bugs. However, I don’t personally think nor that the conditional ? operator should be avoided (as suggested by code analysis tools) nor method overloading, though I do believe you must not mix them.

You can distill a rule from this. Note that in the conditional ? operator, the types of the second and third operands are different and when considered in isolation, each one of them would make the compiler choose a different overloaded method to invoke. So a simple rule of thumb to remember would be:

Don’t use the conditional ? operator as an actual parameter of an overloaded method if the second and third operands have different types.

I wish some day I could see a rule implemented on those code analysis tools to detect this tricky cases.

Some confusing cases of method overloading – Part 2

Some days ago I posted about an example program which called an overloaded method where its parameters are final variables. This time I’ll call an overloaded constructor where its actual parameter is a null literal. As per the Java Language specification, overloaded constructor resolution is identical to overloaded method resolution, so for methods these same concepts apply.

Consider this example, extracted from the (very good and entertaining) book Java Puzzlers, which is a perfectly legal Java program:

public class Confusing {

  private Confusing(Object o) {
    System.out.println("Object");
  }

  private Confusing(double[] dArray) {
    System.out.println("double array");
  }

  public static void main(String[] args) {
    new Confusing(null); // prints double array
  }

}

Here, the class Confusing provides an overloaded constructor: one of them take an Object as a parameter and the other an array. Now, as the Java Language Specification says in its Chapter 10: arrays are objects, are dynamically created, and can be assigned to variables of type Object. All methods of the class Object can be invoked on an array.

So we have that double[] is a subclass of Object, and we try to create an instance of the class Confusing with its actual parameter being a null literal. Because null can be assigned to a double[] array, and to an Object instance, then both methods are applicable, however as per the Java Language Specification, the overloaded method to be called is the most specific one, and because an array is a subclass of Object, then the most specific is the Confusing(double[]) constructor, so the compiler chooses this constructor and then the program compiles without error on Java 6 update 20. Tools like FindBugs and PMD didn’t find this weird, as no warning was emitted from them.

Is it ambiguous or not?

Clearly the above program is perfectly legal…for the compiler. For me, and I believe many others average Joe programmers it is confusing. Why it is legal when at first glance it seems it is ambiguous? My rule of thumb is that code must be easy to read, because code is read more times than written. Also, for productivity purposes the compiler doesn’t count as somebody that read code, so why write perfectly well-formed code that is difficult to grasp at first sight?

The code is ambiguous because a null literal could be used as actual parameter for both constructors, it is not clear which one it will choose to call. Whether or not the Java compiler actually has very deterministic and definite rules to disambiguate the call, it is irrelevant to the human (and error prone) reader, and therefore it is sound to emit a Warning message here.

How to disambiguate

Casting the null parameter to the actual type you intent to refer to is the simplest way to disambiguate the call. No question which constructor you are referring to now:

new Confusing((Object)null);
new Confusing((String[])null);

or you can define a variable with the right type:

Object x = null;
new Confusing((x);
double[] y = null;
new Confusion(y);

Conclusions

The rules of overloaded method resolution are complicated to understand, and “smart” code that depends on those rules makes the program harder to read. I would personally like to see FindBugs or PMD add some rule to detect this kind of situation. As said in the first part of this series of posts, why don’t we use the Java type system to the fullest?

Stay tuned for more interesting overloaded examples!

Some confusing cases of method overloading

Some time ago I was starting to wonder how the Java compiler actually worked, because some programs I was reading at that time compiled without errors but looked like something was out-of-place, particularly some example programs which included overloaded methods.

Method overloading is basically a feature offered by a programming language allowing to create several methods with the same name, but different parameters type or output type. Java is one of such languages.

Java defines at compile time which method will be actually called when encountered with a method invocation. The rules that are checked by the compiler to take such a decision are rather complex and can be found in the Java Language Specification for the intrigued reader. Those rules may or may not be perfect but are the ones that apply to every Java program, so proposing a change on them produces the very undesirable effect of making every Java program to potentially fail to compile. However, the compiler may be modified to emit a Warning when something looks suspicious, even if the program compiles correctly. In this post I’ll show you some of those cases:

Calling an overloaded method where the actual parameter is a final variable

class A {
}

class B extends A {
}

public class OverloadingTest {

  public void method(A a) {
    System.out.println("method(A)");
  }

  public void method(B b) {
    System.out.println("method(B)");
  }

  public static void main(String[] args) {
    OverloadingTest o = new OverloadingTest();

    A p = new B();
    o.method(p); // call method(A)

    o.method(new B()); // call method(B)

    final A p2 = new B();
    o.method(p2); // call method(A)
  }
}

In this example, the first and second calls are intuitively resolved to what the programmer expect, however the third call is open to some controversy. Since the actual method called by the program is resolved at compile time, the Java compiler checks the actual parameters passed to the method call against its static type, not its dynamic type.

This makes sense since at compile time, the compiler can not determine actually which will be the dynamic type of a variable when the program is running. In the third call, the p2’s static type is A and its dynamic type is B. However the p2 variable is marked as final, meaning it can not change its type during its lifetime. In other words, p2 is instantiated as having the B type, and its dynamic type can not change…ever.

IMO, this seems to me like a programmer error on the intent of the variable p2. It is a B, and it will never by any other type? or it is an A? Because based on that decision the compiler will call one or the other overloaded method. I believe the compiler can check on this and show a Warning. I checked this code compiling against Java 6 Update 20 and no warning, also FindBugs and PMD didn’t detect this as a code smell either.

There are some ways to disambiguate the conflicting line of code:

  • Cast the variable to the more specific type:
    final A p2 = new B();
    o.method((B)p2);
  • Remove the final modifier (meaning that its dynamic type could change):
    A p2 = new B();
    o.method((B)p2);
  • Keep the final modifier, but changing the static type to one that indeed lead the compiler to resolve to one (and only one) matching method:
    final B p2 = new B();
    o.method(p2);
    // or
    final A p2 = new A();
    o.method(p2);

Dealing with blank final declarations

Consider the following code excerpt:

final A a;
if (someCondition) {
  a = new A();
}
else {
  a = new B();
}
o.method(a);

There is no obvious way of detecting the issue here (at least not without a flow analysis) so a conservative thing to do is to raise the warning anyway, so the programmer may take out either the final keyword, or refactor the code like this:

if (someCondition) {
  final A a = new A();
  o.method(a);
}
else {
  final B b = new B();
  o.method(b);
}

This time, PMD somewhat points to a problem here, since it warns about the use of final (blank) local variables (AvoidFinalLocalVariable Rule.)

The conditional ? operator

Seems counter intuitive that if

o.method(new B());

call method(B), the following two actually resolve to method(A):

o.method((false) ? new A() : new B());
o.method((false) ? (A)null : new B());

and the following will call method(B):

o.method((false) ? null : new B());

This is so because of the resulting expression type produced by the ? operator.

As both the second and third operand of the ? operator are by all means constant expressions, IMO they should be treated just with the same semantics as if a blank final variable had been used instead:

final A a;
if (condition)
  a = expressionReturningA;
else
  a = expressionReturningB;
o.method(a);

Then, the same reasoning as before applies.

IMO, using the ? operator to resolve an expression for use as an actual parameter in an overloaded method call should be avoided if possible, it’s a code smell, and as such, tools must warn the programmer. Since the ? operator has only one type (which depends on the second and third operand), the only case when it may not emit a warning is if both operands reduce to a type that uniquely resolve to only one (and the same) method.

Conclusions

The interaction between overloaded methods and different kind of expressions in Java is tricky. On the one hand, method resolution rules are very complicated, but reasoning about expressions that represent constants or non changing values (such as final variables) should allow the programmer to think about the code in abstract terms, just as any mathematical formula would behave. Why bother to program in a strongly typed language if we can not reason about the program in an intuitive fashion? Why not make the tools to emit a warning when we write somewhat ambiguous code? Why can’t we use the Java type system to the fullest?

In the upcoming weeks I’ll post some more obscure examples.

Calling an overloaded method where the actual parameter is a final variable