The Clean Code Fundamental Series - Episode II - Names

Disclaimer All credits for this series go to Robert C Martin, who gave me permission to write the content.

Reveal your intent

Every time we name a software element, be a class, a method or a variable, it should reveal the intent of the thing we are trying to name.

Let’s consider variables for a moment. If one needs a comment to explain the purpose of a variable, then the chosen name does not sufficiently reveal the variable intent.

Consider the following:

int d; //elapsed time in days

That looks just silly. It’s far better to write the following:

int elapsedTimeInDays;

Now the variable declares its intent and there is no need for comments.

Describe the problem

Whenever one needs to read the code to understand what a variable means, the programmer has pretty much failed in the naming strategy. Take this code as an example:

/** Useful range constant */
public static final int INCLUDE_NONE = 0;

/** Useful range constant */
public static final int INCLUDE_FIRST = 1;

/** Useful range constant */
public static final int INCLUDE_SECOND = 2;

/** Useful range constant */
public static final int INCLUDE_BOTH = 3;

Look at the comments.

Useful. Really? Thank you for sharing your perception but it doesn’t communicate anything useful to me. I hope the code is useful otherwise why would it be there?
Range. Do you see a range in there? A range normally it’s between two values, this are single digit constants. Additionally, range of what?
Constant. Thank you. I know Java, I know that’s a constant, no need to explain that to me.

The truth is that by reading these names we can’t understand their intent or the problem they’re trying to solve. We’ll need to go into the code where these variables are used and try and figure out their intent. We can do much better than that.

The code where these variables are used checks whether two dates are in range. The code is not shown here for brevity, however this is a brief explanation of how these variables were used in the code:

The FIRST, SECOND or BOTH parts of the variable names are used to instruct the algorithm whether to include the FIRST date, the SECOND or both. This is where the INCLUDE and FIRST, SECOND, BOTH parts of the constants names came from. FIRST, SECOND and BOTH are just not good names to declare the intent of including the lower and upper bound ranges.

INCLUDE_LOWER_BOUND could have been a better name. However this type of problems is a known mathematical problem called intervals. Intervals can be:

(a,b) open, they don’t include lower or upper bounds

[a,b] closed, they include lower and upper bounds

(a,b] open left, they include the upper bound

[a,b) open right, they include the lower bound

So instead of the constants above we could use an Enum:

public enum DateInterval {
  OPEN, CLOSED, OPEN_LEFT, OPEN_RIGHT
}

Names are not for the programmer’s benefit. They are meant to communicate intent, which should always be the first programmer’s priority.

Avoid disinformation

When a name in the code didn’t mean what it said…That’s disinformation. Disinformation is one of the worse sins programmers can commit.

Take this code as an example:

/**
 * Returns an array of month names
 * @param shortened a flag indicating that shortened month names should be returned.
 *
 * @return an array of month names
/*
public static String[] getMonths(final boolean shortened) {
  if (shortened) {
    return DATE_FORMAT_SYMBOLS.getShortMonths();
  } 
  else {
    return DATE_FORMAT_SYMBOLS.getMonths();
  }
}

The method name getMonths is misleading. A better name would be getMonthNames(). The argument should not be called shortened but something like shortNames or similar. Later in this series we will see that this method probably could be two methods, because generally in Clean Code we don’t like boolean as parameters and we don’t like ifs and elses. We could have two methods which would remove these issues and would communicate the intent much better, like:

getShortMonthNames
getMonthNames

Pronounceable Names

nd Good-bee?

Another example:

public int getYYYY() {
  return this.year;
}

What does YYYY mean? Of course one could assume it means Year, but how would one pronounce it? A far better name would be getYear().

As mentioned above names should be thought to be convenient for the readers, not the authors!

Avoid encodings

Today our IDEs our powerful, so all it takes is to hover on a variable name to see its type. We don’t need to prefix our variable names with some encoding scheme denoting their type.

Besides typing errors are going to be caught by our compiler and our unit tests.

This applies also to the use of C for class, I for interfaces, A for Abstract classes and so on.

Parts of speech

Classes and variables are nouns, methods are verbs.

The name of a class or a variable should always be a noun or a noun phrase, like Account or MessageParser. Avoid noise words, like Manager or Processor or Data or Info. They don’t really mean much and are just synonyms for the programmers to say…”I don’t know how to call this” .

Boolean variables should be named more like predicates, e.g.:

boolean isEmpty;
boolean isTerminated;

Then the code using them would be more expressive and read more like, as Grady Booch said, well-written prose, e.g.:

if (isEmpty) 
  //do something for Empty

Method should be verbs, like postPayment or getPrice. However, if the method returns a boolean it should be names as a predicate, e.g. Payment.isPostable()

Enums tend to be states or object descriptors, therefore their values are often adjectives, e.g.:

enum Color {RED, GREEN, BLUE};
enum Status {PENDING, CLOSED, CANCELLED};
enum Size {SMALL, MEDIUM, LARGE];

One of the ways we get code to read like well-written prose is to use the appropriate parts of speech for classes, methods and variables.

The Scope Length Rule

Look at this code:

for (ITestResult tr : m_configIssues) {
  Element element = createElement(d, tr);
  rootElement.appendChild(element);
}

ITestResult has that ugly I in front of it, violating the “Avoid Encodings” rule we talked about earlier. It probably means this is an interface but we can’t be sure of that. If it wasn’t this would also violate the “Avoid misinformation” rule.
Similarly m_configIssue has that ugly m_ prefix, which also violates the “Avoid Encodings” rule.
Element element seems OK
What is d? It comes out of nowhere. This is probably an instance variable of some kind
tr is fine because it is defined and used within a narrow scope of the for loop, therefore it is understandable what it is and what it does

This bring us to one of the most interesting concepts about naming: the relationship between scope length and name length.

Variables scope and name length

The length of a variable name should be directly proportional to the length of its scope.

If the scope is short, like in the example of tr above, then the variable name should also be short
If the scope is wider, e.g. in the case of instance variables, the name ought to be longer
For global variables we probably want the longest names

The variable d in the example above violates this rule. It’s at least an instance variable, therefore its scope extends at least to the class and therefore its name should be a meaningful noun. Probably in this case d denotes a Document in a DOM structure and therefore a better name for it would be document.

Element, on the other hand, since it has the scope within the for loop, could be easily named e and the programmer would know what it refers to.

Functions and Classes name length

For Function and Classes, a good naming strategy is the opposite of that for variables.

The length of a Function or Class name should be inversely proportional to the length of its scope.

For public APIs (in the general term of Application Programmable Interfaces), we want short and meaningful names, because the users of our public APIs will find it easier to use.

However, as the scope gets narrower (protected, default, private), we want these names to be longer and fully descriptive. This becomes clearer when applying refactoring as directed in Clean Code. The public function should read like well-written prose, with a short and meaningful name. The logic in the code will be delegated to private functions, only accessible from within the class, therefore the longer and more descriptive of the intent these names are, the better.

Consider this example on a File class.

public void openFileAndThrowIfNotFound(){}

As a consumer of the File API, you would find inconvenient to use such method. A much better alternative would be to call this method open():

public void open(){}

In another example, consider this code:

public void serve(Socket s) {
  try {
    tryProcessingInstructions(s);
  } catch (Throwable e) {
  } finally {
    slimFactory.stop();
    close();
    closeEnclosingServiceInSeparateThread();
  }
}

Here, tryProcessingInstructions() and closeEnclosingServiceInSeparateThread() are private functions, only accessible from within the class and they have a nice, meaningful and descriptive name that helps the reader understand what these functions do. Even without looking at the pri