The philosophy of nothing

Nothing, the opposite of something, does not exist. To represent something is trivial, but, how we can represent nothing is a great philosophical problem.

What is nothing?

Even when put in the context of application, the concept of nothing may not be clear. For example, in a human resource system where a staff member is assigned by a number, and the system is distributed around the world so no single machine has complete knowledge of the whole. Imagine we ask, “What’s the number of Paul?”. If the system answers 3, we know that his number is 3. However, what if the system wants to say “I don’t know.”, “He does not exist as a staff member.”, etc.? Some badly-designed systems cannot handle these situations and return a value representing nothing, which, when represented as an integer, is 0, which is going to be further confused with “His number is 0.” (Of course, modern systems throw exceptions in these scenarios.)

Null, void and empty

“Null and void”, in legal terms, means “invalid” and “no effect”. However, using the word “void” alone already conveys the same meaning so it is actually a redundant phrase. In programming literature, the words null and void have related but slightly different meanings however.

Null

In programming, null (or nil) is usually to represent “nothing”. Null was invented by C.A.R. Hoare in 1965 as part of the Algol W language, which he later claimed in 2009 as a billion-dollar mistake, causing innumerable errors, vulnerabilities and crashes in the past forty years.

Null usually exists in the following manner:

  • In programming languages with pointers where a null pointer points to absolutely nothing. (example, C, C++, Java)
  • In dynamically-typed programming languages where null is a unary type which the only value is null itself, which is commonly used as a placeholder for nothing. (example, PHP, Javascript)

Some programming language even goes further, having more than 1 kind of nulls:

  • Javascript: undefined, null (where they are different)
  • Objective-C++: 0, NULL, nullptr, nil (where they are actually are the same, representing the null pointer) and NSNull (which is actually the empty object, not a kind of null in normal sense)

In strongly-typed programming language, allowing null in place of object pointer basically undermines the type system, as demonstrated by the following C++ code:

int f(const int *x) {
    return *x;
}

f is guaranteed to work on any valid value of x, with one exception: NULL. If x is NULL, the function crashes. In C++, such kinds of mistakes can be prevented by coding standard that always uses references if NULL is not intended, and pointers only if NULL is expected, but in some other programming languages, you have no choice because they mandates the use of pointers, and null is accepted in the type system as an object pointer, for example, Java. In those situations you can never know whether the function actually works with null arguments, can return null, and you can only pray to God that a NullPointerException doesn’t fly out from somewhere deep in the library.

Void

Void, on other hand, is used to denote that a value does not exist in the C family of languages. In B, the predecessor of C, there were no types at all. Everything worked in terms of machine words. There was also no concept of “procedures”, only “functions”. A function is assumed to return a value, if the value is used but the function actually doesn’t return one, the behaviour is undefined. Early versions of C also inherited this feature, assuming int if the return type is missing in the declaration:

f()
{
    puts("Hello");
}

In the type system, f accepts undetermined parameters and returns int. (Before the era of function prototype, the arguments were not checked at all, causing undefined behaviour if they did not match the function definition) If f returns some other type, the type is added to the function definition. However, there was no way to indicate that the function really does not return anything at all. The practice that time was to omit the type to indicate the function does not return a value, and to explicitly type int (which is redundant) to indicate the function returns an integer. However, it was not a language feature and both is considered to return an int in the type system.

Later, void was introduced as a placeholder to the type to indicate the function does not return a value at all, where the compiler reports an error if a value is returned in the code, or the return value is used in the caller. void is not a data type, it cannot be used to declare variables.

void has two additional uses in C: in function prototypes and in pointer types. Because early versions of C did not type-check function arguments, functions looked like the following:

/* header */
int g();

/* implementation */
int g(a, b)
    int a, b;
{
    return a + b;
}

As an empty parameter list does not mean no parameters in early version of C, void is used in place of the parameter list to denote that the function receives no parameters at all, to retain compatibility with legacy code. C++ does not support old style function declarations so an empty parameter list is enough to denote no parameters:

void h(); /* unspecified parameters in C, no parameters in C++ */
void i(void); /* no parameters in C and C++ */

void can also be used as a placeholder for pointer type. Before the introduction of void keyword, char * was used as a general-purpose pointer type in addition to the literal meaning of a pointer to some char object. With void *, it has become clear that the type where the pointer points to is unknown, and has to be converted to some valid types before dereferencing. Pointer arithmetic cannot be done with void * pointers because the underlying type does not exist. In C, any pointers can be implicitly converted from/to void *, but in C++, only one-way implicit conversion can be done from object pointer to void *, because the reverse cannot be safe at all. In addition, void *p = NULL; means that p points to a piece of memory which does not exist.

Undefined

An undefined variable normally means that the variable even does not exist (not even a null value). In most static languages, it is an error using an undefined identifier. In some circumstances, it is useful to distinguish between undefined and null. For example, storing an entry in a phone book with null may mean that the person has no phone number, while a non-existent entry may mean “I don’t know.”

However, the terminology of Javascript is confusing:

var a;
var b = undefined;
typeof a; // "undefined"
typeof b; // "undefined"
typeof nonexistent; // "undefined"
nonexistent; // throws a ReferenceError

You are allowed to define a variable, setting it to undefined! Actually, in Javascript, undefined and null are both unary types used to represent nothing, where undefined is the default value for uninitialised variables, missing arguments and return values, where null is intended to be assigned by the user in place of a missing object and never generated automatically.

Empty

Empty, on the other hand, is not a well-defined concept. It basically refers to values which generally mean absence of something and normally evaluates to false in Boolean contexts, where all other values of the same type evaluates to true. For example, false (Boolean false), 0 (integer zero), 0.0 (floating-point zero), '\0' (null character), "" (empty string), null (null pointer), [] (empty array) are generally considered as empty values in programming languages, which act as default for variables which the initialiser is not explicitly given.

Not all data types have a sensible empty value. For example, a storage engine type having the requirement that the data must be persistent does not have a sensible default, because the object must do something to store the data. In contrast, if the persistent requirement does not exist, an empty storage engine may simply ignore everything written into it and return nothing when read from it, which is the case of /dev/null on Unix systems acting as an empty character device.

Some programming languages, like Objective-C treat dereferencing a null pointer as a valid operation, returning an empty object as a result, where all operations on the empty object returns empty values. Such may or may not be a desirable behaviour because although it reduces crashes it may hide logic errors where a null value is unexpectedly used in contexts where null is not expected. Also, the empty object itself may not satisfy class invariants, causing surprises in client code if the empty object generated by dereferencing a null pointer is used instead of a real object.

Unary type

A unary type is a type where the set of values only contains a single value, thus holds no information. For example, an empty class in C++, or the null type in PHP, are unary types. These types are commonly used when something meaningless is needed, but if it is intended, it is absolutely not the same as nothing. For example, we can deliberately pass an object of an empty class, which acts as the root in a class hierarchy, into a container. The value of a unary type is generally considered as an empty value because it cannot do anything useful by itself.

Option type

An option type is a type, which, by design, indicates whether a real value exists and if exists what it is. For example, the Maybe type in Haskell is an option type. The use of an option type makes clear the fact that a real value may not exist, and has to be treated specifically. In general, the wrapped type of an option type may also be an option type, which means that any levels of emptiness can be provided in the value. For example, the query method of a phone book may return a type of Maybe Maybe String, where existence in the outer level means I have an entry for that person, and the existence in the inner level means the person has a phone number. It is comparable to a collection type where the size is limited to 1.

Design considerations

Treating nothing as something is a disaster. If something is required you have to make sure that there is really something, not nothing. If nothing is acceptable then make sure you can accept nothing in addition to something, for example, through use of an option type. Also, please don’t confuse null with empty. Null is a kind of empty, but there are many kinds of empty, like empty strings, empty collections, and they are not the same as null. They are actually something which can be used as a real object, although they don’t do non-trivial work.

Leave a Reply

Your email address will not be published. Required fields are marked *