Another thing about Java binary search routines

Java (J2SE) has binary search routines in the java.util package, as Collections.binarySearch (for searching in List implementations), and Arrays.binarySearch (for searching in arrays).

The JavaDoc for both of these methods states:

Returns:
index of the search key, if it is contained in the list; otherwise, (-(insertion point) - 1). The insertion point is defined as the point at which the key would be inserted into the list: the index of the first element greater than the key, or list.size(), if all elements in the list are less than the specified key. Note that this guarantees that the return value will be >= 0 if and only if the key is found.

Now, Java specifies two's complement arithmetic for its primitive integer types. With two's complement (and using “~” to stand for the bitwise complement operator, as in Java), the identity

-x = ~x + 1

holds, so

-x - 1 = (~x + 1) - 1 = ~x.

So (-(insertion point) - 1) in the quote above could equally well be given as ~(insertion point), which to my eyes at least is a more straightforward expression. And when reading up on the equivalent facility in .NET - System.Collections.ArrayList.BinarySearch - I was interested to find:

Return Value

The zero-based index of value in the sorted ArrayList, if value is found; otherwise, a negative number, which is the bitwise complement of the index of the next element that is larger than value or, if there is no larger element, the bitwise complement of Count.

It occurred to me that maybe Sun thinks that typical Java programmers cannot handle bitwise operators. But the results of javap -c java.util.Arrays shows that, in the relevant cases, the Sun J2SE implementation actually does -(low + 1), just as shown in Josh Bloch's recent entry. So I suppose this was simply copied from Bentley's Programming Pearls, with the trivial expansion from -(low + 1) to (-(insertion point) - 1) by whoever wrote the JavaDoc for those methods.

Something else for the VM trivia fans: The Java Virtual Machine does not actually have a bitwise complement instruction corresponding to the ~ operator in the Java language. Instead, the compiler generates an iconst_m1, ixor sequence (i.e. x ^ -1). In contrast, the CIL instruction set used by .NET (as described by ECMA-335) does have a not instruction which performs a bitwise complement. (Neither VM has the specification of its instructions online in a convenient form for linking!)

4 comments