Boundary value analysis and integer sequences for fuzzing

By on .

Equivalence partition testing is a test case design technique in which test cases are designed using representatives from equivalence classes. An equivalence class (or partition) is a portion of the component input or output domains for which the component behavior is assumed to be the same from the component specification. In equivalence partition testing, test cases are designed to cover each class at least once, thereby reducing the total number of test cases that must be developed and executed.

Boundary value analysis is a test case design technique in which test cases are designed using representatives of boundary values. A boundary value is an input value or output value that is on the boundary between equivalence classes, or an incremental distance either side of the boundary.

At the most primitive level, access to memory in most modern processor architectures is made in 8-bit boundaries, or aligned to 2-byte, 4-byte, and 8-byte boundaries for 16-bit, 32-bit, and 64-bit values respectively. Therefore, compilers attempt to allocate memory in a way that prevents data misalignment, and pads structures in a way that aligns each member of the structure within these boundaries—since accessing unaligned memory may be expensive for processors that supports this type of operation. The same occurs when allocating memory buffers, whereas their sizes are allocated as multiples of these boundaries. More importantly, these boundaries also define the most primitive equivalence classes of any component that relies upon a processor architecture.

These sizes, when represented in a network protocol or file format as a length parameter, uses the two’s complement representation, which is the most common method of representing signed integers on present computers. A -1 two’s complement cast to an unsigned data type is equal to the largest binary repunit1 (a Mersenne number2) in the word length, namely 255, 65535, 4294967295, 18446744073709551615 in bytes, words, double words and quadruple words (Sloane’s A051179), and defines the maximum value of each of these equivalence classes.

Conversely, the minimum value of each of these equivalence classes are given by the binomial number of the form (2^n)+1 (a Fermat number3), namely 257, 65537, 4294967297, 18446744073709551617 (Sloane’s A000215.)

Therefore, whether if the component specification is unknown, or if testing the implementation and enforcement of the component specification, integer sequences, such as Mersenne numbers (Sloane’s A000225) and a less common form of Fermat numbers (Sloane’s A000051), when used as representatives of boundary values either side of these boundaries, are heuristically optimal for identifying weaknesses such as “Integer Overflow or Wraparound (CWE-190)” and “Incorrect Calculation of Buffer Size (CWE-131).”

In fuzzing (or fuzz testing, or robustness testing), these equivalence classes, boundaries, and integer sequences, may be used for effectively reducing the number of trials and significantly increasing the number of outcomes in a shorter period of time.

1. “A repunit is a number consisting of copies of the single digit 1. The term “repunit” was coined by Beiler (1966), who also gave the first tabulation of known factors.” Weisstein, Eric W. “Repunit.” From MathWorld—A Wolfram Web Resource.

2. “A Mersenne number is a number of the form (2^n) -1 (Sloane’s A000225), where n is an integer. The Mersenne numbers consist of all 1s in base-2, and are therefore binary repunits.” Weisstein, Eric W. “Mersenne Number.” From MathWorld—A Wolfram Web Resource.

3. “There are two definitions of the Fermat number. The less common is a number of the form (2^n) + 1 (Sloane’s A000051), obtained by setting x = 1 in a Fermat polynomial. The other much more commonly encountered Fermat numbers are a special case, given by the binomial number of the form (2^(2^n)) + 1 (Sloane’s A000215).” Weisstein, Eric W. “Fermat Number.” From MathWorld—A Wolfram Web Resource.