Recursion in computer science is a method where the solution to
a problem depends on solutions to smaller instances of the same problem.[1]
The approach can be applied to many types of problems, and recursion is
one of the central ideas of computer science.[2]
"The power of recursion evidently lies in the possibility of defining
an infinite set of objects by a finite statement. In the same manner, an
infinite number of computations can be described by a finite recursive program,
even if this program contains no explicit repetitions." [3]
Most computer programming languages support recursion by allowing a function to call itself within the
program text. Some functional programming languages do not define
any looping constructs but rely solely on recursion to repeatedly call code. Computability theory has
proven[attribution needed]
that these recursive-only languages are Turing complete; they are as computationally
powerful as Turing complete imperative languages, meaning they can solve the
same kinds of problems as imperative languages even without iterative control
structures such as “while” and “for”.
Recursive
functions and algorithms
A common computer programming tactic is to divide a problem into sub-problems of the same
type as the original, solve those sub-problems, and combine the results. This
is often referred to as the divide-and-conquer method; when combined with a lookup table
that stores the results of solving sub-problems (to avoid solving them
repeatedly and incurring extra computation time), it can be referred to as dynamic programming or memoization.
A recursive function definition has
one or more base cases, meaning input(s) for which the function produces
a result trivially (without recurring), and one or more recursive cases,
meaning input(s) for which the program recurs (calls itself). For example, the factorial
function can be defined recursively by the equations
and, for all
,
. Neither
equation by itself constitutes a complete definition; the first is the base
case, and the second is the recursive case. Because the base cases break the
chain of recursion, they are sometimes also called "terminating
cases".
The job of the recursive cases can
be seen as breaking down complex inputs into simpler ones. In a properly
designed recursive function, with each recursive call, the input problem must
be simplified in such a way that eventually the base case must be reached.
(Functions that are not intended to terminate under normal circumstances—for
example, some system and
server processes—are an exception to this.)
Neglecting to write a base case, or testing for it incorrectly, can cause an infinite loop.
For some functions (such as one that
computes the series for e = 1+1/1!+1/2!+1/3!...) there is not an obvious base
case implied by the input data; for these one may add a parameter
(such as the number of terms to be added, in our series example) to provide a
'stopping criterion' that establishes the base case. Such an example is more
naturally treated by corecursion, where successive terms in the output are the
partial sums; this can be converted to a recursion by using the indexing
parameter to say "compute the nth term (nth partial
sum)".
Recursive
data types
Many computer programs
must process or generate an arbitrarily large quantity of data. Recursion is one technique for representing data whose
exact size the programmer does not know: the programmer can specify this data with a self-referential
definition. There are two types of self-referential definitions: inductive and coinductive
definitions.
Inductively
defined data
An inductively defined recursive
data definition is one that specifies how to construct instances of the data.
For example, linked lists can be defined inductively (here, using Haskell syntax):
dataListOfStrings
= EmptyList | Cons String ListOfStrings
The code above specifies a list of
strings to be either empty, or a structure that contains a string and a list of
strings. The self-reference in the definition permits the construction of lists
of any (finite) number of strings.
A
natural number is either 1 or n+1, where n is a natural number.
Similarly recursive definitions
are often used to model the structure of expressions and statements in programming languages. Language designers often express
grammars in a syntax such as Backus-Naur form;
here is such a grammar, for a simple language of arithmetic expressions with
multiplication and addition:
<expr>
::= <number>
| (<expr> * <expr>)
| (<expr> + <expr>)
This says that an expression is
either a number, a product of two expressions, or a sum of two expressions. By
recursively referring to expressions in the second and third lines, the grammar
permits arbitrarily complex arithmetic expressions such as (5
* ((3 * 6) + 8)), with more than one product or sum
operation in a single expression.
Coinductively
defined data and corecursion
A coinductive data definition is one
that specifies the operations that may be performed on a piece of data;
typically, self-referential coinductive definitions are used for data
structures of infinite size.
A coinductive definition of infinite
streams of strings, given informally, might look like this:
A
stream of strings is an object s such that
head(s)
is a string, and
tail(s)
is a stream of strings.
This is very similar to an inductive
definition of lists of strings; the difference is that this definition
specifies how to access the contents of the data structure—namely, via the accessor functions
head and tail—and what those contents may be, whereas the inductive
definition specifies how to create the structure and what it may be created
from.
Corecursion is related to coinduction, and can be used to compute
particular instances of (possibly) infinite objects. As a programming
technique, it is used most often in the context of lazy
programming languages, and can be preferable to recursion when the desired size
or precision of a program's output is unknown. In such cases the program
requires both a definition for an infinitely large (or infinitely precise)
result, and a mechanism for taking a finite portion of that result. The problem
of computing the first n prime numbers
is one that can be solved with a corecursive program (e.g. here).
Types
of recursion
Single
recursion and multiple recursion
Recursion that only contains a
single self-reference is known as single recursion, while recursion that
contains multiple self-references is known as multiple recursion.
Standard examples of single recursion include list traversal, such as in a
linear search, or computing the factorial function, while standard examples of
multiple recursion include tree traversal,
such as in a depth-first search, or computing the Fibonacci sequence.
Single recursion is often much more
efficient than multiple recursion, and can generally be replaced by an
iterative computation, running in linear time and requiring constant space.
Multiple recursion, by contrast, may require exponential time and space, and is
more fundamentally recursive, not being able to be replaced by iteration
without an explicit stack.
Multiple recursion can sometimes be
converted to single recursion (and, if desired, thence to iteration). For
example, while computing the Fibonacci sequence naively is multiple iteration,
as each value requires two previous values, it can be computed by single
recursion by passing two successive values as parameters. This is more
naturally framed as corecursion, building up from the initial values, at each
step track two successive values – see corecursion: examples.
A more sophisticated example is using a threaded binary tree, which allows iterative tree traversal, rather than
multiple recursion.
Indirect
recursion
Most basic examples of recursion,
and most of the examples presented here, demonstrate direct recursion,
in which a function calls itself. Indirect recursion occurs when a
function is called not by itself but by another function that it called (either
directly or indirectly). For example, if f calls f, that is
direct recursion, but if f calls g which calls f, then
that is indirect recursion of f. Chains of three or more functions are
possible; for example, function 1 calls function 2, function 2 calls function
3, and function 3 calls function 1 again.
Indirect recursion is also called mutual recursion,
which is a more symmetric term, though this is simply a different of emphasis,
not a different notion. That is, if f calls g and then g
calls f, which in turn calls g again, from the point of view of f
alone, f is indirectly recursing, while from the point of view of g
alone, it is indirectly recursing, while from the point of view of both, f
and g are mutually recursing on each other. Similarly a set of three or
more functions that call each other can be called a set of mutually recursive
functions.
Anonymous
recursion
Recursion is usually done by
explicitly calling a function by name. However, recursion can also be done via
implicitly calling a function based on the current context, which is
particularly useful for anonymous functions, and is known as anonymous recursion.
Structural
versus generative recursion
Some authors classify recursion as
either "structural" or "generative". The distinction is
related to where a recursive procedure gets the data that it works on, and how
it processes that data:
[Functions that consume structured
data] typically decompose their arguments into their immediate structural
components and then process those components. If one of the immediate
components belongs to the same class of data as the input, the function is
recursive. For that reason, we refer to these functions as (STRUCTURALLY)
RECURSIVE FUNCTIONS.[4]
Thus, the defining characteristic of
a structurally recursive function is that the argument to each recursive call
is the content of a field of the original input. Structural recursion includes
nearly all tree traversals, including XML processing, binary tree creation and
search, et cetera. By considering the algebraic structure of the natural
numbers (that is, a natural number is either zero or the successor of a natural
number), functions such as factorial may also be regarded as structural
recursion.
Generative recursion is the alternative:
This distinction is important in proving termination of a function. All structurally recursive functions on
finite (inductively defined) data structures can easily be shown to terminate, via structural induction: intuitively, each recursive call receives a smaller piece
of input data, until a base case is reached. Generatively recursive functions,
in contrast, do not necessarily feed smaller input to their recursive calls, so
proof of their termination is not necessarily as simple, and avoiding infinite loops
requires greater care. These generatively recursive functions can often be
interpreted as corecursive functions – each step generates the new data, such
as successive approximation in Newton's method – and terminating this
corecursion requires that the data eventually satisfy some condition, which is
not necessarily guaranteed.
In terms of loop variants,
structural recursion is when there is an obvious loop variant, namely size or
complexity, which starts off finite and decreases at each recursive step. By
contrast, generative recursion is when there is not such an obvious loop
variant, and termination depends on a function, such as "error of
approximation" that does not necessarily decrease to zero, and thus
termination is not guaranteed without further analysis.
Recursive
programs
Recursive
procedures
Factorial
|
function
factorial is:
input: integer n such that n>= 0
output: [n × (n-1) × (n-2) × … × 1]
1. if n is 0, return
1
2. otherwise, return
[ n × factorial(n-1) ]
end factorial
|
This evaluation of the recurrence
relation demonstrates the computation that would be performed in evaluating the
pseudocode above:
Computing
the recurrence relation for n = 4:
|
b4
= 4 * b3
= 4 * 3 * b2
= 4 * 3
* 2 * b1
= 4 * 3
* 2 * 1 * b0
= 4 * 3
* 2 * 1 * 1
= 4 * 3
* 2 * 1
= 4 * 3
* 2
= 4 * 6
= 24
|
This factorial function can also be
described without using recursion by making use of the typical looping
constructs found in imperative programming languages:
Pseudocode
(iterative):
|
function
factorial is:
input: integer n such that n>= 0
output: [n × (n-1) × (n-2) × … × 1]
1. create new variable
called running_total with a value of 1
2. begin loop
1. if n
is 0, exit loop
2. setrunning_total
to (running_total × n)
3. decrementn
4. repeat
loop
3. returnrunning_total
end factorial
|
The imperative code above is
equivalent to this mathematical definition using an accumulator variable t:
Greatest
common divisor
Function definition:
|
functiongcd is:
input: integer x,
integer y such that x>= y and y>= 0
1. if y is 0, returnx
2. otherwise, return
[ gcd( y, (remainder of x/y) ) ]
endgcd
|
Recurrence relation for greatest
common divisor, where
expresses the remainderof
:
Computing
the recurrence relation for x = 27 and y = 9:
|
gcd(27, 9) =
gcd(9, 27% 9)
=
gcd(9, 0)
= 9
|
Computing
the recurrence relation for x = 259 and y = 111:
|
gcd(259, 111) =
gcd(111, 259% 111)
=
gcd(111, 37)
=
gcd(37, 0)
= 37
|
The recursive program above is tail-recursive;
it is equivalent to an iterative algorithm, and the computation shown above
shows the steps of evaluation that would be performed by a language that
eliminates tail calls. Below is a version of the same algorithm using explicit
iteration, suitable for a language that does not eliminate tail calls. By
maintaining its state entirely in the variables x and y and using
a looping construct, the program avoids making recursive calls and growing the
call stack.
Pseudocode
(iterative):
|
functiongcd is:
input: integer x, integer y such that x>= y
and y>= 0
1. create new variable
called remainder
2. begin loop
1. if y
is zero, exit loop
2. setremainder
to the remainder of x/y
3. set
x to y
4. set
y to remainder
5. repeat
loop
3. returnx
endgcd
|
The iterative algorithm requires a
temporary variable, and even given knowledge of the Euclidean algorithm it is
more difficult to understand the process by simple inspection, although the two
algorithms are very similar in their steps.
Towers
of Hanoi
For a full discussion of this
problem's description, history and solution see the main article or one of the
many references.[6][7]
Simply put the problem is this: given three pegs, one with a set of N disks of
increasing size, determine the minimum (optimal) number of steps it takes to
move all the disks from their initial position to a single stack on another peg
without placing a larger disk on top of a smaller one.
Function definition:
Recurrence relation for hanoi:
Computing
the recurrence relation for n = 4:
|
hanoi(4) =
2*hanoi(3) + 1
=
2*(2*hanoi(2) + 1) + 1
=
2*(2*(2*hanoi(1) + 1) + 1) + 1
=
2*(2*(2*1 + 1) + 1) + 1
=
2*(2*(3) + 1) + 1
= 2*(7)
+ 1
= 15
|
Example Implementations:
|
functionhanoi
is:
input: integer n, such that n>= 1
1. if n is 1 then return
1
2. return [2 * [callhanoi(n-1)]
+ 1]
endhanoi
|
Although not all recursive functions have an explicit solution, the Tower of
Hanoi sequence can be reduced to an explicit formula.[8]
An
explicit formula for Towers of Hanoi:
|
h1 = 1
= 21 - 1
h2 = 3
= 22 - 1
h3 = 7
= 23 - 1
h4 = 15
= 24 - 1
h5 = 31
= 25 - 1
h6 = 63
= 26 - 1
h7 = 127 = 27 - 1
In general:
hn = 2n - 1, for all n >= 1
|
Binary
search
The binary search
algorithm is a method of searching a sorted array
for a single element by cutting the array in half with each recursive pass. The
trick is to pick a midpoint near the center of the array, compare the data at
that point with the data being searched and then responding to one of three
possible conditions: the data is found at the midpoint, the data at the
midpoint is greater than the data being searched for, or the data at the
midpoint is less than the data being searched for.
Recursion is used in this algorithm
because with each pass a new array is created by cutting the old one in half.
The binary search procedure is then called recursively, this time on the new
(and smaller) array. Typically the array's size is adjusted by manipulating a
beginning and ending index. The algorithm exhibits a logarithmic order of
growth because it essentially divides the problem domain in half with each
pass.
Example implementation of binary
search in C:
/*
Call binary_search with proper initial
conditions.
INPUT:
data
is an array of integers SORTED in ASCENDING order,
toFind
is the integer to search for,
count
is the total number of elements in the array
OUTPUT:
result
of binary_search
*/
int
search(int *data, inttoFind, int count)
{
//
Start = 0 (beginning index)
//
End = count - 1 (top index)
returnbinary_search(data,
toFind, 0, count-1);
}
/*
Binary
Search Algorithm.
INPUT:
data
is a array of integers SORTED in ASCENDING order,
toFind
is the integer to search for,
start
is the minimum array index,
end
is the maximum array index
OUTPUT:
position
of the integer toFind within array data,
-1 if not found
*/
intbinary_search(int
*data, inttoFind, int start, int end)
{
//Get the midpoint.
int
mid = start + (end - start)/2;
//Integer division
//Stop
condition.
if
(start > end)
return
-1;
else
if (data[mid] == toFind) //Found?
return
mid;
else
if (data[mid] >toFind) //Data
is greater than toFind, search lower half
returnbinary_search(data,
toFind, start, mid-1);
else //Data is less
than toFind, search upper half
returnbinary_search(data,
toFind, mid+1, end);
}
Recursive
data structures (structural recursion)
An important application of
recursion in computer science is in defining dynamic data structures such as lists and trees. Recursive data structures can dynamically grow to a
theoretically infinite size in response to runtime requirements; in contrast,
the size of a static array must be set at compile time.
"Recursive algorithms are
particularly appropriate when the underlying problem or the data to be treated
are defined in recursive terms." [9]
The examples in this section
illustrate what is known as "structural recursion". This term refers
to the fact that the recursive procedures are acting on data that is defined recursively.
As long as a programmer derives the
template from a data definition, functions employ structural recursion. That
is, the recursions in a function's body consume some immediate piece of a given
compound value.[5]
Linked
lists
Below is a C definition of a linked
list node structure. Notice especially how the node is defined in terms of
itself. The "next" element of struct node is a pointer to
another struct node, effectively creating a list type.
struct
node
{
int
data; // some integer data
struct
node *next; // pointer to another struct
node
};
Because the struct node data
structure is defined recursively, procedures that operate on it can be
implemented naturally as recursive procedures. The list_print procedure
defined below walks down the list until the list is empty (i.e., the list
pointer has a value of NULL). For each node it prints the data element (an
integer). In the C implementation, the list remains unchanged by the list_print
procedure.
voidlist_print(struct
node *list)
{
if
(list != NULL) // base case
{
printf
("%d ", list->data); //
print integer data followed by a space
list_print
(list->next); // recursive call on
the next node
}
}
Binary
trees
Below is a simple definition for a
binary tree node. Like the node for linked lists, it is defined in terms of
itself, recursively. There are two self-referential pointers: left (pointing to
the left sub-tree) and right (pointing to the right sub-tree).
struct
node
{
int
data; // some integer data
struct
node *left; // pointer to the left
subtree
struct
node *right; // point to the right
subtree
};
Operations on the tree can be
implemented using recursion. Note that because there are two self-referencing
pointers (left and right), tree operations may require two recursive calls:
//
Test if tree_node contains i; return 1 if so, 0 if not.
inttree_contains(struct
node *tree_node, int i) {
if
(tree_node == NULL)
return
0; // base case
else
if (tree_node->data == i)
return
1;
else
returntree_contains(tree_node->left,
i) || tree_contains(tree_node->right, i);
}
At most two recursive calls will be
made for any given call to tree_contains as defined above.
//
Inorder traversal:
voidtree_print(struct
node *tree_node) {
if
(tree_node != NULL) { //
base case
tree_print(tree_node->left); // go left
printf("%d
", tree_node->data); // print
the integer followed by a space
tree_print(tree_node->right); // go right
}
}
The above example illustrates an in-order traversal
of the binary tree. A Binary search tree is a special case of the binary tree where the data
elements of each node are in order.
Filesystem
traversal
Since the number of files in a filesystem
may vary, recursion is the only practical way to traverse and thus enumerate
its contents. Traversing a filesystem is very similar to that of tree traversal,
therefore the concepts behind tree traversal are applicable to traversing a
filesystem. More specifically, the code below would be an example of a preorder traversal of a filesystem.
import
java.io.*;
public
class FileSystem {
public
static void main (String [] args) {
traverse
();
}
/**
* Obtains the filesystem roots
* Proceeds with the
recurisvefilesystem traversal
*/
private
static void traverse () {
File [] fs = File.listRoots ();
for
(int i = 0; i <fs.length; i++) {
if
(fs[i].isDirectory () &&fs[i].canRead ()) {
rtraverse
(fs[i]);
}
}
}
/**
* Recursively traverse a given
directory
*
* @paramfd indicates the starting
point of traversal
*/
private
static void rtraverse (File fd) {
File [] fss = fd.listFiles ();
for
(int i = 0; i <fss.length; i++) {
System.out.println
(fss[i]);
if
(fss[i].isDirectory () &&fss[i].canRead ()) {
rtraverse
(fss[i]);
}
}
}
}
This code blends the lines, at least somewhat, between recursion and iteration.
It is, essentially, a recursive implementation, which is the best way to
traverse a filesystem. It is also an example of direct and indirect recursion.
The method "rtraverse" is purely a direct example; the method
"traverse" is the indirect, which calls "rtraverse." This
example needs no "base case" scenario due to the fact that there will
always be some fixed number of files and/or directories in a given filesystem.
Implementation
issues
In actual implementation, rather
than a pure recursive function (single check for base case, otherwise recursive
step), a number of modifications may be made, for purposes of clarity or
efficiency. These include:
- Wrapper function (at top)
- Short-circuiting the base case, aka "Arm's-length
recursion" (at bottom)
- Hybrid algorithm (at bottom) – switching to a different
algorithm once data is small enough
On the basis of elegance, wrapper
functions are generally approved, while short-circuiting the base case is
frowned on, particularly in academia. Hybrid algorithms are often used for
efficiency, to reduce the overhead of recursion in small cases, and
arm's-length recursion is a special case of this.
0 komentar:
Posting Komentar