Citizendia
Your Ad Here

In computer science, a subroutine (function, method, procedure, or subprogram) is a portion of code within a larger program, which performs a specific task and can be relatively independent of the remaining code. Computer science (or computing science) is the study and the Science of the theoretical foundations of Information and Computation and their In Computer science, source code (commonly just source or code) is any sequence of statements or declarations written in some Human-readable Computer programs (also software programs, or just programs) are instructions for a Computer. A task is "an execution path through address space" In other words a set of program instructions that are loaded in memory. The syntax of many programming languages includes support for creating self contained subroutines, and for calling and returning from them. In Linguistics, syntax (from Ancient Greek grc συν- syn-, "together" and grc τάξις táxis, "arrangement" is the In Computer programming, a return statement causes execution to leave the current Subroutine and resume at the point in the code immediately after where the subroutine

They are in many ways similar to functions, but can have side-effects outside of the simple "return value" that functions return (result in). The Mathematical concept of a function expresses dependence between two quantities one of which is given (the independent variable, argument of the function Some programming languages make very little syntactic distinction between functions and subroutines.

There are many advantages to breaking a program up into subroutines, including:

The components of a subroutine may include:

Many programming languages, such as Pascal , Fortran, Ada, distinguish between functions or function subprograms, which return values (via a return statement), and subroutines or procedures, which do not. A programming language is an Artificial language that can be used to write programs which control the behavior of a machine particularly a Computer. Pascal is an influential imperative and procedural Programming language, designed in 1968/9 and published in 1970 by Niklaus Wirth as a small Fortran (previously FORTRAN) is a general-purpose, procedural, imperative Programming language that is especially suited to Ada is a structured, Statically typed, imperative, and object-oriented high-level computer Programming language In Computer programming, a return statement causes execution to leave the current Subroutine and resume at the point in the code immediately after where the subroutine Some languages, such as C and Lisp, do not make this distinction, and treat those terms as synonymous. tags please moot on the talk page first! --> In Computing, C is a general-purpose cross-platform block structured Lisp (or LISP) is a family of Computer Programming languages with a long history and a distinctive fully parenthesized syntax The name method is commonly used in connection with object-oriented programming, specifically for subroutines that are part of objects; it is also used in conjunction with type classes. In Object-oriented programming, the term method refers to a Subroutine that is exclusively associated either with a class (called class methods Object-oriented programming (OOP is a Programming paradigm that uses " objects " and their interactions to design applications and computer programs In its simplest embodiment an object is an allocated region of storage In Computer science, a type class is a Type system construct that supports ad-hoc polymorphism.

Maurice Wilkes, David Wheeler, and Stanley Gill are credited with the invention of the subroutine (which they referred to as the closed subroutine). David John Wheeler FRS ( 9 February 1927 &ndash 13 December 2004) was a Computer scientist. Professor Stanley Gill (1926 - 1975 was a British Computer scientist credited along with Maurice Wilkes and David Wheeler, with the invention [1]

Contents

Early history

The first use of subprograms was on early computers that were programmed in machine code or assembly language, and did not support a call instruction. Machine code or machine language is a system of instructions and data executed directly by a Computer 's Central processing unit. See the terminology section below for information regarding inconsistent use of the terms assembly and assembler On these computers, subroutines had to be called by a sequence of lower level machine instructions, possibly implemented as a macro. A macro (from the Greek 'μάκρο' for long or far in Computer science is a rule or Pattern that specifies how a certain input sequence (often a sequence These instructions typically modified the program code, modifying the address of a branch at a standard location so that it behaved like an explicit return instruction. Even with this cumbersome approach subroutines proved very useful. The available memory on early computers was many orders of magnitude smaller than that available on today's computers, and non-trivial subroutines saved memory by reducing redundancy. An order of magnitude is the class of scale or magnitude of any amount where each class contains values of a fixed ratio to the class preceding it Soon, most architectures provided instructions to help with subroutine calls, leading to explicit call instructions.

Technical overview

A subprogram, as its name suggests, behaves in much the same way as a complete computer program, but on a smaller scale. Typically, the caller waits for subprograms to finish and continues execution only after a subprogram "returns". In Computer programming, a return statement causes execution to leave the current Subroutine and resume at the point in the code immediately after where the subroutine Subroutines are often given parameters to refine their behavior or to perform a certain computation with given values. In Computer programming, a parameter is a variable which takes on the meaning of a corresponding Argument (computer science is same article--> argument

No Stack

Early FORTRAN compilers were written for machines like the HP 2100 which did not support stacks (or recursion) with hardware stack registers. The HP 2100 was a series of Minicomputers produced by Hewlett-Packard from the mid 1960s to early 1990s A stack register is a computer central Processor register whose purpose is to keep track of a Call stack. The Jump to subroutine instruction had the following format:

------+-----+---------+--------
label   JSB   m[,I]     comments

The address for label is placed into the location represented by m and control transfers to the NEXT location, m+1. On completion of the subroutine, control may be returned to the normal sequence by performing a JMP m,I. This reserves a location at or before the start of a subroutine to save the return location. This did not require a separate stack, but did not support recursion since there is only one return storage location per subroutine. A similar technique was used by Lotus 1-2-3 to support a tree walk to compute recalculation dependencies, as a location was reserved in each cell to store the "return" address. Lotus 1-2-3 is a Spreadsheet program from Lotus Software (now part of IBM) Since circular references are not allowed for natural recalculation order, this allows a tree walk without reserving space for a stack in memory which was very limited on small computers such as the IBM PC. A circular reference, sometimes referred to as a run-around is a series of References where the last object references the first thus causing the whole series of references

Stack

Most implementations use a call stack to implement subroutine calls and returns. In Computer science, a call stack is a dynamic stack data structure which stores information about the active Subroutines of a Computer program

When an assembly language program executes a call, program flow jumps to another location, but the address of the next instruction (that is, the instruction that follows the call instruction in memory) is kept somewhere to use when returning. The IBM System/360 saved this address in a processor register, relying on convention to save and restore registers and return addresses in memory associated with individual subroutines, then using branches to the address specified in the register to accomplish a subroutine return. The IBM System/360 ( S/360) is a Mainframe computer system family announced by IBM on April 7, 1964. In Computer architecture, a processor register is a small amount of storage available on the CPU whose contents can be accessed more quickly than storage In postal Mail, a return address is an explicit inclusion of the address of the person sending the message In Computer programming, a return statement causes execution to leave the current Subroutine and resume at the point in the code immediately after where the subroutine

Compilers for most languages use a push-down stack and support recursive subroutine calls (each call is given a fresh new location to store the return address). In Computer science, a stack is an Abstract data type and Data structure based on the principle of Last In First Out (LIFO Recursion, in Mathematics and Computer science, is a method of defining functions in which the function being defined is applied within its own definition In a stack based architecture, the return address is 'pushed' as a point of return on the stack. The subroutine 'returns' by 'popping' a return value from the top of the stack, which reads the previously pushed return address and jumps to it, so that program flow continues immediately after the call instruction. Most RISC and VLIW architectures save the return address in a link register (as the IBM 360 did), but simulate a stack with load and store instructions rather than with push and pop instructions. Very Long Instruction Word or VLIW refers to a CPU architecture designed to take advantage of Instruction level parallelism (ILP A link register, in many CPU architectures such as the PowerPC, ARM, and the PA-RISC, is a special purpose register which holds The disadvantage of such a scheme is that the stack can overflow if recursion takes place at too many levels, or if the variables on each stack frame are too large. If there is not sufficient stack space, and there is no recursion, a tree-walk can be simulated with an iterative algorithm, storing return locations at each tree node, as was done with Lotus 1-2-3 and work-alike clone, The Twin which were based on PC's with very limited stack space. Lotus 1-2-3 is a Spreadsheet program from Lotus Software (now part of IBM)

This section deals with the modern implementation of having subroutine data stored on one or more stacks.

Due to usage of a stack, a subroutine can call itself (see recursion) or other subroutines (nested calls), and of course it can call the same subroutine from several distinct places. Recursion, in Mathematics and Computer science, is a method of defining functions in which the function being defined is applied within its own definition Assembly languages generally do not provide programmers with such conveniences as local variables or subroutine parameters. They get to be implemented by passing values in registers or pushing them onto the stack (or another stack, if there is more than one).

When there is just one stack, the return addresses must be placed in the same space as the parameters and local variables. In Computer programming, a parameter is a variable which takes on the meaning of a corresponding Argument (computer science is same article--> argument In Computer science, a local variable is a Variable that is given local scope. Hence, a typical stack may look like this (for a case where function1 calls function2):

This is with a forwards-growing stack — on many architectures the stack grows backwards in memory. Forward and backwards-growing stacks are useful because it is quite practical to have two stacks growing towards each other in a common scratch space, using one mainly for control information like return addresses and loop counters and the other for data. (This is what Forth does. Forth is a structured, imperative, stack-based, computer Programming language and programming environment )

The parts of the program which are responsible for the entry into and exit out of the subroutine (and hence, the setting up and removal of each stack frame) are called the function prologue and epilogue. In Assembly language programming, the function prologue is a few lines of code which appear at the beginning of a function which prepare the stack and

If the procedure or function itself uses stack handling commands, outside of the prologue and epilogue, e. g. to store intermediate calculation values, the programmer needs to keep track of the number of 'push' and 'pop' instructions so as not to corrupt the original return address.

Side-effects

In most imperative programming languages, subprograms may have so-called side-effects; that is, they may cause changes that remain after the subprogram has returned. In Computer science, imperative programming is a Programming paradigm that describes computation in terms of statements that change a program state In Computer science, a function or expression is said to produce a side effect if it modifies some state in addition to returning a value It can be technically very difficult to predict whether a subprogram has a side-effect or not. In imperative programming, compilers usually assume every subprogram has a side-effect to avoid complex analysis of execution paths. A compiler is a Computer program (or set of programs that translates text written in a computer language (the source language) into another Because of its side-effects, a subprogram may return different results each time it is called, even if it is called with the same arguments. In Computer programming, a parameter is a variable which takes on the meaning of a corresponding Argument (computer science is same article--> argument A simple example is a subprogram that implements a pseudorandom number generator; that is, a subprogram that returns a random number each time it is called. A pseudorandom number generator ( PRNG) is an Algorithm for generating a sequence of numbers that approximates the properties of random numbers

In pure functional programming languages such as Haskell, subprograms can have no side effects, and will always return the same result if repeatedly called with the same arguments. In Computer science, functional programming is a Programming paradigm that treats Computation as the evaluation of mathematical functions and Haskell is a standardized Purely functional Programming language with non-strict semantics, named after the Logician Haskell Curry Such languages typically only support functions, since subroutines that do not return a value have no use unless they can cause a side effect. In functional programming writing to a file is a side effect.

C and C++ examples

In the C and C++ programming languages, subprograms are referred to as "functions" (or "methods" when associated with a class). tags please moot on the talk page first! --> In Computing, C is a general-purpose cross-platform block structured C++ (" C Plus Plus " ˌsiːˌplʌsˈplʌs is a general-purpose Programming language. In Object-oriented programming, a class is a Programming language construct that is used as a blueprint to create objects This blueprint includes attributes Note that these languages use the special keyword void to indicate that a function takes no parameters (especially in C) and/or does not return any value. Note that C/C++ functions can have side-effects, including modifying any variables whose addresses are passed as parameters (i. e. "passed by reference"). Examples:

void function1(void) { /* some code */ }

The function does not return a value and has to be called as a stand-alone function, e. g. , function1();

int function2(void)
   { return 5; }

This function returns a result (the number 5), and the call can be part of an expression, e. g. , x + function2()

char function3 (int number)
   { char selection[] = {'S','M','T','W','T','F','S'}; 
     return selection[number];
   }

This function converts a number between 0 to 6 into the initial letter of the corresponding day of the week, namely 0 → 'S', 1 → 'M', . . . , 6 → 'S'. The result of calling it might be assigned to a variable, e. g. , num_day = function3(number);.

void function4 (int* pointer_to_var)
   { (*pointer_to_var)++; }

This function does not return a value but modifies the variable whose address is passed as the parameter; it would be called with "function4(&variable_to_increment);".

Local variables, recursion and re-entrancy

A subprogram may find it useful to make use of a certain amount of "scratch" space; that is, memory used during the execution of that subprogram to hold intermediate results. Variables stored in this scratch space are referred to as local variables, and the scratch space itself is referred to as an activation record. An activation record typically has a return address that tells it where to pass control back to when the subprogram finishes. In postal Mail, a return address is an explicit inclusion of the address of the person sending the message

A subprogram may have any number and nature of call sites. If recursion is supported, a subprogram may even call itself, causing its execution to suspend while another nested execution of the same subprogram occurs. Recursion is a useful technique for simplifying some complex algorithms, and breaking down complex problems. Recursion, in Mathematics and Computer science, is a method of defining functions in which the function being defined is applied within its own definition Recursive languages generally provide a new copy of local variables on each call. If the programmer desires the value of local variables to stay the same between calls, they can be declared "static" in some languages, or global values or common areas can be used.

Early languages like Fortran did not initially support recursion because variables were statically allocated, as well as the location for the return address. Fortran (previously FORTRAN) is a general-purpose, procedural, imperative Programming language that is especially suited to Most computers before the late 1960s such as the PDP-8 did not have support for hardware stack registers. The PDP-8 was the first successful commercial Minicomputer, produced by Digital Equipment Corporation (DEC in the 1960s

Modern languages after ALGOL such as Pl/1 and C almost invariably use a stack, usually supported most modern computer instruction sets to provide a fresh activation record for every execution of a subprogram. That way, the nested execution is free to modify its local variables without concern for the effect on other suspended executions in progress. As nested calls accumulate, a call stack structure is formed, consisting of one activation record for each suspended subprogram. In Computer science, a call stack is a dynamic stack data structure which stores information about the active Subroutines of a Computer program In fact, this stack structure is virtually ubiquitous, and so activation records are commonly referred to as stack frames. In Computer science, a call stack is a dynamic stack data structure which stores information about the active Subroutines of a Computer program

Some languages such as Pascal and Ada also support nested subroutines, which are subroutines callable only within the scope of an outer (parent) subroutine. In Computer programming, a nested function (or nested procedure / subroutine) is a function which is lexically (textually encapsulated within In Computer programming, scope is an enclosing context where values and expressions are associated Inner subroutines have access to the local variables of the outer subroutine which called them. This is accomplished by storing extra context information within the activation record, also known as a display.

If a subprogram can function properly even when called while another execution is already in progress, that subprogram is said to be re-entrant. A recursive subprogram must be re-entrant. Re-entrant subprograms are also useful in multi-threaded situations, since multiple threads can call the same subprogram without fear of interfering with each other. A thread in Computer science is short for a thread of execution.

In a multi-threaded environment, there is generally more than one stack. A thread in Computer science is short for a thread of execution. An environment which fully supports coroutines or lazy evaluation may use data structures other than stacks to store their activation records. In Computer science, coroutines are program components that generalize Subroutines to allow multiple entry points and suspending and resuming of execution at certain In Computer programming, lazy evaluation (or delayed evaluation) is the technique of delaying a computation until such time as the result of the computation is

Overloading

It is sometimes desirable to have a number of functions with the same name, but operating on different types of data, or with different parameter profiles. For example, a square root function might be defined to operate on reals, complex values or matrices. The algorithm to be used in each case is different, and the return result may be different. By writing three separate functions with the same name, the programmer has the convenience of not having to remember different names for each type of data. Further if a subtype can be defined for the reals, to separate positive and negative reals, two functions can be written for the reals, one to return a real when the parameter is positive, and another to return a complex value when the parameter is negative.

When a series of functions with the same name can accept different parameter profiles or parameters of different types, each of the functions is said to be overloaded. Method overloading is a feature found in various Programming languages such as Ada, C#, C++ and Java that allows the creation

As another example, a subroutine might construct an object that will accept directions, and trace its path to these points on screen. There are a plethora of parameters that could be passed in to the constructor (colour of the trace, starting x and y co-ordinates, trace speed). If the programmer wanted the constructor to be able to accept only the color parameter, then he could call another constructor that accepts only color, which in turn calls the constructor with all the parameters passing in a set of "default values" for all the other parameters (X and Y would generally be centered on screen or placed at the origin, and the speed would be set to another value of the coder's choosing).

Conventions

A number of conventions for the coding of subprograms have been developed. It has been commonly preferable that the name of a subprogram should be a verb when it does a certain task, an adjective when it makes some inquiry, and a noun when it is used to substitute variables and such.

Experienced programmers recommend that a subprogram perform only one task. If a subprogram performs more than one task, it should be split up into more subprograms. They argue that subprograms are key components in maintaining code and their roles in the program must be distinct.

Some advocate that each subprogram should have minimal dependency on other pieces of code. For example, they see the use of global variables as unwise because it adds tight-coupling between subprograms and global variables. In Computer programming, a global variable is a variable that is accessible in every scope. If such coupling is not necessary at all, they advise to refactor subprograms to take parameters instead. This practice is controversial because it tends to increase the number of passed parameters to subprograms.

Efficiency and inlining

There is a runtime overhead associated with passing parameters, calling the subprogram, and returning. The actual overhead for each invocation depends on the local context at the point of call and the requirements specified in the architecture's application binary interface. In Computer software, an application binary interface ( ABI) describes the low-level interface between an application program and the Operating system

One technique used to minimise this overhead is inline expansion of the subprogram call site. In Computing, inline expansion, or inlining, is a Compiler optimization that replaces a function Call site with the body of the Callee In programming a call site of a function is a line in the code which calls (or may call through Dynamic dispatch) a function However, inlining often increases code size and can introduce cache misses into a previously optimised block of code.

Dynamic dispatch can introduce further overheads - although the performance difference between indirect and direct calls on commodity CPUs has narrowed since the 1980s, because of research and work done by CPU designers (driven by the increasing popularity of object-oriented programming, which uses dynamic dispatch extensively). In Computer science, dynamic dispatch is the process of mapping a message to a specific sequence of code ( method) at Runtime. Year 1980 ( MCMLXXX) was a Leap year starting on Tuesday (link displays the 1980 Gregorian calendar) Also, software techniques have been developed to make dynamic dispatch more efficient.

Related terms and clarification

Different programming languages and methodologies possess notions and mechanisms related to subprograms:

People who write compilers, and people who write in assembly language, deal with the low-level details involved in implementing subroutines:

Most subprograms typically implement the idea of an algorithm -- they are given a finite amount of information when they start, they are given no more information while they run, and they give out no information until they end. In Mathematics, Computing, Linguistics and related subjects an algorithm is a sequence of finite instructions often used for Calculation However, each process, job, task, or thread is a program or subprogram that typically sends and receives information many times before it ends. In computing a process is an instance of a Computer program that is being sequentially executed by a computer system that has the ability to run several computer In computing a job (or process) is a term used to refer to a single instance of a program. A task is "an execution path through address space" In other words a set of program instructions that are loaded in memory. A thread in Computer science is short for a thread of execution.

See also

References

  1. ^ Wilkes, M. The Mathematical concept of a function expresses dependence between two quantities one of which is given (the independent variable, argument of the function In Object-oriented programming, the term method refers to a Subroutine that is exclusively associated either with a class (called class methods Modular programming is a software design technique that increases the extent to which software is composed from separate parts called modules In Computer science, transclusion is the inclusion of part of a document into another document by reference In Computer programming, operator overloading (less commonly known as operator Ad-hoc polymorphism) is a specific case of polymorphism in V. ; Wheeler, D. J. , Gill, S. (1951). Preparation of Programs for an Electronic Digital Computer. Addison-Wesley.  

© 2009 citizendia.org; parts available under the terms of GNU Free Documentation License, from http://en.wikipedia.org
Dapyx Software network: MP3 Explorer | Ebook Manager | Zenithic