Citizendia
Your Ad Here

A pipeline of three programs run on a text terminal
A pipeline of three programs run on a text terminal

In Unix-like computer operating systems, a pipeline is the original software pipeline: a set of processes chained by their standard streams, so that the output of each process (stdout) feeds directly as input (stdin) of the next one. A Unix-like (sometimes shortened to *nix) Operating system is one that behaves in a manner similar to a Unix system while not necessarily conforming An operating system (commonly abbreviated OS and O/S) is the software component of a Computer system that is responsible for the management and coordination In Software engineering, a pipeline consists of a chain of processing elements ( processes, threads, Coroutines etc. In computing a process is an instance of a Computer program that is being sequentially executed by a computer system that has the ability to run several computer In Unix and Unix-like operating systems as well as certain Programming language interfaces the standard streams are preconnected input and output channels In Unix and Unix-like operating systems as well as certain Programming language interfaces the standard streams are preconnected input and output channels In Unix and Unix-like operating systems as well as certain Programming language interfaces the standard streams are preconnected input and output channels Each connection is implemented by an anonymous pipe. In Computer science, an anonymous pipe is a simplex FIFO communication channel that may be used for one-way Interprocess communication. Filter programs are often used in this configuration. In UNIX and UNIX-like operating systems a filter is a program that gets most of its data from Standard input (the main input stream and writes its main The concept was invented by Douglas McIlroy for Unix shells and it was named by analogy to a physical pipeline. Malcolm Douglas McIlroy (born 1932 is a Mathematician, Engineer, and Programmer. A Unix shell, is a command line shell that provides the traditional User interface for the Unix Operating system and for Unix-like Pipeline transport is the transportation of goods through a pipe.

Contents

Examples

Simple example

ls -l | less

In this example, ls is the Unix directory lister, and less is an interactive text pager with searching capabilities. In Computing, ls is a command to list files in Unix and Unix-like operating systems less is a Terminal pager program on Unix, Windows and Unix-like systems used to view (but not change the contents of a A terminal pager, or paging program is a Computer program used to view (but not modify the contents of a Text file moving down the file one line or one screen at The pipeline lets the user scroll up and down a directory listing that may not fit on the screen.

Pipelines ending in less (or more, a similar text pager) are among the most commonly used. In Computing, more is a command to view (but not modify the contents of a Text file one screen at a time ( Terminal pager) They let the user navigate potentially large (or infinite) amounts of text, which otherwise would have scrolled past the top of the terminal and been lost. Put differently, they relieve programmers from the burden of implementing text pagers in their applications: they can pipe output through less, or assume that the user will do so when needed.

Complex example

Below is an example of a pipeline that implements a kind of spell checker for the web resource indicated by a URL. In Computing, a spell checker is an applications program that flags words in a document that may not be spelled correctly The World Wide Web (commonly shortened to the Web) is a system of interlinked Hypertext documents accessed via the Internet. Uniform Resource Locator is an URI which also specifies where the identified resource is available and the protocol for retrieving it An explanation of what it does follows. (Some machines have /usr/share/dict/words instead. )

curl "http://en. wikipedia. org/wiki/Pipeline_(Unix)" | \
sed 's/[^a-zA-Z ]/ /g' | \
tr 'A-Z ' 'a-z\n' | \
grep '[a-z]' | \
sort -u | \
comm -23 - /usr/dict/words

Pipelines in command line interfaces

Most Unix shells have a special syntax construct for the creation of pipelines. A Unix shell, is a command line shell that provides the traditional User interface for the Unix Operating system and for Unix-like Typically, one simply writes the filter commands in sequence, separated by the ASCII vertical bar character "|" (which, for this reason, is often called "pipe character" by Unix users). American Standard Code for Information Interchange ( ASCII) Note "broken bar" and the glyph "¦" redirect here The shell starts the processes and arranges for the necessary connections between their standard streams (including some amount of buffer storage). In Computing, a buffer is a region of memory used to temporarily hold Data while it is being moved from one place to another

Error stream

By default, the standard error streams ("stderr") of the processes in a pipeline are not passed on through the pipe; instead, they are merged and directed to the console. In Unix and Unix-like operating systems as well as certain Programming language interfaces the standard streams are preconnected input and output channels The system console, root console or simply console is the text entry and display device for system administration messages particularly those from the BIOS However, many shells have additional syntax for changing this behaviour. In the csh shell, for instance, using "|&" instead of "| " signifies that the standard error stream too should be merged with the standard output and fed to the next process. The C shell ( csh) is a Unix shell developed by Bill Joy for the BSD Unix system The Bourne Shell can also merge standard error, using 2>&1, as well as redirect it to a different file. The Bourne shell, or sh, was the default Unix shell of Unix Version 7, and replaced the Thompson shell, whose executable file had the same

Pipemill

In the most commonly used simple pipelines the shell connects a series of sub-processes via pipes, and executes external commands within each sub-process. Thus the shell itself is doing no direct processing of the data flowing through the pipeline.

However, it's possible for the shell to perform processing directly. This construct generally looks something like:

command | while read var1 var2 . . . ; do
   # process each line, using variables as parsed into $var1, $var2, etc
   done

. . . which is referred to as a "pipemill" (since the while is "milling" over the results from the initial command. )

Example of Pipemill

find / /usr /var -mount -user foo -printf "%m %p\n" | while read mode filename; do
    chown $NEWOWNER "$filename"
    chmod $MODE "$filename"
    done

(This example will traverse file directory trees changing the ownership of all files while preserving all permissions, including those that are often stripped off by many versions of the chown command).

There are a number of variations of the pipemill construct including:

ps lax | { read x; while read x owner pid parent x x x x x stat x; do
   [ "$owner"="foo" -a "$stat"="Z" ] && kill "$parent"
   done
   }

(This example kills the parent processes for zombies owned/created by the user "foo").

Here the while loop is enclosed in a command group (the braces); and preceded by a read command, which effectively "throws away" the first line from the ps command. (Of course, in this particular example it would be harmless to process the header line, as it wouldn't match the "$owner"= test). Note that the other references to the "x" variable are simply being used as placeholders for "throwing away" irrelevant fields from each line.

The defining characteristics of a "pipemill" are: some command or series of commands feeds data into a pipe from which a shell while loop reads and processes it.

Creating pipelines programmatically

Pipelines can be created under program control. The pipe() system call asks the operating system to construct a new anonymous pipe object. In Computing, a system call is the mechanism used by an application program to request service from the Kernel. In Computer science, an anonymous pipe is a simplex FIFO communication channel that may be used for one-way Interprocess communication. This results in two new, opened file descriptors in the process: the read-only end of the pipe, and the write-only end. The pipe ends appear to be normal, anonymous file descriptors, except that they have no ability to seek. In computer programming a file descriptor is an abstract key for accessing a file

To avoid deadlock and exploit parallelism, the process with one or more new pipes will then, generally, call fork() to create new processes. A deadlock is a situation wherein two or more competing actions are waiting for the other to finish and thus neither ever does In Computing, when a process forks, it creates a copy of itself which is called a " child process. Each process will then close the end(s) of the pipe that it will not be using before producing or consuming any data. Alternatively, a process might create a new thread and use the pipe to communicate between them. POSIX Threads is a POSIX standard for threads The standard defines an API for creating and manipulating threads

Named pipes may also be created using mkfifo() or mknod() and then presented as the input or output file to programs as they are invoked. In Computing, a named pipe (also FIFO for its behaviour is an extension to the traditional pipe concept on Unix and Unix-like They allow multi-path pipes to be created, and are especially effective when combined with standard error redirection, or with tee. In computing tee is a command in various Command line interpreters ( shells) such as Unix shells 4DOS / 4NT and Windows

Implementation

In most Unix-like systems, all processes of a pipeline are started at the same time, with their streams appropriately connected, and managed by the scheduler together with all other processes running on the machine. Scheduling is a key concept in Computer multitasking and Multiprocessing Operating system design and in Real-time operating system design An important aspect of this, setting Unix pipes apart from other pipe implementations, is the concept of buffering: a sending program may produce 5000 bytes per second, and a receiving program may only be able to accept 100 bytes per second, but no data are lost. In Computing, a buffer is a region of memory used to temporarily hold Data while it is being moved from one place to another A byte (pronounced "bite" baɪt is the basic unit of measurement of information storage in Computer science. The second ( SI symbol s) sometimes abbreviated sec, is the name of a unit of Time, and is the International System of Units Instead, the output of the sending program is held in a buffer, or queue. A queue (pronounced /kjuː/ is a particular kind of collection in which the entities in the collection are kept in order and the principal (or only operations on the collection When the receiving program is ready to read data, the operating system sends it data from the buffer, then removes that data from the buffer. If the buffer fills up, the sending program is suspended (blocked) until the receiving program has had a chance to read some data and make room in the buffer.

Network pipes

Tools like netcat and socat can connect pipes to TCP/IP sockets, following the Unix philosophy of "everything is a file". netcat is a Computer networking utility for reading from and writing to network connections on either TCP or UDP.

History

The pipeline concept and the vertical-bar notation was invented by Douglas McIlroy, one of the authors of the early command shells, after he noticed that much of the time they were processing the output of one program as the input to another. Malcolm Douglas McIlroy (born 1932 is a Mathematician, Engineer, and Programmer. A Unix shell, is a command line shell that provides the traditional User interface for the Unix Operating system and for Unix-like His ideas were implemented in 1973 when Ken Thompson added pipes to the UNIX operating system. Kenneth Lane Thompson (born February 4 1943) commonly referred to as Ken Thompson (or simply Unix (officially trademarked as UNIX, sometimes also written as Unix with Small caps) is a computer [1] The idea was eventually ported to other operating systems, such as DOS, OS/2, Microsoft Windows, and BeOS, often with the same notation. DOS, short for "Disk Operating System" is a shorthand term for several closely related Operating systems that dominated the IBM PC compatible market OS/2 is a computer Operating system, initially created by Microsoft and IBM, then later developed by IBM exclusively Microsoft Windows is a series of Software Operating systems and Graphical user interfaces produced by Microsoft. BeOS is an Operating system for Personal computers which began development by Be Inc

The robot in the icon for Apple's Automator, which also uses a pipeline concept to chain repetitive commands together, holds a pipe. Apple Inc, ( formerly Apple Computer Inc, is an American Multinational corporation with a focus on designing and manufacturing Consumer electronics Automator is an application developed by Apple for Mac OS X that implements point-and-click (or drag-and-drop creation of Workflows for automating repetitive tasks

Other operating systems

Main article: Pipeline (software)

This feature of Unix was borrowed by other operating systems, such as Taos and MS-DOS, and eventually became the pipes and filters design pattern of software engineering. In Software engineering, a pipeline consists of a chain of processing elements ( processes, threads, Coroutines etc. Unix (officially trademarked as UNIX, sometimes also written as Unix with Small caps) is a computer MS-DOS (short for M icro' s' oft D isk O perating S ystem is an Operating system commercialized by Microsoft. In Software engineering, a pipeline consists of a chain of processing elements ( processes, threads, Coroutines etc. Software engineering is the application of a systematic disciplined quantifiable approach to the development operation and maintenance of Software.

See also

External links

pipe: create an interprocess channel – System Interfaces Reference, The Single UNIX® Specification, Issue 6 from The Open Group

References

  1. ^ http://www.linfo.org/pipe.html Pipes: A Brief Introduction by The Linux Information Project (LINFO)
Sal Soghoian is the AppleScript Product Manager at Apple Inc. MacBreak is an Internet TV show hosted by Leo Laporte, Kendra Arimoto, Alex Lindsay, Justine Ezarik and Emery Wells
© 2009 citizendia.org; parts available under the terms of GNU Free Documentation License, from http://en.wikipedia.org
Dapyx Software network: MP3 Explorer | Ebook Manager | Zenithic