Traditionally, the term neural network had been used to refer to a network or circuit of biological neurons. Neurons (ˈnjuːɹɒn also known as neurones and nerve cells) are responsive cells in the Nervous system that process and transmit information The modern usage of the term often refers to artificial neural networks, which are composed of artificial neurons or nodes. An artificial neural network (ANN, often just called a "neural network" (NN is a Mathematical model or Computational model based on Biological neural An artificial neuron is a mathematical function conceived as a crude model or abstraction of biological Neurons Artificial neurons are the constitutive units in an Artificial Thus the term has two distinct usages:
This article focuses on the relationship between the two concepts; the other two articles cover details of the two different concepts.
Contents |
In general a biological neural network is composed of a group or groups of chemically connected or functionally associated neurons. A single neuron may be connected to many other neurons and the total number of neurons and connections in a network may be extensive. Connections, called synapses, are usually formed from axons to dendrites, though dendrodendritic microcircuits[1] and other connections are possible. Chemical synapses are specialized junctions through which Neurons signal to each other and to non-neuronal cells such as those in Muscles or Glands An axon or nerve fiber is a long slender projectionof a nerve cell or Neuron, that conducts electrical impulses away from the neuron's Cell Dendrites (from Greek δένδρον déndron, “tree” are the branched projections of a Neuron that act to conduct the electrochemical Apart from the electrical signaling, there are other forms of signaling that arise from neurotransmitter diffusion, which have an effect on electrical signaling. See Chemical synapse for an introduction to concepts and terminology used in this article As such, neural networks are extremely complex. Whilst a detailed description of neural systems is nebulous, progress is being charted towards a better understanding of basic mechanisms.
Artificial intelligence and cognitive modeling try to simulate some properties of neural networks. A cognitive model is an approximation to animal cognitive processes (predominantly human for the purposes of comprehension and prediction While similar in their techniques, the former has the aim of solving particular tasks, while the latter aims to build mathematical models of biological neural systems.
In the artificial intelligence field, artificial neural networks have been applied successfully to speech recognition, image analysis and adaptive control, in order to construct software agents (in computer and video games) or autonomous robots. Speech recognition (also known as automatic speech recognition or computer speech recognition) converts spoken words to machine-readable input (for example to keypresses Image analysis is the extraction of meaningful information from Images mainly from Digital images by means of Digital image processing techniques In Computer science, a software agent is a piece of software that acts for a user or other program in a relationship of agency. A video game is a Game that involves interaction with a User interface to generate visual feedback on a video device. Autonomous robots are Robots which can perform desired tasks in unstructured environments without continuous human guidance Most of the currently employed artificial neural networks for artificial intelligence are based on statistical estimation, optimization and control theory. Estimation theory is a branch of Statistics and Signal processing that deals with estimating the values of parameters based on measured/empirical data In Mathematics, the term optimization, or mathematical programming, refers to the study of problems in which one seeks to minimize or maximize a real function Control theory is an interdisciplinary branch of Engineering and Mathematics, that deals with the behavior of Dynamical systems The desired output
The cognitive modelling field involves the physical or mathematical modeling of the behaviour of neural systems; ranging from the individual neural level (e. A cognitive model is an approximation to animal cognitive processes (predominantly human for the purposes of comprehension and prediction g. modelling the spike response curves of neurons to a stimulus), through the neural cluster level (e. g. modelling the release and effects of dopamine in the basal ganglia) to the complete organism (e. g. behavioural modelling of the organism's response to stimuli).
Neural networks, as used in artificial intelligence, have traditionally been viewed as simplified models of neural processing in the brain, even though the relation between this model and brain biological architecture is debated.
A subject of current research in theoretical neuroscience is the question surrounding the degree of complexity and the properties that individual neural elements should have to reproduce something resembling animal intelligence.
Historically, computers evolved from the von Neumann architecture, which is based on sequential processing and execution of explicit instructions. The von Neumann architecture is a design model for a stored-program Digital computer that uses a processing unit and a single separate storage structure On the other hand, the origins of neural networks are based on efforts to model information processing in biological systems, which may rely largely on parallel processing as well as implicit instructions based on recognition of patterns of 'sensory' input from external sources. In other words, at its very heart a neural network is a complex statistical processor (as opposed to being tasked to sequentially process and execute).
An artificial neural network (ANN), also called a simulated neural network (SNN) or commonly just neural network (NN) is an interconnected group of artificial neurons that uses a mathematical or computational model for information processing based on a connectionistic approach to computation. An artificial neural network (ANN, often just called a "neural network" (NN is a Mathematical model or Computational model based on Biological neural An artificial neuron is a mathematical function conceived as a crude model or abstraction of biological Neurons Artificial neurons are the constitutive units in an Artificial Note The term model has a different meaning in Model theory, a branch of Mathematical logic. Information processing is the change (processing of Information in any manner detectable by an observer. Connectionism is an approach in the fields of Artificial intelligence, Cognitive psychology / Cognitive science, Neuroscience and Philosophy Computation is a general term for any type of Information processing. In most cases an ANN is an adaptive system that changes its structure based on external or internal information that flows through the network. An adaptive system is a System that is able to adapt its behavior according to changes in its environment or in parts of the system itself
In more practical terms neural networks are non-linear statistical data modeling or decision making tools. This article describes the use of the term nonlinearity in mathematics Statistics is a mathematical science pertaining to the collection analysis interpretation or explanation and presentation of Data. Data modeling in Computer science is the process of creating a Data model by applying formal data model descriptions using data modelling techniques Decision making can be regarded as an outcome of mental processes ( cognitive process) leading to the selection of a course of action among several alternatives They can be used to model complex relationships between inputs and outputs or to find patterns in data. Pattern recognition is a sub-topic of Machine learning. It is "the act of taking in raw data and taking an action based on the category of the data"
An artificial neural network involves a network of simple processing elements (artificial neurons) which can exhibit complex global behaviour, determined by the connections between the processing elements and element parameters. An artificial neural network (ANN, often just called a "neural network" (NN is a Mathematical model or Computational model based on Biological neural An artificial neuron is a mathematical function conceived as a crude model or abstraction of biological Neurons Artificial neurons are the constitutive units in an Artificial One classical type of artificial neural network is the Hopfield net. A Hopfield net is a form of recurrent artificial neural network invented by John Hopfield.
In a neural network model simple nodes, which can be called variously "neurons", "neurodes", "Processing Elements" (PE) or "units", are connected together to form a network of nodes — hence the term "neural network". An artificial neuron is a mathematical function conceived as a crude model or abstraction of biological Neurons Artificial neurons are the constitutive units in an Artificial While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.
In modern software implementations of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing. Neural network software is used to simulate, Research, develop and apply Artificial neural networks Biological neural networks and In some of these systems neural networks, or parts of neural networks (such as artificial neurons) are used as components in larger systems that combine both adaptive and non-adaptive elements. An artificial neuron is a mathematical function conceived as a crude model or abstraction of biological Neurons Artificial neurons are the constitutive units in an Artificial
The concept of a neural network appears to have first been proposed by Alan Turing in his 1948 paper "Intelligent Machinery". Alan Mathison Turing, OBE, FRS (ˈt(jʊ(ərɪŋ (23 June 1912 &ndash 7 June 1954 was an English Mathematician
The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations and also to use it. This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.
The tasks to which artificial neural networks are applied tend to fall within the following broad categories:
Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition, etc. ), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications, data mining (or knowledge discovery in databases, "KDD"), visualization and e-mail spam filtering. Data mining is the process of Sorting through large amounts of data and picking out relevant information E-mail spam, also known as "bulk e-mail" or "junk e-mail" is a subset of spam that involves nearly identical messages sent to numerous recipients by
Main article: Neural network software
Neural network software is used to simulate, research, develop and apply artificial neural networks, biological neural networks and in some cases a wider array of adaptive systems. Neural network software is used to simulate, Research, develop and apply Artificial neural networks Biological neural networks and Simulation is the imitation of some real thing state of affairs or process Research is defined as Human activity based on Intellectual application in the investigation of Matter. Software development is the translation of a user need or marketing goal into a Software product An artificial neural network (ANN, often just called a "neural network" (NN is a Mathematical model or Computational model based on Biological neural In Neuroscience, a neural network describes a population of physically interconnected Neurons or a group of disparate neurons whose inputs or signalling targets define An adaptive system is a System that is able to adapt its behavior according to changes in its environment or in parts of the system itself
There are three major learning paradigms, each corresponding to a particular abstract learning task. These are supervised learning, unsupervised learning and reinforcement learning. Supervised learning is a Machine learning technique for learning a function from training data In Machine learning, unsupervised learning is a class of problems in which one seeks to determine how the data are organised Inspired by related psychological theory in Computer science, reinforcement learning is a sub-area of Machine learning concerned with how an agent Usually any given type of network architecture can be employed in any of those tasks.
In supervised learning, we are given a set of example pairs
and the aim is to find a function f in the allowed class of functions that matches the examples. Supervised learning is a Machine learning technique for learning a function from training data In other words, we wish to infer how the mapping implied by the data and the cost function is related to the mismatch between our mapping and the data.
In unsupervised learning we are given some data x, and a cost function which is to be minimized which can be any function of x and the network's output, f. In Machine learning, unsupervised learning is a class of problems in which one seeks to determine how the data are organised The cost function is determined by the task formulation. Most applications fall within the domain of estimation problems such as statistical modeling, compression, filtering, blind source separation and clustering. Statistical models are used in Applied statistics. Three notions are sufficient to describe all statistical models Email filtering is the processing of E-mail to organize it according to specified criteria Blind signal separation, also known as blind source separation, is the separation of a set of signals from a set of mixed signals without the aid of information Clustering is the classification of objects into different groups or more precisely the partitioning of a Data set into Subsets (clusters
In reinforcement learning, data x is usually not given, but generated by an agent's interactions with the environment. Inspired by related psychological theory in Computer science, reinforcement learning is a sub-area of Machine learning concerned with how an agent At each point in time t, the agent performs an action yt and the environment generates an observation xt and an instantaneous cost ct, according to some (usually unknown) dynamics. The aim is to discover a policy for selecting actions that minimises some measure of a long-term cost, i. e. the expected cumulative cost. The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated. ANNs are frequently used in reinforcement learning as part of the overall algorithm. Tasks that fall within the paradigm of reinforcement learning are control problems, games and other sequential decision making tasks. A game is a structured activity, usually undertaken for Enjoyment and sometimes also used as an Educational tool
There are many algorithms for training neural networks; most of them can be viewed as a straightforward application of optimization theory and statistical estimation. In Mathematics, the term optimization, or mathematical programming, refers to the study of problems in which one seeks to minimize or maximize a real function Estimation theory is a branch of Statistics and Signal processing that deals with estimating the values of parameters based on measured/empirical data
Evolutionary computation methods, simulated annealing, expectation maximization and non-parametric methods are among other commonly used methods for training neural networks. In Computer science evolutionary computation is a subfield of Artificial intelligence (more particularly Computational intelligence) that involves Simulated annealing (SA is a generic probabilistic Meta-algorithm for the Global optimization problem namely locating a good approximation to the An expectation-maximization ( EM) algorithm is used in Statistics for finding Maximum likelihood estimates of Parameters in probabilistic Non-parametric statistics is a branch of Statistics concerned with non-parametric Statistical models and non-parametric inference, including non-parametric See also machine learning. Machine learning is a subfield of Artificial intelligence that is concerned with the design and development of Algorithms and techniques that allow computers to "learn"
Recent developments in this field also saw the use of particle swarm optimization and other swarm intelligence techniques used in the training of neural networks. Particle swarm optimization (PSO is a Swarm intelligence based Algorithm to find a solution to an optimization problem in a Search space, or model and Swarm intelligence (SI is Artificial intelligence based on the Collective behavior of decentralized, self-organized systems
Theoretical and computational neuroscience is the field concerned with the theoretical analysis and computational modeling of biological neural systems. Computational neuroscience is an interdisciplinary science that links the diverse fields of Neuroscience, Cognitive science, Electrical engineering, Since neural systems are intimately related to cognitive processes and behaviour, the field is closely related to cognitive and behavioural modeling.
The aim of the field is to create models of biological neural systems in order to understand how biological systems work. To gain this understanding, neuroscientists strive to make a link between observed biological processes (data), biologically plausible mechanisms for neural processing and learning (biological neural network models) and theory (statistical learning theory and information theory). In Neuroscience, a neural network describes a population of physically interconnected Neurons or a group of disparate neurons whose inputs or signalling targets define Information theory is a branch of Applied mathematics and Electrical engineering involving the quantification of Information.
Many models are used in the field, each defined at a different level of abstraction and trying to model different aspects of neural systems. They range from models of the short-term behaviour of individual neurons, through models of how the dynamics of neural circuitry arise from interactions between individual neurons, to models of how behaviour can arise from abstract neural modules that represent complete subsystems. A biological neuron model is a mathematical description of the properties of nerve cells or Neurons, that is designed to accurately describe and predict biological processes These include models of the long-term and short-term plasticity of neural systems and its relation to learning and memory, from the individual neuron to the system level.
While initially research had been concerned mostly with the electrical characteristics of neurons, a particularly important part of the investigation in recent years has been the exploration of the role of neuromodulators such as dopamine, acetylcholine, and serotonin on behaviour and learning. In neuroscience neuromodulation is the process in which several classes of Neurotransmitters in the nervous system regulate diverse populations of Neurons (one Dopamine is a Hormone and Neurotransmitter occurring in a wide variety of animals including both vertebrates and invertebrates The Chemical compound acetylcholine (often abbreviated ACh) is a Neurotransmitter in both the Peripheral nervous system (PNS and Central Serotonin (ˌsɛrəˈtoʊnən ( 5-hydroxytryptamine, or 5-HT) is a Monoamine Neurotransmitter synthesized in serotonergic Neurons
Biophysical models, such as BCM theory, have been important in understanding mechanisms for synaptic plasticity, and have had applications in both computer science and neuroscience. Biophysics (also biological physics) is an Interdisciplinary Science that employs and develops theories and methods of the Physical sciences for In Neuroscience, synaptic plasticity is the ability of the connection or Synapse, between two Neurons to change in strength. Research is ongoing in understanding the computational algorithms used in the brain, with some recent biological evidence for radial basis networks and neural backpropagation as mechanisms for processing data. A radial basis function network is an Artificial neural network that uses Radial basis functions as activation functions Neural backpropagation is the phenomenon in which the Action potential of a Neuron creates a voltage spike both at the end of the Axon (normal propagation
The concept of neural networks started in the late-1800s as an effort to describe how the human mind performed. Connectionism is an approach in the fields of Artificial intelligence, Cognitive psychology / Cognitive science, Neuroscience and Philosophy These ideas started being applied to computational models with the Perceptron. The perceptron is a type of Artificial neural network invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt.
In early 1950s Friedrich Hayek was one of the first to posit the idea of spontaneous order in the brain arising out of decentralized networks of simple units (neurons). Friedrich August von Hayek CH ( May 8, 1899 March 23, 1992) was an Austrian British Economist See also the closely related articles Emergence and Self-organization. In the late 1940s, Donald Hebb made one of the first hypotheses for a mechanism of neural plasticity (i. Donald Olding Hebb ( July 22, 1904 &ndash August 20, 1985) was a Canadian Psychologist who was influential in the area of Neuropsychology e. learning), Hebbian learning. Hebbian theory describes a basic mechanism for Synaptic plasticity wherein an increase in synaptic efficacy arises from the Presynaptic cell's repeated Hebbian learning is considered to be a 'typical' unsupervised learning rule and it (and variants of it) was an early model for long term potentiation. In Neuroscience, long-term potentiation ( LTP) is the long-lasting improvement in communication between two Neurons that results from stimulating them
The Perceptron is essentially a linear classifier for classifying data
specified by parameters
and an output function f = w'x + b. The perceptron is a type of Artificial neural network invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. Its parameters are adapted with an ad-hoc rule similar to stochastic steepest gradient descent. Because the inner product is a linear operator in the input space, the Perceptron can only perfectly classify a set of data for which different classes are linearly separable in the input space, while it often fails completely for non-separable data. In Mathematics, an inner product space is a Vector space with the additional Structure of inner product. In Mathematics, a linear map (also called a linear transformation, or linear operator) is a function between two Vector spaces that In Geometry, when two sets of points in a two-dimensional graph can be completely separated by a single line they are said to be linearly separable. While the development of the algorithm initially generated some enthusiasm, partly because of its apparent relation to biological mechanisms, the later discovery of this inadequacy caused such models to be abandoned until the introduction of non-linear models into the field.
The Cognitron (1975) was an early multilayered neural network with a training algorithm. The actual structure of the network and the methods used to set the interconnection weights change from one neural strategy to another, each with its advantages and disadvantages. Networks can propagate information in one direction only, or they can bounce back and forth until self-activation at a node occurs and the network settles on a final state. The ability for bi-directional flow of inputs between neurons/nodes was produced with the Hopfield's network (1982), and specialization of these node layers for specific purposes was introduced through the first hybrid network. A Hopfield net is a form of recurrent artificial neural network invented by John Hopfield. The term hybrid neural network can have two meanings Biological neural networks interacting with artificial neuronal models, and Artificial
The parallel distributed processing of the mid-1980s became popular under the name connectionism. Connectionism is an approach in the fields of Artificial intelligence, Cognitive psychology / Cognitive science, Neuroscience and Philosophy Connectionism is an approach in the fields of Artificial intelligence, Cognitive psychology / Cognitive science, Neuroscience and Philosophy
The rediscovery of the backpropagation algorithm was probably the main reason behind the repopularisation of neural networks after the publication of "Learning Internal Representations by Error Propagation" in 1986 (Though backpropagation itself dates from 1974). Backpropagation, or propagation of error, is a common method of teaching Artificial neural networks how to perform a given task The original network utilised multiple layers of weight-sum units of the type f = g(w'x + b), where g was a sigmoid function or logistic function such as used in logistic regression. A sigmoid function is a Mathematical function that produces a sigmoid curve &mdash a curve having an "S" shape A logistic function or logistic curve is the most common Sigmoid curve. In Statistics, logistic regression is a model used for prediction of the Probability of occurrence of an event by fitting data to a Logistic curve. Training was done by a form of stochastic steepest gradient descent. The employment of the chain rule of differentiation in deriving the appropriate parameter updates results in an algorithm that seems to 'backpropagate errors', hence the nomenclature. However it is essentially a form of gradient descent. Determining the optimal parameters in a model of this type is not trivial, and steepest gradient descent methods cannot be relied upon to give the solution without a good starting point. In recent times, networks with the same architecture as the backpropagation network are referred to as Multi-Layer Perceptrons. A multilayer perceptron is a Feedforward Artificial neural network model that maps sets of input data onto a set of appropriate output This name does not impose any limitations on the type of algorithm used for learning.
The backpropagation network generated much enthusiasm at the time and there was much controversy about whether such learning could be implemented in the brain or not, partly because a mechanism for reverse signalling was not obvious at the time, but most importantly because there was no plausible source for the 'teaching' or 'target' signal.
A. K. Dewdney, a former Scientific American columnist, wrote in 1997, “Although neural nets do solve a few toy problems, their powers of computation are so limited that I am surprised anyone takes them seriously as a general problem-solving tool. Alexander Keewatin Dewdney (born August 5 1941 in London Ontario) is a Canadian Mathematician, computer scientist and Scientific American is a Popular science magazine, published (first weekly and later monthly since August 28, 1845, making it ” (Dewdney, p. 82)
Arguments against Dewdney's position are that neural nets have been successfully used to solve many complex and diverse tasks, ranging from autonomously flying aircraft[1] to detecting credit card fraud[2].
Technology writer Roger Bridgman commented on Dewdney's statements about neural nets:
Neural networks, for instance, are in the dock not only because they have been hyped to high heaven, (what hasn't?) but also because you could create a successful net without understanding how it worked: the bunch of numbers that captures its behaviour would in all probability be "an opaque, unreadable table. . . valueless as a scientific resource". In spite of his emphatic declaration that science is not technology, Dewdney seems here to pillory neural nets as bad science when most of those devising them are just trying to be good engineers. An unreadable table that a useful machine could read would still be well worth having. [2]