Papadopoulos T*, Kosmas IJ and Michalakelis C
Department of Telematics and Informatics, Harokopio University of Athens, Greece
*Corresponding author:Papadopoulos T, Department of Telematics and Informatics, Harokopio University of Athens, Greece
Submission: March 08, 2021;Published: March 22, 2021
Volume6 Issue5
March, 2021
Artificial Neural Network models have helped researchers to foresee the outcome of problems in many scientific fields. These models are presumed as a computer-based stimulation of the human neural system capable to benchmark values. They employ functions among a large variety of data examples driving to knowledge, identify relationships or errors and finally provide a possible outcome. From a decisionmaking perspective the Neural Network approach is very helpful even if it is difficult to implement it in real cases. Recently they are increasingly applied in many fields of science such as finance, strategic and project management and business policy implementing forecasts, pattern recognition and predictions using historical data. This study is an overview of the basic models of Artificial Neural Networks, as much as a literature review used in specific fields of science. The goal of this study is to serve as a roadmap for an early researcher to understand the principles of Artificial Neural Network models and the underlying mathematics, to become familiar with their applications over certain areas and to guide further exploration of this area.
Keywords: Neural networks; Decision making; Finance; Science; Humans; Nucleus; Nerve
The concept of simulating a thinking machine was introduced by Alan Turing in 1950,
who proposed that if a computer-human interaction seems like human-human interaction
then we should consider the computer in question to be intelligent. Generally, Artificial
Intelligence systems act like humans, meaning that they need to access and use knowledge in
order to solve problems by applying conditions, logic, hunches and intuition in order to make
a decision [1]. Neural network methods are inspired by biology with analogous components
of living organisms. While in biological Neural Networks dendrites collect signals and send a
signal through an axon which in turn connects it with neurons which are excited of inhibited as
a result, in Artificial Neural Networks input data are sent to a processing entity (simulating a
neuron) which in turn sends output signals to other entities [2]. Such signals mainly follow the
form of an “if-then” rule where Neural Networks analyze large number of variables identifying
inter-actions and patterns [3], discovering rules and learning. Compared to other statistical
models, Artificial Neural Networks constitute a better approach to deduce assumptions based
on independent data [4,5]. Neural networks help to identify patterns among computer files,
requiring assumptions and to organize and classify data achieving satisfactory predictions.
These characteristics can make them a useful tool in finance [6], management [7], decisionmaking
and strategic planning [8]. As demonstrated in (Figure 1):
a. The nerve cell consists of the body.
b. The body of the nerve cell also contains the nucleus.
c. Dendrites are many and short.
d. The axon is one and long.
e. The main functional feature of the nerve cell is that when it receives a stimulus it is
activated.
The human brain, an extremely large, interconnected network of neurons, models the
planet around us, as depicted in (Figure 1). A neuron collects input from other neurons using dendrites. It then summates the inputs and when the resulting
value is higher than a threshold then an appropriate signal appears.
This signal is finally sent to other connected neurons through
the axon. Neural network algorithms are considered among the
foremost powerful and widely used ones. In a macroscopic point
of view, they could appear as a blind box: There is an input layer
which transmits info to the “hidden layer(s)” which subsequently
implements calculations to send results to the output layer. In
order to implement and optimize a neural network it is crucial to
understand how hidden layers operate. When a nerve cell is affected
by a stimulus (E.g., electrical stimulation), then the membrane is
potentially inverted. In other words, when under the influence
of a stimulus the nerve cell is activated, the negative membrane
loads move outwards and the positive inwards. But what was the
problem-solving process before the development of the Neural
Networks?
Figure 1:Human brain neural network.
a. Before Neural Networks, computers solved problems
following specific commands provided by programmers.
The capability to solve problems was limited to those that is
possible to provide a list of commands.
b. Unfortunately, many real-world problems are very
complex, have no analytical solution and require alternative
approaches, as for example heuristics. For instance, imagine
that someone wants to recognize a handwritten letter.
c. It is practically impossible to describe (in programming
language) how a letter may look like as different people write
letters with different handwriting.
On the other hand, in order to digitally read handwritten letters, it is more helpful to use algorithms that recognize patterns through examples. The idea is to create a machine which will potentially be able to recognize patterns while its programmer probably would never understand the pattern itself. This means that a great number of different handwritten letters could be recognized discerning similarities and differences. Therefore, the aim of this work is to shed light on some applications of Artificial Neural Networks to forecast indicators used in specific fields, mainly finance, management and decision-making. Arguably, researchers as well as practitioners could use this study as an example of methodology in order to implement forecasting using artificial neural networks, in domains like the ones described above. Future researchers will not only find in the present article the basic introductory elements for understanding Neural Networks, but they will also be inspired to engage in further research in these or other research areas. The rest of the paper consists of a literature review and history of the artificial neural networks, including some significant milestones (section 2). After this review follows a short introduction to the basic Artificial Neural Network Models (section 3) and their applications to selected areas (section 4). Finally, section 5 includes an analysis of the models’ performance and section 6 concludes.
Kitcenham & Bereton [9] list three important reasons for
conducting a literature review on information engineering and IT:
1. summarizing empirical facts, the advantages and disadvantages
of a given technology, 2. to recognize any existing gaps in and
recommend further research; and 3. to include a basis for the correct
deployment of future study activities. This study is consistent with
the second purpose, with the goal of shedding light on some of
the applications of Artificial Neural Networks in different fields,
especially economics, management and decision-making and also
recognize gaps in these areas. Authors started the automated
research in several databases (google scholar etc) reviewing all
relevant studies (journals, conference, books etc). At the same time
authors considered inclusion and exclusion criteria in order to reach
the final insight. The key words that have been used for the present
paper are: “Neural Networks application in management”, “Neural
Networks application in decision making”, “Neural Networks
application project management”, “Neural Networks application
in finance”. The development of artificial neural networks has a
noteworthy history. This study presents some major events and
achievements during their evolution, in order to underline how
research developed until now. In 1943, for the first time, Culloch
& Pitts [10] proposed the idea of data processing through binary
input elements, called “neurons”. These networks differ from those
in the conventional computer science, in the sense that the steps
of the programming process aren’t executed sequentially, but in
parallel with “neurons” [10]. It was said that the model became the
muse for deeper research.
In 1949, Donald Hebb [11] proposed algorithms that achieve
learning by updating neurons’ connections [11]. McCulloch and
Hebb contributed significantly to the development of neural
sciences studies [10,11]. In 1954, Minsky [12] built and tested,
the primary neurocomputer [12] and in 1958 Frank Rosenblatt
[13] developed the Perceptron as an element instead of a neuron
[13]. This was a trainable computer that could learn by changing
the connections to the edge components and therefore identify
patterns. The theory motivated scientists and became the basis for
the essential algorithms of machine learning. This innovation led to
the establishment of a public company for neural networks in 1960
[14]. Nils Nilsson [15] monograph on learning algorithms outlined
any research development done so far [15] and any drawbacks found in learning algorithms. Sun Ichi Amari [16,17], Fukushima [18] and
Miyaka (1980) discussed further the threshold components and the
statistical principles of neural networks. Associative memory has
further been reviewed by Tuevo Kohonen [19-25]. Later surveys
in neural networking took into consideration Hopfield’s model
as most appropriate for very large-scale integration designs [26]
improved layered models [27]. The variety of applications where
neural networks could be used has radically been increased from
tiny examples to broad cognitive activities. Nowadays, scientists
and practitioners develop complicated interconnected neural
network chips of large scale. There is much research to be made in
order to understand neural networks deeply and therefore achieve
better results but there is still a satisfactory number of models
ready to be issued to solve many problems.
Neural Network models are suitable to solve prediction and forecasting problems that have the following characteristics: i. the input is well understood, ii. the output is well understood and iii. experience is available [28]. These models consist of an input layer of neurons that sends signals to a hidden middle layer. The hidden layer of neurons computes weights and sends them to the output layer which aggregates data and generates the final output [2]. These models can classify outcomes [29] and identify patterns [30]. The procedure contains data to be fed and then processed within the layers. The structure described above is illustrated in (Figure 2). Having similar behavior to biological neurons, nodes combine inputs though an activation function to produce signal to the output layer. Each input has a different weight, which is acquired by training, taking into consideration the contribution or importance of each variable [4]. This function consists of two basic parts: i. a combination function that merges all input values to one and ii. a transfer function which produces the final outcome. A linear transfer function does not answer to most cases while sigmoid and hyperbolic tangents functions work better mainly because they can block function results between 0 and 1 (Figure 3).
Figure 2: The structure of a neural network.
Figure 3:Activation and transfer function.
Simple feedforward neural networks
Feedforward Neural Networks use the supervised learning method [31] which implements the learning process by specifying the desired output for every pattern encountered during training and by treating to differences between the network and training target as errors to be minimized [32]. The Perceptron is a simple topology of feedforward Neural Network without hidden layers. It is the first network introduced by Rosenblatt [13] in 1958 as a mechanism that is able to be trained in classification of models [33]. In the perceptron model, learning is an error-driven process calculating weights given a binary input vector to calculate output. The Perceptron learning algorithm has the following steps which are repeated until weights stop changing much:
i. For each x-> (input) and t-> (desired output) calculate y-> (output).
iii. If x-> ≠ t->
x-> =t-> then add to every weight that has x_{i} ≠ 0 , the difference Δw = d (t − y) , (d is the learning rate).
A generalization of the Perceptron model is the Delta Rule learning algorithm that was introduced by Widrow & Hoff [14]. It is also an error driven learning algorithm, like Perceptron, with the difference that in order to stop the algorithm the mean square error of the input vector should be minimized. Given a p -> learning vector the mean square error is calculated by the function:
In Delta Rule weights in the algorithm change by the following quantity:
Although Delta Rule learning algorithm is an improved version of the Perceptron model it cannot be used to Neural Networks with hidden layers because it is difficult to calculate the target output value t_{k} of each hidden node [33].
Backpropagation model
A highly successful and widely used [34] feedforward training model is backpropagation model (Figure 2), which identifies relationships between variables (inputs and outputs) [35]. The backpropagation model employs feedforward functions among a large variety of data examples driving to knowledge [36], it identifies relationships and errors and finally provides a possible outcome. Its algorithm drives the model to a learning procedure using trials and errors in order to determine the available correct answers [37]. It consists of two propagations: The forward pass, meaning the data imported to the input layer and the backward pass, meaning the feedback received and network response through error-correction knowledge [38]. The backpropagation algorithm begins the learning process using input vectors from past datasets. At first the model is initialized to small random values [39]. Each node of the input layer produces -through the transfer function- its own output which is considered as an input value to next hidden layer nodes. This continues to every hidden layer until it reaches the output layer which produces the final output value of the network. The input value of the j hidden, or output layer node is calculated by the function:
The output value of this node is calculated from the transfer function:
During the learning process for each input vector from past data the desired outcome is known. Using Delta Rule to the k output layer weights change:
Where d is the learning rate. When the calculation of new weights to the output layer is complete the value is now considered as desired output values to the first from the end hidden layer and then all weights are calculated using the same algorithm backwards until all weights change. This is a gradient descent optimization procedure which minimizes the mean square error E between output and desired output [40]:
Hopfield model and bidirectional Associative Memories (BAM)
The innovation that Hopfield Networks [41] brought in 1982 was the introduction of the energy function which describes the situation of the network; like in thermodynamics, where the system tends to minimize its energy. Its topology consists of one layer in which every node is connected with every other with two-way connections and symmetric weights. In the Hopfield model every j node takes as input value the initial j input from the input vector as well as the sum of output values of every other i≠j node.
where x_{j} is the input to j node, y_{i} is the output of the other i
nodes and s_{j} is the input value.
In the Hopfield model weights table during learning process is
easily calculated:
Where S_{k} = (S_{k1}, S_{k2} ,..., S_{kn} ) and wij = 0
when i = j
Nodes are simple Perceptrons with the following activation
function:
Where y_{j}’ is the output of the previous learning season of the
learning algorithm.
The Hopfield Algorithm ends when energy function converges
in a steady state:
Kosko introduced in 1988 an extension to Hopfield model, the Bidirectional Associative Memories (BAM) which has one more layer (two layers). The weight table during learning process in this model is:
The energy function for BAM model is simpler than Hopfield’s:
Kohonen network model
Kohonen model differs from the others as it consists of many neurons placed in a geometric topology like straight line, plane, sphere etc. Each neuron is connected with the input layer receiving every input vector. During the learning process the network separates neurons into categories matching input values to a neuron which affects its nearby neurons. Neurons in the neighborhood strengthen their output values while others weaken them. Given a S = (S_{1}, S_{2} ,..., S_{n} ) input vector the learning process calculates the distance between vectors S and w_{i} , where w_{i} = (w_{i1}, w_{i2} ,..., w_{ik} )
where n is the learning rate.
Neural Network models can be utilized in several domains. Potential applications of Neural Network in finance may be the estimation of stock market and currency values (i.e., forecasting) or the analysis of strength of historical (or pro forma) financial statements (i.e., classification). Chase Manhattan Bank, Peat Marwick, American Express are between the most well-known companies that efficiently apply Neural Networks to solve problems in finance and portfolio management [42]. In 1998, O Leary [43] analyzed studies that presented that artificial neural networks could be used to foresee the failure or bankruptcy of companies. He proposed plenty of data details for each analysis in order to substantiate his claims. Key factors that may affect the outcome of these predictions are the type of the model and the software that is used (means of development) as well as the complete structure of nodes (input, secret and output layers). Zhang et al. [44] reviewed 21 papers addressing neural network models with simulation issues to forecast values and 11 more with studies comparing neural models’ performance with conventional statistical approaches. According to them, the modeling problems are the kind of knowledge, the number of available data for the learning procedure, the model design (how many nodes and layers form the neural model and what transfer function is used), the specific algorithm and the normalization process. It’s widely accepted that artificial neural networks’ financial systems are included in many studies [45- 47]. Foreign exchange rates, stock values, credit risk and business failures are widely estimated by neural network models [48-56]. Considering that a manager in daily basis is required to make many decisions, it is of great importance to ensure that will make the right ones [57]. For example, in the case of deciding whether to introduce a new product in an industry or not a neural network model may provide real-time estimations with high level accuracy. The findings are tested by utilizing data on the life cycle of the item. For example, it is proved that the engine rpm (revolutions per minute), the winding temperature and the strength are sometimes used to predict the remaining life of a component [58].
In decision‐making, researchers argue that political decisions can be assisted by Neural Network models as well. Politicians have to make thousands of decisions during their governance. Some of them, can rise them to the top of publicity while some others can drive them straight to the bottom and of course to end up losing the elections. An imaginary application which could process every data throughout the world, could potentially foresee the outcome of future elections considering how voters may respond to international or national matters [59]. Regarding funding investments, scarce resources force decision makers to decide which projects will get funding and which will not. Unfortunately, it is very hard to develop a reliable framework for such an evaluation [60] especially to projects that may be affected by complex external conditions like migration and international protection management [61]. This decision is difficult because there are many uncertain economic, financial and social indicators that can simultaneously affect the cost benefit analysis of the project [62]. This analysis will determine the Net Present Value of an investment which can help decision makers [63] to choose which project to finance within alternatives. This method gives monetary values to all positive and negative effects focusing in measuring and quantifying indirect effects [64].
Given the complexity of indicators, a powerful branch of artificial intelligence, such as Artificial Neural Networks (Mavaahebi and Nagasaka, 2013), can be utilized to understand patterns [65,66] and forecast indicators in project management. In addition, neural models may be applicable to measure specific critical success factors serving as fundamental criteria to prevent failures [67]. In recent decades, artificial neural network models have helped researchers to foresee the outcome of such indicators [68], provide better decisions and therefore achieve high welfare to community and its citizens [69]. This analysis takes into consideration much uncertain information that increase the risk to make false predictions.
Artificial Neural Network models have helped researchers to foresee outcomes in many fields of science. These models are considered as a computer-based processor that learns from experimental data and makes analogous decisions [68] stimulating human neural. From a decision-makers perspective the Neural Network approach is very helpful even it is difficult to implement it in real cases [69-74]. Aiming at the characteristics of Artificial Neural Networks this study provided the basic models for Machine Learning and some of its applications in finance, management, project management and decision making [75-79]. Most models have common characteristics and prerequisites like the fact that their inputs and outputs should be well understood and that there should be much experience available. There is plenty of available studies and reviews describing how every field of science can utilize each model [80-83]. The scope of this study was not to present each separate model in detail but to provide directions and principles so that an early researcher can find a roadmap for deeper studying.
© 2021 Papadopoulos T. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and build upon your work non-commercially.