of 64
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.

Neural Network Applications Using an Improved Performance Training Algorithm

Category:

Documents

Publish on:

Views: 6 | Pages: 64

Extension: PDF | Download: 0

Share
Description
Neural Network Applications Using an Improved Performance Training Algorithm. Annamária R. Várkonyi-Kóczy 1, 2 , Balázs Tusor 2 1 Institute of Mechatronics and Vehicle Engineering, Óbuda University 2 Integrated Intelligent Space Japanese-Hungarian Laboratory
Transcript
Neural Network Applications Using an Improved Performance Training AlgorithmAnnamária R. Várkonyi-Kóczy1, 2, BalázsTusor21 Institute of Mechatronics and Vehicle Engineering, Óbuda University2 Integrated Intelligent Space Japanese-Hungarian Laboratorye-mail: varkonyi-koczy@uni-obuda.hu Outline
  • Introduction, Motivation for using SC Techniques
  • Neural Networks, Fuzzy Neural Networks, Circular Fuzzy Neural Networks
  • The place and success of NNs
  • A new training and clustering algorithms
  • Classification examples
  • A real-world application: fuzzy hand posture and gesture detection system
  • Inputs of the system
  • Fuzzy hand posture models
  • The NN based hand posture identification system
  • Results
  • Conclusions
  • Motivation for using SC Techniques We need something ”non-classical”: Problems
  • Nonlinearity, never unseen spatial and temporal complexity of systems and tasks
  • Imprecise, uncertain, insufficient, ambiguous, contradictory information, lack of knowledge
  • Finite resources  Strict time requirements (real-time processing)
  • Need for optimization
  • +
  • Need for user’s comfort
  • New challenges/more complex tasks to be solved  more sophisticated solutions neededMotivation for using SC Techniques We need something ”non-classical”: Intentions
  • We would like to build MACHINES to be able to do the same as humans do (e.g. autonomous cars driving in heavy traffic).
  • We always would like to find an algorithm leading to an OPTIMUM solution (even when facing too much uncertainty and lack of knowledge)
  • We would like to ensure MAXIMUM performance (usually impossible from every points of view, i.e. some kind of trade-off e.g. between performance and costs)
  • We prefer environmental COMFORT (user friendly machines)
  • Need for optimization
  • Traditionally:
  • optimization = precision
  • New definition (L.A. Zadeh):
  • optimization = cost optimization
  • But what is cost!?
  • precision and certainty also carry a costUser’s comfortHuman languageModularity, simplicity, hierarchical structuresAims of the processingpreprocessingprocessingimproving the performance of the algorithmsgiving more support to the processing (new)aims of preprocessingimage processing / computer vision:noise smoothingfeature extraction (edge, corner detection)pattern recognition, etc.3D modeling, medical diagnostics, etc.automatic 3D modeling, automatic ...preprocessingprocessingMotivation for using SC Techniques We need something ”non-classical”: Elements of the Solution
  • Low complexity, approximate modeling
  • Application of adaptive and robust techniques
  • Definition and application of the proper cost function including the hierarchy and measure of importance of the elements
  • Trade-off between accuracy (granularity) and complexity (computational time and resource need)
  • Giving support for the further processing
  • These do not cope with traditional and AI methods, only with Soft Computing Techniques and Computational IntelligenceWhat is Computational Intelligence?Computer + IntelligenceIncreased computer facilitiesAdded by the new methodsL.A. Zadeh, Fuzzy Sets [1965]: “In traditional – hard – computing, the prime desiderata are precision, certainty, and rigor. By contrast, the point of departure of soft computing is the thesis that precision and certainty carry a cost and that computation, reasoning, and decision making should exploit – whenever possible – the tolerance for imprecision and uncertainty.” What is Computational Intelligence?
  • CI can be viewed as a consortium of methodologies which play important role in conception, design, and utilization of information/intelligent systems.
  • The principal members of the consortium are: fuzzy logic (FL), neuro computing (NC), evolutionary computing (EC), anytime computing (AC), probabilistic computing (PC), chaotic computing (CC), and (parts of) machine learning (ML).
  • The methodologies are complementary and synergistic, rather than competitive.
  • What is common: Exploit the tolerance for imprecision, uncertainty, and partial truth to achieve tractability, robustness, low solution cost and better rapport with reality.
  • Soft Computing Methods (Computational Intelligence) fulfill all of the five requirements:(Low complexity, approximate modelingapplication of adaptive and robust techniquesDefinition and application of the proper cost function including the hierarchy and measure of importance of the elementsTrade-off between accuracy (granularity) and complexity (computational time and resource need)Giving support for the further processing)Methods of Computational Intelligence
  • fuzzy logic –low complexity, easy build in of the a priori knowledge into computers, tolerance for imprecision, interpretability
  • neuro computing - learning ability
  • evolutionary computing – optimization, optimum learning
  • anytime computing – robustness, flexibility, adaptivity, coping with the temporal circumstances
  • probabilistic reasoning – uncertainty, logic
  • chaotic computing – open mind
  • machine learning - intelligence
  • Neural Networks
  • It mimics the human brain
  • (McCullogh & Pitts, 1943, Hebb, 1949)
  • Rosenblatt, 1958 (Perceptrone)
  • Widrow-Hoff, 1960 (Adaline)
  • Neural NetworksNeural Nets are parallel, distributed information processing tools which are
  • Highly connected systems composed of identical or similar operational units evaluating local processing (processing element, neuron) usually in a well-ordered topology
  • Possessing some kind of learning algorithm which usually means learning by patterns and also determines the mode of the information processing
  • They also possess an information recall algorithm making possible the usage of the previously learned information
  • Application areas where NNs are successfully used
  • One and multi-dimensional signal processing (image processing, speech processing, etc.)
  • System identification and control
  • Robotics
  • Medical diagnostics
  • Economical features estimation
  • Associative memory = content addressable memory
  • Application area where NNs are successfully used
  • Classification system (e.g. Pattern recognition, character recognition)
  • Optimization system (the usually feedback NN approximates the cost function) (e.g. radio frequency distribution, A/D converter, traveling salesman problem)
  • Approximation system (any input-output mapping)
  • Nonlinear dynamic system model (e.g. Solution of partial differential equation systems, prediction, rule learning)
  • Main features
  • Complex, non-linear input-output mapping
  • Adaptivity, learning ability
  • distributed architecture
  • fault tolerant property
  • possibility of parallel analog or digital VLSI implementations
  • Analogy with neurobiology
  • Classical neural nets
  • Static nets (without memory, feedforward networks)
  • One layer
  • Multi layer
  • MLP (Multi Layer Perceptron)
  • RBF (Radial Basis Function)
  • CMAC (Cerebellar Model Articulation Controller)
  • Dynamic nets (with memory or feedback recall networks)
  • Feedforward (with memory elements)
  • Feedback
  • Local feedback
  • Global feedback
  • Feedforward architecturesOne layer architectures: Rosenblatt perceptronFeedforward architecturesOne layer architecturesInputOutputTunable parameters (weighting factors)Feedforward architecturesMultilayer network (static MLP net)Approximation property
  • universal approximation property for some kinds of NNs
  • Kolmogorov: Any continuous real valued N variable function defined over the [0,1]Ncompact interval can be represented with the help of appropriately chosen 1 variable functions and sum operation.
  • LearningLearning = structure + parameter estimation
  • supervised learning
  • unsupervised learning
  • analytic learning
  • Convergence??
  • Complexity??
  • System: d=f(x,n)Criteria:C(d,y)NN Model: y=fM(x,w)Supervised learningestimation of the model parameters by x, y, dn (noise)dxInputC=C(ε)yParametertuningSupervised learning
  • Criteria function
  • Quadratic:
  • ...
  • Minimization of the criteria
  • Analytic solution (only if it is very simple)
  • Iterative techniques
  • Gradient methods
  • Searching methods
  • Exhaustive
  • Random
  • Genetic search
  • Parameter correction
  • Perceptron
  • Gradient methods
  • LMS (least means square algorithm)
  • ...
  • Fuzzy Neural Networks
  • Fuzzy Neural Networks (FNNs)
  • based on the concept of NNs
  • numerical inputs
  • weights, biases, outputs: fuzzy numbers
  • Circular Fuzzy Neural Networks (CFNNs)
  • based on the concept of FNNs
  • topology realigned to a circular shape
  • connection between the hidden and input layers trimmed
  • the trimming done depends on the input data
  • e.g., for 3D coordinates, each coordinate can be connectedto only 3 neighboring hidden layer neurons
  • dramatic decrease in the required
  • training time Classification
  • Classification = the most important unsupervised learning problem: it deals with finding a structure in a collection of unlabeled data
  • Clustering = assigning a set of objects into groups whose members are similar in some way and are “dissimilar” to the objects belonging to other groups (clusters)
  • (usually iterative) multi-objective optimization problem
  • Clustering is a main task of explorative data mining, statistical data analysis used in machine learning, pattern recognition, image analysis, information retrieval, bioinformatics, etc.
  • Difficult problem: multi-dimensional spaces, time/data complexity, finding an adequate distance measure, non-unambiguous interpretation of the results, overlapping of the clusters, etc.
  • The Training and Clustering Algorithms
  • Goal:
  • To further increase the speed of the training of the ANNs used for classification
  • Idea:
  • During the learning phase, instead of directly using the training data the data should be clustered and the ANNs should be trained by using the centers of the obtained clusters
  • u – inputu’– centers of the appointed clustersy – output of the modeld – desired outputc – value determinedby the criteria functionThe Algorithm of the Clustering Step (modified K-means alg.)The ANNs
  • Feedforward MLP, BP algorithm
  • Number of neurons: 2-10-2
  • learning rate: 0.8
  • momentum factor: 0.1
  • Teaching set: 500 samples, randomly chosen from the clusters
  • Test set: 1000 samples, separately generated
  • Examples: Problem #1
  • Easily solvable problem
  • 4 classes, no overlapping
  • The Resulting Clusters and Required Training Time in the First Experiment with Clustering Distances A: 0.05, B: 0.1, and C: 0.25(First experiment)Comparison between the Results of the Training using the Clustered and the Cropped Datasets of the 1st ExperimentExamples: Problem #2Moderately hard problem4 classes, slight overlappingThe Resulting Clusters and Required Training Time in the Second Experiment with Clustering Distances A: 0.05, B: 0.1, and C: 0.25Comparison between the Results of the Training using the Clustered and Cropped Datasets of the 2nd ExperimentComparison of the Accuracy and Training Time Results of the Clustered and Cropped Cases of the 2nd ExperimentExamples: Problem #3Hard problem4 classes, significant overlappingThe Resulting Clusters and Required Training Time in the Third Experiment with Clustering Distances A: 0.05, B: 0.1, and C: 0.2Comparison between the Results of the Training using the Clustered and Cropped Datasets of the 3rd Experiment Comparison of the Accuracy Results of the Clustered and Cropped Cases of the 3rd ExperimentExamples: Problem #4easy problem4 classes, no overlappingd = 0.2 0.1 0.05The original datasetThe trained network’s classifying abilityAccuracy/training timeExamples: Problem #5Moderately complex problem3 classes, with some overlappingThe network could not learn the original training data with the same optionsd = 0.2 0.1 0.05The original datasetA Real-World Application: Man-machine cooperation in ISpace
  • Man-machine cooperation in ISpace using visual (handposture and –gesture based) communication
  • Stereo-camera system
  • Recognition of hand gestures/ hand tracking and classification of hand movements
  • 3D computation of feature points /3D model building
  • Hand model identification
  • Interpretation and execution of instructions
  • The method uses two camerasFrom two different viewpointThe method works in the following way:It locates the areas in the pictures of the two cameras where visible human skin can be detected using histogram back projectionThen it extracts the feature points in the back projected picture considering curvature extrema:peaks andvalleysFinally, the selected feature points are matched in a stereo image pair.The Inputs: The 3D coordinate model of the detected handThe results: The 3D coordinate model of the hand, 15 spatial pointsFuzzy Hand Posture Models
  • describing the human hand by fuzzy hand feature sets
  • theoretically 314 different hand postures
  • 1st set: four fuzzy features describing the distance between the fingertips of each adjacent finger (How far are finger X and finger Y from each other?) 2nd set: five fuzzy features describing the bentness of each finger (How big is the angle between the lowest joint of finger W and the plane of the palm?) 3rd set: five fuzzy features describing the relative angle between the bottom finger joint and the plane of the palm of the given hand (How bent is finger Z?)Fuzzy Hand Posture ModelsExample: VictoryFuzzy Hand Posture and Gesture Identification System
  • ModelBase
  • GestureBase
  • Target Generator
  • Circular Fuzzy Neural Networks (CFNNs)
  • Fuzzy Inference Machine (FIM)
  • Gesture Detector
  • Fuzzy Hand Posture and Gesture Identification SystemModelBase
  • GestureBase
  • Stores the features of the models as linguistic variablesContains the predefined hand gestures as sequences of FHPMsFuzzy Hand Posture and Gesture Identification System
  • Target Generator
  • Input parameters:Calculates the target parameters for the CFNNs and the FIM.d - identification value (ID) of the model in the ModelBase.SL - linguistic variable for setting the width of the triangular fuzzy setsFuzzy Hand Posture and Gesture Identification System
  • Fuzzy Inference Machine (FIM)
  • Max (Min(βi)) βi - intersection of the fuzzy feature sets
  • Gesture Detector
  • Identifies the detected FHPMs by using fuzzy min-max algorithm
  • Searches predefined hand gesture patterns
  • in the sequence of detected hand postures
  • Circular Fuzzy Neural Networks (CFNNs)3 different NNs for the 3 feature groups15 hidden layer neurons4/5 output layer neurons45 inputs (= 15 coordinate triplets)but only 9 inputs connected to each hidden neuron
  • Convert the coordinate model
  • to a FHPM
  • The Experiments
  • Six hand models
  • Separate training and testing sets
  • Training parameters:
  • Learning rate: 0.8
  • Coefficient of the momentum method: 0.5
  • Error threshold: 0.1
  • SL: small
  • 3 experiments
  • First and second experiments compare the speed of the training using the clusteredand the originalunclustereddata and the accuracy of the trained system
  • for given clustering distance (0.5)
  • Third experiment compares the necessary training time and the accuracy of the trained system for different clustering distances
  • The first two experiments have been conducted on an average PC (Intel Pentium® 4 CPU 3.00 GHz, 1 GB RAM, Windows XP+SP3 operating system), while the third experiment has been conducted on another PC (Intel® CoreTM 2 Duo CPU T5670 1.80 GHz, 2 GB RAM, Windows 7 32-bit operating system).Experimental Results:The Result in Required Training Time(First experiment)Experimental Results: Another Training Session with only One Session(Second experiment)Experimental Results: Comparative Analysis of the Result of the Trainings of the Two SessionsExperimental Results: The quantity of Clusters Resulting from Multiple Clustering Steps for Different Clustering Distances(Third experiment)Experimental Results: Comparative Analysis about the Characteristics of the Differently Clustered Data Sets(Third experiment)Experimental Results: Clustered Data SetsNumber of correctly classified samples / number of all samples(Third experiment)References to the examples
  • Tusor, B. and A.R. Várkonyi-Kóczy, “Reduced Complexity Training Algorithm of Circular Fuzzy Neural Networks,” Journal of Advanced Research in Physics, 2012.
  • Tusor, B., A.R. Várkonyi-Kóczy, I.J. Rudas, G. Klie, G. Kocsis, An Input Data Set Compression Method for Improving the Training Ability of Neural Networks, In CD-ROM Proc. of the 2012 IEEE Int. Instrumentation and Measurement Technology Conference, I2MTC’2012, Graz, Austria, May 13-16, 2012, pp. 1775-1783.
  • Tóth, A.A., Várkonyi-Kóczy, A.R., “A New Man- Machine Interface for ISpace Applications,” Journal of Automation, Mobile Robotics & Intelligent Systems, Vol. 3, No. 4, pp. 187-190, 2009.
  • Várkonyi-Kóczy, A.R., B. Tusor, “Human-Computer Interaction for Smart Environment Applications Using Fuzzy Hand Posture and Gesture Models,” IEEE Trans. on Instrumentation and Measurement, Vol. 60, No 5, pp. 1505-1514, May 2011.
  • Conclusions
  • SC and NN based methods can offer solution for many ”unsolvable” cases however with a burden of convergence and complexity problems
  • New training and clustering procedures which can advantageously be used in the supervised training of neural networks used for classification
  • Idea: reduce the quantity of the training sample set in a way that does little (or no) impact on its training ability
  • Clustering based on the k-means method with the main difference in the assignment step, where the samples are assigned to the first cluster that is “near enough”.
  • As a result, for classification problems, the complexity of the training algorithm (and thus the training time) of neural networks can significantly be reduced
  • Open questions:
  • dependency of the decrease of classification accuracy and training time of different types of ANNs
  • optimal clustering distance
  • generalization of the method towards other types of NNs, problems, etc.
  • Similar documents
    View more...
    Search Related
    We Need Your Support
    Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

    Thanks to everyone for your continued support.

    No, Thanks