transfer learning with ltann-mem & nsa for solving multi-objective symbolic regression problem

Transfer learning with LTANN-MEM & NSA for solving high dimensional multi-objective symbolic regression problems

Amr K. DeklelMohamed A. SalehAlaa M. HamdyElsayed M. SaadTransfer learning with LTANN-MEM & NSA for solving high dimensional multi-objective symbolic regression problemsHelwan University, Faculty of Engineering2017, 34rd NATIONAL RADIO SCIENCE CONFERENCE(NRSC 2017), March 1316, 2017, Port Said, EgyptArab Academy for Science, Technology & Maritime Transport

1

OutlinesIntroductionProblem DescriptionMotivationBackgroundNeural Symbolization Algorithm (NSA)Transfer LearningApproachSegmented LTANN-MEMSub tasking NSAResultsComparisonSummaryFuture WorkQ & A211:07:08

IntroductionSymbolic regression versus linear and non-linear regression analysis

Multi-objective Vs. single-objective symbolic regression problems

Symbolic regression techniques (i.e. Genetic Programming and Bee colony programming ...etc)

Symbolic regression & neural networksNeural networks are sub-symbolicHow they can be used to extract symbolic representations?LTANN-MEM and NSA some answer [1]

311:07:08

Problem Description11:07:084LTANN-MEM & NSA proposed using previous learning experience of neurons, model these neurons according to domain specific classification scheme and store them in ANN memory (LTANN-MEM) to be used later in reinforcing the learning of neural networks to represent relations between stored experience [1]

It had convergence problem in high dimensional multi-objective symbolic regressionBoolean Decoder ProblemBoolean DEMUX Problem

MotivationOur primary objective is building unified cognitive architecture based on artificial neural networks which can be used in developing systems that yield in intelligent behavior in a diversity of complex environments.

Intelligent cognitive behavior includes memory, continuous learning and language among other things thats why cognitive functions rely on symbolic representations [2], although ANNs are sub-symbolic by nature [3]

Because of that we invented an ANN architecture which facilitates continuous learning through preserving the learning experience in long-term memory and reusing these experiences in relating them to new acquired knowledge and applied this architecture to symbolic regression problem

511:07:08

Motivation11:07:086And how much it is scalable to solve higher dimensional problems

However, how much this architecture is flexible for storing new learning experiences in LTANN-MEM not before the regression process only (offline) but inside the regression process also (online)

Neural Symbolization Algorithm (NSA) [1]11:07:087Working Memory (WM) is MLP used for training problem casesLTANN-MEM is composed of two Multi Layer PerceptronesAlpha ANN:- generates symbolic representation equivalent to neuron weightsBeta ANN:- synthesizes neuron weights equivalent to generated symbolic representationAnimation shows NSA sequence of workWorking memory gradually evolves to represent the input problem in terms of the stored memory primitives

Working Memory (WM)

Transfer Learning11:07:088Also called lifelong learning and continuous learning

It is the ability of a system to recognize and apply knowledge and skills learned in previous tasks to novel tasks

NSA uses transfer learning through storing learned tasks in LTANN-MEM.

LTANN-MEM doesnt work to only remember previously stored task, however it also does interpolation to predict non-stored tasks from stored ones

However in high dimensional problems, spaces between learned cases becomes too wide such that interpolation errors are high to reach a convergence point.

Segmented LTANN-MEM11:07:089To overcome the high interpolation error, it is necessary to be able to update the LTANN-MEM by new tasks while trying to solve target tasks.

However back-propagation algorithm suffers from the catastrophic forgetting problem, also called catastrophic interference problem

In this problem, back-propagation fails to attend selectively to input dimensions as humans because it forgets learned associations when new examples are trained [4]

To overcome this problem LTANN-MEM architecture is updated to include multiple segments (also called chunks) by buffering new cases until reach certain threshold called (chunk size) and then create an LTANN-MEM-SEGMENT including both part s Alpha and Beta. This new architecture is called Segmented LTANN-MEM (SEG-LTANN-MEM)

NSA is updated to search in all SEG-LTANN-MEM segments rather than one segment

Sub tasking NSA11:07:0810Both weights and model are different representations for the same task so we can measure the interpolation errorWhen the model error is high, this task is considered as sub task and solved separately.The subtask weights and equivalent model is added to chunk bufferWhen chunk size is reached, a segment is created and added to the SEG-LTANN-MEMThis new algorithm is called Sub Tasking NSA (ST-NSA)

Working Memory (WM)

Results11:07:0811Solved Decoder(664), (7128) and (8256) problems efficientlyAbove diagram shows how Decoder(664) is solved:-When the chunk reaches its firing criteria, a new segment is created by new source problemsWhen new segment is created, bit-fail error reaches zero and problem is solved

Effect of Big Chunk Size11:07:0812When the chunk size is very large to be filled by all new casesMuch time will be required because no new emerged casesIt is like someone who analyzes for too long time collecting analysis results and continues analyzing although no new results for long time until finally decide to use collected analysis results in solving the target problem after wasting a lot of time.

Effect of small chunk size11:07:0813Small chunk size creates many segments with low generalizationCauses high error of SEG-LTANN-MEMIt is like one who analyzes problem and tries to use every single case separately without relating them together, so he finds a difficulty in composing the big picture from these details and he tries to test against every single case rather than against small number of hypothesis or segments.

13

Results:- Effect of good chunk size11:07:0814Problem is solved efficientlyThree segments was enough to solve the problem

Effect of initial SEG_LTANN_MEM Size11:07:0815Decoder (7128) ProblemSolved with transferred history of 5000, 1000, 100 and 10 instancesSolved using 10 instances with ~98% accuracy

Decoder (8256) Problem11:07:0816Chunk size = 128Three segments are enough to solve the problem

Comparison:-Single Objective VS. Multiple Objective Problems11:07:0817

All found reference Boolean symbolic regression techniques uses single objective problems (especially the Multiplexer problem)

However our work is focused on multiple objective problems so our focus examples are Boolean decoder problem.

Decoder Vs. Multiplexer complexity in GP11:07:0818Although Multiplexer problem has single output (objective), it has exponentially growing number of data inputs with incrementing number of select lines.

However in decoder number of outputs (objectives) increases exponentially with incrementing number of inputs.

We compared the Genetic Programming scalability between decoder problem versus multiplexer problem

GP was able to solve up to Multiplexer(3+8) and Decoder (38) on normal quad core machine

Decoder VS. Multiplexer in GP11:07:0819

According to [5], the Multiplexer(4 + 16) problem is solved using EC-Star in less than 4 hours by running on seven eight core machines. How much time is required for Decoder (4 16) problem?ST-NSA + SEG-LTANN-MEM was able to solve up to Decoder (8 256) problem

Learning Classifier Systems (LCS)11:07:0820A paradigm of rule-based machine learning methodsCombines discovery component (typically genetic algorithms)With learning component (performing either supervised, reinforcement or unsupervised learning)XSC [5] (1995) is the best known and best studied LCS algorithmsUpervised Classifier System (UCS) is introduced in 2003 to specialize the XCS algorithm to the task of supervised learningExSTraCS [6] (2012) is extending UCS for the purpose of supervised learning in noisy problem domains.It employs a sort of long term memory as as a form for transfer learning. It is called in the method attribute tracking allowing for more efficient learning and the characterization of heterogeneous data patterns

Summary11:07:0821New knowledge can be stored incrementally in SEG-LTANN-MEM during problem solving process

With proper chunk size, new knowledge increments are enough for solving high dimensional multi-objective symbolic regression problems

This proves that this approach can overcome the famous catastrophic forgetting problem of back-propagation

Future Work11:07:0822How to consolidate segments together to fewer number of segments with better generalization

How to organize huge number of segments such that segments are selected for ST-NSA algorithm according to problem domain

Applying to connectionist inductive learning

Applying in solving engineering problems

11:07:0823Q & A

References11:07:0824A. K. Deklel, A. M. Hamdy and E. M. Saad, "Multi-objective symbolic regression using long-term artificial neural network memory (LTANN-MEM) and neural symbolization algorithm (NSA)," Neural Computing and Applications, vol. 27, no. 8, pp. 1 - 8, 29 July 2016Newell, Allen. Unified theories of cognition. Harvard University Press, 1994.Shanmuganathan, Subana. "Artificial Neural Network Modelling: An Introduction."Artificial Neural Network Modelling. Springer International Publishing, 2016. 1-14.J. K. Kruschke, "Human category learning: implications for backpropagation models," Connection Science, vol. 5, no. 1, 1993.Rick Riolo WPMK (2015) Genetic programming theory and practice XII. Springer International Publishing Switzerland, London