learning of fuzzy formal language

5
98 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, JANUARY 1973 ferent gray-value distributions in object and background, and, I. INTRODUCTION furthermore, that there exists a boundary region between the In this correspondence a method of learning a fuzzy language two having its own particular noise characteristic. The method [1 ] is proposed and considered. Essentially, learning techniques was therefore designed to exploit these noise differences. are used when the designing of an optimum machine is impossible It is therefore not surprising that our technique could not because the existing environment is unknown or changeable. be applied successfully to pictures having very small objects On the other hand, the formal language theory was originated approaching the size of noise (Fig. 6) or to pictures with very by Chomsky [2] as a means of grasping a natural language thin objects (Figs. 9 and 10). In the latter instances, the ob- logically, and it has come to be applied to computer program- jects are essentially boundary regions having almost no interior; ming languages also. However, at present it cannot be said for these, objects were obtainable only via slicing techniques. that the formal language theory is satisfactory for representing The combined gray-value transformation-isolation technique a natural language. One of the main reasons it is insufficient was successfully applied to a picture for which slicing methods is that the system of the formal language is very logical and were also sufficient to extract objects (Fig. 8). Interestingly, unflexible, while the system of a natural language is fairly with the former method, both objects and background, each flexible and adaptable, in particular as regards the learning relatively distinct, were detected (Fig. 8(e)). The fuzzy nature faculty of human beings. Therefore, we will consider as a model of the boundary regions is demonstrated in that these alone of the language system of humans a fuzzy language with a were not detected; in fact, the boundary regions here are the learning faculty which is an extension of an ordinary formal blurred regions within the picture, as opposed to the situation language. in Fig. 5, where the noisiness of the boundaries was, on the average, intermediate between that of the objects and that of the II. FUZZY LANGUAGE background. The fuzzy set was proposed by Zadeh [3 ] in 1965 to represent We emphasize the interim nature of these results. At present, the fuzziness of thought of human beings. Since then, the fuzzy the investigator has to make the choice of the technique appro- theory has been applied to automata [4], [5], pattern recognition priate to the picture in question, as well as to the selection of the [6], formal language [1], [7], semantics [8], control [4], etc. parameters for that particular technique. In spite of these cur- Since the language of human beings is fuzzy, it may be appro- rent limitations, preprocessing of pictures using the techniques priate to apply a fuzzy language to a language learning. Our presented has enabled results which to date have not been obtain- notation, terminology, and reasoning parallel closely the presen- able otherwise, e.g., the successful application of texture- tation ofthe theory offormal language in Hopcroft and Ullman coarseness measures [2] and the type of result shown in Fig. 5. [9] and fuzzy language in Lee and Zadeh [1]. Moreover, it is hoped that the techniques presented and the A fuzzy grammar is a quadruple G = (VN,VT,P,S) in which observations of the nature of the gray-value composition of VN is a set of nonterminals, VT is a set of terminals (VN rn VT = pictures might suggest new approaches to a variety of specific S), P is a set of fuzzy productions, and S (E VN) is a sentence problems in the area of picture processing. symbol. The elements of P are expressions in the form REFERENCES (a-o )= P, p E[O,l ] (1) [1] B. S. Lipkin and A. Rosenfield, Eds., Picture Processing and Psycho- w . a pictorics. New York: Academic Press, 1970. where a and ,B are strings in (VN u VT)* and p iS the grade of [2] "Visual texture analysis," in Conf. Rec. Symp. Feature Extraction and membership of ft given a. Where convenient, we shall abbreviate Selection in Pattern Recognition, IEEE Publ., 70C 51-C, Oct. 1970, pp. 115-124. ,(a -+ f,) = p to a > , or, more simply, a e ft. When we are interested not in the grade of membership but in producing ft from a, we call a -e f the production rule. As in the case of nonfuzzy grammars, the expression a -+ ft represents a re- writing rule. Thus, if a -- ft and y, 3 are arbitrary strings in (VN U VT)*, then p ya s => y,B 3 where p is the grade of membership of yf, 3 given ya 3. Learning of Fuzzy Formal Language A fuzzy grammar G generates a fuzzy language L(G) in the following manner. A string of terminals x is called a sentence SHINICHI TAMURA AND KOKICHI TANAKA and is said to be in L(G), or x E L(G), if and only if x is derivable from S. The grade of membership of x in L(G) is given by Abstract-A learning model of fuzzy formal language is proposed and discussed. We continue training the learning machine by giving sets of AG(x) =U(S = x) sentences sequentially. As a result of parsing of the given teaching sentences, the learning machine reinforces fuzzy grades of membership = sup min [A(S - l),#(al = 2), ,(am X)] (2) of productions in an inherent fuzzy grammar of the machine. The convergence of the proposed model is considered, and it is shown whrtesuemmitanovrlldiaincansfm that the grades of membership of desired productions are intensified by S to x. Thus (2) defines L(G) as a fuzzy set in VT*. choosing an adequate teaching sequence of the sentence set. Furthermore, Paralleling the standard classification of nonfuzzy grammars, a concept of "strongly equivalent," in which two grammars are not we have four principal types of fuzzy grammars. However, we distinguished by any teaching sequence, is introduced, are interested in only recursive grammars [1 ], whose derived languages can always be parsed, i.e., context-sensitive, context- Manuscript received April 17, 1972; revised July 28, 1972. fre n eua rmas The authors are with the Department of Information and Computer Type 1 Grammar-Context Sensitive (CS): The productions Sciences, Faculty of Engineering Science, Osaka University, Toyonaka, of the form cA cc2 P, cc1ftc2, p E [0,1], with at, a2' and ft in

Upload: kokichi

Post on 24-Sep-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Learning of Fuzzy Formal Language

98 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, JANUARY 1973

ferent gray-value distributions in object and background, and, I. INTRODUCTIONfurthermore, that there exists a boundary region between the In this correspondence a method of learning a fuzzy languagetwo having its own particular noise characteristic. The method [1 ] is proposed and considered. Essentially, learning techniqueswas therefore designed to exploit these noise differences. are used when the designing of an optimum machine is impossible

It is therefore not surprising that our technique could not because the existing environment is unknown or changeable.be applied successfully to pictures having very small objects On the other hand, the formal language theory was originatedapproaching the size of noise (Fig. 6) or to pictures with very by Chomsky [2] as a means of grasping a natural languagethin objects (Figs. 9 and 10). In the latter instances, the ob- logically, and it has come to be applied to computer program-jects are essentially boundary regions having almost no interior; ming languages also. However, at present it cannot be saidfor these, objects were obtainable only via slicing techniques. that the formal language theory is satisfactory for representingThe combined gray-value transformation-isolation technique a natural language. One of the main reasons it is insufficient

was successfully applied to a picture for which slicing methods is that the system of the formal language is very logical andwere also sufficient to extract objects (Fig. 8). Interestingly, unflexible, while the system of a natural language is fairlywith the former method, both objects and background, each flexible and adaptable, in particular as regards the learningrelatively distinct, were detected (Fig. 8(e)). The fuzzy nature faculty of human beings. Therefore, we will consider as a modelof the boundary regions is demonstrated in that these alone of the language system of humans a fuzzy language with awere not detected; in fact, the boundary regions here are the learning faculty which is an extension of an ordinary formalblurred regions within the picture, as opposed to the situation language.in Fig. 5, where the noisiness of the boundaries was, on theaverage, intermediate between that of the objects and that of the II. FUZZY LANGUAGEbackground. The fuzzy set was proposed by Zadeh [3 ] in 1965 to representWe emphasize the interim nature of these results. At present, the fuzziness of thought of human beings. Since then, the fuzzy

the investigator has to make the choice of the technique appro- theory has been applied to automata [4], [5], pattern recognitionpriate to the picture in question, as well as to the selection of the [6], formal language [1], [7], semantics [8], control [4], etc.parameters for that particular technique. In spite of these cur- Since the language of human beings is fuzzy, it may be appro-rent limitations, preprocessing of pictures using the techniques priate to apply a fuzzy language to a language learning. Ourpresented has enabled results which to date have not been obtain- notation, terminology, and reasoning parallel closely the presen-able otherwise, e.g., the successful application of texture- tation ofthe theory offormal language in Hopcroft and Ullmancoarseness measures [2] and the type of result shown in Fig. 5. [9] and fuzzy language in Lee and Zadeh [1].Moreover, it is hoped that the techniques presented and the A fuzzy grammar is a quadruple G = (VN,VT,P,S) in whichobservations of the nature of the gray-value composition of VN is a set of nonterminals, VT is a set of terminals (VN rn VT =pictures might suggest new approaches to a variety of specific S), P is a set of fuzzy productions, and S (E VN) is a sentenceproblems in the area of picture processing. symbol. The elements of P are expressions in the form

REFERENCES (a-o )= P, p E[O,l ] (1)[1] B. S. Lipkin and A. Rosenfield, Eds., Picture Processing and Psycho- w .a

pictorics. New York: Academic Press, 1970. where a and ,B are strings in (VN u VT)* and p iS the grade of[2] "Visual texture analysis," in Conf. Rec. Symp. Feature Extraction and membership of ft given a. Where convenient, we shall abbreviate

Selection in Pattern Recognition, IEEE Publ., 70C 51-C, Oct. 1970,pp. 115-124. ,(a -+ f,) = p to a > , or, more simply, a e ft. When we

are interested not in the grade of membership but in producingft from a, we call a -e f the production rule. As in the case ofnonfuzzy grammars, the expression a -+ ft represents a re-

writing rule. Thus, if a -- ft and y, 3 are arbitrary stringsin (VN U VT)*, then

p

ya s => y,B 3where p is the grade of membership of yf, 3 given ya 3.

Learning of Fuzzy Formal Language A fuzzy grammar G generates a fuzzy language L(G) in thefollowing manner. A string of terminals x is called a sentence

SHINICHI TAMURA AND KOKICHI TANAKA and is said to be in L(G), or x E L(G), if and only if x is derivablefrom S. The grade of membership of x in L(G) is given by

Abstract-A learning model of fuzzy formal language is proposed anddiscussed. We continue training the learning machine by giving sets of AG(x) =U(S = x)sentences sequentially. As a result of parsing of the given teachingsentences, the learning machine reinforces fuzzy grades of membership = sup min [A(S - l),#(al = 2), ,(am X)] (2)of productions in an inherent fuzzy grammar of the machine.The convergence of the proposed model is considered, and it is shown whrtesuemmitanovrlldiaincansfm

that the grades of membership of desired productions are intensified by S to x. Thus (2) defines L(G) as a fuzzy set in VT*.choosing an adequate teaching sequence of the sentence set. Furthermore, Paralleling the standard classification of nonfuzzy grammars,a concept of "strongly equivalent," in which two grammars are not we have four principal types of fuzzy grammars. However, wedistinguished by any teaching sequence, is introduced, are interested in only recursive grammars [1 ], whose derived

languages can always be parsed, i.e., context-sensitive, context-

Manuscript received April 17, 1972; revised July 28, 1972. fre n eua rmasThe authors are with the Department of Information and Computer Type 1 Grammar-Context Sensitive (CS): The productions

Sciences, Faculty of Engineering Science, Osaka University, Toyonaka, of the form cA cc2 P, cc1ftc2, p E [0,1], with at, a2' and ft in

Page 2: Learning of Fuzzy Formal Language

CORRESPONDENCE 99

(VN U VT)*, A in VN, and ,B . e. In addition, the production K PI | Q(Kn) REINFORCEMENT L(Gn)S -+ e is allowed. -----PRSI

Type 2 Grammar-Context Free (CF): The allowable produc-tions are of the form A - , p E [0,1], A E VN, fi E (VN u Fig. 1. Learning model of fuzzy language.VT)*, f6 X, and S -+ E.

Type 3 Grammar-Regular: The allowable productions areofThpe foramar- Reguaar:-Th a,lp be[0,1],twher arV Since method 2) becomes similar to method 1) in the case where

of the form A P aB or A a, p E- [O, I ] where a E- VT, the grammar is unambiguous, only method 1) is consideredA,B E VN, and S -. e is allowed, here. We show in Section IV that even if the grammar is am-

III. LEARNING FORM biguous, we can intensify only specified rules by choosing an

Let us assume that in the initial stage of learning a fuzzy adequate teaching sequence.In the learning machine just described, since the grammar is

grammar G, = (VN,VT,P1,S) is given to the learning machine,grammar 'gis giventoathe learing machne, controlled by the learning rule, it can be regarded as a sequence-and the learning is advanced by sequentially updating the grades dpnettm-ayn rma.Frhroe ic h rmof membership of production in P1. Therefore, it is necessary mrpart the-machine iseqialn toeanordinar tomata,to choose VN, VT, and P1 beforehand to cover a sufficient range, and thetlearing partideais equivalent to anordgnary automata,

or to give adequate VN, VT, and P1 by initial stage training, machine can a ardedwath the analog values, thoslearangFurthermore, let the set of production rule contained in P1 beR, i.e., R is P1 from which all the grades of membership of IV. LEARNING PROCESSproduction are removed. The set R may contain clearly gram- In this section the learning process of the proposed learningmatically incorrect or improper production rules. Such rules machine is considered. Letwill be degraded or removed by the learning.At time n, the machine has a fuzzy grammar Gn = (VN, VT,Pfl,S), LA(G.) = {X /1G (X) 2 A}

where*

PF = {u,(u -+ v) (u -+ v) E R}. = {x I fl.(S x) 2Al (5)

For the purpose of training, a finite set then Lk(G,,) becomes an ordinary nonfuzzy set. Furthermore,by choosing A and {Ki i = 1,2,. . ., n - 1} adequately, LI(G,,)

Kn = {x,,i i = 1,2,..*,Nn} can be made close to a desired set.Assume R,, = {r r E R, g,n(r) 2 A}; then obviously [7]

of sentences is given to the machine, where xni E L(G1). Thesequence of Kn is called the teaching sequence. Likening the L(RnA) = LA(Gn). (6)present learning to the learning of human children, this teaching If the machine is given only grammatically correct sentences,sequence may correspond to conversations of the adults around. If the macin is the grammatically correct sentenThen the machine parses with respect to each sentence x" R,r will contain only the grammatically correct productionThis parsing is possible since G1 is recursive. Let the set of all rules.production rules that have a possibility of being used to generate TheOrem 1: If Vi, Kn, = K, thenL R E (0,1), there existsN,i)xni be Q(xni), and the one to generate the set Kn be Q(Kn). Ob- such that Vn > NQL) LA(G,) = L[Q(K)]. Furthermore,viously, we have Q(K,,) c R and Q(Kn) = UNn Q(xni)) a) Vn, LO(Gn) = LO(G1) = L(R);Furthermore, let a characteristic function of the set Q(Kn) be b) VA,n, LA(G,) c Lo(GI);On(-); i.e., we have On(r) = 1, for r c Q(Kn), and O,,(r) = 0, for c) K C L[Q(K)] ' LO(G1)-r 0 Q(Kn).For simplicity, let us take the following linear learning form Proof: a) is obvious from the fact that Vn, r E R, an(r) 2 0.

[10] as the learning form: b) is obvious from L,(Gn) c LO(Gn) and a). c) is obvious fromthe definition of Q(K). Let us prove the first half of the theorem

n+1(U o- v) = afl.(u -e v) + (1 -a)n(u -e v), 0 < a < 1 (3) in the following manner. From (3), we have

Obeying this learning form, the grades of membership of the n-1production rules that are used many times become high and vice fln+1(r) = nx"pl(r) + (1-x) E a'O_i(r). (7)

z=Oversa. The proposed learning model of fuzzy language is illus-trated in Fig. 1. Assume that r E Q(K); then, since On(r) = 1, n = 1,2,..-, we

Let a nonfuzzy grammar be G = (VN,VT,Rn,S), where Rn is havea set of production rules. Since VN, VT, and S are not changed n+1(r) = aocp1(r) + 1 - n. (8)by the learning, we can have an abbreviated form as follows;

Therefore, VA), 3N'(A),VSi > N'(A), we haveQ,t(r) 2VA , i.e.,L(G) = L(VN,VT,R,,,S) A L(R,). (4) lim,, A",(r) = 1. Similarly, if r ¢ Q(K), then VA~, 3N"(AS),

Viz . N"(A), we have ,u,,(r) . i~, i.e., lim,,>, p,u(r) = 0. LetWhen the grammar is ambiguous, that is, when some sentence NQL) = max [N'QL),N"())]; then Viz 2 NGi), R,,, = Q(K)- Using

has more than two derivation chains, we may have the following (6), the theorem is proved. Q.E.D.two learning methods:

Example 1: Suppose that G1 = (VN7VT,P1,S), where VN =1) the learning method in which all the production rules that {5} VT = {a,b,C}, P1 is given by pl(S -+ c) - 0.5, 4ur(S -+ aSb) =

can be used in some derivation are intensified; 0.5, 4u1(S e+ abS) = 0.5, a-0.8, and K = {a2cb2}. The learn-2) the learning method in which the correct derivation is ing process is shown in Table I. It is easy to see that

directed by an external supervisor [11 ] and only the correctproduction rules are intensified. Q(K) = {S -+c. S -~aSb}

Page 3: Learning of Fuzzy Formal Language

100 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, JANUARY 1973

TABLE I TABLE IILEARNING PROCESS OF EXAMPLE 1 LEARNING PROCESS OF EXAMPLE 2

R | I P2 YI314 | .I 2 113 V4 115

S-~c 0.5 06 06 074 A'05 06 04 058 0462 3N\ 4/,4/S + c 0.5 0.6 0.68 0.744

c ~~ ~ ~~~~~~~~~~~~~~~Ri0.5 0.6 0.48 0.584 0.4672 /\i4/9, 5/9

S - aSb 0.5 0.6 0 .68 0 .744R .8 074 075a R2 ~~~~~~~~~~~~~~~~0.50.6 0.68 0.744 0.7952 ..." 1

S + abS 0.5 0.4 0.32 0.256 XR3

aS04 03020.5 0.4 0.52 0.416 0.5328 \/\, 5/9, 4/9

S -* A3B3C3L( R) = A3B3 -* abaca

C3 -* bac/ L[Q({xy)a/ QQ=, \ where the initial grades p, are all 0.5. Let a = 0.8, K2n = {abac},

L[Q( L[Q(y) \ K2n_ = {(abac)2}, n = 1,2,. , then

\ X=L(R2)= abac) i = 0,1,* }

L(R3) = {(abac)2}Fig. 2. Illustration of Lemma 1.

L(R1 u R2 u R3) =1L(R2).

L (G )_f 0, = 1,2,3 The learning process is shown in Table II. We seeL{akcbklk = 0,1,..-} JL[Q(K)], n = 4,5,.-- L(Gl)-|n = 1,2,3

L(R) = {a'(ab)icbi,(ab)kai(ab)icbi,-- -I i,j,k = 0,1,2,. -L(R2, n = 4,5,---.- L0(Gn), n = 1,2,*-.. In Example 2, the left-hand sides of the production rules in

R2 and R3 have forms of A2B2 and A3B3. In Theorem 3, it willAs seen from Example 1, the union set of the two sets of be shown that, similar to the nonfuzzy case, a grammar such as

production rule generates a larger language than the union set that in Example 2 is equivalent to the fuzzy CS grammar.of languages generated by the individual set of production rules.

Definition 1: Let the production rules or their sets be R, and V. EQUIVALENCYR2. If any term in the left-hand side of the production rule in For two fuzzy grammars Ga and Gb, if L(Ga) = L(Gb) in theR2 is not contained in the right-hand side of R1, and any of the sense of equality of fuzzy sets, then the grammars Ga and Gbterms in the left-hand side of R1 is not contained in the right- are said to be weakly equivalent or simply equivalent. On thehand side of R2, then R, and R2 are said to be disjoint. other hand, it is well known that if merely two nonfuzzy CFLemma1: L[Q({x,y})] 2 L5[Q(x)] uL15[Q(y)]. Here, in the grammars Ga' and Gb' are given, it is recursively unsolvable

fuzzy CF grammar, if Q(x) and Q(y) are disjoint, then the equal whether the languages generated by Ga' and Gb' are identicalsign is valid. or not. Therefore, it is also true for the fuzzy CF grammars with

Proof: From Q({x,y}) 2 Q(x) and Q({x,y}) 2 Q(y), we respect to weak equivalency. Furthermore, even if two fuzzyhave L[Q({x,y})] 2 L[Q(x)] and L [Q({x,y})] 2D L[Q(y)]. Thus grammars Gan and Gbn are weakly equivalent at time n, advancingthe first half of the lemma is proved. The latter half of the lemma the learning, they generally become unequivalent.is obvious. Q.E.D. Example 3: Let

This lemma is illustrated in Fig. 2. Note that when x and y Gan (VN,VT,Pan,S)are the sets of sentences, Lemma 1 is also valid.As mentioned in Section III, we will show in Example 2 that Gbfl (VN,VT,Pbn,S)

even if the grammar is ambiguous in the nonsupervised case, Pan = {Jn(S - a) = 0.5, pn(S - b) = 0.8by choosing an adequate teaching sequence, we can intensifyonly specified rules. Pbn = {p,,(S - A) = 0.8, pn(A - a) = 0.5, p,(A - b) = 1.0}.Example 2: Let Then

VN = {S,A1,B1,S2,A2,B2,A3,B3,C3} L(Gan) = L(Gb.) = {p(a) = 0.5, p(b) = 0.8}.VT = {a,b,c} If we take Kn = {a}, a = 0.8, we have

=(A1 -+ B1 L(Ga(n+ l)) = {/I(a) = 0.6, pl(b) = 0.64}B1ba L(Gb(,+l1)) = {A(a) = 0.6, g(b) = 0.80}.

(S5Therefore,

R2= { 2> B L(Ga(n+l)) . L(Gb(+)

1A2B2 -4S2abac Therefore, we use the following definitions.

Page 4: Learning of Fuzzy Formal Language

CORRESPONDENCE 101

Definition 2: Assume that at time 1, L(Gai) = L(Gbl). If, Lemma 2: Let a fuzzy CS grammar G = (VN,VT,P,S) tofor any teaching sequence, L(Gan) = L(Gbfl), n = 1,2,- * *, so which the production uPa v, where IuI < Iv, u E VN* - {8Lfar as the learning method in Section III is concerned, the fuzzy v E (VN U VT)* - {e}, p E [0,1 ] is added be fuzzy grammar G'.grammars Gai and Gbl are said to be strongly equivalent. Then we can find a fuzzy CS grammar G" such that G' = G".Symbolically, this is expressed Gal Gbl. Proof: Let u = A1A2. Am and v = BB2 ..B, where

Definition 3: If there exist K such that R = Q(K), the existing Ai E VN, BE a VN U VT, 1 < m < k. Here, introduce new non-fuzzy grammar G is said to be a reduced form. terminals C1,C2,.-*,Ck that are mutually different and notFor the fuzzy CS grammar, it is recursively unsolvable whether contained in VN, and replace the production u P > v by the

K exists in Definition 3 or not. For the fuzzy CF grammar, it is following productionsrecursively solvable. Furthermore, for the fuzzy CF grammar,the minimum value of the number of elements in K is smaller AlA2 * Am > C1C2 ... Ck-m+lA2 Amthan or equal to the number of elements in R.

Theorem 2: Given any fuzzy CF grammar G whose Lo(G) C1C2 * . Ckm±1A2 * Am P* C1C2 * Ckm+1is nonempty, it is possible to find a reduced fuzzy CF grammar *Cnm+2A3 ... AmG' such that G _ G'.

Proof: The proof is obvious. Q.E.D. CiC2 ... Ck-lAm Pi" ClC2 ... Ck

Theorem 2 states that even if unnecessary productions are CiC2 ... Cn,P+ 1 BIC2C3 ... Ckremoved, the strong equivalency is preserved.Example 4: Let Pm+kBlB2 ...* B,-lC, BlB2 ... Bk

G = (VN,VT,P,S) where p = min (P1,P2,... ,Pm+k). Let the set of productions

VN = {S,A,B} for the above replacement be P'; then

VT = {a,b,c} G = (VN _ {C1,C2,-*,Ck},VT,P,S)P = {7l1,72,.-..,761 is obviously strongly equivalent to the original fuzzy grammar

where G'. Q.E.D.

7r: S - aSb 7(2: S e ab 7(3: S CC Lemma 3: Let G = (VN,VT,P,S) be a fuzzy CS grammar.

74: S e 7(5: S A 76: B b. Then we can find a fuzzy grammar G' = (VN',VT,P',S) such thatG' _ G, in which all the productions are of the form aiAa2 -This G is not the reduced form. The reduced fuzzy CF grammar 61fi2, A - a, or S e X, where c a VN/*, fJ E VN * -that is strongly equivalent to this G is A E VN', a a VT.

G' = ({S},VT,{7(l,7(2,7r3J74},S)- Proof: Let P = {l1,72,... ,k}. For each 7ri, let Ni be aproduction that is made by replacing each terminal a in 7ri with

Here, let the production rule that corresponds to 7i be ri; then an abstract symbol a (a new symbol that is not contained inwe have VN u VT). If the productions 7ri contain no terminal, X, = 7i.

Q(a3b3) = {r,,r2,r4} Q(acb) = {rl,r3}. Let

Therefore, an example of K is VN' = VN U {a a E VT}K = {a3b3,acb} PI = {fi 7i E P} u {a aa Ea VT}

Q(K) = R = {rl,r2,r3,r4}. then we obtain the fuzzy grammar G' = (VN',VT,P',S). Let theproduction rule corresponding to 5i that contains a be ria. If

On the other hand, as seen from Example 3, if a production -

rule that produces a nonterminal from a nonterminal, i.e., a na .Q(x) with respect to G', then (a -+ a) a Q(x); i.e., thepA-+ B A,B VN, is removed, the production rule d -+ a is used whenever ria is used. Therefore,production rule of the form A BA,EVN15rmvdhe for any teaching sequence, /1a,(fi,) . ,u,(cT e* a), n=

strong equivalency is no longer preserved. Therefore, though Then, denoting the production rule constructing P by R, wethe Chomsky normal form and the Greibach normal form [1 ] have, for xntL(R)Rfor some fuzzy CF grammar G are both weakly equivalent to G, '

f

generally, they are not strongly equivalent to G. PGn(X) = PGJA(X) n = 1,2,...Generally speaking, even if one production is divided into This is G _ G. Q.E.D.

some productions that are perfectly independent of otherproduction's, the strong equivalency is preserved. A special Applying Lemmas 2 and 3, we can prove the followingcase of this will be stated in Lemma 2. theorem, which is similar to the nonfuzzy grammar case.Example 5: In Example 1, if we change VN for VN' = {S,A,B} Theorem 3: For any fuzzy CS grammar G, we can find a fuzzy

and change the production S aSb for grammar G' such that G G' and all the production rules ofG' are oftheformofu vorS X,where lu vC, uE VN*

a) S ABBab{e},v E (VN U VT)* - {e}. The COnVerSe iS also Valid.

b) S ASB, A -+> a, B -b, Proof: We can find G' of Lemma 3 for any fuzzy CSgrammar G. This G' satisfies the conditions of G' in Theorem 3.

then the new grammars for a) and b) are both strongly equivalent The converse is obvious from Lemma 2. Q.E.D.to G1.Denoting the length of x by lxi, we obtain the following As a matter of course, Theorem 3 is also valid for the weak

lemma, which is similar to the nonfuzzy grammar case. equivalency.

Page 5: Learning of Fuzzy Formal Language

102 IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, JANUARY 1973

VI. CONCLUSION This basic algorithm is applied repeatedly using various computed initialpoints and starting directions. Through the additional use of several

sidered.Thel ofuzzlanguagelearnisgus as taheibae lan ge. Te learning cycles most of the available extrema can be found. Numericalexperiments indicate that the method is very efficient for the functionals

fuzzy language may be suitable for the base language in the learn- of dimensions 15-20 with 20-25 extrema.ing system. As is true for many other learning systems, theproblem has been considered under the assumption that the INTRODUCTIONouter frame of the contents of the learning had been given. Analytical solutions are not readily available, and some kindConsidering the fact that the number of brain cells of a human of global strategy is required for nonlinear optimization problemsbeing is finite and is not increased by learning, we could say that with multimodal performance functions. A group of randomthis assumption is unnatural. However, it may be possible and techniques, which may be described as systematic unguidedinteresting to expand our learning system to the so-called self- random searches [1]- [3], uses as the global strategy the elimina-organizing system in which the initial grammar G1 is extracted tion of unwanted regions, so as to locate the region in which thefrom the initial stage of teaching sentences. global minimum lies. For this, the parameter space is quantized

In this correspondence, the convergency of the proposed into cells. Despite the fact that these methods can use adaptivemodel has been considered and has shown that by choosing learning to improve their performance, the major remainingan adequate teaching sequence, only the specified productions obstacle to this approach is the "curse of dimensionality"are intensified. For the present, however, generally speaking, (Bellman). In higher dimensions this also affects the pure randomit is not clear what teaching sequence is adequate. Furthermore, search [4], which does not need to divide the space into cells.a concept of strong equivalence that two grammars cannot be A different group of random techniques, which may bediscerned from the outside of the machines by any teaching described as guided random searches, does not have a particularsequence has been introduced. This strong equivalency may be global strategy. These searches [5]-[8] explore the spaceviewed as showing the structural character of production rules through a sequence of random moves and always accept onlyof formal language apart from the learning. the improved performance. In certain cases this feature can

cause a search to become trapped among local minima. Thesemethods can be used for a multimodal search, and it appears

The authors wish to thank Mr. Mutsumi Sato, Dr. Masaharu that their capabilities have not been sufficiently explored.Mizumoto, and other members of Prof. Tanaka's laboratory Another, more recent approach to a multimodal-functionfor their helpful discussions. minimization is via approximate-analytical methods [9], [10].

The power of this approach lies in the fact that a well-definedREFERENCES global solution can be anticipated, but the capability to find this

[l]E. T.Lee and L. A. Zadeh, "Note on fuzzy languages," Inform Sci solution rests within less accurate approximation means, such[2] N. Chomsky, "Three models for the description of language," IRE as the statistical Monte Carlo method [9] and fourth-order

Trans. Inform. Theory, vol. IT-2, pp. 113-124, 1956.[3] L. A. Zadeh, "Fuzzy sets," Inform. Contr., vol. 8. pp. 338-353, June polynomial fitting [10]. These methods operate on certain

1965. assumptions about a surface whose shape is generally unknown.[41 W. G. Wee and K. S. Fu, "A formulation of fuzzy automata and its a a a

application as a model of learning systems," IEEE Trans. Syst. Sci. How restrictive these assumptions can be is not immediatelyCybern., vol. SSC-5, pp. 215-223, July 1969. obvious.

[5] M. Mizumoto, J. Toyoda, and K. Tanaka, "Some considerations onfuzzy automata," J. Comput. Syst. Sci., vol. 3, pp. 409-422, Nov. To circumvent some of the aforementioned difficulties a

1969.difrnkidoaprahiprpsdThmehdwihs[6] S. Tamura, S. Higuchi, and K. Tanaka, "Pattern classification based different kind of approach is proposed. The method which ison fuzzy relations," IEEE Trans. Syst., Man, Cybern., vol. SMC-1, used here does not eliminate unwanted regions, nor does itpp. 61-66, Jan. 1971.maeaypriuaasupinabuasrfc.Tetaeg[7] M. Mizumoto, J. Toyoda, and K. Tanaka, "Fuzzy languages," Trans make any particular assumptions about a surface. The strategyInst. Elec. Commun. Eng. Japan (C) (in Japanese), vol. 53-C, pp. 333- is based on a repetitive implementation of heuristic rules and on340, May 1970.

[8] L. A. Zadeh, "Quantitative fuzzy semantics," Inform. Sci., vol. 3, subsequent evaluations of the results obtained. It appears thatApr. 1971. the heuristic method of problem handling [III that permits an

[9] J. E. Hopcroft and J. D. Ullman, Formal Languages and Their Relationto Automata. Reading, Mass.: Addison-Wesley, 1969. investigator to learn from experience lends itself naturally to

[10] R. R. Bush and W. K. Estes, Eds., Studies in Mathematical Learning the problem of multimodal surface minimization. It has enabledTheory. Stanford, Calif.: Stanford Univ. Press, 1959.

[11] J. Spragins, "Learning without a teacher," IEEE Trans. Inform. this author to extend earlier ideas [12], [13] into a global searchTheory, vol. IT-12, pp. 223-230, Apr. 1966. strategy.

An improvement that this strategy offers is its capability tooptimize highly multimodal and highly dimensional surfacesequally as well as those of lower modality or dimensionality.The difference is only in the computing time, which is well

A Heuristic Method for Finding Most Extrema of a within the demands imposed by most engineering applications.Nonlinear Functional DESCRIPTION OF THE METHOD

v I

JASNA OPACIC The search proceeds by constructing and immediately usingthe constructed search directions, each of which branches off

Abstract-A heuristic search is described which has the aim of finding from some previous direction, as is illustrated in Fig. 1. Eachpractically all the extrema of a given nonlinear functional. A standard search direction is defined as a straight line connecting oneunimodal descent algorithm is employed for finding individual extrema. peiuldscvrdmnumadatrigpotwhefm

that minimum was located by a unimodal descent technique.Manuscript received December 13, 1971; revised July 28, 1972. This work Along every search direction new starting points are placed

was supported in part by the Air Force Office of Scientific Research under with the purpose of attempting to find new minima.

The author was with the Department of Electrical Engineering, Univer- In order to describe how the method operates, we need to,sity of Maryland, College Park, Md. She is now with the Bell Telephone deietohusicrl.Esniay,bhrlscnenteLaboratories, Inc., Whippany, N.J. 07981.deietohustcrl. sntay,bhrlscnente