CSCI 6610 Automata and Formal Languages Spring 2004 Two algorithms for a CFG These algorithms are not discussed in Sipser, but were covered in CSCI 2670. The "text" referred to is Hopcroft, Motwani and Ullman. Def: X is a generating symbol of G iff X is in V and X =*=> w for some w in T*. Def: b is a reachable symbol of G iff S =*=> w for some w containing b (so b may be in V or T). Def: b is a useful symbol of G iff b belongs to some string in a derivation of S =*=> w for some w in T*. Let G = (V,T,P,S) be an arbitrary CFG. The algorithms given on p. 258 of the text for finding the generating symbols of G and the reachable symbols of G are inductive, but the exact form of induction is not specified. We reformulate those algorithms to make the induction explicit. Problems NT.1 and NT.2(a) of HW set 8 ask for answers in this format. For the generating symbols, let PI_0 = T, and for each i >= 0 let PI_{i+1} = (PI_i union { A in V : there is A --> w in P for some w in PI_i* }). When PI_{n+1} = PI_n then PI_n is the set of all generating symbols of G. For the reachable symbols, let R_0 = {S}, and for each i >= 0 let R_{i+1} = (R_i union {symbols of w : there is A --> w in P for some A in R_i }). When R_{n+1} = R_n then R_n is the set of all reachable symbols of G. For example 7.1 of the text ( S --> AB | a, A --> b ) we have PI_0 = {a,b}, PI_1 = {a,b,S,A}, PI_2 = PI_1 so those are the generating symbols; R_0 = {S}, R_1 = {S,A,B,a}, R_2 = {S,A,B,a,b}, R_3 = R_2 so those are the reachable symbols. Note that here the reachable symbols algorithm was applied to the original CFG. As discussed in the text, to eliminate useless symbols one should first find all generating symbols, eliminate any non-generating symbols to give a modified CFG G', and then find all reachable symbols in G'. Removing any non-reachable symbols from G' produces a CFG G'' which is equivalent to G and contains no useless symbols (provided that L(G) is nonempty, so that at least S remains as a useful symbol -- Thm. 7.2 of the text).