Monday, March 27, 2017

Combining the Expression and Hierarchy Neural Nets

The next move is to combine the hierarchy and expression parser neural nets. For this I make a hierarchy where "move" and "bought" are a type of "infix", "to" is a type of "preposition" and "a" and the are types of "articles.



The expression parser is then setup with the "infix", "preposition" and "article" types. Where the priorities are "infix" < "preposition" < "article". I made a new class called Chain that can be used to take the output of the hierarchy neural net and use that as the input into the expression neural net. The main change for that was to add code to specify which nodes are outputs from the hierarchy and which nodes are input into the expression neural net. That is pretty much all that I did.

The expression parser is trained with this

priorities = [
  ('preposition', 'article'),
  ('infix', 'preposition'),
  ('constant', 'infix'),
  ('0', 'constant')
]

The hierarchy neural net is trained with this

words = [
  ('constant', 'constant'),
  ('to', 'preposition'),
  ('from', 'preposition'),
  ('move', 'infix'),
  ('bought', 'infix'),
  ('a', 'article'),
  ('the', 'article'),

  # top off the loop
  ('article', 'article'),
  ('preposition', 'preposition'),
  ('infix', 'infix')
]

The setup is done like so

one_hot_spec = ['constant', 'preposition', 'infix', 'article', '0']

hierarchy_model = HierarchyModel(words, 'constant', one_hot_spec)
parser_model = ParserModel(priorities, 'constant', one_hot_spec)

chain_model = ChainModel(hierarchy_model, parser_model)
chain_model.train(session)

Here is some sample usage:

Enter an sentence if you dare: move t1 to b2 dfhj ahadshfa sdhfgfah I bought a car
Input Expression: ['move', 't1', 'to', 'b2', 'dfhj', 'ahadshfa', 'sdhfgfah', 'I', 'bought', 'a', 'car']
Output Expression: {'action': 'move', 'thing': 't1', 'to': 'b2'}
Output Expression: {'buyer': 'I', 'action': 'buy', 'thing': {'thing': 'car', 'determiner': 'a'}}

The source code is expression5.py, hierarchy3.py and chain1.py . This is invoked by running expression5.py.  The source code is here

Thursday, March 16, 2017

Parsing Natural Languages

I think using grammars to parse language is not useful and a dead end. I think language should be parsed using a generalization of the models we use to parse arithmetic expressions. Here is an example for the sentence, "The man bought a car" using these operator definitions

the - prefix priority(400)
man - constant
bought - infix priority(200)
a - prefix priority(400)
car - constant

Then sentence can then be parse by applying the operators in the normal way. Interesting generalizations are to allow operator evaluations to evaluate to other operators. For example, prepositions evaluate to post fix operators. Consider the sentence "The person from France bought a car".

from - prefix operator priority(300) -> evaluates to a post fix operator priority 100.

The sentence can now be parsed with the "from France" being applied to "The person".

Another generalization is to regard verbs as infix operators having named arguments. Consider the sentence "The man bought a car on Saturday". Update our definition of "bought"

bought - infix priority(200) named args: on
on - see from

The sentence can now be parsed with the "on Saturday" phase being part of the evaluation of the verb "bought". Interestingly for the sentence "The man on the boat bought a car" with the same definitions the "on the boat" is evaluated as a post-fix operator applied to "the man". No extra work.

What about conjunctions. Consider the sentence "The man and woman bought and sold a car and went on a trip". We need two generalizations to handle this. The first it to make a verb into a prefix operator that evaluates into a post fix operator. The second is to make a conjunction into a list terminator where the priority of the operator is not a number but rather it evaluates and soon as the term after the operator and the term before the operator have the same type. The result of the evaluation  has the same type. The parsing would look like this

Start:
"the man and woman bought and sold a car and went on a trip"

The types before and after the "and" are the same.
"the (man and woman) (bought and sold) a car and went on a trip"

Apply determiners
"(the (man and woman)) (bought and sold) (a car) and went on (a trip)"

kPrefix prepositions
"(the (man and woman)) (bought and sold) (a car) and went (on (a trip))"

Prefix verbs
"(the (man and woman)) ((bought and sold) (a car)) and (went (on (a trip)))"

Apply conjunctions because types match (both are post fix operators of same priority)
"(the (man and woman)) (((bought and sold) (a car)) and (went (on (a trip))))"

Apply postfix verbs
"((the (man and woman)) (((bought and sold) (a car)) and (went (on (a trip)))))"

This also has a beautiful noise handling feature that is difficult to obtain with grammar based parsers. Imagine this as input " blah blah xhsh the man bought a car tree pick the woman bought a boat aa ashsh1334h". The result of parsing this would recognize "the man bought a car" and "the woman bought a boat" as phrases in addition to the other single word parses. The noise is ignore naturally by the parser. Sweet!

Let's take my neural net expression parser and apply that to this problem. The code is at https://github.com/cooledge/nn/blob/master/expressions/expressions2.py

Here is sample output that converts sentences to JSON.

Enter an sentence if you dare: h sadhfashdf asd joe bought a car shdhf

Input Expression: ['h', 'sadhfashdf', 'asd', 'joe', 'bought', 'a', 'car', 'shdhf']
Output Expression: {'buyer': 'joe', 'action': 'buy', 'thing': {'determiner': 'a', 'thing': 'car'}}

Enter an sentence if you dare: h sadhfashdf asd joe bought a car shdhf sally bought a jeep

Input Expression: ['h', 'sadhfashdf', 'asd', 'joe', 'bought', 'a', 'car', 'shdhf', 'sally', 'bought', 'a', 'jeep']
Output Expression: {'buyer': 'joe', 'action': 'buy', 'thing': {'determiner': 'a', 'thing': 'car'}}
Output Expression: {'buyer': 'sally', 'action': 'buy', 'thing': {'determiner': 'a', 'thing': 'jeep'}}

Enter an sentence if you dare: h sadhfashdf asd joe bought a car shdhf sally bought a jeep adsafhsdhdhd ddd move the tank to france

Input Expression: ['h', 'sadhfashdf', 'asd', 'joe', 'bought', 'a', 'car', 'shdhf', 'sally', 'bought', 'a', 'jeep', 'adsafhsdhdhd', 'ddd', 'move', 'the', 'tank', 'to', 'france']

Output Expression: {'buyer': 'joe', 'action': 'buy', 'thing': {'determiner': 'a', 'thing': 'car'}}
Output Expression: {'buyer': 'sally', 'action': 'buy', 'thing': {'determiner': 'a', 'thing': 'jeep'}}
Output Expression: {'to': 'france', 'action': 'move', 'thing': {'determiner': 'the', 'thing': 'tank'}}

Sunday, March 12, 2017

Parsing Expressions with Neural Nets

This post describes the start of using a neural net to evaluate an mathematics expression such as "1 + 2 * 3" equals "7". The idea is to take a list of tokens and have the neural net determine which operator to evaluate next. The training data will be similar to the hierarchy model.

priorities = [
  ('+', '*'),
  ('-', '*'),
  ('+', '/'),
  ('-', '/'),
  ('*', '**'),
  ('/', '**'),
  ('**', '0'),
  ('c', '+'),
  ('c', '-'),
  ('0', 'c')
]

The values are (lower_priority_op, higher_priority_op). I am minimizing the data purposely to have more of the intelligence contained in the model. 'c' will be anything that is not an op. '0' just means nothing. This is a work in progress and I thought this was a good place to start.

I trained a neural net that given "higher_priority_op" will return softmax probs for the "lower_priority_op" in a one hot vector. That is easy to do. The next step is to generate a model that will take in a sequence of one hot vectors corresponding to the input expression. and return a sequence of probabilities that the corresponding element is the next to process. For example, assuming the ops are c, +, -, *, /, **, 0 for the expression "1 + 2 * 3", the inputs will be 

[1,0,0,0,0,0,0] [0,1,0,0,0,0,0] [1,0,0,0,0,0,0] [1,0,0,0,0,0,0] [0,0,0,1,0,0,0] [1,0,0,0,0,0,0]

The output target will be

[0,0,0,1,0]

Which indicates that the op to process next is the multiply. For this I am not training the model. The model is only trained on pairwise op priorities. That output is used as input for the model. This is an example of how the parse model works. For each input vector, a prediction will be calculated using the standard softmax with the trained model.

[0,0,0,0,0,0,1] [1,0,0,0,0,0,1] [0,0,0,0,0,0,1] [1,1,1,0,0,0,1][0,0,0,0,0,0,1]

The next step is to sum the columns.

[2,1,1,0,0,0,5]

Next step is to subtract this from the input with a relu yielding

[0,0,0,0,0,0,0] [0,0,0,0,0,0,0] [0,0,0,0,0,0,0] [0,0,0,1,0,0,0][0,0,0,0,0,0,0]

Then sum the vector for each position

[0,0,0,0,1,0]

Then select the argmax and that is the next op to run. The numbers that I showed are perfect. The real number are more fractional. 

This is a sample of the app running

Enter an arithmetic expression: 1 + 1 + 3 * 4
Input Expression: ['1', '+', '1', '+', '3', '*', '4']
Output Expression: [14.0]

The app is here https://github.com/cooledge/nn/blob/master/expressions/expressions.py