Using Neural Nets for Natural Language Processing: Hierarchies in Neural Nets

I took some time off to learn more about neural nets using tensorflow Now I am back. I want to investigate using neural nets to model hierarchies. I will start with a simple hierarchy of animals. This is my input data.

hierarchy = {
"bear": "mammal",
"tiger": "mammal",
"whale": "mammal",
"ostrich": "bird",
"peacock": "bird",
"eagle": "bird",
"salmon": "fish",
"goldfish": "fish",
"guppy" : "fish",
"turtle": "reptile",
"crocodile": "reptile",
"snake": "reptile",
"frog": "amphibian",
"toad": "amphibian",
"newt": "amphibian",
"ant": "invertibrates",
"cockroach": "invertibrates",
"ladybird": "invertibrates",
"mammal": "vertibrates",
"birds": "vertibrates",
"fish": "vertibrates",
"reptiles": "vertibrates",
"amphibians": "vertibrates",
"vertibrates": "animals",
"invertibrates": "animals",
"animals": "animals" # loop for things with no higher class
}

I want a neural net that will take one of the objects in the hierarchy and return a one hot vector that indicates the nodes in the hierarchy that are the ancestors. For example, when I input "bear", I want to get a one-hot vector that has "bear", "mammel", "vertibrate", and "animal" marked as hot. Then one could imaging feeding that into a neural net that knew about invertibrates and could act on the invertibrateness of a bear when being fed the input bear.

The other thing I want the neural net to know the hierarchy with only getting minimal information. For example, instead of training the neural net that the input bear maps to a one hot vector of the aforemention nodes. I want to train the neural net that "bear" is a "mammel" and "mammels" are "invertibrates". The neural net because of its structure would infer that "bears" are "invertibrates".

The reason for doing that is that I don't want to encode the knowledge of a hierarchy in the training data instead I want to encode that in the neural net. That better models the way humans learn. Humans don't get training data about whole hierarchies instead they piece it together. I want the knowledge of hierarchy-ness in the neural net.

The model I came up with is in my github repo, https://github.com/cooledge/nn/commit/ad79bef1af9b3fed40e89e765c6e28ba377d7544. The approach is to train a simple neural net that take a one hot vector with the child and predicts a one hot vector of the parent. For example when "vertibrates" is input the prediction is "animals". That is trained in the usual way. The result is a W and b, the usual matrices.

Then for the model that prediction hierarchy I used this.

l1 = softmax(I*W + b)
l2 = softmax(l1*W + b)
...
lk = softmax(l2*W + b)

Y = l1 + l2 ... lk

I left out some of the math since this is the basic idea. Turns out this works. Here is some sample output from the program.

Enter type type: salmon
Output (prob > 50%)
fish has prob 1.000000
vertibrates has prob 1.000000
animals has prob 1.000000
salmon has prob 1.000000

Enter type type: bird
Output (prob > 50%)
vertibrates has prob 0.894435
animals has prob 1.000000
bird has prob 1.000000

Cool. I am doing a neural net course on Udacity that is free. After that I will come back to this and investigate using embedding instead of one hot vectors. Also it would be fun to hook up my spelling mistake code.

Using Neural Nets for Natural Language Processing

Monday, February 20, 2017

Hierarchies in Neural Nets

No comments:

Post a Comment

Blog Archive