Monday, February 20, 2017

Hierarchies in Neural Nets

I took some time off to learn more about neural nets using tensorflow Now I am back. I want to investigate using neural nets to model hierarchies. I will start with a simple hierarchy of animals. This is my input data.

hierarchy = {
  "bear": "mammal",
  "tiger": "mammal",
  "whale": "mammal",
  "ostrich": "bird",
  "peacock": "bird",
  "eagle": "bird",
  "salmon": "fish",
  "goldfish": "fish",
  "guppy" : "fish",
  "turtle": "reptile",
  "crocodile": "reptile",
  "snake": "reptile",
  "frog": "amphibian",
  "toad": "amphibian",
  "newt": "amphibian",
  "ant": "invertibrates",
  "cockroach": "invertibrates",
  "ladybird": "invertibrates",
  "mammal": "vertibrates",
  "birds": "vertibrates",
  "fish": "vertibrates",
  "reptiles": "vertibrates",
  "amphibians": "vertibrates",
  "vertibrates": "animals",
  "invertibrates": "animals",
  "animals": "animals"  # loop for things with no higher class
}

I want a neural net that will take one of the objects in the hierarchy and return a one hot vector that indicates the nodes in the hierarchy that are the ancestors. For example, when I input "bear", I want to get a one-hot vector that has "bear", "mammel", "vertibrate", and "animal" marked as hot. Then one could imaging feeding that into a neural net that knew about invertibrates and could act on the invertibrateness of a bear when being fed the input bear.

The other thing I want the neural net to know the hierarchy with only getting minimal information. For example, instead of training the neural net that the input bear maps to a one hot vector of the aforemention nodes. I want to train the neural net that "bear" is a "mammel" and "mammels" are "invertibrates". The neural net because of its structure would infer that "bears" are "invertibrates".

The reason for doing that is that I don't want to encode the knowledge of a hierarchy in the training data instead I want to encode that in the neural net. That better models the way humans learn. Humans don't get training data about whole hierarchies instead they piece it together. I want the knowledge of hierarchy-ness in the neural net.

The model I came up with is in my github repo, https://github.com/cooledge/nn/commit/ad79bef1af9b3fed40e89e765c6e28ba377d7544. The approach is to train a simple neural net that take a one hot vector with the child and predicts a one hot vector of the parent. For example when "vertibrates" is input the prediction is "animals". That is trained in the usual way. The result is a W and b, the usual matrices.

Then for the model that prediction  hierarchy I used this.

l1 = softmax(I*W + b)
l2 = softmax(l1*W + b)
...
lk = softmax(l2*W + b)

Y = l1 + l2 ... lk

I left out some of the math since this is the basic idea. Turns out this works. Here is some sample output from the program.

Enter type type: salmon
Output (prob > 50%)
fish has prob 1.000000
vertibrates has prob 1.000000
animals has prob 1.000000
salmon has prob 1.000000

Enter type type: bird
Output (prob > 50%)
vertibrates has prob 0.894435
animals has prob 1.000000
bird has prob 1.000000

Cool. I am doing a neural net course on Udacity that is free. After that I will come back to this and investigate using embedding instead of one hot vectors. Also it would be fun to hook up my spelling mistake code.

No comments:

Post a Comment