hierarchy = {
"bear": "mammal",
"tiger": "mammal",
"whale": "mammal",
"ostrich": "bird",
"peacock": "bird",
"eagle": "bird",
"salmon": "fish",
"goldfish": "fish",
"guppy" : "fish",
"turtle": "reptile",
"crocodile": "reptile",
"snake": "reptile",
"frog": "amphibian",
"toad": "amphibian",
"newt": "amphibian",
"ant": "invertibrates",
"cockroach": "invertibrates",
"ladybird": "invertibrates",
"mammal": "vertibrates",
"birds": "vertibrates",
"fish": "vertibrates",
"reptiles": "vertibrates",
"amphibians": "vertibrates",
"vertibrates": "animals",
"invertibrates": "animals",
"animals": "animals" # loop for things with no higher class
}
The other thing I want the neural net to know the hierarchy with only getting minimal information. For example, instead of training the neural net that the input bear maps to a one hot vector of the aforemention nodes. I want to train the neural net that "bear" is a "mammel" and "mammels" are "invertibrates". The neural net because of its structure would infer that "bears" are "invertibrates".
The reason for doing that is that I don't want to encode the knowledge of a hierarchy in the training data instead I want to encode that in the neural net. That better models the way humans learn. Humans don't get training data about whole hierarchies instead they piece it together. I want the knowledge of hierarchy-ness in the neural net.
The model I came up with is in my github repo, https://github.com/cooledge/nn/commit/ad79bef1af9b3fed40e89e765c6e28ba377d7544. The approach is to train a simple neural net that take a one hot vector with the child and predicts a one hot vector of the parent. For example when "vertibrates" is input the prediction is "animals". That is trained in the usual way. The result is a W and b, the usual matrices.
Then for the model that prediction hierarchy I used this.
l1 = softmax(I*W + b)
l2 = softmax(l1*W + b)
...
lk = softmax(l2*W + b)
Y = l1 + l2 ... lk
I left out some of the math since this is the basic idea. Turns out this works. Here is some sample output from the program.
Enter type type: salmon
Output (prob > 50%)
fish has prob 1.000000
vertibrates has prob 1.000000
animals has prob 1.000000
salmon has prob 1.000000
Enter type type: bird
Output (prob > 50%)
vertibrates has prob 0.894435
animals has prob 1.000000
bird has prob 1.000000
Cool. I am doing a neural net course on Udacity that is free. After that I will come back to this and investigate using embedding instead of one hot vectors. Also it would be fun to hook up my spelling mistake code.
No comments:
Post a Comment