German Artikel Predictor - LSTM and One-hot encoding

As a German language lerner, I find it sometimes difficult to determine which words are masculine, feminine, or neuter. I decided that I need to find out if it is a matter of sheer memorisation or just a matter of practice. To address this challenge, I decided to develop a Machine Learning model to predict the gender of a noun from a sequence of letters. Since words are of various length, I decided to use a Long-Short Term Memory model.

After hyperparameter optimisation, model achieved 95.53% accuracy on the test set. The best model was an LSTM with 1 layer, 256 hidden dimensions, 32 embedding dimensions, batch size of 64, and learning rate of 0.0032 using Adam gradient descent algorithm.

Overall, this experiment has shown that it is possible to accurately determine the gender of a previously unseen noun. They are not totally random and endings (e.g. -er, -chen, -e, -heit) and other features can give clues about proper classification even if not with certainty.

See my GitHub Repository

See my Jupyter notebook

LSTM