Implements Global Word Vectors.
using Pkg
Pkg.add("https://github.com/domluna/Glove.jl.git")
See benchmark/perf.jl for a usage example.
Here's the rough idea:
-
Take text and make a LookupTable. This is a dictionary that has a map from words -> ids and vice-versa. Preprocessing steps should be taken prior to this.
-
Use
weightedsumsto get the weighted co-occurence sum totals. This returns aCooccurenceDict. -
Convert the
CooccurenceDictto aCooccurenceVector. The reasoning for this is faster indexing when we train the model. -
Initialize a
Modeland train the model with theCooccurenceVectorusing theagagrad!method.
It's pretty fast at this point. On a single core it's roughly 3x slower than the optimized C version.
-
[ ] More docs.
-
[ ] See if precompile(args...) does anything
-
[ ] Notebook example ( has to have emojis )
-
[ ] Multi-threading