Querying text annotations at scale with SPARK

Whether you are analyzing textual data or building features from text, you will likely use text annotations. While many software libraries and material exist to annotate documents, querying these annotations […]

AI or not AI? Classifying ArXiv articles with BERT

Things are evolving faster and faster in the NLP world. We can’t go 6 months without someone releasing a new language representation model that breaks records on major downstream benchmarks. […]

Node2vec and arXiv data

Since its publication in 2016 by Aditya Grover and Jure Leskovec, Node2vec has become the go-to algorithm to easily compute embeddings for nodes in a graph/network. Working with embeddings has […]