Solving problems using math and computers is my favourite job to do. In early days of my career, I aspired to be a good software engineer, and I passionately pursued it until I slowly transitioned towards a research career. While I write less code as a researcher than what I’m used to doing as a software engineer, I emphasize on good software engineering practices, and open sourcing of tools with a permissible license. In the beginning (2012-2016) I wrote much of my code in Java/Groovy/Scala, but in the recent years (2016-Now), Python has become my go to choice. I have released a bunch of tools to PyPi.


I sometimes participate in StackOverflow QA threads.

Here are some of my selected projects:

RTG: Reader Translator Generator

Neural Machine Translation Toolkit.

MTData: Machine Translation Data

A tool that locates, downloads, and prepares parallel data for machine translation from many data sources.

NLCodec: Natural Language CoDec

A library to do coding-decoding such as Word, Character, and Byte-Pair-Encoding of natural language text.

awkg: Python awk

awk like line-processing tool with python as scripting language.

VirtChar: Virtual Characters

Dialog systems that imitate characters from the popular TV show named F.R.I.E.N.D.S.

JunkDetect: Junk Detector

A tool to detect junk or not-junk text with support for 100 languages.

Sparkler: Spark Crawler

A large scale web crawler on Apache Spark, with Apache Solr backend for crawler database.

Auto Extractor

HTML web page clustering tool based on DOM structure and CSS style similarity.

Supervising UI

A simple web UI for labelling images to be used for image recognition.

More Tools