Solving problems using math and computers is my favourite job to do. In early days of my career, I aspired to be a good software engineer, and I passionately pursued it until I slowly transitioned towards a research career. While I write less code as a researcher than what I’m used to doing as a software engineer, I emphasize on good software engineering practices, and open sourcing of tools with a permissible license. In the beginning (2012-2016) I wrote much of my code in Java/Groovy/Scala, but in the recent years (2016-Now), Python has become my go to choice. I have released a bunch of tools to PyPi.
I sometimes participate in StackOverflow QA threads.
Here are some of my selected projects:
Neural Machine Translation Toolkit.
Installer: https://pypi.org/project/rtg/
A tool that locates, downloads, and prepares parallel data for machine translation from many data sources.
Installer+Docs: https://pypi.org/project/mtdata/
A library to do coding-decoding such as Word, Character, and Byte-Pair-Encoding of natural language text.
Installer+Docs: https://pypi.org/project/nlcodec/
awk
awk
like line-processing tool with python as scripting language.
Installer+Docs: https://pypi.org/project/awkg/
Dialog systems that imitate characters from the popular TV show named F.R.I.E.N.D.S.
A tool to detect junk or not-junk text with support for 100 languages.
Installer+Docs: https://pypi.org/project/junkdetect/
A large scale web crawler on Apache Spark, with Apache Solr backend for crawler database.
HTML web page clustering tool based on DOM structure and CSS style similarity.
A simple web UI for labelling images to be used for image recognition.
CoreNLP + Apache Tika : https://github.com/thammegowda/tika-ner-corenlp
Contributed to Apache Tika: https://cwiki.apache.org/confluence/display/TIKA/TikaAndNER
Keras models deployment on JVM using Deeplearning4J : https://github.com/USCDataScience/dl4j-kerasimport-examples
Contributed to the Apache Tika: https://github.com/apache/tika/pull/125
Tensorflow model deployment on JVM sing GRPC: https://github.com/thammegowda/tensorflow-grpc-java
Image Recognition at large scale using Apache Spark: https://github.com/thammegowda/tika-dl4j-spark-imgrec
Document Similarity using Apache Spark and Solr: https://github.com/thammegowda/solr-similarity
Keyboard layout map of OSX for Kannada (my native language): https://github.com/thammegowda/kannada-osx-keylayout