What I have done
My Portfolio




NLP Lab

Massive research project undertaken by Prof. Damir Cavar, under whom I am a volunteer research assistant, information regarding the same can be found here.
Gist of the whole is to create a NLP environment using various tools like Stanford CoreNLP, Spacy, etc. to extract information from corpus and augment a knowledge graph which can be parsed for easy extraction and validation of information under one umbrella project.
There are quite a few sub-projects in which I have been involved like
JSON-NLP Schema wherein we identified a common schema to gather outputs of the NLP tools in order to compare them and get the most accurate information and reduce the errors.
OpenIE (Open Information Extraction), which uses and extends the Yago database in order for us to validate the information we gather from various corpus, and then extend the database. We utilize Neo4J to create a small graph databse of the information extracted for comparision with Yago and MongoDB is utilized to load the data from Microsoft Concept Graph and DbPedia in order to identify the most relevant hypernym for the entity extracted e.g. Apple is a company while apple is a fruit.
HOOSIER (Hoosier Semantic Information ExtractoR), which will extend OpenIE and utilize network analysis tool like SNAP in order to analyze the sematic relation between entities.


Projects
  • Analysis of Bitcoin Blockchain
    Researched the Bitcoin Blockchain w.r.t. functionality like block, nonce, mining, in order to demonstrate how it actually works also implemented certail aspects of the same. The entire was presented to a group of peers in order to showcase the capabilities and drawbacks of using blockchain and to bust some myths.

  • Pichu
    Implemented a restricted version of chess and tried to maximize winning chance of player we represent by implementing the min-max algorithm to run on the statespaces for each play.

  • Parts of Speech Tagging
    Trained models to showcase the difference between Naive Bayes, Variable Elimination and and Viterbi Algorithm and analyzed the model performance by training it to tad the parts of speech of a given corpus. Naive Bayes achieved 78%, Variable Elimination achieved 89% and Viterbi achieved 93% accuracy.

  • Spooky Author Classification
    Trained models based on Naive Bayes, Viterbi Algorithm and NN on a corpus of 3 authors and then classified test sentences for each to showcase difference and accuracy of all 3 implementations. WHile implementing Naive Bayes I had an inkling that the problem looked like a HMM and so attempted Viterbi as well. Naive Bayes achieved 89%, Viterbi achieved 74% and NN achieved 96% accuracy.

  • Tweet Extraction and Classification
    Extracted tweets associated with cat, dog and both and extracted a portion of the same for testing the implementation and used the rest to train models implemented using scikit learn package, implemented Naive Bayes (80-82%), SVM(44-53%), NN (90-95%), KNN 1 Neighbor(86-92%), KNN 5 Neighbor (89-92%) and KNN 5 Neighbor Weighted (90-93%) for 1500 and 4000 tweets respectively.


Awards & Leadership Achievements
  • Indiana University, Bloomington, U.S.A.

  • Sep 2018 - Selected to be on the graduate discussion panel to discuss the issues with applying data science to social science datascience-socialscience

  • Mar 2018 - Actively contributed to organize the "Opiod Data Wrangling Challenge" for the Data Science department in collaboration with the School of Public and Environmental Affairs(SPEA)

  • Jan 2018 - Elected to position of Director of Professional Development in the Data Science Club at Indiana University (DSCIU)


  • Mastek Ltd, Navi Mumbai,India

  • Aug 2016 - A+ performer of the year Award, recognizing me as the A+ performer for 2015-16.

  • Aug 2016 - Spot Appreciation Award, for exemplary performance for the CZR OD project.

  • Apr 2016 - COE Award, Center of Excellence team appreciation for being a trend tracker, writing maintainable code and for being a passionate coder.

  • Sep 2015 - Spot Appreciation Award, Dedication and commitment towards JCR-2858, working above and beyond expectation for the success of the change, even under challenging deadline.

  • Aug 2015 - Heart of Mastek 4.0 Award, In recognition of achieving excellence in terms of high quality deliverables and exceptional performance resulting in customer wow and team out-performance.

  • Aug 2014 - Onssite to the U.K. to work with the client for an entire year.

  • Jul 2014 - Stephen Hawking Award, In recognition for contributions to the IPF Account(UK) team's success, enabling positive turnaround of the project

  • Jul 2014 - Stellar Award, Recognition for stellar performance in impact analysis for complex functionalities and providing effective solutions for the change requests.

  • May 2014 - Spot Appreciation Award, UK-IPF, contributions for new market entry and Mexico merge release.

  • Sep 2013 - Spot Appreciation Award, In recognition for NME BGR release, showing dedication and efficiency in resolving and impact assessment of defects