Academic

Research Assistant, University of Maryland, Baltimore County (January 2018 to Current)

  • Collected, analyzed, and interpreted multimodal data captured by sensors, intelligent systems, and user inputs.

  • Developed explainable machine learning models with wide-scale applications in both education and healthcare domains. Devised an explainable method using locally interpretable model explanations (LIME) to interpret both white and black box model predictions.

  • Developed predictive models to solve both classification and regression problems.

  • Authored 10 publications in the areas of artificial intelligence and data science in education, cognitive psychology, and health care.

  • Taught Database applications course to a class of 60 students. Trained students on Pl/SQL.

  • Classes taught totaled over 100 students and comprised a Decision Support Systems course and Health Care Informatics course. Conducted RapidMiner data science boot camp

Industry

Data Scientist Co-op (Ph.D. level), Bayer Crop Science R&D  (January to May 2021)

  • Developed Genotype Accuracy Model framework in Python to evaluate Bayesian-based genomic imputation algorithm using observational masking. This framework supports cost-cutting in the breeding pipeline by providing insights to breeders about high-yield seed selection.

  • Deployed an R Shiny Dashboard to identify recombination breakpoints across the genome and explored optimization methods to improve genotype imputation engine performance near and away from recombination breakpoints.

  • Developed a scalable framework in Domino Data Lab to impute missing genotypes across the genetic map with high accuracy for various crops like corn and soy.

  • Developed a framework in R to execute pedigree-based breeding program simulations using AlphaSimR.

  • Collaborated across teams to identify needs and gaps in data science pipelines and provide solutions.

Data Science Intern (Ph.D. level), Edison Software Inc (June to August 2020)

  • Perform statistical modeling on consumer behavioral data captured by the Edison platform (e-mail). This includes mathematical optimization, probability and sampling, and causal inference.

  • Extract data from multiple sources and code models that process large data sets using Impala SQL and python.

  • Develop interpretable and accurate regression models to predict monthly e-commerce sales in US based on Edison data by implementing user cohorts, regularization, and heuristic methods.

  • Performed feature analysis to understand the impact of different variables on model predictions

Predictive Analytics - Data Science Intern, Highmark Health (May to June 2020, reduced due to pandemic)

  • Analyzed the technological changes in the organization during the pandemic.

  • Performed a case study to understand the challenges, benefits, and timing of changes made by Highmark health during the pandemic.

Software Engineer, Accenture Services Pvt Ltd (August 2014 to December 2015)

  • Responsible for gathering SAP OTC business requirements and communicate client business requirements by constructing easy to understand data and process models.

  • Developed and implemented 5+ levels of testing, unit testing, integration testing, system testing, user acceptance testing and Regression testing for entire SAP OTC process.

  • Identified and documented As-Is business processes in Aviation oil domain related to the SAP system.

  • Created a concordance for plant configuration in SAP SD. This was a first of its kind and used as a template for other configurations.

Technical Intern, Efftronics Systems Pvt Ltd (October 2013 to May 2014)

  • Development and monitoring of Air quality control system using multi-sensor fusion techniques.

  • Researched various methods to improve air quality in the workplace.

Technical Skills

Programming: Python, Matlab, R-Programming, PySpark

Data science & Visualization Tools: RapidMiner, Weka, QlikView, Qlik Sense, Tableau

Database: MySQL, SQL server, Cassandra

Development Tools: Anaconda, PyCharm, Git, Domino Data Lab

Operating System: Windows, Linux, Mac