Academic
Research Assistant, University of Maryland, Baltimore County (January 2018 to Current)
-
Collected, analyzed, and interpreted multimodal data captured by sensors, intelligent systems, and user inputs.
-
Developed explainable machine learning models with wide-scale applications in both education and healthcare domains. Devised an explainable method using locally interpretable model explanations (LIME) to interpret both white and black box model predictions.
-
Developed predictive models to solve both classification and regression problems.
-
Authored 10 publications in the areas of artificial intelligence and data science in education, cognitive psychology, and health care.
-
Taught Database applications course to a class of 60 students. Trained students on Pl/SQL.
-
Classes taught totaled over 100 students and comprised a Decision Support Systems course and Health Care Informatics course. Conducted RapidMiner data science boot camp
Industry
Data Scientist Co-op (Ph.D. level), Bayer Crop Science R&D (January to May 2021)
-
Developed Genotype Accuracy Model framework in Python to evaluate Bayesian-based genomic imputation algorithm using observational masking. This framework supports cost-cutting in the breeding pipeline by providing insights to breeders about high-yield seed selection.
-
Deployed an R Shiny Dashboard to identify recombination breakpoints across the genome and explored optimization methods to improve genotype imputation engine performance near and away from recombination breakpoints.
-
Developed a scalable framework in Domino Data Lab to impute missing genotypes across the genetic map with high accuracy for various crops like corn and soy.
-
Developed a framework in R to execute pedigree-based breeding program simulations using AlphaSimR.
-
Collaborated across teams to identify needs and gaps in data science pipelines and provide solutions.
Data Science Intern (Ph.D. level), Edison Software Inc (June to August 2020)
-
Perform statistical modeling on consumer behavioral data captured by the Edison platform (e-mail). This includes mathematical optimization, probability and sampling, and causal inference.
-
Extract data from multiple sources and code models that process large data sets using Impala SQL and python.
-
Develop interpretable and accurate regression models to predict monthly e-commerce sales in US based on Edison data by implementing user cohorts, regularization, and heuristic methods.
-
Performed feature analysis to understand the impact of different variables on model predictions
Predictive Analytics - Data Science Intern, Highmark Health (May to June 2020, reduced due to pandemic)
-
Analyzed the technological changes in the organization during the pandemic.
-
Performed a case study to understand the challenges, benefits, and timing of changes made by Highmark health during the pandemic.
Software Engineer, Accenture Services Pvt Ltd (August 2014 to December 2015)
-
Responsible for gathering SAP OTC business requirements and communicate client business requirements by constructing easy to understand data and process models.
-
Developed and implemented 5+ levels of testing, unit testing, integration testing, system testing, user acceptance testing and Regression testing for entire SAP OTC process.
-
Identified and documented As-Is business processes in Aviation oil domain related to the SAP system.
-
Created a concordance for plant configuration in SAP SD. This was a first of its kind and used as a template for other configurations.
Technical Intern, Efftronics Systems Pvt Ltd (October 2013 to May 2014)
-
Development and monitoring of Air quality control system using multi-sensor fusion techniques.
-
Researched various methods to improve air quality in the workplace.
Technical Skills
Programming: Python, Matlab, R-Programming, PySpark
Data science & Visualization Tools: RapidMiner, Weka, QlikView, Qlik Sense, Tableau
Database: MySQL, SQL server, Cassandra
Development Tools: Anaconda, PyCharm, Git, Domino Data Lab
Operating System: Windows, Linux, Mac