| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Amherst, MA LINKEDIN LINK AVAILABLE PHONE NUMBER AVAILABLE # EMAIL AVAILABLE github.com/yarrap EducationUniversity of Massachusetts, Amherst Amherst, MAMasters in Computer Science September 2023 May 2025 (Expected) Relevant Coursework: Advanced Machine Learning, Responsible AI, Systems for Data Science. Indian Institute of Technology, Bhubaneswar Bhubaneswar, Odisha Bachelor of Technology in Mechanical Engineering July 2017 May 2021 Relevant Coursework: Introduction to Programing and Data Structures, Linear Algebra, Calculus. ExperienceTiger Analytics Hyderabad, TelanganaSenior Analyst - Data Science July 2022 July 2023 Collaborated closely with the Vice President of a Fortune 500 client in the USA, playing an instrumental role in a high-priority Omni-Channel Data Harmonization project. Exerted PySpark to standardize columns in a massive 40-billion-record uncleaned data source, employing fuzzy matching techniques. Engineered an entity matching pipeline for a 5-billion-record data source, exerting Natural Language Processing (NLP) techniques like fuzzy matching, semantic matching with sentence transformers and BERT embeddings, and the DeepER model. Achieved a remarkable 95% mapping success rate for the top 90% selling UPC records with 89% accuracy. Analyst - Data Science July 2021 June 2022 Contributed to a Trade Promotion Optimization (TPO) project for a leading CPG client in Russia and France. Undertook full-cycle data preparation for 6 diverse sources, starting from raw data acquisition to data cleaning, transformation, and integration. Streamlined and automated the entire process, reducing manual efforts by 40%. Employed advanced modeling techniques, including Lasso and Bayesian regression, implemented using Azure Databricks for precise sales prediction. Collaborated in an agile manner with a 7-member team, closely aligning with the business to deliver models covering 90% of the clients market share. Generated a remarkable revenue increase of 35% from 2020 to 2021. ProjectsCustomer Churn Prediction Python, Sklearn, Seaborn, Jupyter Notebook April 2021 July 2021 Developed an advanced machine learning classifier to accurately forecast customer churn. Performed thorough Exploratory Data Analysis (EDA) and Churn Cohort Analysis. Utilized predictive modeling techniques, employing Decision Tree, Random Forest, and AdaBoost classifiers. Applied hyperparameter tuning and chose the best-fitting model. Achieved the highest accuracy of 83% using the AdaBoost classifier. CIA Country Analysis and Clustering Python, Google Colab, Plotly December 2020 February 2021 Developed a robust KMeans Clustering model to identify similarities among more than 200 countries. Experimented with different cluster counts to optimize grouping accuracy and gain valuable insights. Undertook comprehensive data cleaning and subsequently employed feature engineering techniques. Utilized elbow plot method to determine the optimal number of clusters as 3 and carried out model interpretation. Penguin Species Prediction Python, Jupyter Notebook September 2020 November 2020 Implemented a Decision Tree Classifier to predict 3 penguin species with high accuracy, leveraging their physical attributes. Conducted exhaustive EDA, focusing on visualization methods like cat, pair, and scatter plots to reveal crucial patterns and applied feature engineering. Attained an outstanding 92% accuracy by harnessing the capabilities of the Decision Tree Classifier. Technical SkillsProgramming Languages: Python, Java, SQL, C.Python Libraries: Pandas, NumPy, Matplotlib, Seaborn, scipy.stats, Sklearn, Plotly, TensorFlow, Keras, xgboost. Software Tools: Pyspark, Microsoft Azure, Git, Microsoft Excel, Microsoft Word. Certifications: Statistics for Data Science and Business Analysis, Microsoft Excel from Beginner to Advanced. |