| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateMRUDUL GEORGE Mobile: PHONE NUMBER AVAILABLE[DATA SCIENTIST/ANALYST] EMAIL AVAILABLELake Mary, FloridaPROFESSIONAL SUMMARYoData scientist /Analyst with over 10 years of experience, out of which 4 years in Machine Learning, Data analysis, Predictive Modelling, Data Architecture, Data Mining, Text Mining & Natural Language Processing (NLP).oSkilled at extracting hidden patterns from large datasets using Machine Learning and Deep Learning algorithms, constructing regression, classification, and NLP models. Have a thorough understanding of Bigdata concepts and Hadoop. Experienced in statistical modeling and building interactive dashboards using the BI tool Tableau. Excellent communication skills. Experienced in Data Analysis, Data science and Data Warehouse projects. Solid understanding of both the business and the technical aspects of Artificial Intelligence and Machine Learning.oExperience in managing entire data science project life cycle and involved in all phases, including data extraction, data cleaning, statistical modeling and data visualization, with large datasets of structured and unstructured data.oWorked on different ETL tools and platforms like SSIS, Databricks, AWS s3, AWS RDS, AWS Glue, and Redshift.oExperience in making batch and streaming data pipelines for data ingestion.oExperience in various SDLC methodologies like Agile, Scrum, Waterfall.oWorked in Alteryx for data preparation and blending.oExperience in Text Analytics, developing different Statistical Machine Learning, Data Mining solutions to various business problems and generating data visualizations using Python, and Tableau.oHands - on experience in Machine Techniques and algorithms such as Linear Regression, GLM, CART, SVM, KNN, LDA/QDA, Naive Bayes, Random Forest, SVM, Boosting, K-means Clustering, Hierarchical clustering, PCA, Feature Selection, Neural Networks and NLP. Applied algorithms, such as k-NN, Naive Bayes, SVM, Decision Tree, Random Forests etc. in various projects. Applied neural network algorithms ANN, RNN, LSTM and & GRU.oProfessional working experience with Python 2.X/3.X libraries including MatplotLib, Numpy, Scipy, Pandas, Beautiful Soup, Seaborn, Scikit-learn, Pytorch and TensorFlow and NLTK.oExperience with data visualizations using Python 2.X/3.X and generating dashboard with Tableau 8.0/9.2/10.0.oWorking experience in Statistical Analysis and Testing including Hypothesis test, Anova, Time Series.oHands-on experience in importing and exporting data using Relational Database including Oracle 11g / 12c, MySQL 5.0 and MS SQL Server and NoSQL database MongoDB.oWorking experience in big data environment Hadoop Ecosystem including HDFS, MapReduce, Hive, HBase, Spark Framework including Pyspark, MLlib and SparkSQL.oDid Web scrapping with Beautiful Soup.oWorked on classification and clustering projects using python.oPerformed Time Series forecasting and applied ARIMA and SARIMA models.oExperience in hyperparameter Tuning and optimizing models.oWorking experience in version control tools such as Git to coordinate work on file with multiple team members.oWorked using Cloud Services such as AWS EC2, EMR, RDS, S3 to assist with big data tools, solve the data storage issue and work on deployment solution.oExperienced in writing complex SQL queries and PL/SQL stored procedures and functions.oGood team player and quick learner; highly self-motivated person with good communication and interpersonal skills.EDUCATIONMTech in Data Science and Machine Learning (2023)Honors Diploma in Software Applications and Management from Aptech Computer Education (1999)BTech in Agricultural Engineering from the University of Allahabad (1992)PROFESSIONAL EXPERIENCE DATA SCIENCE SKILLSETL SKILLSINFORMATICAPower Center Designer(6.x,7.x,9.6.1),Power Connect, SSISAWS Glue, AWS Redshift,DatabricksDimensional modeling,Star/snowflake schema,Kimball methodology, ObjectOriented Concepts,DocumentationStar/snowflake schema,Kimball methodologyOTHER SKILLSC/C++, JavaPL/SQL, Toad.Developer 2000Python, sparkMachine learning (ML), DeepLearning (DL), NLP, TimeSeries, Hadoop, Databricks,PySparkNumPy, Scikit, Pandas,TensorFlow, Matplotlib,Seaborn, KerasTableau, Excel, AlteryxMongoDB, MySQL, Oracle(7.3, 9i, 10g, 11g/12c), SQLServerUNIX, AWS S3, RDS,EC2, EMRjiraBig Data & Hadoop conceptsExploratory Data Analysis, Agile Methodology, Dimensional modeling,Object Oriented Concepts,Statistical Analysis,MathematicsData Science TraineePES University, Bangalore, India (06/2022-06/2023)Responsibilities:oExtracted data from various sources. Cleaned and manipulated raw data.oCreated graphs and charts detailing data analysis results.oDeveloped a Smart Dashboard for Electric Vehicles using AI techniques that seamlessly integrates nearby EV charging stations into users drive path, considering the source and destination within a 200-meter radius.oDid predictions on the basic Li-ion battery parameter, State of Charge and State of Health using Deep Learning. ANN, LSTM and GRU networks were used for these predictions.oRange prediction done using Linear Regression followed by K-Fold validation for tuning. Predictions were done with 97.8% accuracy.oUsed BERT model for sentiment analysis to classify charging station user reviews, providing valuable insights to EV users. The developed model has Geolocation-Based Charging Station Display and sentiment analysis integration for the charging stations.oPlanned and completed group projects working smoothly with others. Analyzed source data and outlined data relevance for BI solutions for analysis and reporting.oCreated data pipelines using Databricks and performed ETL. Performed incremental data ingestion using Auto Loader.oDefined end-to-end data pipelines using Delta Live Tables (DLT).oTested, validated and tuned models to foster accurate predictions. Developed predictive regression models and projects using different ML and DL algorithms.oDid data pre-processing. Scheduled jobs using the inbuilt scheduler.oDeveloped an NLP classification model for sarcasm detection using the features provided in a university chat dataset.oDid clustering of patients from health care data containing patients Demographic parameters, Diet parameters and lab-tested parameters.oBuilt NLP project based on sentiment analytics model to predict the rating of products based on user reviews. Built recommendation system based on sentiment analysis for readers using books data.oBuilt interactive Dashboards using BI visualization tool Tableau.oCreated story from the dashboards and the reports created.oCloud computing experience using Apache Spark and MapReduce on AWS, DatabricksoWorked on MongoDB database concepts such as locking, transactions, indexes, Sharding, replication, schema design.oOvercame challenges of data migration from MySQL to MongoDB.Role: Data AnalystCobot Technologies Private Ltd., Thiruvananthapuram, India (10/2017-03/2021)Responsibilities:oExtracted huge amount data from data-lakes, big data tables from cloud resources using SQL and Pyspark.oUndertake preprocessing of structured and unstructured data Using Python, Pyspark and SQL.oPerformed data manipulate on, data preparation, normalization, and predictive modeling. Improve efficiency and accuracy by evaluating models in Python.oAnalyzed and solved business problems and found patterns and insights within structured and unstructured data.oInvolved in analysis of business requirements and keeping track of data available from various data sources, transform and load the data into Target Tables using AWS Glue.oWorked on outlier's identification with box - plot, K-means clustering using Pandas, NumPy, matplotlib and seaborn.oUsed regression and predictive modelling to optimize marketing strategies resulting in almost 62 % growth in ROI.oIdentified key business indicators (KPIs) and built interactive dashboards to track the company's advancement in the priority areas.oBuilt stories based on the dashboards and presented them for the marketing team.oAnalyzed and solved business problems and found patterns and insights within structured and unstructured data.oCreated reports and dashboards to communicate sales performance metrics of over 60+ dealers and helped decision making with 96% accuracy. Did customer segmentation to identify high yield segments based on the RFM score.oDeveloped predictive models on large-scale datasets to address various business problems through leveraging advanced statistical modeling, machine learning, and deep learning.oExperimented with various predictive models including Logistic Regression, Support Vector Machine (SVM), Random Forest, XGBoost, and Decision trees to check the model performance and accuracies.oGenerated reports and visualizations based on the insights mainly using Tableau and developed dashboards.oBuilt text classifier on the data glossary using TF-IDF to construct a feature space. Analyzed text data using NLP libraries in python.oUsed Apache Spark in handling huge sets of data and built Machine learning models using sparkML libraries.oPerformed Data pulls to get the data from AWS S3 buckets.oBuilt robust Machine Learning models using bagging and boosting methods.oInvolved with Data Analysis primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats.oDesigned and implemented data integration modules for Extract/Transform/Load (ETL) functions.oDone Performance tuning of the database, which includes indexes, and optimizing SQL statements, monitoring the server.ETL DeveloperGlobal Cynex Inc Virginia, USA (07/2003-02/2005)Clients: Forest Laboratories, Commack, NY& Verizon Inc. Mineola, NY, USAResponsibilities:oWorked on various ETL projects using Informatica during this period.oDeveloped system design documentation and design objectives. Reviewed the project documentation and made significant adjustments to reflect changes in the project scope.oCombined and consolidated various external sources (text files, excel sheets etc.) and loaded them to SAP BW using Informatica Power Connect. Involved in the creation of High-Level Design Document, DDD (Detailed Design Document), Mapping Interface Document and Test Case Document.oWorked with the team to prepare the detailed design documents with the help of IT-RDD (requirement docs). Worked on dividing the project into separate modules and the detailed design of individual modules. Extensively worked on developing different mappings.Informatica DeveloperEBS Inc, Virginia, USA. (10/2001-07/2003)Client: Dean Foods, New York, USAResponsibilities:oJoined the development team during the design phase of the project and worked with the team to prepare the detailed design documents with the help of IT-RDD (requirement documents).oWorked on dividing the project into separate modules and the detail design of individual modules.oWorked with the team to identify conformed dimensions, and granularity of fact tables etc.oExtensively used Star Schema methodologies in building and designing the logical data model into Dimensional Models.oExtensively worked on developing different mappings using Informatica tools such as Source Analyzer, Mapping Designer, Workflow Manager, Workflow Monitor, and Repository ManageroInvolved in the performance tuning of some problem workflows.Application Developer (02/1999 07/2001)Cosmos Systems Ltd, Noida, IndiaResponsibilities:oInvolved in the requirement collection and design of the system.oPrepared the Entity Relationship diagram using Erwin depicting all entities involved and their links.oGenerated the tables, constraints, indexes with appropriate storage parameters on to the database.oDesigned various forms for the users to enter, maintain and process the data using Developer 2000 in front end and Oracle in the back end.oCreated 50 plus reports based on the user requirement. |