Data Science Machine Learning Resume Chi...

Data Science Machine Learning Resume Chi...
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Data Science Machine Learning
Target Location	US-IL-Chicago
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Machine Learning Data Science Elk Grove Village, IL
Data Science Machine Learning Chicago, IL
Machine Learning Data Scientist Chicago, IL
Data Science Chicago, IL
Machine Learning Computer Science Bensenville, IL
Machine Learning Data Scientist Naperville, IL
Click here or scroll down to respond to this candidate
Candidate's Name
Email: EMAIL AVAILABLEPhone: PHONE NUMBER AVAILABLESr. Data Science with ML EngineerPROFESSIONAL SUMMARYProfessional Qualified Data Scientist/Data Analyst with over 8 years of experience in Data science and Analytics including Machine Learning, Data Mining and Statistical AnalysisInvolved in the entire data science project life cycle and actively involved in all the phases including data extraction, data cleaning, statistical modeling and data visualization with large data sets of structured and unstructured data.Experienced with machine learning algorithm such as logistic regression, random forest, XGboost, KNN, SVM, neural network, linear regression, lasso regression and k - meansKBC, Chatbots, Adaptive Supervised Learning (deterministic classification), Unsupervised Learning methods for IE, ANN and DeepNN for NLP and Chatbots, Probabilistic models for NLG and inferences, Decision science. Integrated the OpenAI API into data science pipelines to automate text-related tasks, such as summarization of lengthy documents, creating chatbots, or generating creative content.Knowledge of Information Extraction, NLP algorithms coupled with Deep Learning.Extensively worked on Python 3.5/2.7 (Numpy, Pandas, Matplotlib, NLTK and Scikit-learn)Proficient in Predictive Modeling, Data Mining Methods, Factor Analysis, ANOVA, Hypothetical testing, normal distribution and other advanced statistical and econometric techniques.Developed predictive models using Decision Tree, Random Forest, Nave Bayes, Logistic Regression, Cluster Analysis, and Neural Networks.Experienced in Machine Learning and Statistical Analysis with Python Scikit-Learn.Experienced in Python to manipulate data for data loading and extraction and worked with python libraries like Matplotlib, Numpy, SciPy and Pandas for data analysis.Worked with complex applications such as R, SAS, Matlab and SPSS to develop neural network, cluster analysis.Strong SQL programming skills, with experience in working with functions, packages and triggers.SKILLSData Analytics Tools/Programming:Python (Numpy, SciPy, pandas, Genism, Keras), R (Caret, Weka, ggplot), MATLAB, Microsoft SQL Server, Oracle PLSQL, Python.Analysis &Modelling ToolsErwin, Sybase Power Designer, Oracle Designer, Erwin, Rational Rose, ER/Studio, TOAD, MS Visio, & SAS.Data VisualizationTableau, Visualization packages, Microsoft Excel.ETL ToolsInformatica Power Centre, Data Stage, Ab Initio, Talend.OLAP ToolsMS SQL Analysis Manager, DB2 OLAP, Congos Power Play.LanguagesSQL, PL/SQL, T-SQL, XML, HTML, UNIX Shell Scripting, C, C++, AWK, JavaScriptDatabases:Oracle, Teradata, DB2 UDB, MS SQL Server, Netezaa, Sybase ASE, Informix, Mongo DB, Hbase, Cassandra, AWS RDS.Project Execution MethodologiesRalph Kimball and Bill Inmon data warehousing methodology, Rational Unified Process (RUP), Rapid Application Development (RAD), Joint Application Development (JAD).Tools & SoftwareTOAD, MS Office, BTEQ, Teradata SQL Assistant.MethodologiesRalph Kimball, COBOLReporting ToolsBusiness Objects XIR2, Congos Impromptu, Informatica Analytics Delivery Platform, Micro Strategy, SSRS, and Tableau.ToolsMS Office Suite, Scala, NLP, Maria DB, SAS, Spark MLib Kibana, Elastic search packages, VSSLanguagesSQL, T-SQL, Base SAS and SAS/SQL, HTML, XML.Operating SystemsWindows, UNIX (Sun-Solaris, HP-UX), Windows NT/XP/Vista, MSDOS.PROFESSIONAL EXPERIENCECONA Services 12/2022  PresentData Scientist / Machine Learning EngineerGathering, retrieving and organizing data from multiple sources and mapping it to reach meaningful data to use.Developed a pipeline for collecting data from multiple sources and generating their general stats into reports that give overall understanding of data.Expertise in training and fine-tuning NLP models beyond conversational AI, like sentiment analysis, text summarization, and language translation.Experience in working on different Databases/Data warehouses like Teradata, Oracle, AWS Redshift, and Snowflake.Proficient in utilizing statistical techniques like regression, clustering, and classification to derive insights from data.competent in using programs like Matplotlib, Tableau, or Power BI to create eye-catching visual representations of complex data that make it easier to understand.NLP engineer with a profound interest in research and development for cutting edge machine learning techniques.Conducted A/B tests and hypothesis testing to evaluate the effectiveness of data-driven solutions.Capable of building predictive models to forecast trends and outcomes, enhancing decision-making processes.Developed machine learning models, such as clustering algorithms (e.g., K-means) and collaborative filtering, to personalize marketing strategies and product recommendations.Developed predictive models in R using machine learning algorithms like Random Forest, Gradient Boosting, or XGBoost for real-world business problems.Able to create custom Python geo processing tools and scripts inside ArcGIS capable of automating tedious work and improving the effectiveness of data processing in a variety of fields, including market analysis, emergency response, and natural resource management.Initially the data was stored in Snowflake. Later the data was moved to Azure Data Lake.Developed Seasonality analysis pipeline which can impute missing values, remove outliers and extract seasonal patterns in the historical data using multiple algorithms like, dart seasonality, Ljung box, spectral analysis, Seasonal decompositions and seasonal index.Developed multiple time series forecasting models of on retail store data.Developed anomaly detection algorithms in R to identify unusual patterns or outliers in financial transactions, network traffic, or user behavior.Developed conventional time series model like, Arma, Arima, Auto Arima and some Deep learning based time series models like neutral forecasts and NBeats etc.Created a report for our forecasts which will explain the forecast results to the non-technical persons.Worked on Chat Bot Product Management using NLP/NLU and designed roadmap for launch/future phases.Justify models forecasted results with hypothesis testing, ensuring statistical significance for business application.Experience in generative AI powered by LLMs to automatically create marketing content, such as social media posts, blog articles, and product descriptions.Extracted texts from promotions images using multiple OCRs like tesseract, easy our, keras ocr, cognitive AI based ocr and created a tool for completing the missing extraction based on historical data.Developed a system for analyzing the discounts effect on sales and finding the beast discount for having the optimal profits.Developed Machine Learning algorithms with Spark MLib standalone and Python.Utilized various techniques like Histogram, bar plot, Pie-Chart, Scatter plot, Box plots to determine the condition of the data.Performed data pre-processing tasks like merging, sorting, finding outliers, missing value imputation, data normalization, making it ready for statistical analysis.Environment: Python 3.9, R Studio, MLib, A/B Test, SQL Server, Hive, Hadoop Cluster, ETL, NumPyPandas, Matplotlib, Plotly, Azure ecosystem, Data Bricks, Pyspark, Power BI, Scikit-Learn, ggplot2, Shiny, Tensor Flow, Teradata, Flask.Numerator 8/2021  12/2022Data Scientist / Machine Learning EngineerGathering, retrieving and organizing data and using it to reach meaningful conclusions.Developed a system for collecting data and generating their findings into reports that improved the company.Experience in leveraging parallel computing techniques to expedite data analysis and reduce processing times.Skilled in optimizing algorithms for specific NLP tasks, enhancing accuracy and efficiency.Proven track record of applying GIS expertise to diverse domain classifications, ranging from environmental conservation, urban development, and public health to infrastructure management and market research, demonstrating versatility in problem-solving and data analysis across industries.Experienced in fact dimensional modeling (Star schema, Snowflake schema), transactional modeling and SCD (Slowly changing dimension).Integrated LLMs into chatbot systems to enhance customer support. These chatbots can understand and respond to customer inquiries in a more human-like and contextually relevant manner, resulting in improved customer satisfaction and reduced response times.Involved in building Data Models and Dimensional Modeling with 3NF, Star and Snowflake schemas for OLAP and Operational data store (ODS) applications.Migrated on premises enterprise data warehouse to cloud based snowflake Data Warehousing solution and enhanced the data architecture to use snowflake as a single data platform for all analytical purposes.Setting up the analytics system to provide insights.Initially the data was stored in Mongo DB. Later the data was moved to Elastic search.Used Kibana to visualize the data collected from Twitter using Twitter REST APIs.Developed a multi class, multi label 2-stage classification model to identify depression- related tweets and classify depression- indicative symptoms. Utilized the created model to calculate the severity of depression in a patient using Python, Scikit learn, Weka and Meka.Conceptualized and created a knowledge graph database of news events extracted from tweets using Java, Virtuoso, Stanford CoreNLP, Apache Jena, and RDF.Producing and maintaining internal and client-based reports.Proficiency in handling large datasets using tools like Hadoop, Spark, or SQL databases, enabling efficient data processing.Creating stories with data that a non-technical team could also understand.Worked on Descriptive, Diagnostic, Predictive and Prescriptive analytics.Implementation of Character Recognition using Support vector machine for performance optimization.Monitored the Data quality and integrity of data was maintained to ensure effective functioning of department.Implemented Normalization Techniques and build the tables as per the requirements given by the business users.Machine learning automatically scores user assignment based on few manually scored assignments.Utilized various techniques like Histogram, bar plot, Pie-Chart, Scatter plot, Box plots to determine the condition of the data.Researching and developing Predictive Analytic solutions and creating solutions for business needs.Worked on data processing on very large datasets that handle missing values, creating dummy variables and various noises in data.Mining large data sets using sophisticated analytical techniques to generate insights and inform business decisions.Building and testing hypothesis, ensuring statistical significance and building statistical models for business application.Developed Machine Learning algorithms with Spark MLib standalone and Python.Design and develop analytics, machine learning models, and visualizations that drive performance and provide insights, from prototyping to production deployment and product recommendation and allocation planning.Performed data pre-processing tasks like merging, sorting, finding outliers, missing value imputation, data normalization, making it ready for statistical analysis.Implemented various machine learning models such as regression, classification, Tree based and Ensemble models.techniques such as K-Means and further processed using Support Vector Regression.Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring.Accomplished multiple tasks from collecting data to organizing and interpreting statistical information.Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom visualization tools using R, Tableau, and Power BI.Environment: Python 3.6.4, R Studio, MLib, Regression, A/B Test, SQL Server, Hive, Hadoop Cluster, ETL, Tableau, NumPyPandas, Matplotlib, Power BI, Scikit-Learn, ggplot2, Shiny, Tensor Flow, Teradata.Alteryx 4/2020  7/2021Data ScientistCollaborated with data engineers and operation team to implement ETL process, wrote and optimized SQL queries to perform data extraction to fit the analytical requirements.Performed data analysis by using Hive to retrieve the data from Hadoop cluster, SQL to retrieve data from Redshift.Worked on Shiny and R application showcasing machine learning for improving the forecast of business.Proficiency with various data visualization tools like Tableau, Matplotlib/Seaborn in Python, and ggplot2/Rshiny in R to create interactive, dynamic reports, and dashboards.Explored and analyzed the customer specific features by using Spark SQL.Performed univariate and multivariate analysis on the data to identify any underlying pattern in the data and associations between the variables.Performed data imputation using Scikit-learn package in Python.Work experience with Cherwell Service Management tool for tickets.Participated in features engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn preprocessing.Used Python 3.X (NumPy, SciPy, pandas, Scikit-learn, seaborn) and Spark 2.0 (PySpark, MLib) to develop variety of models and algorithms for analytic purposes.Used F-Score, AUC/ROC, Confusion Matrix, MAE, and RMSE to evaluate different Model performance.Designed and implemented recommender systems which utilized Collaborative filtering techniques to recommend course for different customers and deployed to AWS EMR cluster.Utilized natural language processing (NLP) techniques to Optimized Customer Satisfaction.Designed rich data visualizations to model data into human-readable form with Tableau and Matplotlib.Environment: AWS Redshift, EC2, EMR, Hadoop Framework, S3, HDFS, Spark (PySpark, MLib, Spark SQL), Python 3.x (Scikit-Learn/SciPy/Numpy/Pandas/NLTK/Matplotlib/Seaborn), Tableau Desktop (9.x/10.x), Tableau Server (9.x/10.x), Machine Learning (Regressions, KNN, SVM, Decision Tree, Random Forest, XGboost, LightGBM, Collaborative filtering, Ensemble), NLP, Teradata, Git 2.x, Agile/SCRUMResurface Labs 4/2019  3/2020Data ScientistInvolved in gathering, analyzing and translating business requirements into analytic approaches.Worked with Machine learning algorithms like Neural network models, Linear Regressions (linear, logistic etc.), SVM's, Decision trees for classification of groups and analyzing most significant variables.Converted raw data to processed data by merging, finding outliers, errors, trends, missing values and distributions in the data.Implementing analytics algorithms in Python, R programming languages.Performed K - means clustering, Regression and Decision Trees in R.Worked on Na ve Bayes algorithms for Agent Fraud Detection using R.Performed data analysis, visualization, feature extraction, feature selection, feature engineering using Python.Generated detailed report after validating the graphs using Python and adjusting the variables to fit the model.Worked on Clustering and factor analysis for classification of data using machine learning algorithms.Used Power Map and Power View to represent data very effectively to explain and understand technical and non-technical users.Created SQL tables with referential integrity and developed advanced queries using stored procedures and functions using SQL server management studio.Worked with risk analysis, root cause analysis, cluster analysis, correlation and optimization and K-means algorithm for clustering data into groups.Environment: Python, Jupiter, MATLAB, SSRS, SSIS, SSAS, Mongo DB, HBase, HDFS, Hive, Pig, SAS, Power Query, Power Pivot, Power Map, Power View, SQL Server, MS Access.Teradata 1/2018  3/2019Data Analyst/Data ScientistGathered, analyzed, documented and translated application requirements into data models and Supports standardization of documentation and the adoption of standards and practices related to data and applications.Participated in Data Acquisition with Data Engineer team to extract historical and real-time data by using Sqoop, Pig, Flume, Hive, Map Reduce and HDFS.Wrote user defined functions (UDFs) in Hive to manipulate strings, dates and other data.Performed Data Cleaning, features scaling, features engineering using pandas and NumPy packages in python.Applied clustering algorithms i.e. Hierarchical, K-means using Scikit and SciPy.Performs complex pattern recognition of automotive time series data and forecast demand through the ARMA and ARIMA models and exponential smoothening for multivariate time series data.Delivered and communicated research results, recommendations, opportunities to the managerial and executive teams, and implemented the techniques for priority projects.Designed, developed and maintained daily and monthly summary, trending and benchmark reports repository in Tableau Desktop.Generated complex calculated fields and parameters, toggled and global filters, dynamic sets, groups, actions, custom color palettes, statistical analysis to meet business requirements.Environment: Machine learning (KNN, Clustering, Regressions, Random Forest, SVM, Ensemble), Linux, Python 2.x (Scikit-Learn/SciPy/Numpy/Pandas), R, Tableau (Desktop 8.x/Server 8.x), Hadoop, Map Reduce, HDFS, Hive, Pig, HBase, Sqoop, Flume, Oracle 11g, SQL Server 2012.Algorithmia 1/2016  10/2017Data AnalystSuccessfully Completed Junior Data Analyst Internship in Confidential.Built an Expense Tracker and Zonal Desk.Identifying inconsistencies, correcting them or escalating the problems to next level.Assisted in development of interface testing and implementation plans.Analyzing data for data quality and validation issues.Analyzing the websites regularly to ensure site traffic and conversion funnels are performing well.Collaborating with Sales and marketing teams to optimize processes that communicate insights effectively.Creating and maintaining automated reports using SQL.Understood all the Hadoop architecture and drove all the meetingsConducted safety check to make sure that my team is feeling safe for the retrospectivesAided in data profiling by examining the source dataExtracting features from the given data set and using them to train and evaluate different classifiers that are available in the WEKA tool. Using these features, we differentiate spam messages from legitimate messages.Conducted safety check to make sure that my team is feeling safe for the retrospectivesAided in data profiling by examining the source dataPerformed data mappings to map the source data to the destination dataDeveloped Use Case Diagrams to identify the users involved. Created Activity diagrams and Sequence diagrams to depict the process flows.Environment: Python, Mat lab, Oracle, HTML5, Tableau, MS Excel, Server Services, Informatica Power CenterSQL, Microsoft Test Manager, Adobe Connect, MS Office Suite, LDAP, Hive, Spark, Pig, Oozie.
Respond to this candidate
Your Message
Please type the code shown in the image: