Quantcast

Sr Data Scientist Resume Newark, DE
Resumes | Register

Candidate Information
Name Available: Register for Free
Title Sr. Data Scientist
Target Location US-DE-Newark
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes
Click here or scroll down to respond to this candidate
 AKSHITHAEmail: EMAIL AVAILABLE                                                                                                                          PH: PHONE NUMBER AVAILABLESr. Data ScientistPROFESSIONAL SUMMARY:      Around 8+ years of IT experience as a Data Scientist, including profound expertise and experience on statistical data analysis such as transforming business requirements into analytical models, designing algorithms, and strategic solutions that scales across massive volumes of data.      Proficient in Statistical Methods like Regression models, hypothesis testing, confidence intervals, principal component analysis and dimensionality reduction.      Expert in R and Python scripting. Worked in stats function with Numpy, visualization using Matplotlib/Seaborn and Pandas for organizing data.
      4 years of experience in Scala and spark.      Experience in using various packages in R and python like ggplot2, caret, dplyr, Rweka, rjson, plyr, SciPy, scikit - learn, Beautiful Soup, Rpy2.      Extensive experience in Text Analytics, generating data visualizations using R, Python and creating dashboards using tools like Tableau.      Experience in writing code in R and Python to manipulate data for data loads, extracts, statistical analysis, modeling, and data munging.      Utilized analytical applications like R, SPSS, Rattle and Python to identify trends and relationships between different pieces of data, draw appropriate conclusions and translate analytical findings into risk management and marketing strategies that drive value.      Skilled in performing data parsing, data manipulation and data preparation with methods including describe data contents, compute descriptive statistics of data, regex, split and combine, Remap, merge, subset, reindex, melt and reshape.      Highly skilled in using visualization tools like Tableau, ggplot2 and d3.js for creating dashboards.      Professional working experience in Machine Learning algorithms such as, linear regression, logistic regression, Naive Bayes, Decision Trees, Clustering, and Principal Component Analysis.      Hands on experience with big data tools like Hadoop, Spark, Hive, Pig, Impala, Pyspark, Spark SQL.      Good knowledge in Database Creation and maintenance of physical data models with Oracle, Teradata, Netezza, DB2, MongoDB, HBase and SQL Server databases.      Experienced in writing complex SQL Quires like Stored Procedures, triggers, joints, and Sub quires.      Interpret problems and provides solutions to business problems using data analysis, data mining, optimization tools, and machine learning techniques and statistics.      Knowledge of working with Proof of Concepts (PoC s) and gap analysis and gathered necessary data for analysis from various sources, prepared data for data exploration using data munging and Teradata.      Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.      Ability to work with managers and executives to understand the business objectives and deliver as per the business needs and a firm believer in team work.      Experience and domain knowledge in various industries such as healthcare, insurance, retail, banking, media and technology.      Work closely with customer's, cross-functional teams, research scientists, software developers, and business teams in an Agile/Scrum work environment to drive data model implementations and algorithms into practice.      Strong written and oral communication skills for giving presentations to non-technical stakeholders.TECHNICAL SKILLS:Databases: Oracle, MySQL, SQLite, NO SQL, RDBMS, SQL Server 2014, HBase 1.2, MongoDB 3.2., Teradata, Netezza. CassandraDatabase Tools: PL/SQL Developer, Toad, SQL Loader, Erwin.Web Programming: Html, CSS, Xml, JavaScript.Programming Languages: R, Python, SQL, Scala, UNIX, C, JAVA, TableauDWH BI Tools: Data Stage 9.1, 11.3, Tableau Desktop, D3.jsMachine Learning: Regression, clustering, SVM, Decision trees, Classification, Recommendation systems, Association Rules, Survival Analysis etcData Visualization: Qlikview, Tableau9.4/9.2, ggplot2 (R), D3, ZeeplinBigdata Framework: HDFS, MapReduce, Pig, Hive, Sqoop, Oozie, Zookeeper, Flume and HBase, Amazon EC2, S3 and Red Shift), Spark, Storm, Impala, Kafka.Technologies/Tools: Azure Machine Learning, SPSS, Rattle, Caffe, Tensor flow, Theano, Torch, Keras, NumPy.Scheduling Tools: Autosys, Control-M.Operating Systems: AIX, LINUX, UNIX.Environment: AWS, AZURE, Databricks.comPROFESSIONAL EXPERIENCE:Giant Eagle, Pittsburgh, PA                                                                                                                             December 2022 to PresentSr. Data ScientistResponsibilities:      This project was focused on customer clustering based on ML and statistical modeling effort including building predictive models and generate data products to support customer classification and segmentation.      Develop an Estimation model for various product & services bundled offering to optimize and predict the gross margin      Built sales model for various product and services bundled offering      Developed predictive causal model using annual failure rate and standard cost basis for the new bundled services.      Design and develop analytics, machine learning models, and visualizations that drive performance and provide insights, from prototyping to production deployment and product recommendation and allocation planning.      Worked with sales and Marketing team for Partner and collaborate with a cross-functional team to frame and answer important data questions.      prototyping and experimenting ML algorithms and integrating into production system for different business needs.      Application Machine Learning algorithms with Spark MLib standalone and R/Python.      Worked on Multiple datasets containing 2billion values which are structured and unstructured data about web applications usage and online customer surveys      Design, built and deployed a set of python modeling APIs for customer analytics, which integrate multiple machine learning techniques for various user behavior prediction      and support multiple marketing segmentation programs      Segmented the customers based on demographics using K-means Clustering      Used classification techniques including Random Forest and Logistic Regression to quantify the likelihood of each user referring      Designed and implemented end-to-end systems for Data Analytics and Automation, integrating custom visualization tools using R, Tableau, Power BIEnvironment: MS SQL Server, R/R studio, Python, Spark frame work, Redshift, MS Excel, Tableau, T-SQL, ETL, RNN, LSTM MS Access, XML, MS office 2007, Outlook.AgFirst Columbia, SC                                                                                                                                   May 2020 to November 2022Data ScientistResponsibilities:      Responsible for analyzing large data sets to develop multiple custom models and algorithms to drive innovative business solutions.      Perform Data profiling, preliminary data analysis and handle anomalies such as missing, duplicates, outliers, and imputed irrelevant data.      Remove outliers using Proximity Distance and Density based techniques.      Involved in Analysis, Design and Implementation/translation of Business User requirements.      Experienced in using supervised, unsupervised and regression techniques in building models.      Performed Market Basket Analysis to identify the groups of assets moving together and recommended the client their risks      Experience in determine trends and significant data relationships using advanced Statistical Methods.      Implemented techniques like forward selection, backward elimination and step wise approach for selection of most significant independent variables.      Performed Feature selection and Feature extraction dimensionality reduction methods to figure out significant variables.      Used RMSE score, Confusion matrix, ROC, Cross validation and A/B testing to evaluate model performance in both simulated environment and real world.      Performed Exploratory Data Analysis using R. Also involved in generating various graphs and charts for analyzing the data using Python Libraries.      Involved in the execution of multiple business plans and projects Ensures business needs are being met Interpret data to identify trends to go across future data sets.      Developed interactive dashboards, created various Ad Hoc reports for users in Tableau by connecting various data sources.Environment: Python, SQL server, Hadoop, HDFS, HBase, MapReduce, Hive, Impala, Pig, Sqoop, Mahout, LSTM, RNN, Spark MLLib, MongoDB, Tableau, Unix/Linux.Truist Bank, Charlotte, NC                                                                                                                                    June 2018 to April 2020Data ScientistResponsibilities:      Tackled highly imbalanced Fraud dataset using under sampling, oversampling with SMOTE and cost sensitive algorithms with Python Scikit-learn.      Wrote complex Spark SQL queries for data analysis to meet business requirement.      Developed MapReduce/Spark Python modules for predictive analytics & machine learning in Hadoop on AWS.      Worked on data cleaning and ensured data quality, consistency, integrity using Pandas, Numpy.      Participated in feature engineering such as feature intersection generating, feature normalize and label encoding with Scikit-learn preprocessing.      Improved fraud prediction performance by using random forest and gradient boosting for feature selection with Python Scikit-learn.      Performed Bayes, KNN, Logistic Regression, Random Forest, SVM and XGboost to identify whether a loan will default or not.      Implemented Ensemble of Ridge, Lasso Regression and XGboost to predict the potential loan default loss.      Used various metrics (RMSE, MAE, F-Score, ROC and AUC) to evaluate the performance of each model.      Used big data tools Spark (Pyspark, SparkSQL, MLLib) to conduct real time analysis of loan default based on AWS.      Conducted Data blending, Data preparation using Alteryx and SQL for tableau consumption and publishing data sources to Tableau server.      Created multiple custom SQL queries in Teradata SQL Workbench to prepare the right data sets for Tableau dashboards. Queries involved retrieving data from multiple tables using various join conditions that enabled to utilize efficiently optimized data extracts for Tableau workbooks.Environment: MS SQL Server 2014, Teradata, ETL, SSIS, Alteryx, Tableau (Desktop 9.x/Server 9.x), Python 3.x(Scikit-Learn/SciPy/Numpy/Pandas), Machine Learning (Bayes, KNN, Regressions, Random Forest, SVM, XGboost, Ensemble), AWS Redshift, Spark (Pyspark, MLlib, Spark SQL), Hadoop 2.x, MapReduce, HDFS, SharePointBrio Technologies Private Limited Hyd India                                                                                     September 2016 to March 2018Data AnalystResponsibilities:      Involved in Analysis, Design and Implementation/translation of Business User requirements.      Worked on collection of large sets using Python scripting. Spark SQL      Worked on large sets of Structured and Unstructured data.      Worked on creating DL algorithms using LSTM and RNN.      Actively involved in designing and developing data ingestion, aggregation, and integration in Hadoop environment.      Developed Sqoop scripts to import export data from relational sources and handled incremental loading on the customer, transaction data by date.      Experience in creating Hive Tables, Partitioning and Bucketing.      Performed data analysis and data profiling using complex SQL queries on various sources systems including Oracle 10g/11g and SQL Server 2012.      Identified inconsistencies in data collected from different source.      Worked with business owners/stakeholders to assess Risk impact, provided solution to business owners.      Experienced in determine trends and significant data relationships Analyzing using advanced Statistical Methods.      Carrying out specified data processing and statistical techniques such as sampling techniques, estimation, hypothesis testing, time series, correlation and regression analysis Using R.      Applied various data mining techniques: Linear Regression & Logistic Regression, classification, clustering.      Took personal responsibility for meeting deadlines and delivering high quality work.      Strived to continually improve existing methodologies, processes, and deliverable templates.Environment: R, SQL server, Oracle, HDFS, HBase, MapReduce, Hive, Impala, Pig, Sqoop, NoSQL, Tableau, RNN, LSTM, Unix/Linux, Core Java.Hudda Infotech Private Limited Hyderabad, India                                                                                       July 2015 to August 2016Data AnalystResponsibilities:      Provide expertise and recommendations for physical database design, architecture, testing, performance tuning and implementation.      Designed logical and physical data models for multiple OLTP and Analytic applications.      Extensively used the Erwin design tool &Erwin model manager to create and maintain the Data Mart.      Designed the physical model for implementing the model into oracle9i physical data base.      Involved with Data Analysis Primarily Identifying Data Sets, Source Data, Source Meta Data, Data Definitions and Data Formats      Performance tuning of the database, which includes indexes, and optimizing SQL statements, monitoring the server.      Wrote simple and advanced SQL queries and scripts to create standard and adhoc reports for senior managers.      Collaborated the data mapping document from source to target and the data quality assessments for the source data.      Used Expert level understanding of different databases in combinations for Data extraction and loading, joining data extracted from different databases and loading to a specific database.      Co-ordinate with various business users, stakeholders and SME to get Functional expertise, design and business test scenarios review, UAT participation and validation of financial data.      Worked very close with Data Architects and DBA team to implement data model changes in database in all environments.      Created PL/SQL packages and Database Triggers and developed user procedures and prepared user manuals for the new programs.      Performed performance improvement of the existing Data warehouse applications to increase efficiency of the existing system.      Designed and developed Use Case, Activity Diagrams, Sequence Diagrams, OOD (Object oriented Design) using UML and Visio.Environment: Erwin r9.0, Informatica 9.0, ODS, OLTP, Oracle 10g, Hive, OLAP, DB2, Metadata, MS Excel, Mainframes MS Visio, Rational Rose, Requisite Pro, Hadoop, PL/SQL, etc.

Respond to this candidate
Your Email «
Your Message
Please type the code shown in the image:
Register for Free on Jobvertise