Senior Data Scientist Resume Indianapoli...

Senior Data Scientist Resume Indianapoli...
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Senior Data Scientist
Target Location	US-IN-Indianapolis
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Related Resumes
Click here or scroll down to respond to this candidate
Melake FissuhLead Data Scientist/ ML Engineer/ AI SpecialistContact: PHONE NUMBER AVAILABLE Email: EMAIL AVAILABLEProfessional SummaryA highly accomplished Senior Data Scientist with 18+ years of IT experience, including 11 years specializing in AI, data mining, deep learning, predictive analysis, machine learning.Demonstrated ability to oversee the entire data science project lifecycle, extract insights from extensive datasets, and innovate solutions.Currently serving as Lead Data Scientist at Eli Lilly & Company, focusing on Data Extraction, Modeling, Wrangling, Statistical Modeling, Mining, Machine Learning, and Visualization.Academically distinguished, holding a Bachelor of Science degree in Computer Mathematics from Carleton University, Ottawa, ON.Proven expertise in Natural Language Processing, specifically BERT, ELMO, word2vec, sentiment analysis, Named Entity Recognition, and Topic Modeling Time Series Analysis. Proficient in Python and R, utilizing Pandas, NumPy, SciPy, Matplotlib, Seaborn, TensorFlow, Scikit-Learn, and ggplot2.Extensive experience with AWS, Google Cloud, and Azure, adept in Nave Bayes, Regression, Classification, Neural Networks, Deep Neural Networks, Decision Trees, and Random Forests.Skilled in querying large datasets from Hadoop Data Lakes, Data Warehouses, AWS (Redshift, Aurora), Cassandra, and NoSQL. Proficient in statistical analysis and machine learning techniques with PySpark and batch processing.Successful track record leading teams to operationalize statistical and machine learning models, creating APIs and data pipelines for business leaders and product managers. Experienced in creating visualizations, interactive dashboards, reports, and data stories using Tableau and Power BI.Familiar with algorithm techniques such as Bagging, Boosting, and Stacking, hands-on experience with PaLM, OpenAI Davinci (including GPT-3.5 and GPT-4).Skilled in Exploratory Data Analysis (EDA), communicating findings using Matplotlib, Seaborn, and Plotly. Proficient in defect tracking using Jira and Git for version control.Strong stakeholder interaction skills, adept at gathering requirements, defining business processes, and analyzing risks using Agile Methodology and Scrum Process. Developed novel deep learning architectures including CNNs, LSTMs, and Transformers.Excellence in NLP methods, time series analysis, and statistical modeling. Proficient in statistical methodologies including Hypothetical Testing, ANOVA, Time Series, Principal Component Analysis, Factor & Cluster Analysis, and Discriminant Analysis.Capable of comprehending new domains, designing, and implementing effective solutions, with excellent communication, interpersonal, analytical, and leadership skills. Proficient in explaining complex Data Science concepts to stakeholders and clients.Technical SkillsLibraries: NumPy, SciPy, Pandas, Theano, Caffe, SciKit-learn, Matplotlib, Seaborn, Plotly, TensorFlow, Keras, NLTK, PyTorch, Gensim, Urllib, BeautifulSoup4, PySpark, PyMySQL, SQAlchemy, MongoDB, sqlite3, Flask, Deeplearning4j, EJML, dplyr, ggplot2, reshape2, tidyr, purrr, readr, Apache, Spark.Machine Learning Techniques: Supervised Machine Learning Algorithms (Linear Regression, Logistic Regression, Support Vector Machines, Decision Trees and Random Forests, Nave Bayes Classifiers, K Nearest Neighbors), Unsupervised Machine Learning Algorithms (K Means Clustering, Gaussian Mixtures, Hidden Markov Models, Auto Encoders), Imbalanced Learning (SMOTE, AdaSyn, NearMiss), Deep Learning Artificial Neural Networks, Machine PerceptionAnalytics: Data Analysis, Data Mining, Data Visualization, Statistical Analysis, Multivariate Analysis, Stochastic Optimization, Linear Regression, ANOVA, Hypothesis Testing, Forecasting, ARIMA, Sentiment Analysis, Predictive Analysis, Pattern Recognition, Classification, Behavioral ModelingNatural Language Processing: Processing Document Tokenization, Token Embedding, Word Models, Word2Vec, Fast Text, Bag of Words, TF/IDF, Bert, Elmo, LDAProgramming Languages: Python, R, SQL, Java, MATLAB, and MathematicaApplications: Machine Language Comprehension, Sentiment Analysis, Predictive Maintenance, Demand Forecasting, Fraud Detection, Client Segmentation, Marketing Analysis, Cloud Analytics in cloud-based platforms (AWS, MS Azure, Google Cloud Platform)Deployment: Continuous improvement in project processes, workflows, automation, and ongoing learning and achievementDevelopment: Git, GitHub, GitLab, Bitbucket, SVN, Mercurial, Trello, PyCharm, IntelliJ, Visual Studio, Sublime, JIRA, TFS, LinuxBig Data and Cloud Tools: HDFS, SPARK, Google Cloud Platform, MS Azure Cloud, SQL, NoSQL, Data Warehouse, Data Lake, SWL, HiveQL, AWS (RedShift, Kinesis, EMR, EC2, LambdaProfessional ExperienceGen AI Expert/ Lead Data Scientist at Elli Lilly & CompanyIndianapolis, Indiana January 2023  PresentIn my role as a Gen AI Expert/ Lead Data Scientist at Elli Lilly & Company, I led the development of Python scripts for patient healthcare records. It includes various attributes, such as patient demographics, medical conditions, admission details, insurance details etc. ensuring accuracy and enhancing model performance through comprehensive data cleaning and preprocessing. I implemented various topic modeling models, utilized clustering algorithms for topic identification, and applied rigorous QA techniques. I also automated NLP features APIs, innovated with large language models for topic identification, and deployed AI-driven tools for decision support and predictive analytics, adhering to strict data security and privacy standards.Some tasks include:Implemented LangChain framework to process the chunking, embeddings and store those embeddings into vector DBs like Faiss, Pinecone etc. and applied RAG architecture to get the contextual responses through an output in this chatbot.Within this domain specific customer support bot, used pre-trained models and fine-tune them to build the contextual understanding.Implemented a variety of topic modelling techniques including LDA, GSDMM, and BERT, utilizing Google Universal Sentence Encoder and CS-Bert embedding layers for hyperdimensional embedding.Applied K-Means and DB Scan algorithms to cluster embeddings, identifying topics by locating centroids in the clustered text.Conducted rigorous Quality Assurance on Machine Learning/Natural Language Processing models, employing methods like Confusion Matrix, Lemmatization, Stemming, Synonyms Relationships, Information Retrieval, Keyword Boosting, and Multi-language Matching Validation.Utilized testing frameworks such as Pytest, Unittest, Django, and Python Virtual Environments, incorporating modules like NLTK, Scikit-Learn, Pandas, Matplotlib, and Seaborn, along with Git for testing purposes.Designed and executed production test plans and test cases, covering various testing types, and delivered presentations with mockups.Automated NLP Features API using Python, implementing Customer Query Service Department Classifications and Text Request Multi-Class Service Department Classifications.Innovated with Large Language Models (LLMs) such as GPT-3.5 and LLAMA 2, exploring in-context learning capabilities for topic identification using the Retrieval-Augmented Generation (RAG) approach.Employed ServiceNow and Confluence (JIRA) for intelligent workflows, and used Python and Postman for API testing to ensure seamless integration and functionality.Implemented AI systems to analyze data and support decision-making, using AI algorithms for personalization, resource allocation optimization, and predictive analytics.Deployed AI-driven tools for efficient resource distribution, proactive social issue identification, and AI-powered chatbots for immediate support.Prioritized AI solutions that adhered to strict data security and privacy standards, ensuring ethical AI practices and continuous improvement.Used AI-powered tools to gather insights from community feedback and engagement, promoting a more inclusive and participatory approach.Lead Data Scientist/ ML Engineer at Alignment HealthcareOrange, CA Apr 2020  Dec 2022In my role as a Lead Data Scientist/ML Engineer at Alignment Healthcare, I spearheaded a pivotal data engineering initiative. The project focused on building a resilient framework to acquire, ingest, and curate data from diverse sources, enhancing the foundation for informed decision-making. Leveraging cutting-edge technologies and cloud services, particularly on AWS, I led the team in optimizing the data pipeline. The outcome ensured improved data quality and accessibility, facilitating seamless analytics and decision-making processes.Some tasks include:Utilized diverse data science methods to analyse Slack Support usage patterns, conducting evidence-based evaluations of key satisfaction drivers.Processed and prepared text data using Python and NLTK for normalization, tokenization, stemming, and lemmatization.Developed customized solutions in Python, utilizing TensorFlow, Keras, and NumPy libraries, and tested various embedders like BERT, Word2Vec, and GloVe.Applied statistical classifiers, random forests, and logistic regressions for sentiment analysis, constructing an Artificial Neural Network solution for natural language processing.Implemented and fine-tuned a BERT model for embedding and classification on specific data.Created processes and tools for monitoring and analyzing performance, enhancing data collection procedures for analytics system optimization.Developed an OCR-based system to automatically extract patient information from medical documents, streamlining data entry and enhancing patient care.Designed and deployed advanced OCR models using deep learning techniques, such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), to extract data from diverse healthcare forms, including prescriptions, lab reports, and patient information forms.Collaborated with IT to improve business performance, processing, cleansing, and verifying data integrity from various sources.Leveraged data visualization tools to present analyses to the leadership team.Applied predictive modeling to optimize customer experiences, revenue generation, ad targeting, and other business outcomes.Provided the leadership team and stakeholders with data-driven solutions, recommending strategies to address business challenges.Established key success metrics and reporting for the Customer Experience organization.Prototyped foundational data pipelines, collaborating with the data engineering team to establish canonical sources of truth for Customer Experience metrics.Identified and communicated potential opportunities to improve the Slack user experience, advocating for evidence-based decision-making.Wrote test classes for code coverage, developed reports, and utilized them in dashboards.Supported the design of data models, user interfaces, business logic, and security for custom applications.Designed REST-based APIs for efficient data piping.Implemented generative AI models for automated responses to customer inquiries, enhancing efficiency in handling routine queries.Used generative AI for simulating crisis management scenarios, enabling strategic assessment and enhancement of response strategies.Utilized generative models to analyze market trends for dynamic pricing strategies, optimizing ticket pricing in real-time.Developed a POC for generative AI-powered chatbots to assist customers in booking flights and addressing common queries, improving the overall customer experience.Applied generative models with NLP capabilities to analyze customer feedback, gaining valuable insights for service improvements.ML-Ops Engineer at New York Life Insurance CompanyNew York, NY Jan 2018  Mar 2020As a ML-Ops/ Data Scientist at New York Life Insurance Company, I lead a cross-functional team comprising data engineers, modelers, and ML-ops experts. Our focus is on deploying forecasting and Natural Language Processing (NLP) models to elevate the Customer Experience Department. I authored and tested Transformer-based and Statistical Models to assess client reactions to upgrades and support sessions. Additionally, I integrated forecasting models to predict peak demand points.Some tasks include:Designed specialized algorithms for storing and comparing vectorized features and verifications, demonstrating a customized approach to data analysis.Implemented Convolutional Neural Networks (CNNs) using PyTorch and Python, showcasing proficiency in advanced deep learning technologies.Conducted meticulous data cleaning on both images and tabular data to ensure dataset quality and reliability.Designed and implemented statistical evaluation techniques to rigorously assess model performance.Deployed the developed model using Flask and pickle, highlighting practical implementation skills.Utilized the AWS Stack for efficient cloud-based infrastructure management.Integrated the process model with third-party uncertainty quantification software, extracting valuable insights.Structured data into a tabular format for systematic analysis.Conducted Exploratory Data Analysis (EDA) on structured data, removing outliers and duplicates to enhance data quality.Developed processes and tools for monitoring and analyzing performance and data accuracy, improving data collection procedures for analytics system optimization.Collaborated with IT to continuously improve business performance, processing, cleansing, and verifying data integrity from various sources.Leveraged the latest data visualization tools to present analyses to the leadership team.Applied predictive modeling to optimize customer experiences, revenue generation, ad targeting, and other business outcomes.Provided the leadership team and stakeholders with data-driven solutions, recommending strategies to address business challenges.Established key success metrics and reporting for the Customer Experience organization.Prototyped foundational data pipelines and collaborated with the data engineering team to establish canonical sources of truth for Customer Experience metrics.Identified and communicated potential opportunities to improve the Slack user experience, advocating evidence-based decision-making.Developed a Machine Learning (ML) model in Python using TensorFlow, successfully deploying it in the production system, underscoring practical applications of AI in real-world scenarios.Pioneered the creation of a novel class of neural network, termed "Robust Neural Network," in Python, showcasing innovation in algorithmic design.Integrated developed methods into open-source software for uncertainty quantification, contributing to the advancement of transparent and accessible tools in the field.Sr. Data Scientist Consultant at Conoco PhilippsHouston, Texas Jan 2016  Dec 2017As a Sr. Data Scientist Consultant at ConocoPhillips, I bring a wealth of experience in applying advanced analytics and machine learning to optimize operations and drive data-driven decision-making in the oil and gas sector. I have successfully led projects focused on predictive modeling, operational optimization, and the integration of large-scale data pipelines to support exploration, production, and reservoir management.Devised specialized algorithms for the storage and comparison of vectorized features and verifications, demonstrating a tailored approach to data analysis.Expert in building models for production forecasting, asset optimization, and maintenance prediction, improving operational efficiency and reducing downtime.Handled large datasets and implementing scalable data pipelines using AWS and Azure, enabling real-time data processing and advanced analytics.Specialized in analysing geospatial data for optimizing exploration and drilling strategies, contributing to improved resource utilization and cost savings.Proficient in designing ETL workflows to handle diverse data sources, ensuring data quality and seamless integration for analysis and reporting.Implemented Convolutional Neural Networks (CNNs) using PyTorch and Python, showcasing proficiency in cutting-edge deep learning technologies.Conducted meticulous data cleaning on both images and tabular data, ensuring the quality and reliability of the dataset.Designed and implemented statistical evaluation techniques to assess the performance of the model, emphasizing a rigorous validation process.Deployed the developed model using Flask and pickle, showcasing practical implementation skills.Quantified uncertainties associated with the ANN predictions, providing a nuanced perspective on predictive reliability. I am a highly skilled Data Scientist with extensive experience in applying advanced analytics and machine learning techniques to solve complex problems in the oil and gas industry. With a strong background in data science and engineering, I have successfully developed predictive models, optimized operational workflows, and derived actionable insights to drive business growth.Proficient in building models for demand forecasting, production optimization, and equipment failure prediction using time-series and regression techniques.Skilled in managing large-scale datasets and cloud platforms like AWS and Azure to streamline data processing pipelines.Experienced in creating dashboards and reports using Power BI and Python to enable data-driven decision-making for stakeholders.Knowledgeable in adhering to industry regulations and using data to monitor risks and ensure operational compliance.Data Scientist/ Statistician at AdidasPortland, Oregon May 2014  Dec 2015As a Data Scientist and Statistician at Adidas, I played a pivotal role in a project focused on sensitivity analysis for a numerical model simulating a chemical process plant. My responsibilities included designing and implementing a neural network to replicate the physics of the process model, with a specific emphasis on optimizing computational efficiency for the sensitivity analysis.Some tasks include:Constructed a 2-hidden layer feed-forward artificial neural network (ANN) using cleaned data, demonstrating expertise in neural network architecture.Rigorously tested the predictive capabilities of the model, ensuring robust performance in real-world scenarios.Conducted sensitivity analysis on the ANN to identify the most significant parameters of the process model, contributing to a comprehensive understanding of model behavior.Quantified uncertainties associated with the ANN predictions, providing a nuanced perspective on predictive reliability.Published the research work in the Journal of Neural Networks, highlighting academic and practical contributions to the field.Data Scientist/ Data Engineer at Goldman Sachs Group Inc.New York, NY Jan 2013  April 2014As a Data Scientist/ Data Engineer with Goldman Sachs, I collaborated with the Identity Management Team in the Information Security Division at Goldman Sachs to create self-service tools for internal employees. Focused on implementing cloud controls for identity governance and evaluating risks related to cloud service providers.Some tasks include:Enhanced Goldman Sachs data analysis capabilities through the implementation of diverse classification models, including logistic regression, SVM, random forest, and Nave Bayes, tailored to specific data analysis requirements.Conducted rigorous model validation and selection using k-fold cross-validation and confusion matrices, with a focus on optimizing for high recall rates.Collaborated with data engineers to implement data validation rules, quality checks, and cleansing techniques to ensure the accuracy and reliability of data used in modeling and analysis.Partnered with data engineers to implement real-time data processing systems using Kafka, Spark Streaming, and similar technologies, enabling the real-time analysis and monitoring of key metrics.Performed advanced statistical analysis, deriving meaningful interpretations of data and identifying trends, patterns, and outliers in large datasets to extract actionable insights.Applied predictive analytics models to forecast Key Performance Indicators (KPIs) among all attributes, augmenting the organization's analytics capabilities.Utilized advanced statistical and database management skills to conduct comprehensive data analysis, developing and implementing sophisticated data models for forecasting, predictive analytics, and trend analysis.Created ready-to-use templates for machine learning models, offering clear descriptions of their purpose and required input variables based on specified criteria.Generated comprehensive reports and presentations using tools such as Tableau, MS Office, and ggplot2, effectively communicating data trends and associated analyses.Collaborated closely with data scientists and senior technical staff to identify project needs, document assumptions, and ensure alignment with project goals.Defined, designed, and documented conceptual and logical data models, contributing to a robust data architecture.Worked with data engineers to design and optimize data models that support machine learning algorithms and statistical analysis, ensuring that the data structures are aligned with analytical needs.Established clear data element definitions and created source-target data mapping documents, enhancing data understanding and integration.Produced report wireframes and collaborated on SQL schema data element definitions, streamlining reporting processes.Engaged with Data Warehouse architecture, demonstrating proficiency in writing SQL queries and optimizing data retrieval.Applied dimension modeling techniques to identify dimension and fact tables, aligning data elements with the overall data structure to improve analytics.Data Engineer at Mu SigmaChicago, Illinois Sep 2010  Dec 2012Some tasks include:Tackled complex business queries through data identification, analysis, visualization, and ROI presentations.Collaborated with data engineers to manage version control of data engineering scripts using Git, ensuring that data processing workflows were well-documented and reproducible.Utilized Python (Pandas, NumPy, SciPy), data visualization, and supervised machine learning for decision-making.Defined a comprehensive Data Pipeline for ingestion, cleaning, and transformation.Worked together with data engineers, data analysts, and software developers to design end-to-end data solutions that support both data science initiatives and business objectives.Compiled and analysed data from diverse sources to drive actionable insights.Influenced strategic decisions and product enhancements with data-driven insights. Improved forecasting accuracy using machine learning models.Aligned data structures with business goals, delivering valuable insights. Managed data resources effectively through manipulation and visualization.Collaborated on data governance and models for robust data management.Applied statistical models like Generalized Linear Models for actionable marketing insights.Data Analyst at ScienceSoftMcKinney, TX Jan 2006  Aug 2010Some tasks include:Gather and analyze data from various sources to identify trends and insights.Create detailed reports and visualizations to present data findings to stakeholders.Ensure the accuracy and integrity of data through regular audits and data cleaning processes.EducationBachelor of Science in Computer MathematicsCarleton University, Ottawa, ONCertificationsCertified Platform Developer, Salesforce, Inc.Certified Administrator, Salesforce, Inc.Certified Omni Studio Developer, Salesforce, Inc.
Respond to this candidate
Your Message
Please type the code shown in the image: