Senior Data Scientist Resume Ladue, MO

Senior Data Scientist Resume Ladue, MO
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Senior Candidate's Name
Target Location	US-MO-Ladue
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Related Resumes
Data Engineer Senior Innsbrook, VA
Senior Data engineer Edwardsville, IL
Data Scientist Analyst Saint Ann, MO
Computer Vision, Data Scientist St. Louis, MO
Senior Laboratory Manager St. Louis, MO
Customer Service Data Entry Florissant, MO
Sr. Data Engineer St. Louis, MO
Click here or scroll down to respond to this candidate
Leonildo Jose De Melo De AzevedoCandidate's Name
Phone: PHONE NUMBER AVAILABLE Email: EMAIL AVAILABLESummary: 10+ years in Data Science/ML, 12+ Years in Information TechnologyCreative Candidate's Name  and Software Engineer focused on machine learning. Extensive background in project management, leadership, and financial reporting. Well versed in various machine learning techniques, such as Linear and Logistic Regression, Decision Trees, and Neural Network Architectures. Comfortable with deployment and integration on cloud technologies such as AWS and Azure. Math and logical thinking to tackle the everyday obstacles with a touch of good old physics. Thrives under pressure, quick learner, and a self-starter always in the mood for a new challenge.Professional ProfileExperience applying Nave Bayes, Regression, and Classification techniques as well as Neural Networks, Deep Neural Networks, Decision Tree, and Random Forest.Familiarity using statistical models on large data sets using cloud computing services such as AWS, Azure, and GCP.Applying statistical and predictive modeling methods to build and design reliable systems for real-time analysis and decision-making.Expertise in developing creative solutions to business use cases through data analysis, statistical modeling, and innovative thinking.Performing EDA to find patterns in business data and communicate findings to the business using visualization tools such as Matplotlib, Seaborn, and Plotly.Leading teams to produce statistical or machine learning models and create APIs or data pipelines for the benefit of business leaders and product managers.Experience using supervised and unsupervised techniques.Implementation of predictive analytics for sales to provide forecasting and improve decision-making using techniques such as ARIMA, ETS, and Prophet.Excellent communication and presentation skills with experience in explaining complex models and ideas to team members and non-technical stakeholders.Leading teams to prepare clean data pipelines and design, build, validate, and refresh machine learning models.Applying statistical analysis and machine learning techniques to live data streams from big data sources using PySpark and batch processing techniques.Technical SkillsAnalytic DevelopmentPython, R, Spark, SQLPython PackagesNumpy, Pandas, Scikit-learn, TensorFlow, Keras, PyTorch, Fastai, SciPy, Matplotlib, Seaborn, NumbaProgramming ToolsJupyter, RStudio, Github, GitCloud ComputingAmazon Web Services (AWS), Azure, Google Cloud Platform (GCP)Machine LearningNatural Language Processing & Understanding, Machine Intelligence, Machine Learning algorithms.Analysis MethodsForecasting, Predictive, Statistical, Sentiment, Exploratory and Bayesian Analysis. Regression Analysis, Linear models, Multivariate analysis, Sampling methods, ClusteringApplied Data ScienceNatural Language Processing, Machine Learning, Social Analytics, Predictive Maintenance, Chatbots, Interactive Dashboards.Artificial IntelligenceClassification and Regression Trees (CART), Support Vector Machine, Random Forest, Gradient Boosting Machine (GBM), TensorFlow, PCA, Regression, Nave BayesNatural Language ProcessingText analysis, classification, chatbots.Deep LearningMachine Perception, Data Mining, Machine Learning, Neural Networks, TensorFlow, Keras, PyTorch, Transfer LearningData ModelingBayesian Analysis, Statistical Inference, Predictive Modeling, Stochastic Modeling, Linear Modeling, Behavioral Modeling, Probabilistic Modeling, Time-Series analysisSoft SkillsExcellent communication and presentation skills. Ability to work well with stakeholders to discern needs. Leadership, mentoringOther Programming Languages & SkillsAPIs, C++, Eclipse, Java, Linux, C#, Docker, Node.js, React.js, Spring, XML, Bootstrap, Django, Flask, CSS, Express.js, Front-End, HTML, Kubernetes, Back-End, Databases, Finance, GitHubCandidate's Name  ExpertBayer Mar 2024  CurrentSaint Louis, MissouriAs a Candidate's Name  Expert, I designed and implemented an API for accessing LLM models and vector databases to enable fast retrieval of agricultural advice. I leveraged Lang chain, Hugging Face, Transformers, and OpenAI technologies to deliver critical functionalities for agricultural insights. Collaborating with cross-functional teams using JIRA, I ensured the chatbot addressed industry-specific needs. By integrating diverse agricultural data sources and fine-tuning the model, I significantly enhanced the chatbot's accuracy and efficiency, improving service quality at Bayer.Deploy machine learning models on Azure Kubernetes Service (AKS) or Azure Machine Learning for scalable real-time inference, ensuring continuous integration and continuous deployment (CI/CD) pipelines for rapid model iteration and updates. Use Azure Monitor and Log Analytics for tracking model performance and ensuring uptime for cloud-based AI applications.Design and implement scalable data pipelines, utilizing Azure services like Azure Data Factory, Azure Databricks, and Azure Synapse Analytics to process, store, and analyze large datasets. Use Azure Blob Storage for cost-efficient storage of vast agricultural datasets and Azure SQL for structured data management.Implement security best practices, leveraging Azure Key Vault for securing credentials and secrets, and ensuring compliance with relevant regulations (such as GDPR) in data storage and processing.Develop machine learning models using Python libraries like NumPy, pandas, Scikit-learn, and TensorFlow. Implement data preprocessing, feature engineering, and model evaluation workflows in Python to ensure models are robust and effective in agricultural advisory systems.Automate data ingestion, model training, and evaluation workflows using Python scripts. Utilize Pythons API integration capabilities to interact with Azure resources and external agricultural data sources seamlessly.Use Python to interact with LLMs, handle natural language processing (NLP) tasks, and build custom workflows that support agricultural advice, leveraging tools such as spaCy and NLTK alongside Hugging Face and OpenAI libraries.Leverage Langchain to streamline the development of LLM-powered applications, combining components like document loaders, vector stores, and retrievers. Use Langchain to build complex query pipelines that efficiently retrieve relevant agricultural insights from LLMs.Integrate Langchain with agricultural databases, external APIs, and cloud services to enhance the capabilities of LLM-based chatbots, enabling them to provide context-aware and domain-specific responses to agricultural challenges.Develop modular applications by leveraging Langchains chains, agents, and tools to create customized workflows that cater to specific agricultural problems, such as crop disease diagnosis or soil nutrient management, optimizing the chain for real-time agricultural support.Utilize Hugging Faces vast repository of pretrained models, including BERT, GPT, and RoBERTa, to solve NLP problems specific to agriculture. Fine-tune these models on domain-specific agricultural data to improve their ability to understand complex queries and provide expert advice.Apply Hugging Faces Transformer models and NLP pipelines for tasks like text classification, sentiment analysis, and entity recognition within agricultural contexts. These can be used to analyze farmer feedback, research papers, or weather reports for more precise decision-making.Train custom models using Hugging Faces Trainer API and Datasets library, enabling the integration of specialized agricultural data to enhance the precision of crop recommendations, pest control strategies, and market trends.Fine-tune OpenAIs GPT models (like GPT-4) to align with agricultural use cases, ensuring that responses from the model are tailored to the specific challenges faced by farmers, such as soil management, crop yield optimization, or pest control.Use OpenAI models to generate embeddings from agricultural documents or queries, allowing for efficient similarity search and retrieval using vector databases. This is especially useful in systems that provide rapid agricultural advice based on historical or research data.Develop intelligent chatbots using OpenAIs APIs to provide real-time agricultural advice. These chatbots can be integrated with Langchain and Azure services to deliver comprehensive and context-aware solutions to farmers, improving efficiency and decision-making at scale.Lead Candidate's Name
Campbell Soup Jun 2023  Feb 2024Camden, New JerseyAs Lead Candidate's Name  at Campbell Soup, I developed regression models on Azure Cloud and Databricks, incorporating key business factors like marketing spend, macroeconomic trends, seasonality, and competition. I optimized resource allocation to improve ROI and integrated models with existing systems using Python APIs. Additionally, I built machine learning models to predict Customer Lifetime Value (CLV) and churn rates, providing actionable insights for customer retention. My work also involved leveraging cloud platforms like AWS, Google Cloud, and Azure to ensure scalability and efficiency in data handling and model deployment.Developed regression models on Azure Cloud and Databricks, incorporating factors like marketing expenditures, macroeconomic indicators, seasonality, and competition.Integrated models into existing systems using Python-based APIs to enhance marketing strategies.Optimized resource allocation to improve return on investment and maintained the models in a scalable cloud environment.Continuously updated models to adapt to changing market trends, utilizing continuous integration and deployment strategies on Azure.Used PowerBI, Snowflake, Databricks, and Azure Cloud.Worked in the market data division, focusing on segmentation and forecasting, providing valuable insights to member companies to help retain loyal customers.Built machine learning models to predict Customer Lifetime Value (CLV) and churn rates using survival analysis techniques.Used historical transactional data to construct predictive models and employed unsupervised learning to segment customers based on behavior and characteristics.Utilized Python, R, SQL, and distributed data processing with Apache Hadoop/Spark.Leveraged cloud platforms like AWS, Google Cloud, and Azure for data handling, model training, and deployment.Implemented regression models for CLV, classification models for churn prediction, and clustering algorithms for segmentation.Enhanced model effectiveness through feature engineering and tuning, assessing performance with metrics such as accuracy, precision, recall, F1 score, RMSE, and silhouette score.Lead Candidate's Name
Amdocs Sep 2021  May 2023Chesterfield, MissouriDocument Data Mining using Computer Vision  Amdocs is a global provider of software and services to communications, media and financial services providers and digital enterprises. As Computer Vision and NLP expert, I led a project to develop an object recognition system that combined CNNs with NLP tools to process scanned documents into database entries. My responsibilities included:Developed, evaluated, and trained a custom convolutional neural network (CNN) using frameworks such as TensorFlow and Keras in Python.Leveraging of model checkpoints and early stopping as well as optimizers such as Adam to expedite the model training process.Image resizing and interpolation into a standard size as well as generation of rotational and other invariances using the Skimage library.Utilization of the CV2 library to read and render videos.Collecting and preprocessing a dataset of over 10,000 images, including objects in natural and artificial environments, with varying lighting conditions and camera angles, as well as scanned documents with varying quality and resolution.Training and fine-tuning a CNN architecture using TensorFlow and Keras to recognize objects in the dataset with an initial accuracy of 85%.Developing an NLP module using spaCy to analyze the textual descriptions of the objects and their contexts, and to generate additional features for the CNN to incorporate.Integrating the NLP module with the CNN architecture using a multimodal fusion approach, which allowed the model to learn from both visual and textual information simultaneously.Applying the object recognition system to scanned documents and using Tesseract OCR and AWS Text Extract services to extract and classify text, tables, and other relevant information from the documents.Developing a post-processing module that used NLP techniques to further analyze and interpret the extracted text and generate structured data outputs.Evaluating the performance of the NLP-enhanced CNN model and scanned document processing system using a holdout set of images and documents and achieving a final accuracy of 93% for object recognition and 90% for document processing, with a false positive rate reduced by 40%.Candidate's Name
Nestle Purina Jan 2019 Aug 2021St Louis, MissouriDemand Planning Scientist  As a Demand Planning Statistician at Nestle Purina Pet Care, I developed and enhanced predictive models using complex data science techniques to predict product demand in North America. I utilized a variety of statistical techniques including regression, ARIMAX, ESM, and other time series methods to improve forecast accuracy, reduce forecast bias, increase customer fulfillment, and predict changes in customer demand.Cleaned and transformed data to prepare datasets for further analysis.Automated data acquisition, modeling, and visualization in order to streamline and simplify processes. Cleaned and transformed data to prepare datasets for further analysis.Provided internal data science consulting services, helping business partners identify opportunities and problems that could be addressed through data science solutions. Supported small-scale projects from initial ideation through planning, execution, and delivery.Combined various data inputs (shipment, order, POS, and promotional data) from different external sources (Sales, Marketing, Operations Planning, Customer Facing, and more) as potential predictors of customer demand.Developed and enhanced forecast models for manufacturing plants and customer accounts using a variety of statistical techniques, including regression, ARIMAX, and ESM. Improved forecast accuracy by 10% and reduced forecast bias by 5%.Provided internal data science consulting services, helping business partners identify opportunities and problems that could be addressed through data science solutions. Supported small-scale projects from initial ideation through planning, execution, and delivery.Performed Data Preprocessing on Censor Generated and IoT Data.Pre-processed data using PCA and feature elimination while still maintaining a classification accuracy of more than 99% by the trained models.Implemented SVM, for faster training, less resource-intensive reasons.Implemented various neural networks such as convolutional and recurrent for large number of features.Developed K-means and Density-Based Spatial Clustering of Application with Noise and mixture methods such as multi-variate Gaussian mixture model for this process.Project led to an increase of performance, accuracy, precision, and recall rate.NLP Engineer, Candidate's Name
Buoy Health Feb 2017  Dec 2018Boston MassachusettsMedical Document Search and Chatbot  Buoy Health required symptom checker chatbot which leverages AI to deliver personalized and more accurate diagnoses and medical document search. The companys algorithm was trained on clinical data from 18,000 medical papers to mirror the literature referenced by physicians. Beginning with the symptoms provided by the user via natural language processing, the chatbot matches the symptoms to all possible conditions and then asks clarifying questions to narrow them down to the best selection. Symptom checker chatbots are not clinical decision support (CDS) tools and do not claim to assist with medical decision making (MDM). The bots then put together a most-likely diagnosis and advise on seeing a provider based on provided symptoms. Classification was achieved using a TensorFlow sequential model with SoftMax as the final activation function because of the sheer number of labels, and Stochastic Gradient Descent as an optimizer. Our initial perplexity measurements show a Computer Understanding of over 90% with a solution matching accuracy of over 85%.Worked in an environment using Python, NoSQL, Docker, AWS, and Kubernetes.Worked with the Python packages NumPy, Pandas, SciPy, Matplotlib, Plotly, and Feature Tools for data analytics, cleaning, and feature engineering.Used NLTK and Genism for NLP processes such as Tokenization and for creating custom Word Embeddings.Imported from Pythons TensorFlow package for building Neural Network models.Implemented BERT based embeddings.Employed numerous different models, including Convolutional and Recurrent Neural Networks, LSTM, and Transformers.Models which were operationalized were deployed to a RESTful API using the Python Flask package and Docker containers.Used Agile approaches, including Extreme Programming, Test-Driven Development, and Agile Scrum.Junior Candidate's Name
Citizens Bank June 2014  Jan 2017Boston, MAForecasting and Analytics  Main project targeting the price prediction of houses in greater Boston area. Following a correlation analysis, a few attributes in the dataset seemed to correlate with the price attribute. To train the model with more comprehensive data, the Boston Police reports were added as attributes. First, the all the incidents per zip code were counted, since in the original dataset there was the zip code attribute. Then a function to loop through the original data was created to append a column that contained the number of crimes reported on the zip code of that row. A negative correlation between house prices and areas with high crime rates was shown. The model was then trained using the attributes, number of bathrooms, number of bedrooms, number of crimes and square feet. To test the model, the same data was extracted from a website used to buy/sell property called Zillow. The prediction of the model was 85% accurate in comparison with the price listed on the website. Additional forecasting for sales and overall trends was done in parallel with this work.Built and integrated logistics and linear regression models, balancing various internal requirements of covariance and variable criteria.Discovered patterns in data using algorithms and used experimental and iterative approaches to validate findings.Creative thinking/strong ability to devise and propose innovative ways to look at problems by using business acumen, mathematical theories, data models, and statistical analysis.Advanced statistical and predictive modeling techniques to build, maintain, and improve on real-time decision systems using ARIMA, ETS, and Prophet.Used decision trees and random forests to grade the validity of the variables used in the regression models. Implemented additional tools such as bagging and boosting (AdaBoost, XG Boost) to strengthen these models.Communicated results and recommendations to business stakeholders on a weekly basis. Implemented feedback and features based on the evolving needs of the business in a rapidly changing social landscape.Model evolution was performed between competing groups, where the best model was selected for further refinement.Final model deployed in a Flask app on AWS to be called via a REST API.Data AnalystInfusionsoft June 2012  May 2014Chandler, AZGather data from various sources, including databases, APIs, and CRM systems.Clean and prepare data for analysis, ensuring accuracy and consistency.Identify and address data quality issues.Utilize statistical techniques (e.g., descriptive statistics, hypothesis testing, regression analysis) to analyze data trends and patterns.Develop and maintain data models and visualizations (e.g., dashboards, charts) to communicate findings effectively.Identify key performance indicators (KPIs) and track their performance over time.Create comprehensive reports that provide actionable insights to stakeholders.Collaborate with business teams to understand their data needs and develop tailored reporting solutions.Translate data insights into actionable recommendations to improve business processes and strategies.Support decision-making by providing data-driven evidence.Ensure adherence to data governance policies and regulations (e.g., GDPR, CCPA).Protect data privacy and security.Courses & Certificates:AWS Academy Graduate - AWS Academy Cloud Foundations (2022)Parallel Programming on GPUs (2014)PublicationCosta-Duarte, M. V., et al. "The S-PLUS: a star/galaxy classification based on a Machine Learning approach." arXiv preprint arXiv:1909.08626 (2019).de Azevedo, Leonildo JM, et al. "Optimized service level agreement establishment in cloud computing." The Computer Journal 61.10 (2018): 1429-1442.de Azevedo, Leonildo J. de M., et al. "An analysis of metaheuristic to SLA establishment in cloud computing." (2017).EducationPhD in Computer Science and Computational MathematicsUniversity of Sao PauloMaster of Science in Computational MathematicsUniversity of Sao PauloBachelors in computer scienceState University of the Midwest, Parana, Brazil
Respond to this candidate
Your Message
Please type the code shown in the image: