Sr Data Scientist Resume West haven, CT

Sr Data Scientist Resume West haven, CT
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Sr. Data Scientist
Target Location	US-CT-West Haven
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Data Scientist Analyst White Plains, NY

Data Scientist New Hyde Park, NY

Data Scientist/ Data Analyst Hartford, CT

Junior Data Scientist Hartford, CT

Research Scientist Project Management New Haven, CT

R D Data Analysis Stamford, CT

Cancer Center Data Modeler Simsbury, CT

Click here or scroll down to respond to this candidate

Candidate's Name
Sr. Data ScientistPhone: PHONE NUMBER AVAILABLE | Email: EMAIL AVAILABLELinedIn URL: https://LINKEDIN LINK AVAILABLESUMMARY: Over 10 years of experience in Machine Learning and Deep Learning, specializing in handling large datasets, predictive modeling, and data visualization. Extensive experience in the healthcare domain, with a deep understanding of healthcare data and systems, as well as experience in banking and service domains. Proficient in all phases of the data science project lifecycle, including data extraction, cleaning, statistical modeling, and visualization. Hands-on experience with deep learning frameworks such as TensorFlow, PyTorch, and Keras. Expertise in classical machine learning techniques, including scikit-learn, boosting, and NLP methodologies. Skilled in Python programming with extensive use of libraries like NumPy, Pandas, Matplotlib, SciPy, and scikit-learn for data manipulation and analysis. Experienced with data handling and processing tools such as PySpark and Databricks, efficiently managing large datasets (over 1 million rows). Proficient in using Generative AI tools, including Transformers, Hugging Face, and Langchain. Proficient in Azure services for machine learning and data analytics, including Azure ML, Azure Data Lake Storage, and Azure Databricks. Experienced with MLOps tools such as MLFlow and Kubeflow for model management and deployment. Strong data visualization skills using tools like Power BI, Plotly Dash, and Azure ML. Knowledgeable in machine learning algorithms, including classification, regression, clustering, decision trees, and time series analysis. Demonstrated ability to integrate AI/ML models with cloud platforms and APIs for seamless deployment and management. Exceptional analytical and communication skills, with a proven ability to work collaboratively with diverse teams and present complex findings clearly.TECHNICAL SKILLS:LanguagesPython, R, C++/C, SQL, NOSQLSkillsMachine Learning, Natural Language Processing, Gen AI, Data HandlingPython LibrariesNLTK, SciPy, matplotlib, NumPy, Pandas, Scikit-Learn, Keras, stats models, SciPyDeep Learning FrameworksTensorflow, PyTorch, Hugging Face Transformers, PyTorch LightningVersion ControlGitHub, Git, SVN, Bitbucket, SourceTree, MercurialIDEJupyter Notebook, RStudioToolsConfluenceData StoresHadoop HDFS, RDBMS, SQL and NoSQLData query and Data manipulationHive, Spark-SQL, Scala, MapReduceCloud Data SystemsAWS (S3, EC2, EMR, Glue, Azure, Lambda, Redshift, Sage Maker), Azure (Azure ML, Azure Data Lake Storage, Azure Synapse Analytics), GCP (Vertex AI, AutoML, AI Platform)Machine Learning ToolsMLflow, KubeflowGenerative AI & NLPTransformers, LangChain, OpenAI APIData VisualizationPower BI, Plotly Dash, TableauCI/CD & DevOps:Docker, JenkinsMachine Learning Methods:Gen AI, Classification, Regression, Dimensionality Reduction, ClusteringPROFESSIONAL EXPERIENCE:Client: ((Health First - New York, NY)| MAR 2023 - PresentRole: Sr. Data ScientistResponsibilities: Transformed Logical Data Models to Erwin and developed Physical Data Models, ensuring the consistency of data attributes, primary keys, and foreign key relationships. Migrated SQL server databases to Azure cloud databases for data warehousing and predictive analytics using Azure Machine Learning and MS R server. Designed and developed machine learning models to predict patient outcomes using data from Epic EMR systems, focusing on healthcare analytics. Utilized deep learning frameworks such as TensorFlow, PyTorch, and Keras to build and optimize models.Investigated and implemented Generative Adversarial Networks (GANs) for generating plausible data samples. Applied Computer Vision techniques for inventory optimization, including object detection and image classification. Led the design and development of rapid prototypes and proof-of-concept applications on edge devices powered by Large Language Models (LLMs) and TinyML. Built responsible and ethical AI/ML solutions using advanced Generative AI tools and techniques, including Transformers, LLM, RAG, Langchain, GAN, and VAE. Used advanced SQL features and BigQuery-specific functions to enhance query performance and reduce execution times. Managed large-scale data processing using Apache Hadoop and Spark, optimizing data pipelines for healthcare analytics. Worked with GCP tools including AutoML and Vertex AI, demonstrating the ability to quickly adapt to similar tasks. Implemented MLOps practices using MLFlow and Kubeflow for managing and deploying machine learning models in production environments. Built and maintained end-to-end ML workflows, including tracking metrics, parameters, and model artifacts using MLFlow and Azure Machine Learning. Deployed machine learning models and applications in production environments, ensuring scalability and reliability. Developed interactive dashboards and visualizations using Power BI, Plotly Dash, and D3.js to enhance data interpretation and user experience. Led the development of an end-to-end healthcare analytics solution, coordinating cross-functional teams, defining milestones, and managing project schedules to meet deadlines. Collaborated with front-end developers to integrate machine learning models into web applications using React.js and ES6+, ensuring seamless user interactions. Conducted Exploratory Data Analysis (EDA) using Python libraries such as Pandas, NumPy, SciPy, Matplotlib, and Seaborn. Implemented classification, regression, and clustering algorithms. Converted raw data into processed data by identifying outliers, errors, trends, missing values, and probability distributions. Worked with data architects and stakeholders to understand data movement and storage, and communicated findings clearly to ensure models were incorporated into business processes. Delivered complex OLAP databases, scorecards, dashboards, and reports, and utilized data stores like JSON, XML for machine learning applications. Utilized Alteryx for integrating diverse data sources, including SQL databases, Excel files, and cloud services. Designed and maintained robust end-to-end testing frameworks for web applications using Cypress, with CI/CD integration and test automation.Environment: Python, R, SQL, NoSQL, TensorFlow, PyTorch, Keras, Hugging Face, Transformers, Langchain, GCP (AutoML, Vertex AI), Azure (ML, Data Lake, Databricks), Apache Hadoop, Spark, MLOps (MLFlow, Kubeflow), Power BI, Plotly Dash, D3.js, React.js, ES6+, GitHub, Jenkins, Agile.Client: (Fidelity Raleigh, NC) | FEB 2021 MAR 2023Role: Sr. Data ScientistResponsibilities: Modeled complex business problems to uncover insights and identify opportunities using statistical, algorithmic, data mining, and visualization techniques. Utilized Azure Data Lake Analytics for on-demand analytics with enterprise-grade security. Implemented data processing tasks using Databricks for handling large datasets. Extracted and processed data from Azure Data Lake into HDInsight Cluster, applying Spark transformations and actions. Designed and implemented MLOps pipelines using Azure and tools like PyCaret and MLFlow for automating data preprocessing, model building, deployment, and monitoring. Deployed models using cloud services, demonstrating the ability to adapt to Vertex AI for similar tasks. Developed and optimized generative models for applications like style transfer, text-to-image synthesis, and image production. Built Generative AI applications using LLMs and Transformer architectures with tools like PyTorch, TensorFlow, RAG, LangChain, and Vector DB. Trained and assessed TinyML models on edge devices for optimized performance and resource efficiency. Designed and implemented machine learning algorithms using Spark MLlib and Python. Created and applied reinforcement learning algorithms to enhance decision-making. Developed and trained convolutional neural networks (CNNs) for image classification tasks. Created interactive dashboards and reports using Tableau and Alteryx to visualize data trends and present findings. Performed data analysis and visualization with tools like Python, R, SAS, and Spark. Utilized Apache NiFi on GCP Dataflow for data pipeline orchestration and efficient data processing. Worked with Kubernetes and Docker for containerizing and deploying applications. Implemented ES6+ features for asynchronous data handling in web applications. Created a real-time data visualization dashboard for a financial analytics project using Streamlit, enabling users to explore and interact with data insights dynamically. Set up a Center of Excellence (CoE) for Gen AI, AI/ML, and MLOps, including team formation, AI standards, policies, and templates. Collaborated cross-functionally to integrate machine learning models into business processes and drive actionable insights. Applied Test-Driven Development (TDD) principles and utilized Cypress for automated testing of web interfaces. Followed Agile methodologies, including Extreme Programming and Scrum, for iterative development and deployment. Designed and implemented data schemas in BigQuery for scalable querying of structured and semi-structured data. Performed SQL server migration to Azure cloud databases for data warehousing and predictive analytics. Analyzed and documented data processes, scenarios, and information flow to support business objectives. Provided insights and recommendations to stakeholders through detailed reports and presentations.Environment: Python, JavaScript (ES6+), SQL, Oracle 12c, SQL Server, PL/SQL, MLlib, PySpark, Spark, Azure, NLP, JSON, Alteryx, Tableau, XML, MapReduce, GCP, AWS SageMaker, Tableau.Client: HSBC Metropolitan, NY | JUNE 2019 FEB 2021Role: Data ScientistResponsibilities: Directed and provided vision for designing a scalable and flexible business intelligence (BI) solution to support strategic and tactical decisions. Promoted the use of data-derived insights by creating and socializing key metrics/KPIs. Developed strategies to integrate and analyze data from various payment methods (credit card, debit card, check, cash). Utilized Azure databases for data management, ensuring secure access with Microsoft credentials. Designed BI solutions using Spark and Python, enhancing data analysis and targeting strategies. Developed simulation environments for training reinforcement learning models. Designed and trained Convolutional Neural Networks (CNNs) for image categorization tasks. Collaborated with subject matter experts to create customized NLP pipelines for specific use cases. Leveraged Spark and Python for data manipulation and analysis to enhance banking analytics. Gained expertise in cloud-based ML platforms and prepared for transitioning to Vertex AI for advanced ML tasks. Identified low profitability and negative net margin clients, addressing internal billing discrepancies. Assisted students with skills in Python, Pandas, VBA, JavaScript/HTML/CSS, and databases (MySQL, MongoDB). Developed modular JavaScript code using ES6+ for improved code reuse and maintainability. Modeled the impact of Visa Fixed Acquirer Network Fee (FANF) and the Durbin Amendment on client revenue. Calculated customer lifetime value (CLV) to identify valuable clients and tailor service strategies. Analyzed and prepared data using Azure ML, applying historical models for improved decision-making. Implemented a Continuous Delivery pipeline with Docker and GitHub for streamlined deployment. Integrated marketing and card transaction data to improve targeting strategies and pricing schemes. Developed and maintained ETL pipelines to preprocess and load data into BigQuery using Dataflow, Cloud Composer, and custom scripts. Managed data migration to GCP from other cloud platforms, ensuring data integrity and security during the transition. Ensured compliance with GDPR and CCPA for computer vision solutions. Enhanced user experience of a machine learning model evaluation tool by designing intuitive and interactive interfaces with Streamlit. Created reports and dashboards using SQL Server Reporting Services (SSRS) and MS Visio to visualize data insights. Developed strategies for visualizing client and transaction data, including geo-demographic segmentation. Created a client loss map to aid in retention programs and sales feedback. Analyzed product adoption rates for sales commission calculations.Environment: Python, JavaScript, SQL, Oracle 12c, SQL Server, PL/SQL, MLlib, Spark, PySpark, Azure, NLP, JSON, Tableau, XML, MapReduce, BigQuery, DataflowClient: Merck Rahway, New Jersey | MARCH 2018 MAY 2019Role: Sr. Data AnalystResponsibilities: Analyzed and processed complex data sets using Python libraries (e.g., Pandas, NumPy) and advanced visualization tools (e.g., Matplotlib, Seaborn). Developed a subject segmentation algorithm using R and its packages (e.g., caret, dplyr). Designed and implemented an algorithm to identify "bad" assessments, leveraging statistical techniques. Trained and optimized TinyML models on edge devices for both accuracy and resource efficiency using TensorFlow Lite and Python. Implemented scalable data preprocessing and model training solutions on Databricks, utilizing Spark MLlib. Managed data ingestion, transformation, and analysis in HDFS using Apache Hive, Apache Pig, and Apache Sqoop. Utilized Oracle for data storage and retrieval, optimizing data management processes. Applied statistical methods using R and Python for performance analysis, including predicting days to target metrics Enhanced statistical models like linear mixed models using R and Python libraries. Developed machine learning models including linear regression, KNN, and K-means clustering using Python libraries (e.g., Scikit-Learn). Processed and analyzed unstructured data using text mining and natural language processing techniques in Python. Used Machine Learning Linear regression models, KNN, and K-means clustering algorithms. Builds machine learning models on independent EC2 server to enhance data quality. Processed and analyzed unstructured data, such as text and multimedia, to derive actionable insights using text mining and natural language processing techniques.Environment: Python, R, Oracle, Hadoop, Scikit-Learn, TensorFlow Lite, Databricks, Apache Hive, Apache Pig, Apache Sqoop, Matplotlib, Seaborn, NLTK.Client: Upstox Bengaluru, India | JUNE 2012 MAY 2016Role: Data AnalystResponsibilities: Conducted quality checks on data using Python libraries such as Pandas, identifying outliers through techniques like Z-score and IQR, and standardizing data with Min-Max Scaling. Generated risk stratification reports and visualizations (e.g., bar charts, heat maps) using Tableau to manage and optimize treatment plans. Utilized Oracle for efficient querying and data aggregation, employing advanced techniques such as indexing and partitioning. Developed and managed data pipeline frameworks using AWS SNS and SQS for asynchronous data flow, handling large volumes of transactional and log data. Built and tested predictive risk models using Python, including linear regression and logistic regression, leveraging libraries such as Scikit-Learn and StatsModels.
Applied cross-validation and performance metrics like ROC-AUC. Tracked and analyzed campaign performance, generating detailed customer profiling reports using Python and Tableau.
Derived business insights and generated comprehensive reports using Power BI to drive marketing strategies. Managed data loading from RDBMS and web logs into HDFS using Hadoop tools like Hive and Pig Latin. Improved data warehouse performance with optimization techniques including query optimization and data partitioning using Python and Hadoop.Environment: AWS, Hadoop, Hive, Python, Oracle, SQL, Power BI, Tableau, Microsoft Excel.Education: Jagarlamudi Kuppuswamy Choudary College, Bachelor s| MAY 2012 Sacred Heart University, Masters| DEC 2017

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: