| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidate Candidate's Name
Data EngineerMobile: PHONE NUMBER AVAILABLE | Email: EMAIL AVAILABLESUMMARY Seasoned Data Engineer with over 10 years of experience in architecting and optimizing end-to-end data pipelines and ETL processes to handle large-scale, complex datasets across various domains. Expert in Big Data Technologies, proficient in implementing distributed computing solutions using Hadoop, Spark, and Kafka to manage and process petabyte-scale datasets with high efficiency. Advanced Cloud Data Engineering Skills, with hands-on experience in deploying, scaling, and managing data infrastructures on AWS, Azure, and Google Cloud, ensuring seamless data flow and storage. Strong SQL and NoSQL Proficiency, including extensive experience with relational databases like SQL Server, Oracle, and MySQL, as well as NoSQL databases like MongoDB and Cassandra for versatile data storage solutions. Highly Skilled in ETL Tools, with deep expertise in designing and maintaining complex ETL workflows using tools like Informatica, Talend, and Apache NiFi, enabling robust data transformation and integration. Proficient in Data Programming and Scripting, with a strong command of Python, Java, and Scala, allowing for the development of custom algorithms, data processing scripts, and automation tasks.
Data Warehousing Expertise, with proven ability to design, implement, and optimize enterprise data warehouses using platforms like Snowflake, Amazon Redshift, and Google BigQuery to support advanced analytics. Extensive Experience in Data Modeling, specializing in the creation of normalized and denormalized data models, ensuring optimal query performance and data integrity in complex data ecosystems. Skilled in CI/CD Practices, adept at integrating CI/CD pipelines into data engineering workflows using tools like Jenkins, Docker, and Kubernetes, ensuring rapid and reliable deployment of data solutions. Strong Analytical and Problem-Solving Skills, focused on leveraging data-driven insights to solve complex business challenges and drive strategic decision-making in fast-paced environments.TECHNICAL SKILLSProgramming Language:Python, PowerShell, Bash, SQLMethodologies:SDLC, Agile, WaterfallPackages:
PyTorch, NumPy, Pandas, Matplotlib, SciPy, Scikit-learn, TensorFlow, SeabornVisualization Tools:Microsoft Power BI, Qlik Sense, Grafana, Tableau, Advanced Excel (Pivot Tables, VLOOKUP)IDEs:Jupyter Notebooks, Visual Studio Code, PyCharmDatabase:MongoDB, Redis, MySQL, PostgreSQL, Snowflake, BigQuery (on GCP), MongoDB, T-SQL, Azure SynapseOther Technical Skills:
Azure, Apache Kafka, Apache Spark, Azure Databricks, Apache NiFi, Flink, Prefect, Apache Beam, Apache Pulsar, AWS Glue, Azure Data Lake Storage, Azure Data Factory, Azure Active Directory, Apache Ranger, Apache Atlas, Azure Data Factory, Prometheus, Splunk, Ansible, Puppet, Nagios, ELK Stack (Elasticsearch, Logstash, Kibana), Docker, Kubernetes, Terraform, SNMP, Data Quality and Governance, Machine Learning Algorithms, Natural Language Process, Big Data, Advance Analytics, Statistical Methods, Data Mining, Data Visualization, Data warehousing, Data transformation, Critical Thinking, Communication Skills, Presentation Skills, Problem-SolvingCloud TechnologiesMicrosoft Azure, AWS (Amazon Web Services), GCP (Google Cloud Platform)Version Control Tools:Git, Jenkins, GitHubOperating Systems:Windows, Linux, Mac iOSDOMAIN SKILLSData EngineeringETL/ELT ProcessesBig Data TechnologiesData WarehousingDatabase ManagementCloud PlatformsData IntegrationData ModellingData VisualizationEXPERIENCE
Sr. Data EngineerCVS Health, USA, Austin TX January 2022 Till Date Architected and Managed Large-Scale Data Pipelines, integrating diverse healthcare data sources, including EMR systems, clinical databases, and IoT devices, to support real-time patient data analysis. Developed and Implemented Data Lake Solutions on cloud platforms like Azure, ensuring secure and scalable storage of structured and unstructured healthcare data with HIPAA compliance. Optimized ETL Processes to streamline the ingestion, transformation, and loading of high-volume clinical and operational data into enterprise data warehouses, reducing processing time by 30%. Designed and Maintained Data Models tailored to healthcare analytics, ensuring accurate representation of complex clinical workflows and patient care pathways, which improved model accuracy by 25%. Collaborated with Data Scientists and Analysts to build predictive models and machine learning pipelines, leveraging patient data to enhance clinical decision-making and operational efficiency, achieving a 20% improvement in prediction accuracy. Ensured Data Quality and Integrity, implementing automated validation and cleansing procedures across data pipelines to maintain data accuracy for critical patient care analytics. Led the Integration of Wearable Device Data into existing data platforms, enabling real-time monitoring and analysis of patient vitals and other health metrics. Drove the Adoption of Advanced Analytics Tools, such as Power BI and Tableau, to empower clinical and administrative teams with self-service data exploration and reporting capabilities. Implemented Robust Data Security Measures, ensuring all data handling processes complied with healthcare regulations and safeguarded sensitive patient information against breaches. Coordinated with IT and DevOps Teams to integrate CI/CD practices in data engineering workflows, enabling continuous deployment, and scaling of data solutions in a healthcare setting.Sr. Data Engineer
Paypal, USA Boston MA Dec 2020 January 2022 Developed Real-Time Transactional Data Streaming Solutions using Apache Kafka and Spark Streaming, providing instantaneous insights into user payment behaviors and fraud detection. Optimized Data Warehousing Solutions by implementing Snowflake and BigQuery, reducing query processing time by 40% for financial analytics and regulatory compliance reporting. Created Advanced Data Models to support predictive analytics in payment risk management, enhancing the accuracy of fraud detection and credit scoring algorithms by 25%. Led the Integration of Third-Party Payment Data Sources, including credit card processors and banking APIs, into PayPal s data ecosystem, improving the accuracy of transaction monitoring by 30%. Architected Scalable Data Pipelines specifically designed to process high-volume payment transactions, enabling a 50% faster rate of transaction approvals and risk assessments, handling up to 5 million transactions per day. Designed and Implemented ETL Workflows for complex payment data, ensuring seamless extraction, transformation, and loading from legacy payment systems into modern cloud-based environments. Collaborated with Compliance and Risk Teams to develop data-driven strategies that reduced payment fraud rates by analyzing transaction patterns and user behavior. Enhanced Data Governance Practices, implementing automated data lineage and auditing processes to ensure compliance with financial regulations such as PCI DSS and AML. Drove Continuous Improvement in Data Quality, leading initiatives to clean, normalize, and enrich transactional datasets, resulting in a 20% reduction in data discrepancies. Integrated AI and Machine Learning Models into data pipelines, enabling personalized payment experiences and more accurate risk assessments through predictive analytics.Data Engineer
TD Bank, USA ,Boston MA June 2019 Nov 2020 Designed and Deployed Data Lakes on hybrid cloud environments, enabling the centralized storage and analysis of diverse banking data, from customer transactions to loan portfolios. Led the Migration of Legacy Data Systems to modern, cloud-based architectures, ensuring minimal downtime and preserving data integrity during the transition. Engineered High-Performance Data Pipelines to streamline the processing of transactional data, reducing latency and improving the efficiency of real-time financial analytics. Implemented Data Encryption and Masking Techniques, safeguarding sensitive financial data across all stages of the data lifecycle, in compliance with regulatory standards like GDPR and CCPA. Developed Advanced Fraud Detection Algorithms, integrating real-time analytics and machine learning models to identify and mitigate fraudulent activities within milliseconds. Optimized Credit Scoring Models by integrating alternative data sources and enhancing data processing workflows, resulting in more accurate and inclusive credit evaluations. Collaborated with Treasury and Risk Management Teams, providing data-driven insights that informed liquidity management and asset-liability modeling strategies. Built and Managed Data Mart Solutions, specifically tailored for banking analytics, enabling faster access to aggregated data for performance reporting and decision-making. Integrated AI-Powered Chatbots with backend data systems, enabling personalized customer service experiences and improving response times for common banking queries. Drove the Adoption of DataOps Practices, standardizing and automating data engineering workflows, leading to a 30% reduction in time-to-market for data-driven products and services.Data EngineerOptum, USA, Sunnyvale CA August 2016 May 2019
Developed and Maintained Data Integration Workflows, enabling seamless connectivity between Optum's proprietary healthcare platforms and third-party data providers to ensure consistent data flow. Architected Scalable Data Processing Frameworks to handle complex healthcare claims data, improving the efficiency of claims adjudication processes by automating data transformations. Optimized Patient Data Warehousing Solutions by implementing near-real-time data synchronization across multiple regions, reducing data access latency for end-users by 25%. Led the Development of Predictive Analytics Pipelines, focusing on population health management, which enabled the identification of at-risk patient groups for early intervention. Automated Data Validation Processes using custom scripts and machine learning techniques, improving the detection and correction of data anomalies in clinical trial datasets. Engineered Cloud-Based Data Solutions to support the integration of genomic data into Optum s analytics platform, facilitating advanced research in personalized medicine. Collaborated with Product Development Teams to design data-driven features for Optum s healthcare applications, enhancing user experience and data accessibility for healthcare providers. Implemented Data Versioning and Lineage Tracking to ensure traceability of data transformations and support audit requirements for clinical data used in regulatory reporting.Jr. Data EngineerDeloitte, USA NY June 2015 August 2016 Assisted in Building Data Pipelines for the integration and transformation of client financial data into Deloitte s analytics platforms, ensuring smooth data flow across multiple systems. Supported the Development of Data Quality Checks, implementing validation scripts to detect inconsistencies in client data, contributing to a 15% reduction in reporting errors. Collaborated with Senior Engineers to optimize ETL processes, enhancing data processing efficiency and reducing batch processing times by 20% for client reports. Maintained Data Documentation by creating detailed records of data models, workflows, and transformations, ensuring transparency and traceability for audit and review purposes. Performed Data Cleansing and Normalization tasks to prepare client datasets for advanced analytics, improving the accuracy of predictive models used in client advisory services. Participated in the Automation of Data Ingestion Processes, using Python and SQL to streamline the extraction of client data from various sources, reducing manual data handling by 25%.EDUCATIONMCA, JNTU - 2006B.Sc., (Computer Science), SVU, Tirupati - 2003 |