Data Engineer Resume Phoenix, AZ

Data Engineer Resume Phoenix, AZ
Resumes | Register
Candidate Information
Name	Available: Register for Free
Title	Data Engineer
Target Location	US-AZ-Phoenix
Email	Available with paid plan
Phone	Available with paid plan
20,000+ Fresh Resumes Monthly
View Phone Numbers
Receive Resume E-mail Alerts
Post Jobs Free
Link your Free Jobs Page
... and much more
Register on Jobvertise Free
Related Resumes
Sr Data Engineer Phoenix, AZ
Azure Data Engineer Scottsdale, AZ
Data Engineer Sql Server Tempe, AZ
Senior Data Engineer Chandler, AZ
Data Engineer Azure Phoenix, AZ
Data Scientist/AI Engineer Tempe, AZ
Azure Data Engineer Phoenix, AZ
Click here or scroll down to respond to this candidate
 Candidate's Name
     EMAIL AVAILABLE | (Street Address ) 709-6390 | https://LINKEDIN LINK AVAILABLE
SummaryExperienced Data Engineer with a Master's degree in Information Systems & Technologies and around 7 years of expertise in leading data pipeline development and optimising ETL processes. Proficient in cloud platforms and big data technologies, including GCP, AWS, Snowflake, and Azure. Notable achievements include orchestrating Snowflake adoption and successfully managing the on-premise to AWS migration, resulting in substantial performance improvements and cost efficiencies and seeking an active role to leverage my technical proficiency and project management skills.Technical Skills      Programming Languages: Python, Java, Scala      Data Storage: SQL, NoSQL databases, Amazon Redshift, Google Big Query, Snowflake, Data Lakes      Big Data Technologies: Apache Hadoop, Apache Spark, Databricks, Apache Hive      Data Processing Frameworks: Apache Kafka, Apache NiFi      ETL Tools: Apache Airflow, Talend, Informatica      Cloud Platforms: Amazon Web Services, GCP, Azure      Containerization and Orchestration: Docker, Kubernetes      Version Control: GitHub      Infrastructure: Terraform, Ansible, Jenkins      Data visualisation: Tableau, Power BI      Database Management Systems: MySQL, PostgreSQL, Oracle, Mongo DB, Snowflake      Collaboration Tools: Jira, Confluence      Monitoring and Logging: ELK Stack, Prometheus      Scripting and Automation: Shell scripting, PowerShell, Bash, Linux, UNIX, PHP      Continuous Integration/Continuous Deployment:Jenkins, GitLab CIWork ExperienceSr. Data Engineer	 June 2023   PresentMarsh                  	Phoenix, AZ
      Led a project to optimize data warehousing by designing and refining SQL queries in BigQuery, resulting in a 30% reduction in query execution time and improved data accessibility for the analytics team.      Implemented partitioning and clustering in BigQuery tables to enhance query performance and reduce costs, leading to a more efficient use of resources.      Developed data models and complex SQL scripts in BigQuery to support advanced analytics and reporting, enabling data-driven decision-making across departments.      Architected a scalable data storage solution using Google Cloud Storage to manage and secure over 10TB of structured and unstructured data, enabling seamless data retrieval and backup processes.      Integrated Cloud Storage with other GCP services like Dataflow and BigQuery for efficient data pipelines and ETL processes, ensuring smooth data transfer and processing.      Implemented lifecycle management policies in Cloud Storage to automatically transition data to lower-cost storage tiers, optimizing storage costs.      Developed real-time and batch data processing pipelines using Google Dataflow, automating ETL processes and reducing data latency from hours to minutes, significantly improving data availability for downstream applications.      Integrated Google Dataflow with Pub/Sub for real-time data ingestion, processing millions of messages per day with low latency.      Optimized Dataflow jobs to reduce processing costs by fine-tuning resource allocation and leveraging autoscaling features.      Implemented a real-time event-driven architecture with Google Pub/Sub, enabling reliable and scalable message processing for a high-traffic e-commerce platform, improving system responsiveness and user experience.      Integrated Pub/Sub with Cloud Functions to trigger automated processes based on real-time events, enhancing the responsiveness and efficiency of data workflows.      Set up Pub/Sub topic monitoring and logging to ensure reliable message delivery and troubleshoot any processing issues quickly.      Managed a Hadoop/Spark cluster on Google Dataproc for processing large datasets, optimizing resource allocation and reducing processing costs by 25% through efficient job scheduling and tuning.      Migrated on-premise Hadoop workloads to Google Dataproc, resulting in a more scalable and cost-effective big data processing environment.      Implemented custom Spark jobs on Dataproc to process and analyze large volumes of data, improving data processing efficiency and scalability.      Utilized Google Bigtable to support high-throughput, low-latency data operations for a time-series data project, ensuring consistent performance even with large-scale data ingestion and query loads.      Orchestrated complex ETL workflows using Apache Airflow on Cloud Composer, automating the scheduling and execution of data pipelines, which enhanced data integration and reduced manual intervention by 40%.      Created interactive dashboards and data visualizations in Looker and Data Studio, providing real-time insights to business stakeholders, leading to more informed decision-making and a 15% increase in operational efficiency.      Automated the provisioning and management of GCP infrastructure using Terraform, enabling consistent and repeatable deployments across environments, which reduced setup time by 50%.      Developed serverless Cloud Functions to handle data transformations and automate event-driven processes, streamlining the data flow and reducing the need for manual interventions.      Implemented and configured Identity and Access Management (IAM) policies to enforce secure access control across GCP resources, ensuring compliance with industry standards and reducing security risks.      Designed and deployed a secure Virtual Private Cloud (VPC) network architecture, including subnets, firewalls, and VPNs, to ensure secure and efficient data processing in a cloud environment.      Leveraged Google Cloud Data Catalog to implement a data governance strategy, enabling metadata management and improving data discovery and compliance across the organization.      Managed a hybrid database solution using Cloud SQL for transactional workloads and Cloud Spanner for globally distributed, scalable databases, optimizing performance and ensuring high availability.      Set up comprehensive monitoring, logging, and alerting using Google Cloud Monitoring and Logging, ensuring the reliability and performance of data pipelines, and enabling proactive issue resolution.
Sr Data Engineer	Dec 2022   May 2023Citi Bank 	     Irving, TX      Engineered a data warehousing solution that significantly optimized data processing by leveraging Azure Synapse Analytics, leading to a 40% reduction in data processing times and enhanced reporting speed for business intelligence.      Created and deployed complex SQL queries within Azure Synapse Analytics to support advanced analytics, allowing stakeholders to quickly extract actionable insights from large data sets.      Designed a scalable and secure data storage architecture using Azure Data Lake Storage, efficiently managing petabytes of both structured and unstructured data, and supporting diverse data processing needs across the organization.      Implemented tiered storage policies in Azure Data Lake Storage to automate the migration of infrequently accessed data to lower-cost storage tiers, resulting in a 30% reduction in storage expenses.      Built and orchestrated real-time and batch data pipelines using Azure Data Factory, automating complex ETL processes that significantly reduced data latency and improved data availability for analytical applications.      Integrated Azure Data Factory with Azure Synapse and Data Lake, ensuring seamless data transformation and flow across services, which enhanced the efficiency of data operations by 25%.      Developed custom activities within Azure Data Factory, leveraging Azure Functions to perform advanced data transformations, thereby increasing the flexibility and capability of ETL processes.      Deployed and optimized Azure Databricks clusters to handle large-scale data processing tasks, reducing job execution times by 20% and enabling more efficient processing of data for analytics.      Combined Azure Databricks with Azure Data Lake to create a robust data processing and machine learning environment, allowing for faster and more scalable data analysis and model training.      Designed and implemented a real-time streaming solution with Azure Stream Analytics to process data from IoT devices, enabling near real-time analytics and improving decision-making processes.      Developed a high-throughput, event-driven architecture using Azure Event Hubs to capture and process millions of events per day, ensuring low latency and high reliability for a real-time data platform.      Implemented detailed monitoring and logging for Azure Event Hubs to maintain reliable event processing and quickly identify and resolve any issues.      Automated the provisioning and management of Azure infrastructure using Terraform, achieving a 50% reduction in deployment times and ensuring consistent environments across multiple stages of development.      Strengthened security protocols by implementing Azure Active Directory for access management across cloud resources, ensuring compliance with security standards and reducing the risk of unauthorized access.      Designed and deployed secure Azure Virtual Networks (VNets) to connect on-premises systems with cloud resources, enhancing network performance, security, and scalability.      Secured sensitive information by leveraging Azure Key Vault to manage secrets, keys, and certificates, ensuring that applications adhered to strict security and compliance requirements.      Deployed Azure Cosmos DB to create a globally distributed, multi-model database that supported high availability and low latency for mission-critical applications across different regions.      Set up Azure Monitor and Log Analytics to enable comprehensive monitoring, logging, and alerting across cloud infrastructure, which ensured the high availability and performance of critical systems.      Managed and optimized Azure SQL Database to ensure high performance and availability, which led to a 20% improvement in query execution times and overall database efficiency.      Developed and deployed serverless functions with Azure Functions to automate routine data processing tasks, reducing manual effort and improving overall system efficiency and scalability.Data Engineer                                                                                                                                      March 2020   October 2021Verizon 	Hyderabad, India      Designed and optimized a data warehousing solution using Amazon Redshift, reducing query execution time by 40% and enhancing analytics capabilities for large datasets.      Created complex data models and ETL processes in Amazon Redshift Spectrum to support advanced analytics and ad-hoc querying, improving data accessibility for business intelligence.      Architected scalable and secure data storage using Amazon S3, managing petabytes of data and enabling efficient data retrieval and backup for various analytical applications.      Implemented lifecycle policies in Amazon S3 to automate the archival of infrequently accessed data to lower-cost storage tiers, reducing storage costs by 30%.      Built and maintained real-time and batch data processing pipelines using AWS Glue, automating ETL processes and reducing data latency, thereby improving data availability for downstream applications.      Integrated AWS Glue with Amazon Redshift and S3 to ensure seamless data transformation and flow across services, which enhanced the efficiency of data operations by 25%.      Developed custom ETL workflows in AWS Glue, leveraging Python and Spark to perform complex data transformations, enhancing the flexibility and capability of data pipelines.      Deployed and optimized Amazon EMR clusters to process large-scale datasets, reducing job execution times by 20% and enabling efficient big data analytics.      Combined Amazon EMR with Amazon S3 for scalable data processing and storage, improving performance and scalability for big data workloads.      Designed and implemented real-time data streaming solutions using Amazon Kinesis, enabling near real-time analytics for data from IoT devices and improving decision-making processes.      Developed a high-throughput, event-driven architecture using Amazon Kinesis Data Streams to capture and process millions of events per day, ensuring low latency and high reliability.      Implemented monitoring and logging for Amazon Kinesis Data Streams to maintain reliable event processing and troubleshoot issues promptly.      Automated infrastructure provisioning and management using AWS CloudFormation, achieving a 50% reduction in deployment times and ensuring consistent environments across development stages.      Strengthened security by implementing AWS IAM roles and policies for granular access control across AWS resources, ensuring compliance with security standards and reducing unauthorized access risks.      Designed and deployed a secure Amazon VPC network architecture, including subnets, security groups, and VPNs, to ensure secure and efficient data processing in the cloud.      Managed sensitive data by leveraging AWS Secrets Manager to securely store and manage secrets and configuration parameters, ensuring compliance and enhancing security.      Deployed Amazon DynamoDB for a highly available, scalable NoSQL database solution, enabling low-latency access to data for critical applications.      Set up comprehensive monitoring and alerting using Amazon CloudWatch and AWS CloudTrail, ensuring the reliability and performance of AWS resources and enabling proactive issue resolution.      Managed relational databases using Amazon RDS, optimizing performance and availability, which led to a 20% improvement in query execution times and overall database efficiency.      Developed and deployed serverless functions using AWS Lambda to automate routine data processing tasks, reducing manual effort and improving overall system efficiency and scalability.  Data Engineer                                                                                                                                       Jan 2017 - February 2020CVS Healthcare	Hyderabad, India
      Implemented a secure data lake architecture using Amazon S3 and AWS Lake Formation to centralize and manage large volumes of health care data, improving data accessibility and compliance with HIPAA regulations.      Designed and deployed Amazon Redshift clusters for scalable and high-performance analytics, enabling advanced data analysis and reporting for patient care and operational efficiency.      Built and maintained ETL pipelines using AWS Glue to automate the transformation and loading of patient data from various sources into a centralized data repository, enhancing data integration and quality.      Utilized Amazon Kinesis Data Firehose to stream real-time patient monitoring data into AWS, enabling immediate analysis and response to critical health metrics and reducing latency in health care delivery.      Developed a machine learning model with Amazon SageMaker to predict patient health outcomes and optimize treatment plans, leveraging historical patient data to enhance predictive analytics and personalized care.      Implemented Amazon QuickSight for interactive dashboards and visualizations, providing health care professionals with real-time insights into patient data and operational metrics to support informed decision-making.      Configured AWS Identity and Access Management (IAM) for strict access control and security policies, ensuring that sensitive health care data is only accessible to authorized personnel and maintaining compliance with regulatory standards.      Integrated AWS Lambda with Amazon S3 and DynamoDB to automate data processing workflows, reducing manual intervention and improving the efficiency of data handling and analysis.      Set up Amazon CloudWatch and AWS X-Ray for comprehensive monitoring and debugging of health care applications, ensuring system reliability and performance, and quickly identifying and addressing potential issues.      Architected a high-availability solution using Amazon RDS with Multi-AZ deployments to ensure continuous availability and disaster recovery for critical health care applications and databases.      Implemented AWS Elastic Beanstalk for scalable deployment of web applications related to patient management and health care services, simplifying application management and scaling as demand fluctuates.      Leveraged AWS Secrets Manager to securely store and manage sensitive configuration data, such as API keys and database credentials, ensuring data security and reducing risk.      Designed and implemented a secure Amazon VPC with subnets, NAT gateways, and security groups to isolate and protect health care data and applications, ensuring a secure network environment.      Automated infrastructure deployment with AWS CloudFormation, ensuring consistent and repeatable infrastructure setups across development, testing, and production environments, reducing setup time and errors.      Developed a serverless solution using AWS Step Functions and Lambda to orchestrate complex workflows for patient data processing and integration, enhancing automation and reducing operational overhead.      Utilized Amazon DynamoDB for a scalable, low-latency database solution to manage patient records and appointment scheduling, ensuring high availability and quick data access.      Implemented AWS Glue Data Catalog to manage and organize metadata for health care data, facilitating efficient data discovery and governance for regulatory compliance and data integrity.      Set up Amazon Elastic Kubernetes Service (EKS) to manage containerized applications related to health care analytics and services, improving scalability and deployment efficiency.      Developed a data synchronization solution using AWS DataSync to securely transfer patient data between on-premises systems and AWS, enhancing data integration and backup strategies.      Integrated AWS Transcribe and Translate services to enable real-time transcription and translation of medical consultations and patient interactions, improving accessibility and communication.EducationMaster s Degree:- Computer Science 	                     July 2023Southern Arkansas University - Magnolia, AR	 GPA 3.25Bachelor s Degree:- Electronics and Communication Engineering  	  Aug 2013   May 2017KLUniversity                                                                                                                                                                         GPA 8.5/10
Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: