Quantcast

Azure Data Engineer Resume Philadelphia,...
Resumes | Register

Candidate Information
Name Available: Register for Free
Title azure data engineer
Target Location US-PA-Philadelphia
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes

Azure data engineer Philadelphia, PA

Data Engineer Software Development Wilmington, DE

Data Engineer Engineering Princeton Junction, NJ

Data Engineer Senior Piscataway, NJ

Data Engineer Big Newark, DE

Data Engineer Iselin, NJ

Azure Cloud Architect & DevOps Engineer Ellisburg, NJ

Click here or scroll down to respond to this candidate
                                                                                                DATA ENGINEERCandidate's Name
EMAIL AVAILABLE
PHONE NUMBER AVAILABLEPROFESSIONAL SUMMARYAzure Data Engineer  with over 5 years of experience in IT, specializing in Data Engineering and Data Warehousing. Highly skilled in Python and SQL, with deep expertise in Azure Data Factory, Synapse Analytics, and Databricks for both batch and real-time data processing. Proficient in data streaming technologies like Kafka and Azure Event Hubs, along with Spark notebooks, Azure Data Lake Storage (ADLS), BLOB Storage, and relational databases. Recognized for optimizing ETL processes and leveraging Power BI for data visualization, while leading successful migration projects to Azure and Snowflake.- Over 5 years of experience in Data Engineering, with a focus on building scalable data pipelines using Azure Data Factory, including advanced capabilities like copy and lookup activities and integration with ADLS Gen2.- Led cloud migration projects, transitioning on-premises data to Azure Data Lake Storage Gen2 using Azure Data Factory, improving scalability, reliability, and cost-efficiency.- Strong background in managing ETL pipelines, with expertise in enhancing data transfer performance and storage optimization using compression techniques.- Experienced in fine-tuning queries and applying indexing strategies for improved data retrieval, with hands-on expertise in Vertica and Teradata for high-performance analytics.- Skilled in crafting SQL queries, both DDL and DML, and connecting diverse data sources using linked services in Azure Data Factory.- Advanced knowledge of data warehousing methodologies, including data cleansing, SCD management, surrogate key assignment, and CDC implementation in Snowflake.- Proficient in creating scalable data ingestion systems using technologies like Apache Kafka, Flume, and Nifi, with a focus on processing through EventHub.- Expertise in building and optimizing data models and schemas using technologies like Hive, HBase, and Snowflake, incorporating ADLS Gen2 for effective data management.- Specialized in using Azure Databricks and PySpark to build data processing systems for fast and reliable insights across both batch and real-time data.- Experienced with configuring ADF and Snow SQL jobs in Matillion using Python, and optimizing Azure Functions for data extraction, transformation, and loading.- Deep expertise in Hadoop, HDFS, MapReduce, Hive, and PySpark, with experience in Hortonworks and Cloudera distributions.- Skilled in using Spark for improving algorithm performance in Hadoop environments, leveraging Spark SQL, Data Frames, RDDs, and compression techniques.- Proficient in implementing CI/CD pipelines using Jenkins, ensuring seamless integration and deployment in cloud environments.- Strong background in Agile methodologies, emphasizing cross-functional collaboration to improve project delivery efficiency.  TECHNICAL SKILLS:Azure ServicesAzure Data Factory, Azure Data Lake, ADLS Gen2, Azure Data ricks, Logic Apps, Functional App, Key Vault, Azure Active Directory, Azure Synapse Analytics,
Big Data TechnologiesHDFS, MapReduce, Hive, Sqoop, Oozie, Zookeeper, Kafka, Apache Spark, Spark Streaming,Databases &Data warehousesMS SQL Server 2016/2014/2012, Azure SQL DB, Snowflake, Azure Synapse. MS Excel, MS Access, Oracle 11g/12c, Cosmos DB, Cassandra, PostgreSQL, Teradata, MongoDB, Dynamo DB.Hadoop DistributionCloudera, HortonworksIDE &Build Tools, DesignEclipse, Visual Studio, PyCharm, DBT.Operating SystemsWindows (XP/7/8/10), UNIX, LINUX, UBUNTU, CENTOS.Programming LanguagesPython, PySpark, Shell script, .NET/C#, Perl script, SQL, Java.Version ControlGIT, GitHub, Azure GitHub.Visualization Tools
Power BI, Tableau, SSRSWeb TechnologiesXML, JSP, HTML, SOAP, JavaScript.PROFESSIONAL EXPERIENCE:
Molina Healthcare                                                                                                                                            Dec 2021 - present
Azure Data EngineerResponsibilities:- Spearheaded data engineering projects within the Azure Kubernetes Service (AKS) environment at Moss & Associates, emphasizing reliability, scalability, and efficiency in data operations.- Architected and executed end-to-end data pipelines, seamlessly integrating Azure services like SQL Database, Data Lake Storage, and Data Factory.- Implemented streamlined data integration solutions for seamless ingestion and integration of data from various sources, employing tools such as Apache Kafka, Apache NiFi, and Azure Data Factory, with Event Hubs for real-time data streaming.- Managed data ingestion into Azure Services (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and orchestrated data processing in Azure Databricks, incorporating Event Hubs for real-time analytics.- Boosted Spark performance through advanced optimization techniques like partitioning, caching, and compression, enhancing data processing efficiency.- Leveraged Microsoft Azure services such as HDInsight Clusters, BLOB, Data Factory, and Logic Apps, alongside Event Hubs for streamlined data collection and processing.- Executed ETL operations using Azure Databricks, migrating on-premises Oracle ETL processes to Azure Synapse Analytics with optimized storage and querying speeds.- Utilized Snowflake's versatile support for SQL, Python, and other languages for advanced analytics and data processing tasks within AKS.- Oversaw migration of SQL databases to Azure Data Lake, Azure SQL Database, Azure Synapse, and integration of Teradata databases for seamless data synchronization and reporting.- Managed database access and migration to Azure Data Lake Store using Azure Data Factory, including Teradata databases via efficient data transfer methods.- Implemented data transfer using Azure Synapse and Polybase, focusing on compression techniques to enhance efficiency and reduce storage costs.- Leveraged Snowflake's auto-scaling features to ensure optimal performance and resource utilization.- Deployed and optimized Python web applications to Azure DevOps CI/CD pipeline, integrating data from Event Queues for effective application event management.- Developed enterprise-level solutions using batch processing and streaming frameworks like Spark Streaming and Apache Kafka, with Event Queues for efficient event-driven data workflows.- Designed and implemented robust data models and schemas using technologies such as Apache Hive, Apache Parquet, and Snowflake, applying compression techniques for optimized storage.- Managed end-to-end data pipelines using Apache Spark, Apache Airflow, or Azure Data Factory, ensuring reliable and timely data processing and delivery, including integration with Teradata for comprehensive data analysis.- Collaborated with cross-functional teams to gather requirements, design data integration workflows, and implement scalable data solutions, leveraging Event Hubs for real-time event stream processing.- Provided production support and troubleshooting for data pipelines, identifying and resolving performance bottlenecks, data quality issues, and system failures, with a focus on optimizing data flows from Event Queues.- Actively participated in Agile ceremonies such as Sprint Planning, Daily Stand-ups, Sprint Reviews, and Retrospectives to ensure project progress and team alignment.
Environment:
Hadoop Cloudera, Microsoft Azure (including Azure Databricks, ETL, and Azure Synapse Analytics), SQL databases, Data Lake Analytics, Databricks, Polybase, Python, Azure DevOps, CI/CD, Kafka, Spark, Hive, Scala, Spark SQL, Hive tables, Hive Generic UDFs, Data Lakes, Hortonworks, PySpark, RDDs, Data Frames, Spark SQL, Git, JIRA, Data flow, Azure SQL Server.American Equity, West Des Moines, IA                                                                                                     Aug 2019   Nov 2021
Data Engineer
Responsibilities:-Enterprise Data Lake Design: Architected and implemented an Enterprise Data Lake on AWS, utilizing services such as EC2, S3, Redshift, Athena, Glue, EMR, DMS, Kinesis, SNS, and SQS.- Data Extraction and Cataloging: Extracted data from various sources including S3, Redshift, and RDS, using Glue Crawlers to create databases and tables in the Glue Catalog.- ETL Development: Developed Glue ETL tasks in Glue Studio for data processing and transformation, loading the results into Redshift, S3, and RDS. Utilized Glue DataBrew to design reusable transformation recipes.- AWS Glue ETL Operations: Executed ETL processes in AWS Glue to move data from external sources like S3 and Parquet/Text files into Redshift.- Snowflake Database Management: Contributed to the development, enhancement, and maintenance of Snowflake database applications. Designed a data warehouse model in Snowflake for over 100 datasets using WhereScape and established data sharing between two Snowflake accounts.- Data Integration with PySpark: Implemented PySpark tasks in AWS Glue for integrating data from various tables and updating the Glue Data Catalog with metadata definitions through Crawlers.- Stored Procedures and Talend Integration:Developed stored procedures and views in Snowflake for use in Talend, facilitating the loading of Dimensions and Facts.- Process Automation: Integrated AWS Lambda with AWS Glue for process automation, leveraging AWS EMR for efficient data transformation and movement. Utilized CloudWatch to configure logs, notifications, alarms, and monitoring for Lambda functions and Glue Jobs.- Talend Joblet Conversion: Adapted Talend Joblets to ensure compatibility with Snowflake functionalities.- Architecture Evaluations: Conducted comprehensive evaluations of the architecture and implementation of Amazon EMR, Redshift, and S3 AWS services.- Data Transfer and Transformation: Employed AWS EMR to manage and transform large datasets between AWS storage options such as S3 and DynamoDB. Used Athena for data analysis by running queries on datasets processed by Glue ETL tasks and created business intelligence reports with QuickSight.- DMS for Database Migration: Used DMS to transfer tables from various databases, both homogeneous and heterogeneous, to the AWS Cloud.- Streaming Data Solutions: Developed Kinesis Data Streams, Kinesis Data Firehose, and Kinesis Data Analytics for capturing, processing, and storing streaming data in Redshift, S3, and DynamoDB. Created Lambda functions to trigger AWS Glue jobs based on events in S3.Environment:
 AWS Glue, AWS Lambda, AWS EMR, AWS S3, AWS Redshift, Spark, AWS Kinesis, AWS Athena, AWS    CloudWatch, IAM, SNS, SQS, UNIX Shell Scripting, AWS QuickSight, GitHub, Jenkins, Python.Wells Fargo, Dallas, TX                                                                                                                                    Dec 2018   Jul 2019
Azure Data EngineerResponsibilities:- Orchestrated a large-scale data migration from on-premises systems to the Azure cloud platform, utilizing Snowflake for efficient data warehousing and storage solutions.- Designed and implemented a self-hosted integration runtime within Azure Data Factory (ADF) to facilitate a seamless data transfer pipeline from on-premises SQL databases, Oracle databases, CSV files, and REST APIs into Azure Blob Storage.- Configured network security measures, including VPN and firewall settings, to enable direct data access from on-premises SQL Servers to Azure Databricks via JDBC connectors.- Managed data ingestion, movement, and orchestration using ADF pipelines, leveraging Azure Logic Apps for process automation and Azure Monitor for operational insights and analytics.- Managed structured and unstructured data using a variety of databases, including Azure Cosmos DB for NoSQL data and Azure SQL Database for relational data.- Developed data processing workflows using Azure Databricks, leveraging Spark for distributed data processing and transformation tasks.- Implemented data quality checks and data cleansing techniques to ensure the accuracy and integrity of the data throughout the pipeline, using Azure Data Factory and Databricks.- Developed end-to-end ETL data pipelines, ensuring scalability and smooth functioning. This included extensive use of copy activity for data movement and lookup activity for data validation, leveraging linked services to connect on-premises and cloud data sources.- Implemented optimized query techniques and indexing strategies, enhancing data fetching efficiency and scalability using SQL queries and ADLS Gen2.- Integrated Snowflake with Azure cloud services to establish secure and efficient data warehousing solutions, enabling insightful reports for strategic analysis.- Hands-on development experience with Snowflake features such as Snow SQL, Snow Pipe, Python, Tasks, Streams, Time Travel, Zero Copy Cloning, Optimizer, Metadata Manager, data sharing, and stored procedures.- Designed and implemented real-time data processing solutions using Kafka and Spark Streaming, facilitating the ingestion, transformation, and analysis of high-volume streaming data.- Integrated PySpark with Azure Databricks and Azure Blob Storage for seamless data ingestion and processing within the Azure ecosystem.- Optimized PySpark jobs for performance by leveraging techniques like partitioning and caching, reducing processing times and improving system efficiency.- Conducted performance tuning and capacity planning exercises to ensure scalability and efficiency of data infrastructure.- Made strategic use of ADLS Gen2 for efficient data storage and management in Azure Functions, optimizing code for data extraction, transformation, and loading.- Developed complex SQL queries and data models in Azure Synapse Analytics to integrate big data processing and analytics capabilities, enabling seamless data exploration and insights generation.- Built and optimized data models and schemas using technologies like Apache Hive and Snowflake, with copy activity streamlining data movements.- Created ETL transformations and validations using Spark SQL/Spark Data Frames with Azure Databricks and Azure Data Factory, ensuring data accuracy and consistency.- Integrated GitHub repositories with Azure services for enhanced collaboration and automated deployment workflows within the Azure ecosystem.- Designed and deployed interactive Power BI dashboards providing real-time insights into various business metrics, enhancing decision-making processes.- Collaborated with Azure DevOps team to improve code quality and project management efficiency.Environment:
Snowflake, Azure Databricks, Azure Data Factory, Azure Logic Apps, Oracle, Functional App, Key Vault, MySQL, Azure SQL Database, HDFS, Spark, Hive, SQL, Python, Scala, PySpark, GIT, JIRA, Jenkins, Kafka, Azure ML, Power BI, HBase, Azure DevOps.

Respond to this candidate
Your Email «
Your Message
Please type the code shown in the image:
Register for Free on Jobvertise