Sr Azure Data Engineer Resume Irving, TX

Sr Azure Data Engineer Resume Irving, TX
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Sr. Azure Data Engineer
Target Location	US-TX-Irving
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Click here or scroll down to respond to this candidate

Candidate's Name
Email: EMAIL AVAILABLEPhone: PHONE NUMBER AVAILABLELinkedIn: Candidate's Name Street Address
Sr. Data EngineerPROFESSIONAL SUMMARY: Around 9+ years of experience in the software industry, including 5 years of experience in Azure cloud services, and 4 years of experience in Data warehouse. Experience in Azure Cloud, Azure Data Factory, Azure Data Lake storage, Azure Synapse Analytics, Azure Analytical services, Azure Cosmos NO SQL DB and Data bricks. Experience in developing, support and maintenance for the ETL (Extract, Transform and Load) processes using Informatica. Experience in developing very complex mappings, reusable transformations, sessions, and workflows using Informatica ETL tool to extract data from various sources and load into targets. Proficiency in multiple databases like MongoDB, Cassandra, MySQL, ORACLE, and MS SQL Server. Experience in Developing Spark applications using Spark SQL in Databricks for data extraction, transformation, and aggregation from multiple file formats for analyzing and transforming the data to uncover insights into the customer usage patterns. Used various file formats like Avro, Parquet, Sequence, Json, ORC and text for loading data, parsing, gathering, and performing transformations. Good experience in Hortonworks and Cloudera for Apache Hadoop distributions. Designed and created Hive external tables using shared meta-store with Static & Dynamic partitioning, bucketing, and indexing. Exploring with Spark improving the performance and optimization of the existing algorithms in Hadoop using Spark context, Spark-SQL, Data Frame, pair RDD's. Extensive hands on experience tuning spark Jobs. Experienced in working with structured data using HiveQL and optimizing Hive queries. Familiarity with libraries like PySpark, Numbly, Pandas, Star base, Matplotlib in python. Writing complex SQL queries using joins, group by, nested queries. Experience in HBase to load data using connectors and write queries using NOSQL. Experience with solid capabilities in exploratory data analysis, statistical analysis, and visualization using R, Python, SQL, and Tableau. Running and scheduling workflows using Oozie and Zookeeper, identifying failures and integrating, coordinating, and scheduling jobs. In - depth understanding of Snowflake cloud technology. Hands on experience on Kafka and Flume to load the log data from multiple sources directly in to HDFS. Widely used different features of Teradata such as BTEQ, Fast load, Multifood, SQL Assistant, DDL and DML commands and very good understanding of Teradata UPI and NUPI, secondary indexes and join indexes. Having working experience with Building RESTful web services, and RESTful API. Implemented proof of concept using AWS technology such as S3 storage, Lambda EMR and Redshift. Good understanding of AWS Glue.TECHNICAL SKILLS:PROFESSIONAL EXPERIENCE:Ally - Detroit, Michigan Aug 2022 to till Date Sr. Azure Data EngineerResponsibilities: Architect and implement ETL and data movement solutions using Azure Data Factory, SSIS Understand Business requirements, analysis and translate into Application and operational requirements. Designed one-time load strategy for moving large databases to Azure SQL DWH. Extract Transform and Load data from Sources Systems to Azure Data Storage services using Azure Data Factory and HDInsight. Created a framework to do data profiling, cleansing, automatic restart ability of batch pipeline and handling rollback strategy. Design and implement database solutions in Azure SQL Data Warehouse, Azure SQL Lead a team of six developers to migrate the application. Implemented masking and encryption techniques to protect sensitive data. Implemented SSIS IR to run SSIS packages from ADF. Capable of using Azure services such as Azure HDInsight (for Hadoop and Spark), Azure Storage, and Azure Monitor to run and monitor Hadoop and Spark jobs on Azure. Leveraged Azure Data Lake Analytics for querying structured data stored in Azure Blob Storage, enabling seamless data ingestion into Azure SQL Data Warehouse or Azure Synapse Analytics for further analysis or reporting. Utilized Azure SQL Data Warehouse's distributed query processing capabilities to create tables with partitioning and distribution keys, optimizing data storage and query performance within the Azure ecosystem. Developed mapping document to map columns from source to target. Created azure data factory (ADF pipelines) using Azure blob. Performed ETL using Azure Data Bricks. Migrated on-premises Oracle ETL process to Azure Synapse Analytics. Involved in migration of large amount of data from OLTP to OLAP by using ETL Packages. Worked on python scripting to automate generation of scripts. Data curation was done using azure data bricks. Worked on Azure data bricks, PySpark, HDInsight, Azure ADW and hive used to load and transform data. Implemented and Developing Hive Bucketing and Partitioning. Implemented Kafka, spark structured streaming for real time data ingestion. Used Azure Data Lake as Source and pulled data using Azure blob. Good experience working on analysis tools like Tableau, Splunk for regression analysis, pie charts and bar graphs. Developed reports, dashboards using Tableau for quick reviews to be presented to Business and IT users. Used stored procedure, lookup, execute pipeline, data flow, copy data, azure function features in ADF. Worked on creating star schema for drilling data. Created PySpark procedures, functions, packages to load data. Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in in Azure Databricks. Responsible for estimating the cluster size, monitoring, and troubleshooting of the Spark data bricks cluster. Creating Databricks notebooks using SQL, Python and automated notebooks using jobs. Creating Spark clusters and configuring high concurrency clusters using Azure Databricks to speed up the preparation of high-quality data. Create and maintain optimal data pipeline architecture in cloud Microsoft Azure using Data Factory and Azure DatabricksEnvironment: Hadoop, Hive, Map Reduce, Teradata, SQL, Azure event hubs, Azure synapse, Azure data factory, Azure Databricks.Nationwide - Columbus, Ohio Mar 2020 to Jul 2022 Data EngineerResponsibilities: Employed Agile Methodology of Data Warehouse development using Kanbanize, overseeing the transition from Azure to AWS infrastructure throughout project iterations. Developed data pipeline using Spark, Hive, and HBase to ingest customer behavioral data and financial histories into Hadoop cluster for analysis, ensuring seamless migration of these processes from Azure to AWS. Leveraged Azure Databricks cloud for data organization and visualization, facilitating the transition of these functionalities to AWS cloud infrastructure. Performed ETL on data from different source systems to Azure Data Storage services, integrating Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics, while preparing for migration to AWS data storage solutions. Created database tables and stored procedures as required for reporting and ETL needs, ensuring compatibility and optimization for migration to AWS database services. Managed Databricks job configuration and refactoring of ETL Databricks notebooks, preparing for the transfer of these tasks to AWS Databricks. Implemented data ingestion from various source systems using Sqoop and PySpark, laying the groundwork for similar processes in AWS infrastructure. Performed end-to-end Architecture and implementation assessment of various AWS services like Amazon EMR, Redshift, S3, Athena, Glue, and Kinesis, aligning project architecture for migration to AWS. Provided thought leadership for architecture and the design of Big Data Analytics solutions for customers, actively driving Proof of Concept (POC) and Proof of Technology (POT) evaluations for migration to AWS Big Data solutions. Hands on experience implementing Spark and Hive jobs performance tuning. KS by proper troubleshooting, estimation, and monitoring of the clusters. Performed Data Aggregation, Validation and on Azure HDInsight using spark scripts written in Python. Performed monitoring and management of the Hadoop cluster by using Azure HDInsight. Involved in extraction, transformation and loading of data directly from different source systems (flat files/Excel/Oracle/SQL) using SAS/SQL, SAS/macros. Generated PL/SQL scripts for data manipulation, validation, and materialized views for remote instances. Created partitioned tables in Hive, also designed a data warehouse using Hive external tables and also created hive queries for analysis. Good experience working on analysis tools like Tableau, Splunk for regression analysis, pie charts and bar graphs. Created and modified several database objects such as Tables, Views, Indexes, Constraints, Stored procedures, Packages, Functions and Triggers using SQL and PL/SQL. Created large datasets by combining individual datasets using various inner and outer joins in SAS/SQL and dataset sorting and merging techniques using SAS/Base. Extensively worked on Shell scripts for running SAS programs in batch mode on UNIX. Wrote Python scripts to parse XML documents and load the data in database. Used Hive, Impala and Sqoop utilities and Oozie workflows for data extraction and data loading. Created HBase tables to store various data formats of data coming from different sources. Responsible for importing log files from various sources into HDFS using Flume. Responsible for translating business and data requirements into logical data models in support Enterprise data models, ODS, OLAP, OLTP and Operational data structures. Created SSIS packages to migrate data from heterogeneous sources such as MS Excel, Flat files, and CVS files. Provided thought leadership for architecture and the design of Big Data Analytics solutions for customers, actively drive Proof of Concept (POC) and Proof of Technology (POT) evaluations and to implement a Big Data solutionEnvironment: ADF, Databricks and ADL Spark, Hive, HBase, Sqoop, Flume, ADF, Blob, cosmos DB, MapReduce, HDFS, Cloudera, SQL, Apache Kafka, Azure, Python, power BI, Unix, SQL Server, AWS EMR, Redshift, S3, Athena, Glue.CVS Pharmacy - Woonsocket, Rhode Island Dec 2017 to Feb 2020 Data EngineerResponsibilities: Anchor artifacts for multiple milestones (application design, code development, testing, and deployment) in software lifecycle. Develop Apache Strom program to consume the Alarms in real time streaming from Kafka and enrich the alarm and pass it to EEIM Application. Creating rules Engine in Apache Strom to categorize the alarm into Detection, Interrogation & Association types before processing of alarms. Responsible for developing EEIM Application as Apache Maven project and commit to code to GIT. Analyze the Alarms and enhance the EEIM Application using Apache Strom to predict the root cause of the alarm and exact device where the network failure has happened. Worked extensively on migrating our existing on-prem data pipelines to AWS cloud for better scalability and infra structure maintenance. Worked extensively on migrating/rewriting existing Oozie jobs to AWS simple workflow Accumulate the EEIM Alarm data to the NoSQL database called Mongo DB and retrieve it from Mongo DB when necessary. Build Fiber to The Neighborhood or Node (FTTN) Topology and Fiber to The Premises (FTTP) Topology using Apache Spark and Apache Hive. Process the system logs using log stash tool and store to elastic search and create dashboard using Kibana. Regularly tune performance of Hive queries to improve data processing and retrieving Provide technical support for debugging, code fix, platform issues, missing data points, unreliable data source connections and big data transit issues. Developed Java and Python application to call the external REST APIs to retrieve weather, traffic, geocode information. Working Experience on Azure Databricks cloud to organize the data into notebooks and making it easy to visualize data using dashboards. Worked on managing the Spark Databricks by proper troubleshooting, estimation, and monitoring of the clusters. Performed Data Aggregation, Validation and on Azure HDInsight using spark scripts written in Python. Performed monitoring and management of the Hadoop cluster by using Azure HDInsight. Worked with Jira, Bit Bucket, and source control systems like Git and SVN and development tools like Jenkins, Artifactory.Environment: PySpark, MapReduce, HDFS, Sqoop, flume, Kafka, Hive, Pig, HBase, SQL, Shell Scripting, Eclipse, SQL Developer, Git, SVN, JIRA, Unix.Hexaware India May 2014 to June 2017Data Warehouse Developer Responsibilities: Creation, manipulation and supporting the SQL Server databases. Involved in the Data modeling, Physical and Logical Design of Database Helped in integration of the front end with the SQL Server backend. Created Stored Procedures, Triggers, Indexes, User defined Functions, Constraints etc. on various database objects to obtain the required results. Import & Export of data from one server to other servers using tools like Data Transformation Services (DTS) Wrote T-SQL statements for retrieval of data and involved in performance tuning of TSQL queries. Transferred data from various data sources/business systems including MS Excel, MS Access, Flat Files etc. to SQL Server using SSIS/DTS using various features like data conversion etc. Also Created derived columns from the present columns for the given requirements. Supported team in resolving SQL Reporting services and T-SQL related issues and Proficiency in creating different types of reports such as Cross-Tab, Conditional, Drill-down, Top N, Summary, Form, OLAP and Sub reports, and formatting them. Provided via the phone, application support. Developed and tested Windows command files and SQL Server queries for Production database monitoring in 24/7 support. Created logging for ETL load at package level and task level to log number of records processed by each package and each task in a package using SSIS. Developed, monitored, and deployed SSIS packages.Environment: IBM WebSphere DataStage EE/7.0/6.0 (Manager, Designer, Director, Administrator), Ascential Profile Stage 6.0, Ascential QualityStage 6.0, Erwin, TOAD, Autosys, Oracle 9i, PL/SQL, SQL,UNIX Shell Scripts, Sun Solaris, Windows 2000.EDUCATION DETAILS:Bachelor s Degree from GITAM Hyderabad in Computer Science Engineering (2010-2014)

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: