Quantcast

Senior Data Engineer Resume Cumming, GA
Resumes | Register

Candidate Information
Name Available: Register for Free
Title Senior Data Engineer
Target Location US-GA-Cumming
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes

senior data engineer Alpharetta, GA

Senior Cloud Data Engineer Atlanta, GA

Senior Big Data Engineer Atlanta, GA

Senior Data Engineer Atlanta, GA

Senior Software Engineer Suwanee, GA

Data Center Sqa Engineer Atlanta, GA

Software Engineer Data Entry Atlanta, GA

Click here or scroll down to respond to this candidate
Candidate's Name
Contact No: PHONE NUMBER AVAILABLE
Email: EMAIL AVAILABLE
LinkedIn: Candidate's Name  | LinkedIn


Professional Summary
      Microsoft Certified Data Engineer with expertise in designing data-intensive applications using
      Hadoop Ecosystem and Big Data Analytics, Cloud Data engineering (AWS, Azure), Data
      Visualization, Data Warehouse, Reporting, Business intelligence and ETL.
      10+ years experience in Hadoop Ecosystem components such as MapReduce, Pig, Hive for
      Analysis and Sqoop, Flume for data import/export.
      Experience in Data Migration from on-premises to Azure and AWS cloud.
      Setting Up AWS and Microsoft Azure with Databricks Workspace for Business Analytics.
      Experience with Implementing Databricks Delta Lake Architecture (Bronze, Silver and Gold
      layers) and the Delta Live Tables (DLT).
      Developing ETL transformations and validation using Spark-SQL/Spark Data Frames with Azure
      data bricks and Azure Data Factory for distributed data processing and transformation tasks.
      Hands-on experience on Azure components such as ADF, ADLS, ADB, Synapse, Azure SQL DB,
      Logic apps, Azure functions, Key vault, Integration Runtime etc.
      Experience in creating pipeline jobs, scheduling triggers, and Mapping data flows using Azure
      Data Factory (V2) and Key Vault to store credentials.
      Worked on PySpark using Spark-SQL, Data frame API, and Spark Streaming to increase the
      efficiency and optimization of existing Hadoop approaches and build end-to-end data pipelines.
      Hands-on experience on AWS components such as EMR, EC2, S3, RDS, IAM, Auto Scaling,
      Cloud Watch, SNS, Athena, Glue, Kinesis, Lambda, Redshift, CloudFront, DynamoDB to ensure
      a secure zone for an organization in AWS public cloud.
      Worked on Snowflake modelling using data warehousing techniques, data cleansing, Slow
      Changing Dimension phenomenon, surrogate key assignment and change data capture.
      Experienced in Snowpipe, Data Sharing, Database, Schema and Table structures in Snowflake.
      Designed and developed logical and physical data models that utilize concepts such as Star
      Schema, Snowflake Schema and Slowly Changing Dimensions.
      Exposure to multiple Hadoop distributions like Cloudera and Hortonworks platforms.
      Good experience working with real-time streaming pipelines using Kafka and Spark-Streaming.
      Data Quality, Data Governance, and Data migration implementation in various Big Data projects.
      Hands on experience in application servers like WebLogic 8.1, Tomcat 6, JBoss 7.0, WAS 7.0
      Experienced with IDE tools such as Eclipse, PyCharm, IBM RSA, and Databricks Notebook.
      Implemented MVC framework in projects using Spring, Struts and Hibernate.
      Good knowledge in relational database systems (Oracle, DB2, MS-SQL, and MySQL).
      Experienced in web design using HTML, CSS, JavaScript and jQuery.
      Exposure in coordination with the business, requirement gathering, technical walk-through, and
      functional and technical document preparation.
Technical Skills
    Azure Cloud             Azure Data Factory, Azure Databricks, Azure Synapse, ADLS,
                            Azure Functions, Azure SQL Database, Azure SQL
                            Data Warehouse, Logic apps

    AWS Cloud               EC2, S3, ELB, EBS, VPC, Auto Scaling, CloudFront,
                            CloudWatch, Kinesis, Redshift

    Hadoop Ecosystem        Apache Hadoop, Spark, HDFS, MapReduce, PIG, Hive,
                            Sqoop, Flume, Hbase, PySpark

    Cloud Data Warehouse    Snowflake

    Apache Spark            Spark 3.0.1(Spark Core, Spark SQL, Spark Streaming)

    Programming             Java, Python, Scala, Shell Scripting
    Language

    Hadoop Distributions    Cloudera Distribution 6.12 and HortonWorks (HDP 2.5)

    NoSQL Database          CosmosDB, MongoDB, Cassandra

    Visualization Tool      Tableau, PowerBI

    Deployment Tools        Azure DevOps, GIT, Jenkins, Docker, Kubernetes

    J2EE Components         JSP, Servlets, JavaBeans, JDBC 2.0

    RDBMS                   Oracle 11g/10g/9i, MS SQL, DB2, MySQL

    Web Design              HTML, CSS, JQuery, JavaScript

    IDE Tools               NetBeans 6.5, MyEclipse 6.0, Eclipse 3.7, Eclipse 4.2 Juno

    Web Server              Apache Tomcat 6.0/7.0/8.0, WebLogic 8.1, JBoss 7.0, WAS 7.0

    Version Control Tools   CVS, SVN, GIT

    Database Tool           Toad, SQL Developer

    JavaScript Framework    Angular JS 1.0, jQuery

    Testing Framework       Junit

    Message Broker Tool     Apache Kafka

    Web Service             SOAP, Restful Web service (Jersey), Micro Services

    Rest Client             SOAP UI, Postman

    Tool                    Putty, VPN, WinSCP, JIRA, Toad, SQL developer, VS Code
Projects Experience

E&Y, USA - Senior Data Engineer
July-2023 - July-2024
      Setup Ingestion process to fetch data from PM1 Server using API and push into Data Lake.
      Transform the Data using Azure Databricks PySpark code and insert it to MS SQL Server DB.
      Worked on building the Azure data pipeline, Databricks coding, coordination with the business for
      requirement gathering and feedback.
     Created Databrick notebooks to streamline and curate the data for various business use cases
      and also mounted blob storage on Databrick.
     Created and optimized Snowflake schemas, tables, and views to support efficient data storage
       and retrieval for analytics and reporting purposes.
     Worked on Azure DevOps to migrate code from Dev, QA and Prod environment.
     Write PySpark code to read JSON files and write the content into ADLS location.
     Setup Azure data pipelines to complete one after another to push data to different systems.
     Read CSV files from azure blob storage and use PySpark activity code to store data in Data
      Warehouse.
     Pushed the latest data into the respective application table for further PowerBI report generation.
     Worked on ETL pipeline designed reading data from RDBMS oracle and store into ADLS.
     Managed code repositories using Git within Azure DevOps for version control.
     Implemented CI/CD pipelines using Azure DevOps for data engineering solutions.
     Developed Databricks notebooks using Pyspark and Spark-SQL for data extraction,
      transformation and aggregation from multiple systems and stored on Azure Data Lake Storage.
     Engineered Spark jobs with incremental capabilities to extract data from source databases into
      Gen2 staging, optimizing data extraction processes for subsequent loading into Snowflake.
Technology used: Microsoft SQL Server Studio, Azure Data Factory, Azure Data Bricks, SQL,
Snowflake, PowerBI


Eversource Energy, USA - Senior Data Engineer
Feb-2022 - June-2023
       Setup Ingestion process to migrate data from MSSQL Server into ADLS Gen 2.
       Transform the Data using Azure Databricks and push to target systems for further reporting.
       Implemented ETL transformations utilizing Dataflow within Azure Data Factory (ADF), aligning
       with business requirements to streamline data processing workflows effectively.
       Configured email notifications for ADF pipelines using Logic Apps, enabling proactive monitoring
       and alerting of pipeline execution status and ensuring timely response to any issues or failures.
       Developed ETL logic after analyzing Technical Specifications and layout documents to perform
       data mapping from DataLake to Outbound consumption Layer.
       Followed Agile methodology in Azure DevOps to track the stories in Sprint fashion and update
       the tasks as per progress and comments as per peer review.
       Worked closely with Business Teams to resolve any project related technical concerns.
       Involved in coordination with the business for requirement gathering and feedback.
       Developed Spark SQL/Scala/Python scripts on Azure Data Bricks using Microsoft Azure Cloud
       portal and Azure SQL DB to process the data as per the requirements of the Data Science Team.
       Maintaining version control of code using Azure Devops and GIT repository.
       Used various Spark Transformations and Actions for cleansing the input data.
       Developed and implemented Lakehouse architecture using bronze, silver, and gold layers for
       optimized data migration and processing.
     Configured Linked Services, Datasets, Cloud and self-hosted Integration Runtime, and Schedule
       Triggers in Azure Data Factory.
     Ingestion of data from different source systems into the Data warehouse.
     Implemented Data profiling, Data cleansing, Data Transformation, Data Modelling
     Calculation of Asset Health Index through various parameter of different asset classes such as
       Poles, Network Transformers, Network Protectors, Pad-mount Switch-gear, Regulators etc.
     Putting asset classes into different categories (Reject, Non-Reject, Replace etc. ) based on their
       values of damage.
     Prediction of each asset s health based on their condition rating.
Technology used: Microsoft SQL Server Studio, Azure Data Factory, Azure Data Bricks, SQL, PySpark,
PowerBI


HSBC Bank - Data Architect / Engineer
Jan-2020 - Jan-2022
       Design ETL framework to ingest data and file from different source systems into Data Lake.
       Transform the data by applying business rules and pushing to target systems.
       Report creation of refined data using Tableau.
       Engineered a serverless data integration solution using AWS Lambda and Glue, which
       automated data flows from sources, improving data availability.
       Setting up the AWS S3 Buckets and EC2 instances for real-time data acquisition applications.
       Directed the design of a high-performance data analytics platform on AWS, leveraging EMR, S3,
       Spark, and Airflow to manage data.
     Worked on huge datasets stored in AWS S3 buckets, used spark data frames to perform
       preprocessing in Glue.
       Design and Develop ETL Processes in AWS Glue to migrate data from external sources like S3,
       ORC/Parquet/Text Files into AWS Redshift.
     Loaded Transformed data to AWS Redshift using Spark Batch Processing.
       Integrate with AWS services like Amazon CloudWatch for monitoring and alerting.
     Worked on Creating, Debugging, Scheduling and Monitoring jobs using Airflow and Oozie.
       Orchestration of the ingestion pipeline using CloudWatch and Lambda for time-based triggers.
     Implemented best practices for ETL processes with Apache Spark to transform raw data into
       user-friendly dimensional data for self-service reporting.
     Designed and implemented robust data pipelines on the Databricks platform, leveraging Spark
       SQL and Spark Streaming for real-time data processing.
     Created and managed tables and views to facilitate data migration, querying, and reporting.
     Data Ingestion Management System is an In-house developed system designed to ingest and
       transform the data.
Technology used: AWS, S3, EMR, S3, RDS, CloudWatch , Redshift, Lambda , Tableau, Juniper,
Control-M, JIRA, Confluence, Putty, WinSCP
Citi Bank, USA - Data Engineer
July-2018 - Dec-2019
       Created Azure Data Factory pipeline, resource group, and activities for data migration.
       It picks up the data from the Event hub, transformED and sends it to the target system.
       Prepared Data-set, Answer-set based on business rules and visualize it.
       Designed and implemented data processing workflows using Azure Databricks, leveraging Spark
       for large-scale data transformations.
     Transformed data using Azure Synapse, designed schemas, facts, and dimensions.
       Developed data extraction pipelines to extract data from various sources, such as on-premises
       databases, APIs, or cloud-based applications.
       Utilized best practices for data security and encryption during transit.
     Implemented mechanisms for identifying data changes or updates in the source systems using
       techniques like change tracking, timestamps, or incremental markers.
     Designed and implemented the delta load process to transfer only the changed or new data from
       source to destination, reducing processing time and network bandwidth consumption.
     Developed incrementally processed ETL pipelines to handle real-time data updates in migration.
     Optimized the full delta load process to achieve better performance, reduced latency, and
       minimized resource consumption, thereby maximizing system efficiency.
     Leveraged Change Data Capture techniques to propagate changes effectively during migration.
     Prepared reports in Tableau that are involved in Business according to the requirements.
Technology used: Azure Data Factory, Azure Synapse, ADLS, PySpark, Tableau

CitiusTech Healthcare - Solution Architect
Feb-2018 - June-2018
      Understanding the data of different Healthcare source systems and designing the data ingestion,
      transformation, and reconciliation process accordingly.
     Prepared functional and technical design documents of different H-Scale components i.e. Data
      Quality, Data Transformation, and Reconciliation.
     Implemented end-to-end data pipelines using Azure Data Factory to extract, transform, and load
      (ETL) data from diverse sources into MS SQL DB.
     Improved database performance through query tuning, indexing strategies, and partitioning
      techniques.
     Collaborated across teams to integrate big data technologies such as Hadoop, Hive, and Kafka
      into existing infrastructures.
     Developed and maintained data pipelines using Sqoop, Flume, and Kafka to ingest, transform,
      and process data for analysis.
     Performed data aggregation and analysis on large-scale datasets using Apache Spark, Scala,
      and Hive, resulting in improved insights for the business.
     Developed data processing workflows leveraging Spark for distributed processing and
      transformations.
     Developed Spark jobs to transform data and apply business transformation rules to load/process
      data across the enterprise and application-specific layers.
     Developed and optimized Spark jobs for data transformations and aggregations.
Technology used: PySpark, HIVE, Sqoop, Oozie
State Government, India - Hadoop Developer
Nov-2015 - Jan-2018
       Involved in preparing end-to-end solution design of Big Data projects.
       Involved in requirement gathering, data analysis, design, planning, and preparing mapping
       documents.
     Created Web Application ( Digital Library  ) in Java/J2EE, Solr on top of Hadoop to easy access
       and content-based search to documents, images, audio, and video files of various departments.
     Created a Web Application ( Face Recognition ) in Java/J2EE using Python, Hbase, and
       Phoenix, which matches the input face image with millions of images residing in HBase.
     Created a Web Application ( Citizen-360 ) application to find out citizens  activity in each
       government department.
     Solr Implementation in Digital Library application for content-based search in documents.
     Utilized big data ecosystems such as Hadoop, Spark, and Cloudera to load and transform large
       sets of structured, semi-structured, and unstructured data.
     Utilized Hive queries and Spark SQL to analyze and process data, meeting specific business
       requirements and simulating MapReduce functionalities.
     Prepared an ETL framework using Sqoop, Pig, and Hive to bring in data from various sources
       and make it available for consumption.
     Developed MapReduce programs for unstructured data video, images and blog data and
       structured data using Pig and Hive for analysis.
     Log analysis of E-Mitra and Bhamashah web applications using Kafka to find out the problem
       area of the application.
     Created various use cases based on e-Governance projects data.
     Developed data ingestion, process, post ingestion process for various sources into Data Lake for
       CTD, VATMAN, EXCISE, and HCD.
     Loading the data from the Oracle and SQL Server into Hive tables using Sqoop.
     Engaged in data cleaning and analysis using Hive.
     Fraud detection analysis through multiple database transactions using SparkSQL.
     Sentiment analysis based on grievance feedback to find out positive and negative sentiment
       against each problem statement.
Technology used: HDFS, Hive, Sqoop, Tika, Solr, Spark, HDP2.4, Tableau, Core Java, J2EE, Linux,
Teradata Aster, Teradata App Center

Nationwide Building Society, UK - Hadoop Developer
Jan-2012 - Oct-2015
       Created Hive Internal or External tables as per the requirements and defined with appropriate
       partitions, intended for efficiency.
       Developing, installing, and configuring Hadoop ecosystem components that moved data from
       individual servers to Hadoop Cluster.
       Writing ETL scripts using Sqoop to transfer required data from Hadoop to the database.
       Utilized Hive queries and Spark SQL to analyze and process data, meeting specific business
       requirements and simulating MapReduce functionalities.
       Migrated data from Oracle to Hadoop using Sqoop for processing, enhancing data
       management and processing capabilities.
       Installed and configured multi-node Hadoop cluster for data store and processing
       Imported and exported data into HDFS, HBase and Hive using Sqoop.
       Engaged in requirement gathering, analysis, designed architecture for end-to-end data flow.
       Engaged in Hadoop cluster setup using Cloudera distribution.
      Created workflows using Oozie for data injection from various systems to Hadoop.
      Created various reports for the client using tableau, which they used to offer loans and cards to
      their customers.
     Developed complex calculated fields for the business logic, field actions, sets and parameters to
      include various filtering capabilities for the dashboards, and to provide drill down features for the
      detailed reports.
Technology used: HDFS, MapReduce, Hive, Sqoop, PIG, Oozie, Tableau, CDH 5.4, Core Java, UNIX

Matson, USA - Java Developer
Aug-2010 - Dec-2012
      Developed several modules for the project.
      Involved in unit and system integration testing, addressing technical challenges, and providing
      the technical solution.
     Implemented Struts 1.2 MVC architecture with Spring dependency injection and AOP.
     Worked on various types of validations of authentications.
     Worked in Spring and Hibernate Integration at DAO layer.
     Involved in the design and coding in DAO classes, and Restful services design.
     Involved in team handling and client interaction.
     Developed one complete module,  Customer Profile , of the project.
Technology used: Core Java, J2EE, Spring, Struts, Hibernate, HTML, CSS, jQuery, Oracle

Videocon Telecommunication - Java Developer
Feb-2007 - Jul-2010
     Analyzed the business requirements to understand the application.
     Design complete application flow according to business requirement documents.
     Created High-Level Design document. Database and web page design.
     Involved in coding, unit testing, system testing, and enhancement of the project.
     Development of Server-side coding using hibernate DAO.
     Responsible for unit testing and bug fixing.
     Engaged in database and web-page design.
     Engaged in regular client interaction for feedback and enhancements.
Technology used: Core Java, J2EE, Struts, Hibernate, HTML, CSS, jQuery, IBM RSA and DB2


Educational Qualification
       Master of Computer Applications, University of Rajasthan, Jaipur, Rajasthan
       Bachelor of Science, University of Rajasthan, Jaipur, Rajasthan

Certifications
       Microsoft Certified Azure Data Engineer Associate
       Sun Certified Java Programmer (SCJP 1.5)

Respond to this candidate
Your Email «
Your Message
Please type the code shown in the image:
Register for Free on Jobvertise