Quantcast

Sr Data Engineer Resume Des moines, IA
Resumes | Register

Candidate Information
Name Available: Register for Free
Title Sr Data Engineer
Target Location US-IA-Des Moines
Email Available with paid plan
Phone Available with paid plan
20,000+ Fresh Resumes Monthly
    View Phone Numbers
    Receive Resume E-mail Alerts
    Post Jobs Free
    Link your Free Jobs Page
    ... and much more

Register on Jobvertise Free

Search 2 million Resumes
Keywords:
City or Zip:
Related Resumes
Click here or scroll down to respond to this candidate
Candidate's Name
SENIOR DATA ENGINEER
Mobile no: PHONE NUMBER AVAILABLE
Email ID: EMAIL AVAILABLE

PROFESSIONAL SUMMARY

       Around 5 years IT experience in Analysis, Design, Development and Big Data in Scala, Pyspark,
       Hadoop, and HDFS environment and experience in Python.
       Implemented Big Data solutions using Hadoop technology stack, including Pyspark, Hive, Sqoop.
       Developed complex SQL queries with various relational databases like Oracle, SQL Server for
       Support of Data Warehousing and Data Integration Solutions.
       Firm understanding of Hadoop architecture and various components including HDFS, Job
       Tracker, Task Tracker, Name Node, Data Node and MapReduce programming.
       Involved in setting up Jenkins Master and multiple slaves for the entire team as a CI tool as part
       of Continuous development and deployment process.
       Installed and configured apache Airflow for workflow management and created workflows in
       python, created the DAG s using Airflow to run jobs sequentially and parallelly.
       Experienced in Optimizing the Pyspark jobs to run on Kubernetes Cluster for faster data
       processing.
       Involved in converting Hive Queries into various Spark Actions and Transformations by Creating
       RDD and Data frame from the required files in HDFS.
       Experience in providing support to data analyst in running Hive queries and building an ETL.
       Performed Importing and exporting data into HDFS and Hive using Sqoop.
       Experienced in Designing, Architecting, and implementing scalable cloud-based web applications
       using AWS and Azure.
       Involved in Software development, Data warehousing and Analytics and Data engineering
       projects using Hadoop, MapReduce, Hive, and other open-source tools/technologies.
       Worked on reading and writing multiple data formats like Parquet on HDFS using PySpark.
       Experienced in requirement analysis, application development, application migration and
       maintenance using Software Development Lifecycle (SDLC) and Python technologies.
       Defined user stories and driving the agile board in JIRA during project execution, participate in
       sprint demo and retrospective.
       Strong working experience with SQL and NoSQL databases, data modeling and data pipelines.
       Involved in end-to-end development and automation of ETL pipelines using SQL and Python.
                                             TECHNICAL SKILLS


                              HDFS, MapReduce, Hive, Sqoop, HBase, KafkaConnect, Spark,
 Big Data Eco System
                              Zookeeper, Amazon Web Services, Airflow.

 Hadoop Distributions         Apache Hadoop 2.x, Cloudera CDP

 Programming Languages        Python, Java, Shell Scripting


 Databases                    MySQL, MS SQL SERVER, HBase


 Version control              GIT, Bitbucket.


 Cloud Technologies           Amazon Web Services, EC2, S3, Azure DataBricks, Snowflake.




                                         WORK EXPERIENCE

Senior Data Engineer | Lowe's Companies Inc |Charlotte, NC| June 2023 - Present

       Migrated the complex data jobs from Teradata to hive and developed ETL pipelines to push the
       loaded data into Apache Druid.
       Developed the airflow connectivity to all the ETL pipelines and migrated all the oozie jobs to
       airflow.
       Continuously worked with cross-functional development teams (Data Analysts and Software
       Engineers) for creating Pyspark jobs using Spark SQL and help them build reports on top of data
       pipelines.
       Lead a team of Junior data engineers in designing and developing the data transformations and
       data management.

       Environment: Python, HDFS, Spark, ETL, Hive, Yarn, Jenkins, MySQL, RDBMS,
       Airflow,Collibra, Apache Druid, Oozie


Data Engineer | Conch Technologies Inc |Memphis, Tennessee| October 2022 - May 2023

       Involved in developing Spark application using PySpark as per business requirement.
       Designed robust, reusable, and scalable data driven solutions and data pipeline frameworks to
       automate the ingestion, processing and delivery of both structured and unstructured batch and
       real time data streaming data using Python Programming.
       Hands on experience on developing, Data Frames and optimized SQL queries in Spark SQL.
       Worked with building data warehouse structures, and creating facts, dimensions, aggregate
       tables, by dimensional modeling, Star and Snowflake schemas.
     Developed spark applications in PySpark on distributed environment to load huge number
     of CSV files with different schema in to Hive ORC tables.
     Migrated data from hive to MySQL, to be displayed on UI by using PySpark job which runs for
     different environments.
     Applied transformation on the data loaded into Spark Data Frames and done in memory data
     computation to generate the output response.
     Forecasted the future trends on ATM cash transactions and cheque count by performing Time
     series analysis on history data using Seasonal ARIMA model.
     Experience working with SparkSQL and creating RDD's using PySpark. Extensive
     experience working with ETL of large datasets using PySpark in Spark on HDFS

     Environment: Python, HDFS, Spark, ETL, Hive, Yarn, HBase, Jenkins, MySQL,
     RDBMS, Airflow,Collibra, Seasonal ARIMA, Time Series Analysis.



Data Engineer | Development Bank of Singapore (DBS Bank) |Hyderabad, Telangana | July 2018
                                         - February 2022

     Responsible for design and development of Spark SQL Scripts based on functional
     Specifications. Created HBase tables to store various data formats of data coming from spark.
     Hands in experience in working with Continuous Integration and Deployment (CI/CD) using
     Jenkins.
     Developed ETL pipelines in and out data warehouse using combination of Python and SparkSQL.
     Importing and exporting data into HDFS and Hive using Sqoop from Oracle.
     Scheduled the spark jobs using Airflow scheduler and monitored their performance.
     Used Teradata for developing and running the history migration scripts on millions of data.
     Worked on collibra platform for metadata creation for thousands of tables.
      Worked with data stewards to create Data Quality rules and Data Quality checks in Collibra.
     Responsible for the design, implementation, and architecture of very large-scale data
     intelligence solutions around big data platforms.
     Implemented the Big Data solution using Hadoop, hive and Informatica to pull/load the data
     into the HDFS system.
     Importing and exporting data into HDFS and Hive using Sqoop from Oracle.
     Developed data processing and data manipulation tasks using PySpark and load data in to target
     data destinations.
     Worked on writing high quality documentation describing ETL routines, data mapping, and other
     artifacts needed to design data migration routines.


    Environment: Python, HDFS, Spark, ETL, Sqoop, Collibra, Airflow, Hbase, CI/CD




Big Data Engineer | Que Technologies |Hyderabad, India| April 2017 - February 2018
        Responsible for the design, implementation, and architecture of very large-scale data
        intelligence solutions around big data platforms.
        Worked on SQL queries in dimensional data warehouses and relational data warehouses.
        Performed Data Analysis and Data Profiling using Complex SQL queries on various systems.
        Troubleshoot and resolve data processing issues and proactively engaged in data modelling
        discussions.
        Worked on RDD Architecture and implementing spark operations on RDD and optimizing
        transformations and actions in Spark.
        Using Azure Data Factory, created data pipelines and data flows and triggered the pipelines.
        Written programs in spark using Python, PySpark and Pandas packages for performance tuning,
        optimization, and data quality validations.
        Worked on developing Kafka Producers and Kafka Consumers for streaming millions of events
        per second on streaming data.
        Worked on Tableau to build customize interactive reports, worksheets, and dashboards.
        Developed Spark Programs using Scala and Java API's and performed transformations and
        actions on RDD's.
        Designed the ETL process from various sources into Hadoop/HDFS for analysis and further
        processing of data modules.
        Worked on object detection using Python Open CV to capture the images of moving objects.

   Environment: HDFS, Python, SQL, MapReduce, Spark, Kafka, Hive, Yarn, Zookeeper, Shell Scripting,
RDBMS, ETL, PySpark, Hadoop.

Software Engineer | University of Hyderabad| Hyderabad, India |April 2016   December 2016

        Used GIT to maintain repository, creating and merging branches, commit changes, checking out,
        moving, and removing files.
        Created data models, stored procedures, queries for data analysis and manipulations, views,
        functions. Maintain, upgrade databases and creating backups in SQL.
        The data received after all the tests were done would be parsed to see that there are no
        inconsistencies and save the data to the database.
        Involved in importing and exporting data from local and external file system and RDBMS to
        HDFS.
        Worked on mobile crowd sensing by performing simulation using CrowdSenSim simulator.
        Created graphs and charts for displaying the simulated data using Tableau.


    Environment: GIT, MySQL, RDBMS, Shell Script, JIRA, Tableau.



EDUCATION: 2018

Bachelor of Engineering in Computer Science                                               GPA: 8.89/10
(Jawaharlal Nehru Technological University Kakinada)

Respond to this candidate
Your Email «
Your Message
Please type the code shown in the image:
Register for Free on Jobvertise