| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
EMAIL AVAILABLEPhone #:PHONE NUMBER AVAILABLEAbout MeExperienced Data Engineer with 3 years of expertise in designing and developing ETL pipelines in On-Prem & Cloud technologies. Extensive experience in Data Warehousing, Bigdata Data Lake/Hadoop stack, Cloud technologies like Google Cloud, Google Big Query, Snowflake, AWS, lambda, Glue, AI/ML, Teradata, SQL Server, Informatica, Unix, Scala, Python, Spark/PySpark.Looking forward to working in a dynamic environment for building Data warehouses, Centralized Enterprise Data Hub using Hadoop and Cloud Platform that can cater to build organization growth as well as my career path.Technical SkillsExperience in handling ETL jobs using Hadoop, Pig, Hive, Sqoop, Spark, Scala, Python, HBASE and Teradata.Experience in performing huge data processing in Hadoop using Pyspark, Pig, Hive and Sqoop.Experience in writing Hive queries, HQL queries, SQL queries, PL/SQL stored procedures/functions, triggers and packages etc on Oracle 10g/9i, DB2 & Hadoop.Experience in Analysis and Design, Performance Tuning, Query Optimization, Stored procedures, functions, packages, triggers, views and indexes to implement the business logics of database in Teradata, Sql Server and loading data warehouse tables like dimensional, fact and aggregate tables using SSIS, Teradata Utilities.Experience in handling different types of files csv, xml, Json with different file formats such as ORC and other compressed file formats.Experience in writing custom Pig UDFs.Experience in noSQL databases MongoDB and HBase.Strong working experience in cloud data migration using AWS/GCP and SnowflakeGood Experience in Google Cloud, Big Query, Big-Table, AWS Cloud Redshift and S3.Good experience in writing Unix shell, scala and pythong scripting.Experience in Python OpenStack APIsFamiliarize with object-oriented programming conceptsAble to assess business rules, collaborate with participants and perform source to target mappings, design and review.Hands on experience in establishing connections with RDBMS databases using python programmingHands on experience with analyzing large datasets with in-memory data structures using pandas and spark.Worked in agile methodologies using GIT as version controlHands on experience with working IDEs like notebook and PyCharm.Expertise in getting web data through APIs and web scrapping techniques.Experience with shell scripts to administer and automate batch job scheduling including backup and recovery processes.Experience handling the complete SDLC process of development.Worked on datasets related to telecommunicationsPROFESSIONAL EXPERIENCEClient : ATT, GAPeriod : Aug-2023 July 2024Project: POS CompensationRole: Data EngineerPoint of sales (POS) retail orders is implemented to automate the sales compensation and reporting for COR/Local Dealer/D2D Uverse High Speed internet Orders. Currently this is a manual process and needs a manual intervention by the RST/D2D back-office teams to pass the ATT Dealers and sales agents codes to downstream application and reporting.Responsibilities:Worked on data migration from existing data sources onto Data Lake.Created data ingestion plans for loading the data from external sources using Sqoop, Spark Scala programs for pushing daily source files onto HDFS & Hive tables.Understand customer business use cases, able to translate them to analytical data applications and models to implement a solution.Created custom Database Encryption & Decryption UDF that could be plugged in while ingesting data to External Hive Tables for maintaining security at table or column level.Fine tuning, stabilizing Hadoop platform for allowing real time streaming and batch style Bigdata applications to run smoothly with optimal cluster utilization.Developed Spark programs for different patterns of data on Hadoop cluster.Implemented dynamic partitions, bucketing and compression techniques in Hive External Tables and optimized worst performing hive queries.Developed ETL process for Data acquisition and Transformation using Spark.Proficient in MongoDB, HBase APIs, tools to import variety of data formats like CSV, Excel, JSON, XML etc.In-depth understanding of Spark Architecture including Spark tuning, Spark Core, Spark Streaming, Data Frames, RDD caching, Spark SQL and fine tuning the Spark jobsGood experience in GCP, AWS Cloud S3 and Redshift.Experience in Big Query loads from operational data sources Teradata, SQL Server & PostgreSQL.Played a key role in code development and automating the entire adhoc and delta loads to Teradata tables.Implemented the predictive models to run Spark extensively using the RDD, DAGs, Spark Data-frames, Spark SQL and Spark Streaming and ingesting data onto data lake.Construct data-pipelines by developing Python jobs and deploying on Google cloud platform.Developed data-pipelines using PySpark and schedule jobs/workflows using Apache Airflow.Monitor the scheduled Airflow jobs/workflows in Production.Developed python jobs using Jupiter notebooks for processing events data onto GCP.Processing different types of files csv, xml, json with different file formats such as Avro and other compressed file formats in Hadoop Platform and GCP-Big Query.Processing data from different REST API's using Postman to Google Cloud Platform, Big Query.Worked on different Google components-App-Engine, GCS, Compute Engine, Composer, Scheduler, Dataflow, Cloud Functions, Kubernetes, Docker etc.Developed ETL jobs to update the data into the target databases from various data sources and REST APIs.Populate or refresh Teradata tables using Fast load, Multi load &fast export utilities/scripts for user Acceptance testing and loading history data into Teradata.Experience in creating and writing Unix Shell Scripts (Korn Shell Scripting - KSH), Coding using BTEQ scripts, TERADATA, Implementing ETL logic,Developed ETL workflows using AWS Glue and Snowpark.Developed Data integration using AWS services such as AWS step functions.Developed custom data transformation and processing solutions using AWS Lambda and AWS step functions.Developed Glue job to mask the PII data and generate the new feed file and upload to network drive for business users.Performance tuning the long running queries. Worked on complex queries to map the data as per the requirements.Production Implementation and Postproduction Support.Environment: SQL, Teradata, Hadoop, Bigdata, Hive, Pig, Linux, Python, Shell Scripting, GitHub, Git-Bash, Jenkins,, Scala, Spark, Python, Docker, BigQuery, AWS, Glue, Lambda, Step function, Snowflake etc.Client : CISCO, IndiaPeriod : May-2019 Nov 2021Project: 360 data foundationRole: Data EngineerResponsibilities:Gathering requirements, builds logical models and provides quality documentation of detailed user requirements for the design and development of this project.Extracts data from Oracle to HDFS using Sqoop. Built Rules for different Product lines in Hive UDF for processing covered and uncovered product lines.Created jobs and constructed data pipelines and workflows using Airflow, Composer, App-Engine, Compute Engine on Google Cloud Platform and Big-Query.Developed jobs using Spark Scala and Python jobs for ingesting data onto Bigdata Hadoop and Google Cloud Platform.Coding using BTEQ SQL of TERADATA, Implementing ETL logic using Informatica, transferring files using SSH-Client.Populate or refresh Teradata tables using Fast load, Multi load &fast export utilities/scripts for user Acceptance testing and loading history data into Teradata.Experience in creating and writing Unix Shell Scripts (Korn Shell Scripting - KSH).Preparing test cases and performing Unit Testing and integration testing.Performance tuning the long running queries. Worked on complex queries to map the data as per the requirements.Production Implementation and Post Production Support.Environment: SQL, Teradata, Hadoop, Bigdata, Hive, Pig, Linux, Python, Shell Scripting, GitHub, Git-Bash.Education:1.M.S. in Information Technology Indiana Wesleyan University, Ohio 08/29/2022 to 04/30/20242.Bachelor in computer science engineering Jawaharlal Nehru Tech University 04/01/2015 to 05/30/2019VISA Status: OPT |