| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Current Location: Overland ParkEmail ID: EMAIL AVAILABLEPhone: PHONE NUMBER AVAILABLESUMMARY With over 6 months of experience as a Big Data Developer, I am now eager to transition into a customer service role. I am enthusiastic about applying my problem-solving skills and commitment to providing excellent support and assistance. Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems and vice-versa. Excellent understanding and knowledge of NOSQL databases like HBase and Mongo DB. Experienced in processing Big data on the Apache Hadoop framework using MapReduce programs. Worked on migrating existing Hadoop data into S3 bucket using internal frameworks using Discp Implemented autoloader mechanism in Databricks using PySpark to stream data from S3. Leverage Databricks for ETL processes to clean, transform, and organize data in preparation for analytics. Created a data pipeline in Databricks and Airflow for four different dataflows (DG Claims, DG Clinical, DG Customer, and NCP). Experienced in validating dataset APIs using postmanTECHNICAL SKILLSTechnologiesApache Spark, Spark SQL, PySpark, Python, Shell script, HDFS, Hive, Scala, HBase, Linux, Terraform, MapReduce, Airflow, Delta Lake, Data LakeFrameworksDPF, FIFToolsPostman, Maven, Jenkins, GIT, SPlunkDatabasesHive, Sql, TeradataCloudAWS, DatabricksOtherJira, ConfluenceEXPERIENCEConcentrix 12/2023 to 6/2024DeveloperCignaCloud MigrationProject ResponsibilitiesInvolved in copying the data from Hadoop to S3 using discp.Involved in copying the data from s3 to Databricks delta tables using DPF frame work in rawz layer.Worked on the transformations based on the requirement and conforming the tables in the conform layer.Performing the validation on conform zone tables based on the requirementCreating the views on top of the conform zone to pubz zone so that consumer can consume the views from the pubz zone.Migrated existing Hadoop data into an S3 bucket using internal framework Discp.Created workflows and ran jobs using job clusters in Databricks during the development phase.Created Jenkins pipelines to migrate code across the environment.Implemented the Auto Loader mechanism in Databricks using PySpark to efficiently stream and process data from S3.Created a EMR export script to stream the date from S3 to Teradata.Created the script for JDBC export.Created automation scripts to perform the minus queries that compares the data between Hadoop and databricks.Created the adhoc jobs that copies the data from rawz to cnfz.Utilized Git for version control of PySpark and Python code.Troubleshot and resolved PySpark performance bottlenecks and errors.Created and managed ETL workflows using Apache Airflow DAGs, automating data transformation and loading tasks.Developed Spark transformation logic with Delta Lake to append, update, and delete data from incremental files to the parent file.Implemented ongoing monitoring and support to address any issues after the migration.EDUCATIONMaster of Computer ApplicationsCERTIFICATIONSAWS Certified Solutions ArchitectCertified Databricks Spark Developer Associate |