| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Piscataway, NJ, Street Address PHONE NUMBER AVAILABLEEMAIL AVAILABLE https://meritpages.com/Candidate's Name
Professional Summary:Experienced Data Engineer intern with hands-on expertise in Apache Hadoop, Spark, Kafka, and cloud platforms including Amazon EMR, Microsoft Azure, and Google Cloud Platform.Skilled in real-time data processing and visualization using Apache Spark Streaming, Kafka, and Microsoft PowerBI.Proficient in database administration tasks with Snowflake and MySQL, optimizing queries for performance improvement.Collaborative team player with experience in Agile methodologies, GIT version control, and Jira task management.EDUCATIONMaster of Science (M.S.) - Information Science May 2022 - Mar 2024Trine University, Detroit, MIBachelor of Science (B.S.) - Forensic Science Jun 2018 - May 2021Jain University, BangaloreEXPERIENCEData Engineer (Internship) May 2022 - Feb 2023Kanap Systems LLC, Alpharetta, GAConceptualized extensively with Apache Hadoop, MapReduce, HDFS, Hive, Kafka, Spark, and Zookeeper during internship projects.Acquired practical exposure to cloud platforms such as Amazon EMR (EMR, EC2, S3, Athena, Glue, Elasticsearch, Lambda, DynamoDB, Redshift, ECS), as well as Microsoft Azure and Google Cloud Platform services. Assessed in Developing real-time data processing applications using Apache Spark Streaming and Kafka to handle high-volume data streams efficiently. Leveraged tools like Microsoft PowerBI, and Python libraries to create insightful data visualizations and dashboards for analysis. Collaborated effectively with team members using version control systems like GIT, Jira, and Agile methodologies to manage project tasks and workflows. Assessed database administration tasks for various database platforms including Snowflake, MySQL.Pertain to Implementing performance tuning techniques to optimize queries and improve the efficiency of data processing workflows. Monitored system performance using tools like Azure Monitoring and CloudWatch, and proactively addressed any issues to ensure smooth operation of data processing pipelines. Created comprehensive documentation for project workflows, configurations, and procedures using tools like Confluence and Notepad++. Actively participated in training sessions, workshops, and team meetings to enhance technical skills and stay updated on industry trends and best practices.Basic Understanding of Terraform, DevOps Practices, And HCL (HashiCorp Configuration Language)Basic Understanding about Integrating Data from various sources into GCP using services like Cloud Storage, Big Query, Dataflow, DataprocKnowledge on Integration of data From Various sources into Big Query using batch or streaming ingestion methods.Knowledge on Web Development with Django Frame work, Ninja and HTML.Demonstrated proficiency in Python, Scala, SQL and UNIX shell scripting to develop and optimize data processing pipelines.Data Engineer Dec 2020 - Dec 2021Quicknify, BangaloreDesigned the business requirement collection approach based on the project scope and SDLC methodology.Working on analyzing the requirements by clients, providing estimation, design, coding, testing, implementation, and post implementation support.Understanding in migrating SQL databases to Azure services: Azure Data Lake, Azure SQL Database, Data Bricks, and Azure SQL Data Warehouse.Knowledge in writing and optimizing PySpark code to perform ETL (Extract, Transform, Load) operations, data cleansing, and complex data transformations on diverse datasets within Azure Databricks environments.Adapted in Creating automation scripts using SQL, PySpark, and Pandas to reduce manual effort in data processing tasks.Necessitate in Optimized SQL queries for efficient data extraction and analysis.Complicated in Developing scripts in Python, Scala, and Spark to process source files and load them into HDFS. Exposure to functions in MySQL to enhance database functionalities.Monitored daily and weekly job flows in Airflow, ensuring instant fixes for any job failures.Experienced in monitoring and optimizing Azure resources for performance, cost, and reliability using Azure Monitor, Azure Advisor, and other monitoring tools.Maintained code changes in GIT repositories for version control. JIRA for defect tracking.Handled data sources such as ORC and CSV file formats.Proficient in version control, code collaboration, and reproducibility using Git, and Azure Repos for PySpark code development and project management within Azure Databricks environments.Collaborated effectively with cross-functional teams to ensure successful database migration projects.Knowledge in using Databricks for writing and executing code for data processing, analysis, and machine learning tasks.Used Apache Airflow for complex workflow automation. The process automation is done by wrapper scripts through shell scripting.Expertness in debugging and troubleshooting using logs, ensuring seamless identification and resolution of issues within Azure environments.Utilized Control-M to monitor jobs, analyze failure logs, and implement corrective actions based on identified errors.Expertise on ServiceNow for tracking and addressing tickets related to data issues raised by users.SKILLSPython, Scala, Zookeeper, Hadoop, Git, Pyspark, SQL, Pandas, Azure Cloud (Azure Monitoring, Cloudwatch, Snowflake), AWS (EMR, EC2, S3, Athena, Glue, Elasticsearch, Lambda, DynamoDB, ECS, Redshift), Spark, ETL(Extract, Transform, Load), GCP, Microsoft Power BI, Jira, Agile Methodologies, Databricks, Apache Airflow, Control-M, ServiceNow, MapReduce, Hive, Kafka, Microsoft Azure (Data Lake, SQL DB, Data Bricks, SQL Data Warehouse),HCL, Terraform, BigQuery. |