Data Processing Engineer Resume Chicago,...

Data Processing Engineer Resume Chicago,...
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Data Processing Engineer
Target Location	US-IL-Chicago
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Data Engineer Chicago, IL

Data Analyst Engineer Hoffman Estates, IL

Data Engineer Power Bi Chicago, IL

Data Engineer South Elgin, IL

Data Engineer Azure Aurora, IL

Data Engineer Power Bi Chicago, IL

Azure Data Engineer Chicago, IL

Click here or scroll down to respond to this candidate

Candidate's Name
Chicago,IL PHONE NUMBER AVAILABLE LinkedIn Github Hackerrank Email WebsiteSUMMARY Masters in Computer Science with over 3+ Years of overall IT experience in Data Engineering. Working experience in Hadoop ecosystem (Gen-1 and Gen-2) and its various components such as HDFS, Job Tracker, Task Tracker, Name Node, Data Node, Resource Manager (YARN). Experience with components such as Cloudera distribution encompassing components like MapReduce, Spark, SQL, Hive, HBase, Sqoop, Pyspark. Good skills on NoSQL Database- Cassandra. Proficient in developing Hive scripts for various business requirements. Knowledge in Data Warehousing Concepts in OLTP/OLAP System Analysis and developing Database Schemas like Star Schema and Snowflake Schema for Relational and Dimensional Modeling. Good hands on in creating custom UDF in SnowFlake Data Warehouse. Load and transform large sets of structured, semi-structured and unstructured data from Relational Database Systems to HDFS and vice-versa using Sqoop tool. Good Experience on architecture and components of Spark, and efficient in working with Spark Core, Data Frames/Datasets/RDD API/Spark SQL, Spark streaming and expertise in building PySpark and Spark-Scala applications for interactive analysis, batch processing and stream processing. Hands-on experience in Spark, Scala, SparkSQL, Hive Context for Data Processing. Knowledge on GCP tools like Cloud Function, Dataproc, Big Query. Experience on Azure cloud i.e., ADF, ADLS, Blob Storage, Databricks, Synapse etc. Extensive working experience in an Agile development Methodology & Working knowledge on Linux. Expertise in working with big data distributions like Cloudera and Hortonworks. . Experience in tuning and debugging Spark applications and using Spark optimization techniques. Knowledge on architecture and components of Spark and demonstrated efficiency in optimizing and tuning compute and memory for performance and price optimization. Expertise in developing batch data processing applications using Spark, Hive and Sqoop. EXPERIENCEAzure Data Engineer Jan 2023 - Nov 2023ExxonMobil (contract) Houston, Texas Engineered SQL scripts to automate query processes, successfully eliminating the need for manual intervention and boosting query volume and accuracy by an impressive 40% Designed and implemented an automated system using PowerShell and Azure Cloud Shell, effectively streamlining the deployment of Azure data solutions. Crafted advanced data models that enhanced data processing efficiency in Azure, resulting in a substantial 30% increase in overall productivity Utilized Azure services such as Data Factory to construct robust data pipelines, enabling the seamless migration of data from legacy SQL servers to Azure Database. Accomplished this through the integration of Data Factories and Python scripting Collaborated closely with Analytics and BI teams to develop globally utilized metrics and reports, significantly reducing the need for manual data analysis by more than 15% Associate Software Engineer (Data) Dec 2020 - Dec 2021 Tech Mahindra Hyd, India Designed and implemented a highly scalable data model and data warehouse using Snowflake, leading to a remarkable 10% enhancement in data processing speed and a 15% reduction in storage costs Engineered and deployed data pipelines to elevate data quality, resulting in a notable 30% increase in data accuracy. Streamlined and fine-tuned ETL processes for seamless data loading into Snowflake, achieving a remarkable 50% reduction in data loading time and a 15% enhancement in overall data quality. Pioneered the development of highly optimized Spark code using PySpark and Spark-SQL, resulting in an exceptional 20% improvement in data processing speed and unparalleled data accuracy. Building and architecting multiple data pipeline, end to end ETL and ELT process for Data ingestion and transformation.Data Engineer Nov 2019 - Dec 2020Adiroha Bengaluru, India Optimized an existing data pipeline to improve its performance and scalability. Created and maintained documentation of data models, ETL processes, and data security policies, resulting in a 30% reduction in onboarding time for new team members and ensuring consistent data governance practices. Built a data warehouse in snowflake to capture historical data. Conducted data analysis to identify patterns and trends in customer behavior Data Engineer Intern June 2019 - Nov 2019Adiroha Bengaluru, India Developed a new data pipeline to collect and load data from a new data source into the companys data warehouse. Implemented complex SQL queries to extract the data from the data warehouse. Optimized SQL Queries using indexes as per the company requirements. PROJECTSWinkart django,sql,javascript,docker(link) May 2023 - June 2023 Built a complete end to end ecommerce website where a user can buy apparels. Used session keys to implement add to cart function to increment/decrement/delete items in the cart. Integrated PayPal payment system so a user can purchase products Implemented token based logging in for enhanced security Credit Card Spends Sql(link) Retrieved transaction details for each card type when it reaches a cumulative of 100000 Retrieved which card and expense type combination saw highest month over month growth in Jan-201 Found which city took least number of days to reach its 500th transaction after the first transaction in that city Snowflake ETL Pipeline (link) Created 3 layers to store the data and to capture the CDC data overtime For each layer created 3 pipes, 3 tables and 3 Streams to build the continuous data flow from amazon s3 Used snowflake tasks to automate copying data into tables Transformed the data in 2nd layer(curating zone) and sent it consumption layer for performing data analysis More Side ProjectsTECHNICAL SKILLSLanguages : Java, Python, C/C++, SQL, JavaScriptData WareHousing : Snowflake,Pentaho, AWS RedShift Azure Cloud Tools : Azure Data Lake, Azure Blob Storage, Azure VM, Azure Synapse, Data Factory, Azure cosmos Big Data Tools : Hadoop, Hive, Spark, Metastore, Presto, Flume, Kafka Developer Tools : Git,Google Cloud Platform, VS Code, Visual Studio, PyCharm, IntelliJ, Eclipse Libraries : Pandas, NumPy, MatplotlibML Frameworks : ScikitLearn, TensorFlowNo SQL : HBase, Cassandra, MongoDBEDUCATIONWestern Illinois University Macomb,ILMasters of Science, Computer Science [2022-2023] GPA : 3.29 KL University AP, IndiaBachelors in Computer Science [2016-2020] GPA : 3.25 CERTIFICATIONSAWS Certified Cloud Practitioner

Respond to this candidate
Your Message
Please type the code shown in the image: