Data Engineer Resume Remote, IL

Data Engineer Resume Remote, IL
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Data Engineer
Target Location	US-IL-Remote
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Senior Software Engineer Remote, OH

Software Quality Engineer Remote, OR

Software Engineer Development Remote, OR

Sr.Systems Engineer/Analyst Remote, NY

Systems Engineer Remote

Software Engineer Development Myrtle Point, OR

Agricultural Engineer Coos Bay, OR

Click here or scroll down to respond to this candidate

                                          Candidate's Name
                                       Street Address
                           PHONE NUMBER AVAILABLE    # EMAIL AVAILABLE           github.com/terrortad/
Education
University of Maryland
Bachelor of Science in Computer Science                                                              College Park, Maryland

Experience
Hormel Foods                                                                                         June 2019   Present
Data Engineer                                                                                                 Remote, VA
     Technologies: Python, SQL, Google Cloud Platform, Airflow, Tableau, Informatica ETL

     Oversaw the optimization of data pipelines, streamlining the processing and analysis of 80GB+ of weekly data from

     diverse sources including customer demographics, web activity, and supply chain/distribution data..
     Implemented Apache Airflow scheduling to automate data pipeline tasks, reducing manual intervention by 40 percent

     and ensuring consistent data flow for machine learning models.
     Partnered with data scientists to develop a customer churn prediction model using machine learning algorithms.

     Analyzed customer behavior data to identify key risk factors, leading to a 15 percent reduction in customer churn rate.
     Designed a highly scalable data lake using Microsoft SQL Server to store and manage unstructured data from social

     media platforms. This enabled us to perform sentiment analysis and gain valuable customer insights.
     Merged real-time shelf data from merchants into Google BigQuery by utilizing the Crisp data platform. As a result, food

     waste was decreased and inventory management was enhanced by the alerting of possible out-of-stock situations.
Spins LLC                                                                                          March 2018   May 2019
Data Engineer                                                                                                     Chicago, IL
     Technologies: Python, SQL, Google Cloud Platform, Airflow, MySQL, Hadoop Distributed File System (HDFS)

     Reduced data release execution time by 18 percent through proactive monitoring and troubleshooting of on-premises and

     cloud workflows using Python scripts.
     Successfully migrated data processing scripts from on-premises servers to Airflow, streamlining deployment and

     scheduling compared to cron jobs. This improved collaboration and efficiency within the data operations team.
     Utilized Google Cloud Platform (GCP) tools for data operations, including troubleshooting MySQL issues within the

     data pipeline and maintaining familiarity with Hadoop Distributed File System (HDFS) and logs for potential future
     integration.
     Maintained clear communication with product and customer success teams, proactively identifying potential impacts of

     upcoming features and customer needs on data operations procedures.
Projects
Spotify ETL Data Pipeline | Python, AWS (S3, Glue, Athena) Kafka, Lambda Functions
    Designed and implemented a scalable ETL pipeline on AWS to process terabytes of Spotify data in real-time, enabling

    data-driven music trend analysis and artist popularity insights.
    Leveraged Apache Kafka to achieve a low-latency data ingestion rate for Spotify data, ensuring continuous updates and

    fresh insights.
    Developed a user-friendly data analysis framework using AWS Athena, allowing analysts to run complex SQL queries on

    structured data for faster and more insightful music trend identification.
    Optimized data storage using AWS S3, enabling cost-effective storage of both raw and processed Spotify data for future

    analysis.

Reddit Post Data Pipeline | Python, SQL, AWS (Glue, Athena, Reshift), Docker
    Designed and implemented a high-performance ETL pipeline for Reddit data using a modern technology stack (Apache

    Airflow, Celery, AWS Glue, Athena, Redshift).
    Extracted data from the Reddit API and efficiently stored it in an S3 bucket using Airflow.

    Developed data transformation logic utilizing AWS Glue and Amazon Athena for active querying and cleaning.

    Leveraged Docker containers to ensure consistent and portable project environment.


Technical Skills
  Languages: Python, SQL, T-SQL
  Developer Tools: SSIS, SQL Server, GitHub, Powershell
  Databases: MS SQL, Google Cloud Platform, AWS Redshift
  Technologies/Frameworks: Linux, Apache, PowerBI, Docker, Tableau, Hadoop

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: