Data Engineer Quality Resume Bloomington...

Data Engineer Quality Resume Bloomington...
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Data Engineer Quality
Target Location	US-IN-Bloomington
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Data Engineer Processing Avon, IN

Data Engineer Analytics Indianapolis, IN

Quality Assurance Test Engineer Indianapolis, IN

Software Quality Assurance Engineer Carmel, IN

Product Engineer Quality Assurance Indianapolis, IN

Data Entry Quality Assurance Martinsville, IN

Click here or scroll down to respond to this candidate

Candidate's Name
Carrollton, TX Street Address
PHONE NUMBER AVAILABLE # EMAIL AVAILABLE  LINKEDIN LINK AVAILABLE  github.com/sreyas0304 SummaryData Engineer with proven expertise in designing and optimizing large-scale ETL pipelines, data modeling and database management Proficient with programming languages like Python and SQL. Proven experience in developing scalable data infrastructure using cloud services such as AWS and Snowflake. Expertise in managing data quality, collaborating with cross-functional teams, and thriving in fast-paced environments to solve complex business problems at scale. Skills & CertificationsLanguages: Python, R, Scala, SQL, NoSQL, Shell Scripting Developer Tools: VS Code, Anaconda, Tableau, Power BI, Azure, AWS, GitHub, Docker Databases: PostgreSQL, MySQL, Oracle, MongoDB, SQLite, Neo4j, Vector Databases Technologies/Frameworks: REST APIs, Pandas, Numpy, Sci-Kit Learn, ETL, Hadoop, Hive, Airflow, Spark, Kafka, Databricks, CI/CD, Agile Methodologies, Snowflake, OpenAI, LangChain AWS Services: EC2, RDS, Athena, Redshift, S3, Kinesis, EMR, Lambda, Glue, Quicksight, Step Functions, SQS Certifications: AWS Data Engineer Associate (In Progress), AWS Cloud Practitioner Essentials, Big Data 101, Data Warehouse Essentials(Snowflake), Data Engineering Essentials(Snowflake), Data Analytics Essentials, Tableau Essentials Training, Lakehouse FundamentalsExperienceCrowdDoing Jun 2024  PresentData Analytics Engineer San Francisco, US Architected and deployed an end-to-end pipeline using AWS S3, Docker, and Apache Airflow for scalable data storage and processing of book data and metadata, contributing to the architecture and design of large data pipelines. Managed data quality by writing automated scripts for data checks at various pipeline stages which led to a 40% improvement in the accuracy of extracted context and quotes Integrated PySpark for large-scale text extraction and Ollama for high-quality vector embeddings, improving the ability to analyze book content. Built a knowledge graph using Neo4j to store vector embeddings and relationships, and utilized OpenAI APIs,developed RAG models using LangChain for comprehensive contextual searches, significantly enhancing application search functionality.Indiana University Jun 2023  May 2024Research Data Analyst Bloomington, US Designed and implemented a relational database to store and manage spectroscopy and DNA experiments data from animal tusk samples. Utilized Python and machine learning algorithms (K-means, DBScan, PCA) to cluster and analyze data, enabling efficient identification of elephant tusks for anti-poaching initiatives. Developed dashboards and visualizations in Tableau and ArcGIS to uncover geographical patterns and built a high-accuracy (95%) machine learning model using Python SciKit Learn for accurate elephant ivory classification. EducationIndiana University Aug. 2022  May 2024Master of Science in Computer Science Bloomington, US Thakur College of Engineering and Technology Aug. 2018  May 2022 Bachelor of Engineering in Computer Engineering Mumbai, IN ProjectsReal-time Streaming Data Pipeline Python, Kafka, Spark, Docker, AWS, Tableau June 2024 Architected and optimized a real-time data pipeline, using Apache Kafka and Spark, achieving 50% increase in data processing speed. Utilized AWS S3, Glue, and Redshift for efficient data storage and transformation, enabling advanced analysis with Tableau.End-to-End Weather Data Ingestion Pipeline Python, Airflow, AWS (S3, Codebuild, Glue, Redshift) May 2024 Implemented a scalable ETL pipeline for weather data ingestion using Apache Airflow and AWS, enhancing data access and visualization capabilities. Integrated CI/CD tools like Git for seamless deployment and workflow management.ETL and Analysis on Amazons Mobile Sales Orders SQL, Snowflake, Spark, AWS (S3, IAM) November 2023 Executed end-to-end ETL process using Snowflake and SQL, implementing data flow from three regions, and performed trend analysis showing a 10% annual growth. Utilized Snowflakes features to optimize data handling.

Respond to this candidate
Your Message
Please type the code shown in the image: