Data Engineer Machine Learning Resume Hu...

Data Engineer Machine Learning Resume Hu...
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Data Engineer Machine Learning
Target Location	US-OH-Hudson
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Senior data engineer Cleveland, OH

Senior Data Engineer Cleveland, OH

Data Engineer Warehouse Cleveland, OH

Data Engineer Cleveland, OH

Azure data engineer Cleveland, OH

Data Engineer Service Technician Kent, OH

Data Center Network Engineer Kent, OH

Click here or scroll down to respond to this candidate

Candidate's Name
DATA ENGINEERPHONE NUMBER AVAILABLE EMAIL AVAILABLE Voorhees, NJSUMMARY Data Engineer with around 4 years of experience in leveraging data and business principles to design and implement scalable data infrastructure solutions. Experience in implementing enterprise-grade data solutions leveraging Databricks for batch processing and streaming frameworks like Apache Spark Streaming, Apache Kafka, and Apache Flink. Expert in designing and managing data lakes and data warehouses using technologies like HDFS, Amazon Redshift, and Snowflake, Big Data / Hadoop. Experience in Building web applications in React,Angular and Vue .Published Chrome Extension in Chrome Store Tracker Knowledge of implementing scalable data solutions and analytics pipelines leveraging Big Data technologies like Apache Hadoop, Spark, Hive, Pig, Sqoop, Beam, Storm and cloud-based NoSQL databases such as AWS DynamoDB, Cassandra.EDUCATIONMaster in Data Warehousing and Database Administration, Rowan University CGPA 3.9 Bachelor of Technology in Computer Science Engineering, JNTUH CGPA 3.8 WORK EXPERIENCEHybopay Inc, NJ Jan 2024  CurrentData Engineer Developed Airflow pipelines that orchestrate batch processing of financial data at specific intervals for regulatory reporting, ensuring adherence to reporting deadlines. Utilized technologies such as Spark and Airflow's built-in monitoring tools for real-time tracking and debugging of data workflows, increasing process efficiency by 12%. Migrated the on-premises data warehouse to Snowflake, enabling secure and scalable storage of financial data for reporting and analytics. Implemented data masking and row-level security to protect sensitive information, improving query performance by approximately 8%, reducing average query times from 19.2 seconds to 13.8 seconds. Implemented Snowpipe with automated triggers to streamline ETL processes, ensuring efficient and timely data ingestion. Implemented a Apache NiFi streaming pipeline to ingest and process real-time transaction data inject in Relational databases, enabling near-real-time analysis of loan credibility and loan installment delinquency. Leveraged CDC (Change Data Capture) techniques to ensure real-time data consistency, reducing analysis time from 15 minutes to under 1 minute. The application analyzes transactions for anomalies and flags potential fraud attempts in real time, reducing financial losses by 3%. Used Spark MiLb for parallelizing Machine Learning models and utilized Spark 3 Data Frames instead of Spark 2, which improved processing performance by 300%. Additionally, leveraged Spark's in-memory processing, caching, and partitioning features to improve data processing speed by 15%. Conducted Data Mapping on the existing platform functions for Machine Learning models, and reviewed/defined the Platform-Exit plan for these models. Defined the Acceptance Criteria and Regression Testing plan (end-to-end) for the Machine Learning models, along with a Model Monitoring plan. Reviewed results with key stakeholders and gathered signoff. Also involved in product backlog refinement, including writing user stories, epics, and initiatives.Visusoft Private Limited, India Aug 2019 - Dec 2022 Data Engineer Implemented data cleansing and transformation logic using Spark and DBT (Data Build Tool) to improve the quality of customer financial transaction data by 30%, ensuring accurate customer segmentation and credit risk modeling. Managed data warehouse tasks using Azure Synapse and Blob Storage, ensuring scalable and secure data storage solutions. Utilized Azure Data Factory for orchestrating data workflows, reducing data processing time by 25%. Industrialized serverless data processing pipelines using Lambda,Glue and Step Functions to automate data ingestion and transformation tasks for financial data feeds, minimizing operational overhead and improving efficiency by 20%. Established data pipelines with error handling and retry mechanisms using Airflow and Kafka to ensure the reliability and integrity of financial data processing, minimizing data loss to less than 1% and ensuring timely delivery of insights for investment decisions. Used Jenkins for continuous integration and Amazon Kinesis for real-time data streaming and Elastic Map Reduce is used for Distributed Dataset processing . Utilized SQS, SNS, and Event Bridge services to enhance consistency and efficiency in data processing, resulting in a 30% improvement in data quality and a 25% reduction in data errors and redundancy. Visusoft private limited, India Jan 2019 May 2019 Machine Learning Engineer Intern Precision Agriculture for Water Management: Developed and implemented machine learning models using Scikit-learn and TensorFlow to optimize irrigation schedules based on soil moisture and weather data, leading to water savings of 20% and increased crop yield by 15%. Collaborated with farmers to deploy the models in the field and track performance over the growing season. Integrated real-time sensor data into the models, enabling dynamic adjustments to irrigation schedules based on current environmental conditions, further enhancing water efficiency by 10%. Employed Azure IOT Hub for real-time data ingestion and processing. Used Jira for project management and tracking tasks, ensuring timely delivery and collaboration among team members. Utilized Maven for managing project dependencies and building reproducible environments. CERTIFICATIONSAWS Certified Solutions Architect  Associate Machine Learning Data Lifecycle in Production SKILLS Language: Scala, Python, SQL, Java,JavaScript,Html,Css IDEs: PyCharm, Jupyter Notebook, Databricks Big Data Tools: Hadoop, HDFS, Spark, Kafka, Databricks, Apache Cassandra, Hive, Pig, Pytest, ADO (ActiveX Data Objects) ETL and Cloud Technologies: SSIS, SSRS, AWS (EC2, S3 Bucket, Amazon Redshift, Glue, Lambda, Kinesis, DynamoDB), Airflow, Agile Scrum, DevOps Machine Learning: Linear Regression, Logistic Regression, Decision Tree, SVM, K-mean, Random Forest Visualizations: Tableau, Power BI, Excel,Qlik, Alteryx, RStudio, SAS Visual Analytics, BigQuery, Looker Packages & Data Processing: Matplotlib, Scikit-learn, Seaborn, TensorFlow, PyTorch, Data Pipelines, Jenkins Version Control & Database: GitHub, Git, SQL Server, MongoDB, MySQL, Snowflake, Jira

Respond to this candidate
Your Message
Please type the code shown in the image: