| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
PHONE NUMBER AVAILABLE / EMAIL AVAILABLEhttps://LINKEDIN LINK AVAILABLE SUMMARY Data Engineer with extensive experience in designing, implementing, and optimizing data warehousing solutions, with a strong focus on Snowflake and AWS Redshift environments, enhancing data accessibility, performance, and business intelligence capabilities. Proficient in Snowflake architecture and features, including Virtual Warehouses, Multi-Cluster Warehouses, Role-Based Access Controls, Data Sharing, Snow pipe, and Zero-Copy Cloning, ensuring efficient and secure data management and access across the organization. In-depth knowledge of Snowflake database management and performance optimization, including schema design, partitioning strategies, and workload management to maximize query efficiency and resource utilization in high-demand environments. Hands-on experience with AWS services and ETL tools such as Glue, Lambda, and EMR, orchestrating complex data integration workflows and building scalable data pipelines to support real-time analytics and data-driven decision-making. Expertise in Python and PySpark for data processing and transformation, utilizing libraries like Pandas and NumPy to handle diverse data formats, including JSON, XML, Parquet, and CSV, ensuring data quality and consistency. Proven ability to lead cloud migration initiatives, transitioning legacy data systems to modern, cloud-based platforms using Snowflake and AWS, achieving significant cost savings and operational efficiencies. Strong background in data modeling, data transformation, and data management, with a focus on delivering scalable, end-to-end data solutions that support clinical research, business intelligence, and strategic growth initiatives. Experienced in collaborating with cross-functional teams including data scientists, analysts, and stakeholders, to gather requirements and deliver customized data solutions aligned with business objectives and strategic goals. Skilled in using DevOps tools like GitLab, JIRA, and Confluence to manage deployments and ensure reliable and efficient data pipeline operations in production and lower environments. Adept at implementing data quality frameworks and end-to-end data validation processes to maintain data integrity and support high-quality data delivery across various applications and platforms.EDUCATION Master of Science in Computer Science, University of Central Florida, Orlando, FL, USA Bachelor of Technology in Electronics and Communication Engineering, Rajasthan Technical University, Kota, IndiaCERTIFICATIONS AWS Certified Solutions Architect Associate Snowflake Snow Pro Core Microsoft Certified Azure Fundamentals Robotics Software Engineer Nanodegree, UdacitySKILLSLanguages Python, SQL, C++Databases Snowflake, SQL Server, Amazon RedshiftBI and Reporting Tableau, Power BIData Warehouse Snowflake Cloud, AWS RedshiftCloudTechnologiesAWS (EC2, IAM, S3 bucket, Lambda, Dynamo DB, Redshift, Athena, Secrets Manager, Glue, EMR, KMS)IDE PyCharm, Jupyter notebookETL Workflow Apache Airflow, AWS Glue, TalendMethodologies Agile, WaterfallVersion Control GIT, SVNScripting Shell, PythonOperatingSystemsWindows, LINUXEXPERIENCEJun 24 Current Senior Software Engineer, Equifax, Fort Mill, SC Optimization of financial data warehouse to enhance performance and streamline data integration, supporting real-time analytics and business intelligence across the organization. Implement and optimize data warehousing solutions on AWS Redshift, improving query performance and enabling efficient data storage and retrieval for business intelligence purposes. Automate data integration from various sources into the central data warehouse using AWS Glue, reducing manual data handling and ensuring real-time data availability for analysis. Collaborate with cross-functional teams, including data scientists, analysts, and business stakeholders, to gather requirements and deliver data solutions that align with business objectives. Designed and optimized AWS Redshift and Snowflake data warehousing solutions to enhance query performance and enable efficient data storage and retrieval for real-time business intelligence and analytics. Implemented automated data integration processes using AWS Glue and Snowflake capabilities to ingest and transform data from multiple sources, reducing manual intervention and ensuring seamless data availability for analysis. Collaborated with cross-functional teams, including data scientists, analysts, and business stakeholders, to gather requirements and deliver tailored data solutions aligned with organizational goals, utilizing Snowflakes advanced features. Performed data cleansing, transformation, and normalization using Python and Snowflake SQL to ensure data integrity and consistency, enhancing the quality of analytics and reporting. Configured Snowflakes Virtual Warehouses and Multi-Cluster Warehouses to manage workload efficiently, ensuring optimal resource utilization and cost management. Developed ETL pipelines to load and transform data from on-premises and cloud-based sources into Snowflake and AWS Redshift, facilitating seamless data migration and integration. Applied role-based access controls, data masking, and data sharing techniques in Snowflake to ensure data security, compliance, and controlled data access across different teams. Optimized SQL queries and storage structures in Snowflake and AWS Redshift for faster data retrieval and improved performance of critical business applications. Conducted data modeling and architecture design in Snowflake, arranging data in staging and target schemas to support robust analytics and reporting requirements. Implemented continuous data ingestion processes from AWS S3 into Snowflake using Snow pipe and other Snowflake data loading techniques, ensuring up-to-date data availability for analytics and decision-making. Developed and managed Snowflake streams, tasks, and zero-copy cloning to support data versioning and automate data pipeline processes, enhancing data management efficiency. Environment: AWS Redshift, Snowflake, Python, Lambda, Glue, SQL. Nov 22 Apr 24 Senior Data Engineer, ZS Associates, Princeton, NJ Cloud-based digital platform to manage and process clinical trial data. The platform supports the ingestion, standardization, and provisioning of digital datasets, aiming to accelerate research timelines and enhance scientific understanding by providing seamless access to vital data assets Designed and implemented a cloud-based data platform for healthcare digitalization, utilizing AWS services (Lambda, S3, EMR, KMS, Secrets Manager, Macie) and integrating Snowflake as a cloud warehouse solution, resulting in a 40% improvement in data retrieval times and expedited clinical research processes. Evaluated existing data sources and use cases to streamline the analytics ecosystem, designing a future state technology architecture that enhanced operational efficiency, improved service quality, and unlocked new revenue streams through advanced data analytics. Developed and deployed a scalable AWS-based One Data platform (utilizing EKS, Glue, Redshift, Athena) to integrate diverse data sources into valuable data products, supporting the clients strategic growth initiatives and enabling 25+ new product launches over three years. Leveraged Snowflakes cloud-native architecture to optimize data storage and retrieval processes, ensuring efficient data management and fast access to clinical trial data for research and development teams. Created and optimized ETL pipelines using Python and Snowflake SQL to process and transform healthcare data from multiple sources, improving data quality and consistency for analytical applications. Implemented Snowflake features like role-based access control, data sharing, and zero-copy cloning to ensure secure data handling and efficient collaboration across different teams. Utilized Python libraries such as Pandas for data manipulation and normalization, reducing errors and inconsistencies in heterogeneous healthcare data, thereby enhancing the accuracy and reliability of subsequent analyses. Developed continuous data ingestion and processing workflows using Snowflakes Snowpipe and AWS Glue, maintaining real-time data availability for ongoing clinical trials and research efforts. Collaborated with cross-functional teams to gather requirements and deliver tailored data solutions, utilizing Snowflakes advanced capabilities to meet complex business needs and drive data-driven decision-making.Environment: AWS, Snowflake, Python, SQL, Big Data, Redshift, Airflow. Oct 19 Oct 22 Software Engineer (Data), Educational Testing Service, Princeton, NJ DQSAAS (Data Quality Software as a Service) is a platform designed for end-to-end data quality checks. As part of a major cloud migration initiative, over 70 SAS applications were transitioned to Python and AWS, resulting in significant cost savings of over $1 million annually. Developed Python applications for robust data handling, including automated retrieval from SFTP locations, file format verification, data accuracy checks, and validation processes before transferring data to AWS S3 for secure storage and further processing. Leveraged PySpark on AWS EMR to process large datasets efficiently, managing containerized deployments with ECS, and orchestrating ETL workflows with AWS Glue to enhance analytics and reporting capabilities. Built and managed data pipelines using AWS services such as S3, Glue, Lambda, SES, SQS, and DynamoDB, streamlining storage solutions, serverless ETL processes, and event-driven tasks to improve data management and operational efficiency. Executed code deployments across production and lower environments using DevOps tools like JIRA, Confluence, and GitLab, ensuring system reliability and performance through proactive monitoring, regular health checks, and issue resolution. Collaborated with infrastructure teams to meet and exceed service level agreements, ensuring high-quality IT service delivery that aligned with organizational objectives, fostering strong relationships with business users and technology stakeholders. Provided critical on-call support during weekends and extended hours, effectively managing incident responses and system recoveries to uphold stringent service level agreements and maintain operational continuity. Conducted comprehensive log analysis using Splunk and other monitoring tools to identify and address potential system issues proactively, enhancing application resiliency and ensuring consistent performance.Environment: AWS, Python, SQL, Big Data, Redshift, Lambda. Jan 19 Oct 19 Software Developer, Siemens Corporate Technology, Princeton, NJ A robotic simulation using Siemens Plant Simulation software, synchronized with automation lab systems, aimed at enhancing factory automation. The project also involved 3D printing and CAD modeling, as well as the use of SLAM and LIDAR for room mapping and object detection with Intel RealSense, supporting robotic pick-and-place operations. Contributed to a DARPA-funded project as a Python developer and developed an ARM-funded robotic simulation using Siemens Plant Simulation software, synchronized with the automation lab. Also assisted in 3D printing and designed CAD models using Siemens NX. Assisted in generating room map using SLAM algorithm and LIDAR, used Intel Real-Sense camera for object detection using OpenCV. Worked with UR3 robot to perform pick and place operations.Environment: Python, Siemens Technomatix, CAD, Robotics, ML Jun 18 Dec 18 3D Data Post Processing Work-Flow Intern, Siemens Corporate Technology, Charlotte, NCAn algorithm development for detecting defects in machine parts using computer vision, aimed at eliminating manual inspection efforts and achieving substantial cost savings. Created a Siemens NX plug-in for mapping defects from 2D images to 3D CAD models, alongside gained expertise in photogrammetry. Utilized OpenCV for algorithm development. Applied advanced concepts in algorithm development, including Feature Extraction, Alignment, Segmentation, Holography, Image Registration and Rectification, along with Point Cloud Registration, and Image Sharpening, Smoothing, and Thresholding. Environment: C++, Siemens NX, OpenCV, Python |