Data Engineer Processing Resume Avon, IN

Data Engineer Processing Resume Avon, IN
Resumes | Register

Candidate Information
Title	Data Engineer Processing
Target Location	US-IN-Avon
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Data Engineer Quality Bloomington, IN

Data Engineer Analytics Indianapolis, IN

Data Engineer Zionsville, IN

Process Safety Engineer Avon, IN

Senior Data Scientist Indianapolis, IN

Software Engineering Automation Testing Indianapolis, IN

Data Analyst Science Bloomington, IN

Click here or scroll down to respond to this candidate

Sandhya Kanneboina Data EngineerEMAIL AVAILABLE PHONE NUMBER AVAILABLE LinkedInSUMMARY Seasoned Data Engineer with over 4 years of experience designing, implementing, and managing end-to-end ETL pipelines for big data processing using both on-premises Hadoop frameworks and Azure, AWS cloud settings. Proficient in optimizing Hadoop clusters for maximum performance, guaranteeing data security compliance, and improving data storage performance in Azure, AWS environments. Skilled at troubleshooting and resolving issues with Azure, AWS monitoring tools. Excels at working with cross-functional teams and is fluent in a variety of programming languages and big data processing frameworks.EXPERIENCEInfosys - Hyderabad, India Jan 2022- Jan 2023Data Engineer Experienced IT professional proficient in Data Engineering and analytical programming with Java, Python, SQL, R, Scala, HQL. Developed UNIX Shell scripts to automate repetitive database processes and for validating data files. Utilized AWS and Azure services such as EMR, S3, RDS, Lambda, Redshift, Glue, Athena, Databricks, Data Factory, Azure Functions and Airflow, along with Jupyter notebooks, to address diverse data engineering challenges. Developed serverless functions using AWS Lambda in Python and Node.js to handle various tasks including data processing, file manipulation, and API integrations. Proven expertise in Agile development methodologies, contributing actively to sprint planning, retrospective reviews, and daily stand-ups. Proficient in utilizing Agile tools like JIRA and Confluence to ensure efficient collaboration and prioritize tasks within cross-functional teams. Leveraged Python core and Django middleware to build a web application and utilized Pyspark and Python to create core engines for data validation and analysis. Implemented data transformation solutions using IBM DataStage, including designing, developing, and deploying ETL(Extract, Transform, Load) processes to integrate and transform data from heterogeneous sources. Developed and maintained PostgreSQL, MySQL, SQL Server, and Oracle databases to store and manage large volumes of structured data efficiently. Designed database schemas, optimized queries, and ensured data integrity to meet business requirements. Experienced in developing stored procedures, functions, triggers, and packages using PL/SQL in SQL-based environments to implement business logic and enhance database functionality. Implemented and optimized real-time data processing workflows on Snowflake, leveraging its unique architecture for seamless integration with streaming data sources. Proven proficiency in ETL development using SSIS, excelling in designing and optimizing data workflows for efficiency and data integrity. Implemented event-driven architectures by integrating AWS Lambda with various AWS services such as S3, DynamoDB, SNS, and SQS. Developed and maintained Kafka producer and consumer applications using Kafka client libraries (Java, Python, or Scala) to publish and consume messages from Kafka topics, implementing message serialization, error handling, and batching mechanisms for optimal performance. Implemented and optimized AWS infrastructure using services such as Amazon EC2, Amazon S3, Amazon RDS, Amazon VPC, and Amazon Route 53, aligning with best practices for security, reliability, and performance. Developed and maintained ETL pipelines in Snowflake using Snowpipe, SnowSQL, or third-party integration tools, ensuring efficient data ingestion, transformation, and loading processes from various source systems. Leveraged Glue Data Catalog for metadata management, enabling seamless integration with other AWS services such as Amazon Redshift, Athena, and S3. Developed interactive dashboards and reports using Power BI to visualize key performance metrics and trends, enabling stakeholders to make data-driven decisions. Developed and deployed data pipelines in Azure Data Factory to orchestrate and automate data workflows, ensuring seamless data integration across diverse sources. Developed Python and PySpark applications for data analysis. Developed the PySpark code for AWS Glue jobs and for EMR. Expertise in designing and maintaining data warehousing solutions, including leveraging Azure Synapse, Azure Data Factory, Azure SQL Database and Azure Blob Storage. Used Azure Data Factory extensively for ingesting data from different source systems like Relational and Nonrelational to meet business functional requirements. Implementing data pipelines using Azure Data Factory. Implemented CI/CD pipelines integrating Apache Kafka, Jenkins, Splunk, Maven, Docker, Kubernetes, Terraform and Gradle for seamless deployment and monitoring of applications. Leveraged DataStage ETL for efficient data processing within the pipelines, ensuring reliable and scalable deployment workflows. Implemented and configured Hadoop ecosystem components including HDFS, YARN, MapReduce, Apache Hive, Apache Spark, and HBase, ensuring proper integration and interoperability within the cluster environment. Developed Spark code using Scala and Spark-SQL for faster testing and data processing. Written programs in Spark using Scala and Python for Data quality check. Performed ETL jobs to integrate the data to HDFS using Informatica. Wrote Pig Scripts to generate Map Reduce jobs and performed ETL procedures on the data in HDFS. Leveraged workflow management tools like Apache Airflow to design and manage complex data pipelines, orchestrating the execution of ETL tasks, and data loading processes in a scalable and fault-tolerant manner. Proficient in performing data extraction, ingestion, and processing of large datasets, as well as data modeling and schema design, leveraging appropriate tools such as Apache Spark, Flink, Apache Kafka, and Apache Airflow(DAGs) for efficient and scalable data engineering workflows. Wrote complex SQL scripts and PL/SQL packages, to extract data from various source tables of Data warehouse. Good understanding and hands-on experience in setting up and maintaining NoSQL Databases like MongoDB, Cassandra, Elasticsearch, DynamoDB, and HBase and SQL databases like MYSQL, PostgreSQL, SQL server, Oracle, DB2, Amazon RDS, Google Cloud SQL, and Snowflake. Worked with highly structured and semi-structured data sets of 45 TB in size (135 TBwith replication factor of 3 ). Expertise in designing and maintaining data warehousing solutions leveraging AWS Redshift, AWS Glue, Amazon RDS(Relational Database Service), and Amazon S3 for storage. PRAVEEN Technologies - Hyderabad, India January 2019- December 2021 Hadoop Developer/Data Engineer Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics. Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure Databricks. Implemented best practices for Terraform code organization, versioning, and documentation to facilitate collaboration and maintainability. Proficient in performing data extraction, ingestion, and processing of large datasets, as well as data modeling and schema design, leveraging appropriate tools such as Apache Spark, Apache Kafka, and Apache Airflow (DAGs) for efficient and scalable data engineering workflows. Led Cloud Migration initiatives for bank and financial data, successfully transitioning legacy systems to cloud-based solutions while ensuring compliance and security. Utilized Agile Scrum for efficient collaboration and timely delivery. Experience in waterfall SDLC (Software development life cycle), Agile/Scrum Methodologies/processes. Managed and administered a diverse range of databases including MongoDB, Cassandra, DB2, Oracle, and SQL Server, ensuring optimal performance, security, and availability. Designed and implemented end-to-end ETL pipelines on both on-premises and AWS cloud, utilizing Big Data tools for data processing and storage, alongside Amazon S3, EFS for scalable and reliable cloud-based storage solutions. Expertise in designing and maintaining data warehousing solutions leveraging AWS Redshift, AWS Glue, Amazon RDS(Relational Database Service), and Amazon S3, Aurora for storage. Implemented AWS Organizations for centralized management and governance of multiple AWS Accounts. Established Service Control Policies (SCPs) to enforce security and compliance requirements, while utilizing AWS Config for Realtime monitoring and automated remediation of policy violations. Employed data masking, tokenization, and anonymization techniques to protect sensitive data elements within databases, ensuring compliance with regulatory requirements and minimizing exposure to unauthorized access. Integrated with AWS CloudWatch for monitoring and logging, ensuring reliable performance and operational visibility. Employed AWS SAM (Serverless Application Model) for streamlined deployment and management of serverless applications, optimizing resource utilization and minimizing operational overhead. Assumes full stack ownership, consistently delivering production-ready, testable code. Leads end-to-end product lifecycle, including design, development, testing, deployment, and maintenance. Conducts code reviews to enforce best practices and development standards in the AWS Cloud. Implemented robust security measures, including IAM roles, fine-grained access control policies, and encryption mechanisms such as AWS KMS, to safeguard sensitive data stored within the AWS environment. Hands-on experience with Git, Gitlab, AWS serverless technologies (Lambda, API Gateway, Step Functions, S3, SQS, SNS), containerized workloads (Kubernetes), and Jenkins Orchestration, contributing to streamlined development workflows and enhanced deployment processes in real-world projects. Extensive hands-on experience leveraging Python within AWS Environments to architect, develop, and optimize various cloud solutions. Proficient in utilizing Python for automation, infrastructure provisioning, serverless application development, data processing, and integrating AWS services. Demonstrated ability to design and implement scalable and efficient solutions using Python in real-world AWS projects. Drove AWS CloudFormation adoption to automate infrastructure, facilitating rapid cloud program acceleration. Built cloud-native services with AWS Lambda, API Gateway, and DynamoDB, Aurora, empowering teams. Provided expert support for AWS product utilization, enhancing internal customer experiences. Managed end-to-end data processing tasks proficiently using Python, utilizing key libraries such as pandas, NumPy, Matplotlib, Machine Learning, and conducting PCA Analysis. Set up Jenkins server and build jobs to provide continuous automated builds based on polling the Git source control system during the day and periodic scheduled builds overnight to support development needs using DevOps tools such as Jenkins, Git, Junit and Maven. Working with an Agile development team to deliver an end-to-end continuous integration/continuous delivery (CI/CD) product in an open-source environment using Jenkins. Developed and maintained Glue jobs, crawlers, and workflows to automate data ingestion and processing pipelines, reducing manual effort, and improving data reliability. Integrated Amazon Aurora with other AWS services such as AWS Lambda, AWS Glue, and Amazon S3 to build end to-end data pipelines and analytical solutions. Utilized AWS and Azure services such as Amazon EMR, ECS, Glue, Athena, Databricks, Data Factory, Azure Functions and Airflow, along with Jupyter notebooks, to address diverse data engineering challenges. CERTIFICATIONS AWS Certified Solutions Architect Associate Microsoft Certified: Azure Data Engineer Associate AWS Cloud Technical Essentials PYTHON-DATA STRUCTURES-MACHINE LEARNING Databases and SQL with PythonSKILLS Programming Languages: Python, Java, C++, SQL, HQL, R, Go, Pyspark, Scala. Hadoop: HDFS, MapReduce, Hive, Zookeeper, YARN, HBase, Sqoop, Hortonworks, Cloudera. Cloud: Microsoft Azure, Amazon Web Services. Web Technologies: HTML, CSS, JavaScript, Bootstrap, JQuery, JSON, Snowflake, Agile, Scrum. Databases: MySQL, DB2, PostgreSQL, SQL Server, Oracle NoSQL(HBase, MongoDB, Cassandra). Tools: Git, Bitbucket, JIRA, Confluence, PostMan, Kafka, Tableau, PowerBI, Informatica, Flink, Terraform, Docker, Confluence, Kubernetes, Airflow, Elastic Search, Jupyter Notebooks, Jenkins, Maven, Gradle, Gitlab, Selenium. Numpy, Pandas, Matplotlib, Machine Learning, Unix Shell Scripting, Databricks, Datastage, Linux, EDUCATIONChaitanya Institute of Technology and ScienceBachelors in technology (Electronics and Communication Engineering) Warangal, IndiaCGPA: 3.5

Respond to this candidate
Your Message
Please type the code shown in the image: