| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidate Candidate's Name
PHONE NUMBER AVAILABLE| LinkedIn| EMAIL AVAILABLE| Senior Big Data EngineerPROFESSIONAL SUMMARY: Distinguished Senior Big Data Engineer with over 9+ years of experience, expertly navigating the complexities of big data solutions and analytics. Began professional journey as a SQL Developer at Optimum Info System Pvt Ltd, leveraging Oracle Database, PL/SQL, and Informatica PowerCenter for data management. Utilized Apache Kafka for real-time data streaming, and R for statistical analysis, enhancing data processing capabilities early in my career. Adopted Jenkins and Docker for continuous integration and deployment, significantly improving the efficiency and reliability of development workflows. Expert in Tableau and Power BI for data visualization, providing actionable insights through interactive dashboards and reports, facilitating data-driven decision-making. Transitioned to Infrasoft Tech as a Data Analyst, where I honed my skills in Microsoft Excel and PostgreSQL for advanced data analysis. Mastered Python (pandas, NumPy) for data manipulation and analysis, Apache Hive for data warehousing, enhancing my data analytics capabilities. Employed Tableau and QlikView at Infrasoft Tech for sophisticated data visualization and dashboard creation, driving insights across business units. Integrated GitLab for version control, ensuring robust code management and collaboration within data analysis projects. As an ETL Engineer at Fiserv, I advanced my skills in Informatica and AWS Data Pipeline, streamlining data integration and transformation processes. Utilized AWS Glue, Redshift, and Apache Airflow for scalable, efficient data warehousing and orchestration, supporting financial technology services. Implemented Docker and Terraform for containerization and infrastructure as code, optimizing deployment and management of data pipelines. At SIGNIFY HEALTH, as a Big Data Engineer, I leveraged Apache Spark and AWS EMR for high-performance data processing and analysis. Employed AWS Glue and Redshift for seamless ETL operations and data warehousing, facilitating advanced healthcare analytics and reporting. Utilized Amazon S3 and DynamoDB for scalable storage solutions, ensuring data availability and durability in healthcare applications. Integrated Databricks for collaborative data science and machine learning projects, accelerating innovation and insights in the healthcare domain. Currently, as a Senior Big Data Engineer at Deutsche Bank, I specialize in utilizing Azure and Databricks for cloud-based data solutions. Expert in delta architecture for managing and analyzing real-time data streams, enhancing financial data processing and analysis. Utilize MongoDB and Neo4j for complex data storage, supporting innovative banking applications with graph databases and NoSQL solutions. Implemented Prometheus for monitoring system performance, ensuring reliability in banking data operations. Leveraged DataRobot for predictive analytics and AI-driven decision-making. Advocated for CI/CD with GitLab, improving code quality and deployment efficiency in big data projects. Specialize in data streaming and ETL operations, leveraging technologies like Apache Kafka and Azure Data Factory for real-time data integration.
Committed to agile methodologies, leading cross-functional teams for efficient and adaptable project delivery. Known for deep technical expertise and strategic vision, driving innovation and excellence in big data engineering. Directed end-to-end project execution, including planning, design, development, and deployment, ensuring successful delivery of engineering solutions.TECHNICAL SKILLS:Programming Languages:Python, SQL, R, Java, Scala T-SQL, Shell ScriptingDatabases:Oracle Database, PostgreSQL, MongoDB, Neo4j, Snowflake, SQL Server, Azure SQL, Netezza, MySQLData Processing:Apache Spark, Hadoop, Apache Kafka, AWS EMR, Databricks, MapReduce, Py SparkETL Tools:Informatica PowerCenter, Ab Initio, DBT, SSIS, SSRS, Talend, AWS Glue, Apache Airflow, Informatica AxonCloud Platforms:AWS, GCP, Azure, ADLS, ADFData Visualization:Tableau, QlikView, PowerBI, LookerDevOps Tools:Jenkins, Docker, Terraform, GitLab, Prometheus, Splunk, CICD, KubernetesBig Data Technologies:Apache Hive, Redshift, Hadoop YARN, AWS Lake Formation, Flink, HBaseOther Skills:Analytics, Data Modelling, Visualization, Data Streaming, Data Governance, Data Warehousing, Parquet, Agile Methodologies, Metadata Management, Linux/Unix, EDC, and Data Quality, Troubleshooting, Data Privacy ManagementPROFESSIONAL EXPERIENCE:Client: Deutsche Bank, New York, NY Sep 2022 to till dateRole: Senior Big Data EngineerRoles & Responsibilities:
Led real-time data streaming using Azure, Apache Kafka, and ADLS for financial transaction analysis. Implemented Azure Databricks for scalable big data processing and advanced analytics in banking. Used Delta Lake for ACID transactions on Spark to ensure data integrity, and deployed scalable data pipelines with Azure Data Factory for efficient ETL processes.
Integrated MongoDB for scalable document storage and used Neo4j to model financial networks, enhancing fraud detection and risk management in banking applications.
Integrated Scala API for efficient data handling and utilized Snowflake tools for cloud data warehousing in banking sector applications. Advanced analytics and AI integration using Data Robot, applying machine learning models for predictive insights into banking operations. Integrated Apache Spark for efficient data analysis, used GitLab CI/CD for reliable data solutions, and deployed Prometheus for monitoring big data applications' performance. Led SQL implementation for financial reporting, developed data governance frameworks for quality and compliance, and used agile methodologies to deliver banking projects efficiently. Utilized Azure Synapse Analytics for unified data analytics, combining big data and data warehousing capabilities to drive insights and efficiency in banking sector operations. Loaded data into data warehouses (e.g., Snowflake, Azure SQL) using Talend, enabling advanced analytics and reporting. Designed and maintained scalable DBT data models, ensuring performance across business units. Mentored team on DBT best practices and advanced techniques, fostering continuous learning. I work with a team of 8 members, following agile methodologies within the AZURE CLOUD framework for project management and execution Integrated Azure Data Lake Storage (ADLS) for secure data storage, optimizing for scalability and access in cloud-based big data platforms. Used Parquet format to store and query large datasets efficiently, reducing costs and enhancing performance. Designed and maintained data architectures with Parquet for scalable, high-performance analytics. Enhanced data-driven decisions with advanced analytics, customized data visualization tools for executive reporting, and implemented secure, scalable microservices with Docker for banking applications Enabled sophisticated data mining and pattern recognition, employing Python for scripting and automation of complex data processing tasks. Streamlined DevOps for productivity, used Python and Java for data mining, automated processing, and optimized financial models with Pandas and NumPy for banking sector analysis. Promoted continuous learning and innovation, engaged teams with new technologies, and managed big data security and compliance to protect sensitive financial information. Utilized metadata management tools to enhance data quality and consistency, ensuring that data assets are accurate, reliable, and accessible to stakeholders. Led the transition to Azure for enhanced data capabilities, ensuring high data quality and fostering IT-business collaboration for shared goals.Client: Signify Health, Dallas, TX Mar 2020 to Aug 2022
Role: Big Data EngineerRoles & Responsibilities: Designed and implemented scalable ETL pipelines using Apache Airflow, streamlining the data preparation process for analytics in healthcare. Utilized AWS Glue for serverless data integration, simplifying the ETL process and enhancing scalability and manageability of healthcare data. Leveraged Apache Spark on AWS EMR for efficient processing of large healthcare datasets, enabling advanced analytics and research. Managed secure, scalable data storage with Amazon S3 for critical healthcare data infrastructure. Implemented AWS Redshift for data warehousing, enabling fast, scalable analysis of healthcare data across the organization. Deployed Apache Kafka for real-time data streaming in healthcare monitoring, enabling timely data analysis. Integrated Apache Flink with Hadoop and Kafka for seamless data processing. Set up monitoring and alerting to ensure reliability, identify bottlenecks, and resolve issues proactively. Optimized SSIS data flows for performance, reducing processing time for large-scale operations. Configured SSRS role-based security to manage access and ensure data privacy compliance. Utilized Terraform and Informatica tools proficiently for implementing advanced cloud-native ETL solutions, optimizing data integration and management workflows. Orchestrated AWS Step Functions for automated, efficient data workflows in complex processing tasks. Used AWS DynamoDB for high-performance NoSQL storage and integrated AWS Lake Formation to streamline secure data lakes, enhancing data security and governance in healthcare applications.
Developed analytics solutions with Databricks, fostering collaborative data science and machine learning projects to drive healthcare innovations. Managed Hadoop YARN for optimal resource allocation in big data processing and used Java for scripting and automation to enhance healthcare data processing and analysis.
Implemented secure data pipelines, ensuring patient data privacy and security. Enhanced real-time analytics with AWS Analytics and enabled data-driven decisions with SQL for complex healthcare data analysis. Led agile cross-functional teams for rapid healthcare data solution development and used GitLab CI/CD for enhanced data reliability through automated deployments. Optimized healthcare data workflows with Docker and Kubernetes, for scalability, transitioned to AWS for enhanced data capabilities, and ensured data accuracy with automated quality assessments.Client: Fiserv, Brookfield, WI. Nov 2017 to Feb 2020Role: ETL Engineer Roles & Responsibilities: Designed automated ETL workflows with Informatica, streamlining financial data integration. Used AWS Data Pipeline for efficient, reliable data transport and transformation across cloud platforms. Designed complex ETL processes with Talend. Developed and optimized complex ETL processes using Ab Initio and T-SQL, ensuring efficient data integration, transformation, and robust data quality across diverse data sources. Designed end-to-end ETL solutions using GCP services like Dataflow, Big Query, and Cloud Storage. Optimized ETL job performance, enhancing efficiency and reducing processing times.
Data Integration for data migration and transformation. Leveraged Apache Kafka for real-time data ingestion, enabling timely financial and customer data analysis. Deployed AWS Glue for serverless ETL, simplifying large financial dataset preparation. Orchestrated data workflows with Apache Airflow, enhancing ETL automation and monitoring. Managed analytical databases using AWS Redshift, providing scalable and cost-efficient data warehousing solutions for financial analytics. Used Docker for containerizing ETL applications, ensuring consistency across environments. Automated AWS resource management for data pipelines using Terraform infrastructure as code. Configured AWS EMR for scalable big data analysis. Integrated Hadoop into pipelines for distributed storage and analysis of vast financial data.
Applied GCP security best practices in ETL to protect data and ensure compliance. Used Google Cloud Storage for efficient data staging and management before and after ETL processes. Configured and maintained ETL workflows on Linux-based systems, ensuring reliable and efficient data processing and integration across diverse environments. Adopted DevOps practices for continuous integration and deployment, utilizing GitLab to improve the collaboration and efficiency of development teams. Integrated Google Cloud Pub/Sub for real-time data ingestion and processing, and used Google Cloud Dataproc with Hadoop and Spark to enhance ETL workflows and performance. Developed and optimized shell scripts for automating ETL tasks on Linux, improving operational efficiency and reducing manual intervention in data pipelines. Implemented robust data governance and quality controls, ensuring accuracy and integrity of financial data through comprehensive validation processes. Led agile teams to deliver ETL solutions efficiently. Developed custom data models for financial analytics, enabling insights into customer behavior and market trends. Enhanced data security, applying best practices to protect sensitive financial information. Optimized ETL processes and databases, improving resource usage and reducing processing times. Collaborated with business analysts and data scientists, providing data integration support for predictive analytics and machine learning projects.
Client: Infrasoft Tech, Pune, India Feb 2016 to Jun 2017Role: Data AnalystRoles& Responsibilities:
Analyzed financial data with Excel using advanced formulas and pivot tables to identify trends. Managed databases with PostgreSQL, optimizing storage and retrieval for analytics. Employed Python (pandas, NumPy) for data manipulation, enhancing accuracy in complex datasets.
Developed ETL workflows with Apache Hive for financial reporting. Created Tableau dashboards for data visualization. Used GitLab for version control, enhancing team collaboration and code management.
Used QlikView for advanced data visualization, delivering interactive insights into financial performance. Integrated diverse data sources into a centralized warehouse, ensuring consistent and reliable analysis.
Automated data cleansing with Python, improving data quality. Conducted SQL queries for financial data analysis. Developed documentation for reproducibility of data analysis processes and findings.
Utilized Unix commands and shell scripts to process and analyze large datasets, automating data extraction and transformation tasks to enhance efficiency and accuracy.
Used Power BI's advanced visualization tools to create detailed reports and charts, enhancing data clarity. Improved report performance with efficient DAX formulas and optimized queries, boosting load times and user experience. Collaborated with stakeholders to align data analysis with business objectives. Optimized PostgreSQL storage and management, implementing best practices for effective database design and maintenance.
Used R for statistical analysis, deriving insights from financial data to guide business strategy. Led data analysis projects with agile practices, ensuring timely and successful completion. Implemented data governance practices, ensuring the accuracy, privacy, and security of financial data. Conducted training on data analysis tools, boosting team capabilities. Stayed updated on data analytics and financial tech trends, applying innovative approaches to enhance analysis projects.Client: Optimum Info System Pvt Ltd., Chennai, India May 2014 to Jan 2016Role: SQL Developer Roles & Responsibilities: Developed efficient SQL queries and procedures using Oracle Database, optimizing data access and manipulation for business applications. Used PL/SQL for advanced database programming, automating complex tasks and enhancing functionality. Implemented Informatica PowerCenter workflows, streamlining ETL processes for business intelligence. Used Apache Kafka for real-time data streaming, enhancing dynamic analysis and reporting. Employed Jenkins for continuous integration, automating build and deployment of data-driven applications.
Used Docker for consistent application containerization across environments. Created interactive Tableau dashboards for data visualization. Conducted analytics with R, delivering insights into business performance and operational efficiency.
Created dynamic Power BI dashboards to visualize key metrics and support data-driven decisions. Built and optimized data models, integrating multiple sources for comprehensive reporting and insights. Automated ETL tasks with Jenkins, boosting data workflow speed and reliability. Worked with cross-functional teams to align data solutions with business requirements.
Optimized Oracle Database performance for high availability and speed. Created and maintained documentation for data management, ensuring best practices and effective knowledge sharing.
Participated in agile project management processes, contributing to the efficient delivery of data projects. Implemented data security measures to protect sensitive information and comply with privacy regulations. Committed to continuous learning, staying current with data technologies and methodologies.Education: Bachelor of Technology (B Tech) in Information Technology from JNTUK, Kakinada, Andhra Pradesh, India. - 2014. |