| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidate
Name: Candidate's Name PattemSenior Data EngineerEmail: EMAIL AVAILABLEMobile: PHONE NUMBER AVAILABLELinkedIn ID: https://LINKEDIN LINK AVAILABLE
PROFESSIONAL SUMMARY: Data Engineer with extensive 5 years of experience specializing in designing and developing advanced data engineering solutions. Proficient in developing scalable data architectures with Python, enhancing automation and processing efficiency. Expert in SQL programming, optimizing complex data queries and integrations across multiple database platforms. Skilled in Oracle Database configurations and management, ensuring robust, secure data storage solutions. Advanced user of PL/SQL for writing complex database scripts that streamline data manipulation and reporting. Utilized Informatica PowerCenter to facilitate data integration and transformation across enterprise systems. Experienced with Apache Kafka for building real-time data streaming applications in high-throughput environments. Developed continuous integration and deployment pipelines using Jenkins, improving code integration and delivery. Leveraged Docker to create isolated environments, enhancing application portability and scaling. Integrated R for statistical computing and graphics, providing deep insights through advanced analytics. Employed Tableau for creating interactive and shareable dashboards, enhancing business decision-making processes. Mastered Python for scripting and automation, making routine tasks efficient and error-free. Utilized SSIS for data extraction, transformation, and loading (ETL), improving data workflow and productivity. Managed databases and warehousing solutions using Amazon Redshift, ensuring efficient data storage and retrieval. Configured MySQL databases for web applications, ensuring efficient data handling and storage. Experienced in version control management with Git, maintaining source code integrity across teams. Designed and deployed data workflows using Apache Airflow, orchestrating complex data processing pipelines. Utilized Pyspark for processing large datasets, enhancing data manipulation and analysis capabilities. Developed data solutions with Apache Hadoop, managing and processing big data with improved efficiency. Leveraged Apache Spark for in-memory data processing to speed up analytics applications. Managed cloud data services and orchestration using Azure Data Factory (ADF) and Azure SQL Database. Implemented data storage and analytics solutions with Azure Cosmos DB, optimizing performance and scalability. Engineered ETL processes with Talend, enhancing data integration and transformation capabilities. Created dynamic machine learning models using PyTorch, advancing predictive analytics and insights. Administered AWS S3 for cloud-based data storage solutions, optimizing durability and access speed. Designed and maintained large-scale data warehouses using Snowflake, improving data querying and report generation.
TECHNICAL SKILLS:Programming LanguagesPython, SQL, PL/SQL, RDatabasesOracle Database, Snowflake, MySQL, Amazon Redshift, Azure SQL Database, Azure Cosmos DB, Teradata.Data Processing/ETLApache Hadoop, Apache Spark, Apache Kafka, Informatica PowerCenter, SSIS, AWS Glue, Talend, Apache Airflow, Data Bricks.Visualization ToolsTableau, AWS Quick SightDevOps ToolsJenkins, Docker, Terraform, GITCloud TechnologiesADLS, ADF, AWS (S3, DynamoDB, EMR), Azure, GCPMachine LearningPyTorchFile Formats ManagementHandling various file formatsPROFESSIONAL EXPERIENCE:Client: Novant Health, Charlotte, NC September 2023 to PresentRole: Data EngineerRoles & Responsibilities:
Developed collaborative analytics solutions utilizing Python and Pandas, enhancing data-driven decision-making processes. Managed real-time data streaming architectures with Apache Kafka, optimizing the responsiveness of patient data processing and clinical decision support systems. Orchestrated the deployment and management of data pipelines on AWS EMR, enhancing scalability and performance. Leveraged Airflow and Databricks to construct and maintain robust ETL workflows across diverse healthcare datasets. Implemented large-scale data storage solutions using AWS S3 and Snowflake, improving data accessibility and efficiency. Automated data transformation processes within AWS Glue, streamlining data integration and enhancing reliability. Utilized SQL for complex data querying, ensuring accurate and timely data retrieval for analysis. Deployed collaborative analytics projects with DBT, improving team productivity and project deliverables. Integrated MongoDB for document-oriented database solutions, optimizing data storage and retrieval operations. Managed data pipelines using Kubeflow, facilitating scalable machine learning workflows in production environments. Configured and maintained robust data processing systems using Teradata, enhancing data warehousing capabilities. Developed data streaming solutions with Apache Kafka, facilitating real-time data analysis and reporting. Optimized data extraction and loading processes using Airflow, improving efficiency and reducing processing times. Implemented and managed CI/CD pipelines to automate the deployment of data processing scripts and workflows in the healthcare data environment, ensuring continuous integration. Engineered complex ETL pipelines using Snowflake, streamlining data transformations and consolidations. Optimized and maintained data integration processes using DBT, ensuring consistency and quality of data. Deployed data visualization projects using collaborative analytics tools, enabling better insights and reporting. Managed AWS EMR clusters for processing large-scale data, ensuring efficient resource utilization and performance. Configured and utilized MongoDB to manage unstructured data effectively, improving flexibility in data handling. Streamlined data loading and transformation tasks using Airflow, enhancing workflow automation and monitoring. Utilized Databricks for developing and executing scalable data analytics projects, improving output and efficiency. Developed automation scripts in Python, reducing redundancy and speeding up data processing tasks. Managed and optimized the use of AWS S3 for data storage, ensuring secure and cost-effective data management.Environment: Python, Pandas, Apache Kafka, AWS EMR, ETL, DBT, Databricks, SQL, DBT, Snowflake, AWS Glue, MongoDB, Kubeflow, Teradata, and AWS S3.Top of FormClient: Huntington Bank, Columbus, Ohio March 2023 to August 2023Role: Data EngineerRoles & Responsibilities: Designed and deployed ETL workflows integrating Apache NiFi and Informatica to support wealth management services. Configured Azure Blob Storage for secure and scalable data storage, accommodating large volumes of banking transaction data and customer information. Managed high-throughput NoSQL data with Azure Cosmos DB, enhancing performance and scalability of database operations for financial transactions and customer profiles. Developed machine learning pipelines using PyTorch, enabling predictive analytics for financial trend analysis. Implemented real-time data visualization tools with Azure Power BI, facilitating dynamic financial reporting and business intelligence. Orchestrated automation of data pipeline deployments using Terraform, improving infrastructure as code practices. Utilized Docker containers to create consistent development environments, streamlining project deployments. Leveraged Azure Data Factory for managing and automating ETL processes, enhancing data integration and transformation from various banking data sources. Managed data orchestration and workflow automation using Apache NiFi, optimizing data flow and processing. Developed financial reporting solutions with Tableau, providing insights and supporting decision-making processes. Enhanced project management and tracking using Jira, improving team collaboration and task accountability. Implemented agile methodologies to accelerate project delivery times and improve response to business needs. Configured and managed version control systems using GIT, ensuring code integrity and supporting team collaboration. Streamlined data loading processes using Informatica, enhancing the efficiency and reliability of data transfers. Deployed data transformation and processing solutions using Azure Cosmos DB and Azure Data Factory to streamline financial data workflows. Automated infrastructure deployments using Terraform and Docker, reducing manual efforts and enhancing reproducibility. Developed and maintained operational dashboards using Power BI and Azure Synapse Analytics, providing real-time insights into banking operations and performance metrics. Utilized Python and SQL for scripting and query optimization, improving data manipulation and retrieval processes. Managed file storage and data backups using Azure Blob Storage, ensuring data safety, compliance, and high availability for critical banking data. Implemented continuous integration and delivery pipelines using Jenkins, enhancing deployment efficiency and reliability. Automated data quality checks and performance tuning using Python scripts, maintaining high standards of data integrity. Facilitated team collaboration and project management in an agile environment, using Jira for effective tracking and reporting. Engineered and maintained parquet file formats for efficient data storage and retrieval in analytics applications. Environment: Apache NiFi, Azure Blob Storage, Azure Cosmos DB, Azure Power BI, Azure Data factory, Informatica, AWS S3, AWS DynamoDB, PyTorch, AWS Quick Sight, Terraform, Docker, AWS Glue, Tableau, Jira, Agile, Git, Informatica PowerCenter, Jenkins, Python, SQLClient: KPIT Technologies, Hyderabad, India Jan 2021 to Oct 2022Role: Data Solutions Architect Roles & Responsibilities: Developed complex SQL queries for transactional data analysis and reporting, significantly enhancing data-driven decision-making. Managed and optimized ETL processes using Google Cloud Dataflow and Apache Airflow, improving workflow efficiency and data integration. Created dynamic business intelligence dashboards in Power BI, enabling real-time financial monitoring and analysis. Maintained and enhanced data warehousing solutions in Amazon Redshift, improving performance and data retrieval speeds. Implemented Python scripts for data manipulation and cleaning within Google Cloud Functions, ensuring high-quality data in financial reports. Streamlined data integration from diverse sources using Talend, enhancing the breadth and accuracy of financial data. Optimized data pipeline performance using Apache Airflow, reducing processing times and enhancing data throughput. Conducted regular data quality checks and optimizations using Google Cloud Data Quality tools, ensuring accuracy and reliability of financial reports. Developed automated data validation tests using Python, safeguarding against data integrity issues in transaction processing. Enhanced data security and compliance measures, aligning with industry standards and regulatory requirements. Facilitated the transition of data operations to cloud environments, leveraging GCP technologies for greater scalability and flexibility. Provided technical leadership in the deployment of new data solutions using GCP, enhancing team knowledge and system capabilities. Engineered data models and schemas in Amazon Redshift, supporting complex data analysis and business intelligence efforts. Conducted training sessions on data handling and visualization techniques using GCP tools, empowering team members with new skills. Implemented version control with Git and integrated it with Google Cloud Source Repositories, ensuring efficient collaboration and management of development projects.Environment: SQL, Google Cloud Dataflow, Google Cloud Functions, Google Cloud Data Quality Tools, GCP Tools, Talend, Apache Airflow, Power BI, Amazon Redshift, Python, AWS, Git.Client: Tvisha Technologies, Hyderabad, India Mar 2019 to Dec 2020Role: SQL Developer Roles & Responsibilities: Implemented Oracle Database solutions to enhance data storage, retrieval, and management processes. Developed and optimized PL/SQL scripts for advanced data manipulation and report generation. Managed data integration projects using Informatica PowerCenter, enhancing data consistency and quality. Configured Apache Kafka for real-time data ingestion, improving data flow and processing speeds. Utilized Jenkins and Docker to automate continuous integration and deployment processes, enhancing productivity. Leveraged R programming for data analysis and visualization, providing insights for strategic decision-making. Developed comprehensive data visualization tools with Tableau, enabling better data interpretation and business decisions. Optimized SQL queries and database functions, enhancing performance and data access for enterprise applications. Engineered ETL processes with Informatica PowerCenter, improving data integration and workflow efficiency. Implemented robust data ingestion frameworks with Apache Kafka, streamlining real-time data collection. Utilized Docker to manage containerized applications, improving deployment flexibility and environment consistency. Automated various development and deployment tasks using Jenkins, increasing operational efficiency. Developed analytic models and reports using R and Tableau, enhancing data-driven decision-making. Managed and optimized database operations, ensuring high performance and reliability. Integrated and maintained Oracle and PL/SQL solutions, supporting critical business operations and analytics.Environment: Oracle Database, PL/SQL, Informatica PowerCenter, Apache Kafka, Jenkins, Docker, R, and Tableau. |