| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateName: Asritha KukudalaEmail: EMAIL AVAILABLEPhone: PHONE NUMBER AVAILABLELinkedIn: https://LINKEDIN LINK AVAILABLEProfessional Summary:Over 10 years of experience in ETL Data Engineer and SRE with over 9 years of hands-on experience in performance tuning, ETL optimization, and server reliability engineering. Specialized in configuring and optimizing Pentaho ETL server, developing and automating lightweight processes, and implementing best practices in server-side SRE to ensure high reliability and system performance.Developed scalable data pipelines leveraging AWS services like Kinesis, Glue, Redshift, and S3 for both batch and streaming architectures.Proficient in AWS services (EC2, S3, EBS, ELB, RDS, SNS, SQS, VPC, Redshift, CloudFormation, CloudWatch), ensuring secure and scalable data solutions.Skilled in GCP technologies like BigQuery, Cloud SQL, Cloud Storage, Matillion, DataStudio, Pub/Sub for data processing and analysis.Experienced with Azure services (Data Factory, Databricks, Data Lake, Azure SQL, Cosmos DB), Python, PySpark, SQL for building and optimizing both batch and streaming data pipelines.Expertise in Snowflake and Redshift for data warehousing and analytics, ensuring high performance and scalability.Managed scalable data systems using AWS EC2, SAS Data Maker, automating ETL pipelines for real-time transformations and analytics.Managed GCP Identity and Access Management (IAM), ensuring secure and compliant access controls across cloud resources.AWS VAST for large-scale data storage, optimizing storage efficiency and accessibility for high-performance cloud-based applications.Hands-on experience in Pentaho ETL server setup, performance tuning (heap, memory, CPU), upgrades, and optimization.Scheduling and Automation like Configured Cronitor Scheduler for effective resource allocation and job monitoring, enhancing system efficiency.Experienced in implementing and optimizing lightweight processes (LWP) for resource management in high-performance systems.Server Reliability Engineering (SRE) with 8+ years in server-side SRE, focusing on automation, performance monitoring, and compliance.Strong communication skills with a customer-centric approach, adept at working in Agile teams and coordinating with global, cross-functional teams.Testing automation frameworks and managing functional test scripts. Experience in various types of testing: smoke, white box, black box, integration, end-to-end, regression, and system testing.Requirement and design review participation from a QA perspective. Managing test coverage based on risk analysis. Writing test scripts and adhering to QA standards and practices.Implemented AWS NewT for secure, high-performance cloud application development, ensuring robust data management and operational resilience.Conducted batch testing for data processing pipelines to validate system performance and reliability across large datasets, ensuring seamless data transformations and migrations.Performed UI components testing to ensure the functionality, performance, and user experience of front-end interfaces, using automated testing tools to optimize workflows and identify issues promptly.Design and implement security measures using AWS services such as IAM, KMS, and VPC to ensure data privacy, encryption, and compliance with financial regulations (e.g., GDPR, PCI-DSS).Identified and resolved IAM exceptions to adhere to organizational security policies and compliance standardsDeveloped AI-powered data insights with AWS Bedrock, integrating NLP and ML models to improve predictive analytics.Implemented security best practices with VPCs, IAM roles, encryption, securing network communication via firewalls, VPC Service Controls, and Cloud Armor.Expertise in microservices architecture using service mesh technologies like Istio and secure API management with Apigee Gateway.Designed and managed CI/CD pipelines using GCP DevOps, Golang, Jenkins, Bitbucket, and GitHub.Utilized AWS VAST for large-scale data storage, optimizing AWS NewT for secure, high-performance cloud-based applications.Developed and maintained Customer Data Platform (CDP) solutions using platforms like Segment and MParticle to centralize and manage customer data for marketing and analytics.AWS Developer: Leveraged AWS services (S3, Lambda, Redshift) to build scalable data pipelines and optimize CDP operations for real-time customer insights.Experienced in testing Data and ETL workflows using Informatica IDQ and DVO, ensuring data integrity across complex ETL processes.Implement Snowflakes security features for role-based access control and encryption.Developed and enforced data governance frameworks and DQ stewardship practices to ensure the quality, security, and compliance of data, contributing significantly to regulatory adherence and data management strategies.Led efforts in building data quality rules for data validation and anomaly detection, improving data reliability and driving data consistency across systems using Ataccama.Hands-on with metadata management, analyzing data capture, modification, and deletion processes to enhance transparency and data governance, ensuring audit trail integrity.Experience with DBT for data transformations and analytics workflows, optimized ClickHouse DBMS for improved query performance.Utilized Datadog APM (Application Performance Monitoring) to trace and troubleshoot issues within distributed systems, ensuring rapid identification and resolution of performance bottlenecks.Leveraged Azure Data Engineering tools, such as Azure Data Factory and Databricks, for developing scalable data pipelines and automating data workflows. Integrated Azure Data Lake for efficient data storage and retrieval.Developed and deployed RESTful APIs on Azure, enabling real-time data processing and analytics, integrating Snowflake and other data sources.Implemented security best practices in Azure environments using IAM roles, VPC configurations, and Azure Role-Based Access Control (RBAC) to ensure secure and compliant data management.Configured and automated CI/CD pipelines using Azure DevOps, CI/CD pipeline by integrating Datadog monitoring into Jenkins and Bamboo, enabling continuous tracking of deployment metrics and system performance.Optimized data transformation workflows by integrating Azure SQL and Cosmos DB for low-latency, scalable data solutions.Implemented high availability solutions for CDP data processing using monitoring and alerting tools like DataDog and PagerDuty, ensuring 24/7 system availability and rapid recovery.Developed automated recovery processes to minimize downtime and ensure seamless data processing in the CDP ecosystem.Performed Google Cloud Security Posture Review (CSPR) remediations, ensuring alignment with industry best practices and organizational security policies.Conducted regular reviews and updates to cloud configurations, addressing CSPR findings and implementing necessary security enhancements.Collaborated on Azure cloud architecture, integrating various services like Azure Functions, Blob Storage, and Azure SQL for real-time analytics and reporting.Integrated AWS DMS, Step Functions, Lambda for optimized data workflows.Skilled in Python, Shell Scripting, PowerShell for workflow automation, and CloudWatch, Nagios, Splunk for monitoring and logging.Developed distributed systems for real-time processing with Kafka, AWS Kinesis, and Spark Streaming.Managed large-scale NoSQL databases like MongoDB and Cassandra for real-time ingestion and querying.Led DevOps initiatives, implementing CI/CD pipelines and automating deployment with GitLab and Jenkins.Extensive experience in data profiling, data observability, and data analysis, with SQL, PL/SQL across various databases (MySQL, MS-SQL, Oracle, DB2, Hadoop).Designed and deployed RESTful APIs in Azure, SAML for real-time data processing and analytics, adept in data visualization using Tableau, Power BI, ELK Stack.Experienced in data quality management with Data Cleaning, Data Flux, Quality Center, ensuring high data integrity.Developed Integrated Endpoint Protection solutions to safeguard data pipelines and ensure secure data handling across cloud platforms, leveraging encryption, data loss prevention, and antivirus protection.Worked with Epic Systems software for Master Data Management (MDM) expertise using Ataccama MDM managing electronic health records (EHR), optimizing data retrieval processes, and ensuring secure patient data handling. Implemented and customized Epic modules to streamline healthcare workflows.Technical Skills:Cloud Platforms & Infrastructure:AWS (EC2, S3, EBS, ELB, RDS, SNS, SQS, VPC, Redshift, CloudFormation, CloudWatch, ELK Stack, VAST, NewT, CodePipeline, Lambda)GCP (BigQuery, Cloud SQL, Cloud Storage, Cloud Composer, Dataproc, Pub/Sub)Azure (Data Factory, Databricks, Data Lake, Azure SQL, Cosmos DB)Other Platforms: Palantir Foundry, Palantir Gotham, ECS, Solr, CloudFront, Terraform, Iaas, Paas, Saas, Puppet, Docker, Kubernetes, AnsibleData Warehousing & Big Data:Data Warehousing: Snowflake, Redshift, GCP BigQuery, Teradata, DB2, ClickHouse, Oracle 9i, Azure SQLBig Data Technologies: Hadoop, PySpark, Apache Spark, MapReduce, Hive, HBase, Databricks, ADF, AWS GlueStreaming & Real-Time Processing: Apache Kafka, AWS Kinesis, GCP Pub/Sub, Spark Streaming, Flink, Kafka StreamsData Integration & ETL/ELT:ETL Tools: AWS Glue, Matillion, Pentaho DI, Informatica, SSIS, DataStage, QualityStage, Palantir Foundry, Apache NiFi, Airflow, Apache Pig, GCP Dataflow, Pentaho ETL, Cronitor Scheduler, Jenkins, GitLab.Data Formats: Parquet, CSV, JSONProgramming & Scripting Languages:Languages: Python, PySpark, bash, Shell Scripting, PowerShell, AJAX, JavaScript, react.js, oauth, Rest API's, SOAP, REST SSO-SAML C/C++, Golang, Scala, NodeJS, Java, J2EE, SQL, TypeScript, Ruby, Perl, PHP, .NET (C#), Visual Basic, XML, HTML, HTML5, ASP.Net, CSSFrameworks: Django, Flask, Angular, React, Vue.js, Node.js, Spring BootNoSQL & Relational Databases:NoSQL Databases: MongoDB, Cassandra, DynamoDB, ElastiCache, ataccama, MongoDB, Postgres, HBase, Cosmos DBRelational Databases: MySQL, MS-SQL, Oracle, DB2, PostgreSQL, SQL ServerDevOps & CI/CD Tools:CI/CD: Jenkins, GitLab, Bitbucket, Bamboo, Travis CI, AWS CodePipeline, Ansible playbooksContainerization: Docker, SonarQube, Kubernetes, HelmVersion Control: GitHub, Git, SVNData Processing & Analytics:Tools: Apache NiFi, Airflow, Jupyter Notebooks, Tableau, Power BI, Looker, Pandas, SQLAlchemyTechnologies: Palantir Foundry, Palantir Gotham, Databricks, ELK Stack (Elasticsearch, Logstash, Kibana), Numpy, ScipyMachine Learning Models: TensorFlow, PyTorch, Keras, AI/ML development, Algorithm developmentData Migration & Quality Management:Tools: Ataccama, DataFlux, QualityStage, Informatica, SAS Data MakerTechniques: Data Cleaning, Data Flux, Data Modeling, Data GovernanceNetworking & Security:Networking: DNS, TCP/ IP, DHCP, VPC Configuration, Firewall, IPAM, Routing, Switching, VPN, GPUs, IAM rolesSecurity: OAuth, OpenID, IAM Security, VPN, Firewalls, Compliance Standards (GDPR, HIPAA)Project Management & Agile Practices:Tools: Jira, Asana, Confluence, TrelloMethodologies: Agile (Scrum, Sprint Planning), Kanban, DevOps practices, CI/CD pipelinesMonitoring & Logging:Tools: Nagios, Splunk, AWS CloudWatch, ELK Stack, DataStudio, CloudFront, GrafanaSoft Skills: Communication, Team Leadership, Interpersonal Skills, Problem Solving, Consulting, Documentation, Process OptimizationProfessional ExperienceGlobal Atlantic Financial group, Indianapolis, IN October 2021 to PresentData Engineer-AWS / Product Owner (NICE CXone, Snowflake)Responsibilities:Developed scalable data pipelines using Python, Apache Airflow, Hadoop, and AWS Glue, enhancing data processing capabilities and increasing data handling efficiency.Optimized Pentaho ETL Server by tuning heap, memory, and CPU resources to enhance server performance and reduce processing times, contributing to smooth and efficient ETL workflows.Configured and managed Cronitor Scheduler, ensuring robust monitoring and optimized job scheduling to meet diverse performance needs.Led performance optimization of the Pentaho ETL Server by tuning heap, memory, and CPU allocations, reducing data processing time and enhancing throughput.LWP for resource optimization, significantly enhancing system operations and meeting performance benchmarks.Leveraged best practices in SRE to automate performance monitoring and improve system reliability across large-scale, cloud-based platforms.Collaborated with global teams, adapting to flexible work schedules and ensuring consistent communication and high standards of customer service.The project aims to design and implement a financial data lake architecture that supports real-time analytics using cloud services (AWS), Snowflake, ETL tools, and SQL databases. The system will process large-scale financial data for advanced analytics, enabling stakeholders to gain insights into financial performance, risk management, and predictive analysis.Utilized 9 years of SRE expertise to enforce system reliability, including proactive performance monitoring with tools like Datadog and AWS CloudWatch, automated issue detection, and real-time system alerting.Administered Snowflake for scalable data analytics, ensuring efficient query performance and optimization of cloud-based data warehouses. Migrating data using Snowflake's Service Data Transfer and Federated Queries to handle large-scale data efficiently.Designed ETL pipelines using tools like AWS Glue and Matillion, specifically integrating them with Snowflake for data processing and transformations. Automating data workflows with AWS Glue, ensuring seamless data ingestion into Snowflake.Implemented end-to-end data quality checks using tools like Ataccama to ensure data consistency and integrity across Snowflake environments.Optimized streaming pipelines using Snowflake, Kafka, and Spark Streaming, facilitating real-time data analytics.Implementing Snowflake's role-based access control (RBAC), securing the data environment and ensuring compliance with industry regulations.Use SQL for data modeling, ensuring efficient query performance and data retrieval from Snowflake.Design an ETL pipeline using AWS Glue or Apache Airflow to extract, transform, and load data from raw sources to Snowflake.Experienced in continuous integration tools like Jenkins, TeamCity, and GitLab, ensuring streamlined automated builds and deployments across multiple projects.Utilized AWS VAST for large-scale data storage, optimizing storage efficiency and accessibility for high-performance cloud-based applications.Implemented AWS NewT for secure, high-performance cloud application development, ensuring robust data management and operational resilience.Experienced in testing Data and ETL workflows using Informatica IDQ and DVO, ensuring data integrity across complex ETL processes.Leveraged over 5 years of experience working with Nice InContact CXone to implement and optimize ACD, IVR, and OMNIChannel systems, integrating with auto dialer and workforce management solutions for enhanced customer interactions and analytics.Utilized 3+ years of experience with CXOne Studio to build and manage complex interaction flows, automating customer engagement strategies to improve operational efficiency.Proficient in programming with C#, Java, JavaScript, and Python for developing, maintaining, and troubleshooting web services and backend systems.Developed, maintained, and troubleshot webservice API calls, ensuring seamless integration between platforms and external services (5+ years of experience). Managed API interactions between CXone and other customer data platforms to enhance service delivery and reporting.Adept in implementing QA processes within Agile and Scrum environments, actively participating in sprint planning, daily stand-ups, and retrospectives to ensure continuous quality delivery.Conducted batch testing for data processing pipelines to validate system performance and reliability across large datasets, ensuring seamless data transformations and migrations.Developed and upgraded AbInitio infrastructure scripts (EAC) for blue/green deployment on AWS, ensuring continuity and resilience in the deployment process.Integrate batch data sources using tools like AWS Glue or Apache NiFi for ETL processes.Configured and implemented Data Quality (DQ) tools such as Ataccama, ensuring end-to-end data quality across diverse data environments. Played a key role in managing DQ rules and data profiling for consistent and accurate data throughout the organization.Integrated Palantir Foundry for streamlined data workflows and advanced analytics on large financial datasets, supporting real-time decision-making.Led cross-functional teams to develop and deliver data solutions, aligning with business goals and ensuring stakeholder collaboration using tools like Jira and Confluence.Designed ETL pipelines using PySpark, AWS EMR, and Snowflake, optimizing data warehousing and real-time data analytics from heterogeneous sources like Oracle, Hadoop, and S3.Automated data workflows with Apache NiFi and AWS Glue, enabling seamless data ingestion and transformation from AWS S3 to Snowflake.Executed blue/green deployment pipelines using Jules for Spark jobs, enabling zero-downtime updates and improved system reliability.Collaborated with marketing teams to integrate CDP data into marketing automation platforms, enhancing personalized marketing campaigns through customer segmentation and behavior analysis.Managed AWS EMR clusters for distributed data processing, improving data integration across multiple data sources and scaling analytics capabilities.Utilized Datadog APM (Application Performance Monitoring) to trace and troubleshoot issues within distributed systems, ensuring rapid identification and resolution of performance bottlenecks.Leveraged AWS services (S3, EC2, Redshift, Glue, Kinesis) and GCP for efficient big data processing, real-time ingestion, and advanced analytics.Extensive experience in managing relational databases like Postgres and SQL for data storage and retrieval within CDP ecosystems.Create a business intelligence (BI), Artificial Intelligence layer using tools like Tableau or Amazon QuickSight to visualize key financial metrics such as revenue, expenses, and financial risks.Implemented Infrastructure as Code (IaC) with Terraform and Jenkins to automate the deployment and management of AWS infrastructure, optimizing the provisioning of resources for CDP platforms.Utilized Ansible for configuration management to maintain consistent environments across CDP deployments.Developed AI-powered models for predictive analytics using AWS Bedrock and improved data processing with minimal data wrangling efforts.Developed and upgraded Spark infrastructure scripts (EAC) for blue/green deployment on AWS Dev environments, ensuring seamless deployment and minimal downtime.Built and optimized data pipelines using Apache Airflow, SAS Data Maker and automated ETL processes, improving real-time data transformation and availability.Designed event-driven architectures using Kafka, AWS Lambda, and Kinesis for real-time data processing and analytics. Led migration of legacy systems to cloud platforms like AWS using CloudFormation, ensuring efficient resource management and monitoring with CloudWatch.Developed RESTful APIs for real-time data retrieval and analytics on Azure, integrating with Snowflake and other data sources.Implemented CI/CD pipelines using Jenkins, Ansible, and Bamboo, automating deployment processes and reducing development cycle times.Enhanced system performance with ELK Stack for real-time monitoring, improving system uptime and operational efficiency.Managed data security using S3 bucket policies, IAM roles, and VPC configurations, ensuring compliance with GDPR and HIPAA.Used tools such as DataDog and PagerDuty for real-time infrastructure monitoring, ensuring high availability and performance across AWS-based CDP environments.Optimized streaming data pipelines with Apache airflow, AWS Kinesis, Kafka, and Spark Streaming, improving data retrieval speeds and reducing latency by 30%.Collaborated with data scientists to implement machine learning models for large-scale datasets, enhancing real-time data analytics.Integrated ClickHouse with data workflows for enhanced data insights and visualization.Led product development by managing backlogs, facilitating Agile ceremonies, and prioritizing tasks to ensure timely sprint execution.Environment: Python, Shell Scripting, PowerShell, Snowflake, Web sphere, AWS (EC2, S3, EBS, ELB, RDS, SNS, SQS, VPC, Redshift, Cloud formation, CloudWatch, ELK Stack), Pentaho, Jenkins, Ansible, Unix/Linux, Tomcat, Nagios, Splunk, Hadoop, Linux/Un Hive, Impala, AWS Glue, GIT, Microservice, Jira, JBOSS, Bamboo, Kubernetes, Docker, Web Logic, Maven. Ataccama, Data Governance, Data Quality Tools (Ataccama, Informatica), Data Profiling, J2EE, Metadata Analysis, Data Stewardship, DQ Rules Configuration, Data Observability, SQL, Python, Data Governance Frameworks.Credit One Bank, Las Vegas, NevadaGCP Data Engineer March 2019 to September 2021Responsibilities:Developed and managed big data & analytics using GCP (BigQuery, Dataflow, Dataproc) for data transformation and aggregation. Designed ETL pipelines in Cloud Composer and Matillion, integrating data from Cloud SQL and MySQL, and leveraging MS-SQL, Oracle, Tomcat, and DB2 for optimal database performance.Strong SQL expertise in querying and optimizing cloud-based data warehouses like Snowflake and Redshift. Administered Snowflake, supporting scalable analytics and performed data migrations via Service Data Transfer and Federated Queries.Implemented Cloud Platforms & Infrastructure like AWS (EC2, S3, EBS, ELB, RDS, SNS, SQS, VPC, Redshift, CloudFormation, CloudWatch, ELK Stack, VAST, NewT, CodePipeline, Lambda)Ataccama to monitor and enforce data quality, reducing errors. Automated data quality checks, profiling, and governance across data lifecycle.Led performance optimization of the Pentaho ETL Server by tuning heap, memory, and CPU allocations, reducing data processing time and enhancing throughput.Utilized 9 years of SRE expertise to enforce system reliability, including proactive performance monitoring with tools like Datadog and AWS CloudWatch, automated issue detection, and real-time system alerting.Implemented Infrastructure as Code (IaC) using tools such as Terraform and Google Cloud Deployment Manager to automate and streamline cloud infrastructure provisioning.Developed and maintained IaC templates for consistent and repeatable deployment of secure cloud resources across multiple environments.Build AI/ML & Predictive Analytics models using Python and BQ-ML to forecast financial trends and manage risk. Applied statistical analysis for customer segmentation and personalized marketing.Designed Real-time Data Processing pipelines using Kafka, Golang, AWS Kinesis, and Spark Streaming for event-driven architectures. Automated data ingestion pipelines for AWS Redshift, with real-time monitoring and alerting via AWS CloudWatch and X-Ray.Designed APIs Development & Integration on Angular and Azure for real-time data retrieval and manipulation. Automated ETL processes with SSIS, enhancing reporting efficiency with DataStudio and SSRS.Configured Cloud Security & Governance with LOB, Product management, IAM Security, VPCs, and VPN solutions (Google-Client) for secure cloud environments. Enforced data governance using Data Catalog and Cloudera tools.Worked closely with Cloud Foundations teams to integrate Prisma Cloud monitoring and CSPR remediation processes into cloud security frameworks, improving overall cloud security posture.Collaborated with cross-functional teams to design and implement secure cloud architectures, focusing on automation and compliance.Led Collaboration & Agile Methodologies with Agile workflows and continuous integration. Collaborated on GitHub for version control and optimized data workflows using statistics and troubleshooting.Developed Automation & Scripting languages like Python, Golang and shell scripts to automate complex data processing tasks. Solved data integration challenges with custom algorithms for streaming data and unified data lakes.Enhanced Data Visualization & Reporting with DataStudio, SSRS, and multidimensional data analysis using SSAS, generating actionable business insights.Orchestrated Data Ingestion & Transformation with ETL pipelines using Matillion for transforming large datasets in GCP BigQuery. Managed automated data cleaning and quality control using QUALITYSTAGE and GCP Data preparation.Environment: GCP, GCP Big Query, GCP Dataprep, GCP Dataflow, GCP Dataproc, Cloud SQL, Cloud Storage, Matillion, BQ-ML, DataStudio, MySQL, MS-SQL, ORACLE, DB2, MAESTRO FINANCIERO, Federated Queries, IAM Security, Federated Queries, VPC Configuration, Data Catalog, VPN Google-Client, Pub Sub, SSIS, SSAS, SSRS, DATASTAGE, QUALITYSTAGE, Service Data Transfer, Snowflake, Cloud Composer, python, shell scripts.Chewy Dania Beach, FLAzure Data Engine April 2017 to February 2019Responsibilities:Developed scalable Azure & Data Engineering data pipelines with Azure Data Factory, Databricks, and optimized Azure Data Lake for efficient high-volume data ingestion and storage. Leveraged Spark, Hive, and MapReduce and Golang for complex transformations and batch processing, enhancing analytics efficiency.Built Distributed systems & Streaming with Apache Kafka, Flink, and Golang to process integrated NoSQL databases (MongoDB, Cassandra) for real-time, low-latency data solutions. Orchestrated data synchronization using Sqoop, Flume, Blob, and Cosmos DB for scalability and availability.Utilized 9 years of SRE expertise to enforce system reliability, including proactive performance monitoring with tools like Datadog and AWS CloudWatch, automated issue detection, and real-time system alerting.Designed responsive Azure & Data Engineering dashboards using React for real-time reporting. Automated data cleaning with Python scripts and developed dynamic analytics dashboards in Power BI to enhance customer insights and product performance.Implemented CI/CD pipelines with Azure DevOps and Git, facilitating efficient deployments. Configured Matillion for automated ETL workflows, integrating AWS Redshift and Snowflake for optimized data processing.Led Scrum and Kanban teams to drive projects, using Project Management tools managed Jira for tracking, and ensured timely project delivery. Automated network configuration management to reduce outages and improve infrastructure resilience.Developed Security & Cloud Functions like Azure Function Apps and WebApps for real-time data processing. Ensured data security and compliance by configuring SSH and Azure Role-Based Access Control for sensitive information protection.Managed Cloudera clusters to optimize big data Platforms and Operations, improving performance and stability for large-scale pet food analytics.Environment: Azure Data Factory, Databricks and Azure Data Lake Spark, Hive, Azure DevOps, Git, Maven, Jira, Apache Kafka, Azure, Python, power BI, Unix, SQL Server, HBase, Sqoop, Flume, ADF, Blob, cosmos DB, MapReduce, HDFS, Cloudera, SQL, ACR, Azure Function App, Azure WebApp, Azure SQL, and Azure SQL MI, SSH, YAML, WebLogic, Python.Genpact, IndiaData Analyst September 2014 to December 2016Responsibilities:Analyzed Data Analysis & Machine Learning complex data sets using SQL, R, and Python for business decision-making. Applied machine learning techniques for predictive modeling and customer segmentation. Utilized Pandas for data wrangling and large dataset analysis.Data Integration & Transformation: Integrated and transformed data with Informatica 6.1, ensuring high-quality handling. Cleaned and pre-processed data using Data Flux, Excel, and Jupyter Notebook to achieve accurate analysis-ready data.Worked on Big Data & Storage with big data technologies such as Hadoop and Spark for large-scale data analytics. Managed and optimized data storage in Hadoop, Oracle 9i, and Teradata for efficient processing and access.Performed statistical analysis and predictive modeling with SAS and SPSS to forecast trends and support strategic planning.Developed Data Visualization interactive dashboards and reports in Tableau and Power BI to present data insights to stakeholders.Technical Expertise in complex PL/SQL queries to handle data from various sources, improving data loading and transformation. Ensured data quality and accuracy using Quality Center 7.2 and TOAD.Collaboration & Documentation with cross-functional teams to understand data needs, translating them into actionable insights. Managed tasks with Jira and created comprehensive technical documentation for data processes.Environment: Databricks, Informatica, 6.1, SQL, Excel, R, Python, Jupyter notebook, Jira, SAS, SPSS, Tableau, Power BI, Hadoop, Data Cleaning, Data Flux, Oracle 9i, Quality Center 7.2, TOAD, Statistical Analysis, PL/SQL, Flat Files, Teradata.Education: July 2010 to May 2014Malla Reddy Engineering CollegeBachelors in computer science |