| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Toledo OH.Contact Info: PHONE NUMBER AVAILABLEEmail-ID: EMAIL AVAILABLELinkedIn: LINKEDIN LINK AVAILABLEProfessional Summary:Having 6+ years of IT Industry experience as Data analyst with solid understanding of Data Modeling, Data Validation, Evaluating Data sources and strong understanding of Data Warehouse/ Data Mart Architects, ETL, BI, OLAP and Client/Server applications on AWS and Azure.Solid Knowledge of AWS services like AWS EMR, Redshift, S3, EC2, and concepts, configuring the servers for auto-scaling and elastic load balancing.Good knowledge and experience in Software Development Life Cycle (SDLC) and its phases requirement gathering, Analysis, Design, Implementing, Deployment, and Maintenance.Experience in Data Analysis, Data Validation, Data Modeling, Data mapping, Data Verification, Data loading, and Data mining through understanding various Requirements, analyses, and Designs.Highly skilled in using visualization tools like Tableau, ggplot2, dash, and flask for creating dashboards. Adept in statistical programming languages like R and Python.experience in the Alteryx platform, and involved in data preparation, data blending, and the creation of data models, and data sets using Alteryx.Experience in Data Extraction, Data Management, Data Cleansing, Data Profiling, Data Consolidation, and Data Quality for various business data feeds.Expert in creating Complex Power BI Ad Hoc Reports, Frequency Reports, Summary Reports, Drill-down, Dynamic grouping, graphical, aging reports.Experienced in creating Data Flow Diagrams, Use Cases, Use Case Diagrams, Activity diagrams, Entity Relationship Diagrams, Data Mapping, and Data Integration.Understanding the clients business problems and analyzing the data by using appropriate Statistical models to generate insights.Hands-on experience with Gurobi optimizer, SQL, Big Data model techniques using Python.Excellent technical skills, consistently outperformed schedules and acquired interpersonal and communication skills.Experience:Data Analyst 3AMIGOSIT.LLC June 2023 - Till Date.Location: Columbus, OH.Creating Databricks notebooks using SQL, Python and automated notebooks using jobs.Creating Spark clusters and configuring high concurrency clusters using Azure Databricks to speed up the preparation of high-quality data.Worked extensively on the migration of different data products from Oracle to AzureSpun up HDInsight clusters and used Hadoop ecosystem tools like Kafka, Spark and data bricks for real-time analytics streaming, Apache Sqoop, pig, hive, and Cosmos DB for batch jobs.Extract Transform and Load data from Sources Systems to Azure Data Storage services using a combination of Azure Data Factory, T-SQL, Spark SQL, and U-SQL Azure Data Lake Analytics Data Ingestion to one or more Azure Services - (Azure Data Lake, Azure Storage, Azure SQL, Azure DW) and processing the data in In Azure DatabricksImported the data from various formats like JSON, Text, CSV, and Parquet to HDFS cluster with compressed for optimization.Worked on ingesting data from RDBMS sources like - Oracle, SQL Server, and Teradata into HDFS using Sqoop.Loaded all datasets into Hive from Source CSV files using Spark and Cassandra from Source CSV files using Spark.Database creation (DDL, DML, DCL), database tuning, SQL tuning, performance planning.Worked on SPARK for real time streaming of data into the cluster.Developed Power BI reports and dashboards from multiple data sources using data blending.Developed SQL queries /scripts to validate the data such as checking duplicates, null values, truncated values and ensuring correct data aggregations.Performed data quality analysis using advanced SQL skills. Performed Count Validation, Dimensional Analysis, Statistical Analysis and Data Quality Validation in Data Migration.Extensive SQL querying on Staging, Data warehouse and Data Mart.Explored data in a variety of ways and across multiple visualizations using Power BI. Strategic expertise in design of experiments, data collection, analysis and visualizationInvolved in Troubleshooting, resolving and escalating data related issues and validating data to improve data quality.Designed Power BI data visualization utilizing cross tabs, maps, scatter plots, pie, bar, and density charts.Extensive use of DAX (Data Analysis Expressions) functions for reports and for the tabular models.Created Power BI reports by using Joins in multiple tables from multiples database using complex SQL queries.Designed a Power BI data model with multiple fact tables and dimensions depending on the business requirements.Experience in custom visuals and groups creation in Power BI.Experience in scheduling refresh of Power BI reports, hourly and on-demand.Environment: Azure Data Bricks, Data Factory, Spark SQL, Azure Data Lake Storage, Azure Data Explorer, Azure Synapse, Azure Databricks, Power BI, TableauData Analyst HYLYT BY SOCIORAC May 2022 May 2023.Location: Princeton, NJ.Development of Spark structured streaming to read the data from Kafka in real time and batch modes, apply different mode of Change data captures (CDCs) and then load the data into Hive.Developed and Configured Kafka brokers to pipeline server logs data into spark streamingCreated Stream sets pipeline for event logs using Kafka, Stream sets Data Collector and Spark Streaming in cluster mode by customizing with mask plugins, filters and distributed existing Kafka topics across applications using Stream sets Control Hub.Developed spark code and spark-SQL/streaming for faster testing and processing of data.Developed Python scripts to automate the ETL process using Apache Airflow and CRON scripts in the Unix operating system as well.Worked on Apache Spark with Python to develop and execute Big Data Analytics and Machine learning applications, executed machine learning use cases under Spark ML and MLlibExpertise in writing complex DAX functions in Power BI and Power Pivot.Automated Power Query refresh using power shell script and windows task scheduler.Used various sources to pull data into Power BI such as Databricks Notebooks, Oracle, SQL Azure etc...Scheduled Automatic refresh and scheduling refresh in power bi service.Wrote calculated columns, Measures queries in power BI desktop to show good data analysis techniques.Weekly presentation to the business users about the reports and their changes as required.Worked on all kind of reports such as Yearly, Quarterly, Monthly, and Daily.Worked on all types of transformations that are available in Power BI query editor.Created Reports using Charts, Gauges, Tables, matrix.Created Parameterized Report, Dashboard Report, linked report and Sub Report by Year, Quarter, Month, and Week. Created Drill Down Reports, Drill Through Report by Region.Designed and developed Power BI graphical and visualization solutions with business requirement documents and plans for creating interactive dashboards.Gathering the requirements and writing the ad-hoc SQL query batches to update the data and metadata for delta and history loads.Utilized (Power BI, Pivot/View) to design multiple scorecards and dashboards to display information required by different departments and upper-level management.Implemented several DAX functions for various fact calculations for efficient data visualization in Power BI.Converted the existing Azure Data Factory V1 to Azure Data Factory V2 and implemented SSIS package in Azure environment using Integration Runtime (IR).Done configure diagnostics, monitoring, and analytics on Azure platform along with scale and resilience for Azure websites.Created calculated measures and dimension members using multi-dimensional expression (MDX).Worked closely with the team in meeting the deadlines pertaining to design and development deliverables.Environment: Azure Data Factory, Spark SQL, Azure Data Lake Storage, Azure Data Explorer, Azure Synapse, Azure Databricks, Power BIBusiness Data Analyst DKUBE December 2021 May 2022.Location: Potomac, MD.Worked in complete Software Development Life Cycle (SDLC) process by analyzing business requirements and understanding the functional workflow of information from source systems to destination systems.Utilizing analytical, statistical, and programming skills to collect, analyze and interpret large data sets to develop data-driven and technical solutions to difficult business problems using tools such as SQL, and Python.Worked on designing AWS EC2 instance architecture to meet high availability application architecture and security parameters.Created AWS S3 buckets and managed policies for S3 buckets and Utilized S3 buckets and Glacier for storage and backup. Worked on Hadoop cluster and data querying tools to store and retrieve data from the stored databases.Worked on designing and developing the SSIS Packages to import and export data from MS Excel, SQL Server, and Flat files.Worked on Data Integration for extracting, transforming, and loading processes for the designed packages.Designed and deployed automated ETL workflows using AWS lambda, organized and cleansed the data in S3 buckets using AWS Glue and processed the data using Amazon Redshift.Worked within the ETL architecture enhancements to increase the performance using query optimizer.Implemented the data that is extracted using Spark, Hive, and large data sets using HDFS.Worked on Streaming data transfer, data from different data sources into HDFS, No SQL databases.Created ETL Mapping with Talend Integration Suite to pull data from Source, apply transformations, and load data into the target database.Performance tuning to optimize SQL queries using query analyzer.Worked on scripting with Python in Spark for transforming the data from various files like Text files, CSV, and JSON.Loaded the data from different relational databases like MySQL and Teradata using Sqoop jobs.Worked on processing the data and testing using Spark SQL and on real-time processing by Spark Streaming and Kafka using Python.Scripted using Python and PowerShell for setting up baselines, branching, merging, and automation processes across the process using GIT.Worked with the implementation of the ETL architecture for enhancing the data and optimized workflows by building DAGs in Apache Airflow to schedule the ETL jobs and additional components in Apache Airflow like Pool, Executors, and multi-node functionality.Used various Transformations in SSIS Dataflow, Control Flow using for loop Containers and Fuzzy.Worked on creating SSIS packages for Data Conversion using data conversion transformation and producing advanced extensible reports using SQL Server Reporting Services.Deployed application to GCP using Spinnaker (rpm based) launched multi-node Kubernetes cluster in Google Kubernetes Engine (GKE) and migrated the dockized application from AWS to GCP.Environment: Python, SQL, AWS EC2, AWS S3 buckets, Hadoop, PySpark, AWS lambda, AWS Glue, Amazon Redshift, Apache Kafka, SSIS, Informatica, ETL, Hive, HDFS, NoSQL, Talend, MySQL, Teradata, Sqoop, PowerShell, GIT, Apache Airflow.Data Analyst PROCODE SOFTECH PVT LTD. November 2020 to December 2021.Location: Hyderabad, India.Involved in Data mapping specifications to create and execute detailed system test plans. The data mapping specifies what data will be extracted from an internal data warehouse, transformed and sent to an external entity.Analyzed business requirements, system requirements, data mapping requirement specifications, and responsible for documenting functional requirements and supplementary requirements in Quality Center.Setting up of environments to be used for testing and the range of functionalities to be tested as per technical specifications.Integrated with UI layer HTML, Java script.Tested Complex ETL Mappings and Sessions based on business user requirements and business rules to load data from source flat files and RDBMS tables to target tables.Responsible for different Data mapping activities from Source systems to TeradataCreated the test environment for Staging area, loading the Staging area with data from multiple sources.Responsible for analyzing various data sources such as flat files, ASCII Data, EBCDIC Data, Relational Data (Oracle, DB2 UDB, MS SQL Server) from various heterogeneous data sources.Delivered file in various file formatting system (ex. Excel file, Tab delimited text, Coma separated text, Pipe delimited text etc.)Executed campaign based on customer requirements.Built Informatica Mappings to Extract, Transform and load claim data into Salesforce Data Warehouse and Datamart.Followed company code standardization rule.Performed ad hoc analyses, as needed, with the ability to comprehend analysis as needed.Involved in testing the XML files and checked whether data is parsed and loaded to staging tables.Executed the SAS jobs in batch mode through UNIX shell scripts.Created remote SAS sessions to run the jobs in parallel mode to cut off the extraction time as the datasets were generated simultaneously.Involved in code changes for SAS programs and UNIX shell scripts.Reviewed and modified SAS Programs, to create customized ad-hoc reports, processed data for publishing business reports.Responsible for creating test cases to make sure the data originating from source is making into target properly in the right format.Worked with data transfer from on-premises SQL servers to cloud database (Azure Synapse Analytics & Azure SQL)Validating the data from SQL server to Snowflake to make sure it has Apple to Apple match.Tested the ETL process for both before data validation and after data validation process. Tested the messages published by ETL tool and data loaded into various databases.Experience in creating UNIX scripts for file transfer and file manipulation.Provide support to client with assessing how many virtual user licenses would be needed for performance testing.Writing Pyspark and SQL transformation in Azure Databricks to perform complex transformation for business rule implementation.Enabling Oracle OBIEE Internal SSL for OBIEE.Designing and developing various machine Learning frameworks using Python, R, and MATLAB.Tested the database to check field size validation, check constraints, stored procedures, and cross verifying the field size defined within the application with metadata.Environment: Informatica 7.1, Data Flux, Oracle 9i, Quality Center 8.2, SQL, TOAD, PL/SQL Flat Files.Data Analyst PEERTECH'Z PVT LTD. July 2019 to November 2020.Location: Hyderabad, India.Fostered the utilization of SQL and Excel in managing all data transformation, mappings, and migrations for multiple clients.Secured additional information from numerous linked servers connected to SQL and Oracles tables using queries.Defined a large list of related variables using mining techniques including principal components and classification of large datasets to develop business judgment using SQL.Made substantial contribution in preparing and presenting reports and analysis, predictive modeling and other data mining results using SSIS, Excel and Access applications.Prepared and presented SQL Server Reporting Services (SSRS), which detailed risk characteristics and location.Identified data discrepancies in key business reports that were leading to incorrect reporting of crucial financial data. Worked with engineers to fix data tables and queries to eliminate discrepancies.Elicited, analyzed, and validated requirements to identify BI redesign initiatives and demonstrated how technology solutions supported the redesigned business process flow.Created new reporting solutions, introduced dashboard presentation technology merging business process rules with workforce management reporting, improving call center workflow efficiency.Introduced new and formal technical requirements and QA documents to define formal processes for quality assurance, business task initiation, and completion.Demonstrated understanding of functional and data requirements for data warehouse projects including a key marketing project to gauge marketing channel performance.Planned and successfully managed multiple application and reporting upgrade migration projects increasing billing and collection department efficiency by 39% over a 14-month period.Responsible for the research, troubleshooting, and utilization of the web analytics tool and its tracking and reporting functions to ensure all business reporting data elements were accurate.Reliably aided the Underwriting, Actuarial, and Finance and Claims departments with innovative business intelligence and statistical analysis.Operated as primary point of contact for addressing the reporting needs for Call Centers executive management.Identified and documented opportunities for new technology solutions to support reengineered business processes through interactive JAD sessions; provided recommendations for technology selection.Managed and coordinated BI and reporting and process redesign initiatives to support financial, lending, accounting, and remarketing services.Performed quality assurance testing, created test cases and test plans for data warehouse projects and QA testing.Coordinated with the Information Technology department regarding documented business designs and unit test plans in compliance with Governance requirements.Environment: SQL, matplotlib, NumPy, tableau, MSWord, MS Excel, Python, UNIX/Linux, Oracle, SQL Server, MS Access.Skills:Programming Languages: Python (NumPy, Pandas, Sci-kit-learn, TensorFlow), Scala, HTML.Databases: SQL, Oracle, MySQL, PostgreSQL, Cassandra, Snowflake, Hive, Presto, MS SQL Server.Big Data: Hadoop, Apache, Spark, Apache Kafka, Airflow, Mongo DB,Cloud Platform: Azure (Data Factory, Data Lake Storage, Cosmos DB, Databricks, Power BI), AWS (S3, EC2, RDS) and Google cloud.GCP Analytics: Big Query, Cloud Storage, Data Studio, Big Table.ETL: Data Model, Statistical Data Analysis, Data Warehouse, Data Pipeline, Data MartsBI Tools: Tableau, Power BI, Qlik view, Looker.Software Methodologies: Agile, scrum and waterfall.Soft Skills: Project Management, Communication, Problem-solving, Strategic planning, and AnalysisEducation:Master of Sciences in Health informatics The University of Findlay. Findlay, OHMajor: Health Information systems and Health data.Minor: Database Concepts, Statistical Methods for Business Analytics.Bachelors in Pharmaceutical Sciences St. Peters Pharmaceutical Sciences. TS, INDIA.Major: Pharma Technology.Minor: Biostatistics and computer applications.Certifications:Agile Job Simulation Covering Scrum, Sprint Planning, And Retrospectives from JPMorgan Chase & Co., Issued by Foarge.Transformer Models and Bert Model Issued by Google.Introduction To Statistical Concepts Issued by SAS.Social & Behavioral Research Issued by Citi Program.Positioning And Competitive Advantage Issued by IBS Americas.Scrum Fundamentals for Scrum Master and Agile Issued by Udemy. |