| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidate Candidate's Name
PHONE NUMBER AVAILABLEEMAIL AVAILABLEData Engineer
PROFESSIONAL SUMMARY: Over 5 years of experience as a data engineer, adept in Microsoft Excel, R, and Python for data analysis and insights. Skilled in statistical modeling and data processing using R and Python libraries such as pandas and NumPy for complex datasets. Experienced in creating dynamic and interactive visualizations with QlikView and Tableau to support business intelligence and decision-making. Proficient in managing and optimizing Microsoft SQL Server databases for efficient data storage and rapid retrieval processes. Utilized Apache Hive for handling large datasets, improving the scalability and performance of data-driven applications. Employed Git for version control, enhancing team collaboration and maintaining code integrity across project iterations. Developed comprehensive business requirements and documentation to align project deliverables with strategic business goals. Expert in implementing distributed data processing systems using Apache Hadoop and Apache Spark for real-time analytics. Managed cloud-based data solutions using Azure SQL Database, optimizing data operations and management in the cloud. Designed and automated ETL processes with Talend and Azure Data Factory, streamlining data integration and workflow efficiency. Implemented real-time data streaming and processing solutions using Apache Kafka, enhancing data availability and analysis. Ensured consistent development environments and simplified deployments using Docker containerization technology. Automated and managed cloud infrastructure with Terraform, improving provisioning and maintenance of cloud resources. Managed and scaled large data warehousing solutions using Azure SQL Data Warehouse, supporting extensive data analysis needs. Utilized Azure Cosmos DB for distributed database management, ensuring high performance and scalability across global applications. Secured sensitive data using Azure Key Vault, providing robust encryption and access control for critical data assets. Administered Azure Active Directory for enhanced security and management of identities and access controls in cloud environments. Applied Agile methodologies and utilized Azure DevOps for continuous integration and delivery, enhancing project efficiency and team collaboration. Demonstrated ability to handle multiple projects efficiently, leveraging a broad range of technologies to meet diverse data needs. Fostered a collaborative team environment, leading by example in adopting new technologies and practices for data engineering. Continuously updated technical skills and industry knowledge, staying ahead of trends and advancements in data technology. Committed to ethical data management practices, ensuring compliance with data protection regulations and standards.
TECHNICAL SKILLS: Data Analysis Tools: Microsoft Excel, R, Python (pandas, NumPy), QlikView, Tableau Database Management: Microsoft SQL Server, Azure SQL Database, Azure Cosmos DB, ADLS Big Data Technologies: Apache Hadoop, Apache Spark, Databricks, Apache Hive Data Streaming and ETL: Apache Kafka, Stream Sets, Talend, Azure Data Factory Cloud Technologies: Azure (SQL Database, Cosmos DB, HDInsight, Key Vault, Active Directory, DevOps) Programming: Python, SQL DevOps and Infrastructure: Docker, Terraform, Agile methodologiesPROFESSIONAL EXPERIENCE:Client: United Health Care, Edina, MN Nov 2022 to Till presentRole: Data EngineerRoles & Responsibilities: Developed complex data pipelines using Apache Airflow, which optimized the automation and efficiency of data workflows. Integrated secure data handling solutions with Azure Active Directory and Azure Key Vault to maintain data integrity and security compliance. Utilized Apache Kafka and Stream Sets to establish robust data streaming processes, enhancing the capability for real-time analytics. Leveraged Databricks for advanced data processing and analytics, facilitating complex computations and machine learning projects. Managed the Azure DevOps environment to streamline agile project management and ensure continuous integration and delivery of data solutions. Optimized the management and storage of large datasets using Azure SQL Database and ADLS, improving performance and scalability. Deployed and managed machine learning models using Azure ML, which improved the accuracy and efficiency of predictive analytics. Secured data transfers and storage through meticulous implementation of Azure Active Directory, reinforcing data privacy and access control. Designed and executed data streaming architectures using Stream Sets, enabling the capture and processing of live data feeds. Configured and maintained Apache Kafka to manage data ingestion and processing, which supported real-time decision-making processes. Utilized Python to create and maintain ETL scripts, enhancing data manipulation and integration capabilities. Employed SQL for data querying and management tasks, optimizing database performance and supporting complex data analysis. Ensured data security and compliance by implementing protocols in Azure Key Vault for managing secrets and encryption keys. Orchestrated data pipeline workflows with Apache Airflow, which automated routine tasks and reduced manual interventions. Managed projects using agile methodologies, coordinated through Azure DevOps to maintain high standards of project delivery. Utilized Azure SQL Database for efficient data storage and retrieval, ensuring robust data availability for ongoing analytics needs. Enhanced data processing capabilities using Azure HDInsight, facilitating the management of big data workloads. Deployed Docker containers for application development, ensuring a consistent environment across testing and production phases. Streamlined data integration processes with Azure Data Factory, automating data flows and enhancing system interoperability. Monitored and secured user access and data transactions using Azure Active Directory, enhancing system security protocols. Configured and maintained Unity Catalog in Azure, ensuring data governance and a unified data view across all datasets. Applied data streaming and ETL processes using Azure Active Directory and Azure Key Vault, ensuring secure and efficient data management. Environment: Apache Airflow, Azure Active Directory, Azure Key Vault, Apache Kafka, Stream Sets, Databricks, Azure DevOps, Azure SQL Database, Azure Data Lake Storage (ADLS), Azure Machine Learning (ML), Python, SQL, Azure HDInsight, Docker, and Azure Data Factory.Client: CSAA, Charlotte, NC Jun 2020 to Jun 2022Role: Big Data EngineerRoles& Responsibilities:
Engineered robust ETL pipelines using Talend and Azure Data Factory, significantly enhancing data integration and automation processes. Managed large-scale data warehousing solutions with Azure SQL Data Warehouse, optimizing data storage and access for complex queries. Developed and maintained scalable data ingestion frameworks using Apache Hadoop and Apache Spark, which increased processing speed and capabilities. Configured real-time data streams using Apache Kafka, facilitating efficient data flow and supporting analytics platforms. Orchestrated application deployments using Docker, ensuring consistency across development, testing, and production environments. Automated infrastructure setups and maintenance using Terraform, which streamlined resource management across cloud platforms. Secured application and data environments by implementing Azure Cosmos DB, enhancing data availability and disaster recovery capabilities. Improved data handling and storage configurations using Azure Data Lake Storage (ADLS), enhancing data retrieval and scalability. Optimized data transformation and loading processes with Apache Spark, reducing latency and improving throughput for analytics. Utilized Python and SQL for complex data manipulation and analytics, driving insights and business intelligence solutions. Maintained and enhanced real-time and batch data processing systems, ensuring high availability and performance. Ensured compliance with industry security standards by integrating secure cloud services and protocols. Streamlined data pipeline management and monitoring using Azure DevOps tools, improving team productivity and project tracking. Configured Azure SQL Database for optimized data querying and management, supporting critical business operations. Developed APIs for data access using Python, ensuring seamless integration with other applications and services. Leveraged Docker to create reproducible development environments, which simplified collaboration and testing processes. Applied agile methodologies throughout the project lifecycle, enhancing team collaboration and timely delivery of solutions. Enabled data-driven decision-making by deploying sophisticated analytics platforms using Databricks. Facilitated data migration projects, ensuring data integrity and minimizing downtime during transitions. Conducted performance tuning and optimization for data processes, ensuring efficient operations and resource utilization.
Environment: Talend, Azure Data Factory, Azure SQL Data Warehouse, Apache Hadoop, Apache Spark, Apache Kafka, Docker, Terraform, Azure Cosmos DB, Azure Data Lake Storage (ADLS), Python, SQL, Azure DevOps, Azure SQL Database, Databricks.Client: Tvisha Technologies, Hyderabad, India May 2019 to May 2020Role: Data Analyst Roles & Responsibilities: Analyzed business requirements to develop and deploy effective data models using Microsoft Excel and R. Utilized Python, employing libraries like pandas and NumPy, to enhance data manipulation and analysis processes. Created dynamic visualizations and business intelligence solutions using QlikView and Tableau, enhancing data accessibility. Managed and optimized Microsoft SQL Server databases to ensure efficient data storage and rapid query execution. Leveraged Apache Hive to manage and analyze large datasets, improving scalability and data handling efficiency. Ensured version control and collaborative development using Git, maintaining code integrity across project phases. Documented analytics processes and solutions, aligning them with business objectives and stakeholder expectations. Developed advanced statistical models in R, providing insights into customer behaviors and market trends. Enhanced data analysis capabilities by integrating Python scripts into data workflow processes. Streamlined data reporting and insight generation using Tableau, facilitating strategic decision-making. Optimized SQL queries and database schemas to improve performance and data retrieval speeds. Collaborated with cross-functional teams to identify and resolve data discrepancies and quality issues. Conducted thorough data validation and cleansing, ensuring accuracy and reliability of reports. Designed and implemented data dashboards that provided real-time insights into operational metrics. Trained team members on data analysis tools and best practices, enhancing team capabilities. Applied best practices in data security and compliance to safeguard sensitive information. Contributed to team meetings and strategy sessions, providing data-driven recommendations and reports. Monitored and evaluated new data management technologies and tools, recommending implementations to improve processes.
Environment: Microsoft Excel, R, Python (pandas, NumPy), QlikView, Tableau, Microsoft SQL Server, Apache Hive, and Git.Education: Master of science in computer science Southeast Missouri state university, USA (2022-24)
|