Principal Data Integration Engineer Resu...

Principal Data Integration Engineer Resu...
Resumes | Register

Candidate Information
Name	Available: Register for Free
Title	Principal Data Integration Engineer
Target Location	US-NJ-Secaucus, 07305
Email	Available with paid plan
Phone	Available with paid plan

20,000+ Fresh Resumes Monthly

View Phone Numbers

Receive Resume E-mail Alerts

Post Jobs Free

Link your Free Jobs Page

... and much more
Register on Jobvertise Free

Related Resumes

Candidate's Name Data Engineer | Transforming Data Into Insights Secaucus, 07305 NJ PHONE NUMBER AVAILABLE EMAIL AVAILABLE With 9 years of intensive experience in the tech industry, I have had the distinct privilege of driving technological innovation within several startups, playing a pivotal role in conceptualizing and actualizing products from the ground up. My expertise spans a vast array of cutting-edge technologies and systems. In the realm of Programming Languages & Frameworks, I have harnessed advanced SQL techniques, leveraged Python for diverse scripting and ETL processes, and employed Java/Scala, especially in conjunction with platforms like Apache Kafka and Apache Spark. My work in Data Processing Systems is marked by a deep-seated proficiency in Apache Spark, real-time data processing with Apache Flink, and a thorough grasp of the Hadoop ecosystem. Within the sphere of Database & Storage Solutions, I have navigated the intricacies of various RDBMS such as PostgreSQL and MySQL, tapped into the capabilities of NoSQL platforms like MongoDB, and managed columnar and time-series databases with systems like Parquet and InfluxDB. I have an extensive background in Data Warehousing Technologies, familiarizing myself with powerhouse platforms like Amazon Redshift and Snowflake. This experience is complemented by my adeptness in Data Movement & ETL Instruments, having worked extensively with tools like Apache NiFi, Apache Kafka, and more. My skills in Data Design & Architecture have empowered startups to leverage effective data modeling techniques, from Star and Snowflake Schemas to adept normalization/denormalization strategies. This foundation is further reinforced by my expertise in cloud solutions across AWS, Google Cloud, and Azure platforms, as well as Infrastructure Automation Tools, streamlining processes using Terraform, CloudFormation, and Pulumi. Linkedin https://LINKEDIN LINK AVAILABLE Skills Traditional RDBMS like PostgreSQL, MySQL, Oracle, etc NoSQL databases like Cassandra, MongoDB, DynamoDB, etc Columnar storage systems like Parquet and ORC Time-series databases like InfluxDB or TimescaleDB Solutions like Amazon Redshift, Snowflake, Google BigQuery, and Azure Synapse Analytics SQL: Mastery of advanced SQL concepts, window functions, stored procedures, etc Python: Extensive use for scripting, data manipulation, ETL tasks, et Java/Scala: Especially if they've been working with tools like Apache Kafka or Apache Spark Apache Spark: Mastery of Spark Core, Spark SQL, and streaming capabilities Apache Flink: For real-time data processing Hadoop Ecosystem: Deep understanding of MapReduce, Hive, HBase, and other Hadoop technologies Data Modeling & Architecting Star Schema and Snowflake Schema for DW modeling Normalization and denormalization techniques Data lakes and data lakehouse architectures S3, EC2, EMR, Glue, Lambda, etc Google Cloud Platform: BigQuery, Dataflow, Pub/Sub, etc Microsoft Azure: Azure Data Factory, Azure Blob Storage, HDInsight, etc Tools like Terraform, CloudFormation, or Pulumi for infrastructure automation Apache Kafka: For event streaming and real-time analytics Talend, Informatica, Microsoft SSIS, etc Work History 2021-08 - Current Principal Data Integration Engineer Stealth AI Startup (Stich Vision), New Jersey City , New jersey Pioneered a real-time inventory tracking system using Apache Kafka for event streaming, ensuring live updates on stock availability. Developed predictive models in Python to forecast demand and optimize reordering processes, reducing stock-outs by 20%. Utilized NoSQL databases like MongoDB for managing vast product catalogs, ensuring swift data retrievals and updates. Leveraged Apache Spark for processing and analyzing supply chain data, identifying bottlenecks and areas of improvement. Designed an integrated data warehouse using Snowflake, consolidating data from various supply chain touchpoints for holistic analytics. Architected a waste tracking system using Java, categorizing and monitoring waste in real-time to streamline recycling processes. Integrated sensors and IoT data with Apache Flink for real-time processing, enabling dynamic route optimization for waste collection trucks. Deployed time-series databases like InfluxDB to track waste generation patterns over time. Automated infrastructure scaling on AWS using Terraform, ensuring robustness during peak data inflows. Introduced a data lakehouse architecture, enhancing the flexibility and scalability of waste data storage and analysis. 2018-01 - 2021-01 Data Engineering Team Lead Horizon Technologies (Addo AI) , California , San Francisco Conceptualized and developed a user health data platform, capturing fitness metrics using Apache Kafka streams. Implemented data lakes on Google Cloud Platform, storing diverse user data like heart rates, step counts, and diet logs. Leveraged Python scripts for ETL processes, cleaning and transforming wearable device data for analysis. Employed Apache Spark's machine learning libraries to create personalized workout and diet plans. Designed a responsive querying system using advanced SQL techniques, providing instant insights into user fitness trends. Optimized CRM databases, primarily using PostgreSQL, ensuring swift data retrieval and efficient storage. Integrated real-time customer interaction data using Apache Kafka, enhancing the responsiveness of sales and support teams. Employed Star and Snowflake Schemas to structure customer data, facilitating faster report generation. Streamlined customer segmentation using clustering techniques in Apache Spark, enabling targeted marketing campaigns. Automated data integration pipelines using Apache NiFi, ensuring timely synchronization of CRM data across platforms. 2015-01 - 2018-01 Data Engineer Mercurial Minds Developed a comprehensive document management system, with real-time indexing and retrieval capabilities powered by Apache Kafka and Apache Flink. Utilized columnar storage systems like Parquet for efficient storage and retrieval of large documents. Automated document versioning and backup processes on AWS using CloudFormation templates. Designed a robust search engine using Python, enabling users to quickly find documents based on content, metadata, and tags Integrated OCR capabilities, transforming scanned documents into searchable and editable formats. Led the development of a collaborative platform, allowing real-time data sharing and interaction using WebSockets and Apache Kafka. Facilitated seamless integration of third-party tools using Python scripting for ETL tasks. Leveraged AWS services like Lambda and EC2 for on-demand scalability during peak collaboration hours. Engineered a data backup system on Azure Blob Storage, ensuring data integrity and availability. Introduced real-time analytics on collaboration patterns using Apache Spark, providing insights to teams on productivity metrics. Education BS: Information Technology, Software Engineering - University Of Management & Technology

Respond to this candidate
Your Email	«
Your Message
Please type the code shown in the image: