| 20,000+ Fresh Resumes Monthly | |
|
|
| | Click here or scroll down to respond to this candidateCandidate's Name
Email: EMAIL AVAILABLEWeb link: /https://LINKEDIN LINK AVAILABLEPhone: PHONE NUMBER AVAILABLESummary: Primary skills in data modeling and data science, also a good software designer, coder, and project leader.SKILLSlanguages: python (9 years), java, c, c++, plsql, tsql, presto sqldata science: LLM, tensorflow, inferential and bayesian statistics, mcmc, jags, vision ml, time seriesdata modeling: nosql, relational, star schema, graph data modeling.others: b,c,k,t shells, node.js, awk, perl, c#, pyqt, tableau, d3.js, matplotlib, oracle, mssql, postgresql, hive, spark, databricks., wsl2, docker, github, cuda, docker, linux, jupyter notebooksEXPERIENCETime series forecasting projects:Columbus, Ohio 04/15/2024 - presentA petroleum stock price ml based forecasting (in progress)A time series forecasting software with algorithm to search for time differenced correlations with dynamic parameter adjustments (in progress)Some of the python libraries used in time series analysis: numpy, pandas, matplotlib, pickle, h5py, sklearn.cluster Kmeans, DBSCAN, AgglomerativeClustering, tslearn.clustering TimeSeriesKmeans, sklearn.metrics silhouette_samples, silhouette_score, dtaidistance dtw, fastdtw tslearn.metrics cdist_dtw, tensorflowSenior Data Scientist in Fractal Analytics Contract to C3.aiRedwood City, California 01/12/2024 - 04/15/2024Familiarized myself with process scheduling optimization (pso) module based on integer programmingUpgrading of c3 AI platform in different environments (local, jupyter, docker, Linux in wsl2, C3 cloud).used vm, linux in WSL2, docker, mac, github, python, node.jsParticipated in client engagement for AI projectsML Architect in ZeroSightSan Jose, California 02/2023 - 12/2023A POC project to investigate best ways from video imaging to 3D modeling. Some of the tech used:Used ML packages openpose, mediapipe, Yolov8, and Yolo-nas to do object detection and pose estimation.Did Dockerfile, pre-configured images, cmake, python virtual environ, Nvidia gpu, cuda, anaconda, vlc, ffmpeg, streamlitUsed pretrained models (Yolov8) and done some training. Used cvat to prepare labels for training.Done affine transformations for 3D modeling. Used quaternions, 3D transformation matrices, etc.Used Blender (python) for crafting 3D models and Unity scripting (C#) for controlling object movementsDone socket programming: UPD and TCP, messages between python to python (Blender) and python to C# (Unity)Done threading and async coding (C#) for decoupling message passing and object controlInvestigated data storage for storing and searching large amount of imaging data, such as vector database MilvusData Scientist in Meta PlatformsMenlo Park, California 12/2021 01/2023Analyzed online ads parameters to enhance ads effectiveness. Used Presto SQL, Hive, Python, and notebooks. Originally focused on bid cap's impact, the project evolved into a high-dimensional optimization problem, analyzing various parameters collectively and individually. Used Presto SQL, Hive, statistics, and notebooks. Identified lift, reach, impression, CTR, and OR across channels, objectives, verticals, and partner conditions. Presented my findings to marketing managers and DS group.In marketing campaign result analysis, used Python, R, z-tests, t-tests, p-values, winsorization, CUPED to statistically analyze marketing result analysis and communicate to marketing managers with reports.Done Machine Learning, used PyTorch, Hive, and notebooks, to scope target audience for marketing campaigns. Created labels, prepare data, design and trained models, and identified responsive customers.Senior Technical Staff in ComResourceColumbus, Ohio 06/2019 05/2021Directly report to CTO. I have done numerous POCs (proof-of-concept) for clients. All projects are in python.Done data mining (wrote code to analyze data): used apriori algorithm to discover value patterns with 99% accuracy. The code can discover thousands of value patterns in hours as compared to human staff can do in several months.Done product tier classification (wrote code to analyze data) in a large database for client Cargill.Done NLP (wrote code to analyze text): extracted patent citations with 97% accuracy from 50,000+ pdf documents from US Patent Office. I proposed this approach to replace the original proposal to do manual extraction with outsourced low labor cost (it would need more than 10 people) and won this contract from client Chemical Abstract.Independently impl a data search tool (a gui tool. 5k lines python and pyqt), used to search in a large client database.Independently impl a graph data visualization tool (a gui tool. 3k lines python and pyqt), used to search in a neo4j graph database with interactive graphs and menu selections.Built a Azure chatbot as a demo project for Microsoft local marketing office in Columbus Ohio.Used and compared gcp, aws, azure cloud platforms and wrote a report on usability and learning curves on major features relevant to the client.Done data modeling: designed and built relational and graph database schemas, starting from a set of large (in num of data items as well as data volumes) unstructured and uncleansed datasets.Founder in Code TrackColumbus, Ohio 03/2016 03/2019I created this company to develop an automatic data lineage searching software that I designed. Because of the small funding received, I personally developed most of the code (15k lines of python).Used Tensorflow/Keras (ML) to build a component in lineage search engine to capture user habits of the software.Design and impl an office file sharing program. (similar to Slack, 3k lines python code)Meetings with investors, did presentations, wrote business plans, impl plans, budget estimatesUsed python, data mining libraries, Tensorflow, web crawler, Django, regex, web sockets, postgresql, pyqt gui framework, windows and ubuntu linuxPython Developer in Signet AccelColumbus, Ohio 04/15/2015 03/2016This was a small medical software startup aiming to consolidate all medical data into a common schema.Wrote a versatile data converter in python. Its input are data from different hospitals. The challenges were to correctly identify and categorizes text values of the same concepts with different medical codings.Familiarized with OMOP schema, Observational Medical Outcome Partnership, and many medical coding sets such as SNOMED, CPT-4, and ICD9.Used python, lucene, usagi, csvkit, sql20, postgresql. sql server, windows, linuxDeveloper in JP Morgan Chase BankColumbus, Ohio 03/2013 04/2015In financial compliance department, I worked on modification of compliance code due to regulation changes,Wrote java program to archive data to third party storage format to comply with regulatory requirements. A 8 month project with 3 managers (source, general, and target), and 4 tech resources to collect and store data. I was the only conversion programmer to convert source data format to target data format and conversion is the only part of the project that requires substantial coding effort.Investigated legacy financial compliance systems, separated and abstracted its main logic components for re-factorization of the system. Studied and done an (visual programming) POC.Traced data flows from cobol/copybook files from ibm db2 (Bear Stern, bought by JPMC in 2008 financial crisis) to target oracle system in an effort to link two different systems.Traced all historical data linked to a high profile Wall Street financial scandal.Did website security analysis and maintenance, such as SSL setup for the Chase web site.Used java, c++, junit, log4j, oracle, sql server, ibm db2, xml, ssl setup for httpsData Architect in Research Institute at Nationwide Children HospitalColumbus, Ohio 11/2011 03/2013Redesign and impl the design of the data backend of etrac system. Data backend was buggy after many years of continuous modifications. Etrac system is used to track from funding proposal all the way to fund use for every research project. This system tracks research fundings for 200 medical professors, along with 800 scientists and assistants, from the medical school of Ohio State University.Used sql server, data modeling, etrac, erwin, olap cubes, star schemaSenior System Analyst in G2OColumbus, Ohio 5/2006 11/2011Worked in about 30 projects and the projects are in various client companies and government agencies around Columbus. Most projects are database and data warehouse design, ETL (SQL or SSIS) impl and BI build.Done many PO projects, wrote many technical proposals and assessments, for client companies.Used sq server, oracle, postgresql, pentaho, Microsoft Bl suite, google map, iWay (ETL), java, google map programData Architect in GE Aircraft EnginesCincinnati, Ohio 01/2005 - 5/2006Coordinate the integration of several data areas, each area has several dozens of databases, and each database has dozens or hundreds of tables. This is done to support merging of GE Aircraft Engines and GE LocomotivesFounder in Ruiling OpticsColumbus, Ohio 2003 2004Designed and built an ultra-fast (split-of-second) flat-bed image scan device, applied and obtained 3 US utility (invention) patents, conducted research on imaging software and hardware, and done many funding raising activities (business plans, exec summaries, emails, presentations, meeting with investors), It is intended to replace flatbed scannerPrincipal Engineer in ProfitlogicCambridge, Massachusetts 10/2001 10/2002Profitlogic is a software startup to do retail price setting. JC Penny project ($15m, 6/2001-12/2002) is the biggest pricing project awarded to this company before it sold to Oracle.I was in charge of the design and impl of the whole data backend of this pricing software and lead a team of 7 developers. It was a large data lake (several TBs) with relational and star schema. It has data about 3.5 million active SKU's across 1,020 stores.Used UNIX Shell Scripting, Toad, Oracle 8i, PL/SQL, Pro*C, and Erwin in the project.Senior Engineer in Electron EconomyCupertino, California 5/2000 10/2001Electron Economy was a software startup funded with $86m. I was hired with no fixed role assignment.Proposed an online, rule-based intelligent business negotiation system, 2 month after I joined the company.A division of 15 engineers (5 with PhDs and 10 with masters) was formed based on this proposal. I helped organize the team and structured the project.When Electron Economy was sold to Viewlocity, the rule-based transaction engine is one of the two main software assets Electron Economy had (the other one is a web-based online sales transaction engine).Design and impl java software for the project I proposed, along with other team members.EDUCATION The Ohio State UniversityBS Computational Mathematics (error control in large numerical computational processes)MS MathematicsMS Computer and Info (CIS)PhD in Computer and Info (data mining and fast data retrieval indices)PhD candidacy in Mathematics in Combinatorics (graph theory, counting structures)(Courses are 3-5 weeks each)ML course certifications (21) Year Contents InstituteGetting Started with Tensorflow 2 2021 CNN, RNN, LSTM, python Imperial College LondonCustomizing your model with Tensorflow 2 2021 CNN, RNN, LSTM, python Imperial College LondonSupervised Machine Learning Regression and Classification 2023 python, Tensorflow DeepLearning.AIUnsupervised Learning, Recommenders, Reinforcement Learning 2023 python, Tensorflow DeepLearning.AIAdvanced Learning Algorithms 2023 Complex models, Tensorflow, python DeepLearning.AISequence Models 2023 RNN, LSTM, GRU, forecast, python DeepLearning.AIDeep Learning Neural Network with PyTorch 2023 PyTorch, python IBMSequences, Time Series and Predictions 2023 RNN, LSTM, ARIMA, LLM, python DeepLearning.AITransformer Models and BERT Model 2023 Transformer, BERT model, python Google CloudGoogle Cloud Big Data and Machine Learning 2023 GCP Google CloudHow Google Does Machine Learning 2023 GCP Google CloudIntro to Vertex AI Studio 2023 Vertex AI Studio Google CloudTensorFlow on Google Cloud 2023 GCP, Tensorflow, python Google CloudFeature Engineering 2023 GCP, DataPrep, DataFlow, BigQuery Google CloudLaunching into Machine Learning 2023 GCP, Vertex AI, python Google CloudNatural Language Processing on Google Cloud 2023 GCP, LLM, python Google CloudMLOps Started 2023 GCP, MLOps Google CloudMachine Learning in the Enterprise 2023 GCP, MLOps Google CloudProduction Machine Learning Systems 2023 GCP, MLOps Google CloudDistributed Computing with Spark SQL 2023 AWS, databricks, Spark SQL UC DavisGenerative AI with Large Language Models 2024 Generative AI, Transformer, LLM DeepLearning.AIStatistics course certifications (8) Year Contents InstitutePractical Time Series Analysis 2021 Statistics based, ARIMA, R State Univ of New YorkBayesian Statistics: Techniques and Models 2021 Statistical Model, MCMC, jags, R UC Santa CruzInferential Statistics 2023 Inference, hypothesis tests, R Duke UniversityExperimental Design Basics 2024 ARIMA, Latin square, block design, R Arizona State UnivR Programming 2024 R language John Hopkins UnivGetting Started with SAS Programming 2024 SAS language SAS InstituteProbabilistic Graph Model 1: Representation 2024 Bayesian and Markov Networks Stanford UnivProbabilistic Graph Model 2: Inference 2024 Bayesian and Markov Networks Stanford UnivData Visualization course certifications (4) Year Contents InstituteCreating Dashboards and Storytelling with Tableau 2023 Tableau UC DavisVisual Analysis with Tableau 2023 Tableau UC DavisFundamentals of Visualization Tableau 2023 Tableau UC DavisEssential Design Principles for Tableau 2023 Tableau UC DavisCDMP professional certificationCertified Data Management Professional at the Master Level with Specialties in Data Management and Data Warehousing) 2013From DAMA (Data Management Association International) and ICCP (Institute for Certification of Computing ProfessionalShort projects and short courses certifications (5)Image Compression and Generation using Variational Autoencoders 2021 CourseraDecision Tree and Random Forest Classification using Julia 2023 CourseraML with ChatGPT, image Classification 2023 chatGPT assist in ML image classification CourseraCypher Fundamantals 2024 Cypher language in Neo4j database Neo4j GraphAcademyNeo4j Fundamenatals 2024 Neo4j database Neo4j GraphAcademyGraph Data Modeling Fundamentas 2024 data modeling in Neo4j database Neo4j GraphAcademy |