I gained my master's degree in data science at Duke University. I am currently working with Computational Culture Lab (based on Stanford University and UC Berkeley) as a research assistant, mainly focusing on how to measure organizational culture from empirical data. In 2020 summer, I worked as a Jr. data scientist in Summery, a tech startup in Silicon Valley, helping match their charitable projects with their employees’ personalities. Before coming to Duke, I worked as a people analyst in KEPCO (Korea Electric Power Corporation), #1 global utility company with 20,000 employees, for four years. My interest lies in building predictive models, deriving business insights from data (specifically orgnization or HR data) and data visualization.
A few fun facts: I published my first book when I was at 17; I used to hire data scientist as a recruiter and one day I thought "Why not I become a data scientist instead of recruiting them?" and here I am as a data scientist with HR background.
M.S in Interdisciplinary Data Science, 2021
Duke University, NC, U.S.
B.A in Business and Economics, 2014
Handong Global University, Republic of Korea
Python, R, SQL, HTML
Tableau, Hadoop, Power BI, Git, Docker, Scikit-learn, Keras
Developed a personalized course recommendation system to help facilitate course load planning, order, and selection for future students leading to academic success
Using historical enrollment and graduation data, we propose several tools to facilitate major and course selection for undergraduate students at Duke University.
proposed a deep learning model, ResNet50, to identify COVID-19 chest X-rays
reproduced some state-of-the-art models (ResNet-18, Xception, InceptionResnet V1), assessed the performance of these models and identified their challenges and limitations
Developed a model to identify solar photovoltaic (PV) arrays in the aerial image through Convolution Neural Network
As an alumnus of Handong Global University (HGU), I presented my lecture to undergraduate students about People Analytics.
Used hierachical logistic regression model to find the main factors to lead people(male and female)'s decision in the speed dating.
Analyzed the effect of job training on disadvantaged workers in the US using regression model
pre-post analysis and a difference-in-difference analysis when comparing opioid shipments and overdose deaths.(One report for Data Scientist and the other report for a policy maker)
Analyzed the effect of social distancing on the number of new COVID-19 cases per day and the mortality rate with three merged data from FIPS, Census, Safegraph Data
Analyzed Twitter, Google Trend, Safegraph data to see the impact of social distancing given the date of Stay-at-Home order