Having experience in software development and a research degree in software engineering, I am into mining software repositories for insightful findings that can help developers improve their productivity and succeed in their goals.
Engineering areas I am passionate about:
- Leveraging data for actionable insights for important decision-making processes
- Building scalable distributed low-latency applications and pipelines that can handle a large amount of data in real-time.
Java
Python
R
Javascript
SQL
HTML/CSS
Shell scripting
CI/CD
Other tech stacks
Frameworks and libraries: Spring framework, D3.js, Bootstrap, Scrapy.
Data mining techniques: web crawling, feature engineering, machine learning, statistical analysis, natural language processing.
Certificates
AWS Fundamentals: Going Cloud-Native (AWS)
Scalable Machine Learning on Big Data using Apache Spark (IBM)
Distributed Computing with Spark SQL (UC Davis)
Managing Big Data in Clusters and Cloud Storage (Cloudera)
Work Experience
SENIOR SOFTWARE ENGINEER - BACKEND
EBAY
JULY 2021 - PRESENT
TORONTO, CANADA
Lead several high-impact feature developments from designing, scoping, planning, execution, shipping, monitoring, and post-release support.
Designed and developed API contracts with Swagger and OpenAPI for integrations between microservices.
Built new features and optimized large-scale distributed services that serve millions of client requests across the web and native platforms.
Developed experiments to A/B test new features and designs to help product teams arrive at final decisions.
Mentored and lead other engineers in software development processes and best practices.
Tech: Java, Spring Framework, Jenkins, Kibana, IntelliJ IDEA, JUnit 4.
SOFTWARE DEVELOPER
FIIX SOFTWARE
NOVEMBER 2020 - JULY 2021
TORONTO, CANADA
Developed scalable reactive Spring microservices, using Spring Webflux and Reactor Core for non-blocking asynchronous codes, and integrating with the Axon framework, following the Command and Query Responsibility Segregation (CQRS) pattern and Event Sourcing pattern.
Built rate-limiter with Redis cache, into API gateways and designed database schemas to control request limits and concurrent API calls.
Built API authentication and authorization using Spring Security.
Tech: Java, PostgreSQL, MySQL, AWS CodeBuild, AWS CodePipeline.
GRADUATE RESEARCH ASSISTANT
ANALYTICS OF SOFTWARE GAMES AND REPOSITORY DATA (ASGAARD) LAB
SEPTEMBER 2019 - SEPTEMBER 2020
EDMONTON, CANADA
Published a paper on an empirical study of software meta-data from an online distribution platform. The paper is accepted to the International Conference on the Foundations of Digital Games (FDG) 2020. Preprint paper: Here
Published a paper on indie game recommendations using game data mined from online distribution platforms. The paper won a best paper award at the International Conference on the Foundations of Digital Games (FDG) 2021. Preprint paper: Here
Developed several web crawlers in Python for mining software repositories (using Scrapy) such as GitHub (via API), GitLab, Devpost, Steam, and itch.io.
Performed statistical analysis using several data mining techniques (data cleaning, feature selection, machine learning modelling, visualization).
Tech: R, Python, SQL, pandas, numpy.
Developed natural language processing scripts for topic modeling (using Latent Dirichlet Allocation) and content-based recommendation system (using Rapid Automatic Keyword Extraction (RAKE) algorithm).
Tech: R, Python, SQL, pandas.
Built a website to demonstrate a recommendation system and collect feedback from users.
Tech: JavaScript, D3.js.
SOFTWARE DEVELOPER
CITIGROUP
JULY 2016 - JULY 2019
SINGAPORE
Developed several distributed RESTful Java microservices (with Spring framework), following the TDD principle, to recalculate valuation and dividend on several classes of derivatives. Tech: IntelliJ IDEA, Java, Gradle, MySQL.
Automated the CI/CD process from testing, code analysis, integration, deployment, and artifact storage (using tools such as TeamCity, UrbanCodeDeploy, SonarQube, Artifactory), which reduced ~80% of manual tasks, thereby streamlining development, achieving Single-Click deployment.
Containerized the microservices into Docker images for deployment on the OpenShift platform..
Automated periodic tasks (e.g., send data, purge database) using Shell scripts and AutoSys job schedule, reducing ~90% manual effort.
Revamped a monolithic web application into several web-fragment microservices, thereby enabling the plugability of components for easy deployment or removal.
SOFTWARE DEVELOPER INTERN
HOLMUSK
JULY 2015 - AUGUST 2015 and DECEMBER 2015 - JANUARY 2016
SINGAPORE
Developed a WEB crawler to retrieve millions of user records (e.g., workouts, meals, goals, profiles) from Fitbit and Jawbone fitness trackers and store in Mongo database. This dataset was used for the company's machine learning pipeline.
Tech: Python, Scrapy.
Developed the GUI of a single-page web application designed for diabetes patients to track their meals.
Tech: D3.js, Javascript.
Education
MASTER OF SCIENCE IN SOFTWARE ENGINEERING
UNIVERSITY OF ALBERTA
SEPTEMBER 2019 - SEPTEMBER 2020
GPA: 4.0/4.0
Relevant course (grade): Data Analytics in Software Engineering (A+), Software Construction & Verification (A), Digital Image and Video Processing (A).
Authored 2 research papers on mining digital distribution platforms.
BACHELOR OF ENGINEERING (HONORS) IN COMPUTER ENGINEERING
NATIONAL UNIVERSITY OF SINGAPORE (NUS)
AUGUST 2012 - JUNE 2016 (4 YEARS)
GPA: 4.27/5.00
Awards:
ASEAN Undergraduate Scholarship (2012 - 2016): awarded to outstanding students from ASEAN countries.
Deans' List (Academic year 2012/13) in recognition of academic excellence and being in the top 5% of the cohort.
Adler Foundation scholarship (Jan - Jun 2015): awarded by Chalmers University of Technology for exchange study.
Selected Projects
Indie Games Recommendation
Indie games often lack visibility as compared to top-selling games due to their limited marketing budget and the fact that there are a large number of indie games. Players of top-selling games usually like certain types of games or certain game elements such as theme, gameplay, storyline. Therefore, indie games could leverage on their shared game elements with top-selling games to get discovered.
Published as: Quang N. Vu, Cor-Paul Bezemer, “Improving the Discoverability of Indie Games by Leveraging their Similarity to Top-Selling Games: Identifying Important Requirements of a Recommender System,” in International Conference on the Foundations of Digital Games (FDG), 2021
A topic modeling approach using Latent Dirichlet Allocation (LDA) and several natural language processing techniques to identify the challenges, lessons learned, and future goals described by the participants of hackathons.
Tech: Python, Scrapy, Pandas, MySQL, gensim for LDA.
An Empirical Study of the Characteristics of Popular Game Jams and Their High-ranking Submissions on itch.io
Game jams are hackathon-like events that allow participants to develop a playable game prototype within a time limit. Having a high-ranking game is a great bonus to a beginning game developer's résumé and their pursuit of a career in the game industry. However, participants often face time constraints set by jam hosts while balancing what aspects of their games should be emphasized to have the highest chance of winning. In this project, an empirical analysis of 1,290 past game jams and their 3,752 submissions on itch.io was conducted to understand better what makes popular jams and high-ranking games perceived well by the audience.
Tech: Python, Scrapy, Pandas, MySQL.
Published as: Quang N. Vu, Cor-Paul Bezemer, “An Empirical Study of the Characteristics of Popular Game Jams and Their High-ranking Submissions on itch.io,” in International Conference on the Foundations of Digital Games (FDG), 2020