IT Jobs Analysis (香港IT工招聘廣告數據挖掘報告)

Introduction

This analytics aims to help IT practitioners, students and teachers to understand the manpower need of IT industry in Hong Kong. Students always want to know the essential technical knowledge and skills in finding a job. e.g. PHP or ASP.NET? Android or iOS? The methodologies of Data Science have been adopted in this project to investigate the keywords and extract the hidden information from 192,000 IT job advertisements. The results of this project will help you make a better decision on your further study and career development."清楚IT行業真正要求及自己的位置,免行冤枉路!"

Report contents:

IT Job Skill Index (IT工作技能指數) Keyword Count summary and Simple correlation analysis with Higher Diploma. More Detailed Analysis for a Keyword: Trendency (趨勢分析), Prediction Tree (預測樹), Social Network Terms Analysis (社交網絡用語分析), Geochart (區域圖) Keyword Density Visualization (關鍵字密度視覺化) IT Term Matrix Social Network (IT關鍵字矩陣社交網絡圖), Term Frequent Word Cloud (IT工作關鍵字頻率字雲) IT Term Matrix Social Network (IT關鍵字矩陣社交網絡圖) Analysis the keyword relations with Social Network Analysis Technique. Communities (Groups) are extracted. Analytic Analogy(分析技術類推) - Each keyword is a "person", which is represented by node, and node size reflects the occurrence frequency. Job Advertisement is an "event", "Keywords appear in a Job Advertisement" implies "People join a unique event". People always join event together, which implies they may be "friend" or have "relation", which is reflected by edge. IT Job Skill Cube (IT工作技術立方) Transform variable with principal components, and interactive 3D Plot to explore their relation. Job Advertisement Cluster Analysis (IT工作集群分析) Clustering the job advertisements by keywords.

Technology:

Apache lucene and Cloudera Hadoop - ETL process

Amazon AWS - EMR, S3, Route 53.

R - data mining, and statistics

This Project is jointly developed by Mr. Cyrus Wong, Data Scientist of Cloud Innovation Centre, IVE (Lee Wai Lee). And,thanks to our sponsor - JobsDB.com, AWS, Cloudera (Lively Impact), and IVE IT discipline