Data Science is used as a tool to find hidden facts in the data. We want to find out what factors such as ‘AGE’, ‘TAX’, ‘PUPIL-TEACHER RATIO’, ‘PER-CAPITA INCOME’ contribute the most to housing prices. To answer this question, we studied the dataset of “Boston Houses Prices”. By applying the Lasso Regression (a Data Mining Technique) on the data set of “Boston Houses Prices” we identified the influential factors in the linear model. As a conclusion we found that there were six inputs which contributed the most to the prices of houses and those inputs are as follow: (i) CRIM-per capita crime rate by town, (ii) ZN- proportion of residential land zoned for lots over 25000 sq. Ft, (iii) CHAS-Charles River Dummy Variable, (iv) RM- Average number of rooms per dwelling, (v) Black- proportion of black by town, and (vi) LSTAT-Lower status of population
Wu, Xiaoqing and Ahmad, Daanial, "Using data mining to identify the most influential factors in training results" (2020). CUNY Academic Works.