Data Warehousing for Business Intelligence – Data Mining
Analyse the Statlog (German Credit Data) data set (available from the UCI Machine Learning Repository – https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29) to explore the different factors that affect the credit rating of a person. Your aim is to identify which attribute or combination of attributes and which algorithm has the highest accuracy in identifying adults who would be given a ‘good’ credit rating. Once you have completed this, write a report to describe in detail the analyses you have performed.Your report should include:• A data set description in terms of the attributes present in the data, the number of instances, missing values, and other relevant characteristics.• A detailed description of the pre-processing of the data.• Evidence that you have investigated the data using multiple analysis methods.• An explanation of the selected algorithm.• A discussion of any pre or post processing done to improve the accuracy of your analysis.• A business recommendation based upon your analysis.The report should be no more than 2500 words long and should include such graphics as are appropriate to illustrate your answers.




