University of Maryland University CollegeDATA 610 –
University of Maryland University College
DATA 610 – Decision Management Systems
Assignment ?2 – Exploratory Data Analysis (EDA) using Cognos Analytics
Deadline: Last day of week 5, 11:59 pm Eastern Time
Submission via LEO.
This is an individual assignment. Each student will complete the assignment outlined below and post his/her written results to the appropriate assignment. Please note that only 1 document is allowed to be submitted. See content on p.4-5.
Submitted assignments will be graded for (a) content, (b) document quality (i.e. formatting, following guidelines, pleasant to read, etc.), and timeliness of submission. Assignments submitted late will be deducted 5 points for each day it is late.
1. For this assignment, you will use the Nashville housing data. Read the data source description at https://www.kaggle.com/tmthyjames/nashville-housing-data.
2. Provide a brief description of the dataset in your paper. Discuss the number of cases, description of the inputs, description of the variables that could be used to develop predictive models, etc. Which variables are missing data?
Using Cognos Analytics:
3. Data preparation
a) Create Data Module
o Login to your Cognos Analytics account
o Copy the dataset from Team Content ->Data for DATA 610 Assignments -> Assignment 2 folder in Figure 1 into My Content area.
Figure 1: Nashville Housing Data location
o Navigate to My content and make sure that you see a copy of the Nashville housing data.
o Select Create data module from action menu in Figure 2.
Figure 2: Create data module
o The new data module will open. Click on an arrow next to save icon in Figure 3 and select save as to save new data module into my content.
Figure 3: Save Data module
o Click on Nashville housing data csv in the left panel in Figure 4 to preview data.
Figure 4: Preview data
b) Handle Missing data
Cognos Analytics represents missing data as Null
Apply the approaches in the assigned readings to handle missing data, including filtering rows, removing columns, and replacing the missing value with the average. Discuss the approaches you used in your paper.
c) Handle irrelevant variables – Which variables in the data are irrelevant? Explain why they are irrelevant and how you handle them.
Make sure to save the data module before moving on to the next part of the assignment.
4. Explore the dataset:
Return to the Welcome page and expand My Content
Select Create exploration from the action menu of the Nashville housing data module in Figure 5.
Figure 5: Create Exploration
a) Are there outliers? If so, what fields have outliers and what do you recommend as solutions?
b) Pose an initial set of questions to use for data exploration. Provide any insights gained from using Cognos Analytics with this dataset
a) Develop new specific questions which provide additional insights into and answer specific questions from the dataset. Discuss how these insights could be useful. Discuss how you would improve the relevancy.
b) Develop and explain at least five different visualizations. Experiment with the available options and summarize the results. Provide insights to what the visualizations show.
c) Utilize features (e.g., filters, comparisons) with the visualizations to uncover and explain interesting aspects of the data set.
d) Create and explain at least one insightful calculation. Discuss why this would be useful.
Each student will submit a single document conforming to the guidelines and standards outlined below.
§ limited to 8 pages (excluding title page, references, and appendix),
§ Double-spaced, 12 point Times New Roman font, 1” margins, Bottom-right page numbering.
Note: Submitted report must be either in MS Word or PDF format and titled: “Assignment2_LastName”.
Only one document will be allowed to be submitted.
Content (note that the document must have clearly marked sections for the items listed below)
1) Title page (1 page limit): course number and term, assignment number and project title, student name and contact information, instructor’s name. Format it so it looks pleasant and presentable. Follow formatting guidelines above.
2) Introduction. Provide a brief outline of the dataset you are using for this assignment. Briefly describe the content of the data. Include a screenshot of the data (not all, but partial as far as all relevant variables are visible).
3) Data preparation – handling missing data, removing irrelevant variables
4) Data exploration process. Explain and discuss what data exploration you performed (e.g., questions generated about the data set content). Include any specific ideas or suggestions as to how this could be used in your organization.
5) Visualizations created. Explain the visualizations created. Include the value-added aspects of the visualizations. Include creative aspects for increasing potential for higher assignment grade.
6) Calculation which adds insights or value to the data set. Include and explain the value of the calculation, i.e., insights provided by the calculation.
7) References (1 page limit): List all references in APA format used in preparing this report. It is strongly recommended to use outside knowledge in setting-up the analysis or discussing the results where possible.
8) Appendix (4 page limit):
a) Appendix A: Include any appropriate workbooks and/or screenshots (figures, tables, diagrams) used in this assignment. Make sure all tables, figures, or diagrams are properly numbered and titled. For example, “Table 1. Model Results”. Make sure all tables or figures or diagrams are easily readable and visually presentable.
§ Assignments that: 1) adequately address all required tasks; 2) are submitted on time; 3) are properly formatted (APA format for references, no typos or misspelled words, no grammar errors, cover page, etc.) will receive a grade of B (80-89, depending on content).
§ In order to increase (but not guarantee) your chances of receiving a higher grade, you need to show clear evidence of critical thinking. Critical thinking can take many forms, depending on the type of assignment. In some instances, showing greater depth (e.g., such as creating more models, looking at more than one insightful fact or relationships, and comparing them on key criteria) is one method for providing evidence of critical thinking. In other cases, it might include providing more explanation to include the pros and cons of the approach used or the arguments in favor and against the proposal as well as some criteria for choosing among the alternatives. Still another example would be providing significant insights as to how the assignment outcome would benefit (or would meet resistance) in your organization and what steps might be employed to facilitate acceptance. Certainly, this is not a complete list, but gives some examples of critical thinking aspects.