You are to write a position paper that addresses the question: How can we do a better job of using data to accurately represent the world around us? What do you think scientists, researchers, the media, etc. should be doing to present a more accurate picture of the truth?
In this paper, you can use Speigelhalter and other sources to assess how data is being used and presented today. Then you should complete a critical analysis of some of the challenges with data that we have covered in this class. Lastly, (and this is the most important part of this paper) you should include three recommendations regarding how data should be presented. Those recommendations must be supported by outside sources that are properly cited in your paper.
This paper is your chance to summarize what you have learned about how data is used or misused and what you feel should be done about it.
Module 2 Spreadsheet Review We learned several tips and tricks for MS Excel and downloaded Solver as an Add-In.
Module 3 Correlations and Visualizations in Excel In this module, we looked at two things. The first was correlation which allows us to assess the relationship between two variables. When comparing two variables, if one increases and the other tends to do the same, then they are likely to be positively correlated having a correlation coefficient higher than 0 and possibly approaching 1. If, when one variable increases and the other decreases, then it is likely that they have a negative correlation and the correlation coefficient will be somewhere between 0 and -1. If there seems to be no match in the variability of the two variables, they are said to be not correlated and will have a correlation coefficient close to 0.
Module 4 Optimization In this module, we got to consider how to get maximum benefit in a production scenario when we had limited resources and other constraints. In our example, we produced guns and butter and used Solver to help us maximize our revenues by determining the maximum amount of each product we could produce within the limits imposed upon us.
Module 5 Optimization II In this module, we again used Solver but this time it was to maximize sales of pairs of Skis for Winterfell Ski Company.
Module 6 Linear Regression We built our first predictive model that allowed us to predict Christmas sales for each customer given the information we had regarding their sales from January 1 to November 30. We learned how to tell if our predictive model was good or not.
Module 7 Time Series We analyzed past records of calls and checked them to see if there were trends or seasonality. We used MS Excel line charts to visualize the data. Using this information, we were able to make educated guesses about what the call numbers were likely to be for the remaining months of the year.
Module 8 Comparing Means Using t-test we were able to compare two sets of numbers to see if the mean of set 1 was significantly different than the mean of set 2. While the numeric means might be different, this could be due to variability or outliers in one or both groups; there may be no real significant differences. We used t-test from within Excel to generate a p-value that gave us an idea if the means of the two groups were significantly different. We also used t-test from the Data Analysis Add-in menu to get more information about our comparison that allowed us to see if the difference was significantly different for both one-tailed and two-tailed tests. We also took a brief look at ANOVA which is the test you use when you want to compare means for more than two groups.