Exploratory Data Analysis Academic Essay

Legacies of Human Evolutionary History Academic Essay
August 18, 2020
Assignment – Week 2 Academic Essay
August 18, 2020

Exploratory Data Analysis Academic Essay

For each of the five variables, process, organize, present and summarize the data. Analyze each variable by itself using graphical and numerical techniques of summarization. Use MINITAB as much as possible, explaining what the printout tells you. You may wish to use some of the following graphs: stem-leaf diagram, frequency/relative frequency table, histogram, boxplot, dotplot, pie chart, bar graph. Caution: not all of these are appropriate for each of these variables, nor are they all necessary. More is not necessarily better. In addition be sure to find the appropriate measures of central tendency, the measures of dispersion, and the shapes of the distributions (for the quantitative variables) for the above data. Where appropriate, use the five number summary (the Min, Q1, Median, Q3, Max). Once again, use MINITAB as appropriate, and explain what the results mean.
Analyze the connections or relationships between the variables. There are ten possible pairings of two variables. Use graphical as well as numerical summary measures. Explain what you see. Be sure to consider all 10 pairings. Some variables show clear relationships, while others do not.
Prepare your report in Microsoft Word,integrating your graphs and tables with text explanations and interpretations. Be sure that you have graphical and numerical back up for your explanations and interpretations. Be selective in what you include in the report. I’m not looking for a 20 page report on every variable and every possible relationship (that’s 15 things to do).
In particular, what I want you do is to highlight what you see for three individual variables(no more than 1 graph for each, one or two measures of central tendency and variability (as appropriate), the shapes of the distributions for quantitative variables, and two or three sentences of interpretation). For the 10 pairings, identify and report only onthree of the pairings, again using graphical and numerical summary (as appropriate), with interpretations. Please note that at least one of your pairings must include the qualitative variable and at least one of your pairings must not include the qualitative variable.

Using the sample data, perform the hypothesis test for each of the above situations in order to see if there is evidence to support your manager’s belief in each case a.-d. In each case use the Seven Elements of a Test of Hypothesis, in Section 6.2 of your text book, using the ? provided by your Instructor in the Doc Sharing materials, and explain your conclusion in simple terms. Also be sure to compute the p-value and interpret.
Follow this up with computing confidence intervals (the required confidence level will be provided by your Instructor) for each of the variables described in a.-d., and again interpreting these intervals.
Write a report to your manager about the results, distilling down the results in a way that would be understandable to someone who does not know statistics. Clear explanations and interpretations are critical.
Generate a scatterplot for the specified dependent variable and the specified independent variable, including the graph of the “best fit” line. Interpret.
Determine the equation of the “best fit” line, which describes the relationship between the dependent variable and the selected independent variable.
Determine the coefficient of correlation. Interpret.
Determine the coefficient of determination. Interpret.
Test the utility of this regression model (use a two tail test with the ? provided by your Instructor). Interpret your results, including the p-value.
Based on your findings in 1-5, what is your opinion about using the designated independent variable to predict the designated dependent variable? Explain.
Compute the confidence interval for beta-1 (the population slope), using the confidence level specified by your Instructor. Interpret this interval.

Using an interval, estimate the average for the dependent variable for a selected value of the independent variable (to be provided by your Instructor). Interpret this interval

Using an interval, predict the particular value of the dependent variable for a selected value of the independent variable (to be provided by your Instructor). Interpret this interval.
What can we say about the value of the dependent variable for values of the independent variable that are outside the range of the sample values? Explain your answer.

In an attempt to improve the model, we will attempt to do a multiple regression model predicting the dependent variable based on all of the independent variables.

Using MINITAB run the multiple regression analysis using the designated dependent and independent variables. State the equation for this multiple regression model.
Perform the Global Test for Utility (F-Test). Explain your conclusion.
Perform the t-test on each independent variable. Explain your conclusions and clearly state how you should proceed. In particular, which independent variables should we keep and which should be discarded. If any independent variables are to be discarded, re-run the multiple regression, including only the significant independent variables, and include the final Minitab output, with interpretation.
Is this multiple regression model better than the linear model that we generated in parts 1-10? Explain.

Unlimited Free Revisions

find the cost of your paper
Is this question part of your assignment?
Place order
Posted on May 12, 2016Author TutorCategories Question, Questions