Solve the Statistical question pro

Question 2

Janssenet al. (2007) studied the relationships between a variety of abiotic factors and benthic invertebrate abundance at sites on beaches along the Dutch coast. One of these abiotic factors was the relative height of the site in relationship to the average sea level of the area (NAP). Positive values of NAP indicate sites that are higher than the average sea level, whereas negative values indicate sites that are below the average sea level.

The data are in the file sle251dutch.csv and the relevant variables are the response variable, richness (richness of invertebrate species), and the predictor variable, NAP (relative height of the site in relationship to the average sea level of the area).

Format of sle251dutch.csv data file

Site NAP richness

1 0.045 11

2 -1.036 10

3 -1.336 13

4 0.616 11

5 -0.684 10

.. .. ..

Site The number of the site where the samples were collected

NAP Relative height of the site in relationship to the average sea level of the area

Predictor variable

richness Richness of invertebrate species

Response variable

a) Janssenet al. (1996) were interested in modeling the linear relationship between invertebrate richness (response) and the relative height of the site in relationship to the average sea level (predictor). List the following:

The biological inference of interest

The biological null hypothesis derived from above

The statistical null hypothesis (H0) derived from above

b) Draw a scatterplot of NAP against richness. Draw boxplots for each variable as well. Any evidence of skewness in the distributions or nonlinearity?

To create scatterplot in R

Graphs

Scatterplot

Select x-variable (NAP) and y-variable (richness)

Check Marginal boxplots and Least-squares line

Unselect Smooth line and show spread

OK

c) Fit the regression model richness = intercept + slope x NAP.

To fit linear regression and create an ANOVA table in R

Statistics

Fit models

Linear regression…

You can enter a name for the results object (Enter name for model:) but its simplest to just use the name that R provides.

Select richnessfrom Response variablelist

Select NAPfrom Explanatory variables list.

OK

Models

Hypothesis tests

ANOVA table

Select Partial, ignoring marginality (“Type III”).

OK

Examine the regression output and identify and interpret the following:

Sample y-intercept

Value (estimate in the R output):

Interpretation:

Slope of regression line (NAP)

Value(estimate in the R output):

Interpretation:

t statistic for main H0 (regression slope equals zero)

Value:

Interpretation:

P-value for main H0 (regression slope equals zero)

Value:

Interpretation:

r2 value (multiple R-squared)

Value:

Interpretation:

d) Complete the following ANOVA table from the regression analysis

Source of variation SS df MS F ratio

Regression

Residual

Total

44

Note: To get the MS values from the output – remember to divide the SS value by the df.

e) What conclusions would you draw from the regression analysis (statistical and biological)?

f) What invertebrate richness would you predict for a new site with an NAP of -2? Simply plug -2 into your regression equation and calculate predicted richness.