The goal of this experiment was to see if different nests for sparrows on Kent Island attracted different size sparrows. The
data is in the file
sparrow.csv
, which has the following columns:
Column 1:
Treatment
– What type of nes, with
control
(not manipulated),
enlarged
(manipulated to be a larger nest than
normal), or
reduced
(manipulated to be a smaller next than normal).
Column 2:
Weight
– The weight of the bird in grams
In addition to hypothesis testing, we are interested in the following confidence intervals:
•
A confidence interval for the nest that tends to have the largest sparrow.
•
A confidence interval comparing the control nest to the enlarged nest.
•
A confidence interval comparing the control nest to the enlarged nest.
Question 2
The data set contains information on 76 people who undertook one of three diets (referred to as diet A, B and C). The aim
of the study was to see which diet was best for losing weight. The data is in the file
loseit.csv
, with the following columns.
Column 1:
Diet
– Which diet they were on, with values
A, B, C
.
Column 2:
Loss
– The difference (in pounds) of their weight at the beginning of the program, and their weight after 6 months.
A positive number therefore suggests they lost weight, while a negative suggests they gained weight.
In addition to hypothesis testing, we are interested in all pairwise confidence intervals for differences in means.
Question 3
This data was collected as part of the SENIC project, and the overall goal is to assess if the length of stay for patients differs
between geological regions. The data is found in the file
senic.csv
, with the following columns:
Column 1:
Length
: The average length of stay for patients at this hospital (in days).
Column 2:
Region
: The region the hospital was in, with values
NC
(North Central),
NE
(North East),
W
(West), and
S
(South)
In addition to hypothesis testing, we are interested in all pairwise confidence intervals for differences in means, and if you
believe any regions could be combined.
1
The Report Format
You or your team will turn in a short report. This means you should write in full sentences, and have the following sections
for each question, while being
as specific as you can
about your results:
I. Introduction. State the question you are trying to answer, why it is a question of interest (why might we be interested
in the answer), and what approach you are going to take (just the name of the approach).
II. Summary of your data. This should include things like plots (histograms, boxplots) including the interpretation of the
plots, and summary values such as sample means and standard deviations. You should have an idea about the trend of
the data from this section.
III. Diagnostics. You should discuss your assumptions here, and if you believe they are violated. Perform diagnostics for
the model (this will be covered on Monday, Feb 5, and the lecture notes and R handout will be posted Friday evening).
If you believe assumptions are violated, note this and continue with the project.
IV. Analysis. Report back the model fit, confidence intervals, test-statistic/s, and p-value/s, nulls and alternatives, power
calculations, etc. You may use tables here, but be sure that you organize your work. Remember to write your results
in full sentences where possible.
V. Interpretation. State your conclusion, and what inference you may draw from your corresponding tests or confidence
intervals. These should all be in terms of your problem.
VI. Conclusion. Summarize briefly your findings. Here you do not have to re-iterate your numeric values, but summarize
all relevant conclusions.
Details
Your report should be the following format:
i. Typed.
ii. A title page including your name/s, the name of the class, and the name of your instructor (me).
iii. Double-sided pages.
iv. An appendix of your R code used to produce the results.
Do not include in R code in the body of your report.