 # Week one Assignment Descriptive Statistics

Week One Assignment

Descriptive Statistics

ALY 6015 21284 Intermediate Analytics

Date: 2019/01/13

Q1: Set the working directly to the location of the file, and import the dataset using R code. Using descriptive statistics, visualize this data numerically and graphically.

[pic 1]

1. Summary statistics (mean, median, standard deviation, min, max, Q1, and Q3) for variable “Sepal.Width”

[pic 2]

1. Create a histogram for “Sepal.Width”

[pic 3]

[pic 4]

Q2: Using height_weight_bygender.csv data file, plot a histogram of men’s and women’s height.

[pic 5]

1. The histogram of men's height

[pic 6]

[pic 7]

1. The histogram of women's height

[pic 8]

[pic 9]

Q3: Using the height_weight_byGender.csv file, plot a scatterplot of height by weight for either men or women using the plot function.

[pic 10]

1. The scatterplot of height by weight for men

[pic 11]

1. The scatterplot of height by weight for women

[pic 12]

Q4: What is an example of a statistical inference question we may ask next?

I want to ask what is the significant correlation between height and weight for either men or women, and can we describe them by R function?

[pic 13]

1. R output

[pic 14]

From the results, we can see that there is indeed a correlation between height and weight and the correlation coefficient is higher for men than for women.

R Code:

#1

setwd('/Users/jonas/Desktop/Informatics/Intermediate Analytics/Week 1')

mean(Sepal.Width)

median(Sepal.Width)

sd(Sepal.Width)

min(Sepal.Width)

max(Sepal.Width)

quantile(Sepal.Width, probs = c(0.25,0.75))

hist(Sepal.Width, main = "The Histogram of Sepal.Width")

#2

setwd('/Users/jonas/Desktop/Informatics/Intermediate Analytics/Week 1')

male <- subset(height_weight, height_weight\$X...Gender == "Male")

hist(male\$Height..inches.,main = "The Histogram of Men's Height")

