Overview
Tidyverse allows you to easily manipulate and analyze data in a way that both readable and maintainable.
Activity Objective
For this activity, you’ll need the following libraries.
library( readr )
library( knitr )
library( lubridate )
library( tidyverse )
library( kableExtra )
and the url for the raw data set.
url <- \https://docs.google.com/spreadsheets/d/1Mk1YGH9LqjF7drJE-td1G_JkdADOU0eMlrP01WFBT8s/pub?gid=0&single=true&output=csv\
For review, here are links to the slides and narrative on this topic.
Activity Questions
- Using tidyverse load in the data and format it properly including:
- Make columns for
Date
representing a POSIX
date object.
- Create a
Weekday
column as an appropriately ordered factor.
- Convert all data columns that are in Imperial Units (F, mph, in, etc.) to proper SI coordinates.
- Drop any duplicate or irrelevant columns of data to minimize memory footprint.
- Remove any rows that have missing data
- Retain data columns for Date, Weekday, Wind Speed, Wind Direction, Rain, Air Temperature, Water Temperature, PH, Salinity, and Water Depth.
format <- \%m/%d/%Y %I:%M:%S %p\
days <- c(\Monday\,\Tuesday\,\Wednesday\,\Thursday\, \Friday\,\Saturday\, \Sunday\)
read_csv( url ) %>%
mutate( Date = parse_date_time( DateTime,
orders=format,
tz=\EST\) ) %>%
mutate( Weekday = factor( weekdays( Date ),
ordered=TRUE,
levels=days) ) %>%
mutate( AirTemp = (AirTempF - 32) * 5/9,
Rain = Rain_in * 2.54,
`Wind Speed` = 1.60934 * WindSpeed_mph ) %>%
select( Date,
Weekday,
Rain,
`Wind Speed`,
`Wind Direction` = WindDir,
`Air Temperature (°C)` = AirTemp,
`Water Temperature (°C)` = H2O_TempC,
PH,
Salinity = Salinity_ppt,
`Water Depth (m)` = Depth_m ) %>%
filter( !is.na( PH) ) -> rice
summary( rice )
Date Weekday Rain
Min. :2014-01-01 00:00:00 Monday :1152 Min. :0.000000
1st Qu.:2014-01-22 08:33:45 Tuesday :1152 1st Qu.:0.000000
Median :2014-02-12 16:52:30 Wednesday:1248 Median :0.000000
Mean :2014-02-12 16:49:35 Thursday :1191 Mean :0.002137
3rd Qu.:2014-03-06 01:11:15 Friday :1151 3rd Qu.:0.000000
Max. :2014-03-27 09:30:00 Saturday :1152 Max. :0.881380
Sunday :1152
Wind Speed Wind Direction Air Temperature (°C) Water Temperature (°C)
Min. : 0.000 Min. : 0.00 Min. :-15.6950 Min. :-0.140
1st Qu.: 3.970 1st Qu.: 37.31 1st Qu.: -0.2542 1st Qu.: 3.930
Median : 6.583 Median :137.30 Median : 3.0194 Median : 5.450
Mean : 8.764 Mean :146.20 Mean : 3.7748 Mean : 5.529
3rd Qu.:11.735 3rd Qu.:249.97 3rd Qu.: 8.0056 3rd Qu.: 7.410
Max. :49.326 Max. :360.00 Max. : 23.8167 Max. :13.300
PH Salinity Water Depth (m)
Min. :6.43 Min. :0.0000 Min. :3.705
1st Qu.:7.50 1st Qu.:0.0700 1st Qu.:4.451
Median :7.58 Median :0.0800 Median :4.684
Mean :7.60 Mean :0.0759 Mean :4.677
3rd Qu.:7.69 3rd Qu.:0.0800 3rd Qu.:4.913
Max. :9.00 Max. :0.1000 Max. :5.454
- On April 23, 1971, the Carpenters released the song “Rainy Days and Mondays Always Get Me Down”. Here is a youtube for those who may be too young to know this one. From the data set, make a plot of total rain for the month of February by each day. Are there significantly different amounts of rain by weekday?
rice %>%
filter( Date > ymd(\20140201\) ) %>%
filter( Date < ymd(\20140301\) ) %>%
filter( Rain > 0 ) %>%
ggplot( aes(Weekday, Rain ) ) +
geom_boxplot() +
theme( axis.text.x = element_text(angle = 45, hjust=1 )) +
scale_y_log10()
- Is there a prevailing wind direction for this field station?
rice %>%
ggplot( aes(`Wind Direction`) ) +
geom_histogram( bins=24) +
coord_polar( start=0 ) +
theme_bw()
- This field station is located in the tidal freshwater region of the Chesapeake Bay in Virginia, USA. As such, there is a tidal surge. Make a table of minimum and maximum tide depth for each day of the week that contained Valentines Day (14 February 2014).
rice %>%
filter( Date < ymd(\2014-02-17\),
Date >= ymd(\2014-02-10\)) %>%
group_by( Weekday ) %>%
summarize( High = max(`Water Depth (m)`),
Low = min( `Water Depth (m)`) ) %>%
kable( digits=2,
caption = \Tides measured at the Rice Rivers Center (as depth in meters) for each day of the week containing valentines day in 2014.\) %>%
kable_styling( full_width = FALSE ) %>%
add_header_above(c(\\,\Measured Tide Depth (m)\=2))
LS0tCnRpdGxlOiAiVGlkeXZlcnNlIEFjdGl2aXR5IgpvdXRwdXQ6IGh0bWxfbm90ZWJvb2sKLS0tCgohW10oaHR0cHM6Ly9saXZlLnN0YXRpY2ZsaWNrci5jb20vNjU1MzUvNTEwMjIxNjUwNDNfNTlhN2Y3YjNkNV9jX2QuanBnKQoKIyMgT3ZlcnZpZXcKClRpZHl2ZXJzZSBhbGxvd3MgeW91IHRvIGVhc2lseSBtYW5pcHVsYXRlIGFuZCBhbmFseXplIGRhdGEgaW4gYSB3YXkgdGhhdCBib3RoIHJlYWRhYmxlIGFuZCBtYWludGFpbmFibGUuCgojIyBBY3Rpdml0eSBPYmplY3RpdmUKCkZvciB0aGlzIGFjdGl2aXR5LCB5b3UnbGwgbmVlZCB0aGUgZm9sbG93aW5nIGxpYnJhcmllcy4KCmBgYHtyfQpsaWJyYXJ5KCByZWFkciApCmxpYnJhcnkoIGtuaXRyICkKbGlicmFyeSggbHVicmlkYXRlICkKbGlicmFyeSggdGlkeXZlcnNlICkKbGlicmFyeSgga2FibGVFeHRyYSApCmBgYAoKYW5kIHRoZSB1cmwgZm9yIHRoZSByYXcgZGF0YSBzZXQuCgpgYGB7cn0KdXJsIDwtICJodHRwczovL2RvY3MuZ29vZ2xlLmNvbS9zcHJlYWRzaGVldHMvZC8xTWsxWUdIOUxxakY3ZHJKRS10ZDFHX0prZEFET1UwZU1sclAwMVdGQlQ4cy9wdWI/Z2lkPTAmc2luZ2xlPXRydWUmb3V0cHV0PWNzdiIKYGBgCgpGb3IgcmV2aWV3LCBoZXJlIGFyZSBsaW5rcyB0byB0aGUgW3NsaWRlc10oc2xpZGVzLmh0bWwpIGFuZCBbbmFycmF0aXZlXShuYXJyYXRpdmUubmIuaHRtbCkgb24gdGhpcyB0b3BpYy4KCi0tLQoKIyBBY3Rpdml0eSBRdWVzdGlvbnMKCjEuIFVzaW5nIHRpZHl2ZXJzZSBsb2FkIGluIHRoZSBkYXRhIGFuZCBmb3JtYXQgaXQgcHJvcGVybHkgaW5jbHVkaW5nOgogIC0gTWFrZSBjb2x1bW5zIGZvciBgRGF0ZWAgcmVwcmVzZW50aW5nIGEgYFBPU0lYYCBkYXRlIG9iamVjdC4KICAtIENyZWF0ZSBhIGBXZWVrZGF5YCBjb2x1bW4gYXMgYW4gYXBwcm9wcmlhdGVseSBvcmRlcmVkIGZhY3Rvci4KICAtIENvbnZlcnQgYWxsIGRhdGEgY29sdW1ucyB0aGF0IGFyZSBpbiBJbXBlcmlhbCBVbml0cyAoRiwgbXBoLCBpbiwgZXRjLikgdG8gcHJvcGVyIFNJIGNvb3JkaW5hdGVzLgogIC0gRHJvcCBhbnkgZHVwbGljYXRlIG9yIGlycmVsZXZhbnQgY29sdW1ucyBvZiBkYXRhIHRvIG1pbmltaXplIG1lbW9yeSBmb290cHJpbnQuCiAgLSBSZW1vdmUgYW55IHJvd3MgdGhhdCBoYXZlIG1pc3NpbmcgZGF0YQogIC0gUmV0YWluIGRhdGEgY29sdW1ucyBmb3IgRGF0ZSwgV2Vla2RheSwgV2luZCBTcGVlZCwgV2luZCBEaXJlY3Rpb24sIFJhaW4sIEFpciBUZW1wZXJhdHVyZSwgV2F0ZXIgVGVtcGVyYXR1cmUsIFBILCBTYWxpbml0eSwgYW5kIFdhdGVyIERlcHRoLgogIApgYGB7cn0KZm9ybWF0IDwtICIlbS8lZC8lWSAlSTolTTolUyAlcCIKZGF5cyA8LSBjKCJNb25kYXkiLCJUdWVzZGF5IiwiV2VkbmVzZGF5IiwiVGh1cnNkYXkiLCAiRnJpZGF5IiwiU2F0dXJkYXkiLCAiU3VuZGF5IikKCnJlYWRfY3N2KCB1cmwgKSAlPiUKICBtdXRhdGUoIERhdGUgPSBwYXJzZV9kYXRlX3RpbWUoIERhdGVUaW1lLAogICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgb3JkZXJzPWZvcm1hdCwKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIHR6PSJFU1QiKSApICU+JQogIG11dGF0ZSggV2Vla2RheSA9IGZhY3Rvciggd2Vla2RheXMoIERhdGUgKSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIG9yZGVyZWQ9VFJVRSwKICAgICAgICAgICAgICAgICAgICAgICAgICAgIGxldmVscz1kYXlzKSApICU+JQogIG11dGF0ZSggQWlyVGVtcCA9IChBaXJUZW1wRiAtIDMyKSAqIDUvOSwKICAgICAgICAgIFJhaW4gPSBSYWluX2luICogMi41NCwKICAgICAgICAgIGBXaW5kIFNwZWVkYCA9IDEuNjA5MzQgKiBXaW5kU3BlZWRfbXBoICkgJT4lCiAgc2VsZWN0KCBEYXRlLCAKICAgICAgICAgIFdlZWtkYXksIAogICAgICAgICAgUmFpbiwKICAgICAgICAgIGBXaW5kIFNwZWVkYCwgCiAgICAgICAgICBgV2luZCBEaXJlY3Rpb25gID0gV2luZERpciwKICAgICAgICAgIGBBaXIgVGVtcGVyYXR1cmUgKMKwQylgID0gQWlyVGVtcCwKICAgICAgICAgIGBXYXRlciBUZW1wZXJhdHVyZSAowrBDKWAgPSBIMk9fVGVtcEMsCiAgICAgICAgICBQSCwgCiAgICAgICAgICBTYWxpbml0eSA9IFNhbGluaXR5X3BwdCwgCiAgICAgICAgICBgV2F0ZXIgRGVwdGggKG0pYCA9IERlcHRoX20gKSAlPiUKICBmaWx0ZXIoICFpcy5uYSggUEgpICkgLT4gcmljZSAKCnN1bW1hcnkoIHJpY2UgKQoKYGBgCiAgCiAgCiAgCjIuIE9uIEFwcmlsIDIzLCAxOTcxLCB0aGUgQ2FycGVudGVycyByZWxlYXNlZCB0aGUgc29uZyAiUmFpbnkgRGF5cyBhbmQgTW9uZGF5cyBBbHdheXMgR2V0IE1lIERvd24iLiAgSGVyZSBpcyBhIFt5b3V0dWJlXShodHRwczovL3lvdXR1LmJlL1BqRm9ReGpnYnJzKSBmb3IgdGhvc2Ugd2hvIG1heSBiZSB0b28geW91bmcgdG8ga25vdyB0aGlzIG9uZS4gIEZyb20gdGhlIGRhdGEgc2V0LCBtYWtlIGEgcGxvdCBvZiB0b3RhbCByYWluIGZvciB0aGUgbW9udGggb2YgRmVicnVhcnkgYnkgZWFjaCBkYXkuICBBcmUgdGhlcmUgc2lnbmlmaWNhbnRseSBkaWZmZXJlbnQgYW1vdW50cyBvZiByYWluIGJ5IHdlZWtkYXk/ICAKCmBgYHtyIGZpZy5jYXA9IlRoZSBkaXN0cmlidXRpb24gb2YgcmFpbiBhbW91bnRzIGJ5IHdlZWtkYXkgZm9yIHRoZSBtb250aCBvZiBGZWJydWFyeSBtZWFzdXJlZCBhdCB0aGUgUmljZSBSaXZlcnMgQ2VudGVyLiAgTm90ZSwgdGhlIGFtb3VudCBvZiByYWluIGlzIGxvZy10cmFuc2Zvcm1lZCBmb3IgdG8gY3JlYXRlIGEgbW9yZSByZWFkYWJsZSB2aXN1YWxpemF0aW9uLiIsIG1lc3NhZ2U9RkFMU0V9CnJpY2UgJT4lCiAgZmlsdGVyKCBEYXRlID4geW1kKCIyMDE0MDIwMSIpICApICU+JQogIGZpbHRlciggRGF0ZSA8IHltZCgiMjAxNDAzMDEiKSApICU+JSAKICBmaWx0ZXIoIFJhaW4gPiAwICkgJT4lCiAgZ2dwbG90KCBhZXMoV2Vla2RheSwgUmFpbiApICkgKyAKICBnZW9tX2JveHBsb3QoKSArIAogIHRoZW1lKCBheGlzLnRleHQueCA9IGVsZW1lbnRfdGV4dChhbmdsZSA9IDQ1LCBoanVzdD0xICkpICsgCiAgc2NhbGVfeV9sb2cxMCgpCmBgYAoKMy4gSXMgdGhlcmUgYSBwcmV2YWlsaW5nIHdpbmQgZGlyZWN0aW9uIGZvciB0aGlzIGZpZWxkIHN0YXRpb24/CgpgYGB7cn0KcmljZSAlPiUKICBnZ3Bsb3QoIGFlcyhgV2luZCBEaXJlY3Rpb25gKSApICsgCiAgZ2VvbV9oaXN0b2dyYW0oIGJpbnM9MjQpICsKICBjb29yZF9wb2xhciggc3RhcnQ9MCApICsgCiAgdGhlbWVfYncoKQpgYGAKCjQuIFRoaXMgZmllbGQgc3RhdGlvbiBpcyBsb2NhdGVkIGluIHRoZSB0aWRhbCBmcmVzaHdhdGVyIHJlZ2lvbiBvZiB0aGUgQ2hlc2FwZWFrZSBCYXkgaW4gVmlyZ2luaWEsIFVTQS4gIEFzIHN1Y2gsIHRoZXJlIGlzIGEgdGlkYWwgc3VyZ2UuICBNYWtlIGEgdGFibGUgb2YgbWluaW11bSBhbmQgbWF4aW11bSB0aWRlIGRlcHRoIGZvciBlYWNoIGRheSBvZiB0aGUgd2VlayB0aGF0IGNvbnRhaW5lZCBWYWxlbnRpbmVzIERheSAoMTQgRmVicnVhcnkgMjAxNCkuCgpgYGB7cn0KcmljZSAlPiUKICBmaWx0ZXIoIERhdGUgPCB5bWQoIjIwMTQtMDItMTciKSwKICAgICAgICAgIERhdGUgPj0geW1kKCIyMDE0LTAyLTEwIikpICU+JQogIGdyb3VwX2J5KCBXZWVrZGF5ICkgJT4lIAogIHN1bW1hcml6ZSggSGlnaCA9IG1heChgV2F0ZXIgRGVwdGggKG0pYCksCiAgICAgICAgICAgICBMb3cgPSBtaW4oIGBXYXRlciBEZXB0aCAobSlgKSApICU+JQogIGthYmxlKCBkaWdpdHM9MiwKICAgICAgICAgY2FwdGlvbiA9ICJUaWRlcyBtZWFzdXJlZCBhdCB0aGUgUmljZSBSaXZlcnMgQ2VudGVyIChhcyBkZXB0aCBpbiBtZXRlcnMpIGZvciBlYWNoIGRheSBvZiB0aGUgd2VlayBjb250YWluaW5nIHZhbGVudGluZXMgZGF5IGluIDIwMTQuIikgJT4lCiAga2FibGVfc3R5bGluZyggZnVsbF93aWR0aCA9IEZBTFNFICkgJT4lCiAgYWRkX2hlYWRlcl9hYm92ZShjKCIiLCJNZWFzdXJlZCBUaWRlIERlcHRoIChtKSI9MikpCmBgYAoKCgoK