Task: can you apply the functions
sum
andlength
to boolean vectors?
Yes - sum counts the number of TRUE
values.
Task: make a list of booleans where the items are
TRUE
if each corresponding item ofnums
is less or equal to 3
nums <= 3
Task: make a list of booleans where the items are
TRUE
if each corresponding item ofnums
is either 5 or 1.
nums %in% c(5,1)
Task : Using y to index
nums
worked because the length of the two vectors are the same. What happens if you make another variabley2
which only has 3 boolean items, and try to indexnums
using that?
y2 = c(TRUE, FALSE, TRUE)
nums[y2]
Task: Find the favourite numbers of all my friends who have more than 4 characters in their name.
Hint: build this up step by step. First of all, get the number of characters in each name, then test whether this number is greater than 4. This should result in a vector of booleans. Then index nums using this vector.
nums[nchar(friends)>4]
Task: What happens if you type
d
into the console to see what’s inside the objectd
?
You get a lot of data printed to the screen! Very unhelpful.
Task: Make a boolean vector which is
TRUE
if the latitude of a language is greater than 0, andFALSE
otherwise. Assign this to a variable named `northernHemisphere’.
northernHemisphere <- d$latitude > 0
Task: Make a table of counts of basic word order types for languages in the northern hemisphere. Hint: you should index the rows with the variable northernHemisphere and the BasicWordOrder column.
table(d[northernHemisphere,]$BasicWordOrder)
Task: Load the data in
data/Glottolog_Data.csv
into a data frame called glottoData.
glottoData <- read.csv("../data/Glottolog_Data.csv", stringsAsFactors = F)
Task What are the names and formats of the variables in glottoData?
names(glottoData)
Task: Could you match the databases on the name of the language? Use
d2
to find how many languages have different names in Glottolog versus WALS.
sum(d2$Name!=d2$glotto.name)
Task: Which basic word order type has the higest mean latitude?
wordOrderLatitudeMEAN<- tapply(d2$latitude, d2$BasicWordOrder, mean)
wordOrderLatitudeMEAN.sorted <- sort(wordOrderLatitudeMEAN, decreasing = TRUE)
head(wordOrderLatitudeMEAN.sorted)
Task: What happens if you try to calculate the sum of this variable?
You get NA returned!
Task: How many datapoints in ObjectVerbOrder are missing? How many glottocodes are missing? Remember that
is.na
creates a vector of booleans andsum
can count the number ofTRUE
values in a vector.
sum(is.na(d$ObjectVerbOrder))
sum(is.na(d$glottocode))