Task: can you apply the functions
sumandlengthto boolean vectors?
Yes - sum counts the number of TRUE values.
Task: make a list of booleans where the items are
TRUEif each corresponding item ofnumsis less or equal to 3
nums <= 3
Task: make a list of booleans where the items are
TRUEif each corresponding item ofnumsis either 5 or 1.
nums %in% c(5,1)
Task : Using y to index
numsworked because the length of the two vectors are the same. What happens if you make another variabley2which only has 3 boolean items, and try to indexnumsusing that?
y2 = c(TRUE, FALSE, TRUE)
nums[y2]
Task: Find the favourite numbers of all my friends who have more than 4 characters in their name.
Hint: build this up step by step. First of all, get the number of characters in each name, then test whether this number is greater than 4. This should result in a vector of booleans. Then index nums using this vector.
nums[nchar(friends)>4]
Task: What happens if you type
dinto the console to see what’s inside the objectd?
You get a lot of data printed to the screen! Very unhelpful.
Task: Make a boolean vector which is
TRUEif the latitude of a language is greater than 0, andFALSEotherwise. Assign this to a variable named `northernHemisphere’.
northernHemisphere <- d$latitude > 0
Task: Make a table of counts of basic word order types for languages in the northern hemisphere. Hint: you should index the rows with the variable northernHemisphere and the BasicWordOrder column.
table(d[northernHemisphere,]$BasicWordOrder)
Task: Load the data in
data/Glottolog_Data.csvinto a data frame called glottoData.
glottoData <- read.csv("../data/Glottolog_Data.csv", stringsAsFactors = F)
Task What are the names and formats of the variables in glottoData?
names(glottoData)
Task: Could you match the databases on the name of the language? Use
d2to find how many languages have different names in Glottolog versus WALS.
sum(d2$Name!=d2$glotto.name)
Task: Which basic word order type has the higest mean latitude?
wordOrderLatitudeMEAN<- tapply(d2$latitude, d2$BasicWordOrder, mean)
wordOrderLatitudeMEAN.sorted <- sort(wordOrderLatitudeMEAN, decreasing = TRUE)
head(wordOrderLatitudeMEAN.sorted)
Task: What happens if you try to calculate the sum of this variable?
You get NA returned!
Task: How many datapoints in ObjectVerbOrder are missing? How many glottocodes are missing? Remember that
is.nacreates a vector of booleans andsumcan count the number ofTRUEvalues in a vector.
sum(is.na(d$ObjectVerbOrder))
sum(is.na(d$glottocode))