seq
?The cumsum
function returns the cumulative sum
of a numeric vector.
cumsum(c(1, 1, 2, 1))
## [1] 1 2 4 5
cumsum(c(3, 1, 1, 17))
## [1] 3 4 5 22
The first entry is the first element. The second entry is sum of the first two elements; the third entry is the sum of the first three elements; the fourth entry is the sum of the first four elements; and so forth.
Let’s write our own cumsum
function.
my_cumsum <- function(x) {
result <- numeric(length = length(x))
for (i in 1:length(x)) { # focus on this line
result[i] <- sum(x[1:i])
}
result
}
my_cumsum(c(1, 1, 2, 1))
## [1] 1 2 4 5
my_cumsum(c(3, 1, 1, 17))
## [1] 3 4 5 22
We get matching output, on the two inputs. Let’s now try to pass in
an empty numeric vector to both cumsum
and
my_cumsum
:
cumsum(numeric(0))
## numeric(0)
my_cumsum(numeric(0))
## [1] NA
Now my_cumsum
returns an incorrect result.
NA
is the result of indexing a vector with a value outside
of its bounds:
numeric(0)[1] # Trying to access the first element of an empty vector.
## [1] NA
Why did this occur? The following expression is the culprit:
for (i in 1:length(x)) {
if x
is empty, length(x)
is zero, so our
loop looks like
for (i in 1:0) {
Since 1:0
is the vector c(1, 0)
, we in fact
enter the loop with these two values:
for (i in 1:0) {
print(i)
}
## [1] 1
## [1] 0
Instead of 1:length(x)
we need an expression that is
empty if x
is empty. seq_along
accomplishes
this:
seq_along(numeric(0))
## integer(0)
for (i in seq_along(numeric(0))) {
print(i)
}
Nothing is printed because the loop is not entered.
So a correct implementation follows:
my_cumsum <- function(x) {
result <- numeric(length = length(x))
for (i in seq_along(x)) {
result[i] <- sum(x[1:i])
}
result
}
my_cumsum(c(1, 1, 2, 1))
## [1] 1 2 4 5
my_cumsum(numeric(0))
## numeric(0)
The functions seq_along
and seq_len
are
used to protect against this bad behavior when the set of loop indices
is empty.