Monday, 18 July 2016

2D particle tracking in R, part 2: kernel density

In my previous post, I showed how particle movement can be simulated in the marine environment. Now that we have tracked the particle movement, how can we use this information to translate this to something more meaningful? This can be done by determining the particle density on the map.

The most simple way would be to lay a grid over the particles and simply count the number of particles in each grid cells. However, I would like to go for something a little bit complicated that gives some smoother results. I will use kernel density estimates. This means that I will consider the particles as a density function, were the position of the particle gives the highest density for that particle. You can get a density map of all particles by simply adding up the probabilities of the density functions of each individual particle. Fortunately, the KernSmooth package will take care of all this.

Remember that in the original simulation, I released 5,000 particles within a specific time-frame. With the kernel density estimated, I can estimate the number of particles per square kilometer at any time and location. This will become even more interesting if these particles represent something in the real world, let's say oil. If we know how much cubic meter of oil is represented by each particle, the estimates can be transformed in to the thickness of the oil slick. Keep in mind that this is just an initial test simulation that does not include all processes in order to properly simulate oil spills (e.g., oil evaporation, degradation and dispersion is not simulated).

Here's a video created with ffmpeg of the particle density movement over time.

And here's the fully commented script used to create the frames for the video.

require(raster)
require(KernSmooth)
require(maptools)
## load the simulation results. See the previous blog post
## how this file was created. This data file contains
## an object named 'part.results', which is a list
## of SpatialPointsDataFrame objects.
load("output/out.RData")
## Create a SpatialPoints object representing the release location
release.location <- SpatialPoints(t(c(3.7, 52)), proj4string = CRS("+proj=longlat +datum=WGS84"))
## everything will be plotted in UTM zone 31, so we need to transform all spatial data:
release.location <- spTransform(release.location, CRS("+proj=utm +zone=31 +datum=WGS84"))
## Countries for which the shorelines are downloaded and plotted
countries <- "NLD"
gd_load <- lapply(as.list(countries), function(x) {
gadm <- ""
lev <- 0
if (x == "NLD") lev <- 1
try(gadm <- raster::getData("GADM", country = x, level = lev))
if (class(gadm) == "SpatialPolygonsDataFrame" && lev == 1 && !is.character(gadm)) {
gadm <- unionSpatialPolygons(gadm, gadm@data$ENGTYPE_1 != "Water body")[2]
}
gadm
})
## Transform to UTM:
gd_load <- lapply(gd_load, spTransform, CRSobj = CRS("+proj=utm +zone=31 +datum=WGS84"))
## x range for the grid in UTM coordinates (m)
x.rast <- c(475000, 635000)
## y range for the grid in UTM coordinates (m)
y.rast <- c(5730000, 5890000)
## dimensions for the grid
grid.size <- c(480, 480)
## grid cell surface area in sqaure kilometers
grid.area <- ((diff(x.rast)/grid.size[1])*(diff(y.rast)/grid.size[2]))/1e6
## note that 'part.results' is a list of SpatialPointsDataFrame objects.
## each element represents a time index and is named after the
## corresponding simulation date and time. Here we loop each time index:
for (i in 1:length(part.results)) {
## select data at time index i:
sel.points <- part.results[[i]]
## only select particles that have been released:
sel.points <- sel.points[sel.points$released,]
## only continue if particles have been released:
if (length(sel.points) > 0) {
## project to UTM such that the coordinates are in meters:
sel.points <- spTransform(sel.points, CRS("+proj=utm +zone=31 +datum=WGS84"))
## calculate the kernel density using a 2x2 km bandwidth:
est <- bkde2D(coordinates(sel.points),
bandwidth = c(2000, 2000),
gridsize = grid.size,
range.x = list(x.rast, y.rast))
## multiply density estimated with number of released particles and divide by
## grid cell surface area multiplied with the total density, to get the density
## in number of particles per square kilometer:
est.raster = raster(list(x = est$x1, y = est$x2, z = est$fhat*length(sel.points)/(sum(est$fhat)*grid.area)))
} else {
est.raster = raster(list(x = seq(x.rast[1], x.rast[2], length.out = grid.size[1]),
y = seq(y.rast[1], y.rast[2], length.out = grid.size[2]),
z = matrix(0, grid.size[1], grid.size[2])))
}
projection(est.raster) <- CRS("+proj=utm +zone=31 +datum=WGS84")
## open a graphics device to plot the density estimates
png(sprintf("output\\kernout%03d.png", i), 640, 480, res = 100, type = "cairo")
## use print as otherwise lattice based graphics won't be drawn
## onto the opened device
print(
## I use "spplot" here for nicer plots. using "plot" here will be faster.
spplot(est.raster,
ylab.right = "# / km^2",
par.settings = list(layout.widths = list(axis.key.padding = 0,
ylab.right = 2)),
col.regions = colorRampPalette(c("white", rainbow(15, start = 1/6), "black"), bias = 1)(100),
at = seq(0, 20, length.out = 100),
main = names(part.results)[i],
maxpixels = prod(grid.size),
panel = function(...) {
panel.gridplot(...)
lapply(gd_load, sp.polygons, fill = "lightgrey")
sp.points(release.location, col = "red")
panel.text(x.rast[1] + 10000, y.rast[2] - 10000, "r-coders-anonymous.blogspot.com", adj = c(0, 0.5))
})
)
## close the graphics device
dev.off()
## print ratio between the current time index and the total
## number of indices. This serves as a progress indicator.
print(i/length(part.results))
}

Friday, 15 July 2016

2D particle tracking in R

How hard is it nowadays to simulate particle movement in the sea and track them in R? Well I discovered that it is not all that difficult. Within a couple of hours I had a simple simulation model running in the North Sea. All you need are time dependent water currents and wind data, both of which can be obtained from marine.copernicus.eu.

The main steps that you will have to follow:

  • load the water currents and wind data
  • specify particle release details
  • specify simulation details
  • loop all simulation time steps and for each step do the following with the particles that are released:
    • for each particle determine the local water current and wind vector at the current time index
    • transform the particle coordinates to a projection in meters
    • move the particles with the water current (m/s) multiplied by the time step duration (s)
    • move the particles with the wind speed (m/s) multiplied by the time step duration (s) and a drift factor (between 0 and 1)
    • move the particles in random directions (to simulate diffusion)
    • transform the particle coordinates back to the original projection
    • save particle information every nth simulation step, where n is any positive integer value

Below you will find the script that does all this (each of the steps above are marked in the script). In this simulation I simulate the release of 5,000 particles with a constant release rate during 5 days and followed them for a couple of days. I found the trickiest bit not the simulation itself, but reading the the NetCDF files such that it could be used in a convenient manner. I create a video of this simulation with ffmpeg:

So what's the practical use of all this? Well obviously you can simulate the trajectory of any floating materials, such as oil. However, you have to keep in mind that the simulation is only as good as the underpinning data and parameterisation. For instance, the water current data used here is not that accurate near estuaries. Of course you can always do multiple runs of your simulation with different parameter settings (or make it even a Monte Carlo simulation). Also, here I only simulate in 2D. But there is nothing that would stop you from modifying it to include the third dimension. What also should be added is some detection that particles have beached. But that should not be very difficult when you have coastline data.

Why in R? Well, commercial particle tracking software is generally expensive. Whereas free software are often rigid and don't allow the user to include custom modifications. In R the simulation is free, and you can take the exact implementation into our own hands... Hopefully, this post will help you get started with your own simulations. Good luck!

Here's the script:

## load required packages:
require(sp)
require(raster)
require(ncdf4)
require(ncdf.tools)
require(rgeos)
require(maptools)
###############################################################
## STEP 1: load the water currents and wind data
###############################################################
## function that reads netcdf files and puts them in a workable stack object:
read.mover <- function(ncdfile, xvar = "u", yvar = "v") {
u <- stack(ncdfile, varname = xvar)
v <- stack(ncdfile, varname = yvar)
## Get time indices:
z_time <- attr(u, "z")
if (class(z_time[[1]]) == "character") {
z_time <- as.POSIXct(z_time[[1]], tz = "UTC")
} else {
z_time_info <- tolower(strsplit(tolower(names(z_time)), " since ")[[1]])
z_time <- z_time[[1]]*switch(z_time_info[1], seconds = 1, minutes = 60, hours = 60*60, days = 60*60*24)
z_time <- as.POSIXct(z_time, origin = as.POSIXct(z_time_info[2], tz = "UTC"), tz = "UTC")
rm(z_time_info)
}
return(list(u = u, v = v, z = z_time))
}
## Read the water current data:
temp <- read.mover("input/MetO-NWS-PHYS-hi-Agg-all-levels_1468581872238.nc",
"vozocrtx", "vomecrty")
water_u <- temp$u
water_v <- temp$v
water_time <- temp$z
rm(temp)
## Read the wind data:
temp <- read.mover("input/CERSAT-GLO-BLENDED_WIND_L4-V5-OBS_FULL_TIME_SERIE_1468508078544.nc",
"eastward_wind_rms", "northward_wind_rms")
wind_u <- temp$u
wind_v <- temp$v
wind_time <- temp$z
rm(temp)
###############################################################
## STEP 2: specify particle release details
###############################################################
## time at which the simulated release should start:
release.start <- as.POSIXct("2016-07-01 08:00", tz = "UTC")
## time at which teh simulated release should end:
release.end <- as.POSIXct("2016-07-06 08:00", tz = "UTC")
## number of particles to release in total:
release.n <- 5000
## release lon-coordinate:
release.x <- 3.7
## release lat-coordinate:
release.y <- 52
## release depth (not used yet in simulation)
release.z <- 0
###############################################################
## STEP 3: specify simulation details
###############################################################
## time at which the simulation should start:
simulation.start.time <- as.POSIXct("2016-07-01 00:00", tz = "UTC")
## time step of simulation in seconds:
simulation.time.step <- 60*60/4
## integer value indicating every n-th time step that should be exported:
simulation.out.step <- 4
## time at which the simulation should stop
simulation.end.time <- as.POSIXct("2016-07-10 08:00", tz = "UTC")
## create a time series based on the start and end time and the time step:
simulation.time <- seq(simulation.start.time, simulation.end.time - simulation.time.step, simulation.time.step)
## projection to be used to calculate particle displacement in meters:
part.proj <- CRS("+proj=utm +zone=31 +datum=WGS84")
## drift factor indicates with which fraction particles are displaced by the wind speed:
drift.factor <- 0.03
## the diffusion coefficient, used in random walk simulation
diffusion.coeff <- 1e5
## Countries for which the shorelines are downloaded and plotted
countries <- "NLD"
gd_load <- lapply(as.list(countries), function(x) {
gadm <- ""
lev <- 0
if (x == "NLD") lev <- 1
try(gadm <- raster::getData("GADM", country = x, level = lev))
if (class(gadm) == "SpatialPolygonsDataFrame" && lev == 1 && !is.character(gadm)) {
gadm <- unionSpatialPolygons(gadm, gadm@data$ENGTYPE_1 != "Water body")[2]
}
gadm
})
## for each time step determine how many particles are released:
n.particles <- rep(NA, length(simulation.time))
n.particles[simulation.time <= release.start] <- 0
n.particles[simulation.time >= release.end] <- release.n
n.particles[is.na(n.particles)] <- approx(c(0, release.n), n = 2 + sum(is.na(n.particles)))$y[c(-1, -(2 + sum(is.na(n.particles))))]
n.particles <- round(n.particles)
## create a SpatialPointsDataFrame for the particles:
particles <- data.frame(pos.x = rep(release.x, release.n),
pos.y = rep(release.y, release.n),
pos.z = rep(release.z, release.n))
coordinates(particles) <- ~ pos.x + pos.y
proj4string(particles) <- CRS("+proj=longlat +datum=WGS84")
## create a list in which results should be stored:
part.results <- list()
previous.position <- particles
## function that extrapolates water currents for NA values
extend.data <- function(r, band = 21) {
## replace NA values of raster data at coasts with nearest neighbours:
fill.na <- function(x, i = round(band^2/2)) {
if(is.na(x)[i]) {
return(mean(x, na.rm = T))
} else {
return(x[i])
}
}
r <- setValues(r, getValues(focal(r, w = matrix(1,band,band), fun = fill.na,
pad = TRUE, na.rm = FALSE )))
r
}
###############################################################
## STEP 4: loop all simulation time steps
###############################################################
## start looping the simulation times
for (i in 1:length(simulation.time)) {
print(i)
## Get the water currents that are closest to the current simulation time
tdif <- abs(as.numeric(water_time) - as.numeric(simulation.time[i]))
tsel_c <- which(tdif == min(tdif))[[1]]
twat_u <- extend.data(raster(water_u, tsel_c))
twat_v <- extend.data(raster(water_v, tsel_c))
tdif <- abs(as.numeric(wind_time) - as.numeric(simulation.time[i]))
tsel_w <- which(tdif == min(tdif))[[1]]
twin_u <- extend.data(raster(wind_u, tsel_w))
twin_v <- extend.data(raster(wind_v, tsel_w))
## Get the last known particle locations and transform coordinates into meters:
part.coords <- coordinates(spTransform(previous.position, part.proj))
## displace particles with with water_currents
part.coords[0:n.particles[i],] <- part.coords[0:n.particles[i],] +
simulation.time.step*cbind(raster::extract(twat_u, previous.position, "bilinear"),
raster::extract(twat_v, previous.position, "bilinear"))[0:n.particles[i],]
## displace particles with wind
part.coords[0:n.particles[i],] <- part.coords[0:n.particles[i],] +
simulation.time.step*drift.factor*cbind(raster::extract(twin_u, previous.position, "bilinear"),
raster::extract(twin_v, previous.position, "bilinear"))[0:n.particles[i],]
## displace with random walk (diffusion):
part.coords[0:n.particles[i],] <- part.coords[0:n.particles[i],] +
cbind(runif(n.particles[i], -1, 1),
runif(n.particles[i], -1, 1))*sqrt(diffusion.coeff*simulation.time.step*6/1e4)
## turn particle information into a SpatialPointsDataFrame object:
part.coords <- as.data.frame(part.coords)
coordinates(part.coords) <- ~ pos.x + pos.y
proj4string(part.coords) <- part.proj
## backtransform coordinates to original projection:
part.coords <- spTransform(part.coords, CRS(proj4string(previous.position)))
## store the new locations in the results list:
previous.position <- part.coords
## for each n-th simulation step store the results (where n = simulation.out.step):
if ((i - 1)%%4 == 0) {
## add an indicator of the release state of the particles:
part.coords <- SpatialPointsDataFrame(part.coords, data.frame(released = rep(F, release.n)))
part.coords@data$released[0:n.particles[i]] <- T
part.results[[(i - 1)/4 + 1]] <- part.coords
## name the list element after the time index it represents:
names(part.results)[(i - 1)/4 + 1] <- paste(format(simulation.time[i], "%Y-%m-%d %H:%M:%S"),
attr(as.POSIXlt(simulation.time[i]),"tzone"))
png(sprintf("output\\out%03d.png", (i - 1)/4), 640, 480, res = 75, type = "cairo")
plot.ylim <- c(51.9, 53)
plot(part.results[[(i - 1)/4 + 1]], pch =".", xlim = c(3, 5), ylim = plot.ylim,
main = paste(format(simulation.time[i], "%Y-%m-%d %H:%M:%S"),
attr(as.POSIXlt(simulation.time[i]),"tzone")))
## draw water current arrows
wat_arrow <- as.data.frame(coordinates(raster(water_u, tsel_c)))
wat_arrow$u <- as.vector(t(as.matrix(raster(water_u, tsel_c))))
wat_arrow$v <- as.vector(t(as.matrix(raster(water_v, tsel_c))))
scale = 0.05
aspect.ratio <- 1/cos(mean(plot.ylim)*pi/180)
arrows(wat_arrow$x, wat_arrow$y,
wat_arrow$x + scale*aspect.ratio*wat_arrow$u,
wat_arrow$y + scale*wat_arrow$v, length = 0.01, col = "lightpink")
## draw wind current arrows
win_arrow <- as.data.frame(coordinates(raster(wind_u, tsel_w)))
win_arrow$u <- as.vector(t(as.matrix(raster(wind_u, tsel_w))))
win_arrow$v <- as.vector(t(as.matrix(raster(wind_v, tsel_w))))
scale <- 0.05
aspect.ratio <- 1/cos(mean(plot.ylim)*pi/180)
arrows(win_arrow$x, win_arrow$y,
win_arrow$x + scale*aspect.ratio*win_arrow$u,
win_arrow$y + scale*win_arrow$v, length = 0.01, col = "lightblue")
## draw coastlines:
lapply(gd_load, plot, add = T, col = "lightgrey")
text(2.5, 52.9, "r-coders-anonymous.blogspot.com", adj = c(0, 0.5), cex = 1.5)
axis(1)
axis(2)
dev.off()
}
}
## save the resulting particle information:
save(part.results, file = "output/out.RData", compress = T)

Edit: if your interested, you might want to check out the follow-up post on particle density estimates of this simulation.