R coders anonymous: R

Showing posts with label R. Show all posts

Tuesday, 27 March 2018

Going retro: image dithering

Recently I have published the AmigaFFH on CRAN. This package allows you to read, write, interpret and modify Commodore Amiga file formats in R. This seems like a good opportunity to publish a small series about going retro in R. In this post I will show how to apply oldskool dithering techniques to images.

Back in the day home computers had limited memory and graphical capabilities. As a result images were usually stored and displayed using an indexed palette with only a handful of colours. When digitising photographs colour banding occurs due to this limited palette. This is why dithering was introduced. Dithering deliberately adds noise to an image to reduce colour banding when digitising photographs.

In this example I have used an image of the Amiga boing ball as an example. The technique requires two stages. First we need to a palette (first stage) to which the image needs to be dithered (second stage). The palette can be based on the original (true colour) image, or we can force our own palette to the image. In the first case we need a clustering algorithm to determine the most frequently occurring colours which can be used in a palette. In the second case you can pick any colour you like.

Below you see the result of both strategies. The top row selects a palette based on the original image, where from left to right, the number of colours are 4, 8, 16, 32 and 64. In the top row no dithering is applied. You can see how bands of different colours are formed, particularly for the low number of colours. In the middle row, you can see the same images, with the same number of colours, but in this case dither is applied. Even though the same number of colours are applied, the bands are much less apparent due to the dithering. Last, but not least, the bottom row show the same image where we force a black and white palette to the resulting image. From left to right, different dithering methods are used, each with slightly different results.

The source code for generating this image is, us usual, provided below. Note that dithering can be applied to any type of continuous information where the 'depth' is reduced (although currently not implemented in the package demonstrated here). For instance when reducing audio from 16 to 8 bit, dithering can be applied to prevent banding. In the coming month I would like to go retro some more. I will bring back pixels into your life by showing what the AmigaFFH and adfExplorer packages have to offer. And don't forget about the chiptunes which can be used in R using the ProTrackR package.

Friday, 30 June 2017

Trick a Shiny Event Listener

Shiny allows you to build a graphical user interface for your R script. The so-called Shiny Apps are specific R scripts that run on special servers. The script is split into two parts: a part that does all the heavy duty calculations on the server; and a part that generates a graphical user interface to interact with the user. I don't intend to explain Shiny in detail in this post. You can consult the Shiny page, which offers excellent information and tutorials.

In this post I will focus on a specific issue with Shiny. Information thrown from the user interface can be caught by ‘event observers’. For instance, when a user clicks a button on the interface this can be detected on the server, which then executes a specific part of the script. Shiny offers many standard user interface elements, but custom elements can also be created.

You can for instance create a link that can be clicked, which then sends information to the server. The problem is that such ‘events’ (clicks) will only be detected by the Shiny server when the information that is being send changes. So, when you create a custom interface element that should send the same information each time it is clicked, it will not be detected.

In the example below I show how you can work around this issue. I solved it by negating (i.e., multiplying the information with minus one) each time the custom interface element is clicked. This way, it will get detected each time by the server. On the server side you only need to take the absolute value of the data sent by the interface.

Below you will find a fully commented source code showing the work around described in this post. It also shows you the App actually running on the shinyapps.io server.

Note: the app runs under my free account at shinyapps.io. So, when it is not working, I probably ran out of my monthly server-time. In that case please come back later and try again. Or get the source code from my Gist:

Saturday, 7 January 2017

Filling SpatialPolygons with an image

Plotting country outlines is really easy in R, but making those plots a bit more fancy can be frustrating. I thought it would be nice to fill the country outlines with an image rather than with a solid colour. How hard could it be? After some googling it appears that there is very little documentation on this topic. The only document that I found was by Paul Murrel (2011). He plots a black filled country shape first and then captures the rasterized shape using the grid.cap function from the grid package. This captured raster is then used as a mask on the image that needs to be plotted in the shape of a country.

I'm having difficulties with this approach for two reasons: one, I can't seem to get grid.cap working as it should; two, I don't like having to plot the shape of a country first in order to create a mask. The first matter is probably my own wrongdoing, but all the grid.cap function returns me is a matrix of white pixels. I grew tired of figuring out what I was doing wrong. For the second aspect, I think a more direct approach should be possible by using the ‘over’ function from the sp package.

So these are the steps that I took to tackle the problem:

Download the country shape
Download a suitable png image
Georeference the image, such that it matches the location of the country
Use the ‘over’ function to determine which pixels are inside the country shape
Plot the pixels of the image that are inside the country and plot the outline

In this post I will show you how to create the image shown below, where each country is filled with an image of their respective flag. Below I will explain in a bit more detail how this image was created, following the steps listed above. I've also provided the script used to create the image below, where each of the steps are also clearly marked.

The country shapes for step 1 are downloaded from GADM using the getData function from the raster package. Of course you can use any SpatialPolygons object from any other source.

The country flag png images are downloaded from wikipedia (step 2). So this example shows how to download png image and turn it into a usable format. If you like, you can use any image format like jpg or tif, but that will require some modification of the code presented here.

To understand the next step (3), I'm using georeferencing and I may need to explain what that means. Basically, this is telling where and how the image should be positioned in the coordinate reference system (CRS) of the SpatialPolygons objects (the country outlines). In this step I'm instructing the system to stretch the image to the bounding box of the respective country. This also means that the aspect ratio of the original image is probably messed up. If you would like to keep the original aspect ratio, you need to modify this step. The image was downloaded as an array with the red, green and blue component in separate dimensions. With the brick function from the raster package this array is turned into a correctly georeferenced brick object.

Now that we have properly positioned the image, we need to determine which pixels are located inside the country outlines (step 4). For this purpose, we use the ‘over’ function from the sp package. This function will not accept a raster brick object as input. The raster brick object is therefore cast into a SpatialGrid format. The function will return a dataframe with the country shape element that matches with a specific grid cell. From this information it can be derived whether the pixel is situated inside or outside the country outline. The pixel values for those situated outside the country outlines are set to NA. Note that this can be a time-consuming step. Especially when the resolution of your image is high, and/or the country outline is complex (i.e., contains details, a lot of islands and/or holes), so either be patient. Alternatively, you can speed up the process by either lowering the resolution of your image (hint: focal) or simplifying the country outline (hint: gSimplify).

All there is left to do is to plot the image and the outline (step 5). I use the plotRGB function from the raster package to plot the image. Don't forget to set the ‘bgalpha’ argument to 0, to ensure that NA values are plotted as transparent pixels. Otherwise, it will plot white pixels over anything that has previously been plotted.

I've wrapped four of the five steps in a function, such that I could easily repeat these steps for different countries. Hopefully this post will help you to create your own cool graphics in R. Good luck, and please let me know in the comments if any of the steps are not clear...

Monday, 19 December 2016

Segmenting SpatialPolygons

Both SpatialPolygons and SpatialPolygons consist of list of Polygons. Each of these Polygons in turn consist of vertices, i.e., the points where the edges of the polygon meet. When you transform a SpatialPolygons(DataFrame) to a different projection (e.g., from long-lat coordinates to UTM) with the spTransform-function, actually only the vertices are transformed (and not the lines connecting them). This means that also in the new projection straight lines are drawn between the vertices, whereas in reality they should be curved (due to the quasi-spherical shape of our world). For most shapes this isn't much of an issue. But when your shape covers a large area, you could run into serious problems.

This problem can be solved by adding more vertices to the polygons. In other words, add new vertices by interpolating between the existing vertices. If you do this before re-projecting, the new projection will be a lot more accurate. I tried to find an existing package in R that would do this, but I could not find any (please leave a comment if you did find an alternative). Therefore, I've written a small function to take care of this myself, which I will demonstrate in this post.

The source code of this function and the accompanying example is given below. This code is also used to produced the graphical illustration below. Note how the red line crosses the islands Northeast of Scotland in the original projection. Compare this with the new projection of the red lines with and without segmentation. Also note that the red lines in the new projection are straight when we don't apply segmentation, and they are curved (as they are supposed to be) when we do apply segmentation.

The segmentation is achieved relatively simple by looping all the lists of Polygons with the lapply function and then looping each line segment with the apply function (see R script below). The new vertices are derived via linear interpolation using the approx function. You could go for something more fancy like spline interpolation, but linear will do just fine in this example. As I've implemented all this in the function named ‘SPSegmentation’, you can simply call it on any SpatialPolygons or SpatialPolygonsDataFrame object you would like. With some modifications, you could also use this function of SpatialLines and SpatialLinesDataFrame objects.

In case you want to achieve the opposite (i.e., reduce the number of vertices in a SpatialPolygons(DataFrame)), you can use the gSimplify function from the rgeos package. This uses a Ramer–Douglas–Peucker algorithm to simplify the shape, which could save you some computing time, but your shape will also become less detailed as a result.

Saturday, 26 November 2016

ProTrackR: chiptunes in R (part three)

It has been a while but I've made some small progress on the ProTrackR package. Many bugs are fixed and some functionality is added. This is making tracking in R more and more easy and fun. As if developing a package for handling ProTracker modules isn't the nerdiest thing ever, I found a way to top it off. What if I combine the ProTrackR package with a Shiny interface? Well, you'll get a basic online tracker.

For more information you can also read parts one and two on ProTrackR.

Friday, 25 November 2016

Fixing a function

Sometimes R functions are not always doing exactly what you want. In that case you could try to look up the source code for that function paste it into your code and make your modifications there. But sometimes it takes only a small tweak and you want to keep your code as clean as possible. For me this was the case for the vuong test in the pscl package. This function only prints the output, but does not return it. This can easily be fixed by converting the function into text with the ‘deparse’ function. Then the ‘print’ statement can be replaced by a ‘return’ statement by using ‘gsub’. Now the modified text can be turned into a function again by calling ‘eval’ and ‘parse’. With this hack the new function will return the output I need. The code below shows you all described in this post. Go ahead and mutate functions just like that...

Friday, 28 October 2016

Openstreetmap relation, please be a SpatialPolygons object

The sp package is the package when it comes to handling spatial data in R. Getting to know how this package work can take some time, but it's well worth it. It can be a pain to import spatial data from for instance www.opentreetmap.org.

The other day for instance, I required an outline of Lake IJssel. How hard could it be to get that from openstreetmap? Well, getting the data is not all that hard, but converting it into something workable really got me pulling my hair out. In order to save future frustrations, I decided to share my solution with you.

The first step is to get the relation id number from openstreetmap. This can be easily obtained by searching the website for ‘Lake IJssel’. After clicking the first hit that represents a water body, you will find the relation id.

The second step is to download the data in R using this relation id. This can be done with the get_osm function of the osmar package. This data can be converted into a SpatialLines object using the as_sp function from the osmar package. However, I don't want a SpatialLines object; I want it to be a SpatialPolygons object. And that is the tricky bit, because there is no package available (that I know of) that can do this directly.

So this is the trick that I applied. First you need to join the lines in the SpatialLines object, using the gLineMerge function from the rgeos package (it took me a while to figure this one out!). Then convert it into a PolySet using the SpatialLines2PolySet function from the maptools package. Only then can it be converted into a SpatialPolygons object using the PolySet2SpatialPolygons function from the maptools package. If anyone knows of an easier way, pleas let me know in the comments.

So, there you have it. A SpatialPolygons object of Lake IJssel, that is actually useful for my purpose. The code below will do what I've described in this post; feel free to try it yourself! Note that it may not work on all openstreetmap data...

Monday, 18 July 2016

2D particle tracking in R, part 2: kernel density

In my previous post, I showed how particle movement can be simulated in the marine environment. Now that we have tracked the particle movement, how can we use this information to translate this to something more meaningful? This can be done by determining the particle density on the map.

The most simple way would be to lay a grid over the particles and simply count the number of particles in each grid cells. However, I would like to go for something a little bit complicated that gives some smoother results. I will use kernel density estimates. This means that I will consider the particles as a density function, were the position of the particle gives the highest density for that particle. You can get a density map of all particles by simply adding up the probabilities of the density functions of each individual particle. Fortunately, the KernSmooth package will take care of all this.

Remember that in the original simulation, I released 5,000 particles within a specific time-frame. With the kernel density estimated, I can estimate the number of particles per square kilometer at any time and location. This will become even more interesting if these particles represent something in the real world, let's say oil. If we know how much cubic meter of oil is represented by each particle, the estimates can be transformed in to the thickness of the oil slick. Keep in mind that this is just an initial test simulation that does not include all processes in order to properly simulate oil spills (e.g., oil evaporation, degradation and dispersion is not simulated).

Here's a video created with ffmpeg of the particle density movement over time.

And here's the fully commented script used to create the frames for the video.

Friday, 15 July 2016

2D particle tracking in R

How hard is it nowadays to simulate particle movement in the sea and track them in R? Well I discovered that it is not all that difficult. Within a couple of hours I had a simple simulation model running in the North Sea. All you need are time dependent water currents and wind data, both of which can be obtained from marine.copernicus.eu.

The main steps that you will have to follow:

load the water currents and wind data
specify particle release details
specify simulation details
loop all simulation time steps and for each step do the following with the particles that are released:

for each particle determine the local water current and wind vector at the current time index
transform the particle coordinates to a projection in meters
move the particles with the water current (m/s) multiplied by the time step duration (s)
move the particles with the wind speed (m/s) multiplied by the time step duration (s) and a drift factor (between 0 and 1)
move the particles in random directions (to simulate diffusion)
transform the particle coordinates back to the original projection
save particle information every n^th simulation step, where n is any positive integer value

Below you will find the script that does all this (each of the steps above are marked in the script). In this simulation I simulate the release of 5,000 particles with a constant release rate during 5 days and followed them for a couple of days. I found the trickiest bit not the simulation itself, but reading the the NetCDF files such that it could be used in a convenient manner. I create a video of this simulation with ffmpeg:

So what's the practical use of all this? Well obviously you can simulate the trajectory of any floating materials, such as oil. However, you have to keep in mind that the simulation is only as good as the underpinning data and parameterisation. For instance, the water current data used here is not that accurate near estuaries. Of course you can always do multiple runs of your simulation with different parameter settings (or make it even a Monte Carlo simulation). Also, here I only simulate in 2D. But there is nothing that would stop you from modifying it to include the third dimension. What also should be added is some detection that particles have beached. But that should not be very difficult when you have coastline data.

Why in R? Well, commercial particle tracking software is generally expensive. Whereas free software are often rigid and don't allow the user to include custom modifications. In R the simulation is free, and you can take the exact implementation into our own hands... Hopefully, this post will help you get started with your own simulations. Good luck!

Here's the script:

Edit: if your interested, you might want to check out the follow-up post on particle density estimates of this simulation.

Saturday, 7 November 2015

ProTrackR: chiptunes in R (part two)

A short follow-up on my previous post. I have just released a new version of the ProTrackR package (available on the CRAN). It can now actually play ProTracker modules, yeah! The routine isn't perfect yet but it does a decent job at the moment. The example below shows how you can download a module from ModArchive and play it:

Saturday, 26 September 2015

ProTrackR: chiptunes in R (part one)

ProTracker is a popular music tracker to sequence music on a Commodore Amiga machine. As I still own a Commodore Amiga 500, I thought it would be a nice challenge to develop an R package that could handle ProTracker module files, and so I did. The first version of this package was just released on CRAN.

In this post I will briefly explain what ProTracker modules are and how they work. Then I will give a short demonstrations of the current sweetness the ProTrackR package has to offer. On this blog I will try to keep you updated on new versions of this package and its new features.

The module (MOD) is a computer file format used primarily to represent music. A MOD file contains a set of instruments in the form of samples, a number of patterns indicating how (which notes and effects) and when the samples are to be played, and a list of what patterns to play in what order.

The first release of the ProTrackR package contains the bare bones required to import and export such module files, and modify them any way you'd like (and the file format restrictions allow you to). This is a good time to start with an example. Let's start by loading the module 'Elekfunk' from The Mod Archive.

This module was used in a demo by Sanity to show off the Amiga's capabilities, which were impressive ate the time.

Playing routines in the ProTrackR package are currently limited to just playing the samples at specific notes (more fancy routines are in development). The examples below show how to play all samples in a module, a specific sample at different notes and applying different fine tunes. But at least you can export them as wave files.

Combined with the power of R, all kinds of fancy analyses become possible. The module for instance contains a sample of a guitar slide. In R we can now generate a power spectrum (which would have been tricky to do real time on an original chipset Amiga).

The power spectrum shows the shift in frequency which you would expect for a guitar slide. There are plenty other examples I could show you here but leave it at this for the moment. Hopefully you enjoyed these examples of the ProTrackR package, please stay tuned for updates on this package.

Saturday, 5 September 2015

Geotags from JPEG images

Most modern camera's and telephones can geotag their pictures. Kind of neat that you can use your camera as GPS logger. Even better if we could do something useful with it in R. In this post I will show how GPS coordinates can be extracted from jpeg images and utilized in R. Although it sounds pretty simple, and it is, it is not very straight forward to achieve.

In this example I will use pictures I made on a day out. We made a small round trip by steamtram from Hoorn to Medemblik where we visited castle Radboud. I will show how to plot the locations where I took the pictures can be plotted onto a map. For privacy reasons I will not share the pictures. However, the example should also work with your own geotagged pictures.

Geotags and other meta-information is stored in jpeg files in the so-called exchangeable image file format (or EXIF). Luckily it can be easily obtained using the function GDALinfo. This is where things get messy. The function returns the EXIF codes as character strings in the attributes of the object that is returned. So, we need to fetch the relevant attributes; find the relevant EXIF codes; and extract the information from the code and convert it into something workable.

The EXIF codes for GPS information will look something like this ‘EXIF_GPSLatitude=(52) (40) (57.45)’. I use regular expressions to extract the coordinate information from between the brackets. I'm no expert in the field of regular expressions, so there may be more efficient regular expressions to get the right information. If you have improvements, please feel free to leave them in the comments.

I've wrapped the routines to isolate the respective EXIF code and to extract the numerical coordinates from the literal EXIF character string in slick functions (‘get_EXIF_value’ for getting the literal string and ‘get_EXIF_coordinate’ to get the numerical coordinate from that string).

After some rearranging of the data, we have a data.frame holding the filename and the longitude and latitude in decimal WGS84 degrees. I noticed that not all my pictures had a geotag (even though I had geotagging turned on, on my device). Apparently taking pictures is not a full proof method for logging locations (at least not on my device).

Let's plot the coordinates on a map. I download some OpenStreetMap-based map tiles using the function ‘openmap’. This function conveniently allows you to specify the upperleft and lowerright coordinates of your map. That way, you can be certain that your locations will fit on the map.

As we cannot change the projection of the downloaded maptiles. We therefore have to transform the coordinates of our locations to the same projection as the maptiles. We do this by first creating a SpatialPoints object from our coordinates and then transform those coordinates to the proper projection by applying spTransform. Plotting is now simply a matter of calling some plot routines.

To make things a bit more fancy, I used the rainbow palette to color the locations in the order in which the pictures were taken. Note that the use of geotagged images is limited. Proper GPS loggers for instance also log the precision of the position and your speed. But I will save that for a future post.

Saturday, 22 August 2015

R Chemistry in 3D

The three dimensional model of the acetylcholine molecule shown below is interactive, when your browser supports WebGL. You can use your mouse to rotate and zoom the model. In this post I will show you the main steps required to create such an interactive 3D model in R: easy peasy (more or less).

You must enable Javascript to view this page properly.

Drag mouse to rotate model. Use mouse wheel or middle button to zoom it.

There are several R packages available for cheminformatics. However, as far as I could tell, there aren't any packages that are directly able to create 3D visualisations of molecules. Therefore, we need the capabilities of the rcdk package to load molecule files and 3D rendering capabilities of the rgl package.

Although it is possible to derive a 3D structure from a SMILES code (see e.g., the SMILES translator from the Computer-Aided Drug Design Group), I couldn't find such functionality in R packages. Therefore, we use an existing 3D model of the acetylcholine molecule from PubChem in this example. The first part of the script downloads the molecule and then loads it into the working memory.

Subsequently, we define some atom properties, which we will need for plotting later on: the atom symbol, the colour to be used in the model, and the Van der Waals radii of the atoms as listed at Wikipedia. Note that we only define these properties for the four atoms present in acetylcholine. If you want to apply the code to other molecules, you have to make sure all relevant atoms are specified. In our example we continue by getting the 3D atom coordinates from the molecule file, sticking it in a data.frame and merge it with the previously defined atom properties.

Now, all we have to do is to plot spheres at the positions of the atoms, with the right sizes and colours. For this we use the spheres3d function. We could also define our own (smoother) mesh of a sphere, but that will result in huge html files when saving the model with writeWebGL. We therefore stick with the lighter spheres3d function. Unfortunately, in Internet Explorer strange shadows appear (a bug?), but in other browsers it looks fine. Just see for yourself at the top of this page.

Although writeWebGL produces an html file, which is ready to go and can be uploaded to your website directly, it may need some editing when embedding it in a custom template. This requires some knowledge of the html and JavaScript languages. I won't be explaining the modifications I made here.

As a bonus, if your more a fan of the ball and stick model, below the code to add sticks to your molecule model.

Pages