OpenRefine for Data Mangling

I often use data from various sources and sometimes have to get creative about transforming the data into a format that I can use easily. So far I have done this mainly with R (and with Python from time to time). The other day, Dr. Mazzolla pointed to OpenRefine, an application for data clean-up and transformation. I most probably never give up R after this point, but what I saw in the project videos impressed me.

The project used to be called Google Refine for a while, when google cut support for the project it went open source and became OpenRefine around 2012. OpenRefine is not a web service, it runs on your computer. Basically you run an OpenRefine server on your computer and access it through a web interface.

The idea behind the project is to filter data in various ways and run transformations on filtered observations. It allows manipulation of observations programmatically through Python or GREL (Google Refine Expression Language) so your previous Python knowledge can serve you here.

Find the project here


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Create a website or blog at

Up ↑

%d bloggers like this: