OpenRefine
is a free and open source (FOSS) tool with which you can (batch)
edit
and
upload files
on
Wikimedia Commons
. OpenRefine focuses on adding and editing
structured data
.
This page collects information about OpenRefine for the
Wikimedia Commons
community.
Frequently Asked Questions
- I have problems installing / opening OpenRefine on my computer. What should I do?
You can
download the latest stable version of OpenRefine from its website
. OpenRefine's manual
includes detailed installation instructions
; make sure to read these.
- If you use Windows, then make sure you install the OpenRefine kit with embedded Java.
Some users are unable to install OpenRefine because of, for instance, firewall issues, or because their organization or company does not allow users to install external software. In that case, you can use Wikimedia's cloud version of OpenRefine on PAWS, which is described
elsewhere on this page
.
- Does OpenRefine allow upload of all types of files that Wikimedia Commons supports?
OpenRefine supports all kinds of media files that can be uploaded to Wikimedia Commons. It does not support the upload of data files.
Many upload tools, including OpenRefine, sometimes have trouble uploading TIFF files.
Please note that OpenRefine supports uploading local files up to 100MB, not larger. It is possible to upload larger files from URL though.
- What is the maximum size of files that can be uploaded to Wikimedia Commons with OpenRefine?
OpenRefine does not (yet) support Chunked Uploads, and hence only allows uploads of files from your local drive up to 100MB. See
GitHub issue
(help is very welcome to address and fix this issue). Uploading larger files to Wikimedia Commons is possible via URLs on the web. If that is not an option for you, please use
Pattypan
or the
Upload Wizard
.
- How many files can I upload in one session or project? Can I upload 10,000s or even 100,000s of files at once?
OpenRefine can easily handle datasets of up to tens of thousands (potentially hundreds of thousands) of rows of data. The bottleneck is the speed of uploading files to Wikimedia Commons, which is regulated by the Wikimedia Commons API. For an upload of thousands of files at once (or more), you will need some patience and you will need to keep OpenRefine open.
- Can OpenRefine retrieve embedded metadata from files (like EXIF metadata)?
This is not possible inside OpenRefine. We recommend using EXIFtool
https://exiftool.org
.
This YouTube video
explains the process quite clearly.
- What are the (dis)advantages of running OpenRefine locally? What are the (dis)advantages of the cloud (PAWS) version of OpenRefine?
When running OpenRefine locally (on your own computer):
- On your own computer, it will especially be easier when you want to do file uploads to Wikimedia Commons. You will be able to upload files from your own local harddrive. This is not possible on PAWS.
- On your own computer, you can do various tasks (especially data cleanup and joining/splitting data) without an internet connection. You do need an internet connect as soon as you want to do reconciliation and upload data and files to Commons and Wikidata.
When running OpenRefine in the cloud (via Wikimedia PAWS):
- The cloud version is convenient when you can't easily install new software on your own computer.
- You always need a live internet connection for this.
- With this PAWS/cloud version, it is not possible to upload images from your local computer.
- General links
- Talk about OpenRefine with its community and with Wikimedia users
- Bug reports and feature requests
240,592
files have been uploaded with OpenRefine.
Learn to use OpenRefine for Wikimedia Commons: WikiLearn course
This online course is available at any time, for free, for anyone with a Wikimedia account. It can be followed at your own pace, with computer-graded exercises. Following the course takes an average of 6 to 8 hours.
Upload files to Wikimedia Commons with OpenRefine (version 3.7)
For uploading files to Wikimedia Commons, you need
OpenRefine 3.7
.
Wikimedia Commons upload is
not
supported in OpenRefine 3.6 or earlier versions.
Edit files on Wikimedia Commons with OpenRefine (version 3.6 and newer)
For editing Wikimedia Commons, you need
OpenRefine 3.6
or newer.
Wikimedia Commons is
not
supported in OpenRefine 3.5 or earlier versions.
It is highly recommended to also install
OpenRefine's Wikimedia Commons extension
.
Advanced tips and tricks
There is also a page with
advanced tips and tricks
, which include more instructions on working with manifests and reconciliation, retrieving EXIF, special GREL recipes, and more. Add your own!
Install and run OpenRefine
As a local application on your computer
OpenRefine can be downloaded as an application and works on desktop and laptop computers with Windows, Mac and Linux operating systems. It runs a small server on your computer and you then use a web browser to interact with it. It works best with browsers based on Webkit, such as Google Chrome, Chromium, Opera and Microsoft Edge, and is also supported on Firefox.
You can
download OpenRefine here
.
Installation instructions are available in OpenRefine's user manual
.
Wikimedia Commons extension for OpenRefine
Additionally, you can also install
OpenRefine's Wikimedia Commons extension
. This is not necessary, but helpful for Wikimedia Commons batch editing. It offers:
- A start screen to load file names directly from Wikimedia Commons categories.
- Thumbnails of Commons files (not all file formats supported yet).
- Several dedicated GREL expressions to retrieve data from wikitext for further processing.
Download and installation instructions are available at
https://github.com/OpenRefine/CommonsExtension
In the cloud (via Wikimedia PAWS)
If you are unable to install OpenRefine on your computer, or if it runs very slowly, then you can also use it in the cloud (on wmcloud.org through
PAWS
). Everyone with a Wikimedia account can access OpenRefine here. Visit
https://hub-paws.wmcloud.org/
, log in, and click on the OpenRefine (blue diamond) logo.
The Wikimedia Commons extension (mentioned above) is installed in OpenRefine on PAWS. Please note: with OpenRefine on PAWS it is NOT possible to upload files to Wikimedia Commons from your local computer.
Launch PAWS
Demo: start OpenRefine on Wikimedia PAWS