Google Refine extension

From LinkedGov

Jump to: navigation, search

Google Refine uses a modular web framework called "Butterfly" to allow custom extensions to be built and make use of Refine's core functionality.

Extensions tend to be a mixture of front-end code (JavaScript and UI modifications) and back-end code (Java commands & servlets), and exist as a folder in the "extensions" folder in Refine's root directory.

The LinkedGov extension modifies Refine's "index" page - Extension/index_page - mainly to include a metadata form and it's "project" page - Extension/project_page - mainly to include the "Typing" panel - Extension/typing panel.


Contents

Overview

https://github.com/linkedgov/linkedgov-google-refine-extension

The LinkedGov UI skin for Google Refine should exist as an extension in the /extensions folder.

The extension also relies on the RDF extension.


Folder structure

See Extension/folder structure

Code structure

See Extension/code structure

Pages

There are only two pages to work with in Google Refine.

The index page is the landing page once the Refine servlet has started - from here you are able to create a project, import a project or open an existing project.

The project page is where the data manipulation is carried out - much like a worksheet in spreadsheet software.

LinkedGov adds modifications to both pages.


Index page

Extension/index page

Shows a particular panel depending on the "mode" parameter that's passed to the page in the URL from the menu page.

Home to the "import" screen and "resume" screen - used to begin a "project" (Refine's terminology).


Project page

Extension/project page

The "project" page in Refine is the page that is home to the data table, allowing data manipulation, transformation and so on.

The LinkedGov extension adds a number of additional "panels" on the left-hand-side - that allow the user to clean, link and label data.


Installation

Extensions directory.png

  • Add 4 lines of 'ant' code (2 'build' lines and 2 'clean' lines) to the "build.xml" file that's found in the extensions folder:
   <ant dir="rdf-extension/" target="build" />
   <ant dir="linkedgov/" target="build" />   

and

   <ant dir="rdf-extension/" target="clean" />
   <ant dir="linkedgov/" target="clean" />
  • Rebuild Refine by typing in the terminal command while inside the main Refine directory, "ant" (you should see a "Build Successful" message).
  • Run Google Refine by typing in the terminal command "./refine" (or the equivalent for a particular operating system).


Modifying Refine

See Extension/modification regarding changes to Refine's default behaviour.

Styling

Styling the pages is fairly straightforward. The index.js and project.js files both add the class "lg" to the <body> element on each page. Each CSS file then styles any Refine elements or LinkedGov elements using "body.lg" as a prefix.

There's a mixture of CSS and LESS files for styles.


Dialogs

See Extension/Dialogs.

Feedback form

Across the Importer is a feedback form.

See Extension/feedback_form for more information.

RDF output

See Extension/RDF & Extension/RDF Schema.

RDF (Resource Description Framework) data is produced behind the scenes when interacting with the wizards and cleaning data. It's produced using functionality from the RDF Extension built by DERI.

Examples of the data & structure generated:

Example datasets

A list of good and bad example datasets (locations, contents) have been compiled here: Extension/Example_datasets

Unacceptable data

  • Personal data (telephone numbers, house addresses)
  • Geographic coordinates other than WGS84, negative northings/eastings?
  • ...

Importing issues

There are issues with some types of data when importing.

See Extension/importing issues.


Browser Compatibility Issues

See Extension/cross browser compatibility.

Bugs

See Extension/bugs.


Feedback

See Extension/feedback.

Our reported bugs

Our feature requests

Personal tools