Skip to end of metadata
Go to start of metadata

Data from Wikipedia list of largest companies by revenue

Files that you'll need (data & code)





Step 1 Get the data from Wikipedia

On the Wikipedia page click edit (above the table), copy everything and paste in .txt file, or use this one "company_revenue_data.txt"

(NOTE: Use the ready file, the table on Wikipedia keeps changing making the commands in Step 3 unusable)

Step 2 Load the data using OntoRefine

In GraphDB -> Import -> Tabular (OntoRefine) select your file

in Line-based text files:

  • ignore first 18 lines
  • parse every 3 lines into one row

click create project

Step 3 Clean up the data in OntoRefine

In undo/redo select apply

copy the list of commands from file "company_revenue_Ontorefine_Commands.txt"

Step 4 Load the data into your repository

copy the OntoRefine sparql endpoint and put it between <> after SERVICE  in "company_revenue_insert_query.txt"

create a repository (e.g. companies, base URI

connect to the repository

go to SPARQL -> paste your code and run it (code is in "company_revenue_insert_query.txt")

Step 5 Run & visualize queries against your data, add basic schema

(optional) use spif functions to generate resources for the CEOs and country codes from the literals matching ?company :CEO ?CEO; :CountryCode ?countryCode.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.