Free knowledge database project
Wikidata
|
Screenshot
Main page of Wikidata in April 2021
|
Type of site
| |
---|
Available in
| Multiple languages
|
---|
Owner
| Wikimedia Foundation
|
---|
Editor
| Wikimedia community
|
---|
URL
| wikidata
.org
|
---|
Commercial
| No
|
---|
Registration
| Optional
|
---|
Launched
| 29 October 2012
; 11 years ago
(
2012-10-29
)
[1]
|
---|
Wikidata
is a
collaboratively edited
multilingual
knowledge graph
hosted by the
Wikimedia Foundation
.
[2]
It is a common source of
open data
that Wikimedia projects such as
Wikipedia
,
[3]
[4]
and anyone else, is able to use under the
CC0
public domain
license. Wikidata is a wiki powered by the software
MediaWiki
, including its extension for
semi-structured data
, the
Wikibase
. As of early 2023, Wikidata had 1.54 billion item statements (
semantic triple
).
[5]
Concept
[
edit
]
Wikidata is a
document-oriented database
, focusing on items, which represent any kind of topic, concept, or object. Each item is allocated a unique,
persistent identifier
, a positive integer prefixed with the upper-case letter Q, known as a "QID". Q is the starting letter of the first name of Qamarniso Vrande?i? (nee Ismoilova), an Uzbek Wikimedian married to the Wikidata co-developer
Denny Vrande?i?
.
[6]
This enables the basic information required to identify the topic that the item covers to be translated without favouring any language.
Examples of items include
1988 Summer Olympics
(Q8470)
,
love
(Q316)
,
Johnny Cash
(Q42775)
,
Elvis Presley
(Q303)
, and
Gorilla
(Q36611)
.
Item labels do not need to be unique. For example, there are two items named "Elvis Presley":
Elvis Presley
(Q303)
, which represents
the American singer and actor
, and
Elvis Presley
(Q610926)
, which represents his
self-titled album
. However, the combination of a label and its description must be unique. To avoid ambiguity, an item's unique identifier (
QID
) is hence linked to this combination.
Main parts
[
edit
]
Fundamentally, an item consists of:
- An
identifier
(the QID), related to a label and a description.
- Optionally, multiple aliases and some number of statements (and their properties and values).
Statements
[
edit
]
Statements are how any information known about an item is recorded in Wikidata. Formally, they consist of
key?value pairs
, which match a
property
(such as "author", or "publication date") with one or more entity
values
(such as "
Sir Arthur Conan Doyle
" or "1902"). For example, the informal English statement "milk is white" would be encoded by a statement pairing the property
color
(P462)
with the value
white
(Q23444)
under the item
milk
(Q8495)
.
Statements may map a property to more than one value. For example, the "occupation" property for
Marie Curie
could be linked with the values "physicist" and "chemist", to reflect the fact that she engaged in both occupations.
[7]
Values may take on many types including other Wikidata items, strings, numbers, or media files. Properties prescribe what types of values they may be paired with. For example, the property
official website
(P856)
may only be paired with values of type "URL".
[8]
Optionally,
qualifiers
can be used to refine the meaning of a statement by providing additional information. For example, a "population" statement could be modified with a qualifier such as "point in time (P585): 2011" (as its own key-value pair). Values in the statements may also be annotated with
references
, pointing to a source backing up the statement's content.
[9]
As with statements, all qualifiers and references are property?value pairs.
Properties
[
edit
]
Each property has a numeric identifier prefixed with a capital P and a page on Wikidata with optional label, description, aliases, and statements. As such, there are properties with the sole purpose of describing other properties, such as
subproperty of
(P1647)
.
Properties may also define more complex rules about their intended usage, termed
constraints
. For example, the
capital
(P36)
property includes a "single value constraint", reflecting the reality that (typically) territories have only one capital city. Constraints are treated as testing alerts and hints, rather than inviolable rules.
[10]
Before a new property is created, it needs to undergo a discussion process.
[11]
[12]
The most used property is
cites work
(P2860)
, which is used on more than 290,000,000 item pages as of November 2023.
[update]
[13]
Lexemes
[
edit
]
In
linguistics
, a
lexeme
is a unit of
lexical
meaning. Similarly, Wikidata's
lexemes
are items with a structure that makes them more suitable to store
lexicographical
data. Besides storing the language to which the lexeme refers, they have a section for
forms
and a section for
senses
.
[14]
Entity Schemas
[
edit
]
In January 2019, development started of a new extension for MediaWiki to enable storing
Shape Expressions
in a separate namespace.
[15]
[16]
This extension has since been installed on Wikidata
[17]
and enables contributors to use Shape Expressions for validating and describing Resource Description Framework data in items and lexemes. Any item or lexeme on Wikidata can be validated against an Entity Schema,
[
clarification needed
]
and this makes it an important tool for quality assurance.
Content
[
edit
]
Wikidata's content collections include data for biographies,
[18]
medicine,
[19]
digital humanities,
[20]
scholarly metadata through the WikiCite project.
[21]
It includes data collections from other open projects including
Freebase (database)
.
[22]
Development
[
edit
]
The creation of the project was funded by donations from the
Allen Institute for Artificial Intelligence
, the
Gordon and Betty Moore Foundation
, and
Google, Inc.
, totaling
€
1.3 million.
[23]
[24]
The development of the project is mainly driven by
Wikimedia Deutschland
under the management of
Lydia Pintscher
, and was originally split into three phases:
[25]
- Centralising interlanguage links ? links between Wikipedia articles about the same topic in different languages.
- Providing a central place for
infobox
data for all Wikipedias.
- Creating and updating list articles based on data in Wikidata and linking to other Wikimedia sister projects, including
Meta-Wiki
and the own Wikidata (interwikilinks).
Initial rollout
[
edit
]
|
|
|
Wikidata was launched on 29 October 2012 and was the first new project of the Wikimedia Foundation since 2006.
[3]
[26]
[27]
At this time, only the centralization of language links was available. This enabled items to be created and filled with basic information: a label ? a name or title, aliases ? alternative terms for the label, a description, and links to articles about the topic in all the various language editions of Wikipedia (interwikipedia links).
Historically, a Wikipedia article would include a list of interlanguage links (links to articles on the same topic in other editions of Wikipedia, if they existed). Wikidata was originally a self-contained
repository
of interlanguage links.
[28]
Wikipedia language editions were still not able to access Wikidata, so they needed to continue to maintain their own lists of interlanguage links.
[
citation needed
]
On 14 January 2013, the
Hungarian Wikipedia
became the first to enable the provision of interlanguage links via Wikidata.
[29]
This functionality was extended to the
Hebrew
and
Italian
Wikipedias on 30 January, to the
English Wikipedia
on 13 February and to all other Wikipedias on 6 March.
[30]
[31]
[32]
[33]
After no consensus was reached over a proposal to restrict the removal of language links from the English Wikipedia,
[34]
they were automatically removed by
bots
. On 23 September 2013, interlanguage links went live on Wikimedia Commons.
[35]
Statements and data access
[
edit
]
On 4 February 2013, statements were introduced to Wikidata entries. The possible values for properties were initially limited to two data types (items and images on Wikimedia Commons), with more
data types
(such as
coordinates
and dates) to follow later. The first new type, string, was deployed on 6 March.
[36]
The ability for the various language editions of Wikipedia to access data from Wikidata was rolled out progressively between 27 March and 25 April 2013.
[37]
[38]
On 16 September 2015, Wikidata began allowing so-called
arbitrary access
, or access from a given article of a Wikipedia to the statements on Wikidata items not directly connected to it. For example, it became possible to read data about Germany from the Berlin article, which was not feasible before.
[39]
On 27 April 2016 arbitrary access was activated on Wikimedia Commons.
[40]
According to a 2020 study, a large proportion of the data on Wikidata consists of entries imported en masse from other databases by
Internet bots
, which helps to "break down the walls" of
data silos
.
[41]
Query service and other improvements
[
edit
]
On 7 September 2015, the
Wikimedia Foundation
announced the release of the Wikidata Query Service,
[42]
which lets users run queries on the data contained in Wikidata.
[43]
The service uses
SPARQL
as the query language. As of November 2018, there are at least 26 different tools that allow querying the data in different ways.
[44]
It uses
Blazegraph
as its
triplestore
and
graph database
.
[45]
[46]
In 2021
Wikimedia Deutschland
released the Query Builder,
[47]
"a form-based query builder to allow people who don't know how to use SPARQL to" write a query.
Logo
[
edit
]
The bars on the
logo
contain the word "WIKI" encoded in
Morse code
.
[48]
It was created by Arun Ganesh and selected through community decision.
[49]
Reception
[
edit
]
In November 2014, Wikidata received the Open Data Publisher Award from the
Open Data Institute
"for sheer scale, and built-in openness".
[50]
In December 2014, Google announced that it would shut down
Freebase
in favor of Wikidata.
[51]
As of November 2018
[update]
, Wikidata information was used in 58.4% of all English Wikipedia articles, mostly for external identifiers or coordinate locations. In aggregate, data from Wikidata is shown in 64% of all
Wikipedias
' pages, 93% of all
Wikivoyage
articles, 34% of all
Wikiquotes
', 32% of all
Wikisources
', and 27% of
Wikimedia Commons
.
[52]
As of December 2020
[update]
, Wikidata's data was visualized by at least 20 other external tools
[53]
and over 300 papers have been published about Wikidata.
[54]
Applications
[
edit
]
A systematic literature review of the uses of Wikidata in research was carried out in 2019.
[60]
See also
[
edit
]
References
[
edit
]
- ^
"The Wikidata revolution is here: enabling structured data on Wikipedia"
. 25 April 2013
. Retrieved
12 June
2022
.
Since Wikidata.org went live on 30 October 2012,
- ^
Chalabi, Mona (26 April 2013).
"Welcome to Wikidata! Now what?"
.
Archived
from the original on 2 October 2021
. Retrieved
2 October
2021
.
- ^
a
b
Wikidata
(
Archived
29 October 2012 at the
Wayback Machine
)
- ^
"Data Revolution for Wikipedia"
. Wikimedia Deutschland. 30 March 2012.
Archived
from the original on 23 October 2012
. Retrieved
11 September
2012
.
- ^
"Grafana"
.
grafana.wikimedia.org
. Retrieved
21 March
2024
.
- ^
Vrande?i?, Denny; Pintscher, Lydia; Krotzsch, Markus (30 April 2023).
"Wikidata: The Making of"
.
Companion Proceedings of the ACM Web Conference 2023
. pp. 615?624.
doi
:
10.1145/3543873.3585579
.
ISBN
9781450394192
.
S2CID
258377705
.
- ^
"Help:Statements ? Wikidata"
.
www.wikidata.org
.
Archived
from the original on 25 March 2019
. Retrieved
20 February
2019
.
- ^
"Help:Data type ? Wikidata"
.
www.wikidata.org
.
Archived
from the original on 23 March 2019
. Retrieved
20 February
2019
.
- ^
"Help:Sources ? Wikidata"
.
www.wikidata.org
.
Archived
from the original on 17 April 2019
. Retrieved
20 February
2019
.
- ^
"Help:Property constraints portal"
.
Wikidata
.
Archived
from the original on 1 June 2019
. Retrieved
20 February
2019
.
- ^
Cochrane, Euan (30 September 2016).
"Wikidata as a digital preservation knowledgebase"
.
openpreservation.org
.
Archived
from the original on 5 January 2022
. Retrieved
5 January
2022
.
- ^
Samuel, John (15 August 2018). "Experimental IR Meets Multilinguality, Multimodality, and Interaction".
Experimental IR Meets Multilinguality, Multimodality, and Interaction
.
CLEF
2018. Lecture Notes in Computer Science. Vol. 11018. p. 129.
doi
:
10.1007/978-3-319-98932-7_12
.
ISBN
978-3-319-98931-0
.
- ^
"Wikidata:Database reports/List of properties/Top100"
.
Archived
from the original on 24 February 2023
. Retrieved
18 November
2023
.
- ^
"Wikidata:Lexicographical data/Documentation ? Wikidata"
.
www.wikidata.org
.
Archived
from the original on 13 November 2018
. Retrieved
13 November
2018
.
- ^
"Extension:EntitySchema ? MediaWiki"
.
mediawiki.org
.
Archived
from the original on 25 June 2021
. Retrieved
10 September
2021
.
- ^
"Initial empty repository"
.
Gerrit
. 15 January 2019.
Archived
from the original on 19 March 2022
. Retrieved
12 June
2022
.
- ^
"Version ? Wikidata"
.
Wikidata.org
.
Archived
from the original on 19 October 2021
. Retrieved
10 September
2021
.
- ^
Chisholm, Andrew; Radford, Will; Hachey, Ben (2017). "Learning to generate one-sentence biographies from Wikidata".
arXiv
:
1702.06235
.
- ^
Turki, Houcemeddine; Shafee, Thomas; Hadj Taieb, Mohamed Ali; Ben Aouicha, Mohamed; Vrande?i?, Denny; Das, Diptanshu; Hamdi, Helmi (November 2019).
"Wikidata: A large-scale collaborative ontological medical database"
.
Journal of Biomedical Informatics
.
99
: 103292.
doi
:
10.1016/j.jbi.2019.103292
.
- ^
Zhao, Fudie (31 May 2023).
"A systematic review of Wikidata in Digital Humanities projects"
.
Digital Scholarship in the Humanities
.
38
(2): 852?874.
doi
:
10.1093/llc/fqac083
.
- ^
Nielsen, Finn Arup; Mietchen, Daniel; Willighagen, Egon (2017).
Scholia, Scientometrics and Wikidata
(PDF)
. Lecture Notes in Computer Science. Vol. 10577. pp. 237?259.
doi
:
10.1007/978-3-319-70407-4_36
.
ISBN
978-3-319-70406-7
.
- ^
Pellissier Tanon, Thomas; Vrande?i?, Denny; Schaffert, Sebastian; Steiner, Thomas; Pintscher, Lydia (11 April 2016). "From Freebase to Wikidata: The Great Migration": 1419?1428.
doi
:
10.1145/2872427.2874809
.
- ^
Dickinson, Boonsri (30 March 2012).
"Paul Allen Invests In A Massive Project To Make Wikipedia Better"
.
Business Insider
.
Archived
from the original on 23 December 2017
. Retrieved
11 September
2012
.
- ^
Perez, Sarah (30 March 2012).
"Wikipedia's Next Big Thing: Wikidata, A Machine-Readable, User-Editable Database Funded By Google, Paul Allen And Others"
.
TechCrunch
.
Archived
from the original on 5 October 2012
. Retrieved
11 September
2012
.
- ^
"Wikidata ? Meta"
.
meta.wikimedia.org
.
Archived
from the original on 7 April 2012
. Retrieved
8 November
2015
.
- ^
Pintscher, Lydia (30 October 2012).
"wikidata.org is live (with some caveats)"
.
wikidata-l
(Mailing list)
. Retrieved
3 November
2012
.
- ^
Roth, Matthew (30 March 2012).
"The Wikipedia data revolution"
. Wikimedia Foundation.
Archived
from the original on 31 July 2020
. Retrieved
11 September
2012
.
- ^
Leitch, Thomas
(1 November 2014).
Wikipedia U: Knowledge, Authority, and Liberal Education in the Digital Age
.
Johns Hopkins University Press
. p.
120
.
ISBN
978-1-4214-1550-5
.
- ^
Pintscher, Lydia (14 January 2013).
"First steps of Wikidata in the Hungarian Wikipedia"
. Wikimedia Deutschland.
Archived
from the original on 14 December 2015
. Retrieved
17 December
2015
.
- ^
Pintscher, Lydia (30 January 2013).
"Wikidata coming to the next two Wikipedias"
. Wikimedia Deutschland.
Archived
from the original on 4 October 2018
. Retrieved
31 January
2013
.
- ^
Pintscher, Lydia (13 February 2013).
"Wikidata live on the English Wikipedia"
. Wikimedia Deutschland.
Archived
from the original on 19 February 2013
. Retrieved
15 February
2013
.
- ^
Pintscher, Lydia (6 March 2013).
"Wikidata now live on all Wikipedias"
. Wikimedia Deutschland.
Archived
from the original on 14 April 2013
. Retrieved
8 March
2013
.
- ^
"Wikidata ist fur alle Wikipedien da"
(in German). Golem.de.
Archived
from the original on 6 November 2018
. Retrieved
29 January
2014
.
- ^
"Wikipedia talk:Wikidata interwiki RFC"
. 29 March 2013.
Archived
from the original on 18 October 2021
. Retrieved
30 March
2013
.
- ^
Pintscher, Lydia (23 September 2013).
"Wikidata is Here!"
.
Commons:Village pump
.
Archived
from the original on 6 December 2021
. Retrieved
30 August
2016
.
- ^
Pintscher, Lydia.
"Wikidata/Status updates/2013 03 01"
.
Wikimedia Meta-Wiki
. Wikimedia Foundation.
Archived
from the original on 12 April 2013
. Retrieved
3 March
2013
.
- ^
Pintscher, Lydia (27 March 2013).
"You can have all the data!"
. Wikimedia Deutschland.
Archived
from the original on 29 March 2013
. Retrieved
28 March
2013
.
- ^
"Wikidata goes live worldwide"
. The H. 25 April 2013. Archived from
the original
on 1 January 2014.
- ^
Pintscher, Lydia (16 September 2015).
"Wikidata: Access to data from arbitrary items is here"
.
Wikipedia:Village pump (technical)
.
Archived
from the original on 27 September 2016
. Retrieved
30 August
2016
.
- ^
Pintscher, Lydia (27 April 2016).
"Wikidata support: arbitrary access is here"
.
Commons:Village pump
.
Archived
from the original on 5 February 2017
. Retrieved
30 August
2016
.
- ^
Waagmeester, Andra; Stupp, Gregory; Burgstaller-Muehlbacher, Sebastian; et al. (17 March 2020).
"Wikidata as a knowledge graph for the life sciences"
.
eLife
.
9
.
doi
:
10.7554/ELIFE.52614
.
ISSN
2050-084X
.
PMC
7077981
.
PMID
32180547
.
Wikidata
Q87830400
.
- ^
"Home"
.
query.wikidata.org
.
Archived
from the original on 7 November 2016
. Retrieved
30 January
2019
.
- ^
"[Wikidata] Announcing the release of the Wikidata Query Service - Wikidata - lists.wikimedia.org"
.
Archived
from the original on 10 November 2015
. Retrieved
13 November
2018
.
- ^
"Wikidata:Tools/Query data ? Wikidata"
.
www.wikidata.org
.
Archived
from the original on 31 May 2020
. Retrieved
13 November
2018
.
- ^
"[Wikidata-tech] Wikidata Query Backend Update (take two!)"
.
lists.wikimedia.org
.
Archived
from the original on 6 January 2021
. Retrieved
29 August
2018
.
(The message also contains a link to the graph databases comparison performed by Wikimedia.)
- ^
86
on
GitHub
- ^
"Wikidata Query Builder"
.
query.wikidata.org
.
- ^
commons:File talk:Wikidata-logo-en.svg#Hybrid
. Retrieved 2016-10-06.
- ^
"Und der Gewinner ist..."
13 July 2012.
Archived
from the original on 21 January 2021
. Retrieved
16 June
2020
.
- ^
"First ODI Open Data Awards presented by Sirs Tim Berners-Lee and Nigel Shadbolt"
. Archived from
the original
on 24 March 2016.
- ^
"Freebase"
.
Google Plus
. 16 December 2014. Archived from
the original
on 20 March 2019.
- ^
"Percentage of articles making use of data from Wikidata"
. Archived from
the original
on 15 November 2018
. Retrieved
15 November
2018
.
- ^
"Wikidata:Tools/Visualize data ? Wikidata"
.
www.wikidata.org
.
Archived
from the original on 15 November 2018
. Retrieved
15 November
2018
.
- ^
"Scholia"
.
Scholia
.
Archived
from the original on 30 September 2021
. Retrieved
2 August
2021
.
- ^
Simonite, Tom (18 February 2019).
"Inside the Alexa-Friendly World of Wikidata"
.
Wired
.
ISSN
1059-1028
. Retrieved
25 December
2020
.
- ^
"Rob Barry / Mwnci ? Deep Spreadsheets"
.
GitLab
.
Archived
from the original on 21 September 2019
. Retrieved
21 September
2019
.
- ^
Krause, Volker (12 January 2020),
KDE Itinerary ? A privacy by design travel assistant
,
archived
from the original on 26 June 2020
, retrieved
10 November
2020
- ^
sling
on
GitHub
- ^
Scharpf, P. Schubotz, M. Gipp, B.
Mining Mathematical Documents for Question Answering via Unsupervised Formula Labeling
Archived
10 February 2023 at the
Wayback Machine
ACM/IEEE Joint Conference on Digital Libraries, 2022.
- ^
Mora-Cantallops, Marcal; Sanchez-Alonso, Salvador; Garcia-Barriocanal, Elena (2 September 2019). "A systematic literature review on Wikidata".
Data Technologies and Applications
.
53
(3): 250?268.
doi
:
10.1108/DTA-12-2018-0110
.
S2CID
202036639
.
Further reading
[
edit
]
- Mark Graham (6 April 2012),
"The Problem With Wikidata"
,
The Atlantic
, US
- Claudia Muller-Birn, Benjamin Karran, Janette Lehmann, Markus Luczak-Rosch:
Peer-production system or collaborative ontology development effort: What is Wikidata?
In, OpenSym 2015 ? Conference on Open Collaboration, San Francisco, US, 19 ? 21 Aug 2015 (preprint).
External links
[
edit
]
Wikimedia Commons has media related to
Wikidata
.
Wikiquote has quotations related to
Wikidata
.
|
---|
Concepts
| |
---|
By location
| |
---|
Open data projects
| |
---|
|
|
---|
People
| |
---|
Projects
| |
---|
Other
| |
---|
Related
| |
---|