Skip to content

PLOS is a non-profit organization on a mission to drive open science forward with measurable, meaningful change in research publishing, policy, and practice.

Building on a strong legacy of pioneering innovation, PLOS continues to be a catalyst, reimagining models to meet open science principles, removing barriers and promoting inclusion in knowledge creation and sharing, and publishing research outputs that enable everyone to learn from, reuse and build upon scientific knowledge.

We believe in a better future where science is open to all, for all.

PLOS BLOGS The Official PLOS Blog

Integrating OpenAlex metadata to improve Open Science Indicators

Author: Lauren Cadwallader, Open Research Manager, PLOS

The latest version of the PLOS Open Science Indicators (OSI) dataset, produced in partnership with DataSeer, extends its coverage to articles from January 1st 2018 to September 30th 2024. These data on 147,619 articles offer insights into open science practices over even more time and, with improved metadata integrated from OpenAlex, this and future versions of the data can be explored in new ways.

Download the dataset

This release includes the latest results for data and code sharing, and preprint posting. Previously released preliminary rates of study registration, and protocol sharing are also included.

Figure 1: Open Science Indicators for PLOS and comparator content by publication quarter Q1 2018-Q3 2024. Rates are given as a proportion of all published articles.

Previous versions of the dataset have included information on the country associated with the first corresponding author and disciplinary information taken from the MeSH terms associated with the article on PubMed. In this version, the country and discipline metadata has been replaced with openly available data from OpenAlex to provide better coverage and standardised names across the articles (OpenAlex is an open and free bibliographic catalogue of scientific papers, authors and institutions). Users of OSI told us they needed different – including less granular – ways of exploring trends by discipline, and better coverage of geographic information across the comparator (non-PLOS) dataset.

This update allows a more thorough examination of trends by country as well as by discipline. The new fields included in OSI version 9 are: 

  • “Primary Topic Field” and “Primary Topic Subfield”  – these give disciplinary information about each article from two levels of the OpenAlex Topic hierarchy. The “Primary Topic Field” has 26 distinct categories, whereas the Subfield category gives more granular information.
  • “Corresponding Author Country” and “First Author Country” give the geographic location of the affiliation of the authors. Both are provided as “First Author Country”, which gives better coverage across the whole dataset than “Corresponding Author Country”. Where data for both are available, the country matches in >95% of cases.

Examples of using the new metadata

Example 1

Using First Author Country and looking at the data across the whole dataset, we can examine preprint posting trends in specific countries (Figure 2). In this case, trends for the USA, China and India can be observed. Not only do we see how the results change over time, with differences emerging post COVID pandemic, but also how rates differ between countries.

Figure 2: Preprint posting rate for three countries based on publication date of the article. 2024 is for Q1-3 only.

Example 2

Integrating disciplinary data from OpenAlex with the OSIs facilitates exploring the data with specific subject-based communities in mind. For example, the graph below (Figure 3) shows the rates of data shared in a repository by articles that fall into the Environmental Science primary topic for articles published in PLOS and articles published elsewhere. This allows us to get a better sense of the open science practices that are happening within different communities and therefore tailor solutions that work with existing community practices and norms.

Figure 3: Rates of data shared in a repository for articles published PLOS articles versus those in the Comparator dataset where Primary Topic Field is “Environmental Science” based on publication year. 2024 is for Q1-3 only.

We hope that other users of the OSI data find these improvements useful and we welcome any feedback from and collaboration with users of the data.

Future OSI releases

We’ll be making some changes to how, and how often, we release the OSI dataset in 2025 so that we best support the needs of OSI’s users, and PLOS’ mission. As such there won’t be an OSI public release in March/April 2025. Our commitment to the OSI principles including open data and methods for OSI results will not change. We’ll share more information on our OSI plans later in 2025, and meanwhile feel free to contact us with questions about using the data.

Related Posts
Back to top