Can Microsoft Academic help to assess the citation impact of academic books?

Authors: Kousha K & Thelwall M

Comment: This article examines the comparison of coverage and citations by Microsoft Academic (MA) with the Book Citation Index (BKCI) and Google Scholar (GS). It showed that, while MA’s coverage for books is still not comprehensive, it is able to find more citations in some fields than the other two sources. In particular, it has greater coverage than BKCI for some Arts & Humanities fields (though in general it is still biased towards the technical fields). MA also seems less sensitive to book editions. MA’s comparison with GS gave mixed results, with one better than the other in different fields, suggesting them as having partly complementary coverage.

Abstract: Despite recent evidence that Microsoft Academic is an extensive source of citation counts for journal articles, it is not known if the same is true for academic books. This paper fills this gap by comparing citations to 16,463 books from 2013-2016 in the Book Citation Index (BKCI) against automatically extracted citations from Microsoft Academic and Google Books in 17 fields. About 60% of the BKCI books had records in Microsoft Academic, varying by year and field. Citation counts from Microsoft Academic were 1.5 to 3.6 times higher than from BKCI in nine subject areas across all years for books indexed by both. Microsoft Academic found more citations than BKCI because it indexes more scholarly publications and combines citations to different editions and chapters. In contrast, BKCI only found more citations than Microsoft Academic for books in three fields from 2013-2014. Microsoft Academic also found more citations than Google Books in six fields for all years. Thus, Microsoft Academic may be a useful source for the impact assessment of books when comprehensive coverage is not essential.

Kousha K, Thelwall M (2018) Can Microsoft Academic help to assess the citation impact of academic books? arXiv.org: arXiv:1808.01474v1.

Source: Can Microsoft Academic help to assess the citation impact of academic books?

Citation analysis with microsoft academic

Authors: Hug SE, Ochsner M & Brandle MP

Comment: This article compares the citation analyses between Microsoft Academic (MA) and Scopus. This was compared via the output of three selected researchers. The results showed uniformity across MA and Scopus. Some limitations to MA were also pointed out.

Abstract: We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the “fields of study” are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers’ publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.

Hug, S.E., Ochsner, M. & Brändle, M.P. (2017) Citation analysis with microsoft academic. Scientometrics 111: 371. https://doi.org/10.1007/s11192-017-2247-8

Source: Citation analysis with microsoft academic

Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources

Authors: Tsay M-y, Wu T-l & Tseng L-l

Comment: This article compares several open access search engines (i.e., Google Scholar (GS), Microsoft Academic (MSA), OAIster, OpenDOAR, arXiv.org and Astrophysics Data System (ADS)) using publications of Nobel Laureates for Physics from 2001 to 2013. A short literature on comparing search engines is given. Both internal and external overlaps are studied. At the time of this work, GS had the highest coverage of this sample, but had a very high percentage of internal overlap (>92%). It actually covers all items in other sources, except for MSA. ADS and MSA both had coverage just below GS, with ADS having the lowest internal overlap of the three (just slightly higher than arXiv.org, which had 0 internal overlap).

Abstract: This study examines the completeness and overlap of coverage in physics of six open access scholarly communication systems, including two search engines (Google Scholar and Microsoft Academic), two aggregate institutional repositories (OAIster and OpenDOAR), and two physics-related open sources (arXiv.org and Astrophysics Data System). The 2001–2013 Nobel Laureates in Physics served as the sample. Bibliographic records of their publications were retrieved and downloaded from each system, and a computer program was developed to perform the analytical tasks of sorting, comparison, elimination, aggregation and statistical calculations. Quantitative analyses and cross-referencing were performed to determine the completeness and overlap of the system coverage of the six open access systems. The results may enable scholars to select an appropriate open access system as an efficient scholarly communication channel, and academic institutions may build institutional repositories or independently create citation index systems in the future. Suggestions on indicators and tools for academic assessment are presented based on the comprehensiveness assessment of each system.

Tsay M-y, Wu T-l, Tseng L-l (2017) Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources. PLoS ONE 12(12): e0189751. https://doi.org/10.1371/journal.pone.0189751

Source: Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources

Microsoft Academic is one year old: the Phoenix is ready to leave the nest

Authors: Harzing, AW. & Alakangas, S.

Comment: This is the third of a series of articles, by the first author, investigating the relative citation and publication coverage of Microsoft Academic (MA) within its first year of (re-)launch. Although the studies were of relatively small scale (citation record of 1 and 145 academics), they provided strong evidence for the advantages of MA over other databases. In particular, it possesses high coverage like Google Scholar and, at the same time, structured metadata like in Scopus and Web of Science. These, together with its fast growth, make MA an excellent alternative for bibliometrics and scientometrics studies.

Abstract: We investigate the coverage of Microsoft Academic (MA) just over a year after its re-launch. First, we provide a detailed comparison for the first author’s record across the four major data sources: Google Scholar (GS), MA, Scopus and Web of Science (WoS) and show that for the most important academic publications, journal articles and books, GS and MA display very similar publication and citation coverage, leaving both Scopus and WoS far behind, especially in terms of citation counts. A second, large scale, comparison for 145 academics across the five main disciplinary areas confirms that citation coverage for GS and MA is quite similar for four of the five disciplines. MA citation coverage in the Humanities is still substantially lower than GS coverage, reflecting MA’s lower coverage of non-journal publications. However, we shouldn’t forget that MA coverage for the Humanities still dwarfs coverage for this discipline in Scopus and WoS. It would be desirable for other researchers to verify our findings with different samples before drawing a definitive conclusion about MA coverage. However, based on our current findings we suggest that, only one year after its re-launch, MA is rapidly become the data source of choice; it appears to be combining the comprehensive coverage across disciplines, displayed by GS, with the more structured approach to data presentation, typical of Scopus and WoS. The Phoenix seems to be ready to leave the nest, all set to start its life into an adulthood of research evaluation.

Harzing, AW. & Alakangas, S. (2017) Microsoft Academic is one year old: the Phoenix is ready to leave the nest. Scientometrics 112: 1887. https://doi.org/10.1007/s11192-017-2454-3

Source: Microsoft Academic is one year old: the Phoenix is ready to leave the nest | Springer for Research & Development

Growth of hybrid open access, 2009–2016

Author: Bo-Christer Bjork

Notes: This 2017 article estimates the growth in hybrid OA journals and articles published within from 2009 to 2016. from 20 publishers Most interesting is the difficulty experienced in obtaining data because the hybridity of a journal is not always indicated. The author used previous studies and more recent data from 15 publishers who agreed to share, plus 5 big publishers. However data are not itemised for each publisher.

Abstract

Hybrid Open Access is an intermediate form of OA, where authors pay scholarly publishers to make articles freely accessible within journals, in which reading the content otherwise requires a subscription or pay-per-view. Major scholarly publishers have in recent years started providing the hybrid option for the vast majority of their journals. Since the uptake usually has been low per journal and scattered over thousands of journals, it has been very difficult to obtain an overview of how common hybrid articles are. This study, using the results of earlier studies as well as a variety of methods, measures the evolution of hybrid OA over time. The number of journals offering the hybrid option has increased from around 2,000 in 2009 to almost 10,000 in 2016. The number of individual articles has in the same period grown from an estimated 8,000 in 2009 to 45,000 in 2016. The growth in article numbers has clearly increased since 2014, after some major research funders in Europe started to introduce new centralized payment schemes for the article processing charges (APCs).

https://peerj.com/articles/3878/

Publications | Free Full-Text | Enhancing Institutional Publication Data Using Emergent Open Science Services | HTML

Authors: David Walters and Christopher Daley (Brunel University, London)

Notes: An interesting article looking at integrating data sources to assess OA status and location of OA copies for single UK university. Focusses on data derived from CORE and from Unpaywall and its combination with other information from university systems.

Abstract: The UK open access (OA) policy landscape simultaneously preferences Gold publishing models (Finch Report, RCUK, COAF) and Green OA through repository usage (HEFCE), creating the possibility of confusion and duplication of effort for academics and support staff. Alongside these policy developments, there has been an increase in open science services that aim to provide global data on OA. These services often exist separately to locally managed institutional systems for recording OA engagement and policy compliance. The aim of this study is to enhance Brunel University London’s local publication data using software which retrieves and processes information from the global open science services of Sherpa REF, CORE, and Unpaywall. The study draws on two classification schemes; a ‘best location’ hierarchy, which enables us to measure publishing trends and whether open access dissemination has taken place, and a relational ‘all locations’ dataset to examine whether individual publications appear across multiple OA dissemination models. Sherpa REF data is also used to indicate possible OA locations from serial policies. Our results find that there is an average of 4.767 permissible open access options available to the authors in our sample each time they publish and that Gold OA publications are replicated, on average, in 3 separate locations. A total of 40% of OA works in the sample are available in both Gold and Green locations. The study considers whether this tendency for duplication is a result of localised manual workflows which are necessarily focused on institutional compliance to meet the Research Excellence Framework 2021 requirements, and suggests that greater interoperability between OA systems and services would facilitate a more efficient transformation to open scholarship.

Source: Publications | Free Full-Text | Enhancing Institutional Publication Data Using Emergent Open Science Services | HTML

Over 80% of research outputs meet requirements of REF 2021 open access policy – Research England

Author: Research England (neé HEFCE)

Notes: An important national survey of progress towards Open Access in the context of a strong policy and compliance requirement. Interesting both for the claims it makes about the levels of OA as well as the language and nature of the process by which it is being achieved. Lots of important detail on how metadata is and is not being collected an processed.

Abstract: Sixty one per cent of research outputs known to be in scope for the REF 2021 are meeting open access deposit, discovery and access requirements, with a further twenty per cent reporting a known exception, a report published today shows.The report details the findings of a survey by the former Higher Education Funding Council for England (HEFCE), the Wellcome Trust, the former Research Councils UK (RCUK) and Jisc. The survey sought to assess how the sector is delivering funders’ open access (OA) policies and to understand some of the challenges the sector faces. The four project partners were also interested in understanding the methods and tools being used across the sector to ensure policy compliance.

Source: Over 80% of research outputs meet requirements of REF 2021 open access policy – Research England

It’s Time to Make Your Data Count!

Author: Daniella Lowenburg

Notes: The Making Data Count project is a Sloan funded effort to develop standardised metrics for data usage across data repositories. It represents the most general effort to track usage for generic research data to date. Here they report progress within two repositories (California Digital Library and DataONE) and are seeking to get engagement from other repositories to expand the program.

Summary: One year into our Sloan funded Make Data Count project, we are proud to release Version 1 of standardized data usage and citation metrics!

As a community that values research data it is important for us to have a standard and fair way to compare metrics for data sharing. We know of and are involved in a variety of initiatives around data citation infrastructure and best practices; including Scholix, Crossref and DataCite Event Data. But, data usage metrics are tricky and before now there had not been a group focused on processes for evaluating and standardizing data usage. Last June, members from the MDC team and COUNTER began talking through what a recommended standard could look like for research data.

Since the development of our COUNTER Code of Practice for Research Data we have implemented comparable, standardized data usage and citation metrics at Dash (CDL) and DataONE.

Source: It’s Time to Make Your Data Count!

The Landscape of Research Data Repositories in 2015: A re3data Analysis

TiTle: The Landscape of Research Data Repositories in 2015: A re3data Analysis

Authors: Maxi Kindling et al

https://doi.org/10.1045/march2017-kindling

Summary: Analysis of data repositories in re3data shows a range of access, software, APIs, PIDs used as well as content, owners and countries. Limited standard compliance was noted.

re3data now provides much of this info on its metrics page https://www.re3data.org/metrics

D-Lib Magazine March/April 2017
Volume 23, Number 3/4

Abstract

This article provides a comprehensive descriptive and statistical analysis of metadata information on 1,381 research data repositories worldwide and across all research disciplines. The analyzed metadata is derived from the re3data database, enabling search and browse functionalities for the global registry of research data repositories. The analysis focuses mainly on institutions that operate research data repositories, types and subjects of research data repositories (RDR), access conditions as well as services provided by the research data repositories. RDR differ in terms of the service levels they offer, languages they support or standards they comply with. These statements are commonly acknowledged by saying the RDR landscape is heterogeneous. As expected, we found a heterogeneous RDR landscape that is mostly influenced by the repositories’ disciplinary background for which they offer services.

Keywords: Research Data Repositories, RDR, Statistical Analysis, Metadata, re3data, Open Science, Open Access, Research Data, Persistent Identifier, Digital Object Identifier, Licenses

Source: The Landscape of Research Data Repositories in 2015: A re3data Analysis

open-research-data-report.pdf

TItle: Open Research Data: Report to the Australian National Data Service (ANDS)

Authors:John Houghton. Nicolas Gruen

Summary:
An interesting  2014 report assessing the value of data in Australia’s public research. Estimates for Australia extrapolated and scaled from UK studies. Staffing makes up more than 50% and up to 90% of the cost.

Main points:

Research data are an asset we have been building for decades, through billions of dollars of bublic investment in research annually. The information and communication technology (ICT) revolution presents an unprecedented opportunity to ‘leverage’ that asset. Given this, there is
increasing awareness around the world that there are benefits to be gained from curating and openly sharing research data (Kvalheim and Kvamme 2014).
Conservatively, we estimate that the value of data in Australia’s public research to be at least $1.9 billion and possibly up to $6 billion a year at current levels of expenditure and activity. Research data curation and sharing might be worth at least $1.8 billion and possibly up to $5.5 billion a year of which perhaps $1.4 billion to $4.9 billion annually is yet to be realized. Hence, any policy around public funded research data should aim to realise as much of this unrealised value as practicable.

Click to access open-research-data-report.pdf

Source: open-research-data-report.pdf