Accuracy of affiliation information in Microsoft Academic: Implications for institutional level research evaluation

Authors: Ranjbar-Sahraei B.; Eck, N.J. van; Jong R. de

Comment: This is a summary of results for a poster presented at the STI 2018 Conference in Leiden. The work compares research output recorded by both Microsoft Academic (MA) and Web of Science (WoS) for Leiden University. A first level automated matching is done, revealing differences across MA and WoS. Then, a sample of 100 is drawn from each of the disagreeing parts of the comparison. Manual checking of these found that MA contained affiliation errors.

Abstract: In this work, we study the accuracy of affiliation information in Microsoft Academic (MA). To conduct this study, we have considered the full set of publications assigned to Leiden University (LU) as provided by two different data sources: MA and Web of Science (WoS). The results of this study suggest that a considerable number of publications in MA have missing or wrong affiliation information.

Source: Accuracy of affiliation information in Microsoft Academic: Implications for institutional level research evaluation

The History, Deployment, and Future of Institutional Repositories in Public Universities in South Africa – ScienceDirect

Author: Siviwe Bangani

Comment:
Another interesting paper about IRs in South Africa (SA). Web data was collected, together with interviews been conducted. A detailed history of IRs in SA is given. While many of the South African universities have signed various international declarations and initiative on OA, they often don’t have an institutional policy on OA. Various factors (obstacles and enablers) are listed. Amount of funding is relatively low compared to other countries. Varying IR sizes, types of objects in IRs, multiple language support and issues, and suggestions for development are presented and discussed.

Abstract:
This paper investigates the history, deployment, and content of institutional repositories (IRs) in public universities in South Africa. Some of the local, national and international drivers and enablers that ensure the establishment and survival of the institutional repositories are identified. Lastly, an attempt is made to determine the future of the IRs. Findings include that South African universities were among the first universities in the world to host IRs with the first IR established in 2000. The most prevalent and dominant content in South African public university collections are electronic theses and dissertations (ETDs). There are signs that this is changing as more libraries cover research outputs emanating from the universities. African languages are sparsely represented in IRs in South Africa. The majority of universities in the country signed the Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities, and the Budapest Open Access Initiative. Many of them do not have their own open access policy. The driving factors include the decline in government subsidy, increase in journal subscriptions, depreciation of the South African currency, and addition of the Value Added Tax (VAT) of 14% on electronic resources by the South Africa taxman while the enabling factors include the international open access mandates, the Carnegie Foundation grants, and the National Research Foundation’s statement on open access.

Bangani S (2018) The History, Deployment, and Future of Institutional Repositories in Public Universities in South Africa. The Journal of Academic Librarianship 44(1): 39-51.

Source: The History, Deployment, and Future of Institutional Repositories in Public Universities in South Africa – ScienceDirect

Institutional Repositories in Chinese Open Access Development: Status, Progress, and Challenges – ScienceDirect

Authors: Jing Zhong & Shuyong Jiang

Comment: An interesting paper interrogating institutional repositories (IR) in China. These IRs were accessed via ROAR, OpenDOAR, SouOA and CHAIR, though many URL links were broken. The article highlighted the slow development of OA repositories in China and attributed this to the lack of policy and support at all levels. At the end of the article, it mentioned that the Chinese Academy of Sciences and the National Natural Science Foundation of China, in May 2014, released an Open Access policy statement requiring that its funded research papers be made open access in IRs within 12 months after their publication. It would be interesting to follow-up on whether this had made any significant impact.

Abstract:
Open Access (OA) movement in China is developing with its own track and speed. Compared to its western counterparts, it moves slowly. However, it keeps growing. More significantly, it provides open and free resources not only to Chinese scholars, but also to those of China studies around the world. The premise is whether we can find them in an easy and effective fashion. This paper will describe the status of the OA movement in China with a focus on institutional repositories (IR) in Chinese universities and research institutes. We will explore different IR service modules and discuss their coverage, strengths, limitation, and most importantly implications to the East Asian Collection in the US.

Zhong J & Jiang S (2016) Institutional Repositories in Chinese Open Access Development: Status, Progress, and Challenges. The Journal of Academic Librarianship 42(6): 739-744.

Source: Institutional Repositories in Chinese Open Access Development: Status, Progress, and Challenges – ScienceDirect

Elsevier journals — some facts

Author: Timothy Gower
Blogpost April 24, 2014

Comment: This long blog post discusses the author’s attempts, successful in many cases, to obtain the costs of Elsevier journal subscriptions at the UK Russell Group of universities. It includes some amusing detailed correspondence with JISC and the universities. Also  related discussion around APCs and their impact on subscription costs, Elsevier costs in some US universities, Brazil. Also in the post and related comments are some useful data sources and related analysis.

Introduction: A little over two years ago, the Cost of Knowledge boycott of Elsevier journals began. Initially, it seemed to be highly successful, with the number of signatories rapidly reaching 10,000 and including some very high-profile researchers, and Elsevier making a number of concessions, such as dropping support for the Research Works Act and making papers over four years old from several mathematics journals freely available online. It has also contributed to an increased awareness of the issues related to high journal prices and the locking up of articles behind paywalls….

I  have come to the conclusion that if it is not possible to bring about a rapid change to the current system, then the next best thing to do, which has the advantage of being a lot easier, is to obtain as much information as possible about it. Part of the problem with trying to explain what is wrong with the system is that there are many highly relevant factual questions to which we do not yet have reliable answers.

Elsevier journals — some facts

Can Microsoft Academic help to assess the citation impact of academic books?

Authors: Kousha K & Thelwall M

Comment: This article examines the comparison of coverage and citations by Microsoft Academic (MA) with the Book Citation Index (BKCI) and Google Scholar (GS). It showed that, while MA’s coverage for books is still not comprehensive, it is able to find more citations in some fields than the other two sources. In particular, it has greater coverage than BKCI for some Arts & Humanities fields (though in general it is still biased towards the technical fields). MA also seems less sensitive to book editions. MA’s comparison with GS gave mixed results, with one better than the other in different fields, suggesting them as having partly complementary coverage.

Abstract: Despite recent evidence that Microsoft Academic is an extensive source of citation counts for journal articles, it is not known if the same is true for academic books. This paper fills this gap by comparing citations to 16,463 books from 2013-2016 in the Book Citation Index (BKCI) against automatically extracted citations from Microsoft Academic and Google Books in 17 fields. About 60% of the BKCI books had records in Microsoft Academic, varying by year and field. Citation counts from Microsoft Academic were 1.5 to 3.6 times higher than from BKCI in nine subject areas across all years for books indexed by both. Microsoft Academic found more citations than BKCI because it indexes more scholarly publications and combines citations to different editions and chapters. In contrast, BKCI only found more citations than Microsoft Academic for books in three fields from 2013-2014. Microsoft Academic also found more citations than Google Books in six fields for all years. Thus, Microsoft Academic may be a useful source for the impact assessment of books when comprehensive coverage is not essential.

Kousha K, Thelwall M (2018) Can Microsoft Academic help to assess the citation impact of academic books? arXiv.org: arXiv:1808.01474v1.

Source: Can Microsoft Academic help to assess the citation impact of academic books?

Citation analysis with microsoft academic

Authors: Hug SE, Ochsner M & Brandle MP

Comment: This article compares the citation analyses between Microsoft Academic (MA) and Scopus. This was compared via the output of three selected researchers. The results showed uniformity across MA and Scopus. Some limitations to MA were also pointed out.

Abstract: We explore if and how Microsoft Academic (MA) could be used for bibliometric analyses. First, we examine the Academic Knowledge API (AK API), an interface to access MA data, and compare it to Google Scholar (GS). Second, we perform a comparative citation analysis of researchers by normalizing data from MA and Scopus. We find that MA offers structured and rich metadata, which facilitates data retrieval, handling and processing. In addition, the AK API allows retrieving frequency distributions of citations. We consider these features to be a major advantage of MA over GS. However, we identify four main limitations regarding the available metadata. First, MA does not provide the document type of a publication. Second, the “fields of study” are dynamic, too specific and field hierarchies are incoherent. Third, some publications are assigned to incorrect years. Fourth, the metadata of some publications did not include all authors. Nevertheless, we show that an average-based indicator (i.e. the journal normalized citation score; JNCS) as well as a distribution-based indicator (i.e. percentile rank classes; PR classes) can be calculated with relative ease using MA. Hence, normalization of citation counts is feasible with MA. The citation analyses in MA and Scopus yield uniform results. The JNCS and the PR classes are similar in both databases, and, as a consequence, the evaluation of the researchers’ publication impact is congruent in MA and Scopus. Given the fast development in the last year, we postulate that MA has the potential to be used for full-fledged bibliometric analyses.

Hug, S.E., Ochsner, M. & Brändle, M.P. (2017) Citation analysis with microsoft academic. Scientometrics 111: 371. https://doi.org/10.1007/s11192-017-2247-8

Source: Citation analysis with microsoft academic

Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources

Authors: Tsay M-y, Wu T-l & Tseng L-l

Comment: This article compares several open access search engines (i.e., Google Scholar (GS), Microsoft Academic (MSA), OAIster, OpenDOAR, arXiv.org and Astrophysics Data System (ADS)) using publications of Nobel Laureates for Physics from 2001 to 2013. A short literature on comparing search engines is given. Both internal and external overlaps are studied. At the time of this work, GS had the highest coverage of this sample, but had a very high percentage of internal overlap (>92%). It actually covers all items in other sources, except for MSA. ADS and MSA both had coverage just below GS, with ADS having the lowest internal overlap of the three (just slightly higher than arXiv.org, which had 0 internal overlap).

Abstract: This study examines the completeness and overlap of coverage in physics of six open access scholarly communication systems, including two search engines (Google Scholar and Microsoft Academic), two aggregate institutional repositories (OAIster and OpenDOAR), and two physics-related open sources (arXiv.org and Astrophysics Data System). The 2001–2013 Nobel Laureates in Physics served as the sample. Bibliographic records of their publications were retrieved and downloaded from each system, and a computer program was developed to perform the analytical tasks of sorting, comparison, elimination, aggregation and statistical calculations. Quantitative analyses and cross-referencing were performed to determine the completeness and overlap of the system coverage of the six open access systems. The results may enable scholars to select an appropriate open access system as an efficient scholarly communication channel, and academic institutions may build institutional repositories or independently create citation index systems in the future. Suggestions on indicators and tools for academic assessment are presented based on the comprehensiveness assessment of each system.

Tsay M-y, Wu T-l, Tseng L-l (2017) Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources. PLoS ONE 12(12): e0189751. https://doi.org/10.1371/journal.pone.0189751

Source: Completeness and overlap in open access systems: Search engines, aggregate institutional repositories and physics-related open sources

Microsoft Academic is one year old: the Phoenix is ready to leave the nest

Authors: Harzing, AW. & Alakangas, S.

Comment: This is the third of a series of articles, by the first author, investigating the relative citation and publication coverage of Microsoft Academic (MA) within its first year of (re-)launch. Although the studies were of relatively small scale (citation record of 1 and 145 academics), they provided strong evidence for the advantages of MA over other databases. In particular, it possesses high coverage like Google Scholar and, at the same time, structured metadata like in Scopus and Web of Science. These, together with its fast growth, make MA an excellent alternative for bibliometrics and scientometrics studies.

Abstract: We investigate the coverage of Microsoft Academic (MA) just over a year after its re-launch. First, we provide a detailed comparison for the first author’s record across the four major data sources: Google Scholar (GS), MA, Scopus and Web of Science (WoS) and show that for the most important academic publications, journal articles and books, GS and MA display very similar publication and citation coverage, leaving both Scopus and WoS far behind, especially in terms of citation counts. A second, large scale, comparison for 145 academics across the five main disciplinary areas confirms that citation coverage for GS and MA is quite similar for four of the five disciplines. MA citation coverage in the Humanities is still substantially lower than GS coverage, reflecting MA’s lower coverage of non-journal publications. However, we shouldn’t forget that MA coverage for the Humanities still dwarfs coverage for this discipline in Scopus and WoS. It would be desirable for other researchers to verify our findings with different samples before drawing a definitive conclusion about MA coverage. However, based on our current findings we suggest that, only one year after its re-launch, MA is rapidly become the data source of choice; it appears to be combining the comprehensive coverage across disciplines, displayed by GS, with the more structured approach to data presentation, typical of Scopus and WoS. The Phoenix seems to be ready to leave the nest, all set to start its life into an adulthood of research evaluation.

Harzing, AW. & Alakangas, S. (2017) Microsoft Academic is one year old: the Phoenix is ready to leave the nest. Scientometrics 112: 1887. https://doi.org/10.1007/s11192-017-2454-3

Source: Microsoft Academic is one year old: the Phoenix is ready to leave the nest | Springer for Research & Development

Growth of hybrid open access, 2009–2016

Author: Bo-Christer Bjork

Notes: This 2017 article estimates the growth in hybrid OA journals and articles published within from 2009 to 2016. from 20 publishers Most interesting is the difficulty experienced in obtaining data because the hybridity of a journal is not always indicated. The author used previous studies and more recent data from 15 publishers who agreed to share, plus 5 big publishers. However data are not itemised for each publisher.

Abstract

Hybrid Open Access is an intermediate form of OA, where authors pay scholarly publishers to make articles freely accessible within journals, in which reading the content otherwise requires a subscription or pay-per-view. Major scholarly publishers have in recent years started providing the hybrid option for the vast majority of their journals. Since the uptake usually has been low per journal and scattered over thousands of journals, it has been very difficult to obtain an overview of how common hybrid articles are. This study, using the results of earlier studies as well as a variety of methods, measures the evolution of hybrid OA over time. The number of journals offering the hybrid option has increased from around 2,000 in 2009 to almost 10,000 in 2016. The number of individual articles has in the same period grown from an estimated 8,000 in 2009 to 45,000 in 2016. The growth in article numbers has clearly increased since 2014, after some major research funders in Europe started to introduce new centralized payment schemes for the article processing charges (APCs).

https://peerj.com/articles/3878/

Publications | Free Full-Text | Enhancing Institutional Publication Data Using Emergent Open Science Services | HTML

Authors: David Walters and Christopher Daley (Brunel University, London)

Notes: An interesting article looking at integrating data sources to assess OA status and location of OA copies for single UK university. Focusses on data derived from CORE and from Unpaywall and its combination with other information from university systems.

Abstract: The UK open access (OA) policy landscape simultaneously preferences Gold publishing models (Finch Report, RCUK, COAF) and Green OA through repository usage (HEFCE), creating the possibility of confusion and duplication of effort for academics and support staff. Alongside these policy developments, there has been an increase in open science services that aim to provide global data on OA. These services often exist separately to locally managed institutional systems for recording OA engagement and policy compliance. The aim of this study is to enhance Brunel University London’s local publication data using software which retrieves and processes information from the global open science services of Sherpa REF, CORE, and Unpaywall. The study draws on two classification schemes; a ‘best location’ hierarchy, which enables us to measure publishing trends and whether open access dissemination has taken place, and a relational ‘all locations’ dataset to examine whether individual publications appear across multiple OA dissemination models. Sherpa REF data is also used to indicate possible OA locations from serial policies. Our results find that there is an average of 4.767 permissible open access options available to the authors in our sample each time they publish and that Gold OA publications are replicated, on average, in 3 separate locations. A total of 40% of OA works in the sample are available in both Gold and Green locations. The study considers whether this tendency for duplication is a result of localised manual workflows which are necessarily focused on institutional compliance to meet the Research Excellence Framework 2021 requirements, and suggests that greater interoperability between OA systems and services would facilitate a more efficient transformation to open scholarship.

Source: Publications | Free Full-Text | Enhancing Institutional Publication Data Using Emergent Open Science Services | HTML