Skip to main content

Exploring the merits of research performance measures that comply with the San Francisco Declaration on Research Assessment and strategies to overcome barriers of adoption: qualitative interviews with administrators and researchers

Abstract

Background

In prior research, we identified and prioritized ten measures to assess research performance that comply with the San Francisco Declaration on Research Assessment, a principle adopted worldwide that discourages metrics-based assessment. Given the shift away from assessment based on Journal Impact Factor, we explored potential barriers to implementing and adopting the prioritized measures.

Methods

We identified administrators and researchers across six research institutes, conducted telephone interviews with consenting participants, and used qualitative description and inductive content analysis to derive themes.

Results

We interviewed 18 participants: 6 administrators (research institute business managers and directors) and 12 researchers (7 on appointment committees) who varied by career stage (2 early, 5 mid, 5 late). Participants appreciated that the measures were similar to those currently in use, comprehensive, relevant across disciplines, and generated using a rigorous process. They also said the reporting template was easy to understand and use. In contrast, a few administrators thought the measures were not relevant across disciplines. A few participants said it would be time-consuming and difficult to prepare narratives when reporting the measures, and several thought that it would be difficult to objectively evaluate researchers from a different discipline without considerable effort to read their work. Strategies viewed as necessary to overcome barriers and support implementation of the measures included high-level endorsement of the measures, an official launch accompanied by a multi-pronged communication strategy, training for both researchers and evaluators, administrative support or automated reporting for researchers, guidance for evaluators, and sharing of approaches across research institutes.

Conclusions

While participants identified many strengths of the measures, they also identified a few limitations and offered corresponding strategies to address the barriers that we will apply at our organization. Ongoing work is needed to develop a framework to help evaluators translate the measures into an overall assessment. Given little prior research that identified research assessment measures and strategies to support adoption of those measures, this research may be of interest to other organizations that assess the quality and impact of research.

Peer Review reports

Background

The San Francisco Declaration on Research Assessment (DORA) was established in 2012 during the Annual Meeting of the American Society for Cell Biology [1]. DORA principles advocate for research assessment based on a broad range of discipline-relevant measures of quality and impact, and eliminating journal-based metrics such as Journal Impact Factor. DORA recommends that academic organizations be explicit about criteria used for hiring, annual review, tenure, and promotion decisions; assess the value and impact of all research outputs in addition to research publications; and consider a broad range of measures including qualitative indicators of research impact such as influence on policy and practice. As of 1 November 2022, 22,311 individuals and organizations in 159 countries are DORA signatories.

Some groups have generated principles for assessing research that align with the DORA statement. For example, the Leiden Manifesto includes ten principles such as measure performance against research institute mission, and accounts for variation by field in publication and citation practices [2]. A 2017 meeting of international experts in scientific communication generated five principles upon which to judge research: assess contribution to societal needs, employ responsible indicators, reward publishing of all research regardless of the results, recognize the culture of open research, fund research that generates evidence on optimal ways to assess science and faculty, and fund/recognize out-of-the-box ideas [3]. While helpful, these principles offer high-level guidance but not concrete performance measures. Others have suggested measures for research assessment, but they are discipline-specific and not broadly applicable to diverse fields of research. For example, Mazumdar et al. proposed criteria to assess the contributions of biostatisticians to team science [4].

To address these limitations, our group identified and prioritized measures of research quality and impact. To be clear, we included only measures of research activity and outputs, and excluded measures pertaining to teaching, mentoring, and other service. The methods and results are reported elsewhere [5]. In brief, we synthesized peer-reviewed research and grey literature, including documents from Canadian academic organizations and international scholarly organizations that had adopted DORA principles, to generate a list of 50 unique measures for assessing the quality and impact of research. We then conducted a two-round Delphi survey of multidisciplinary researchers, research administrators, and research leaders to achieve consensus on priority measures. This resulted in ten measures organized in eight domains: relevance of research program, challenges to research program or productivity, team/open science, funding, innovations, publications, other dissemination, and impact. The measures can be used by researchers across disciplines in our organization and beyond to describe their research achievements, and by various staff in academic organizations to support hiring, annual review, tenure, promotion, and other decisions based on the quality and impact of research.

While the limitations and harms of evaluating research on the basis of metrics such as Journal Impact Factor have long been recognized [6,7,8,9,10,11], research assessment has largely focused on journal metrics for quite some time. For example, a 2018 survey of criteria used for assessing researchers at 92 international faculties of biomedical sciences revealed they largely employed traditional measures such as the number of peer-reviewed publications, impact factor, and number or amount of grant funding [12]. Thus, it may be challenging to promote adoption of measures that do not include journal metrics, and instead rely on assessing the quality and impact of the research itself rather than where it was published. The overall aim of this study was to explore how to promote acceptance and use of the ten priority measures. The specific objectives were to identify perceived strengths and limitations of the measures, and suggestions for strategies needed to overcome barriers and support adoption of the measures. Such knowledge is critical for planning implementation of the measures including approaches, interventions, or tools.

Methods

Approach

We employed a qualitative research design to fully explore participant perspectives on the ten priority measures [13]. More specifically, we used qualitative description, an approach that does not test or generate theory [14]. Instead, this approach is commonly used in health services research to gather explicit information about views, experiences, and suggestions. We complied with the Consolidated Criteria for Reporting Qualitative Research checklist [15]. The University Health Network Research Ethics Board granted ethical approval for this study (REB #22–5082). All participants provided written informed consent prior to interviews. The researchers had no relationship with the participants, other than employment at the same institution. The University Health Network is one of Canada’s largest academic hospitals offering 12 medical programs across 10 hospital sites. The research arm of the University Health Network is organized into distinct research institutes that vary in size, administration, and focus including cardiology, transplantation, neurosciences, oncology, surgical innovation, infectious diseases, genomic medicine, healthcare education, and rehabilitation medicine.

Sampling and recruitment

We used purposive sampling to recruit individuals from our organization whose views about the measures might vary by role (researcher, research administrator, research leader) and research discipline (six research institutes spanning biomedical, clinical, health services, population health, rehabilitation, and medical education research). We identified eligible persons on publicly available research institute web sites, and also used snowball sampling by first interviewing research administrators in different research institutes and asking them to refer us to others in their research institute. We aimed to recruit a single administrator (e.g., business manager) and leader (e.g., scientific director) plus one researcher from each research institute, aiming to also include in non-mutually exclusive fashion persons representing early, mid-, and later career, for a target of 18 interviews. In qualitative research, sampling is concurrent with data collection and analysis to the point of thematic saturation, or when no further unique themes emerge from successive interviews, which was established through discussion of themes by the research team. This is consistent with the 12–15 interviews by when saturation is often achieved [16]. We first contacted potential participants by email on 28 April 2022, followed by email reminders to non-respondents every two weeks until we closed recruitment on 3 August 2022. Following informed consent, we used email to schedule the interview and share the ten priority measures (Additional File 1), asking participants to review the measures in advance and have the measures available during the interview.

Data collection

We conducted a single telephone interview with participants through May and July 2022. A.R.G. (PhD-trained woman Senior Scientist/Professor with extensive qualitative experience) conducted the first four interviews while H.B. (MPH-trained woman Research Associate with some qualitative experience) attended for training purposes, and the next two were conducted by H.B. while A.R.G. attended, followed by discussion to further support training. Thereafter, H.B. conducted all interviews, with periodic review by A.R.G. to provide feedback and answer questions posed by H.B. Interview questions were reviewed and refined by the research team prior to use. The interview guide (Additional file 2) included five questions related to perceived strengths of the measures, perceived limitations of or gaps in the measures, potential barriers to reporting the measures, potential barriers to assessing the measures, and strategies needed to facilitate use of the measures for reporting or assessing research. The interview guide included prompts that we invoked only if any participant provided little response to a given question. Prompts were informed by what we heard from participants of early interviews. Both questions and prompts were posed in an open, non-leading manner to avoid influencing responses. Interviews ranged from 15 to 60 min, and were audio-recorded and transcribed.

Data analysis

We used content analysis to identify themes inductively through constant comparison and used Microsoft Office (Word, Excel) to manage data [13, 14]. H.B. and A.R.G. independently coded the first three interviews, then compared and discussed coding to develop a preliminary codebook of themes and exemplar quotes (first-level coding). H.B. coded subsequent interviews to expand or merge themes (second level of coding). H.B. met with A.R.G. periodically to discuss and refine coding. We tabulated data (themes, quotes) by participant institute, participant role, participant career stage, and research discipline to compare themes. We used summary statistics to describe participants and text to describe key themes. As is customary in qualitative research, we used words like few or many to convey whether some or most of the participants articulated a given idea as a way of exploring and reporting major discrepancies in the views or suggestions of participants.

Results

Participants

We interviewed a total of 18 participants: 6 administrators (research institute business managers and directors) and 12 researchers (7 of whom were on research institute appointment committees). Participants varied by research institute, thus representing the perspectives of various research disciplines (Table 1). The career stage of the 12 researchers and directors included 2 early, 5 mid-, and 5 later career stage.

Table 1 Participant characteristics

Themes

Data (themes and quotes) are available in Additional file 3. Table 2 lists themes with exemplar quotes. Here we discuss themes organized by interview question with select illustrative quotes, and highlight any views that differed among participants.

Table 2 Themes and exemplar quotes

Perceived strengths of the measures

Overall, participants identified numerous benefits of the measures and the template for reporting the measures. Most participants said that the measures covered the major aspects of being a productive researcher, many using the word “comprehensive,” and that the measures were equally relevant to different research disciplines. In particular, by moving away from impact factor, the measures would allow researchers to personalize reporting according to their research discipline, resulting in fair assessment. Most researchers and one administrator said that the measures were relatively similar to those already in use, suggesting that little change was needed to adopt the measures. Similarly, most researchers and one administrator said the measures were easy to understand, and the template was easy to use because it provided options to choose from and limited open-ended responses to a few key achievements. One researcher appreciated that the measures were generated using a rigorous scientific process, referring to the literature search, environmental scan and Delphi survey process we used to identify and prioritize the measures [5].

Using these ten criteria as an assessor, I think I can get a pretty good feel for what that researcher is doing and what their successes and challenges have been (012 JS mid clinical).

Several participants noted the merits of specific measures. For example, participants appreciated the opportunity to describe challenges to their productivity “because the real world does impact research.” Others highlighted the novelty of describing collaboration, an important aspect of research yet not typically explicitly reported or assessed in the past. Participants also appreciated multiple opportunities to describe their research contributions including a list of key outputs in addition to publications, how their research directly or indirectly improves health or health care, how their research advances knowledge, and any other relevant information about the quality and impact of their research not captured by other measures.

Researchers might not always be published in a traditional peer-review paper. I really like that you have it explicitly stated those kinds of outputs are something that can be highlighted (01 AC late biomedical).

Perceived limitations or gaps in the measures

Compared with strengths, fewer participants raised concerns about the measures. Regarding limitations, the most commonly articulated concern was that the measures may not be relevant across research disciplines. This was mentioned by five administrators, including two who had noted that relevance across disciplines was a strength, and a single researcher. Several participants also said that the 5-year time frame (reflecting the 5-year review for scientists at our institution) was too short given that some years are more productive than others, and in some research disciplines, achievements may only be realized in the context of an entire career.

I think the disciplines are so different that to use this collection of measures equally across disciplines is extremely difficult (06 AC mid biomedical).

In terms of gaps, a few participants said that the measures did not reflect activities such as teaching, mentorship, and other service, which they viewed as a major and important part of what researchers do. A single researcher noted that options listed as examples of research outputs did not include medical curriculum and outcomes associated with community-based research collaboration. A single administrator noted the absence of a measure to capture the effort of failed attempts to capture research funding.

Potential barriers to reporting the measures

Few participants identified barriers that researchers may face when reporting the measures. Some thought that it would be time-consuming to prepare descriptive accounts in response to these measures, referring to the fact that, in the past, information about grants and publications was gathered by others from their CVs. The same administrators and other researchers said that some people are better able to describe their research than others, suggesting that poorly written descriptive accounts would not reflect research quality and impact.

People differ in their ability to describe the impact of their work but that does not necessarily reflect the impact of the work, it reflects their ability to describe it (09 ID mid biomedical).

Potential barriers to assessing the measures

Compared with barriers of reporting the measures, more participants identified barriers that evaluators may face when assessing the measures. Without being able to rely on journal impact factor, which they viewed as objective, participants said that it would be difficult to judge research outside of one’s own discipline, possibly leading to biased or inaccurate assessments. Furthermore, without numerical data such as Journal Impact Factor, evaluators might be unclear how to weight different measures to generate an overall evaluation, or to distinguish between candidates in the scenario of hiring. Given these concerns, a single researcher said that the workload would be greater because they would have to read publications to assess research contributions rather than relying on Journal Impact Factor. Several administrators and researchers highlighted that many are comfortable with the status quo, and reluctant to accept the changes required to implement or adopt the measures.

The shift to these kinds of measures is the biggest challenge instead of the measures themselves or gaps in the measures (010 BM).

Strategies needed to address barriers and support adoption of the measures

Three administrators suggested that research institutes be allowed to choose which of the ten measures to adopt, in contradiction to the many participants who said that the measures are similar to those in current use. This view was qualified by two participants: one administrator said that it would simply take time to adjust to the measures, and one appointment committee researcher said that the measures could be refined over time if needed on the basis of continuous assessment of issues that arise with implementation and adoption.

When anything new is introduced people are uncomfortable with it, and once they tried it a few times, usually then they develop a certain level of comfort with it and it doesn’t seem so unreasonable. With time, people will learn how to adjust to the new assessment measures and things will be fine (016 ID late biomedical).

DORA has to be a working document, if you will, and one that will be changed with time or with implementation itself (05 AC late biomedical).

Participants said that researchers and evaluators will comply if the measures are formally endorsed as the standard by leadership, and several others recommended an official launch accompanied by a multipronged communication strategy to raise awareness about the measures and, specifically, why they are important, plus training for researchers and evaluators on how to report or assess the measures.

It needs to be made explicit by leaders (011 SS late clinical).

I guess it would be nice to officially launch it somehow and probably that would be with a town-hall or something like that (03 AC mid health services and population).

You need the why are we doing this? Why are we shifting our focus? Why are we asking you not to just count citations and look at journal impact factor (07 AC mid biomedical).

We would require a lot of education for our researchers in guiding them on thinking in this way and using these measures and how best to report it. Education will be critical for our evaluators as well to think in this lens (017 BM).

Specific to helping researchers, a few participants recommended informing researchers about the measures early in their career so that they can track them, and to ensure that researchers have administrative support to help prepare reports based on the measures or, alternatively, automate the reporting process so that data required to report the measures can be directly acquired by administrative assistants from researcher CVs. Specific to helping evaluators, a few participants recommended providing guidance on how to assess and interpret the measures. One participant suggested sharing information across research institutes on approaches they are using to adopt and apply the measures in performance evaluation. A few participants also recommended strategies to ensure fair reviews including: evaluators should be from the same discipline as the researcher under review, multiple evaluators should be employed, and evaluators must contextualize measures to both discipline and career stage.

Discussion

Through interviews with 18 participants representing a range of roles, disciplines, and career stages affiliated with 6 research institutes, we generated important insight needed to inform implementation of the 10 DORA-compliant measures of research quality and impact identified in our prior research [5]. Participants identified numerous strengths of the measures. In contrast to the many participants who viewed the measures as relevant across disciplines, the main limitation articulated by a few administrators was lack of relevance of the measures across disciplines. Few participants identified barriers to reporting the measures. Several participants thought it would be difficult to judge researchers of different disciplines and to translate the measures into an overall evaluation, resulting in greater workload, and contributing to reluctance to change from the status quo. Key strategies viewed as necessary to overcome barriers and support implementation of the measures included high-level endorsement of the measures, an official launch accompanied by a multipronged communication strategy, training for both researchers and evaluators.

In comparison with previous efforts that generated principles of research assessment rather than specific measures [2, 3], or discipline-specific measures [4], in prior research, we identified and prioritized measures of research quality and impact that can be applied across disciplines [5], whereas in this study we explored barriers of adopting those measures. Approaches for performance appraisal of physicians and nurses have been developed, but they focus on practice-based measures rather than academic research [17, 18]. Thus, to our knowledge, no other research has empirically examined the implications of research assessment for individual researchers that did not rely wholly or in part on Journal Impact Factor. Regardless of the absence of such research, the research assessment landscape is rapidly changing. A recent Nature editorial underscored these changes, noting that, in addition to DORA and the Leiden Manifesto [1, 2], there are several other efforts to advance research assessment [19]. For example, the European Commission established an agreement among a coalition of more than 350 organizations from over 40 European Union countries to reward integrity, teamwork, and a diversity of outputs in addition to other measures of research quality and impact (REF). The Coalition for Advancing Research Assessment Agreement includes principles (e.g., focus research assessment criteria on quality; ensure gender equality, equal opportunities and inclusiveness) and processes to implement these principles (e.g., base research assessment primarily on qualitative evaluation, commit resources to reforming research assessment; raise awareness of research assessment reform). The UK Future Research Assessment Programme (https://www.jisc.ac.uk/future-research-assessment-programme#) will soon issue a report on modernized measures and strategies for research assessment in England, Scotland, Wales, and Northern Ireland. The International Network of Research Management Societies (INORMS), formed in 2001, includes research management societies and associations from across the globe. INORMS generated the SCOPE Framework for Research Evaluation, a five-stage model for responsible assessment: start with what you value, consider context, options for evaluating, probe deeply, and evaluate your evaluations [20]. The Hong Kong Principles refer to five principles of research rigor rather than impact generated through discussion and consensus of over 100 participants at the 6th World Conference on Research Integrity: assess responsible research practices, value complete reporting, reward the practice of open science, acknowledge a broad range of research activities, and recognize essential other tasks like peer review and mentoring [MOHER]. Project TARA (i.e., Tools to Advance Research Assessment) was launched in October 2022 by the DORA initiative to identify, understand, and make visible the criteria and standards universities use to make hiring, promotion, and tenure decisions, and features an online repository of tools (https://sfdora.org/project-tara/). While it is beyond the scope of this study to compare the principles underlying these initiatives, the included principles and processes do appear to support the measures we generated in prior research [5] (e.g., assess research quality, base research assessment primarily on qualitative evaluation), and the recommendations of researchers we interviewed in this study about how to implement the measures.

In this study, while perceived strengths outnumbered limitations, further efforts to promote adoption of the measures must address the limitations identified by participants, warranting some analysis here of each concern. Some participants noted a lack of measures reflecting what is often referred to as service, including activities such as teaching, mentorship, committees, and leadership. We chose to focus only on the assessment of research quality and impact because the emphasis of DORA is on research assessment [1]. However, research institutes or other agencies that evaluate researchers could choose to also assess service. One appointment committee researcher said that a 5-year time frame was too short. We chose this because our organization reviews senior scientists every five years. However, organizations that evaluate researchers could impose a different time frame more suitable to differing evaluation contexts (i.e., hiring, annual salary review, promotion, tenure). One researcher noted that some options for research outputs were not listed as examples on the reporting template. The reporting template was not meant to be exhaustive and specifies that researchers can add outputs of relevance to their research. However, we will add the suggested outputs to the list of the examples to be inclusive of the range of research disciplines. Similarly, one administrator noted the absence of failed attempts at acquiring research funding, a measure that was initially included on our list but did not achieve consensus during the Delphi survey in our previous study [5]. However, we could add failed attempts as an example of response options to the template under the measures of challenges to productivity or other information.

Participants identified several strategies to support adoption of the measures. To help researchers, such assistance could take two forms. A few administrators thought that some researcher may be disadvantaged if they lack ability to describe the value of their research. While researchers are routinely required to write succinct and convincing research funding applications, journal articles, and meeting abstracts, assessing the ability of researchers to describe the merits of their research was beyond the scope of this research. However, to support researchers, we could develop sample reports that demonstrate how to complete the template. To further help researchers, administrative assistants could adapt information contained within academic CVs, which usually include a narrative section on research achievements. To help evaluators, participants recommended guidance on how to translate responses to the ten measures into an overall evaluation. Given that DORA recommends a broad range of measures including qualitative indicators of research impact [1], the measures are not easily quantifiable, which poses challenges for evaluators accustomed to relying on journal metrics. Further research is likely necessary to develop a framework or other guidance to help evaluators determine if a given researcher has been sufficiently productive in a relatively objective rather than subjective fashion. Further, to eliminate bias, participants offered three recommendations to ensure fair reviews: evaluators should be from the same discipline as the researcher under review, multiple evaluators should be employed, and evaluators must contextualize measures to both discipline and career stage. Evaluators likely do consider these factors; however, in our ongoing work to generate a framework for interpreting the measures, we could investigate how to make considering of those factors explicit rather than implicit.

Strengths of this research included use of robust qualitative methods that complied with reporting criteria and standard techniques for ensuring rigor [13,14,15,16]. The research was guided at multiple points by input from the interdisciplinary research team. Furthermore, we interviewed participants who varied by role (research institute business managers and directors, researchers and researchers on appointment review committees) and career stage. Participants were affiliated with six research institutes, thus representing the perspectives of a wide range of research disciplines. We do acknowledge some limitations. All participants were employed at a single hospital corporation featuring multiple research institutes. While this may have resulted in similar perspectives or possible bias in favor of the measures, data revealed differing views and recommendations among participants about the merits of the measures. While participants represented all 6 research institutes, given that only 18 researchers participated, they may not have represented all possible views of all healthcare-relevant research disciplines in Canada or elsewhere. Participants included only two of six research institute directors and only three researchers not on appointment review committees, so their views may not be fully represented in the data. We attribute this to timing of recruitment (summer) and burnout among clinician investigators (due to COVID-19). We were not given ethical approval to collect information about sex, gender, or cultural group, so we could not analyze data by these demographic characteristics. While limited sampling may have influenced the results, we achieved informational saturation of themes and identified differing perspectives among the 18 participants. The participants were affiliated with a hospital corporation in Ontario, Canada, so findings may not be transferable or relevant to other locations in Canada or academic institutions elsewhere with differing research assessment rubrics and processes.

Conclusions

While participants identified many strengths of the ten DORA-compliant measures for research assessment, they also identified some limitations and concerns. At the same time, they offered corresponding strategies to overcome those barriers. Many noted barriers can be relatively easily addressed through high-level official endorsement, communication, training for researchers and evaluators, and modification of the reporting template. Ongoing work is needed to develop a framework that evaluators can use to translate the measures into an overall assessment in response to a key concern about arriving at an overall assessment for an individual researcher. Given little prior research that identified research assessment measures and strategies to support adoption of those measures, this research may be of interest to academic organizations beyond our hospital as well as researchers, funders, and others. Other initiatives (e.g., Hong Kong Principles, Project TARA, CoARA Agreement) can also provide guidance for creating a research culture conductive of DORA-compliant measures.

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its supplementary information files.

Abbreviations

DORA:

San Francisco Declaration on Research Assessment

References

  1. San Francisco Declaration on Research Assessment. DORA. 2012. https://sfdora.org/. Accessed 2 Jun 2022.

  2. Hicks D, Wouters P, Waltman L, de Rijcke S, Rafols I. Bibliometrics: the Leiden Manifesto for research metrics. Nature. 2015;520:429–31.

    Article  PubMed  Google Scholar 

  3. Moher D, Naudet F, Cristea IA, Miedema F, Ioannidis JP, Goodman SN. Assessing scientists for hiring, promotion, and tenure. PLOS Biol. 2018;16:3.

    Article  Google Scholar 

  4. Mazumdar M, Messinger S, Finkelstein DM, et al. Evaluating academic scientists collaborating in team-based research: a proposed framework. Acad Med. 2015;90:1302–8.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Gagliardi AR, Chen RHC, Boury H, Albert M, Chow J, DaCosta RS, Hoffman M, Keshavarz B, Kontos P, Liu J, McAndrews MP, Protze S. DORA-compliant measures to assess research quality and impact in biomedical institutions: review of published research, international best practice and Delphi survey. PLoS ONE. 2023;18(5):e0270616. https://doi.org/10.1101/2022.06.16.22276440.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Gingras Y. Bibliometrics and research evaluation: uses and abuses. Cambridge: The MIT Press; 1992.

    Google Scholar 

  7. Muller JZ. The tyranny of metrics. Princeton: Princeton University Press; 2019.

    Book  Google Scholar 

  8. Seglen PO. Why the impact factor of journals should not be used for evaluating research. BMJ. 1997;314:498–502.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Nature editors. Not-so-deep impact. Nature. 2005;435:1003–4.

    Article  Google Scholar 

  10. The PLoS Medicine editors. The impact factor game. PLoS Med. 2006;3(6):291.

    Article  Google Scholar 

  11. Rossner M, Van Epps H, Hill E. Show me the data. J Cell Biol. 2007;179(6):1091–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Rice D, Raffoul H, Ioannidis J, Moher D. Academic criteria for promotion and tenure in biomedical sciences faculties: cross sectional analysis of international sample of universities. BMJ. 2020;369: m2081.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Auerbach CF, Silverstein LB. Qualitative data: an introduction to coding and analysis. New York: New York University Press; 2003.

    Google Scholar 

  14. Sandelowski M. Focus on research methods—whatever happened to qualitative description? Res Nurs Health. 2000;23:334–40.

    Article  CAS  PubMed  Google Scholar 

  15. Tong A, Sainsbury P, Craig J. Consolidated criteria for reporting qualitative research. Int J Qual Health Care. 2007;19:349–57.

    Article  PubMed  Google Scholar 

  16. Malterud K, Siersma VD, Guassora AD. Sample size in qualitative studies: guided by information power. Qual Health Res. 2016;26:1753–60.

    Article  PubMed  Google Scholar 

  17. Bindels E, Boerebach B, Scheepers R, et al. Designing a system for performance appraisal: balancing physicians’ accountability and professional development. BMC Health Serv Res. 2021;21:800.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Madlabana CZ, Mashamba-Thompson TP, Petersen I. Performance management methods and practices among nurses in primary health care settings: a systematic scoping review protocol. Syst Rev. 2020;9:40.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Editorial. Research evaluation needs to change with the times. Nature. 2022;601:166.

    Article  Google Scholar 

  20. International network of research management societies. SCOPE framework for research evaluation. https://inorms.net/scope-framework-for-research-evaluation/. Accessed 4 Oct 2022.

Download references

Acknowledgements

We thank the University Health Network Vice President of Research for instigating and supporting this initiative.

Funding

This research was undertaken with no source of research funding.

Author information

Authors and Affiliations

Authors

Contributions

ARG conceptualized the study and led the planning of data collection and analysis. HB assisted with data collection and analysis, and drafting of the manuscript. MA, RHCC, JCLC, RD, MMH, BK, PK MPM, and SP assisted in planning the study, interpreting results, and drafting the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Anna R. Gagliardi.

Ethics declarations

Ethics approval and consent to participate

This research was conducted with approval from the University Health Network Research Ethics Board (REB #22-5082), and all participants provided signed informed consent.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Prioritized DORA-compliant measures.

Additional file 2.

Interview guide.

Additional file 3.

Interview data.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Boury, H., Albert, M., Chen, R.H.C. et al. Exploring the merits of research performance measures that comply with the San Francisco Declaration on Research Assessment and strategies to overcome barriers of adoption: qualitative interviews with administrators and researchers. Health Res Policy Sys 21, 43 (2023). https://doi.org/10.1186/s12961-023-01001-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12961-023-01001-w

Keywords