Focus list and filtering it

From ScienceSource
Revision as of 11:15, 8 November 2018 by Charles Matthews (talk | contribs) (expand)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

The ScienceSource focus list is set up on Wikidata, using P5008 on Wikidata . To read about it, go to WD:SSFL. To access the list itself, use the SPARQL query on the talk page of that Wikidata page.

The list is the setting for a "concentric" overview of the project as a whole.

The role of the focus list is to provide the "outer ring" of biomedical articles the project wishes to consider, on Wikidata. Out of 18 million possible articles, it will choose up to about 50,000. Anyone can add to it.

To look at options under consideration for scaling up the focus list, go to Additions to Wikidata#Additions to focus list.

The second ring is articles chosen for download here. These must be open access article, but also and fundamentally, licensed to allow use of the text here. See Help:Licences. The choice will by a series of filters applied to the focus list. An important idea is that the downloaded articles should be representative, rather than reflecting the systematic bias of the medical literature as a whole. "Neglected diseases" should be well covered.

The third ring is of articles satisfying MEDRS: see Help:MEDRS. This inner selection is the ultimate goal: the main point of the project is to define it by using machine-readable data. How well will it match the decisions taken daily on Wikipedia about acceptable referencing of health information? The algorithm here can be adjusted in ways to accommodate both general criteria and special cases. Once real data from papers is combined with a prototype algorithm, we will understand more about the finer points of MEDRS. Where the guideline refers to judgement calls, it will not necessarily be possible to go behind those decisions to underlying data, but documenting edge cases in this area will have its own value.

Summary: The concentric overview means that the criteria for inclusion or downloading need not be strict. What is needed is a good set of articles for development and testing of the MEDRS criteria in the context of automated processing.