Description
The following metadata fields can be extracted from a DESCRIPTION file.
These fields are defined in the DESCRIPTON specification, and are mapped according to the CodeMeta crosswalk for DESCRIPTION files based in R Package.
| Software metadata category | SOMEF metadata JSON path | DESCRIPTION metadata file field |
|---|---|---|
| authors | authors[i].result.value | Authors (1) |
| authors | authors[i].result.email | Authors (2) |
| code_repository | code_repository[i].result.value | URL (3) |
| description | description[i].result.value | Description (3) |
| has_package_file | has_package_file[i].result.value | URL of the DESCRIPTION file |
| homepage | homepage[i].result.value | URL (3) |
| issue_tracker | issue_tracker[i].result.value | BugReports (5) |
| license | license[i].result.value | License (6) |
| package_id | package_id[i].result.value | Package (6) |
| version | version[i].result.value | Version (7) |
(1), (2) ,
- Regex 1: r'Authors@R:\s*c\(([\s\S]*?)\)\s*$' → group[1]
- Regex 2: find in group[1] all persons and extract first name (or organition), last name and email
- Example:
Authors@R: c(
person("Hadley", "Wickham", , "hadley@posit.co", role = "aut",
comment = c(ORCID = "0000-0003-4757-117X")),
person("Winston", "Chang", role = "aut",
comment = c(ORCID = "0000-0002-1576-2126"))
)
- Result:
{'result': {'value': 'Hadley Wickham', 'type': 'Agent', 'email': 'hadley@posit.co'}, 'confidence': 1, 'technique': 'code_parser', 'source': 'https://example.org/DESCRIPTION'}, {'result': {'value': 'Winston Chang', 'type': 'Agent'}, 'confidence': 1, 'technique': 'code_parser', 'source': 'https://example.org/DESCRIPTION'}
(3)
- Regex: 'URL:\s*([^\n]+(?:\n\s+[^\n]+)*)'
- if github.com or gitlab.com --> code_repository
- if not --> homepage
- Example:
URL: https://ggplot2.tidyverse.org,
https://github.com/tidyverse/ggplot2
- Result code_repository:
'result': {'value': 'https://github.com/tidyverse/tidyverse', 'type': 'Url'} - Result hompeage:
'result': {'value': 'https://tidyverse.tidyverse.org', 'type': 'Url'}}
(3)
- Regex: r'Description:\s*([^\n]+(?:\n\s+[^\n]+)*)', content)
- Example:
```Description: A system for 'declaratively' creating graphics, based on "The
Grammar of Graphics". You provide the data, tell 'ggplot2' how to map
variables to aesthetics, what graphical primitives to use, and it
takes care of the details.
- Result:
A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. ```
(5)
- Regex: 'BugReports:\s*([^\n]+)'
- Example: BugReports: https://github.com/tidyverse/ggplot2/issues
- Result: https://github.com/tidyverse/ggplot2/issues
(5)
- Regex: r'License:\s*([^\n]+)'``
- Example:License: MIT + file LICENSE- Result:MIT + file LICENSE`
(6)
- Regex: r'Package:\s*([^\n]+)
- Example: Package: ggplot2
- Result: ggplot2
(6)
- Regex: r'Version:\s*([^\n]+)'
- Example: Version: 2.0.0.9000
- Result: 2.0.0.9000