| Title: | Easy Interface to Search 'SciELO' Database |
|---|---|
| Description: | Provides a simple interface to search and retrieve scientific articles from the 'SciELO' (Scientific Electronic Library Online) database <https://scielo.org>. It allows querying, filtering, and visualizing results in an interactive table. |
| Authors: | Pablo Ixcamparij [aut, cre], Keneth Masis Leandro [aut] |
| Maintainer: | Pablo Ixcamparij <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.1 |
| Built: | 2026-05-31 08:42:00 UTC |
| Source: | https://github.com/programa-isa/easyscielopack |
This is the core function that performs the web scraping and data extraction. It handles pagination and combines results into a single data frame.
fetch_scielo_results(query_obj)fetch_scielo_results(query_obj)
query_obj |
A |
A data.frame containing all fetched articles.
Accepts a single subject category and validates it.
normalize_categories(category)normalize_categories(category)
category |
A character vector of length 1. Subject category to filter by (e.g., "environmental sciences"). |
A cleaned category string if valid.
Converts country names or ISO codes into valid SciELO collection codes. Only one value is allowed.
normalize_collections(collections)normalize_collections(collections)
collections |
A character vector of length 1: a country name (e.g., "Costa Rica") or a valid SciELO ISO code (e.g., "cri"). |
A character string representing the normalized SciELO collection code.
normalize_collections("Costa Rica") # returns "cri" normalize_collections("cri") # returns "cri"normalize_collections("Costa Rica") # returns "cri" normalize_collections("cri") # returns "cri"
Accepts a single journal name and validates it.
normalize_journals(journal)normalize_journals(journal)
journal |
A character vector of length 1. Journal name to filter by (e.g., "Revista Ambiente & Água"). |
A cleaned journal name if valid.
Accepts a single language code ("es", "pt", "en") and validates it.
normalize_languages(lang_code)normalize_languages(lang_code)
lang_code |
A character vector of length 1. Language code to filter by. |
A normalized (lowercase) language code if valid.
normalize_languages("EN") # returns "en"normalize_languages("EN") # returns "en"
Executes a search in the SciELO database using multiple optional filters, and returns the results as a data frame.
search_scielo( query, lang = "en", lang_operator = "AND", n_max = NULL, journals = NULL, collections = NULL, languages = NULL, categories = NULL, year_start = NULL, year_end = NULL )search_scielo( query, lang = "en", lang_operator = "AND", n_max = NULL, journals = NULL, collections = NULL, languages = NULL, categories = NULL, year_start = NULL, year_end = NULL )
query |
Search term (e.g., "climate change"). Required. |
lang |
Interface language for SciELO website ("en", "es", "pt"). Default is "en". |
lang_operator |
Operator for combining language filters ("AND" or "OR"). Default is "AND". |
n_max |
Maximum number of results to return. Optional. |
journals |
Vector of journal names to filter. Only one supported. Optional. |
collections |
A character string for filtering by SciELO collection (country name or ISO code, e.g., "Mexico" or "mex"). |
languages |
Vector of article languages to filter (e.g., "en"). |
categories |
Vector of subject categories (e.g., "ecology"). |
year_start |
Start year for filtering articles. Optional. |
year_end |
End year for filtering articles. Optional. |
Note: Only one value per filter category is currently supported (e.g., only one language).
A data.frame with the search results.
# Simple search with a keyword df1 <- search_scielo("salud ambiental") # Limit number of results to 5 df2 <- search_scielo("salud ambiental", n_max = 5) # Filter by SciELO collection (country name or code) df3 <- search_scielo("salud ambiental", collections = "Ecuador") df4 <- search_scielo("salud ambiental", collections = "cri") # Costa Rica by ISO code # Filter by article language df5 <- search_scielo("salud ambiental", languages = "es") # Filter by a specific journal df6 <- search_scielo("salud ambiental", journals = "Revista Ambiente & Agua") # Filter by subject category df7 <- search_scielo("salud ambiental", categories = "environmental sciences") # Filter by year range df8 <- search_scielo("salud ambiental", year_start = 2015, year_end = 2020)# Simple search with a keyword df1 <- search_scielo("salud ambiental") # Limit number of results to 5 df2 <- search_scielo("salud ambiental", n_max = 5) # Filter by SciELO collection (country name or code) df3 <- search_scielo("salud ambiental", collections = "Ecuador") df4 <- search_scielo("salud ambiental", collections = "cri") # Costa Rica by ISO code # Filter by article language df5 <- search_scielo("salud ambiental", languages = "es") # Filter by a specific journal df6 <- search_scielo("salud ambiental", journals = "Revista Ambiente & Agua") # Filter by subject category df7 <- search_scielo("salud ambiental", categories = "environmental sciences") # Filter by year range df8 <- search_scielo("salud ambiental", year_start = 2015, year_end = 2020)
Ensures that start and end years are valid numeric values and in correct order.
years(start_year, end_year)years(start_year, end_year)
start_year |
Integer. Start year for filtering (inclusive). |
end_year |
Integer. End year for filtering (inclusive). |
A list with named elements year_start and year_end.
valid_years <- years(2018, 2022)valid_years <- years(2018, 2022)