Jump to related tools in the same category or review the original source on GitHub.

Search & Research @matanle51 Updated 2/8/2026

Agentic Paper Digest Skill OpenClaw Plugin & Skill | ClawHub

Looking to integrate Agentic Paper Digest Skill into your AI workflows? This free OpenClaw plugin from ClawHub helps you automate search & research tasks instantly, without having to write custom tools from scratch.

What this skill does

Fetches and summarizes recent arXiv and Hugging Face papers with Agentic Paper Digest. Use when the user wants a paper digest, a JSON feed of recent papers, or to run the arXiv/HF pipeline.

Install

npx clawhub@latest install agentic-paper-digest-skill

Full SKILL.md

Open original
Metadata table.
namedescriptionhomepage
agentic-paper-digest-skillFetches and summarizes recent arXiv and Hugging Face papers with Agentic Paper Digest. Use when the user wants a paper digest, a JSON feed of recent papers, or to run the arXiv/HF pipeline.https://github.com/matanle51/agentic_paper_digest

SKILL.md content below is scrollable.

Agentic Paper Digest

When to use

  • Fetch a recent paper digest from arXiv and Hugging Face.
  • Produce JSON output for downstream agents.
  • Run a local API server when a polling workflow is needed.

Prereqs

  • Python 3 and network access.
  • LLM access via OPENAI_API_KEY or an OpenAI-compatible provider via LITELLM_API_BASE + LITELLM_API_KEY.
  • git is optional for bootstrap; otherwise curl/wget (or Python) is used to download the repo.

Get the code and install

  • Preferred: run the bootstrap helper script. It uses git when available or falls back to a zip download.
bash "{baseDir}/scripts/bootstrap.sh"
  • Override the clone location by setting PROJECT_DIR.
PROJECT_DIR="$HOME/agentic_paper_digest" bash "{baseDir}/scripts/bootstrap.sh"

Run (CLI preferred)

bash "{baseDir}/scripts/run_cli.sh"
  • Pass through CLI flags as needed.
bash "{baseDir}/scripts/run_cli.sh" --window-hours 24 --sources arxiv,hf

Run (API optional)

bash "{baseDir}/scripts/run_api.sh"
  • Trigger runs and read results.
curl -X POST http://127.0.0.1:8000/api/run
curl http://127.0.0.1:8000/api/status
curl http://127.0.0.1:8000/api/papers
  • Stop the API server if needed.
bash "{baseDir}/scripts/stop_api.sh"

Outputs

  • CLI --json prints run_id, seen, kept, window_start, and window_end.
  • Data store: data/papers.sqlite3 (under PROJECT_DIR).
  • API: POST /api/run, GET /api/status, GET /api/papers, GET/POST /api/topics, GET/POST /api/settings.

Configuration

Config files live in PROJECT_DIR/config. Environment variables can be set in the shell or via a .env file. The wrappers here auto-load .env from PROJECT_DIR (override with ENV_FILE=/path/to/.env).

Environment (.env or exported vars)

  • OPENAI_API_KEY: required for OpenAI models (litellm reads this).
  • LITELLM_API_BASE, LITELLM_API_KEY: use an OpenAI-compatible proxy/provider.
  • LITELLM_MODEL_RELEVANCE, LITELLM_MODEL_SUMMARY: models for relevance and summarization (summary defaults to relevance model if unset).
  • LITELLM_TEMPERATURE_RELEVANCE, LITELLM_TEMPERATURE_SUMMARY: lower for more deterministic output.
  • LITELLM_MAX_RETRIES: retry count for LLM calls.
  • LITELLM_DROP_PARAMS=1: drop unsupported params to avoid provider errors.
  • WINDOW_HOURS, APP_TZ: recency window and timezone.
  • ARXIV_CATEGORIES: comma-separated categories (default includes cs.CL,cs.AI,cs.LG,stat.ML,cs.CR).
  • ARXIV_API_BASE, HF_API_BASE: override source endpoints if needed.
  • ARXIV_MAX_RESULTS, ARXIV_PAGE_SIZE: arXiv paging limits.
  • MAX_CANDIDATES_PER_SOURCE: cap candidates per source before LLM filtering.
  • FETCH_TIMEOUT_S, REQUEST_TIMEOUT_S: source fetch and per-request timeouts.
  • ENABLE_PDF_TEXT=1: include first-page PDF text in summaries; requires PyMuPDF (pip install pymupdf).
  • DATA_DIR: location for papers.sqlite3.
  • CORS_ORIGINS: comma-separated origins allowed by the API server (UI use).
  • Path overrides: TOPICS_PATH, SETTINGS_PATH, AFFILIATION_BOOSTS_PATH.

Config files

  • config/topics.json: list of topics with id, label, description, max_per_topic, and keywords. The relevance classifier must output topic IDs exactly as defined here. max_per_topic also caps results in GET /api/papers when apply_topic_caps=1.
  • config/settings.json: overrides fetch limits (arxiv_max_results, arxiv_page_size, fetch_timeout_s, max_candidates_per_source). Updated via POST /api/settings.
  • config/affiliations.json: list of {pattern, weight} boosts applied by substring match over affiliations. Weights add up and are capped at 1.0. Invalid JSON disables boosts, so keep the file strict JSON (no trailing commas).

Mandatory workflow (follow step-by-step)

  1. You first MUST open and read the configuration from the github repo: https://github.com/matanle51/agentic_paper_digest you downloaded:
    • Load config/topics.json, config/settings.json, and config/affiliations.json (if present).
    • Note current topic IDs, caps, and fetch limits before asking the user to change them.
  2. ASK THE USER TO PROVIDE IT'S PREFERENCES ABOUT THE FOLLOWING (HELP THE USER):
    • Topics of interest → update config/topics.json (topics[].id/label/description/keywords, max_per_topic).
      Show current defaults and ask whether to keep or change them.
    • Time window (hours) → set WINDOW_HOURS (or pass --window-hours to CLI) only if the user cares; otherwise keep default to 24h.
    • ASK THE USER TO FILL THE FOLLOWING PARAMETERS (explain the user why are their intent): ARXIV_CATEGORIES, ARXIV_MAX_RESULTS, ARXIV_PAGE_SIZE, MAX_CANDIDATES_PER_SOURCE.
      Ask whether to keep defaults and show the current values.
    • Model/provider → set OPENAI_API_KEY or LITELLM_API_KEY (+ LITELLM_API_BASE if proxy), and set LITELLM_MODEL_RELEVANCE/LITELLM_MODEL_SUMMARY.
    • Do NOT ask by default: timezone, quality vs cost, timeouts, PDF text, affiliation biasing, sources list. Use defaults unless the user requests changes.
  3. Confirm workspace path: Ask where to clone/run. Default to PROJECT_DIR="$HOME/agentic_paper_digest" if the user doesn’t care. Never hardcode /Users/... paths.
  4. Bootstrap the repo: Run the bootstrap script (unless the repo already exists and the user says to skip).
  5. Create or verify .env:
    • If .env is missing, create it from .env.example (in the repo), then ask the user to fill keys and any requested preferences.
    • Ensure at least one of OPENAI_API_KEY or LITELLM_API_KEY is set before running.
  6. Apply config changes:
    • Edit JSON files directly (or use POST /api/topics and POST /api/settings if running the API).
  7. Run the pipeline:
    • Prefer scripts/run_cli.sh for one-off JSON output.
    • Use scripts/run_api.sh only if the user explicitly asks for UI/API access or polling.
  8. Report results:
    • If results are sparse, suggest increasing WINDOW_HOURS, ARXIV_MAX_RESULTS, or broadening topics.

Getting good results

  • Help the user define and keep topics focused and mutually exclusive so the classifier can choose the right IDs.
  • Use a stronger model for summaries than for relevance if quality matters.
  • If using openAI's model, defualy to gpt-5-mini for good tradeoff.
  • Increase WINDOW_HOURS or ARXIV_MAX_RESULTS when results are sparse, or lower them if results are too noisy.
  • Tune ARXIV_CATEGORIES to your research domains.
  • Enable PDF text (ENABLE_PDF_TEXT=1) when abstracts are too thin.
  • Use modest affiliation weights to bias ranking without swamping relevance.
  • BE PROACTIVE AND HELP THE USER TUNE THE SKILL FOR GOOD RESULTS!

Troubleshooting

  • Port 8000 busy: run bash "{baseDir}/scripts/stop_api.sh" or pass --port to the API command.
  • Empty results: increase WINDOW_HOURS or verify the API key in .env.
  • Missing API key errors: export OPENAI_API_KEY or LITELLM_API_KEY in the shell before running.
Original Repository URL: https://github.com/openclaw/skills/blob/main/skills/matanle51/agentic-paper-digest-skill
Latest commit: https://github.com/openclaw/skills/commit/1fc8f4f1f38f851e7816b31bda78aca1dcc787b0

Related skills

If this matches your use case, these are close alternatives in the same category.

1

Personal knowledge base powered by Ensue for capturing and retrieving understanding. Use when user wants to save knowledge, recall what they know, manage their toolbox, or build on past learnings. Triggers on "save this", "remember", "what do I know about", "add to toolbox", "my notes on", "store this concept".

academic-deep-research

Transparent, rigorous research with full methodology — not a black-box API wrapper. Conducts exhaustive investigation through mandated 2-cycle research per theme, APA 7th citations, evidence hierarchy, and 3 user checkpoints. Self-contained using native OpenClaw tools (web_search, web_fetch, sessions_spawn). Use for literature reviews, competitive intelligence, or any research requiring academic rigor and reproducibility.

academic-writer

Professional LaTeX writing assistant. Capabilities include: scanning existing LaTeX templates, reading reference materials (Word/Text), drafting content strictly following templates, and compiling PDFs. Triggers include: 'write thesis', 'draft section', 'compile pdf', 'check latex format'. Designed to work in tandem with 'academic-research-hub' for citation retrieval.

academic-writing

You are an academic writing expert specializing in scholarly papers, literature reviews, research methodology, and thesis writing. You must adhere to strict academic standards in all outputs.## Core Requirements1. **Output Format**: Use Markdown exclusively for all writing outputs and always wrap the main content of your response within <ama-doc></ama-doc> tags to clearly distinguish the core i...

academic-writing-refiner

Refine academic writing for computer science research papers targeting top-tier venues (NeurIPS, ICLR, ICML, AAAI, IJCAI, ACL, EMNLP, NAACL, CVPR, WWW, KDD, SIGIR, CIKM, and similar). Use this skill whenever a user asks to improve, polish, refine, edit, or proofread academic or research writing — including paper drafts, abstracts, introductions, related work sections, methodology descriptions, experiment write-ups, or conclusion sections. Also trigger when users paste LaTeX content and ask for writing help, mention "camera-ready", "rebuttal", "paper revision", or reference any academic venue or conference. This skill handles both full paper refinement and section-by-section editing.

aclawdemy

The academic research platform for AI agents. Submit papers, review research, build consensus, and push toward AGI — together.