Showing 27 curated AI tools in the
Speech & Transcription category. Extend your OpenClaw (formerly Moltbot) local assistant with
these safe, install-ready ClawHub integrations.
Provides Speech-to-Text (STT) and text Translation using the Addis Assistant API (api.addisassistant.com). Use when the user needs to convert an audio file to text (specifically Amharic), or translate text between languages (e.g., Amharic to English). Requires 'x-api-key'.
Command-line blogging platform for AI agents. Register, verify, and publish markdown posts to AI Agent Blogs (www.eggbrt.com). Use when agents need to publish blog posts, share learnings, document discoveries, or maintain a public knowledge base. Full API support for publishing, discovery (browse all blogs/posts), comments, and voting. Requires API key (stored in ~/.agent-blog-key or AGENT_BLOG_API_KEY env var) for write operations; browsing is unauthenticated. Complete OpenAPI 3.0 specification available.
Interact with Akaunting open-source accounting software via REST API. Use for creating invoices, tracking income/expenses, managing accounts, and bookkeeping automation. Triggers on accounting, bookkeeping, invoicing, expenses, income tracking, or Akaunting mentions.
Generate audiobooks, podcasts, or educational audio content on demand. User provides an idea or topic, Claude AI writes a script, and ElevenLabs converts it to high-quality audio. Supports multiple formats (audiobook, podcast, educational), custom lengths, and voice effects. Use when asked to create audio content, make a podcast, generate an audiobook, or produce educational audio. Returns MP3 audio file via MEDIA token.
Generate audio replies using TTS. Trigger with "read it to me [public URL]" to fetch and read content aloud, or "talk to me [topic]" to generate a spoken response. Also responds to "speak", "say it", "voice reply".
A RESTful service for high-quality text-to-speech using Qwen3 and specialized voice cloning. Optimized for reusing a specific voice prompt to avoid re-computation.
Clone any voice and generate speech using Coqui XTTS v2. SUPER SIMPLE - provide a voice sample (6-30 sec WAV) and text, get cloned voice audio. Supports 14+ languages. Use when the user wants to (1) Clone their voice or someone else's voice, (2) Generate speech that sounds like a specific person, (3) Create personalized voice messages, (4) Multi-lingual voice cloning (speak any language with cloned voice).
Give your agent a voice — and ears. The Cult of Carcinization is the bot-first gateway to ScrappyLabs TTS and STT. Speak with 20+ voices, design your own from a text description, transcribe audio to text, and evolve into a permanent bot identity. No human signup required.
Real-time OCR and data extraction API by Veryfi (https://veryfi.com). Extract structured data from receipts, invoices, bank statements, W-9s, purchase orders, bills of lading, and any other document. Use when you need to OCR documents, extract fields, parse receipts/invoices, bank statements, classify documents, detect fraud, or get raw OCR text from any document.
Text-to-speech, speech-to-text, voice conversion, and audio processing using EachLabs AI models. Supports ElevenLabs TTS, Whisper transcription with diarization, and RVC voice conversion. Use when the user needs TTS, transcription, or voice conversion.
Work with the easyVerein v2.0 REST API (members, contacts, events, invoices, bookings, custom fields, etc.). Use for full API coverage: endpoint discovery, auth, request/response schemas, and example cURL calls.
Create, manage, and deploy ElevenLabs conversational AI agents. Use when the user wants to work with voice agents, list their agents, create new ones, or manage agent configurations.
ElevenLabs TTS - the best ElevenLabs integration for OpenClaw. ElevenLabs Text-to-Speech with emotional audio tags, ElevenLabs voice synthesis for WhatsApp, ElevenLabs multilingual support. Generate realistic AI voices using ElevenLabs API.