Software Modules
End-to-end pipelines composed from software components
Software modules are production-ready, end-to-end pipelines that combine software components into a single, configurable workflow. They eliminate the need to wire components together manually — you provide the input and your field requirements, and the module handles transcription or parsing, schema generation, and structured extraction in one step.
See also: No-Code Assets for prompt templates and packaged skills.
Use software modules when you want a complete, working pipeline with minimal integration code. Each module is designed to cover the most common GenAI knowledge capture scenarios out of the box.
Overview
| Module | Input | Processing steps | Output |
|---|---|---|---|
| Audio-to-Structured-Data | Audio / Video | Transcribe → Extract | Structured fields from spoken content |
| Document-to-Structured-Data | PDF / DOCX | Parse → Extract | Structured fields from document content |
| RAG-Workflow | PDF files + User query | Parse → Embed → Store → Retrieve → Generate | Contextual answers with citations |
Audio-to-Structured-Data
This module bridges the gap between spoken language and structured data. It combines the Transcriber and Extractor components into a single pipeline: audio is transcribed to text, and the text is processed against your field requirements to produce validated, structured output. Both the transcript and the extracted fields are returned together. The generated schema is persisted and reused on subsequent runs, making repeated processing fast and cost-efficient.
Example: A site worker records a 90-second voice note describing a water leak in the storage room. The module produces: a clean transcript ("At 14:30, a water leak was noticed in storage room B..."), and extracted incident fields: Event type: Water leak · Location: Storage room B · Time: 14:30 · Severity: Yes · Immediate action: Area cordoned off · Equipment affected: Electrical equipment near east wall.
What the module returns:
| Output | Description |
|---|---|
| Raw Transcript | Verbatim speech-to-text output |
| Enhanced Transcript | GPT-refined, readable version |
| Structured Fields | All defined fields extracted and validated |
| Reusable Schema | Pydantic schema saved for future runs |
Key features:
- Single entry point for voice-to-data workflows
- Plain-language field requirements — no code or schema definition needed
- Returns both transcript and extracted data for full traceability
- Schema reuse eliminates redundant processing on repeated runs
Potential applications:
-
Workplace incident and safety observation reporting
-
Construction site and field work diary entries
-
Field service and maintenance activity logging
-
Quality inspection and audit recordings
-
Medical and clinical dictation capture
-
Sales call notes and customer interaction summaries
-
Meeting minutes and action item extraction
-
Implementation code: software_modules/audio_to_structured_data
-
Usage examples: examples/audio_to_structured_data
Document-to-Structured-Data
This module turns static documents into structured, queryable data. It combines the Document Parser and Extractor components: a document is first converted to clean markdown, then processed against your field requirements to extract exactly the data you need. The module supports both vision-based and local parsing strategies, making it adaptable from clean digital PDFs to visually complex, multi-column layouts.
Example: A scanned 8-page supplier invoice PDF with embedded tables is parsed into clean markdown. The module extracts the required fields: Invoice number: INV-20241103 · Supplier: Acme Components Ltd · Total (excl. VAT): €4,820.00 · Due date: 30.11.2024 · Line items: 3.
What the module returns:
| Output | Description |
|---|---|
| Parsed Markdown | Document content converted to clean, structured markdown |
| Structured Fields | All defined fields extracted and validated |
| Reusable Schema | Pydantic schema saved for future runs |
Key features:
- Configurable parser choice — vision-based for complex layouts, local for speed and cost efficiency
- Plain-language field requirements eliminate manual schema work
- Returns both parsed text and extracted data for full traceability
- Schema reuse supports efficient batch processing across large document sets
Potential applications:
-
Invoice and purchase order data capture from PDFs
-
Contract and agreement key term extraction
-
Technical specification and datasheet processing
-
Research report and regulatory filing analysis
-
CV and job application data extraction
-
Insurance and financial form digitization
-
Product catalogue and pricing data ingestion
-
Implementation code: software_modules/documents_to_structured_data
-
Usage examples: examples/documents_to_structured_data
RAG-Workflow
This module enables question-answering over document collections. It combines all five RAG components into a complete workflow with two phases: an indexing phase that parses, embeds, and stores documents, and a query phase that retrieves relevant context and generates answers with citations. The module maintains a persistent vector store and optional conversation history, making it suitable for interactive document exploration and knowledge retrieval applications.
Example: A team indexes 50 technical specification PDFs. Later, a user asks: "What is the maximum operating temperature for component X?" The module retrieves relevant chunks and responds: "The maximum operating temperature for component X is 85°C [Source: TechSpec_X_v2.pdf, page 4]. For extended duty cycles, derate to 70°C [Source: OperatingGuide.pdf, page 12]."
What the module returns:
| Output | Description |
|---|---|
| Index Result | Number of documents and chunks indexed, vector store path |
| Answer | Generated response with inline citations |
| Retrieved Documents | Top-k relevant chunks used for answer generation |
| Conversation History | Optional: Last n Q/A pairs for context-aware responses |
Key features:
- Two-phase workflow: index once, query many times
- Persistent vector store — indexed data reused across sessions
- Configurable retrieval: top-k, score threshold, hybrid search, reranking
- Citation support — every answer references source documents and pages
- Optional conversation history for multi-turn dialogue
- Streaming response support for real-time UX
- PostgreSQL / pgvector support available at the component level for custom pipelines
Potential applications:
-
Internal knowledge base search for technical documentation
-
Compliance and policy Q&A over regulatory documents
-
Research paper exploration and literature review assistance
-
Customer support knowledge retrieval from product manuals
-
Legal contract and case law research
-
Medical guideline and protocol reference systems
-
Educational content Q&A and study assistance
-
Implementation code: software_modules/RAG_workflow
-
Usage examples: examples/RAG_workflow
GAIK