Knowledge Discovery: Solution Description and High Level Architecture
Knowledge Discovery: Solution Description and High Level Architecture
Overview
AI Knowledge is based on Retrieval-Augmented Generation (RAG), an advanced AI approach that enhances large language models (LLMs) by dynamically retrieving information from specified sources in real time. Unlike standard AI models, which rely solely on previously trained data, RAG dynamically searches for relevant documents from designated knowledge bases before generating a response. This ensures that AI-generated answers are not only contextually accurate but also up-to-date, making this technology particularly useful in IT Service Management (ITSM) and corporate environments where knowledge is constantly evolving.
Matrix42 offers flexible deployment options for AI-powered solutions. When using the CAI (Conversational AI) platform with locally hosted Natural Language Processing (NLP) and Large Language Model (LLM) components — fully owned and operated by Matrix42 — no external providers are involved at any stage of data processing. All data remains within the customer’s infrastructure or approved Matrix42-hosted environments (e.g., within the EU), ensuring full control over infrastructure, data flow, and compliance.
Alternatively, customers may choose to integrate external LLM providers, such as Azure OpenAI Service or OpenAI. In such cases, user input and context required for generating a response are transmitted to the infrastructure of the selected provider and processed in accordance with their regional hosting and data protection policies.
Matrix42 GenAI can power AI Knowledge by leveraging Retrieval-Augmented Generation (RAG). This allows end users to receive accurate, context-aware answers based on the organization's internal knowledge, such as:
- IT service documentation.
- Knowledge bases.
- Company policies.
- Internal repositories (wikis, PDFs, Confluence, and structured data sources).
The model ensures that only authorized and relevant information is retrieved and shared, enhancing self-service capabilities while reducing the workload on IT support teams.
Possible widget embeddings:
- M42 Self service portal
- MS Teams
- Webpage (public, internal)
Data Sources
AI Knowledge supports the following data sources:
- M42 Core/Pro and Enterprise
- SharePoint
- Confluence
- DokuWiki
- HelpJuice
- Web Pages – supports pages where elements are not dynamically generated.
- Local Files – supported formats: CSV, XLS/XLSX, DOCX, PDF.
Available Runtimes
- M42 local GenAI (Finland, Germany)
- Bring Your Own Model (Azure OpenAI / OpenAI)
Additional Features
Firewall
The system has security mechanisms to restrict access to unauthorized resources and control data flow to ensure security. The firewall can be based on OpenAI, Azure OpenAPI runtime, or a local classification model.
Link Downloader
- Allows downloading web page content where a widget used by the user is embedded.
- Supports pages with static content.
Reranking
- A mechanism for sorting documents based on their relevance to the given query.
- Selection of the best documents for the prompt based on relevance scoring.
Full-Text Search (FTS)
- Full-text search enabling retrieval of relevant fragments within documents and files.
System Workflow
- The user submits a query – interaction occurs through a widget or another system interface.
- Embedding-based search – the query is encoded into a vector and compared with existing documents in the database.
- Ranking and reranking – the system sorts results based on relevance.
- Best documents are included in the prompt – selected content is used to formulate the response.
- Response generation by GPT – the model generates a response based on system prompt instructions and selected documents.
Context Handling
- The system supports follow-up questions, allowing users to continue the conversation while maintaining context.
- Option to disable context for independent responses.
Automatic Data Refresh
- Ability to periodically re-fetch data, e.g., from web pages.
- Ensures up-to-date information through scheduled source updates.
Data Processing and Privacy
- Matrix42 processes exclusively the data that is necessary to generate a relevant and accurate response. This typically includes the user’s input (query), applicable contextual attributes from the M42 Pro platform, and relevant content retrieved from the organization’s knowledge base.
- By default, conversation transcripts are retained for a period of 12 months. This retention period can be customized based on the client’s preferences or internal policies. No data collected during interactions is used for training AI models. Matrix42 does not process, store, or download any sensitive or personal data beyond what is strictly required for response generation.
The following data points are recorded in system logs for auditing and monitoring purposes:
question – the user’s question.
response – the previous answer, enabling context continuity.
sessionId – a unique identifier for the conversation session.
addressIp – the IP address of the user (if available).
startTime – the timestamp marking the session’s initiation.
processingTime – the time taken to generate the response [s].
rawQuestion – the original input submitted by the user.
rawResponse – the unprocessed response generated by the AI model.
AI Knowledge - High Level Architecture

FAAS (Function as a Service) - this is an internal module responsible for communication with models and external platforms. It is fully customizable and the data that is shared externally must be clearly specified in this module. The customization is done in the project and is derived from what is needed to generate the desired answer; user question, a system prompt and data from the vector knowledge base. The model or any external platforms do not have direct access to the data
Technical Architecture of MS Teams Integration (Knowledge Discovery)
User Interaction Layer
Microsoft Teams App
- A custom Teams app installed by the organization’s users.
- Acts as the primary user interface for interacting with Knowledge Discovery.
- Provides both chat-based and potentially adaptive card-based interactions.
Azure Bot Service
- Handles inbound messages from Teams.
- Performs intent recognition, authentication, and routing to backend services.
- Can initiate proactive messages if required (e.g., reminders, ticket updates).
Azure Cloud Infrastructure
Azure Resource Group
- Logical container managing all related Azure resources.
Azure Bot
- Registered bot that integrates with Microsoft Teams via the Bot Framework.
- Connects to the Web App for handling business logic or accessing external APIs.
Azure Web App (Node.js in Docker)
- The central backend application.
- Hosted as a containerized Node.js app via Azure App Service.
- Handles:
- Authentication & authorization
- Dialog state management
- Routing logic
- Integration with GenAI / Rail API
Azure Container Registry
- Hosts Docker images used by the Web App.
- Supports CI/CD workflows and versioned deployments.
Azure Cosmos DB (NoSQL)
- Persists:
- Chat history
- User preferences and profiles
- Metadata about ongoing sessions
Integration Layer
-
Matrix42 Conversational AI Platform
- Acts as the central AI orchestration hub.
- Receives natural language inputs and enriches them using:
-
M42 GenAI or third-party LLMs
- Retrieval-Augmented Generation (RAG)
- Contextual memory from past sessions
- Exposed via secure HTTP(S) endpoints.
-
Rail API (Integration Middleware)
- Normalizes requests/responses between the Conversational AI layer and external systems.
- Handles authentication, request transformation, and caching.
- Routes to:
- M42 Enterprise Service Management (ESM)
- Other backend business systems as required
-
M42 Enterprise Service Management (ESM)
- Provides service request fulfillment, CMDB queries, status updates, etc.
- Acts as a primary backend system for actionable use cases.
Communication Protocols
-
HTTPS
- Primary protocol for secure data transfer between:
- Teams ↔ Azure Bot
- Azure Bot ↔ Web App
- Web App ↔ Rail API & ESM
- Teams ↔ GenAI endpoints via Web App
- Primary protocol for secure data transfer between:
-
WebSocket (Optional/Pluggable)
- For real-time interactions where needed (e.g., streaming LLM responses or live typing indicators).
AI and Knowledge Discovery Layer
-
M42 GenAI + RAG Stack
- Combines:
- Large Language Model (e.g., GPT, Phi)
- Document retrievers for domain-specific answers
- Queries internal documentation and knowledge sources for accurate answers.
- Combines:
-
Supported Knowledge Sources:
- SharePoint (modern/classic)
- Atlassian Confluence
- DokuWiki
- Internal web pages (via crawler or connector)
- PDF, Word, HTML, Markdown documents
- Structured knowledge bases
End-to-End Workflow
- User sends a message in Teams
- Azure Bot captures and parses the input.
-
Bot forwards the request to the Web App, which handles:
- User/session validation
- AI enrichment via Conversational AI
-
Conversational AI uses GenAI + RAG to:
- Interpret the question
- Retrieve relevant organizational content
- Formulate a human-like answer
- If action is required (e.g., ticket creation), the Web App routes the request to Rail API, which connects to ESM.
- ESM returns data/results, which flow back to Teams for user presentation.
- Web App stores conversation context in Cosmos DB to improve future interactions.
Architecture visualization

Table of Contents