Skip to main content

Component Overview

This section introduces the core components of the BK-Lite platform and their key roles within the system.

Architecture Overview

BK-Lite adopts a microservices architecture, orchestrated and deployed via Docker Compose. Components are categorized as follows:

CategoryComponentRole
GatewayTraefikReverse proxy, request routing, TLS termination
ApplicationServer, WebBusiness logic, frontend interface
DataPostgreSQL, PGVector, FalkorDBRelational data, vector data, graph data
CacheRedisCaching, sessions, Celery Broker
MessagingNATSMessage queue, event distribution
MonitoringVictoriaMetrics, VictoriaLogsMetrics storage, log storage
StorageMinIOObject storage
AgentTelegraf, Vector, Beats seriesHost metrics collection, log collection
CollectorFusion-Collector, Stargazer, NATS-Executor, WebhookdUnified collection, cloud resource collection, command execution
AIvLLM (optional)Model inference service (Embedding, Rerank, OCR)

Core Component Details

Traefik (Reverse Proxy)

Role: Serves as the unified entry point for the system, handling request routing and TLS termination.

PropertyValue
Default Port443 (configurable)
Data VolumeNo persistent data
Backup PriorityLow (stateless)

Key Configuration:

  • Automatically discovers Docker containers and generates routing rules
  • Supports dynamic configuration hot-reloading
  • Provides an optional Dashboard management interface

PostgreSQL (Relational Database)

Role: Stores core business data including users, configurations, CMDB assets, and more.

PropertyValue
Default Port5432
Data Volumepostgres:/data/postgres
Backup PriorityHighest

Database List:

  • bklite - Main business database
  • mlflow - MLflow model tracking database
Important

PostgreSQL is the most critical data store in the system and must be prioritized for backup.


PGVector (Vector Database)

Role: Provides vector storage for OpsPilot AI capabilities, supporting semantic search and knowledge base Q&A.

PropertyValue
Default PortShares PostgreSQL port 5432
Database Namemetis
Backup PriorityHigh

Use Cases:

  • Knowledge base document vectorization
  • RAG (Retrieval-Augmented Generation) semantic search
  • AI conversation context association
Note

PGVector runs as a PostgreSQL extension, with data stored in a separate metis database.


FalkorDB (Graph Database)

Role: Stores CMDB asset relationship graphs, supporting complex topology queries.

PropertyValue
Default Port6479 (mapped to container port 6379)
Data Volumefalkordb:/var/lib/falkordb/data
Backup PriorityHigh

Use Cases:

  • CMDB asset relationships
  • Service dependency topology
  • Impact analysis and root cause identification
Technical Note

FalkorDB is based on the Redis protocol — you can use redis-cli for management operations.


Redis (Cache Database)

Role: Provides high-performance caching, session storage, and Celery task queue.

PropertyValue
Default Port6379
Data Volumeredis:/data
Backup PriorityMedium

Usage Allocation:

  • DB 1: Application cache
  • DB 3: Celery Broker / Result Backend

NATS (Message Queue)

Role: High-performance messaging middleware responsible for asynchronous communication and event distribution between components.

PropertyValue
Default Port4222 (client), 7422 (cluster)
Data Volumenats:/nats
Backup PriorityLow

Use Cases:

  • Monitoring metrics reporting channel
  • Log data transport
  • Node management command dispatch

VictoriaMetrics (Time Series Database)

Role: Stores monitoring metrics data, providing high-performance time series queries.

PropertyValue
Default Port8428
Data Volumevictoria-metrics:/victoria-metrics-data
Backup PriorityMedium
Default Retention168 hours (7 days)

Features:

  • Compatible with Prometheus query syntax
  • Supports high-cardinality metrics
  • Low resource footprint

VictoriaLogs (Log Database)

Role: Stores and retrieves system logs, supporting full-text search.

PropertyValue
Default Port9428
Data Volumevictoria-logs:/vlogs
Backup PriorityMedium

MinIO (Object Storage)

Role: Provides S3-compatible object storage for files and model artifacts.

PropertyValue
API Port9000
Console Port9001
Data Volumeminio:/data
Backup PriorityMedium

Stored Content:

  • MLflow model artifacts
  • Knowledge base uploaded files
  • System attachments

MLflow (Model Management)

Role: Machine learning model version management and experiment tracking.

PropertyValue
Default Port15000
Backend StorePostgreSQL (mlflow database)
Artifact StoreMinIO (mlflow-artifacts bucket)
Backup PriorityMedium

Server (Backend Service)

Role: BK-Lite core business service, providing REST APIs.

PropertyValue
Internal Port8000
External Path/api/v1/*
Backup PriorityLow (stateless)

Dependencies:

  • PostgreSQL (business data)
  • PGVector (vector data)
  • FalkorDB (graph data)
  • Redis (cache/queue)

Web (Frontend Service)

Role: Next.js frontend application, providing the user interface.

PropertyValue
Internal Port3000
External Path/ (default route)
Backup PriorityLow (stateless)

Collection and Agent Components

Telegraf (Metrics Collection)

Role: Collects host and container metrics, reporting to VictoriaMetrics via NATS.

PropertyValue
Config Fileconf/telegraf/telegraf.conf
Backup PriorityLow (stateless)

Vector (Log Collection)

Role: High-performance log collection and forwarding engine, sending log data to VictoriaLogs.

PropertyValue
Config Fileconf/vector/vector.yaml
Backup PriorityLow (stateless)

Fusion-Collector (Unified Collector)

Role: BK-Lite's proprietary unified collector, supporting multiple data source collection, deployable on managed nodes.

PropertyValue
SNMP Trap Port162/udp
Supported PlatformsLinux, Windows
Backup PriorityLow (stateless)

Collection Capabilities:

  • Host performance metrics (CPU, memory, disk, network)
  • SNMP Trap reception
  • Custom script collection

Stargazer (Cloud Resource Collection)

Role: Cloud resource collection and monitoring agent service, supporting multi-cloud platform resource synchronization.

PropertyValue
Internal Port8083
Tech StackPython + Sanic + ARQ
Backup PriorityLow (stateless)

Supported Cloud Platforms:

  • VMware vSphere
  • Alibaba Cloud
  • AWS
  • Tencent Cloud
  • Huawei Cloud

Architecture:

  • Server: Receives collection requests, distributes tasks
  • Worker: Executes specific collection tasks (based on ARQ task queue)

NATS-Executor (Command Executor)

Role: NATS-based remote command execution agent, deployed on managed nodes to execute scripts and commands.

PropertyValue
Tech StackGo
Supported PlatformsLinux, Windows, macOS
Backup PriorityLow (stateless)

Features:

  • Cross-platform command execution (sh, bash, bat, PowerShell)
  • Command execution timeout control
  • File download and extraction
  • Health checks

NATS Subscription Topics:

  • local.execute.{instance_id} - Local command execution
  • health.check.{instance_id} - Health check
  • download.local.{instance_id} - File download
  • unzip.local.{instance_id} - File extraction

Webhookd (Webhook Service)

Role: Provides HTTP Webhook interfaces for triggering script execution and Docker Compose management.

PropertyValue
Internal Port8080
Backup PriorityLow (stateless)

API Features:

  • Docker Compose service management (setup, start, stop, status, update)
  • Infrastructure management script execution
  • Kubernetes operation proxy
  • MLOps training task triggering

AI Inference Components (Optional)

Note

The following components are only deployed when OpsPilot AI capabilities are enabled and require GPU support.

vLLM Model Service

Role: Provides high-performance LLM inference service, supporting Embedding, Rerank, and OCR models.

ServiceModel TypePurpose
bce-embeddingText EmbeddingDocument vectorization
bge-embeddingText EmbeddingDocument vectorization (alternative)
bce-rerankRerankingSearch result optimization
olmocrOCRImage text recognition

Hardware Requirements:

  • NVIDIA GPU (CUDA supported)
  • VRAM >= 8GB (per model)
  • Recommended server memory >= 16GB

Data Volume Inventory

The following lists all Docker data volumes and their backup priorities:

Data VolumeComponentBackup PriorityDescription
postgresPostgreSQLHighestCore business data
falkordbFalkorDBHighGraph database
victoria-metricsVictoriaMetricsMediumMonitoring metrics
victoria-logsVictoriaLogsMediumLog data
minioMinIOMediumObject storage
redisRedisMediumCache data
natsNATSLowMessage queue
neo4j(Reserved)-Not enabled

Network Architecture

All components run within the bklite-prod Docker network and communicate internally via container names.

┌─────────────────────────────────────────────────────────────┐
│ External Access │
│ https://<HOST_IP> │
└─────────────────────────┬───────────────────────────────────┘

┌─────▼─────┐
│ Traefik │ :443
└─────┬─────┘
┌───────────────┼───────────────┐
│ │ │
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
│ Web │ │ Server │ │ Others │
│ :3000 │ │ :8000 │ │ │
└───────────┘ └─────┬─────┘ └───────────┘

┌──────────┬──────────┼──────────┬──────────┐
│ │ │ │ │
┌───▼───┐ ┌───▼───┐ ┌────▼────┐ ┌───▼───┐ ┌───▼───┐
│ Redis │ │ NATS │ │PostgreSQL│ │PGVector│ │FalkorDB│
│ :6379 │ │ :4222 │ │ :5432 │ │ :5432 │ │ :6479 │
└───────┘ └───────┘ └─────────┘ └────────┘ └────────┘

Next Steps