Step-by-Step Installation

🗂️ Create database topology config file

HariKube determines data locality using the object key structure and applies routing based on configurable policies, such as matching by resource type, namespace, key prefix, or custom resource definition.

Routing configurations are evaluated in order from top to bottom, and the first matching rule determines the data’s target database. Once a match is found, subsequent rules are ignored for that resource.

Routing policies must be carefully designed, as adding or changing a policy for resource types that already have stored data can result in the existing records becoming inaccessible. HariKube does not migrate previously stored resources to the new target automatically, so any change in routing may lead to apparent data loss unless migration handled manually.

During runtime the middleware monitors configuration changes and applies new configuration, but only adding new configuration to the bottom is supported.

Names and endpoints must be unique in the configuration. If you have to change endpoint, first ensure all data exists on the new endpoint, and then restart the middleware. If you have to change name, restart the middleware and all services - including Kubernetes - which depends on historical data.

topology.yaml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
backends:
- name: rbac
  endpoint: http://127.0.0.1:2579
  regexp:
    prefix: (clusterrolebindings|clusterroles|rolebindings|roles|serviceaccounts)
    key: (clusterrolebindings|clusterroles|rolebindings|roles|serviceaccounts)
- name: kube-system
  endpoint: mysql://root:passwd@tcp(127.0.0.1:3306)/kube_system
  namespace:
    namespace: kube-system
- name: pods
  endpoint: postgres://postgres:passwd@127.0.0.1:5432/pods
  prefix:
    prefix: pods
- name: shirts
  endpoint: sqlite://./db/shirts.db?_journal=WAL&cache=shared
  customresource:
    group: stable.example.com
    kind: shirts

Routing Configuration Explained

  • ETCD with regular expression routing: Routes Kubernetes RBAC resources to an ETCD store.

  • MySQL endpoint with namespace matching: All objects in the kube-system namespace are routed to a MySQL backend.

    If you want only a selected list of resources, you can configure them via kinds field. For custom resources you have to create a separate policy, because both given types and custom resources are not supported in the same time.

    topology.yaml
    1
    2
    3
    4
    5
    6
    7
    
    - name: kube-system
      endpoint: mysql://root:passwd@tcp(127.0.0.1:3306)/kube_system
      namespace:
        namespace: kube-system
        kinds:
        - pods
        - deployments
    

  • PostgreSQL endpoint with prefix matching: All pods resources - except pods in kube-system namespace - are routed to a PostgreSQL backend.

  • SQLite endpoint for specific custom resources: Routes all resources of type shirts in the group stable.example.com to a lightweight embedded SQLite database.

  • Rest of the objects are stored in the default database.

Advanced Configuration Options

📘 Metadata Store Configuration

HariKube includes an internal metadata store that maintains mapping information about the underlying databases. It keeps track of which database is responsible for each data segment, ensuring consistency and fast lookups without querying every backend directly. The metadata store is central to HariKube’s ability to provide dynamic data placement, multi-database support, and high-performance routing across flat or hierarchical topologies. To configure metadata store you have to set environment variable(s) for the middleware. Default Metadata store is sqlite.

  • REVISION_MAPPER_HISTORY: Defines how long metadata revisions are retained in the system. After this period, older revisions are treated as compacted and are no longer accessible for historical lookups. This helps manage storage usage by limiting how long old revision data is preserved. If 0, the compaction is disabled. Default 4h
  • REVISION_MAPPER_CACHE_CAPACITY: Capacity of the in-memory cache. Default 10000, 1000000 for In-memory
  • REVISION_MAPPER_SKIP_VERIFY: Skip TLS verification. Default false

HariKube mappers are optimized for high-throughput environments, and some persist metadata asynchronously; in worst-case scenarios where metadata is lost, restarting services that rely on historical revisions will safely reinitialize a fresh revision history—allowing the system to continue operating, even if older state is no longer available.

🔌 Start Middleware

Images are not public, please ask for registry user via info@inspirnation.eu.

1
2
3
4
5
6
7
docker run -d \
  -e TOPOLOGY_CONFIG=file:///topology.yaml \
  -v $(pwd)/topology.yaml:/topology.yaml \
  -v harikube_db:/db \
  -p 2379:2379 \
  registry.harikube.info/harikube/middleware:beta-v1.0.0-4 \
  --listen-address=0.0.0.0:2379 --endpoint=multi://http://<default.database.server:2379>

🚀 Setup and start Kubernetes

Kubernetes Configuration

HariKube requires specific Kubernetes configuration to enable custom resource routing and external data store integration

MandatoryCategoryOptionDescription
Feature GateCustomResourceFieldSelectors=trueEnables CR field selectors
Feature GateWatchList=trueEnables watch list support
Feature GateWatchListClient=trueEnables watch list client feature
API Server Flag--encryption-provider-config=""Encryption not supported
API Server Flag--storage-media-type=application/jsonSets storage format to JSON
API Server Flag--etcd-servers=http(s)://middleware.service:2379Sets the middleware as the ETCD backend
API Server Flag--watch-cache=falseDisables watch cache (recommended for large data)
API Server Flag--max-mutating-requests-inflight=400Increases concurrency for mutating requests
API Server Flag--max-requests-inflight=800Increases concurrency for all requests
API Server Flag--enable-garbage-collector=falseOn case all databases use automatic GC
Controller Manager Flag--enable-garbage-collector=falseOn case all databases use automatic GC

Kubernetes is compatible with HariKube by default. However, due to architectural constraints in ETCD—its underlying storage system—it is not optimized for handling very large datasets. To enable support for high-volume data workloads, modifications to specific Kubernetes components (such as the API server) are required.

You can use our pre-built images, pre-built versions are:

Major versionPatch versionsArchitecures
v1.32v1.32.0, v1.32.1, v1.32.2, v1.32.3amd64, arm64, ppc64le, s390x
v1.33v1.33.0amd64, arm64, ppc64le, s390x
1
2
docker pull registry.harikube.info/harikube/kube-apiserver-amd64:v1.33.0
docker pull registry.harikube.info/harikube/kube-controller-manager-amd64:v1.33.0

Comipiling Kubernetes From Source Code (optional)

For detailed information about how to build Kubernetes please follow official documentation. But here are some simple steps to compile it.

1
2
3
git clone https://github.com/kubernetes/kubernetes.git
cd kubernetes
git checkout v1.33.0

Download kubernetes-v1.33.0.patch and apply.

1
git apply kubernetes-v1.33.0.patch

Building Options

<– Overview | Custom Resource –>