Installation

A step by step guide for setting up a self-hosted DataForge instance.

This chapter will provide a reference docker compose file to use for deployment. All services of a Daforge deployment may run in Docker, except the Dataforge server, which is installed natively as a systemd service.

Deploying DataForge using docker-compose

1. Obtaining necessary credentials

Once you have purchased a DataForge license, we will send you various credentials for your DataForge instance:

  • Your license key (a file ending in .key) will authenticate your DataForge server to our license server and enable all features included in your subscription
  • The docker registry credentials are used to pull the Docker images from our Docker registry

You will need these credentials during the installation process. Please do not share the credentials with anybody.

2. Setting up the environment

Download the included DataForge deployment package and extract it to a directory on your server (We recommend the name “dataforge”).

Rename the included template.env file to .env, which contains the most important configuration parameters.

Edit this file with an editor of your choice (for example nano or vim) and replace YOUR_DATAFORGE_VERSION at the top of the file with the version of DataForge that you want to install.

The .env file also contains placeholders ([REPLACE_ME]) that should be replaced by random secrets. For your convenience, we provide a gen_passwords.sh file which will automatically generate new secrets for your DataForge installation. Simply run the command bash ./gen_passwords.sh to generate random secrets.

Take note of the newly generated DF_SERVER_DEFAULTACCOUNTPASSWORD in the .env file, as you will need it for your first login to your DataForge server.

Secrets and some key configuration options are managed using a .env file.

# Set your DataForge version in the line below (For example 7.6.13)
DF_VERSION=YOUR_DATAFORGE_VERSION
# The docker registry the DataForge images are pulled from
DF_DOCKER_REGISTRY_PATH=images.intellitrend.de/dataforge/dataforge-core

# The timezone of your DataForge server
TIMEZONE=Europe/Berlin

# The database (root) credentials
DB_DATABASE=dataforge
DB_PASSWORD=[REPLACE_ME]

# The master key used to encrypt database values and sign sessions
MASTER_KEY=[REPLACE_ME]

# The credentials of the DataForge default account (admin@dataforge.loc)
DF_SERVER_DEFAULTACCOUNTENABLED=true
DF_SERVER_DEFAULTACCOUNTPASSWORD=[REPLACE_ME]

# Secrets for authentication on the NSQ message queue
DF_SERVER_NSQSECRET=[REPLACE_ME]
DF_COLLECTOR_NSQSECRET=[REPLACE_ME]
DF_PREPROCESSOR_NSQSECRET=[REPLACE_ME]
DF_RENDERER_NSQSECRET=[REPLACE_ME]
DF_DELIVERER_NSQSECRET=[REPLACE_ME]

3. Installing your license key file

Copy your DataForge license key (from step 1) to a file with the name dataforge.key in the same directory as your docker-compose.yaml.

Your DataForge server will use this file to communicate with one of our redundant license servers and activate the modules in your subscription. For this reason, the DataForge server will need to be able to reach our license servers.

4. Review the docker-compose.yaml file

The included docker-compose.yaml file defines how the services will start.

If you have any problems setting up the DataForge server, you may need to modify this file to match your system configuration.

You can consult the official docker-compose file reference or contact our support for further assistance.

The following is the content of the included docker compose file (docker-compose.yaml). This docker compose file is used to deploy all services including the DataForge server.

services:
    # ------------  DF Frontend  ------------
    df-frontend:
        container_name: df-frontend
        image: ${DF_DOCKER_REGISTRY_PATH}/df-frontend:${DF_VERSION}
        restart: always
        ports:
            - 80:80
            # Uncomment this line if you want to terminate TLS in this container.
            #- 443:443
        volumes:
            - ./df-frontend/data/cfg/df-base.conf:/etc/nginx/conf.d/df-base.conf

            # Uncomment this line if TLS is terminated on an external proxy
            #- ./df-frontend/data/cfg/df-http.conf:/etc/nginx/conf.d/default.conf

            # Uncomment these lines if you want to terminate TLS in this container.
            # You will need to install your TLS certificates in the directory below.
            #- ./df-frontend/data/cfg/df-https.conf:/etc/nginx/conf.d/default.conf
            #- ./df-frontend/data/ssl/:/etc/nginx/ssl/

    # ------------  DF Server  ------------
    df-server:
        image: ${DF_DOCKER_REGISTRY_PATH}/df-server:${DF_VERSION}
        container_name: df-server
        restart: always
        command: ./df-server
        depends_on:
            - mariadb
        environment:
            # API Server Configuration
            DF_SERVER_LOGLEVEL: "info"
            DF_SERVER_GRPCPORT: "8090"
            DF_SERVER_WEBPORT: "8091"
            DF_SERVER_RESTPORT: "8092"
            DF_SERVER_SLOWQUERYTHRESHOLD: "1500"
            DF_SERVER_ENCRYPTIONKEY: $MASTER_KEY
            DF_SERVER_DEFAULTACCOUNTENABLED: $DF_SERVER_DEFAULTACCOUNTENABLED
            DF_SERVER_DEFAULTACCOUNTPASSWORD: $DF_SERVER_DEFAULTACCOUNTPASSWORD
            DF_SERVER_THREADGUARDENABLED: "true"
            DF_SERVER_SESSIONEXPIRY: 3600
            # User Sync
            DF_SERVER_USERSYNCONSTART: true
            DF_SERVER_USERSYNCSCHEDULE: "0 3 * * *" # we will sync with all zabbix servers, every day at 03:00
            # User Cache
            DF_SERVER_USERCACHEEVICTIONTIME: 3600 # users get evicted after one hour of inactivity
            DF_SERVER_USERCACHEHOUSEKEEPERSCHEDULE: "*/15 * * * *" # run housekeeper every 15min
            # HA
            DF_SERVER_ENABLECLUSTERING: "false"
            DF_SERVER_HALEADERLEASELENGTH: 60
            DF_SERVER_HALEADERINITIALLEASELENGTH: 150
            DF_SERVER_HANODELONGPOLL: 30
            DF_SERVER_HANODESHORTPOLL: 15
            # NSQ
            DF_SERVER_NSQDISABLE: "false"
            DF_SERVER_NSQTRANSACTIONTIMEOUT: 600 # 10 minutes maximum transactions between services
            DF_SERVER_SERVICEIDENTITY: "dataforge_server"
            # NSQ connection
            DF_SERVER_NSQCONSUMEPORT: "4161"
            DF_SERVER_NSQCONSUMEADDRESS: "nsqlookupd"
            DF_SERVER_NSQPRODUCEPORT: "4150"
            DF_SERVER_NSQPRODUCEADDRESS: "nsqd"
            # NSQ TLS
            DF_SERVER_NSQTLSENABLE: "false"
            DF_SERVER_NSQTLSCACERTIFICATE: "/usr/local/intellitrend/df/ssl/ca.cert"
            DF_SERVER_NSQTLSSKIPCERTIFICATEVALIDATION: "true"
            # NSQ Authentication
            DF_SERVER_NSQAUTHENABLE: "true"
            DF_SERVER_NSQAUTHPORT: "4180"
            DF_SERVER_NSQSECRET: $DF_SERVER_NSQSECRET
            DF_SERVER_NSQAUTHSERVERSECRET: $DF_SERVER_NSQSECRET
            DF_SERVER_NSQAUTHCOLLECTORSECRET: $DF_COLLECTOR_NSQSECRET
            DF_SERVER_NSQAUTHPREPROCESSORSECRET: $DF_PREPROCESSOR_NSQSECRET
            DF_SERVER_NSQAUTHRENDERERSECRET: $DF_RENDERER_NSQSECRET
            DF_SERVER_NSQAUTHDELIVERERSECRET: $DF_DELIVERER_NSQSECRET
            DF_SERVER_LICENSE: "/usr/local/intellitrend/df/etc/df.key"
            # Zabbix
            DF_SERVER_ZABBIXAPITRACEENABLE: "false"
            # TLS
            DF_SERVER_TLSENABLE: "false"
            DF_SERVER_TLSCERTIFICATEPATH: "/usr/local/intellitrend/df/ssl/df-server.cert"
            DF_SERVER_TLSPRIVATEKEYPATH: "/usr/local/intellitrend/df/ssl/df-server.key"
            # Database
            DF_SERVER_DBPORT: 3306
            DF_SERVER_DBNAME: $DB_DATABASE
            DF_SERVER_DBADDRESS: mariadb
            DF_SERVER_DBUSER: root
            DF_SERVER_DBPASSWORD: $DB_PASSWORD
            DF_SERVER_DBDRIVER: "mysql"
            DF_SERVER_DBTRACEENABLE: "false"
            DF_SERVER_DBRECONNECTDELAY: 15
            # LDAP
            DF_SERVER_LDAPADDRESS: ""
            DF_SERVER_LDAPPORT: ""
            DF_SERVER_LDAPDOMAIN: ""
            DF_SERVER_LDAPTLSENABLE: "false"
            DF_SERVER_LDAPTLSVERIFYPEER: "false"
            DF_SERVER_LDAPTLSCACERTIFICATE: "false"
            # Metrics
            DF_SERVER_METRICSPORT: "8094"
            DF_SERVER_METRICSHISTORYSIZE: 1000
            DF_SERVER_METRICSSTATEFILE: "/usr/local/intellitrend/df/var/metrics/state.json"
        #ports:
        #  - 4180:4180 # NSQ-Auth
        #  - 8090:8090 # gRPC
        #  - 8091:8091 # gRPC-Web (exposed through df-frontend reverse proxy)
        #  - 8092:8092 # REST
        #  - 8094:8094 # Metrics
        volumes:
            # Place your license key here
            - ./dataforge.key:/usr/local/intellitrend/df/etc/df.key:ro
            # A directory used to persist metrics between restarts
            - ./df-server/data/metrics/:/usr/local/intellitrend/df/var/metrics:rw

    # ------------  DF Collector  ------------
    df-collector:
        container_name: df-collector
        image: ${DF_DOCKER_REGISTRY_PATH}/df-collector:${DF_VERSION}
        restart: always
        environment:
            DF_COLLECTOR_LOGLEVEL: "info"
            DF_COLLECTOR_NSQSERVICEIDENTITY: "dataforge_collector"
            DF_COLLECTOR_NSQSECRET: $DF_COLLECTOR_NSQSECRET
            DF_COLLECTOR_NSQCONSUMEPORT: "4161"
            DF_COLLECTOR_NSQCONSUMEADDRESS: "nsqlookupd"
            DF_COLLECTOR_NSQPRODUCEPORT: "4150"
            DF_COLLECTOR_NSQPRODUCEADDRESS: "nsqd"
        depends_on:
            - nsqd

    # ------------  DF Preprocessor  ------------
    df-preprocessor:
        container_name: df-preprocessor
        image: ${DF_DOCKER_REGISTRY_PATH}/df-preprocessor:${DF_VERSION}
        restart: always
        environment:
            DF_PREPROCESSOR_LOGLEVEL: "info"
            DF_PREPROCESSOR_NSQSERVICEIDENTITY: "dataforge_preprocessor"
            DF_PREPROCESSOR_NSQSECRET: $DF_PREPROCESSOR_NSQSECRET
            DF_PREPROCESSOR_NSQCONSUMEPORT: "4161"
            DF_PREPROCESSOR_NSQCONSUMEADDRESS: "nsqlookupd"
            DF_PREPROCESSOR_NSQPRODUCEPORT: "4150"
            DF_PREPROCESSOR_NSQPRODUCEADDRESS: "nsqd"
        depends_on:
            - nsqd

    # ------------  DF Deliverer  ------------
    df-deliverer:
        container_name: df-deliverer
        image: ${DF_DOCKER_REGISTRY_PATH}/df-deliverer:${DF_VERSION}
        restart: always
        environment:
            DF_DELIVERER_LOGLEVEL: "trace"
            DF_DELIVERER_NSQSERVICEIDENTITY: "dataforge_deliverer"
            DF_DELIVERER_NSQCONSUMEPORT: "4161"
            DF_DELIVERER_NSQCONSUMEADDRESS: "nsqlookupd"
            DF_DELIVERER_NSQPRODUCEPORT: "4150"
            DF_DELIVERER_NSQPRODUCEADDRESS: "nsqd"
            DF_DELIVERER_NSQSECRET: $DF_DELIVERER_NSQSECRET
        depends_on:
            - nsqd

    # ------------  DF Renderer  ------------
    df-renderer:
        container_name: df-renderer
        image: ${DF_DOCKER_REGISTRY_PATH}/df-renderer:${DF_VERSION}
        restart: always
        environment:
            DF_RENDERER_LOGLEVEL: "info"
            DF_RENDERER_NSQSERVICEIDENTITY: "dataforge_renderer"
            DF_RENDERER_NSQCONSUMEPORT: "4161"
            DF_RENDERER_NSQCONSUMEADDRESS: "nsqlookupd"
            DF_RENDERER_NSQPRODUCEPORT: "4150"
            DF_RENDERER_NSQPRODUCEADDRESS: "nsqd"
            DF_RENDERER_NSQSECRET: $DF_RENDERER_NSQSECRET
            DF_RENDERER_CHROMEADDRESS: "pdf-worker"
            DF_RENDERER_CHROMEPORT: "9222"
            DF_RENDERER_WEBSERVERADDRESS: "df-renderer"
        depends_on:
            - nsqd

    # ------------  Chrome headless  ------------
    pdf-worker:
        image: chromedp/headless-shell:114.0.5735.199
        container_name: pdf-worker
        restart: always

    # -----------  MariaDB database  -----------
    mariadb:
        image: mariadb
        restart: always
        container_name: mariadb
        command: mariadbd --character-set-server=utf8mb4 --collation-server=utf8mb4_unicode_ci
        environment:
            MYSQL_DATABASE: $DB_DATABASE
            MYSQL_ROOT_PASSWORD: $DB_PASSWORD
        volumes:
            - ./mariadb/data/var/lib/mysql:/var/lib/mysql:rw
        #ports:
        #    - "3306:3306"

    # ------------  NSQLookupd  ------------
    nsqlookupd:
        image: nsqio/nsq
        container_name: nsqlookupd
        restart: always
        command: /nsqlookupd
        #ports:
        #    - 4161:4161

    # ------------  NSQD  ------------
    nsqd:
        image: nsqio/nsq
        container_name: nsqd
        restart: always
        command: /nsqd --lookupd-tcp-address=nsqlookupd:4160 --auth-http-address=df-server:4180 --mem-queue-size=1000000 --max-msg-size=419430400 --data-path /nsqd-data --broadcast-address=nsqd
        depends_on:
            - nsqlookupd
        volumes:
            - ./nsqd/data/nsqd-data:/nsqd-data
        #ports:
        #    - 4150:4150

    # ------------  SeaweedFS / S3  ------------
    s3:
        image: chrislusf/seaweedfs
        container_name: s3
        restart: always
        #ports:
        #    - 8333:8333 # S3 API HTTP
        #    - 9327:9327 # Metrics port (must be enabled in command)
        command: 'server -s3 -s3.config=/seaweed-config.json -master.volumePreallocate=false -ip.bind=0.0.0.0' # -metricsPort=9327
        volumes:
            - ./seaweed/data:/data:rw
            - ./seaweed-config.json:/seaweed-config.json:ro

5. Configure TLS / reverse proxy

The df-frontend container includes an nginx server which serves the DataForge frontend to browsers and also forwards any requests at /service.intellimon to the gRPC-Web API of the df-server container.

By default, the df-frontend container listens on port 80 (unprotected HTTP). Security features in modern browsers require that the DataForge frontend and the gRPC-Web API of the DataForge server are reachable from your browser using HTTPS.

You can either configure an external reverse proxy or you can set up the nginx server in the frontend container to serve HTTPS.

5.1. Using an external reverse proxy

If you have an existing reverse proxy handling HTTPS termination, edit the docker-compose.yaml file as follows:

  • In the df-frontend service definition, uncomment the bind mount for the ./df/frontend/data/cfg/df-http.conf config file
  • Make sure the df-https.conf config file is not included
  • The df-frontend service definition in your docker-compose.yaml file should now look like this (with additional lines omitted at [...]):
    df-frontend:
        # [...]
        ports:
            - 80:80
        volumes:
            - ./df-frontend/data/cfg/df-base.conf:/etc/nginx/conf.d/df-base.conf
            - ./df-frontend/data/cfg/df-http.conf:/etc/nginx/conf.d/df.conf
    

Now you can simply configure your external reverse proxy to pass any requests to a subdomain of your choice to port 80 of the df-frontend container.

Most common reverse proxies (such as Traefik, nginx or Apache) should be usable for this purpose.

For configuration instructions, please consult the documentation of your specific reverse proxy software. The following endpoints need to be configured:

  • Any URLs at https://yoursubdomain/ need to point to the HTTP or HTTPS endpoint of the df-frontend container.
  • The reverse proxy must allow websocket upgrades for any URLs starting with https://yoursubdomain/service.intellimon/Stream*, where Stream* means any method that starts with Stream.

5.2. Serve HTTPS using included nginx

If you have a TLS certificate, you can configure the df-frontend container to serve HTTPS.

To configure this nginx server to serve HTTPS, simply follow these steps:

  1. Create a new directory df-frontend/data/ssl.
  2. Rename your HTTPS/TLS certificate and private key to df-frontend.cert and df-frontend.key respectively and place them both inside the newly created df-frontend/data/ssl/ directory
  3. Edit the docker-compose.yaml file:
    • In the df-frontend service definition, uncomment the bind mount for the ./df/frontend/data/cfg/df-https.conf config file
    • Make sure the df-http.conf config file is not included
    • Uncomment the port mapping 443:443, so the HTTPS service is exposed outside the container. (Port 80 will redirect HTTP clients to HTTPS on port 443)
    • The df-frontend service definition in your docker-compose.yaml file should now look like this (with additional lines omitted at [...]):
      df-frontend:
          # [...]
          ports:
              - 80:80
              - 443:443
          volumes:
              - ./df-frontend/data/cfg/df-base.conf:/etc/nginx/conf.d/df-base.conf
              - ./df-frontend/data/cfg/df-https.conf:/etc/nginx/conf.d/df.conf
      

6. Run services

First, make sure to sign in to the docker image registry (docker login images.intellitrend.de) using the credentials that were provided to you when purchasing DataForge.

Start the all services by running docker compose up -d.

You can now access the DataForge frontend using the configured port and protocol.

You can see the server logs for any errors using docker compose logs. Optionally, you can specify a specific service, such as docker compose logs df-server.

If you encounter any problems or need help with your configuration, please contact us. We’re happy to help you!

7. Configure S3 bucket

If you would like to use the reporting and DataForge AI functionality, you need to configure an S3 bucket in DataForge.

While you can use any S3-compatible object storage, the default docker-compose file includes a SeaweedFS container. The credentials for the S3-API are configured in the seaweed-config.json file. If you ran the ./gen_passwords.sh script, a secure secret key will have already been created, otherwise, you’ll need to manually configure a secure password here.

Use s3:8333 (the docker hostname and port) as the object storage URL, the credentials in the seaweed-config.json file, as well as a bucket name of your choice when configuring the S3 storage for your Zabbix server in DataForge. Keep in mind that you need to use different bucket names for different Zabbix servers.

If you need to access the S3 bucket from outside of the DataForge instance, we recommend setting up a reverse proxy to forward https://s3.example.org/ to port 8333 on the S3 / SeaweedFS docker container.

Last modified February 25, 2025: (b558ec5b8)