This section discusses our recommendations for deploying a Stardog High Availability Cluster in a production environment.
<details open markdown="block"> <summary> Page Contents </summary> 1. TOC </details>ZooKeeper is a critical component of a Stardog HA cluster, providing metadata persistence1 and coordination services to the nodes in the cluster. If you deploy your own Stardog HA Cluster, you must fully understand how to configure and maintain a ZooKeeper ensemble.
See the ZooKeeper documentation for more details. In particular, understanding ZooKeeper's system requirements is critical.
As a starting point for provisioning a server to host a ZooKeeper node, the Required Software section of the Administrator's Guide states: "At Yahoo!, ZooKeeper is usually deployed on dedicated RHEL boxes, with dual-core processors, 2GB of RAM, and 80GB IDE hard drives."
Whether you run the ZooKeeper ensemble for your Stardog HA cluster on physical hardware or in a virtualized environment such as Amazon Web Services (AWS), we recommend the following:
In addition to a ZooKeeper ensemble, a Stardog HA cluster consists of one or more Stardog server nodes (a minimum of three is recommended). A client typically accesses the cluster via a load balancer. See the "Stardog Cluster Architecture" section of the Tuning Cluster for Cloud blog post for a good overview of a Stardog HA cluster.
Many of the recommendations for configuring the Stardog server nodes in a cluster are similar to those for ZooKeeper. We provide a Capacity Planning doc page to assist with provisioning CPU, memory, and disk. Note however that these recommendations are for a stand-alone Stardog server. When provisioning a host for a Stardog cluster node, increase the provisioning requirements by 25-50% to account for the overhead of a node's cluster membership participation.
For guidance on provisioning a VM to host a Stardog server node in the cloud, see the "AWS specifics" section of the previously-mentioned blog post.
Whether you run the Stardog server nodes for your cluster on physical hardware or in a virtualized environment, we recommend the following:
STARDOG_HOME volume.ZooKeeper is used to persist the latest Stardog transaction ID, and it is used to store metadata about the Stardog cluster. However, ZooKeeper is not used to store any Stardog database data, such as the contents of a transaction. ↩
The Required Software section of the Administrator's Guide states: "Three ZooKeeper servers is the minimum recommended size for an ensemble, and we also recommend that they run on separate machines." ↩
The Single Machine Requirements section of the ZooKeeper Administrator's Guide states: "ZooKeeper's transaction log must be on a dedicated device. (A dedicated partition is not enough.)" ↩
Some cloud and/or container environments offer varying QoS guarantees for their disks, which can impact ZooKeeper's performance as well as overall cluster reliability. Specifically, if ZooKeeper is not able to persist information to disk, across all of its nodes, within its timeout limits, then sessions may be dropped, and Stardog nodes may be expelled from the cluster.
For example, on AWS, you can use general purpose volumes (gp2 or gp3) for a lightly loaded ZooKeeper cluster, but for production use, we recommend using provisioned IOPS volumes (io1 or io2) so that credit-limiting does not increase disk latency. If you decide to deploy your ZooKeeper cluster with general purpose disk volumes, be sure to measure the cluster's performance under expected workloads. See the "AWS specifics" section of the Tuning Cluster for Cloud blog post for more information. ↩
The number of Stardog nodes is not required to be an odd number, since we use ZooKeeper to manage the decisions about cluster membership. The number of Stardog nodes you deploy depends on your HA requirements (i.e., how many nodes can fail without killing the cluster) and how much client load must be supported. ↩