CRX clustering allow you to have multiple CRX repository share same copy. Here we will discuss some of the basics of CRX clustering.
What is what directory in CRX/CQ
Assumption: It assumes that You are using CRX2.2 or higher.
Courtesy : Thomas Mueller from Adobe Systems
After joining a cluster using the GUI, a new repository directory crx.000x is created. The directory with highest number is usually the current repository directory. To verify, open the file bootstrap.properties which is located in the parent directory of crx-quickstart.
Note: If in future you want to switch to stand alone instance stop CQ, rename crx.000x to crx-quickstart, delete crx-quickstart and bootstrap.properties file and start the instance. (Not tested this but theoretically it should work)
1.1) /datatore Large Binaries are stored under /datatore folder. There is a setting under /crx-quickstart/repository/repository.xml file
That mean any thing bigger than 4096Byte (4KB) would be stored under datastore (Off course you can change this number) .
Please note that datastore, Like tar is append only (That mean data never get deleted from datastore).
So In order to make sure that data get deleted from datastore you have to run datastore garbage collection regularly.
More information about datastore is found here
1.2) /Index Stores Lucene Indexes for version.
1.3) /Meta Stores Root ID of CRX which is cafebabe-cafe-babe-cafe-babecafebabe
1.4) /namespaces and /nodetypes stores all registered Node Type and namespaces. They can be accessed through /crx using node type administration URL.
This folder stores all version history in data*tar file. Corresponding indexes are stored under index*tar file. Through crx-explorer you can find those entry under /jcr:system/jcr:versionStorage node. It is good to enable version purging, So that this folder should not grow unnecessary.
3.1) /index This folder stores Lucene Search index for corresponding workspace. They are used for full text searching within CRX/CQ. More information can be found here and Detail about Indexing configuration can be found here
3.2) data*tar All data which is less than minRecordLength except version info.
3.3) index*tar workspace data tar index.
Information about how tar persistence manager work can be found here
4.1) data*tar that contains Journal information about the repository changes. This is used to ensure data consistency between cluster. It also helps to recover from crash and cluster synch.4.2) index*tar contains tar journal index.5) /crx-quickstart/repository/cluster_node.idThis file contains the cluster node id (unique for each cluster node). This file is automatically created by the system. By default it contains a randomly generated UUID, but it can be any name. When copying a cluster node, this file should be copied (if two cluster nodes contain the same cluster node id, only the first cluster node can connect). 08d434b1-5eaf-4b1c-b32f-e9abedf05f236) /crx-quickstart/repository/cluster.propertiesThis file contain cluster configuration. The file is automatically updated by the system if the cluster configuration is changed in the GUI.cluster_id=86cab8df-3aeb-4985-8eb5-dcc1dffb8e10addresses=10.0.2.2,10.0.2.3members=08d434b1-5eaf-4b1c-b32f-e9abedf05f23,fd11448b-a78d-4ad1-b1ae-ec967847ce94The cluster_id property contains the cluster id, which must be the same for all cluster nodes that participate in this cluster. By default this is a randomly generated UUID, but it can be any name. The addresses property contains the comma separated list of the IP addresses of all cluster node. This list is used at the startup of each cluster node to connect to the other nodes that are already running. The list is not needed if all cluster nodes are running on the same computer (which may be interesting for some use cases, such as testing). The members properties contains a comma separated list of the cluster node ids that participate in the cluster. This property is not required for the cluster to work, it is for informational purposes only.7) /crx-quickstart/repository/clustered.txt