From 92459fe8b63c431fe6570b050a2bc0d3dfcc8db2 Mon Sep 17 00:00:00 2001
From: MikeBirnstiehl <michael.birnstiehl@sonarsource.com>
Date: Wed, 26 Jun 2019 14:59:26 +0200
Subject: [PATCH] SONAR-12212 add ES read-only recovery doc

---
 .../requirements/hardware-recommendations.md  |  6 ++--
 .../src/pages/setup/troubleshooting.md        | 32 +++++++++++++++++++
 2 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/server/sonar-docs/src/pages/requirements/hardware-recommendations.md b/server/sonar-docs/src/pages/requirements/hardware-recommendations.md
index a57e2a03f6e..2ab386910ea 100644
--- a/server/sonar-docs/src/pages/requirements/hardware-recommendations.md
+++ b/server/sonar-docs/src/pages/requirements/hardware-recommendations.md
@@ -13,15 +13,15 @@ In case your SonarQube Server is running on Linux and you are using Oracle, the
  -Djava.security.egd=file:///dev/urandom
  ```
 
- ## Elasticsearch (aka ES)
+ ## Elasticsearch (ES)
 * [Elasticsearch](https://www.elastic.co/) is used by SonarQube in the background in the SearchServer process. To ensure good performance of your SonarQube, you need to follow these recommendations that are linked to ES usage.
 
 ## JVM
 * It is recommended to set the min and max memory to the same value to prevent the heap from resizing at runtime, a very costly process. See -Xms and  -Xmx of property `sonar.search.javaOpts`.
 
 ## Disk
-* Free disk space is an absolute requirement. ES will lock indices in read-only mode if free disk space drops below 15%. 
-* Disk can easily become the bottleneck of ES. If you can afford SSDs, they are by far superior to any spinning media. SSD-backed nodes see boosts in both query and indexing performance. If you use spinning media, try to obtain the fastest disks possible (high performance server disks 15k RPM drives).
+* Free disk space is an absolute requirement. ES implements a safety mechanism to prevent the disk from being flooded with index data that locks all indices in read-only mode when a 95% disk usage watermark is reached. For information on recovering from ES read-only indices, see the [Troubleshooting](/setup/troubleshooting/) page.
+* Disk can easily  become the bottleneck of ES. If you can afford SSDs, they are by far superior to any spinning media. SSD-backed nodes see boosts in both query and indexing performance. If you use spinning media, try to obtain the fastest disks possible (high performance server disks 15k RPM drives).
 * Make sure to increase the number of open files descriptors on the machine (or for the user running SonarQube server). Setting it to 32k or even 64k is recommended. See [this ElasticSearch article](https://www.elastic.co/guide/en/elasticsearch/reference/current/file-descriptors.html).
 * Using RAID 0 is an effective way to increase disk speed, for both spinning disks and SSD. There is no need to use mirroring or parity variants of RAID because of Elasticsearch replicas and database primary storage.
 8 Do not use remote-mounted storage, such as NFS, SMB/CIFS or network-attached storages (NAS). They are often slower, display larger latencies with a wider deviation in average latency, and are a single point of failure.
diff --git a/server/sonar-docs/src/pages/setup/troubleshooting.md b/server/sonar-docs/src/pages/setup/troubleshooting.md
index 2c849fbb91b..4b82fd19689 100644
--- a/server/sonar-docs/src/pages/setup/troubleshooting.md
+++ b/server/sonar-docs/src/pages/setup/troubleshooting.md
@@ -40,4 +40,36 @@ Unless you wrote the code that produced this error, you really only care about:
 * the first line, which ought to have a human-readable message after the colon. In this case, it's Unable to blame file `**/**/foo.java`
 * and any line that starts with `Caused by:`. There are often several `Caused by` lines, and indentation makes them easy to find as you scroll through the error. Be sure to read each of these lines. Very often one of them - the last one or next to last one - contains the real problem.
 
+## Recovering from Elasticsearch read-only indices
+
+You may encounter issues with Elasticsearch (ES) indices becoming locked in read-only mode. ES requires free disk space available and implements a safety mechanism to prevent the disk from being flooded with index data that:
+
+* **For non-DCE** â  locks all indices in read-only mode when the 95% used disk usage watermark is reached.  
+* **For DCE** â locks all or some indices in read-only mode when one or more node reaches the 95% used disk usage watermark.
+
+ES shows warnings in the logs as soon as disk usage reaches 85% and 90%. At 95% usage and above, indices turning read-only causes errors in the web and compute engine.
+
+Freeing disk space will *not* automatically make the indices return to read-write. To make indices read-write, you also need to:
+
+* **For non-DCE** â restart SonarQube.
+* **For DCE** â restart *ALL* application nodes (the first application node restarted after all have been stopped will make the indices read-write).  
+
+SonarQube's built-in resilience mechanism allows SonarQube to eventually recover from the indices being behind data in the DB (this process can take a while).
+
+If you still have inconsistencies, you'll need to rebuild the indices (this operation can take a long time depending on the number of issues and components):
+
+**non-DCE:**  
+
+1. Stop SonarQube  
+1. Delete the data/es6 directory  
+1. Restart SonarQube  
+
+**DCE:**  
+
+1. Stop the whole cluster (ES and application nodes)  
+1. Delete the data/es6 directory on each ES node  
+1. Restart the whole cluster  
+    
+**Note:** See [Configure & Operate a Cluster](/setup/operate-cluster/) for information on stopping and starting a cluster.
+
 <!-- /sonarqube -->
-- 
2.39.5