Elasticsearch yml example

Elasticsearch yml example DEFAULT

In the YML Configuration section of the Cluster Configuration page of your Alibaba Cloud Elasticsearch cluster, you can enable the Auto Indexing, Audit Log Indexing, or Watcher feature. You can also specify Index Deletion and Other Configurations. This topic describes how to configure the following items: parameters in the YML file of the cluster, cross-origin resource sharing (CORS), a remote reindex whitelist, the Audit Log Indexing feature, and queue sizes.

Precautions

The network architecture of Alibaba Cloud Elasticsearch was adjusted in October 2020. Due to this adjustment, you cannot use the reindex API to migrate data between clusters in some scenarios. For more information, see the precautions described in Migrate data from a self-managed Elasticsearch cluster to an Alibaba Cloud Elasticsearch cluster deployed in the new network architecture.

Note The network architecture in the China (Zhangjiakou) region and the regions outside China was adjusted before October 2020. If you want to perform operations between a cluster that is created before October 2020 and a cluster that is created in October 2020 or later in the China (Zhangjiakou) region or a region outside China, submit a ticket to contact technical support personnel to check whether the network architecture supports the operations.

Configure the parameters in the YML file

  1. Log on to the Elasticsearch console.
  2. In the left-side navigation pane, click Elasticsearch Clusters.
  3. Navigate to the desired cluster.
    1. In the top navigation bar, select the resource group to which the cluster belongs and the region where the cluster resides.
    2. In the left-side navigation pane, click Elasticsearch Clusters. On the Elasticsearch Clusters page, find the cluster and click its ID.
  4. In the left-side navigation pane of the page that appears, click Cluster Configuration.
  5. On the Cluster Configuration page, click Modify Configuration on the right side of YML Configuration.
  6. In the YML File Configuration panel, configure the following parameters.
    Elasticsearch YML file configuration
    ParameterDescription
    Auto IndexingSpecifies whether to automatically create an index if a new document is uploaded to the Elasticsearch cluster but no index exists. We recommend that you disable Auto Indexing because the indexes that are automatically created may not meet your business requirements.

    This parameter corresponds to the action.auto_create_index configuration item in the YML file. The default value of this configuration item is false.

    Index DeletionSpecifies whether to specify the index name when you delete an index. If you set this parameter to Allow Wildcards, you can use wildcards to delete multiple indexes at a time. Deleted indexes cannot be recovered. Exercise caution when you configure this parameter.

    This parameter corresponds to the action.destructive_requires_name configuration item in the YML file. The default value of this configuration item is true.

    Audit Log IndexingIf you enable Audit Log Indexing, the system generates audit logs for the create, delete, modify, and search operations that are performed in the Elasticsearch cluster. These logs consume disk space and affect cluster performance. Therefore, we recommend that you disable Audit Log Indexing and exercise caution when you configure this parameter. For more information about the parameters related to the Audit Log Indexing feature, see Configure the Audit Log Indexing feature.

    Notice This parameter is unavailable for Elasticsearch clusters of V7.0 or later.

    This parameter corresponds to the xpack.security.audit.enabled configuration item in the YML file. The default value of this configuration item is false.

    WatcherIf you enable Watcher, you can use the X-Pack Watcher feature. You must clear the .watcher-history* index on a regular basis to free up disk space.

    This parameter corresponds to the xpack.watcher.enabled configuration item in the YML file. The default value of this configuration item is false.

    Other ConfigurationsThe following description provides some of the supported configuration items. Unless otherwise specified, these items are available for Elasticsearch V5.X, V6.X, and V7.X clusters.
    • Configure CORS
      • http.cors.enabled
      • http.cors.allow-origin
      • http.cors.max-age
      • http.cors.allow-methods
      • http.cors.allow-headers
      • http.cors.allow-credentials
    • Configure a remote reindex whitelist

      reindex.remote.whitelist

    • Configure the Audit Log Indexing feature

      Notice Elasticsearch clusters of V7.0 or later do not support the Audit Log Indexing feature.

      • xpack.security.audit.enabled
      • xpack.security.audit.index.bulk_size
      • xpack.security.audit.index.flush_interval
      • xpack.security.audit.index.rollover
      • xpack.security.audit.index.events.include
      • xpack.security.audit.index.events.exclude
      • xpack.security.audit.index.events.emit_request_body
    • Configure queue sizes
      • thread_pool.bulk.queue_size (available for Elasticsearch V5.X clusters)
      • thread_pool.write.queue_size (available for Elasticsearch V6.X and V7.X clusters)
      • thread_pool.search.queue_size
    • Configure a custom SQL plug-in

      xpack.sql.enabled

      By default, Elasticsearch clusters use the built-in SQL plug-in provided by X-Pack. If you want to upload a custom SQL plug-in to your Elasticsearch cluster, set xpack.sql.enabled to false.

    Warning
    • Before you configure the YML file of your Elasticsearch cluster, you must make sure that the cluster is in a normal state. After you configure the YML file, the system restarts the cluster. The time required for the restart depends on the size, data volume, and load of the cluster. We recommend that you configure the YML file during off-peak hours.
    • In most cases, if the indexes of your cluster have replica shards and the load of your cluster is normal, your cluster can still provide services during a cluster modification. The following items indicate that the load of a cluster is normal: The CPU utilization of the cluster is about 60%, the heap memory usage of the cluster is about 50%, and the value of NodeLoad_1m is less than the number of vCPUs for the cluster.

Sours: https://www.alibabacloud.com/

What is YAML? YAML is a readable data serialization language used frequently in configuration files for software. Oddly enough, it stands for “Ain’t Markup Language.” This article will show you samples of YAML files (written .yml or .yaml) for the ELK Stack and other programs DevOps team commonly uses. And while some people love yaml and some hate it, it’s not going away.

This article is for anyone looking to quickly configure their entire ELK stack immediately after installing it. It’ll provide basic yaml configurations, when to uncomment lines, and advanced configurations (including .config files which might be in YAML or JSON). Please note, some files are .yml and others are .yaml. This is not a misprint. Pay attention to these details.

Introduction to YAML

YAML is actually a superset of JSON, but was created to be more legible than other markup languages—specifically XML. It also supports more features like lists, maps and even scalar types. Put another way, it works well for data serialization, something more advanced than markup (and yes, even XML does data serialization). To quote yaml.org, it is “designed around the common native data structures of agile programming languages.”

JSON won out as lingua franca of APIs and technical docs even though JSON and YAML were both designed with data serialization in mind. YAML is better for configuration because it allows for directions—use a # to write a non-code comment in the file to tell people configuring files exactly what to do. Many YAML templates actually function based on these hashtags. All you have to do—sometimes—is “uncomment lines” to make configurations work at their most basic level.

As a superset, YAML can parse JSON with a YAML parser. Again, YAML will support more advanced datatypes like embedded block literals and is also self-referential. You can also for free convert JSON to YAML and YAML to JSON online. YAML is also critical for log files, hence its importance for companies like Logz.io and open source tools like ELK.

YAML and Kubernetes

Why is YAML so important to Kubernetes?

Kubernetes is incredibly complex. YAML affords a lot of advantages for such a system, including YAML’s declarative traits. All Kubernetes resources are created by declaring them within YAML files. This saves a lot of time when scaling Kubernetes. Within a Kubernetes YAML configuration file, you declare the number of replica pods you want, which automatically creates them once the file’s changes are saved. You can also define deployment strategies for new containers and pods, pod limits, labels and filters to target specific pods called “selectors.”

YAML Files in the ELK Stack

Of course, you might also be here because you are trying to keep your YAML configurations straight specifically for the ELK Stack (or another monitoring tool, whether or not related to Docker and/or Kubernetes).

The file—like similar files in the ELK Stack and Beats—will be by default located in different places depending on the way you install ELK. In general, this is where you will find them:

Linux:

/etc/elasticsearch/elasticsearch.yml /etc/kibana/kibana.yml /etc/filebeat/filebeat.yml /etc/metricbeat/metricbeat.yml

Homebrew (Mac):

/usr/local/etc/elasticsearch/elasticsearch.yml /usr/local/etc/kibana/kibana.yml /usr/local/etc/filebeat/filebeat.yml /usr/local/etc/metricbeat/metricbeat.yml

Docker:

/usr/share/elasticsearch/elasticsearch.yml /usr/share/kibana/kibana.yml /usr/share/filebeat/filebeat.yml /usr/share/metricbeat/metricbeat.yml

Configure the elasticsearch.yml File

To configure , enter the file.

sudo vim /etc/elasticsearch/elasticsearch.yml

Hit i in order to edit the file. You will want to uncomment lines for the following fields:

  1. cluster.name
  2. node.name
  3. path.data
  4. path.logs
  5. network.host #Depending on your situation, this should usually be 127.0.0.1 or 0.0.0.0
  6. http.port 9200 (When you uncomment this, make sure this is already set at 9200. Otherwise, set it yourself and type it in.)

Press esc and then type :wq in order to 1) save AND 2) exit the file simultaneously.

Advanced YAML: Elasticsearch Cluster Configuration

You will need more advanced settings for Elasticsearch clusters, including disabling swapping unused memory.

bootstrap.memory_lock: true OR bootstrap.mlockall: true

And:

MAX_LOCKED_MEMORY=unlimited

Configure the kibana.yml File

Besides the default file locations mentioned above, if you installed Kibana from a tar.gz or .zip distribution, look for the file in KIBANA_HOME/config.

Uncomment the following lines and/or make sure the settings match:

server.port: 5601 elasticsearch.url: "http://localhost:9200"

An alternative example might look like this:

server.port: 127.0.0.1:5601 elasticsearch.url: "http://elasticsearch:9200"

In general, Elasticsearch should be located at localhost:9200 in all ELK Stack configuration files for system-hosted ELK, unless of course you have a different location.

Intermediate/Advanced Kibana configurations

You can point to an X.509 server certificate and their respective private keys using different options. These configurations are possible for both Elasticsearch input and Kibana itself. Both sets of configurations, however, would be in the configuration file.

Kibana uses all of these options to validate certificates and create a chain of trust with SSL/TLS connections from end users coming into Kibana.

For Kibana:

server.ssl.keystore.path:

For Elasticsearch:

elasticsearch.ssl.keystore.path:

Alternatively, the server.ssl.certificate and server.ssl.key configurations can be used. These together function as an alternative because they cannot be used in conjunction with the server.ssl.keystore.path configuration.

For Kibana:

server.ssl.certificate: server.ssl.key:

For Elasticsearch:

elasticsearch.ssl.certificate: elasticsearch.ssl.key:

Elasticsearch or Kibana will use these chains, respectively, when PKI authentication is active.

Additionally, you can enable the following configuration to encrypt the respective ssl.key configs:

For Kibana:

server.ssl.keyPassphrase:

For Elasticsearch:

elasticsearch.ssl.keyPassphrase:

Configure the logstash.conf and logstash.yml Files

You will mainly configure Logstash in its .conf file, which is in JSON. However, the file is still relevant.

The logstash.conf file is actually in JSON. However, while this post obviously focuses on YAML configurations, it would be a disservice not to include the basics for the .conf file.

Here is a basic Logstash configuration example for the file’s three main sections: input, filter, and output:

logstash.conf Configuration

input { file { path => "/var/log/apache2/access.log" start_position => "beginning" sincedb_path => "/dev/null" } } filter { grok { match => { "message" => "%{COMBINEDAPACHELOG}" } } date { match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ] } geoip { source => "clientip" } } output { elasticsearch { hosts => ["localhost:9200"] } }

logstash.yml Configuration

Specify Logstash modules:

modules: - name: MODULE_NAME1 var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY1: VALUE var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY2: VALUE var.PLUGIN_TYPE2.PLUGIN_NAME2.KEY1: VALUE var.PLUGIN_TYPE3.PLUGIN_NAME3.KEY1: VALUE - name: MODULE_NAME2 var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY1: VALUE var.PLUGIN_TYPE1.PLUGIN_NAME1.KEY2: VALUE

Configure the filebeat.yml Files

Configure filebeat.inputs for type: log. Identify separate paths for each kind of log (Apache2, nginx, MySQL, etc.)

filebeat.inputs: - type: log   #Change value to true to activate the input configuration   enabled: false   paths:     - “/var/log/apache2/*”     - “/var/log/nginx/*”     - “/var/log/mysql/*”

Then define processors within filebeat.inputs. This example defines the drop_fields processor:

filebeat.inputs: - type: log   paths:     - "/var/log/apache2/access.log"   fields:     apache: true   processors:   - drop_fields:       fields: ["verb","id"]

Then define the Filebeat output. Uncomment or set the outputs for Elasticsearch or Logstash:

output.elasticsearch:   hosts: ["localhost:9200"]output.logstash:   hosts: ["localhost:5044"]

Configuring Filebeat on Docker

The most common method to configure Filebeat when running it as a Docker container is by bind-mounting a configuration file when running said container. To do this, create a new file on your host. This example is for a locally hosted version of Docker:

filebeat.inputs: - type: log   paths:   - '/var/lib/docker/containers/*/*.log'   json.message_key: log   json.keys_under_root: true   processors:   - add_docker_metadata: ~ output.elasticsearch:   hosts: ["localhost:9200"]

To see further examples of advanced Filebeat configurations, check out our other Filebeat tutorials::

What is Filebeat Autodiscover?

Using the Filebeat Wizard in Logz.io

Musings in YAML—Tips for Configuring Your Beats

Configure the metricbeat.yml File

will list a number of modules (Apache, system, nginx, etc.). Make sure to identify the module, the metricsets, the interval, processes, hosts and enabled: true.

Here is an example configuration of two modules in Metricbeat, one for your system and another for Apache metrics:

metricbeat.modules: - module: system   metricsets: ["cpu","memory","network"]   enabled: true   period: 15s   processes: ['.*'] - module: apache   metricsets: ["status"]   enabled: true   period: 5s   hosts: ["http://172.20.11.7"]

If you are setting up a Metricbeat Docker module, it’s advisable to mark the following metricsets:

metricsets: [“container”, “cpu”, “diskio”, “healthcheck”, “info”, “memory”, “network”]

Metricbeat Output

Just as with Filebeat, uncomment or set the outputs for Elasticsearch or Logstash:

output.elasticsearch:   hosts: ["localhost:9200"] output.logstash:   hosts: ["localhost:5044"]

Be sure to go over our full Metricbeat tutorial.

Sours: https://logz.io/blog/configure-yaml-files-elk-stack/
  1. I7 10700k
  2. Matthew 23 24 kjv
  3. Pupinia stewart

zsprackett/elasticsearch.yml

##################### ElasticSearch Configuration Example ###################### This file contains an overview of various configuration settings,# targeted at operations staff. Application developers should# consult the guide at <http://elasticsearch.org/guide>.## The installation procedure is covered at# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html>.## ElasticSearch comes with reasonable defaults for most settings,# so you can try it out without bothering with configuration.## Most of the time, these defaults are just fine for running a production# cluster. If you're fine-tuning your cluster, or wondering about the# effect of certain configuration option, please _do ask_ on the# mailing list or IRC channel [http://elasticsearch.org/community].# Any element in the configuration can be replaced with environment variables# by placing them in ${...} notation. For example:## node.rack: ${RACK_ENV_VAR}# For information on supported formats and syntax for the config file, see# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/setup-configuration.html>################################### Cluster #################################### Cluster name identifies your cluster for auto-discovery. If you're running# multiple clusters on the same network, make sure you're using unique names.#cluster.name: od-fts1#################################### Node ###################################### Node names are generated dynamically on startup, so you're relieved# from configuring them manually. You can tie this node to a specific name:#node.name: "od-fts1a"# Every node can be configured to allow or deny being eligible as the master,# and to allow or deny to store the data.## Allow this node to be eligible as a master node (enabled by default):#node.master: true## Allow this node to store data (enabled by default):#node.data: true# You can exploit these settings to design advanced cluster topologies.## 1. You want this node to never become a master node, only to hold data.# This will be the "workhorse" of your cluster.## node.master: false# node.data: true## 2. You want this node to only serve as a master: to not store any data and# to have free resources. This will be the "coordinator" of your cluster.## node.master: true# node.data: false## 3. You want this node to be neither master nor data node, but# to act as a "search load balancer" (fetching data from nodes,# aggregating results, etc.)## node.master: false# node.data: false# Use the Cluster Health API [http://localhost:9200/_cluster/health], the# Node Info API [http://localhost:9200/_cluster/nodes] or GUI tools# such as <http://github.com/lukas-vlcek/bigdesk> and# <http://mobz.github.com/elasticsearch-head> to inspect the cluster state.# A node can have generic attributes associated with it, which can later be used# for customized shard allocation filtering, or allocation awareness. An attribute# is a simple key value pair, similar to node.key: value, here is an example:## node.rack: rack314# By default, multiple nodes are allowed to start from the same installation location# to disable it, set the following:# node.max_local_storage_nodes: 1#################################### Index ##################################### You can set a number of options (such as shard/replica options, mapping# or analyzer definitions, translog settings, ...) for indices globally,# in this file.## Note, that it makes more sense to configure index settings specifically for# a certain index, either when creating it or by using the index templates API.## See <http://elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules.html> and# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html># for more information.# Set the number of shards (splits) of an index (5 by default):#index.number_of_shards: 2# Set the number of replicas (additional copies) of an index (1 by default):#index.number_of_replicas: 1# Note, that for development on a local machine, with small indices, it usually# makes sense to "disable" the distributed features:## index.number_of_shards: 1# index.number_of_replicas: 0# These settings directly affect the performance of index and search operations# in your cluster. Assuming you have enough machines to hold shards and# replicas, the rule of thumb is:## 1. Having more *shards* enhances the _indexing_ performance and allows to# _distribute_ a big index across machines.# 2. Having more *replicas* enhances the _search_ performance and improves the# cluster _availability_.## The "number_of_shards" is a one-time setting for an index.## The "number_of_replicas" can be increased or decreased anytime,# by using the Index Update Settings API.## ElasticSearch takes care about load balancing, relocating, gathering the# results from nodes, etc. Experiment with different settings to fine-tune# your setup.# Use the Index Status API (<http://localhost:9200/A/_status>) to inspect# the index status.#################################### Paths ##################################### Path to directory containing configuration (this file and logging.yml):## path.conf: /path/to/conf# Path to directory where to store index data allocated for this node.## path.data: /path/to/data## Can optionally include more than one location, causing data to be striped across# the locations (a la RAID 0) on a file level, favouring locations with most free# space on creation. For example:## path.data: /path/to/data1,/path/to/data2# Path to temporary files:## path.work: /path/to/work# Path to log files:## path.logs: /path/to/logs# Path to where plugins are installed:## path.plugins: /path/to/plugins#################################### Plugin #################################### If a plugin listed here is not installed for current node, the node will not start.## plugin.mandatory: mapper-attachments,lang-groovy################################### Memory ##################################### ElasticSearch performs poorly when JVM starts swapping: you should ensure that# it _never_ swaps.## Set this property to true to lock the memory:#bootstrap.mlockall: true# Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set# to the same value, and that the machine has enough memory to allocate# for ElasticSearch, leaving enough memory for the operating system itself.## You should also make sure that the ElasticSearch process is allowed to lock# the memory, eg. by using `ulimit -l unlimited`.############################## Network And HTTP ################################ ElasticSearch, by default, binds itself to the 0.0.0.0 address, and listens# on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node# communication. (the range means that if the port is busy, it will automatically# try the next port).# Set the bind address specifically (IPv4 or IPv6):## network.bind_host: 192.168.0.1# Set the address other nodes will use to communicate with this node. If not# set, it is automatically derived. It must point to an actual IP address.## network.publish_host: 192.168.0.1# Set both 'bind_host' and 'publish_host':## network.host: 192.168.0.1# Set a custom port for the node to node communication (9300 by default):## transport.tcp.port: 9300# Enable compression for all communication between nodes (disabled by default):## transport.tcp.compress: true# Set a custom port to listen for HTTP traffic:## http.port: 9200# Set a custom allowed content length:## http.max_content_length: 100mb# Disable HTTP completely:## http.enabled: false################################### Gateway #################################### The gateway allows for persisting the cluster state between full cluster# restarts. Every change to the state (such as adding an index) will be stored# in the gateway, and when the cluster starts up for the first time,# it will read its state from the gateway.# There are several types of gateway implementations. For more information, see# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html>.# The default gateway type is the "local" gateway (recommended):## gateway.type: local# Settings below control how and when to start the initial recovery process on# a full cluster restart (to reuse as much local data as possible when using shared# gateway).# Allow recovery process after N nodes in a cluster are up:#gateway.recover_after_nodes: 1# Set the timeout to initiate the recovery process, once the N nodes# from previous setting are up (accepts time value):#gateway.recover_after_time: 10m# Set how many nodes are expected in this cluster. Once these N nodes# are up (and recover_after_nodes is met), begin recovery process immediately# (without waiting for recover_after_time to expire):#gateway.expected_nodes: 2# Require explicit index creation# action.auto_create_index: false# Protect against accidental close/delete operations# on all indices. You can still close/delete individual# indicesaction.disable_close_all_indices: trueaction.disable_delete_all_indices: trueaction.disable_shutdown: true############################# Recovery Throttling ############################## These settings allow to control the process of shards allocation between# nodes during initial recovery, replica allocation, rebalancing,# or when adding and removing nodes.# Set the number of concurrent recoveries happening on a node:## 1. During the initial recovery## cluster.routing.allocation.node_initial_primaries_recoveries: 4## 2. During adding/removing nodes, rebalancing, etc## cluster.routing.allocation.node_concurrent_recoveries: 2# Set to throttle throughput when recovering (eg. 100mb, by default 20mb):#indices.recovery.max_bytes_per_sec: 100mb# Set to limit the number of open concurrent streams when# recovering a shard from a peer:## indices.recovery.concurrent_streams: 5################################## Discovery ################################### Discovery infrastructure ensures nodes can be found within a cluster# and master node is elected. Multicast discovery is the default.# Set to ensure a node sees N other master eligible nodes to be considered# operational within the cluster. Its recommended to set it to a higher value# than 1 when running more than 2 nodes in the cluster.## discovery.zen.minimum_master_nodes: 1# Set the time to wait for ping responses from other nodes when discovering.# Set this option to a higher value on a slow or congested network# to minimize discovery failures:## discovery.zen.ping.timeout: 3s# For more information, see# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html># Unicast discovery allows to explicitly control which nodes will be used# to discover the cluster. It can be used when multicast is not present,# or to restrict the cluster communication-wise.## 1. Disable multicast discovery (enabled by default):## discovery.zen.ping.multicast.enabled: false## 2. Configure an initial list of master nodes in the cluster# to perform discovery when new nodes (master or data) are started:## discovery.zen.ping.unicast.hosts: ["host1", "host2:port"]# EC2 discovery allows to use AWS EC2 API in order to perform discovery.## You have to install the cloud-aws plugin for enabling the EC2 discovery.## For more information, see# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-ec2.html>## See <http://elasticsearch.org/tutorials/elasticsearch-on-ec2/># for a step-by-step tutorial.################################## Slow Log ################################### Shard level query and fetch threshold logging.#index.search.slowlog.threshold.query.warn: 10s#index.search.slowlog.threshold.query.info: 5s#index.search.slowlog.threshold.query.debug: 2s#index.search.slowlog.threshold.query.trace: 500ms#index.search.slowlog.threshold.fetch.warn: 1s#index.search.slowlog.threshold.fetch.info: 800ms#index.search.slowlog.threshold.fetch.debug: 500ms#index.search.slowlog.threshold.fetch.trace: 200ms#index.indexing.slowlog.threshold.index.warn: 10s#index.indexing.slowlog.threshold.index.info: 5s#index.indexing.slowlog.threshold.index.debug: 2s#index.indexing.slowlog.threshold.index.trace: 500ms################################## GC Logging #################################monitor.jvm.gc.ParNew.warn: 1000ms#monitor.jvm.gc.ParNew.info: 700ms#monitor.jvm.gc.ParNew.debug: 400ms#monitor.jvm.gc.ConcurrentMarkSweep.warn: 10s#monitor.jvm.gc.ConcurrentMarkSweep.info: 5s#monitor.jvm.gc.ConcurrentMarkSweep.debug: 2s
Sours: https://gist.github.com/zsprackett/8546403
HTTPS and TLS Security for Elasticsearch, Logstash and Kibana
# ======================== Elasticsearch Configuration =========================## NOTE: Elasticsearch comes with reasonable defaults for most settings.# Before you set out to tweak and tune the configuration, make sure you# understand what are you trying to accomplish and the consequences.## The primary way of configuring a node is via this file. This template lists# the most important settings you may want to configure for a production cluster.## Please consult the documentation for further information on configuration options:# https://www.elastic.co/guide/en/elasticsearch/reference/index.html## ---------------------------------- Cluster -----------------------------------## Use a descriptive name for your cluster:##cluster.name: my-application## ------------------------------------ Node ------------------------------------## Use a descriptive name for the node:##node.name: node-1## Add custom attributes to the node:##node.attr.rack: r1## ----------------------------------- Paths ------------------------------------## Path to directory where to store the data (separate multiple locations by comma):#@[email protected]## Path to log files:#@[email protected]## ----------------------------------- Memory -----------------------------------## Lock the memory on startup:##bootstrap.memory_lock: true## Make sure that the heap size is set to about half the memory available# on the system and that the owner of the process is allowed to use this# limit.## Elasticsearch performs poorly when the system is swapping the memory.## ---------------------------------- Network -----------------------------------## By default Elasticsearch is only accessible on localhost. Set a different# address here to expose this node on the network:##network.host: 192.168.0.1## By default Elasticsearch listens for HTTP traffic on the first free port it# finds starting at 9200. Set a specific HTTP port here:##http.port: 9200## For more information, consult the network module documentation.## --------------------------------- Discovery ----------------------------------## Pass an initial list of hosts to perform discovery when this node is started:# The default list of hosts is ["127.0.0.1", "[::1]"]##discovery.seed_hosts: ["host1", "host2"]## Bootstrap the cluster using an initial set of master-eligible nodes:##cluster.initial_master_nodes: ["node-1", "node-2"]## For more information, consult the discovery and cluster formation module documentation.## ---------------------------------- Various -----------------------------------## Allow wildcard deletion of indices:##action.destructive_requires_name: false
Sours: https://github.com/elastic/elasticsearch/blob/master/distribution/src/config/elasticsearch.yml

Example elasticsearch yml

Unless you are using Elasticsearch for development and testing, creating and maintaining an Elasticsearch cluster will be a task that will occupy quite a lot of your time. Elasticsearch is an extremely powerful search and analysis engine, and part of this power lies in the ability to scale it for better performance and stability.

This tutorial will provide some information on how to set up an Elasticsearch cluster, and will add some operational tips and best practices to help you get started. It should be stressed though that each Elasticsearch setup will likely differ from one another depending on multiple factors, including the workload on the servers, the amount of indexed data, hardware specifications, and even the experience of the operators.

What is an Elasticsearch cluster?

As the name implies, an Elasticsearch cluster is a group of one or more Elasticsearch nodes instances that are connected together. The power of an Elasticsearch cluster lies in the distribution of tasks, searching and indexing, across all the nodes in the cluster.

The nodes in the Elasticsearch cluster can be assigned different jobs or responsibilities:

  • Data nodes — stores data and executes data-related operations such as search and aggregation
  • Master nodes — in charge of cluster-wide management and configuration actions such as adding and removing nodes
  • Client nodes — forwards cluster requests to the master node and data-related requests to data nodes
  • Ingest nodes — for pre-processing documents before indexing
  • *Note: Tribe nodes, which were similar to cross-cluster or federated nodes, were deprecated with Elasticsearch 5.4

By default, each node is automatically assigned a unique identifier, or name, that is used for management purposes and becomes even more important in a multi-node, or clustered, environment.

When installed, a single Elasticsearch node will form a new single-node cluster entitled “elasticsearch,” but as we shall see later on in this article it can also be configured to join an existing cluster using the cluster name. Needless to say, these nodes need to be able to identify each other to be able to connect.

Installing an Elasticsearch Cluster

As always, there are multiple ways of setting up an Elasticsearch cluster. You can use a configuration management tool such as Puppet or Ansible to automate the process. In this case, though, we will be showing you how to manually set up a cluster consisting of one master node and two data nodes, all on Ubuntu 16.04 instances on AWS EC2 running in the same VPC. The security group was configured to enable access from anywhere using SSH and TCP 5601 (Kibana).

Installing Java

Elasticsearch is built on Java and requires at least Java 8 (1.8.0_131 or later) to run. Our first step, therefore, is to install Java 8 on all the nodes in the cluster. Please note that the same version should be installed on all Elasticsearch nodes in the cluster.

Repeat the following steps on all the servers designated for your cluster.

First, update your system:

sudo apt-get update

Then, install Java with:

sudo apt-get install default-jre

Checking your Java version now should give you the following output or similar:

openjdk version "1.8.0_151" OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.16.04.2-b12) OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

Installing Elasticsearch nodes

Our next step is to install Elasticsearch. As before, repeat the steps in this section on all your servers.

First, you need to add Elastic’s signing key so that the downloaded package can be verified (skip this step if you’ve already installed packages from Elastic):

wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

For Debian, we need to then install the apt-transport-https package:

sudo apt-get install apt-transport-https

The next step is to add the repository definition to your system:

echo "deb https://artifacts.elastic.co/packages/6.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-6.x.list

All that’s left to do is to update your repositories and install Elasticsearch:

sudo apt-get update sudo apt-get install elasticsearch

Configuring the Elasticsearch cluster

Our next step is to set up the cluster so that the nodes can connect and communicate with each other.

For each node, open the Elasticsearch configuration file:

sudo vim /etc/elasticsearch/elasticsearch.yml

This file is quite long, and contains multiple settings for different sections. Browse through the file, and enter the following configurations (replace the IPs with your node IPs):

#give your cluster a name. cluster.name: my-cluster #give your nodes a name (change node number from node to node). node.name: "es-node-1" #define node 1 as master-eligible: node.master: true #define nodes 2 and 3 as data nodes: node.data: true #enter the private IP and port of your node: network.host: 172.11.61.27 http.port: 9200 #detail the private IPs of your nodes: discovery.zen.ping.unicast.hosts: ["172.11.61.27", "172.31.22.131","172.31.32.221"]

Save and exit.

Running your Elasticsearch cluster

You are now ready to start your Elasticsearch nodes and verify they are communicating with each other as a cluster.

For each instance, run the following command:

sudo service elasticsearch start

If everything was configured correctly, your Elasticsearch cluster should be up and running. To verify everything is working as expected, query Elasticsearch from any of the cluster nodes:

curl -XGET 'http://localhost:9200/_cluster/state?pretty'

The response should detail the cluster and its nodes:

{ "cluster_name" : "my-cluster", "compressed_size_in_bytes" : 351, "version" : 4, "state_uuid" : "3LSnpinFQbCDHnsFv-Z8nw", "master_node" : "IwEK2o1-Ss6mtx50MripkA", "blocks" : { }, "nodes" : { "IwEK2o1-Ss6mtx50MripkA" : { "name" : "es-node-2", "ephemeral_id" : "x9kUrr0yRh--3G0ckESsEA", "transport_address" : "172.31.50.123:9300", "attributes" : { } }, "txM57a42Q0Ggayo4g7-pSg" : { "name" : "es-node-1", "ephemeral_id" : "Q370o4FLQ4yKPX4_rOIlYQ", "transport_address" : "172.31.62.172:9300", "attributes" : { } }, "6YNZvQW6QYO-DX31uIvaBg" : { "name" : "es-node-3", "ephemeral_id" : "mH034-P0Sku6Vr1DXBOQ5A", "transport_address" : "172.31.52.220:9300", "attributes" : { } } }, …

Elasticsearch cluster configurations for production

We already defined the different roles for the nodes in our cluster, but there are some additional recommended settings for a cluster running in a production environment.

Avoiding “Split Brain”

A “split-brain” situation is when communication between nodes in the cluster fails due to either a network failure or an internal failure with one of the nodes. In this kind of scenario, more than one node might believe it is the master node, leading to a state of data inconsistency.

For avoiding this situation, we can make changes to the discovery.zen.minimum_master_nodes directive in the Elasticsearch configuration file which determines how many nodes need to be in communication (quorum) to elect a master.

A best practice to determine this number is to use the following formula to decide this number: N/2 + 1. N is the number of master eligible nodes in the cluster. You then round down the result to the nearest integer.

In the case of a cluster with three nodes, then:

discovery.zen.minimum_master_nodes: 2

Adjusting JVM heap size

To ensure Elasticsearch has enough operational leeway, the default JVM heap size (min/max 1 GB) should be adjusted.

As a rule of the thumb, the maximum heap size should be set up to 50% of your RAM, but no more than 32GB (due to Java pointer inefficiency in larger heaps). Elastic also recommends that the value for maximum and minimum heap size be identical.

These value can be configured using the Xmx and Xms settings in the jvm.options file.

On DEB:

sudo vim /etc/elasticsearch/jvm.options -Xms2g -Xmx2g

Disabling swapping

Swapping out unused memory is a known behavior but in the context of Elasticsearch can result in disconnects, bad performance and in general — an unstable cluster.

To avoid swapping you can either disable all swapping (recommended if Elasticsearch is the only service running on the server), or you can use mlockall to lock the Elasticsearch process to RAM.

To do this, open the Elasticsearch configuration file on all nodes in the cluster:

sudo vim /etc/elasticsearch/elasticsearch.yml

Uncomment the following line:

bootstrap.mlockall: true

Next, open the /etc/default/elasticsearch file:

sudo vim /etc/default/elasticsearch

Make the following configurations:

MAX_LOCKED_MEMORY=unlimited

Restart Elasticsearch when you’re done.

Adjusting virtual memory

To avoid running out of virtual memory, increase the amount of limits on mmap counts:

sudo vim /etc/sysctl.conf

Update the relevant setting accordingly:

vm.max_map_count=262144

On DEB/RPM, this setting is configured automatically.

Increasing open file descriptor limit

Another important configuration is the limit of open file descriptors. Since Elasticsearch makes use of a large amount of file descriptors, you must ensure the defined limit is enough otherwise you might end up losing data.

The common recommendation for this setting is 65,536 and higher. On DEB/RPM the default settings are already configured to suit this requirement but you can of course fine tune it.

sudo vim  /etc/security/limits.conf

Set the limit:

- nofile 65536

Elasticsearch Cluster APIs

Elasticsearch supports a large number of cluster-specific API operations that allow you to manage and monitor your Elasticsearch cluster. Most of the APIs allow you to define which Elasticsearch node to call using either the internal node ID, its name or its address.   

Below is a list of a few of the more basic API operations you can use. For advanced usage of cluster APIs, read this blog post.

Cluster Health

This API can be used to see general info on the cluster and gauge its health:

curl -XGET 'localhost:9200/_cluster/health?pretty'

Response:

{ "cluster_name" : "my-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 3, "number_of_data_nodes" : 3, "active_primary_shards" : 0, "active_shards" : 0, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0, "delayed_unassigned_shards" : 0, "number_of_pending_tasks" : 0, "number_of_in_flight_fetch" : 0, "task_max_waiting_in_queue_millis" : 0, "active_shards_percent_as_number" : 100.0 }

Cluster State

This API can be sued to see a detailed status report on your entire cluster. You can filter results by specifying parameters in the call URL.

curl -XGET 'localhost:9200/_cluster/state?pretty'

Response:

{ "cluster_name" : "my-cluster", "compressed_size_in_bytes" : 347, "version" : 4, "state_uuid" : "uMi5OBtAS8SSRJ9hw1-gUg", "master_node" : "sqT_y5ENQ9SdjHiE0oco_g", "blocks" : { }, "nodes" : { "sqT_y5ENQ9SdjHiE0oco_g" : { "name" : "node-1", "ephemeral_id" : "-HDzovR0S0e-Nn8XJ-GWPA", "transport_address" : "172.31.56.131:9300", "attributes" : { } }, "mO0d0hYiS1uB--NoWuWyHg" : { "name" : "node-3", "ephemeral_id" : "LXjx86Q5TrmefDoq06MY1A", "transport_address" : "172.31.58.61:9300", "attributes" : { } }, "it1V-5bGT9yQh19d8aAO0g" : { "name" : "node-2", "ephemeral_id" : "lCJja_QtTYauP3xEWg5NBQ", "transport_address" : "172.31.62.65:9300", "attributes" : { } } }, "metadata" : { "cluster_uuid" : "8AqSmmKdQgmRVPsVxyxKrw", "templates" : { }, "indices" : { }, "index-graveyard" : { "tombstones" : [ ] } }, "routing_table" : { "indices" : { } }, "routing_nodes" : { "unassigned" : [ ], "nodes" : { "it1V-5bGT9yQh19d8aAO0g" : [ ], "sqT_y5ENQ9SdjHiE0oco_g" : [ ], "mO0d0hYiS1uB--NoWuWyHg" : [ ] } }, "snapshots" : { "snapshots" : [ ] }, "restore" : { "snapshots" : [ ] }, "snapshot_deletions" : { "snapshot_deletions" : [ ] } }

Cluster Stats

Extremely useful for monitoring performance metrics on your entire cluster:

curl -XGET 'localhost:9200/_cluster/stats?human&pretty'

Response:

1.2" ], "os" : { "available_processors" : 3, "allocated_processors" : 3, "names" : [ { "name" : "Linux", "count" : 3 } ], "mem" : { "total" : "10.4gb", "total_in_bytes" : 11247157248, "free" : "4.5gb", "free_in_bytes" : 4915200000, "used" : "5.8gb", "used_in_bytes" : 6331957248, "free_percent" : 44, "used_percent" : 56 } }, "process" : { "cpu" : { "percent" : 10 }, "open_file_descriptors" : { "min" : 177, "max" : 178, "avg" : 177 } }, "jvm" : { "max_uptime" : "6m", "max_uptime_in_millis" : 361766, "versions" : [ { "version" : "1.8.0_151", "vm_name" : "OpenJDK 64-Bit Server VM", "vm_version" : "25.151-b12", "vm_vendor" : "Oracle Corporation", "count" : 3 } ], "mem" : { "heap_used" : "252.1mb", "heap_used_in_bytes" : 264450008, "heap_max" : "2.9gb", "heap_max_in_bytes" : 3195076608 }, "threads" : 63 }, "fs" : { "total" : "23.2gb", "total_in_bytes" : 24962703360, "free" : "19.4gb", "free_in_bytes" : 20908818432, "available" : "18.2gb", "available_in_bytes" : 19570003968 }, "plugins" : [ ], "network_types" : { "transport_types" : { "netty4" : 3 }, "http_types" : { "netty4" : 3 } } } }

You can also target specific groups of nodes with node filters.

Nodes Stats

If you want to inspect metrics for specific nodes in the cluster, use this API. You can see info for all nodes, a specific node, or ask to see only index or OS/process specific stats.

All nodes:

curl -XGET 'localhost:9200/_nodes/stats?pretty'

A specific node:

curl -XGET 'localhost:9200/_nodes/node-1/stats?pretty'

Index-only stats:

curl -XGET 'localhost:9200/_nodes/stats/indices?pretty'

You can get any of the specific metrics for any single node with the following structure:

curl -XGET 'localhost:9200/_nodes/stats/ingest?pretty'

Or multiple nodes with the following structure:

curl -XGET 'localhost:9200/_nodes/stats/ingest,fs?pretty'

Or all metrics with either of these two formats:

curl -XGET 'localhost:9200/_nodes/stats/_all?pretty' curl -XGET 'localhost:9200/_nodes/stats?metric=_all?pretty'

Nodes Info

If you want to collect information on any or all of your cluster nodes, use this API.

Retrieve for a single node:

curl -XGET ‘localhost:9200/_nodes/?pretty’

Or multiple nodes:

curl -XGET ‘localhost:9200/_nodes/node1,node2?pretty’

Retrieve data on plugins or ingest:

curl -XGET ‘localhost:9200/_nodes/pluginscurl -XGET ‘localhost:9200/_nodes/ingest

Information about ingest processors should appear like this (with many more than the three types shown in the example):

{ "_nodes": … "cluster_name": "elasticsearch", "nodes": { "toTaLLyran60m5amp13": { "ingest": { "processors": [ { "type": "uppercase" }, { "type": "lowercase" }, { "type": "append" } ] } } } }

Pending Cluster Tasks

This API tracks changes at the cluster level, including but not limited to updated mapping, failed sharding, and index creation.

The following GET should return a list of tasks:

curl -XGET ‘localhost:9200/_cluster/pending_tasks?pretty’

Task Management

Similar to the Pending Cluster Tasks API, the Task Management API will get data on currently running tasks on respective nodes.

To get info on all currently executing tasks, enter:

curl -XGET "localhost:9200/_tasks

To get current tasks by specific nodes, AND additionally cluster-related tasks, enter the node names as such and then append &actions to the GET:

curl -XGET ‘localhost:9200/_tasks?nodes=node1,node2&actions=cluster:*&pretty’

Retrieve info about a specific task (or its child tasks) by entering _tasks/ and then the task’s individual ID:

curl -XGET ‘localhost:9200/_tasks/43r315an3xamp13’

And for child tasks:

curl -XGET ‘localhost:9200/_tasks?parent_task_id=43r315an3xamp13’

This API also supports reindexing, search, task grouping and task cancelling.

Remote Cluster Info

Get remote cluster info with:

curl -XGET 'localhost:9200/_remote/info?pretty'

Voting Configuration Exclusions

This will remove master-eligible nodes.
Remove all exclusions by:

curl -X DELETE ‘localhost:9200/_cluster/voting_config_exclusions?pretty’

Or add a node to the exclusions list:

curl -X POST ‘localhost:9200/_cluster/voting_config_exclusions/node1?pretty’

What next?

“When all else fails, read the fuc%^&* manual” goes the famous saying. Thing is, the manual in question, and the technology it documents, are not straightforward to say the least.

More on the subject:

This tutorial made a brave attempt to provide users with the basics of setting up and configuring their first Elasticsearch cluster, knowing full well that it is virtually impossible to provide instructions that suit every environment and use case.

Together with this tutorial, I strongly recommend doing additional research. Other than Elastic’s official documentation, here are some additional informative resources:

Good luck!

Sours: https://logz.io/blog/elasticsearch-cluster-tutorial/
Elastic Search/ Kibana configuration files explained. Elasticsearch.yml and Kibana.yml

Installing and Configuring Elasticsearch

This document provides reference information and examples relating to installation and configuration of the Elasticsearch search feature for the Akana API Platform Community Manager developer portal.

Note: For information about secure configuration of Elasticsearch, see Configuring Elasticsearch with security.


Table of Contents

Elasticsearch feature overview:

  1. About the Elasticsearch feature
  2. Elasticsearch version
  3. System requirements
  4. Planning your Elasticsearch feature
  5. Should I choose Transport Client or REST Client mode?
  6. Links to additional information about Elasticsearch

Installing and Configuring Elasticsearch:

  1. Installing Elasticsearch
  2. High-level steps for Elasticsearch configuration
  3. What changes do I need to make to the Elasticsearch YAML file?
  4. How do I configure Elasticsearch?
  5. How do I configure the number of nodes and shards?
  6. Updating the Elasticsearch index

Elasticsearch feature information:

About the Elasticsearch feature

Elasticsearch is a search engine based on Apache Lucene. It is robust, and allows fast indexing and responsive updating. It's an extremely popular tool in very broad use—a scalable search solution that uses JSON messaging over an HTTP interface with a native Java API.

Elasticsearch is run in standalone mode. Your installation will need to have Elasticsearch installed on at least one server. Just as with a relational database, you'll need to provide the software and hardware required. You can get started with a trial license.

All containers running the Akana API Platform can use Elasticsearch. A cluster is recommended for redundancy.

Deployment is via REST Client.

For more information on these choices, see Should I choose Transport Client or REST Client mode? below.

Administration of Elasticsearch is done with the configuration wizard in the Akana Administration Console.

For more information on Elasticsearch terminology, refer to the Elasticsearch glossary: https://www.elastic.co/guide/en/elasticsearch/reference/current/glossary.html.

Back to top

Elasticsearch version

For the latest information about supported Elasticsearch versions, refer to the correct version of the platform system requirements document:

Note: For any System Requirements version, you can also go to the landing page, System Requirements (all versions), and choose the version for your installation.

Back to top

System requirements

For system requirements for the standalone Elasticsearch server, refer to the correct version of the platform system requirements document, as listed in Elasticsearch version above.

Planning your Elasticsearch feature

As part of planning your installation, you'll need to make some decisions about how you want to set up the Elasticsearch feature:

  • Do you want one or more Elasticsearch servers? Additional servers are recommended, for fallback reasons.
  • Do you want dedicated Elasticsearch servers? You could also install Elasticsearch on one or more servers running the Akana API Platform.
  • Which deployment mode do you want to use, Transport Client or REST Client? see Should I choose Transport Client or REST Client mode? below.

Back to top

Should I choose Transport Client or REST Client mode?

There are two options for Deployment Mode:

  • Transport Client: The client uses a TCP connection to communicate to the Elasticsearch server or cluster. Transport Client mode will be deprecated in a later version of Elasticsearch, and will be removed in 8.0. It's best to use REST Client.
  • REST Client: The client communicates to the Elasticsearch server or cluster by accessing a URL. Introduced in a recent version of Elasticsearch. Recommended.

Back to top

Links to additional information about Elasticsearch

For more information about Elasticsearch, refer to the following:

Back to top


Installing and configuring Elasticsearch:

Installing Elasticsearch

You'll need to download Elasticsearch, and install it on the server or servers you'll be using. To determine the version to install, see Elasticsearch version.

Follow the instructions provided by Elasticsearch.

Download file locations:

Back to top

High-level steps for Elasticsearch configuration

To use Elasticsearch on the Akana API Platform you'll need to:

Back to top

What changes do I need to make to the Elasticsearch YAML file?

You'll need to make some changes to one of the Elasticsearch configuration files, elasticsearch.yml, so that Elasticsearch will work for your Akana API Platform installation.

The elasticsearch.yml file is generally stored in the {elasticsearch_home}/config folder. It might have some default placeholder content, but not all the placeholder values.

As a starting point to model your changes, you can use the example in Sample Elasticsearch YAML file below.

Note: If you want security, you'll need to add some extra values, using the example in Sample Elasticsearch YAML file with security settings. If you don't want security, just set up the values listed below.

  • Cluster name: needed for both Transport Client and REST Client. For example: cluster.name: akana
  • Node name: needed for both Transport Client and REST Client, if you want to name your own node. For example: node.name: node-1
  • Transport TCP port: needed for Transport Client. For example: transport.tcp.port: 9300
  • Path to directory where to store the data: needed for both. For example: path.data: /vars/elasticsearch/data
  • Optional if you don't want to bind to all interfaces: Set the bind address to a specific IP. For example: network.host: 192.164.0.1
  • Pass an initial list of hostnames for all the nodes in the cluster, to provide a seed list of other nodes in the cluster that are likely to be live and contactable, as part of discovery. See Discovery Settings on the Elasticsearch website. For example: discovery.zen.ping.unicast.hosts: ["localhost", "[::1]"]
  • Increase the default configuration value for the maximum number of Boolean clauses allowed within a search string. For example: indices.query.bool.max_clause_count: 10000

    For more information on this setting, see Search Settings (Elasticsearch documentation).

For general information about the elasticsearch.yml file, see https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html.

Sample Elasticsearch YAML file

# ======================== Elasticsearch Configuration ========================= # # NOTE: Elasticsearch comes with reasonable defaults for most settings. # Before you set out to tweak and tune the configuration, make sure you # understand what are you trying to accomplish and the consequences. # # The primary way of configuring a node is via this file. This template lists # the most important settings you may want to configure for a production cluster. # # Please consult the documentation for further information on configuration options: # https://www.elastic.co/guide/en/elasticsearch/reference/index.html # # ---------------------------------- Cluster ----------------------------------- # # Use a descriptive name for your cluster: # cluster.name: akana # # ------------------------------------ Node ------------------------------------ # # Use a descriptive name for the node: # node.name: node-1 # # Add custom attributes to the node: # #node.attr.rack: r1 # # ----------------------------------- Paths ------------------------------------ # # Path to directory where to store the data (separate multiple locations by comma): # path.data: /vars/elasticsearch/data # # Path to log files: # path.logs: /vars/elasticsearch/logs # # ----------------------------------- Memory ----------------------------------- # # Lock the memory on startup: # #bootstrap.memory_lock: true # # Make sure that the heap size is set to about half the memory available # on the system and that the owner of the process is allowed to use this # limit. # # Elasticsearch performs poorly when the system is swapping the memory. # # ---------------------------------- Network ----------------------------------- # # Set the bind address to a specific IP (IPv4 or IPv6): # #network.host: 192.168.0.1 network.host: 0.0.0.0 transport.tcp.port: 9300 # # Set a custom port for HTTP: # http.port: 9200 # # For more information, consult the network module documentation. # # --------------------------------- Discovery ---------------------------------- # # Pass an initial list of hosts to perform discovery when new node is started: # The default list of hosts is ["127.0.0.1", "[::1]"] # discovery.zen.ping.unicast.hosts: ["localhost", "[::1]"] # # Prevent the "split brain" by configuring the majority of nodes (total number of master-eligible nodes / 2 + 1): # discovery.zen.minimum_master_nodes: 1 # # For more information, consult the zen discovery module documentation. # # ---------------------------------- Gateway ----------------------------------- # # Block initial recovery after a full cluster restart until N nodes are started: # #gateway.recover_after_nodes: 3 # # For more information, consult the gateway module documentation. # # ---------------------------------- Search settings ----------------------------------- # # Add the line below to increase the maximum number of Boolean clauses: # indices.query.bool.max_clause_count: 10000 # # For more information, consult the Elasticsearch documentation. # # ---------------------------------- Various ----------------------------------- # (This section is needed only if you want to configure security)

Note: If you want to use security with Elasticsearch, you'll need to set up additional values in the YAML file. See Sample Elasticsearch YAML file with security settings.

Default ports for Elasticsearch configuration

The default ports for Elasticsearch configuration are as follows:

  • HTTP: Default is 9200, default range is 9200–9299
  • TCP: Default is 9300, default range is TCP is 9300–9399

Range implies that if the first port is busy, the platform tries the next one and so on.

Note to Administrators: When setting up your implementation, make sure that all ports in the range 9300–9305 are open, on all containers that the Elasticsearch feature is installed on.

Back to top

How do I configure Elasticsearch?

In configuring the Elasticsearch feature for your Akana API Platform implementation, you'll need to do the following:

  1. Follow the applicable set of steps for the deployment mode you're using:
  2. Create an app to generate the index for the first time

For information on which deployment mode to choose, see Should I choose Transport Client or REST Client mode?

Note: Configuration needs to be set up only once, for all containers, and can be done in any one container. The settings are stored in the database for the entire implementation.

Configure Elasticsearch Global Configuration: Transport Client

  1. In the Akana Administration Console, on the Configuration tab, under Configuration Actions, choose Configure Elasticsearch Global Configuration. The wizard opens.
  2. Make sure the Deployment Mode is set to Transport Client, as shown below.

    Configure Elasticsearch Global Configuration: Transport Client

  3. Provide the name of the cluster for the Elasticsearch feature; for example, akana.
  4. In the ES Server URL field, provide the transport address for the Elasticsearch server (one or more, separated by commas), in the format: {hostname}:{port} (without the protocol). Examples:
    • localhost:9300
    • 10.12.121.116:9300
    • 10.12.121.116:9300,10.12.122.140:9300
  5. Click Finish and then click Close.

Configure Elasticsearch Global Configuration: REST Client

  1. In the Akana Administration Console, on the Configuration tab, under Configuration Actions, choose Configure Elasticsearch Global Configuration. The wizard opens.
  2. For Deployment Mode, choose REST Client.

    Configure Elasticsearch Global Configuration: REST Client

  3. In the ES HTTP URLs field, provide the HTTP URLs for each container where Community Manager is running, as well as any container running scheduled jobs. Provide full URLs; use a comma separator between values. Examples:
    • http://localhost:9200
    • https://localhost:9200
    • http://localhost:9200,http://localhost:9250

    Note: If there are multiple URLs, the protocol must be the same for all. For example, you cannot mix HTTP and HTTPS.

  4. Click Finish and then click Close.

Create an app to generate the index for the first time

When you've completed the configuration, the index isn't generated until you create some content to be indexed.

When you first log in to the Community Manager developer portal after configuring, you'll see a General Search Error until the first index is generated.

To resolve this, create an app in the Community Manager developer portal. This causes the first index to be generated. After that, the search error is resolved and adding and indexing of content can proceed as normal.

For instructions on creating an app in the Community Manager developer portal, see How do I add an app? (Community Manager developer portal help).

Back to top

How do I configure the number of nodes and shards?

Note: This step is optional. The platform defaults are sufficient.

The platform includes configuration settings that you can use to manage your Elasticsearch setup. These are controlled by configuration settings in the Akana Administration Console.

In the Akana Administration Console, the configuration category is: com.akana.elasticsearch.

In the default configuration, shown below, there are two shards and one replica. Let's say there are two nodes in the cluster. One shard, approximately half the index, is stored on each node. The one replica includes a replica of each shard:

  • Node 1 has Shard 1 and the replica of Shard 2
  • Node 2 has Shard 2 and the replica of Shard 1

In this scenario, if one node goes down, the other node has the full search index. Additional nodes add additional safety.

There are two settings, as shown below.

elastic.config.index.number.of.replicas
The number of replicas (additional copies) of the Elasticsearch index. Each replica includes a replica of each shard, so one replica might be distributed across multiple nodes, just as the index itself is split into shards which are distributed across nodes.
Default: 1
elastic.config.index.number.of.shards
The number of shards (splits) for the Elasticsearch index.
Note: This is a one-time setting. An Elasticsearch index cannot be re-sharded; if you want to change the number of shards, after changing the setting you'll need to delete the /index folder that the search index is stored in and then rebuild the index.
Default: 2

For additional information about configuration settings in the Akana Administration Console, see Admin Console Settings.

Back to top

Updating the Elasticsearch index

New in version: 2020.2.0

From time to time, as new features are added to the Community Manager developer portal, additional fields are added to the Elasticsearch index.

In version prior to 2020.2.0, when new fields were added, it was necessary to run specific commands, and then a database query, to update the Elasticsearch index.

In 2020.2.0 and later, if new Elasticsearch indexes are added, you can run an automation recipe, cm-es-index-upgrade.json, to update the index. This recipe takes no parameters.

In addition, the upgrade recipe for the Community Manager developer portal, cm-upgrade.json, calls the cm-es-index-upgrade.json recipe, so that the Elasticsearch index is updated if needed.

If you're upgrading to a new minor version, and new fields have been added to the Elasticsearch search index, as specified in the Release Notes, run the cm-es-index-upgrade.json recipe to ensure that the additional values appear in the Community Manager developer portal search index.

Back to top


Related Topics

Sours: https://docs.akana.com/sp/elasticsearch/install_es_config.htm

Now discussing:

Elasticsearch configuration

Open Distro for Elasticsearch development has moved to OpenSearch. The ODFE plugins will continue to work with legacy versions of Elasticsearch OSS, but we recommend upgrading to OpenSearch to take advantage of the latest features and improvements.

Most Elasticsearch configuration can take place in the cluster settings API. Certain operations require you to modify and restart the cluster.

Whenever possible, use the cluster settings API instead; is local to each node, whereas the API applies the setting to all nodes in the cluster.

Cluster settings API

The first step in changing a setting is to view the current settings:

For a more concise summary of non-default settings:

Three categories of setting exist in the cluster settings API: persistent, transient, and default. Persistent settings, well, persist after a cluster restart. After a restart, Elasticsearch clears transient settings.

If you specify the same setting in multiple places, Elasticsearch uses the following precedence:

  1. Transient settings
  2. Persistent settings
  3. Settings from
  4. Default settings

To change a setting, just specify the new one as either persistent or transient. This example shows the flat settings form:

You can also use the expanded form, which lets you copy and paste from the GET response and change existing values:


Configuration file

You can find in (Docker) or (RPM and DEB) on each node.

The demo configuration includes a number of settings for the security plugin that you should modify before using Open Distro for Elasticsearch for a production workload. To learn more, see Security.

Sours: https://opendistro.github.io/for-elasticsearch-docs/docs/elasticsearch/configuration/


836 837 838 839 840