How to Install Elasticsearch 8 on Ubuntu 24.04, 22.04 or 20.04

This guide will demonstrate how to install Elasticsearch 8 on Ubuntu 24.04, 22.04 or 20.04, equipping you with the necessary knowledge to leverage its capabilities effectively.

In today’s rapidly evolving digital landscape, Elasticsearch 8 stands out as a powerful tool for managing and analyzing vast amounts of data. Its significance in various fields, from data analytics to search engine optimization, cannot be overstated.

Elasticsearch 8.x offers a suite of features that enhance its functionality:

  • Enhanced Security: Comes with improved security settings, ensuring data protection.
  • Scalability: Easily scales to handle petabytes of structured and unstructured data.
  • Speed: Offers real-time search and analytics capabilities.
  • Flexibility: Supports a wide range of data types and structures.
  • Improved Observability: Offers detailed insights into the health and performance of your clusters.
  • Machine Learning Integration: Provides advanced analytics and anomaly detection.

Transitioning into the installation process, it’s essential to understand that Elasticsearch 8’s robust features require a systematic approach to ensure a seamless setup on Ubuntu 22.04 or 20.04. The following sections will guide you through each step, from preparing your system to configuring Elasticsearch for optimal performance.

Import Elasticsearch 8 APT Repository on Ubuntu

Update Ubuntu System Packages

Begin by updating your Ubuntu system packages to ensure all components are current. Execute the command:

sudo apt update && sudo apt upgrade

This command refreshes the package lists and upgrades the packages to their latest versions, maintaining system stability and security.

Install Initial Packages for Elasticsearch 8 Installation

To prepare for Elasticsearch 8 installation, certain packages are necessary. Install these prerequisite packages with the command:

sudo apt install dirmngr ca-certificates software-properties-common apt-transport-https lsb_release curl -y

This step is crucial as it installs utilities like dirmngr and ca-certificates for managing keyrings, software-properties-common for handling software repositories, apt-transport-https for secure package downloads, lsb_release for Linux Standard Base information, and curl for data transfers.

Import Elasticsearch 8 APT Repository

Since Elasticsearch 8 is not available in the default Ubuntu repository, importing it from the Elasticsearch APT repository is necessary.

Add Elasticsearch GPG Key

Start by importing the GPG key to ensure the integrity and authenticity of the packages. Run:

wget -q https://artifacts.elastic.co/GPG-KEY-elasticsearch -O- | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg

This command downloads the GPG key from Elasticsearch’s official website and adds it to your system’s keyring, securing future downloads from the repository.

Add Elasticsearch 8.x APT Repository

Following the GPG key addition, import the Elasticsearch repository with:

echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list

This command creates a new source list file for Elasticsearch, ensuring that your system recognizes and trusts the newly added repository for subsequent installation steps.

Install Elasticsearch 8.x on Ubuntu

Update APT Index Cache After Elasticsearch 8 Import

Refresh the Repository List

After importing Elasticsearch 8, the next step is to refresh your system’s package list. This ensures that the newly added Elasticsearch repository is recognized by your system. Execute the command:

sudo apt update

This command updates the APT index cache, allowing you to install the latest version of Elasticsearch available in the repository.

Install Elasticsearch

With the repository list updated, proceed to install Elasticsearch by running:

sudo apt install elasticsearch

This command downloads and installs Elasticsearch onto your Debian system. It ensures that you have the latest stable version of Elasticsearch, which is crucial for maintaining optimal performance and security.

Screenshot showing successful installation of Elasticsearch 8 on Ubuntu Linux 24.04, 22.04, or 20.04
Confirmation of Successfully Installed Elasticsearch 8 on Ubuntu Linux 24.04, 22.04, or 20.04

Configure and Start the Elasticsearch Service

Enable and Start the Service

By default, Elasticsearch does not start automatically upon system boot. To configure Elasticsearch to start at boot and immediately start the service, use:

sudo systemctl enable elasticsearch.service --now

The --now flag in the systemctl command is a convenient way to both enable the service at boot and start it in the current session.

Verify Service Status

To confirm that Elasticsearch is running properly, check its status with:

systemctl status elasticsearch
Screenshot of Elasticsearch 8 systemd service status OK on Ubuntu Linux 24.04, 22.04, or 20.04
Verifying Systemd Service Status of Elasticsearch 8 on Ubuntu Linux 24.04, 22.04, or 20.04

This command provides real-time status information about the Elasticsearch service, ensuring that it is active and functioning correctly on your Ubuntu system.

Configure Elasticsearch 8 on Ubuntu

Understanding Elasticsearch Data and Configuration Directories

Default Data Directory

Elasticsearch utilizes /var/lib/elasticsearch for storing data. This directory holds indexed data and manages the cluster’s state.

Configuration File Locations

Configuration files are located in /etc/elasticsearch. Here, you control Elasticsearch’s behavior. Java start-up options are set in /etc/default/elasticsearch.

Default configurations work well for single-server operations. For clusters, alterations enable remote connections.

sudo nano /etc/elasticsearch/elasticsearch.yml

Set up Remote Access (Optional)

Networking Configuration in Elasticsearch

Adjust network settings in the configuration file to allow connections beyond localhost.

Open the configuration file using:

sudo nano /etc/elasticsearch/elasticsearch.yml

In the Network section, uncomment the relevant line for network binding and set it to your preferred IP address.

Common Configuration Examples

Setting Network Host

To configure an internal private IP:

network.host: [Internal Private IP]

This setting is essential for cluster communication.

Configuring Cluster Name

Define your cluster name for identification:

cluster.name: my-cluster

This name helps in cluster management and monitoring.

Node Identification

Set a unique name for each node:

node.name: node-1

Unique node names simplify cluster management.

Discovery Settings

Configure node discovery for cluster formation:

discovery.seed_hosts: ["host1", "host2"]

These settings are vital for nodes to discover each other in a cluster.

Memory Allocation

Allocate memory for Elasticsearch:

-Xms1g
-Xmx1g

These settings in /etc/default/elasticsearch control the JVM heap size, crucial for performance.

Enabling CORS

For web-based Elasticsearch tools:

http.cors.enabled: true
http.cors.allow-origin: "/.*/"

CORS settings in elasticsearch.yml enable interactions with web applications.

After making changes, save and exit the editor. Then, restart Elasticsearch to apply new configurations:

sudo systemctl restart elasticsearch

Restarting ensures Elasticsearch operates with the updated settings.

Configure UFW Firewall for Elasticsearch 8 on Ubuntu

Setting Up Firewall Rules for Elasticsearch

Allowing Specific IP Addresses

To enable remote connections to Elasticsearch, it’s essential to configure the firewall to allow these specific connections. Use this command to permit an individual IP address:

sudo ufw allow from [IP Address] to any port 9200

Replace [IP Address] with the desired external IP address. This setup allows traffic from this address to access Elasticsearch on port 9200, which is crucial for remote access or cluster communication.

Allowing a Range of IP Addresses

If you need to allow a range of IP addresses, modify the UFW rule accordingly:

sudo ufw allow from [IP Address Range] to any port 9200

Here, [IP Address Range] could be a subnet, allowing multiple IPs within that subnet to access your Elasticsearch instance.

Allowing All Traffic on Port 9200

In some environments, you might need to allow all traffic to the Elasticsearch port. Use caution with this command, as it opens up port 9200 to all incoming traffic:

sudo ufw allow 9200

This command is generally used in controlled environments or for initial setup and testing.

Restricting Access to Local Network

For added security, especially in production environments, restrict access to the local network. This command allows only local network connections to Elasticsearch:

sudo ufw allow from 192.168.1.0/24 to any port 9200

Adjust 192.168.1.0/24 to match your local network’s IP range. This setting ensures that only devices on your local network can access Elasticsearch, adding a layer of security against external threats.

Applying the Firewall Rules

After setting up the rules, activate them by reloading UFW:

sudo ufw reload

This command enforces the new rules without interrupting current connections. It’s a crucial step to ensure that your Elasticsearch server is protected while allowing necessary traffic.

Example Commands with Elasticsearch 8 on Ubuntu

Deleting an Index in Elasticsearch

To delete an index, such as ‘samples’, execute:

curl -X DELETE 'http://localhost:9200/samples'

This command removes the specified index and all its data, freeing up resources.

Listing All Indexes

To view all indexes on your Elasticsearch server:

curl -X GET 'http://localhost:9200/_cat/indices?v'

This command provides an overview of all indexes, including their health and document count.

Listing All Documents in an Index

To list all documents in an index, like ‘sample’:

curl -X GET 'http://localhost:9200/sample/_search'

Useful for a quick review of the index’s contents, this command displays all stored documents.

Querying with URL Parameters

For targeted searches, use Lucene query syntax. For example, to find ‘Harvard’ in the ‘school’ field:

curl -X GET http://localhost:9200/samples/_search?q=school:Harvard

This method is efficient for simple queries directly via the URL.

Querying with JSON (Elasticsearch Query DSL)

For complex queries, JSON format is more readable and manageable:

curl -XGET --header 'Content-Type: application/json' http://localhost:9200/samples/_search -d '{
      "query" : {
        "match" : { "school": "Harvard" }
    }
}'

This format allows for sophisticated query structures, making it ideal for advanced searches.

Listing Index Mapping

To understand the structure of an index, such as ‘samples’:

curl -X GET http://localhost:9200/samples

This command reveals the fields and their types within the index, aiding in query formulation.

Adding Data to an Index

To insert data into an index:

curl -XPUT --header 'Content-Type: application/json' http://localhost:9200/samples/_doc/1 -d '{
   "school" : "Harvard"			
}'

This command adds a document to the ‘samples’ index, useful for data ingestion.

Updating a Document

To update an existing document:

curl -XPUT --header 'Content-Type: application/json' http://localhost:9200/samples/_doc/2 -d '
{
    "school": "Clemson"
}'

curl -XPOST --header 'Content-Type: application/json' http://localhost:9200/samples/_doc/2/_update -d '{
"doc" : {
               "students": 50000}
}'

These commands first create and then update a document in the ‘samples’ index, demonstrating how to modify data.

Backing Up an Index

To create an index backup:

curl -XPOST --header 'Content-Type: application/json' http://localhost:9200/_reindex -d '{
  "source": {
    "index": "samples"
  },
  "dest": {
    "index": "samples_backup"
  }
}'

This command duplicates the ‘samples’ index, creating a ‘samples_backup’ index, which is crucial for data redundancy.

Bulk Loading Data

For bulk data loading:

export pwd="elastic:"

curl --user $pwd  -H 'Content-Type: application/x-ndjson' -XPOST 'https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/0/_bulk?pretty' --data-binary @<file>

This method efficiently imports large datasets, leveraging Elasticsearch’s bulk API.

Showing Cluster Health

To check the health of an Elasticsearch cluster:

curl --user $pwd  -H 'Content-Type: application/json' -XGET https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/_cluster/health?pretty

This command provides vital information about the cluster’s status, including node health and data balance.

Aggregation and Bucket Aggregation

For analytics purposes, like counting web hits by user city:

curl -XGET --user $pwd --header 'Content-Type: application/json'  https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/logstash/_search?pretty -d '{
        "aggs": {
             "cityName": {
                    "terms": {
                     "field": "geoip.city_name.keyword",
                                "size": 50

        }
   }
  }
}'

And for more detailed insights, such as response codes by city:

curl -XGET --user $pwd --header 'Content-Type: application/json'  https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/logstash/_search?pretty -d '{
        "aggs": {
          "city": {
                "terms": {
                        "field": "geoip.city_name.keyword"
                },
        "aggs": {
          "responses": {
                "terms": {
                     "field": "response"
                 }
           }
         }
      },
      "responses": {
                "terms": {
                     "field": "response"
                 }
        }
   }
}'

These examples illustrate how to use Elasticsearch’s aggregation capabilities for meaningful data analysis.

Using Elasticsearch with Basic Authentication

For secure Elasticsearch setups, authenticate each curl command:

curl -X GET 'http://localhost:9200/_cat/indices?v' -u elastic:(password)

This ensures that only authorized users can access the Elasticsearch data.

Pretty Print

To enhance JSON output readability:

 curl -X GET 'http://localhost:9200/(index)/_search'?pretty=true

Adding ?pretty=true formats the JSON response for easier analysis.

Querying Specific Fields

To return only certain fields, specify them in the _source array:

GET filebeat-7.6.2-2020.05.05-000001/_search
 {
    "_source": ["suricata.eve.timestamp","source.geo.region_name","event.created"],
    "query":      {
        "match" : { "source.geo.country_iso_code": "GR" }
    }
}

This approach focuses the search results on specified fields, streamlining data retrieval.

Querying by Date

For date-based queries:

GET filebeat-7.6.2-2020.05.05-000001/_search
 {
    "query": {
        "range" : {
            "event.created": {
                "gte" : "now-7d/d"
            }
        }
}
}

Use date math to filter documents within a specific time frame, which is crucial for time-sensitive data analysis.

Managing Elasticsearch 8 on Ubuntu

Uninstalling Elasticsearch 8

Removing Elasticsearch Software

In scenarios where Elasticsearch is no longer needed, it can be uninstalled efficiently. To remove Elasticsearch from your system, use:

sudo apt remove elasticsearch

This command not only uninstalls Elasticsearch but also removes any packages that were installed alongside it and are no longer needed, ensuring a clean removal.

Deleting the APT Repository

After uninstalling the software, it’s important to remove the Elasticsearch repository from your system’s sources list. Execute:

sudo rm /etc/apt/sources.list.d/elastic-8.x.list

This command deletes the Elasticsearch repository configuration file, preventing your system from accessing outdated or unnecessary Elasticsearch packages in future updates or installations.

Conclusion

In this guide, we navigated the essentials of managing Elasticsearch 8 on Ubuntu 24.04, 22.04 or 20.04, covering everything from installation and configuration to advanced querying and secure data handling. Remember, regular updates and proper configuration are key to leveraging Elasticsearch’s full potential. Don’t hesitate to revisit the steps for tasks like data backup or query optimization, as they’re crucial for maintaining a robust and efficient Elasticsearch environment.

For further reading, visit the official documentation page.

Leave a Comment