This guide will demonstrate how to install Elasticsearch 8 on Ubuntu 24.04, 22.04 or 20.04, equipping you with the necessary knowledge to leverage its capabilities effectively.
In today’s rapidly evolving digital landscape, Elasticsearch 8 stands out as a powerful tool for managing and analyzing vast amounts of data. Its significance in various fields, from data analytics to search engine optimization, cannot be overstated.
Elasticsearch 8.x offers a suite of features that enhance its functionality:
- Enhanced Security: Comes with improved security settings, ensuring data protection.
- Scalability: Easily scales to handle petabytes of structured and unstructured data.
- Speed: Offers real-time search and analytics capabilities.
- Flexibility: Supports a wide range of data types and structures.
- Improved Observability: Offers detailed insights into the health and performance of your clusters.
- Machine Learning Integration: Provides advanced analytics and anomaly detection.
Transitioning into the installation process, it’s essential to understand that Elasticsearch 8’s robust features require a systematic approach to ensure a seamless setup on Ubuntu 22.04 or 20.04. The following sections will guide you through each step, from preparing your system to configuring Elasticsearch for optimal performance.
Import Elasticsearch 8 APT Repository on Ubuntu
Update Ubuntu System Packages
Begin by updating your Ubuntu system packages to ensure all components are current. Execute the command:
sudo apt update && sudo apt upgrade
This command refreshes the package lists and upgrades the packages to their latest versions, maintaining system stability and security.
Install Initial Packages for Elasticsearch 8 Installation
To prepare for Elasticsearch 8 installation, certain packages are necessary. Install these prerequisite packages with the command:
sudo apt install dirmngr ca-certificates software-properties-common apt-transport-https lsb_release curl -y
This step is crucial as it installs utilities like dirmngr
and ca-certificates
for managing keyrings, software-properties-common
for handling software repositories, apt-transport-https
for secure package downloads, lsb_release
for Linux Standard Base information, and curl
for data transfers.
Import Elasticsearch 8 APT Repository
Since Elasticsearch 8 is not available in the default Ubuntu repository, importing it from the Elasticsearch APT repository is necessary.
Add Elasticsearch GPG Key
Start by importing the GPG key to ensure the integrity and authenticity of the packages. Run:
wget -q https://artifacts.elastic.co/GPG-KEY-elasticsearch -O- | sudo gpg --dearmor -o /usr/share/keyrings/elasticsearch-keyring.gpg
This command downloads the GPG key from Elasticsearch’s official website and adds it to your system’s keyring, securing future downloads from the repository.
Add Elasticsearch 8.x APT Repository
Following the GPG key addition, import the Elasticsearch repository with:
echo "deb [signed-by=/usr/share/keyrings/elasticsearch-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee /etc/apt/sources.list.d/elastic-8.x.list
This command creates a new source list file for Elasticsearch, ensuring that your system recognizes and trusts the newly added repository for subsequent installation steps.
Install Elasticsearch 8.x on Ubuntu
Update APT Index Cache After Elasticsearch 8 Import
Refresh the Repository List
After importing Elasticsearch 8, the next step is to refresh your system’s package list. This ensures that the newly added Elasticsearch repository is recognized by your system. Execute the command:
sudo apt update
This command updates the APT index cache, allowing you to install the latest version of Elasticsearch available in the repository.
Install Elasticsearch
With the repository list updated, proceed to install Elasticsearch by running:
sudo apt install elasticsearch
This command downloads and installs Elasticsearch onto your Debian system. It ensures that you have the latest stable version of Elasticsearch, which is crucial for maintaining optimal performance and security.
Configure and Start the Elasticsearch Service
Enable and Start the Service
By default, Elasticsearch does not start automatically upon system boot. To configure Elasticsearch to start at boot and immediately start the service, use:
sudo systemctl enable elasticsearch.service --now
The --now
flag in the systemctl
command is a convenient way to both enable the service at boot and start it in the current session.
Verify Service Status
To confirm that Elasticsearch is running properly, check its status with:
systemctl status elasticsearch
This command provides real-time status information about the Elasticsearch service, ensuring that it is active and functioning correctly on your Ubuntu system.
Configure Elasticsearch 8 on Ubuntu
Understanding Elasticsearch Data and Configuration Directories
Default Data Directory
Elasticsearch utilizes /var/lib/elasticsearch
for storing data. This directory holds indexed data and manages the cluster’s state.
Configuration File Locations
Configuration files are located in /etc/elasticsearch
. Here, you control Elasticsearch’s behavior. Java start-up options are set in /etc/default/elasticsearch
.
Default configurations work well for single-server operations. For clusters, alterations enable remote connections.
sudo nano /etc/elasticsearch/elasticsearch.yml
Set up Remote Access (Optional)
Networking Configuration in Elasticsearch
Adjust network settings in the configuration file to allow connections beyond localhost.
Open the configuration file using:
sudo nano /etc/elasticsearch/elasticsearch.yml
In the Network section, uncomment the relevant line for network binding and set it to your preferred IP address.
Common Configuration Examples
Setting Network Host
To configure an internal private IP:
network.host: [Internal Private IP]
This setting is essential for cluster communication.
Configuring Cluster Name
Define your cluster name for identification:
cluster.name: my-cluster
This name helps in cluster management and monitoring.
Node Identification
Set a unique name for each node:
node.name: node-1
Unique node names simplify cluster management.
Discovery Settings
Configure node discovery for cluster formation:
discovery.seed_hosts: ["host1", "host2"]
These settings are vital for nodes to discover each other in a cluster.
Memory Allocation
Allocate memory for Elasticsearch:
-Xms1g -Xmx1g
These settings in /etc/default/elasticsearch
control the JVM heap size, crucial for performance.
Enabling CORS
For web-based Elasticsearch tools:
http.cors.enabled: true http.cors.allow-origin: "/.*/"
CORS settings in elasticsearch.yml
enable interactions with web applications.
After making changes, save and exit the editor. Then, restart Elasticsearch to apply new configurations:
sudo systemctl restart elasticsearch
Restarting ensures Elasticsearch operates with the updated settings.
Configure UFW Firewall for Elasticsearch 8 on Ubuntu
Setting Up Firewall Rules for Elasticsearch
Allowing Specific IP Addresses
To enable remote connections to Elasticsearch, it’s essential to configure the firewall to allow these specific connections. Use this command to permit an individual IP address:
sudo ufw allow from [IP Address] to any port 9200
Replace [IP Address]
with the desired external IP address. This setup allows traffic from this address to access Elasticsearch on port 9200, which is crucial for remote access or cluster communication.
Allowing a Range of IP Addresses
If you need to allow a range of IP addresses, modify the UFW rule accordingly:
sudo ufw allow from [IP Address Range] to any port 9200
Here, [IP Address Range]
could be a subnet, allowing multiple IPs within that subnet to access your Elasticsearch instance.
Allowing All Traffic on Port 9200
In some environments, you might need to allow all traffic to the Elasticsearch port. Use caution with this command, as it opens up port 9200 to all incoming traffic:
sudo ufw allow 9200
This command is generally used in controlled environments or for initial setup and testing.
Restricting Access to Local Network
For added security, especially in production environments, restrict access to the local network. This command allows only local network connections to Elasticsearch:
sudo ufw allow from 192.168.1.0/24 to any port 9200
Adjust 192.168.1.0/24
to match your local network’s IP range. This setting ensures that only devices on your local network can access Elasticsearch, adding a layer of security against external threats.
Applying the Firewall Rules
After setting up the rules, activate them by reloading UFW:
sudo ufw reload
This command enforces the new rules without interrupting current connections. It’s a crucial step to ensure that your Elasticsearch server is protected while allowing necessary traffic.
Example Commands with Elasticsearch 8 on Ubuntu
Deleting an Index in Elasticsearch
To delete an index, such as ‘samples’, execute:
curl -X DELETE 'http://localhost:9200/samples'
This command removes the specified index and all its data, freeing up resources.
Listing All Indexes
To view all indexes on your Elasticsearch server:
curl -X GET 'http://localhost:9200/_cat/indices?v'
This command provides an overview of all indexes, including their health and document count.
Listing All Documents in an Index
To list all documents in an index, like ‘sample’:
curl -X GET 'http://localhost:9200/sample/_search'
Useful for a quick review of the index’s contents, this command displays all stored documents.
Querying with URL Parameters
For targeted searches, use Lucene query syntax. For example, to find ‘Harvard’ in the ‘school’ field:
curl -X GET http://localhost:9200/samples/_search?q=school:Harvard
This method is efficient for simple queries directly via the URL.
Querying with JSON (Elasticsearch Query DSL)
For complex queries, JSON format is more readable and manageable:
curl -XGET --header 'Content-Type: application/json' http://localhost:9200/samples/_search -d '{
"query" : {
"match" : { "school": "Harvard" }
}
}'
This format allows for sophisticated query structures, making it ideal for advanced searches.
Listing Index Mapping
To understand the structure of an index, such as ‘samples’:
curl -X GET http://localhost:9200/samples
This command reveals the fields and their types within the index, aiding in query formulation.
Adding Data to an Index
To insert data into an index:
curl -XPUT --header 'Content-Type: application/json' http://localhost:9200/samples/_doc/1 -d '{
"school" : "Harvard"
}'
This command adds a document to the ‘samples’ index, useful for data ingestion.
Updating a Document
To update an existing document:
curl -XPUT --header 'Content-Type: application/json' http://localhost:9200/samples/_doc/2 -d '
{
"school": "Clemson"
}'
curl -XPOST --header 'Content-Type: application/json' http://localhost:9200/samples/_doc/2/_update -d '{
"doc" : {
"students": 50000}
}'
These commands first create and then update a document in the ‘samples’ index, demonstrating how to modify data.
Backing Up an Index
To create an index backup:
curl -XPOST --header 'Content-Type: application/json' http://localhost:9200/_reindex -d '{
"source": {
"index": "samples"
},
"dest": {
"index": "samples_backup"
}
}'
This command duplicates the ‘samples’ index, creating a ‘samples_backup’ index, which is crucial for data redundancy.
Bulk Loading Data
For bulk data loading:
export pwd="elastic:"
curl --user $pwd -H 'Content-Type: application/x-ndjson' -XPOST 'https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/0/_bulk?pretty' --data-binary @<file>
This method efficiently imports large datasets, leveraging Elasticsearch’s bulk API.
Showing Cluster Health
To check the health of an Elasticsearch cluster:
curl --user $pwd -H 'Content-Type: application/json' -XGET https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/_cluster/health?pretty
This command provides vital information about the cluster’s status, including node health and data balance.
Aggregation and Bucket Aggregation
For analytics purposes, like counting web hits by user city:
curl -XGET --user $pwd --header 'Content-Type: application/json' https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/logstash/_search?pretty -d '{
"aggs": {
"cityName": {
"terms": {
"field": "geoip.city_name.keyword",
"size": 50
}
}
}
}'
And for more detailed insights, such as response codes by city:
curl -XGET --user $pwd --header 'Content-Type: application/json' https://58571402f5464923883e7be42a037917.eu-central-1.aws.cloud.es.io:9243/logstash/_search?pretty -d '{
"aggs": {
"city": {
"terms": {
"field": "geoip.city_name.keyword"
},
"aggs": {
"responses": {
"terms": {
"field": "response"
}
}
}
},
"responses": {
"terms": {
"field": "response"
}
}
}
}'
These examples illustrate how to use Elasticsearch’s aggregation capabilities for meaningful data analysis.
Using Elasticsearch with Basic Authentication
For secure Elasticsearch setups, authenticate each curl command:
curl -X GET 'http://localhost:9200/_cat/indices?v' -u elastic:(password)
This ensures that only authorized users can access the Elasticsearch data.
Pretty Print
To enhance JSON output readability:
curl -X GET 'http://localhost:9200/(index)/_search'?pretty=true
Adding ?pretty=true
formats the JSON response for easier analysis.
Querying Specific Fields
To return only certain fields, specify them in the _source
array:
GET filebeat-7.6.2-2020.05.05-000001/_search
{
"_source": ["suricata.eve.timestamp","source.geo.region_name","event.created"],
"query": {
"match" : { "source.geo.country_iso_code": "GR" }
}
}
This approach focuses the search results on specified fields, streamlining data retrieval.
Querying by Date
For date-based queries:
GET filebeat-7.6.2-2020.05.05-000001/_search
{
"query": {
"range" : {
"event.created": {
"gte" : "now-7d/d"
}
}
}
}
Use date math to filter documents within a specific time frame, which is crucial for time-sensitive data analysis.
Managing Elasticsearch 8 on Ubuntu
Uninstalling Elasticsearch 8
Removing Elasticsearch Software
In scenarios where Elasticsearch is no longer needed, it can be uninstalled efficiently. To remove Elasticsearch from your system, use:
sudo apt remove elasticsearch
This command not only uninstalls Elasticsearch but also removes any packages that were installed alongside it and are no longer needed, ensuring a clean removal.
Deleting the APT Repository
After uninstalling the software, it’s important to remove the Elasticsearch repository from your system’s sources list. Execute:
sudo rm /etc/apt/sources.list.d/elastic-8.x.list
This command deletes the Elasticsearch repository configuration file, preventing your system from accessing outdated or unnecessary Elasticsearch packages in future updates or installations.
Conclusion
In this guide, we navigated the essentials of managing Elasticsearch 8 on Ubuntu 24.04, 22.04 or 20.04, covering everything from installation and configuration to advanced querying and secure data handling. Remember, regular updates and proper configuration are key to leveraging Elasticsearch’s full potential. Don’t hesitate to revisit the steps for tasks like data backup or query optimization, as they’re crucial for maintaining a robust and efficient Elasticsearch environment.
For further reading, visit the official documentation page.