(Created blank page) |
No edit summary |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Introduction == | |||
[https://en.wikipedia.org/wiki/Apache_Cassandra Apache Cassandra] is a free and open-source distributed NoSQL database system designed to handle large amounts of data across multiple servers, providing high availability with no single point of failure. One of the main features of Apache Cassandra is its ability to run in a multi-node setup, hence providing the following benefits: | |||
* '''Fault tolerance''': Data is automatically replicated to multiple nodes for fault-tolerance. Also, replication across multiple data centers is supported. Failed nodes can be replaced with no downtime. | |||
* '''Decentralization''': There are no single points of failure, no network bottlenecks and every node in the cluster is identical. | |||
* '''Scalability & Elasticity''': Capability to run with dozens of thousands of nodes with petabytes of data. Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications. | |||
== Installation == | |||
The database have been available since Fedora 26 and there are multiple packages in Fedora repositories: | |||
{|class="wikitable" | |||
|- | |||
| '''cassandra''' || Client tools | |||
|- | |||
| '''cassandra-server''' || Server part, mainly database daemon | |||
|- | |||
| '''cassandra-javadoc''' || Documentation | |||
|- | |||
| colspan="2" | More packages can be listed with command: '''dnf list cassandra\*''' | |||
|} | |||
<pre> | |||
dnf install cassandra cassandra-server | |||
</pre> | |||
will install database server and tools for working with it. | |||
== Basic setup == | |||
=== Initialization and startup === | |||
Start database daemon: | |||
<pre> | |||
systemctl start cassandra | |||
</pre> | |||
Enable start of database daemon after boot: | |||
<pre> | |||
systemctl enable cassandra | |||
</pre> | |||
To test if server initialization was successful you can try the Cassandra client. See [[#Usage example|Usage example]]. | |||
=== Users authentication === | |||
It’s especially relevant to note that '''by default authentication is disabled''' and to enable it you have to take the following steps: | |||
# Change the authenticator option in the ''/etc/cassandra/cassandra.yaml'' file to PasswordAuthenticator: <pre>authenticator: PasswordAuthenticator</pre> | |||
# Restart cassandra: <pre>systemctl restart cassandra</pre> | |||
# Start cqlsh using the default superuser name and password: <pre>cqlsh -u cassandra -p cassandra</pre> | |||
# Create a new superuser: <pre>cqlsh> CREATE ROLE <new_super_user> WITH PASSWORD = '<some_secure_password>' AND SUPERUSER = true AND LOGIN = true;</pre> | |||
# Log in as the newly created superuser: <pre>cqlsh -u <new_super_user> -p <some_secure_password></pre> | |||
# ''cassandra'' superuser cannot be deleted from Cassandra, so to neutralize the account, change the password to something long and incomprehensible, and alter the user’s status to NOSUPERUSER: <pre>cqlsh> ALTER ROLE cassandra WITH PASSWORD='SomeNonsenseThatNoOneWillThinkOf' AND SUPERUSER=false;</pre> | |||
=== Ports and remote access === | |||
By default these ports should be binded to Cassandra Java process after start: | |||
{| class="wikitable" | |||
|- | |||
! Port number !! Description | |||
|- | |||
| TCP / 7000 || Cassandra inter-node cluster communication | |||
|- | |||
| TCP / 7199 || Cassandra JMX monitoring port | |||
|- | |||
| TCP / 9042 || Cassandra client port | |||
|} | |||
{{admon/tip | Encrypted communication | [http://cassandra.apache.org/doc/latest/operating/security.html#tls-ssl-encryption SSL/TLS in Apache Cassandra] can be configured, by default it uses TCP / 7001 for inter-node communication and TCP / 9142 as client port.}} | |||
{{admon/warning | Thrift API | was deprecated in Apache Cassandra 4 and in Fedora version of Cassandra 3 is also stripped. This means there is not port TCP / 9160.}} | |||
To allow '''remote access''' to database, edit the ''/etc/cassandra/cassandra.yaml'' file, changing the following parameters (needs service restart): | |||
<pre> | |||
listen_address: external_ip | |||
rpc_address: external_ip | |||
seed_provider/seeds: "<external_ip>" | |||
</pre> | |||
Also open ports in '''firewall'''. | |||
firewalld: | |||
<pre> | |||
firewall-cmd --add-port=7000/tcp | |||
firewall-cmd --add-port=9042/tcp | |||
# probably you do not want to expose JMX port on external network | |||
# firewall-cmd --add-port=7199/tcp | |||
# save configuration | |||
firewall-cmd --runtime-to-permanent | |||
</pre> | |||
iptables: | |||
<pre> | |||
iptables -A INPUT -p tcp --dport 7000 -j ACCEPT | |||
iptables -A INPUT -p tcp --dport 9042 -j ACCEPT | |||
# probably you do not want to expose JMX port on external network | |||
# iptables -A INPUT -p tcp --dport 7199 -j ACCEPT | |||
</pre> | |||
{{admon/warning | Warning: | '''By default''' authentication is disabled and '''data are unprotected'''. See [[#Users authentication|Users authentication]].}} | |||
=== More about how to configure Apache Cassandra === | |||
To configure the server you have to edit the file ''/etc/cassandra/cassandra.yaml''. For more information about how to change configuration, see the the [https://docs.datastax.com/en/archived/cassandra/3.x/cassandra/configuration/configCassandra_yaml.html upstream configuration]. | |||
== Cluster setup == | |||
For example refer to page [[Apache Cassandra Cluster]]. | |||
== Apache Cassandra in the container == | |||
An Apache Cassandra container image can be found in [https://hub.docker.com/ DockerHub] as '''[https://hub.docker.com/r/centos/cassandra-3-centos7/ centos/cassandra-3-centos7]'''. Starting a container for serving Cassandra is simple. | |||
[[Getting started with docker|Install]] and start docker. | |||
Prepare directory for database data: | |||
<pre> | |||
mkdir data | |||
chown 143:143 data | |||
</pre> | |||
{{admon/note|Note:|You have to change ownership for data directory to match ''cassandra'' user in container to allow reading and writing.}} | |||
Start the container: | |||
<pre> | |||
docker run --name cassandra -d -p 9042:9042 \ | |||
-e CASSANDRA_ADMIN_PASSWORD=secret \ | |||
-v "`pwd`/data":/var/opt/rh/sclo-cassandra3/lib/cassandra:Z \ | |||
centos/cassandra-3-centos7 | |||
</pre> | |||
The container uses the prepared directory to store data into and creates a user and database. '''Important''' is second line with defined password for ''admin'' user. | |||
{{admon/note|Note:|Apache Cassandra has ''admin'' user instead ''cassandra'', this one is deleted in initialization phase.}} | |||
Now you can try the Cassandra client. See [[#Usage example|Usage example]]. If you don't have client tools installed, you can use one provided by container: | |||
<pre> | |||
docker exec -it cassandra 'bash' -c 'cqlsh '`docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' cassandra`' -u admin -p secret' | |||
</pre> | |||
More options are available, see '''[https://hub.docker.com/r/centos/cassandra-3-centos7/ container README]'''. | |||
== Usage example == | |||
<pre> | |||
$ cqlsh | |||
Connected to Test Cluster at 127.0.0.1:9042. | |||
[cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4] | |||
Use HELP for help. | |||
cqlsh> CREATE KEYSPACE k1 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; | |||
cqlsh> USE k1; | |||
cqlsh:k1> CREATE TABLE users (user_name varchar, password varchar, gender varchar, PRIMARY KEY (user_name)); | |||
cqlsh:k1> INSERT INTO users (user_name, password, gender) VALUES ('John', 'test123', 'male'); | |||
cqlsh:k1> SELECT * from users; | |||
user_name | gender | password | |||
-----------+--------+---------- | |||
John | male | test123 | |||
(1 rows) | |||
</pre> | |||
== Feedback == | |||
We will be glad to see any feedback from you. | |||
Also we are looking for some help with maintaining Apache Cassandra in Fedora, so if you feel ready to help us, just contact us. |
Latest revision as of 13:42, 16 May 2018
Introduction
Apache Cassandra is a free and open-source distributed NoSQL database system designed to handle large amounts of data across multiple servers, providing high availability with no single point of failure. One of the main features of Apache Cassandra is its ability to run in a multi-node setup, hence providing the following benefits:
- Fault tolerance: Data is automatically replicated to multiple nodes for fault-tolerance. Also, replication across multiple data centers is supported. Failed nodes can be replaced with no downtime.
- Decentralization: There are no single points of failure, no network bottlenecks and every node in the cluster is identical.
- Scalability & Elasticity: Capability to run with dozens of thousands of nodes with petabytes of data. Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications.
Installation
The database have been available since Fedora 26 and there are multiple packages in Fedora repositories:
cassandra | Client tools |
cassandra-server | Server part, mainly database daemon |
cassandra-javadoc | Documentation |
More packages can be listed with command: dnf list cassandra\* |
dnf install cassandra cassandra-server
will install database server and tools for working with it.
Basic setup
Initialization and startup
Start database daemon:
systemctl start cassandra
Enable start of database daemon after boot:
systemctl enable cassandra
To test if server initialization was successful you can try the Cassandra client. See Usage example.
Users authentication
It’s especially relevant to note that by default authentication is disabled and to enable it you have to take the following steps:
- Change the authenticator option in the /etc/cassandra/cassandra.yaml file to PasswordAuthenticator:
authenticator: PasswordAuthenticator
- Restart cassandra:
systemctl restart cassandra
- Start cqlsh using the default superuser name and password:
cqlsh -u cassandra -p cassandra
- Create a new superuser:
cqlsh> CREATE ROLE <new_super_user> WITH PASSWORD = '<some_secure_password>' AND SUPERUSER = true AND LOGIN = true;
- Log in as the newly created superuser:
cqlsh -u <new_super_user> -p <some_secure_password>
- cassandra superuser cannot be deleted from Cassandra, so to neutralize the account, change the password to something long and incomprehensible, and alter the user’s status to NOSUPERUSER:
cqlsh> ALTER ROLE cassandra WITH PASSWORD='SomeNonsenseThatNoOneWillThinkOf' AND SUPERUSER=false;
Ports and remote access
By default these ports should be binded to Cassandra Java process after start:
Port number | Description |
---|---|
TCP / 7000 | Cassandra inter-node cluster communication |
TCP / 7199 | Cassandra JMX monitoring port |
TCP / 9042 | Cassandra client port |
To allow remote access to database, edit the /etc/cassandra/cassandra.yaml file, changing the following parameters (needs service restart):
listen_address: external_ip rpc_address: external_ip seed_provider/seeds: "<external_ip>"
Also open ports in firewall.
firewalld:
firewall-cmd --add-port=7000/tcp firewall-cmd --add-port=9042/tcp # probably you do not want to expose JMX port on external network # firewall-cmd --add-port=7199/tcp # save configuration firewall-cmd --runtime-to-permanent
iptables:
iptables -A INPUT -p tcp --dport 7000 -j ACCEPT iptables -A INPUT -p tcp --dport 9042 -j ACCEPT # probably you do not want to expose JMX port on external network # iptables -A INPUT -p tcp --dport 7199 -j ACCEPT
More about how to configure Apache Cassandra
To configure the server you have to edit the file /etc/cassandra/cassandra.yaml. For more information about how to change configuration, see the the upstream configuration.
Cluster setup
For example refer to page Apache Cassandra Cluster.
Apache Cassandra in the container
An Apache Cassandra container image can be found in DockerHub as centos/cassandra-3-centos7. Starting a container for serving Cassandra is simple.
Install and start docker.
Prepare directory for database data:
mkdir data chown 143:143 data
Start the container:
docker run --name cassandra -d -p 9042:9042 \ -e CASSANDRA_ADMIN_PASSWORD=secret \ -v "`pwd`/data":/var/opt/rh/sclo-cassandra3/lib/cassandra:Z \ centos/cassandra-3-centos7
The container uses the prepared directory to store data into and creates a user and database. Important is second line with defined password for admin user.
Now you can try the Cassandra client. See Usage example. If you don't have client tools installed, you can use one provided by container:
docker exec -it cassandra 'bash' -c 'cqlsh '`docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' cassandra`' -u admin -p secret'
More options are available, see container README.
Usage example
$ cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh> CREATE KEYSPACE k1 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> USE k1; cqlsh:k1> CREATE TABLE users (user_name varchar, password varchar, gender varchar, PRIMARY KEY (user_name)); cqlsh:k1> INSERT INTO users (user_name, password, gender) VALUES ('John', 'test123', 'male'); cqlsh:k1> SELECT * from users; user_name | gender | password -----------+--------+---------- John | male | test123 (1 rows)
Feedback
We will be glad to see any feedback from you.
Also we are looking for some help with maintaining Apache Cassandra in Fedora, so if you feel ready to help us, just contact us.