m (Jjanco moved page Cassandra to Apache Cassandra: just Cassandra is vague name) |
No edit summary |
||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Introduction == | == Introduction == | ||
[https://en.wikipedia.org/wiki/Apache_Cassandra Apache Cassandra] is a free and open-source distributed NoSQL database system designed to handle large amounts of data across multiple servers, providing high availability with no single point of failure. | [https://en.wikipedia.org/wiki/Apache_Cassandra Apache Cassandra] is a free and open-source distributed NoSQL database system designed to handle large amounts of data across multiple servers, providing high availability with no single point of failure. One of the main features of Apache Cassandra is its ability to run in a multi-node setup, hence providing the following benefits: | ||
* '''Fault tolerance''': Data is automatically replicated to multiple nodes for fault-tolerance. Also, replication across multiple data centers is supported. Failed nodes can be replaced with no downtime. | |||
* '''Decentralization''': There are no single points of failure, no network bottlenecks and every node in the cluster is identical. | |||
* '''Scalability & Elasticity''': Capability to run with dozens of thousands of nodes with petabytes of data. Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications. | |||
== Installation == | == Installation == | ||
Line 19: | Line 22: | ||
</pre> | </pre> | ||
will install database server and tools for working with it. | will install database server and tools for working with it. | ||
== Basic setup == | == Basic setup == | ||
Line 42: | Line 44: | ||
# Create a new superuser: <pre>cqlsh> CREATE ROLE <new_super_user> WITH PASSWORD = '<some_secure_password>' AND SUPERUSER = true AND LOGIN = true;</pre> | # Create a new superuser: <pre>cqlsh> CREATE ROLE <new_super_user> WITH PASSWORD = '<some_secure_password>' AND SUPERUSER = true AND LOGIN = true;</pre> | ||
# Log in as the newly created superuser: <pre>cqlsh -u <new_super_user> -p <some_secure_password></pre> | # Log in as the newly created superuser: <pre>cqlsh -u <new_super_user> -p <some_secure_password></pre> | ||
# | # ''cassandra'' superuser cannot be deleted from Cassandra, so to neutralize the account, change the password to something long and incomprehensible, and alter the user’s status to NOSUPERUSER: <pre>cqlsh> ALTER ROLE cassandra WITH PASSWORD='SomeNonsenseThatNoOneWillThinkOf' AND SUPERUSER=false;</pre> | ||
=== Ports and remote access === | === Ports and remote access === | ||
Line 87: | Line 89: | ||
To configure the server you have to edit the file ''/etc/cassandra/cassandra.yaml''. For more information about how to change configuration, see the the [https://docs.datastax.com/en/archived/cassandra/3.x/cassandra/configuration/configCassandra_yaml.html upstream configuration]. | To configure the server you have to edit the file ''/etc/cassandra/cassandra.yaml''. For more information about how to change configuration, see the the [https://docs.datastax.com/en/archived/cassandra/3.x/cassandra/configuration/configCassandra_yaml.html upstream configuration]. | ||
== Cluster setup == | |||
For example refer to page [[Apache Cassandra Cluster]]. | |||
== Apache Cassandra in the container == | |||
An Apache Cassandra container image can be found in [https://hub.docker.com/ DockerHub] as '''[https://hub.docker.com/r/centos/cassandra-3-centos7/ centos/cassandra-3-centos7]'''. Starting a container for serving Cassandra is simple. | |||
[[Getting started with docker|Install]] and start docker. | |||
Prepare directory for database data: | |||
<pre> | |||
mkdir data | |||
chown 143:143 data | |||
</pre> | |||
{{admon/note|Note:|You have to change ownership for data directory to match ''cassandra'' user in container to allow reading and writing.}} | |||
Start the container: | |||
<pre> | |||
docker run --name cassandra -d -p 9042:9042 \ | |||
-e CASSANDRA_ADMIN_PASSWORD=secret \ | |||
-v "`pwd`/data":/var/opt/rh/sclo-cassandra3/lib/cassandra:Z \ | |||
centos/cassandra-3-centos7 | |||
</pre> | |||
The container uses the prepared directory to store data into and creates a user and database. '''Important''' is second line with defined password for ''admin'' user. | |||
{{admon/note|Note:|Apache Cassandra has ''admin'' user instead ''cassandra'', this one is deleted in initialization phase.}} | |||
Now you can try the Cassandra client. See [[#Usage example|Usage example]]. If you don't have client tools installed, you can use one provided by container: | |||
<pre> | |||
docker exec -it cassandra 'bash' -c 'cqlsh '`docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' cassandra`' -u admin -p secret' | |||
</pre> | |||
More options are available, see '''[https://hub.docker.com/r/centos/cassandra-3-centos7/ container README]'''. | |||
== Usage example == | == Usage example == | ||
Line 106: | Line 137: | ||
(1 rows) | (1 rows) | ||
</pre> | </pre> | ||
== Feedback == | == Feedback == |
Latest revision as of 13:42, 16 May 2018
Introduction
Apache Cassandra is a free and open-source distributed NoSQL database system designed to handle large amounts of data across multiple servers, providing high availability with no single point of failure. One of the main features of Apache Cassandra is its ability to run in a multi-node setup, hence providing the following benefits:
- Fault tolerance: Data is automatically replicated to multiple nodes for fault-tolerance. Also, replication across multiple data centers is supported. Failed nodes can be replaced with no downtime.
- Decentralization: There are no single points of failure, no network bottlenecks and every node in the cluster is identical.
- Scalability & Elasticity: Capability to run with dozens of thousands of nodes with petabytes of data. Read and write throughput both increase linearly as new machines are added, with no downtime or interruption to applications.
Installation
The database have been available since Fedora 26 and there are multiple packages in Fedora repositories:
cassandra | Client tools |
cassandra-server | Server part, mainly database daemon |
cassandra-javadoc | Documentation |
More packages can be listed with command: dnf list cassandra\* |
dnf install cassandra cassandra-server
will install database server and tools for working with it.
Basic setup
Initialization and startup
Start database daemon:
systemctl start cassandra
Enable start of database daemon after boot:
systemctl enable cassandra
To test if server initialization was successful you can try the Cassandra client. See Usage example.
Users authentication
It’s especially relevant to note that by default authentication is disabled and to enable it you have to take the following steps:
- Change the authenticator option in the /etc/cassandra/cassandra.yaml file to PasswordAuthenticator:
authenticator: PasswordAuthenticator
- Restart cassandra:
systemctl restart cassandra
- Start cqlsh using the default superuser name and password:
cqlsh -u cassandra -p cassandra
- Create a new superuser:
cqlsh> CREATE ROLE <new_super_user> WITH PASSWORD = '<some_secure_password>' AND SUPERUSER = true AND LOGIN = true;
- Log in as the newly created superuser:
cqlsh -u <new_super_user> -p <some_secure_password>
- cassandra superuser cannot be deleted from Cassandra, so to neutralize the account, change the password to something long and incomprehensible, and alter the user’s status to NOSUPERUSER:
cqlsh> ALTER ROLE cassandra WITH PASSWORD='SomeNonsenseThatNoOneWillThinkOf' AND SUPERUSER=false;
Ports and remote access
By default these ports should be binded to Cassandra Java process after start:
Port number | Description |
---|---|
TCP / 7000 | Cassandra inter-node cluster communication |
TCP / 7199 | Cassandra JMX monitoring port |
TCP / 9042 | Cassandra client port |
To allow remote access to database, edit the /etc/cassandra/cassandra.yaml file, changing the following parameters (needs service restart):
listen_address: external_ip rpc_address: external_ip seed_provider/seeds: "<external_ip>"
Also open ports in firewall.
firewalld:
firewall-cmd --add-port=7000/tcp firewall-cmd --add-port=9042/tcp # probably you do not want to expose JMX port on external network # firewall-cmd --add-port=7199/tcp # save configuration firewall-cmd --runtime-to-permanent
iptables:
iptables -A INPUT -p tcp --dport 7000 -j ACCEPT iptables -A INPUT -p tcp --dport 9042 -j ACCEPT # probably you do not want to expose JMX port on external network # iptables -A INPUT -p tcp --dport 7199 -j ACCEPT
More about how to configure Apache Cassandra
To configure the server you have to edit the file /etc/cassandra/cassandra.yaml. For more information about how to change configuration, see the the upstream configuration.
Cluster setup
For example refer to page Apache Cassandra Cluster.
Apache Cassandra in the container
An Apache Cassandra container image can be found in DockerHub as centos/cassandra-3-centos7. Starting a container for serving Cassandra is simple.
Install and start docker.
Prepare directory for database data:
mkdir data chown 143:143 data
Start the container:
docker run --name cassandra -d -p 9042:9042 \ -e CASSANDRA_ADMIN_PASSWORD=secret \ -v "`pwd`/data":/var/opt/rh/sclo-cassandra3/lib/cassandra:Z \ centos/cassandra-3-centos7
The container uses the prepared directory to store data into and creates a user and database. Important is second line with defined password for admin user.
Now you can try the Cassandra client. See Usage example. If you don't have client tools installed, you can use one provided by container:
docker exec -it cassandra 'bash' -c 'cqlsh '`docker inspect -f '{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}' cassandra`' -u admin -p secret'
More options are available, see container README.
Usage example
$ cqlsh Connected to Test Cluster at 127.0.0.1:9042. [cqlsh 5.0.1 | Cassandra 3.11.1 | CQL spec 3.4.4 | Native protocol v4] Use HELP for help. cqlsh> CREATE KEYSPACE k1 WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> USE k1; cqlsh:k1> CREATE TABLE users (user_name varchar, password varchar, gender varchar, PRIMARY KEY (user_name)); cqlsh:k1> INSERT INTO users (user_name, password, gender) VALUES ('John', 'test123', 'male'); cqlsh:k1> SELECT * from users; user_name | gender | password -----------+--------+---------- John | male | test123 (1 rows)
Feedback
We will be glad to see any feedback from you.
Also we are looking for some help with maintaining Apache Cassandra in Fedora, so if you feel ready to help us, just contact us.