EXAClusterOS-6 0 6
EXAClusterOS-6 0 6
EXAClusterOS-6 0 6
Table of Contents
1. Introduction ............................................................................................................................... 1
1.1. System structure .............................................................................................................. 1
1.1.1. PID namespaces .................................................................................................... 1
2. Core Daemon ............................................................................................................................. 3
2.1. Mission .......................................................................................................................... 3
2.2. Concepts ........................................................................................................................ 3
2.3. Startup ........................................................................................................................... 3
3. EXAClusterOS Core Utils ............................................................................................................ 7
3.1. cosexec .......................................................................................................................... 7
3.2. cosps ............................................................................................................................. 8
3.3. cosadd ............................................................................................................................ 8
3.4. coskill ............................................................................................................................ 9
3.5. coskillall ......................................................................................................................... 9
3.6. cosmod .......................................................................................................................... 9
3.7. cosmv .......................................................................................................................... 10
3.8. cosrm ........................................................................................................................... 10
3.9. cosstop ......................................................................................................................... 11
3.10. cos-timeout-start ........................................................................................................... 11
3.11. coswait ....................................................................................................................... 11
3.12. hddident ...................................................................................................................... 12
4. DWAd .................................................................................................................................... 13
4.1. Mission ........................................................................................................................ 13
4.2. Startup ......................................................................................................................... 13
4.3. User interface ................................................................................................................ 13
4.3.1. dwad_client ........................................................................................................ 13
4.4. Design .......................................................................................................................... 18
4.5. Recovery mechanisms ..................................................................................................... 18
4.5.1. Process failures .................................................................................................... 18
4.5.2. Node crashes ....................................................................................................... 19
4.6. Network splits ................................................................................................................ 19
4.7. Additional information .................................................................................................... 19
5. Loggingd ................................................................................................................................. 21
5.1. Mission ........................................................................................................................ 21
5.2. Startup ......................................................................................................................... 21
5.3. User interface ................................................................................................................ 21
5.3.1. logd_client .......................................................................................................... 21
5.3.2. logd_collect ........................................................................................................ 21
6. Lockd ..................................................................................................................................... 23
6.1. Mission ........................................................................................................................ 23
6.2. Startup ......................................................................................................................... 23
7. StorageD ................................................................................................................................. 25
7.1. User interface ................................................................................................................ 25
7.1.1. csinfo ................................................................................................................ 25
7.1.2. cslabel ............................................................................................................... 25
7.1.3. csvol .................................................................................................................. 26
7.1.4. csctrl ................................................................................................................. 27
7.1.5. csmd .................................................................................................................. 28
7.1.6. csmove ............................................................................................................... 29
7.1.7. csrec .................................................................................................................. 29
7.1.8. csresize .............................................................................................................. 30
7.1.9. cssetio ................................................................................................................ 31
7.1.10. cssnap .............................................................................................................. 31
7.1.11. csconf .............................................................................................................. 32
8. Management ............................................................................................................................ 33
8.1. Logging ........................................................................................................................ 33
iii
9. EXAoperation .......................................................................................................................... 37
9.1. Components .................................................................................................................. 37
9.2. Logging ........................................................................................................................ 37
9.3. Permissions ................................................................................................................... 37
9.4. Processes ...................................................................................................................... 41
9.4.1. Installation .......................................................................................................... 41
9.4.2. Booting .............................................................................................................. 41
9.4.3. Restore .............................................................................................................. 42
9.4.4. Restore (Storage) ................................................................................................. 42
9.5. Renaming of databases .................................................................................................... 43
9.6. Update servers ............................................................................................................... 43
9.7. Interfaces ...................................................................................................................... 43
9.7.1. Storage archive volumes ........................................................................................ 44
9.8. Maintenance user ........................................................................................................... 44
9.9. Failover ........................................................................................................................ 44
9.9.1. License server failure ............................................................................................ 44
9.9.2. Power outage and checksum mismatches .................................................................. 44
9.10. Automatic node reordering for Storage databases ................................................................ 45
9.11. Volume restore delay ..................................................................................................... 45
9.12. Using the EXAoperation browser interface ........................................................................ 45
9.12.1. Form "EXASolution Instances" ............................................................................. 45
9.12.2. Form "EXASolution Instance" .............................................................................. 48
9.12.3. Form "EXAStorage" ........................................................................................... 53
9.12.4. Form "EXAStorage Volume Node Information" ....................................................... 55
9.12.5. Form "EXAStorage Node Information" .................................................................. 56
9.12.6. Form "EXAStorage Node Device Information" ........................................................ 58
9.12.7. Form "EXABucketFS Services" ............................................................................ 58
9.12.8. Form "Cluster Nodes" ......................................................................................... 60
9.12.9. Form "Backups Information" ................................................................................ 66
9.12.10. Form "Access Management" ............................................................................... 68
9.12.11. Form "Versions" ............................................................................................... 69
9.12.12. Form "UDF Libraries" ....................................................................................... 71
9.12.13. Form "JDBC Drivers" ........................................................................................ 73
9.12.14. Form "EXACluster Debug Information" ................................................................ 75
9.12.15. Form "Monitoring Services" ............................................................................... 76
9.12.16. Form "Threshold Values" ................................................................................... 77
9.12.17. Form "Network" ............................................................................................... 78
9.12.18. Form "License" ................................................................................................ 82
9.13. EXAoperation Add/Edit Forms ........................................................................................ 83
9.13.1. Form "Create EXACluster Node" .......................................................................... 83
9.13.2. Form "EXACluster Node Properties" ..................................................................... 84
9.13.3. Form "EXACluster Node Disk Properties" .............................................................. 84
9.13.4. Form "Edit EXASolution Instance" ........................................................................ 85
9.13.5. Form "Create EXASolution Instance" ..................................................................... 85
9.13.6. Form "EXACluster Logging Service" ..................................................................... 86
9.13.7. Form "Create Remote Volume Instance" ................................................................. 86
9.13.8. Form "Create Jdbc Driver" ................................................................................... 87
9.13.9. Form "EXACluster Jdbc Drivers" .......................................................................... 87
9.13.10. Form "Create EXACluster Route" ........................................................................ 87
9.13.11. Form "EXACluster Route Properties" ................................................................... 87
9.13.12. Form "Create EXACluster Vlan" .......................................................................... 87
9.13.13. Form "EXACluster Vlan Properties" ..................................................................... 87
9.13.14. Form "Create EXACluster Public Vlan" ................................................................ 88
9.13.15. Form "EXACluster Public Vlan Properties" ........................................................... 88
9.13.16. Form "Create EXACluster Ipmi Group" ................................................................ 88
9.13.17. Form "EXACluster Ipmi Group Properties" ........................................................... 88
9.13.18. Form "Create Key Store" .................................................................................... 88
9.13.19. Form "Key Store Properties" ............................................................................... 88
iv
EXAClusterOS 6.0.6 Reference
v
9.20.10. Remote syslog servers ...................................................................................... 118
10. Installation ........................................................................................................................... 119
10.1. Installation on a license server via installation medium ....................................................... 119
10.2. Automated installation of a license server via network ........................................................ 119
10.2.1. Configuration file settings .................................................................................. 121
10.3. Installation of EXAClusterOS on a bare CentOS server system ............................................ 122
10.4. Client Nodes .............................................................................................................. 123
10.4.1. Client boot process ............................................................................................ 123
10.5. Updates ..................................................................................................................... 124
10.5.1. Updates from version 4.x ................................................................................... 124
10.5.2. Updates and defect nodes ................................................................................... 124
10.6. Downgrades ............................................................................................................... 124
10.7. Add another license server into a cluster .......................................................................... 124
10.8. Add new disks to client nodes without re-installation .......................................................... 125
Glossary ................................................................................................................................... 127
vi
EXAClusterOS 6.0.6 Reference
List of Figures
1.1. PID namespaces ....................................................................................................................... 1
9.1. Show foreign database backups ................................................................................................. 42
9.2. Example view: Form "EXASolution Instances" ............................................................................ 45
9.3. Example view: Form "EXASolution Instance" ............................................................................. 48
9.4. Example view: Form "EXAStorage" .......................................................................................... 53
9.5. Example view: Form "EXAStorage Volume Node Information" ....................................................... 55
9.6. Example view: Form "EXAStorage Node Information" .................................................................. 56
9.7. Example view: Form "EXAStorage Node Device Information" ....................................................... 58
9.8. Example view: Form "EXABucketFS Services" ........................................................................... 58
9.9. Example view: Form "Cluster Nodes" ......................................................................................... 60
9.10. Example view: Form "Backups Information" .............................................................................. 66
9.11. Example view: Form "Access Management" .............................................................................. 68
9.12. Example view: Form "Versions" .............................................................................................. 69
9.13. Example view: Form "UDF Libraries" ...................................................................................... 71
9.14. Example view: Form "JDBC Drivers" ....................................................................................... 73
9.15. Example view: Form "EXACluster Debug Information" ............................................................... 75
9.16. Example view: Form "Monitoring Services" ............................................................................... 76
9.17. Example view: Form "Threshold Values" ................................................................................... 77
9.18. Example view: Form "Network" .............................................................................................. 78
9.19. Example view: Form "License" ............................................................................................... 82
9.20. DB RAM and hugepages ...................................................................................................... 116
vii
viii
EXAClusterOS 6.0.6 Reference
List of Tables
4.1. Valid parameters ..................................................................................................................... 17
9.1. EXAoperation permissions ....................................................................................................... 39
ix
x
Chapter 1. Introduction
Chapter 1. Introduction
1.1. System structure
1
2
Chapter 2. Core Daemon
Furthermore, the Core daemon maintains all processes executed in a cluster. All processes in an EXAClusterOS
cluster are direct or indirect child processes of the Core daemon. They are executed in so called partitions so that
every cluster process can be identified by a distinct partition ID and every instance of a distributed cluster process
has an appropriate node ID. Partitions are hierarchical, thus every partition can have an arbitrary number of sub-
partitions.
Hint: In case a cluster process requests to create a new subpartition, it has to send an appropriate request to the
Core Deamon. Processes created by means of fork()/execve() system calls are identified by the same partition and
node ID as their parent processes.
2.2. Concepts
There exist three main concepts for process management in an EXAClusterOS cluster applied by the Core daemon:
• Physical node: A physical node is a machine (physical or virtual) that runs an operating system. It can be em-
bedded in one or more EXAClusterOS clusters, each having a unique cluster ID. A physical node cannot appear
multiple times in the same EXAClusterOS cluster. A physical node is member of an appropriate cluster once
a Core daemon of that cluster started on this node.
• Partition: An EXAClusterOS partition is a collection of one or more processes with the same start command
on one or more physical nodes. A partition may contain a physical node multiple times. A process of this par-
tition is identified by a logical node ID (a logical node ID may also identify some more processes; see below).
• Logical node: A logical node identifies a process in a partition. Node IDs are counted from zero and reach the
value of the partition size - 1. Furthermore, all threads and forked processes of this process share the same lo-
gical node ID in this partition with the "original" process.
2.3. Startup
The COS cluster daemon can be executed without providing any command line parameters. However, it might be
necessary to change some default settings. The COS kernel module must be installed for execution.
• --broadcast-port=port: Broadcast/multicast port for internal communication (only for testing purposes).
• --local-to-broadcast-port=port: Local UDP port to use for connecting with multicast/broadcast address (only
for testing purposes).
3
2.3. Startup
• -e, --exit-on-error: Force Cored to exit after the root process exits with an error code not 0.
• --inherit-priority: Let child processes inherit reniced priority (else, use setpriority() and set default priority for
child processes when using --renice).
• -l dir, --log-directory=dir: Use specified log directory to redirect output of EXAClusterOS processes.
• --logfile-pattern=pattern: Use appropriate standard logfile pattern for logfile redirection (default:
%l/%e.%c.%p.%n.%P.log).
• -t id, --cluster-port=id: ID to use for cluster and UDP port for internal communication.
• --auth-sock-dir=directory: Specify directory in which to locate authentication socket(s) for clients (default:
/tmp).
• -w, --wait: Wait up to 60 seconds for messages from other core daemons at startup before doing anything else.
• --token-delay-time-ms=milliseconds: Specifies minimum delay time between two regular tokens in operational
state.
• --token-loss-timeout=seconds: Specifies timeout of token after which cored tries to gather a new cluster).
• --token-retransmission-timeout=seconds: Specifies timeout of token after which cored resends token into
cluster.
4
Chapter 2. Core Daemon
Furthermore, the following arguments may/must be specified on the command line: {root process command}
[command arguments]
5
6
Chapter 3. EXAClusterOS Core Utils
3.1. cosexec
The newly started partition includes all nodes in the same order they are specified with -n or -N. With -c, the CoreD
will choose the corresponding nodes, depending on their current usage. If not specifying any node, a new partition
with only one node will be created.
• -g groupid, --gid=groupid: Start partition process(es) with specified group ID (root user only).
• -l name, --alternative-name=name: Use alternative name for partition (instead of first command line argument).
• -t, --show-root-node-ids: Show root node IDs when redirecting I/O to stdout/stderr instead of logical node IDs..
• -u userid, --uid=userid: Start partition process(es) with specified user ID (root user only).
• --wait-status: Wait for partition to be finished and exit with status 2 in case partition exited with code not equal
to 0.
• -z node-ids, --except-nodes=node-ids: Start partition on nodes except the specified ones and do not recognize
these nodes for auto-add.
7
3.2. cosps
Furthermore, the following arguments may/must be specified on the command line: command [command ar-
guments]
3.2. cosps
Show general information about the cluster. The first part of output displays known cluster nodes, their root node
ID and state (offline or online) while the second part contains the partition table. Each entry of this table is charac-
terized by a partition ID, a user and group ID, the parent partition ID, a list of partition nodes and the executed
command.
• -N, --physical-nodes: Show physical node IDs for each logical node.
3.3. cosadd
Extend an existing partition by one node.
• -r id, --root-node-id=id: Node ID of the new node when adding nodes to the root partition.
Furthermore, the following arguments may/must be specified on the command line: {partition ID}
8
Chapter 3. EXAClusterOS Core Utils
3.4. coskill
Send a signal to one or all cluster process instances (logical nodes) addressed by its partition (and logical node
ID).
Furthermore, the following arguments may/must be specified on the command line: [-SIGNO] {partition
ID} {node ID}
3.5. coskillall
Send a signal to every cluster process instance (logical node) of one or more partitions addressed by their name.
• -a, --wait-nodes: Wait for exit of nodes in partition, but not for exit of whole partition (useful for auto-restart
partitions).
Furthermore, the following arguments may/must be specified on the command line: [-SIGNO] {partition
ID}
3.6. cosmod
Set/unset several partition flags.
9
3.7. cosmv
• -c seconds, --set-consensus-timeout=seconds: Timeout for reaching consensus with a new cluster configuration.
• -l seconds, --set-token-loss-timeout=seconds: Specifies timeout of token after which cored tries to gather a
new cluster.
Furthermore, the following arguments may/must be specified on the command line: {partition}
3.7. cosmv
Move a cluster process instance to another cluster node. The addressed logical node - respectively its corresponding
UNIX process - will be stopped in case it is still running and then be started on the specified root node. It is not
really "moving", because all information of the former process will be lost. The new process will be addressed
with the same partition and logical node ID.
Furthermore, the following arguments may/must be specified on the command line: {partition} {node}
3.8. cosrm
Remove the last node or all nodes of a partition. In case the cluster process instance is still running, it will be
stopped. After removing the last logical node of a partition, the whole partition will be removed from the partition
table.
• -n IDs, --node=IDs: Comma-separated list of root node IDs to remove (only when removing nodes from the
root partition).
• -f, --force: Force removal of physical node (only used if removing nodes from root partition).
Furthermore, the following arguments may/must be specified on the command line: {partition ID}
10
Chapter 3. EXAClusterOS Core Utils
3.9. cosstop
Set/unset several partition flags.
Furthermore, the following arguments may/must be specified on the command line: {partition ID} {node
ID}
3.10. cos-timeout-start
Execute a command and wait for completion during a user-specified timeout. Returns status of command if the
appropriate process finishes, else (or on any other error) the exit code 100 will be returned and the started process
will be killed with SIGTERM.
Furthermore, the following arguments may/must be specified on the command line: {timeout} {command}
[command arguments]
3.11. coswait
coswait may be used at cluster startup to wait for a number of nodes etc.
• -s number, --sleep=number: Number of seconds to sleep after the condition became true and before executing
command.
Furthermore, the following arguments may/must be specified on the command line: command [command ar-
guments]
11
3.12. hddident
3.12. hddident
HDD Ident is used to store and restore metadata of HDD drives.
12
Chapter 4. DWAd
Chapter 4. DWAd
4.1. Mission
The DWAd service can be used to manage one or more Exasolution DW systems and provides a user interface to
access its database administration functionalities. It is able to do automatic failure recovery in case of process and
node crashes.
4.2. Startup
This daemon has to be started as a service in its own partition. This partition must not be the root partition and
may be resized later by appropriate EXAClusterOS tools. Keep in mind to start the DWAd service on every node
dedicated to be a database node when specifying systems. It requires one partition with an active LoggingD and
one with an active LockD service.
• --overcommit-memory: Do not care about memory resource limits on nodes and balance SLB equal for EX-
ASolution systems.
• --store-config=file: Name of file in which to store configuration data at the end of a DWAd process.
4.3.1. dwad_client
This program is a client for the DWAd service.
• start-wait {name}: Start system and wait until it becomes reachable by DB client applications.
13
4.3. User interface
• start-create-new-db-features {name} {features}: Start system with specified features and -create-new-db flag.
• pdd-restore-features {name} {features}: Start system with specified features and -create-new-db flag.
• stop-signal {name} {signal number} [timeout]: Stop system with specified signal and timeout.
• stop-wait {name}: Stop system and wait until all processes stopped.
• stop-force {name}: Stop system and set it immediately into setup state.
• setup-node-groups {node group}: Group database nodes. Each group must be specified as one parameter on
the command line..
• uptime {name}: Show uptime of system in seconds. This uptime starts with a system being connectable.
• show-files {name} {iproc}: List datafiles of system for specified iproc number.
• pdd-proc-wait {name}: Wait for information about PDD process of system and print.
14
Chapter 4. DWAd
• check-restore-ready-state {name}: Check whether PDD server is able to receive restore requests.
• switch-nodes {name} {active node} {reserve node}: Move active node to reserve node list and vice versa.
• start-on-nodes-wait {name} {exec}: Start command on all nodes of system and wait for partition shutdown.
• storage-backup {name} {volume id} {level} {expire time}: Do backup of a system (via Storage daemon).
• shrink-db {name} {size} {on-persistent-volume (1|0)} {dry-run (1|0)}: Shrink database volume size to specified
number of MiB.
• storage-restore {name} {volume id} {backup name}: Restore backup into system (via Storage daemon).
• storage-restore-nonblocking {name} {volume id} {backup name}: Restore backup into system (via Storage
daemon).
• storage-restore-virtual {name} {volume id} {backup name}: Restore backup into system (via Storage daemon).
• wait-state {name} {state} {timeout}: Wait for database system to reach defined state (running, setup).
• dump-data {1|0} {1|0} {filename}: Produce DWAd data dump. Consistent state = yes[1]/no[0]; all nodes =
yes[1]/no[0].
• dump-data-xml {1|0] {1|0] {filename}: Produce DWAd data dump in XML format. Consistent state =
yes[1]/no[0]; all nodes = yes[1]/no[0].
• protect-node-mem {mem}: Save memory on nodes for other purposes (in GB).
• print-protected-node-mem: Print amount of memory saved on each node from EXASolution (in GB).
• disallow-inserts {name} {rawsize} {memsize}: Disallow INSERT statements (all values in MiB).
15
4.3. User interface
16
Chapter 4. DWAd
17
4.4. Design
4.4. Design
18
Chapter 4. DWAd
• A process exits while an appropriate system is in state "starting": Here, a SIGABRT signal will be sent to all
remaining system processes by the current DWAd master node and the system will be set to state "setup" im-
mediately.
• A process exits with return code 0 while an appropriate system is in state "running": Here, a SIGTERM signal
will be sent to all remaining system processes by the current DWAd master node checking for shutdown of all
system processes in an appropriate time frame as specified by the user at setup time.
• A process exits with return code 1 while an appropriate system is in state "running": This case is handled as
in the former one except that the system will be restarted after all processes exited. It will be logged that a
controller requested a system restart.
• A process exits with a return code not being 0 or 1 while an appropriate system is in state "running": This case
is handled as in the former one except that the system log shows up with an unexpected process exit not being
a controller request for a restart.
19
20
Chapter 5. Loggingd
Chapter 5. Loggingd
5.1. Mission
The Loggingd service is used for situations in which it would be too expensive to look into all local service logs
while searching for a certain expression like a warning or an alert. It is able to collect logs from an appropriate
service over all nodes and can show them in an time-sorted fashion with a provided client tool. Thus, important
log entries may be found very fast and almost independently from cluster size.
5.2. Startup
This daemon has to be started as a service in its own partition. This partition must not be the root partition and
may be resized later by appropriate COS tools. Keep in mind to start the Loggingd service on every node dedicated
to collect global logs.
5.3.1. logd_client
This program shows information about a LoggingD partition and triggers some administrative tasks.
5.3.2. logd_collect
This program show global log entries from specified services.
• -p priorities, --prio=priorities: Search log entries with specified priorities separated by comma.
21
5.3. User interface
• -o priorities, --not-prio=priorities: Search log entries not having specified priorities separated by comma.
• -n nids, --nodes=nids: Comma separated list of Loggingd nodes that should be asked for log entries.
Furthermore, the following arguments may/must be specified on the command line: {service name 1}
[[service name 2] ...]
22
Chapter 6. Lockd
Chapter 6. Lockd
6.1. Mission
The Lockd service is able to manage global operations for EXAClusterOS processes by providing a simple interface
without need to know about implementation details of parallel algorithms used. Its interface may be utilised to
manage global locks as well as global barriers. The Lockd is furthermore able to detect e.g. process crashes in
global critical sections and communicates this kind of problems to its clients.
6.2. Startup
This daemon has to be started as a service in its own partition. This partition must not be the root partition and
may be resized later by appropriate EXAClusterOS tools. Keep in mind to start the Lockd service on every node
dedicated to use global locks. It requires one partition with an active LoggingD service.
23
24
Chapter 7. StorageD
Chapter 7. StorageD
7.1. User interface
7.1.1. csinfo
Csinfo queries (and displays) information about exiting volumes and nodes. The level of detail for the displayed
information con be specifed usign the --level option.
• -r, --range: print range of bytes (or blocks) on each node of the given volume.
• -m, --masters_only: consider only master segments (regardless of their current state) when requesting block-
ranges (default: use node IDs of the deputy segment if the master is offline)..
• -l level, --level=level: level of detail for the information displayed (default: 0).
• -p, --include-partitions: include partitions when requesting volume or node information (may slow down the
request)..
• -R, --red-dist: show the redundancy distribution of the given node (or all nodes)..
7.1.2. cslabel
With cslabel you can perform the following actions: - add a label to a volume - remove a label (or all labels) from
a volume - find all volumes with a given label Every volume can have an arbitrary number of labels. You can
add/remove labels at any time (no matter whether the volume is online or offline) if you are the owner of the
volume. The labels of one volume must be unique, i.e. no label is added more than once.
25
7.1. User interface
• -F, --front: add label to the front of the list (default: end of the list).
7.1.3. csvol
With csvol you perform the following actions: - create a volume - delete a volume - close a volume - change per-
missions - change owner and group - lock a volume - unlock a volume - change shared flag - change priority -
clear data on a volume
• -r number, --redundancy number: the rundancy for the volume (max. 255).
• -n string, --nodes string: list of node IDs to be used for the volume..
26
Chapter 7. StorageD
• -N node names, --node_names=node names: list of node names to be used for the volume.
• -t string, --partition string: Partition for whom the volume should be closed..
7.1.4. csctrl
This program implements some control mechanisms for the storage service. It is able to: - start (or restart)
EXAStorage - shut down EXAStorage - print the current UUID-NodeID mapping - suspend (or resume) an
EXAStorage node
27
7.1. User interface
7.1.5. csmd
With csmd you can perform the following actions: - print information about existing metadata files - convert
metadata files to another version - compare different metadata files - print history of modifications
• -p, --print: print info (content and version) about the serialized metadata.
• -X, --to-text: convert given metadata file to text format (XML in case of COS serialization).
28
Chapter 7. StorageD
7.1.6. csmove
With csmove you can: - move one or more nodes of a given volume to another node. - move a single segment to
another node Moving a node/segment may be denied in any of the following cases: - one or more segments on the
source node have a snapshot map. - one or more segments on the source node have a redundancy segment on the
destination node. - the source node is used for recovering another node and no other suitable node is available.
However, one can force the movement using the --force flag (see below).
• -s node IDs, --src-nodes=node IDs: list of physical node IDs that should be moved.
• -S node names, --src-names=node names: list of node names that should be moved.
• -d node IDs, --dest-nodes=node IDs: list of physical node IDs that contains the destination node for each node
that should be moved (matched by index).
• -D node names, --dest-names=node names: list of node names that contains the destination node for each node
that should be moved (matched by index).
7.1.7. csrec
With csrec you can perform the following actions: - list all existing recovery maps (in the cluster or for a volume)
- see the recovery completion status of a given volume - start recovery on a given node (and volume) - enable and
disable background recovery (on selected nodes) Only one action can be performed at a time.
• -l, --list: List all existing recovery maps for the given volume (-v) or all volumes (default).
• -s, --show: Show completion status of all recovery maps of the given volume (in percentage)..
• -r, --restore-node: Restore the data on a given node from a redundancy node.
• -d, --restore-hdd: Restore the data on the given hdd from a redundancy node.
• -o, --bg-off: Turn background recovery OFF on the given nodes (default: all nodes)..
• -O, --bg-on: Turn background recovery ON on the given nodes (default: all nodes).
29
7.1. User interface
• -P string, --bg-rec-profile string: Set background recovery profile ('one', 'two', 'three').
• -t number, --timeout number: Time (in seconds) to wait for restoration (default: 300).
7.1.8. csresize
With csresize one can resize an existing volume in various ways: - append new nodes to the volume - remove
nodes from the volume - enlarge each node/segment of a volume - shrink each node/segment of a volume - increase
the redundancy of an existing volume - decrease the redundancy of an existing volume Only one action can be
performed at a time. See below for explanations on each resizing method.
• -b number of blocks, --blocks=number of blocks: number of blocks by which each segment of a volume should
be enlarged/shrinked.
30
Chapter 7. StorageD
• -n node IDs, --nodes=node IDs: list of physical node IDs that should be appended/removed.
• -N node names, --nodenames=node names: list of node names that should be appended/removed.
7.1.9. cssetio
Some details.
7.1.10. cssnap
With cssnap you can perform the following actions: - create a new snapshot - release an existing snapshot relation
- list all existing snapshots (in the cluster or for a volume) - see the completion status of a snapshot - enable and
disable background snapshot creation (on selected nodes) Only one action can be performed at a time.
• -l, --list: List all existing snapshots for the given volume (-v) or all volumes (default).
• -s, --show: Show simple completion status of all snapshots of the given volume.
• -S, --show-detailed: Show detailed completion status of all snapshots of the given volume..
• -o, --bg-off: Turn background snapshot creation OFF on the given nodes (default: all nodes)..
• -O, --bg-on: Turn background snapshot creation ON on the given nodes (default: all nodes).
• -p number, --priority number: Set priority for background operations (default: 100).
• -r number, --redundancy number: Redundancy for the snapshot volume (default: 1).
• -h string, --hdd-type string: Type of HDD for the snapshot-volume (default: same as source vol.).
• -C, --copy-labels: Inherit all labels from the source volume (default: false).
• -d, --distinct-nodes: Use only distinct nodes for auto-selection (i.e. nodes that are not used by the source volume).
• -L, --local-nodes: Use the volume's current master nodes for building the snapshot.
• -n string, --nodes string: List of physical node IDs that should be used for the snapshot (default: auto-select)
or for enabling/disabling background snapshot creation (default: all nodes)..
31
7.1. User interface
• -N string, --node-names string: List of node names that should be used for the snapshot (default: auto-select)
or for enabling/disabling background snapshot creation (default: all nodes)..
7.1.11. csconf
With csconf one can modify the following parameters that are part of the EXAStorage configuration file: -
max_bg_mem - max_oth_mem - max_num_bg_ops - max_bytes_per_bg_op - use_group_io - optimize_sort -
use_nw_aio - use_nw_ooo - clean_interval It can also print the default values for the current installation.
• -M number, --max-bg-mem number: max. memory usage (in bytes) for background operations.
• -S number, --max-oth-mem number: max. memory usage (in bytes) for other operations.
• -c number, --clean-interval number: set interval (in sec.) for the I/O cleaner thread..
• -P number, --min-nw-perf number: the min. assumed network throughput (in bytes per sec)..
• -d number, --rec-delay number: time period (in seconds) to wait before starting background data restoration.
• -w number, --space-warn-threshold number: the space usage threshold at which a warning is generated..
32
Chapter 8. Management
Chapter 8. Management
8.1. Logging
Following logging types are available with EXAClusterOS:
• Syslog
This service is used to log internal Linux kernel information and should be used for debugging purposes only.
This information is locally available on the node. All syslog information is written into one file:
/var/log/all.log.
Syslog files are rotated with OS internal rotation mechanisms (e.g. logrotate).
• Loggingd
Loggingd is used for log information needed for cluster monitoring. This information is available on every
node and with the EXAoperation.
This service writes its data to the /var/log/logd directory and is rotated automatically with the cos-
logdir-rotate command.
• Directories
Some services use logging in individual process files. For example Cored uses it for writing the output of all
commands started with EXAClusterOS. This information should be used only for debugging purposes and is
available only locally on each node. Following services uses this logging type:
• EXASolution: /<disk>/<database>/log
• Cored: /var/log/cored
Alle log directories are rotated automatically with the cos-logdir-rotate command.
• EXAoperation
/usr/opt/EXASuite-6/EXAClusterOS-6.0.6/var/exaoperation/log
For rotating of log data, the command cos_logdir_rotate is available. Its command line parameters are as
follows:
2. <max files> - this number of logfiles will be let at the log directory.
3. <backups> - if number of backups is larger then this parameter, older backups will be removed.
4. <signal> - the signal which should be send to processes of open files. If this parameter is not given, open
files will not be rotated.
5. <pattern> - if this regular expression pattern matches a name of a file and this file is open, the file will be
renamed and the process which holds the file will receive the appropriate signal.
33
8.1. Logging
2. Read the list of files and the process id list of processes which use the files.
3. If signal is given, then all files with the matched pattern are renamed. Processes that used an appropriate file
will receive the designated signal.
5. If <max files> is not equal to zero, then all files will be renamed to <filename>.<number>. If <max
files> is equal to zero, then the old files will be removed.
The following list gives an overview about logfiles on the license server.
1. /usr/opt/EXASuite-6/EXAClusterOS-6.0.6/var/exaoperation/log/access.log
2. /usr/opt/EXASuite-6/EXAClusterOS-6.0.6/var/exaoperation/log/output.log
EXAoperation log
3. /usr/opt/EXASuite-6/EXAClusterOS-6.0.6/var/exaoperation/log/zeo.log
4. /usr/opt/EXASuite-6/EXAClusterOS-6.0.6/var/exaoperation/log/zope.log
ZOPE log
The following list gives an overview about logfiles in the initrd environment on client nodes (accessible with rssh
n{number} command)
1. /var/log/hddmount.log
2. /var/log/hddinit.log
3. /var/log/cos_startup.log
The following list gives an overview about logfiles that can be found on client nodes via login with SSH on port
22.
1. /d02_data/{database name}/log/process/*
The following list gives an overview about logfiles that can be found on the license server as well as on client
nodes via SSH to port 22.
1. /var/log/cored/*
2. /var/log/logd/*
34
Chapter 8. Management
global logs of EXAClusterOS process (Cored, DWAd, Lockd, EXASolution), can be viewed globally with
logd_collect command
35
36
Chapter 9. EXAoperation
Chapter 9. EXAoperation
9.1. Components
The EXAoperation service is composed of following components:
• Application server
This is the core component which implements the frontend and all backend processes.
• Configuration database
This service executes processes which are initiated from the frontend.
• Unix services
For following standard unix services are used for booting and scheduling:
• DHCP Server
• XINET Service
• Cron daemon
• TFTPd
• SSH
• Syslog
9.2. Logging
To show the information of the loggingd service, a periodical job is triggered with crond every minute. This
job uses the logd_collect command to collect the data and write it to a file. This file is afterwards accessible
with the EXAoperation frontend.
9.3. Permissions
To manage permissions in EXAoperation you have following predefined roles:
1. Master
2. Administrator
As administrator you can manage the cluster, but you can not change the license, password for disk encryption
or set a master role to a user.
3. Supervisor
37
9.3. Permissions
A supervisor has the same rights as administrator without the posibility to change something. It is used to
monitor the cluster.
4. User
A user has the same rights as supervisor, but can only view the basic state of nodes and databases.
In EXAoperation every user can have a different role for any object, so it is possible to have administrator rights
on one database and user or supervisor rights on all other objects.
38
Chapter 9. EXAoperation
39
9.3. Permissions
40
Chapter 9. EXAoperation
9.4. Processes
9.4.1. Installation
If a node is in installation mode its boot process is shown as follows:
5. Install software.
9.4.2. Booting
On activation of a node its boot process is shown as follows:
41
9.4. Processes
9.4.3. Restore
For restoring a database, the database must be created in EXAoperation, but not be started. The steps of the restore
process are the following:
1. Check whether enough files are available. This means having a node file for every node number and a metadata
file.
3. Trigger database to read backup files, i.e. start the restore process over EXAoperation.
When using offline backups, the backup files must be moved to the archive nodes including an empty 'dontexpire'
file. They have to be located into a directory whose name matches the backup name (which is usually a timestamp)
The files also have to be located on the nodes where they were created first. If not, the metadata file ("backup.ini")
has to be adjusted first.
1. Blocking restore: This restore mechanism loads all data into the database before setting the database into a
mode in which it accepts connections. This is the fastest restore mechanism.
2. Non-blocking restore: This mechanism only loads the most necessary part of the data into the database and
immediately sets the database into a mode in which it accepts connections. This mechanism is useful for de-
creasing the downtime of the database, but will load data slightly slower than the blocking restore mechanism.
3. Virtual-access restore: The mechanism starts a database in a read-only mode. Thus, no write operations are
possible. It is useful for restoring only a single object of a database backup into another database via IM-
PORT/EXPORT.
Hint: Remote archive volumes can only be used for blocking restore processes. All other restore types require
further functionality that is only available in internal cluster volumes. Thus, a remote backup must be moved to a
cluster archive volume first in such a situation.
42
Chapter 9. EXAoperation
To restore backups from other databases, use the "Show foreign database backups" button in the "EXASolution
Database Backup List" form (see screenshot above). One can restore backups from arbitrary EXASolution databases
as long as the number of nodes is similar.
1. Create a backup from the database with EXAoperation. You may stop it right afterwards.
2. Create a new database with a different name (and a different communication and connection port if the old
database is still running) but with the same number of nodes.
3. Edit the backup properties and insert the new database into "Systems". Afterwards, restore the backup into
the new database.
1. After deleting the old database, all backups of this system will be deleted.
2. Change the references from the old database to the new one, especially in the scheduler, monitor and backup
view.
3. You may have to change the database name in external tools, e.g. in monitoring tools.
4.x.2/EXAClusterOS-4.x.2_LS-Update-CentOS-6.2_x86_64.pkg
4.x.3/EXAClusterOS-4.x.2_LS-Update-CentOS-6.2_x86_64.pkg ->
../4.x.2/EXAClusterOS-4.x.2_LS-Update-CentOS-6.2_x86_64.pkg
4.x.3/EXASolution-4.x.3_x86_64.pkg
This structure enables an EXASuite cluster in version 4.x.2 to find database version 4.x.3 (see the file link) and
to show this version as applicable database version in the appropriate EXAoperation form. It would further enable
an EXASuite cluster with a version smaller than 4.x.2 to update to EXASuite version 4.x.2.1 A patchlevel for
version 4.x.2 would have to be located in the 4.x.2 directory.
9.7. Interfaces
The following interfaces are available with EXAoperation.
• EXAoperation frontend - HTTP on port 80 and HTTPS on port 443 on all cluster nodes
• Storage archive volumes - Ports 2021 (FTP), 2022 (SFTP), 2080 (HTTP), and 2443 (HTTPS) on all cluster
nodes
1
Even with a wrong file hierarchy, EXAoperation will be able to detect whether an update is applicable. Every package is signed and will be
checked once the update process is in progress.
43
9.8. Maintenance user
• testdb/id_0/level_0/node_0/metadata_{timestamp}
• testdb/id_0/level_0/node_0/backup_{timestamp}
• testdb/id_0/level_0/node_1/backup_{timestamp}
To store this backup to an offline storage system, all these three files must be downloaded. Restoring this backup
from an offline storage system requires all these files to be uploaded into exactly this file structure. Alternatively,
you may choose to download this backup in a compressed form. Therefore, you would have to download the vir-
tual file testdb/id_0.tar.gz (which must be uploaded to exactly the same location when restoring from an
offline archive). Consider that these tar.gz files are generated on-the-fly. This may limit the download speed to
somewhat around 20 MiB/s while the limiting factor of uncompressed files will normally be the network speed.
9.9. Failover
• Every cluster node is able to host EXAoperation (not only license servers).
• The interfaces of EXAoperation (HTTP/HTTPS/XML-RPC) are reachable on all cluster nodes, but only one
node has a running instance of EXAoperation (e.g. is able to boot nodes) and thus is the EXAoperation master
node. All other nodes act as proxy servers and will redirect requests.
• In case of a failover, EXAoperation will not be connectable for a short period of time (up to one minute) and
a user will experience "Connection refused" messages during this process.
• The default license server is named as 'n10'. This name should be used for XML-RPC functions such as call-
Plugin() as the former 'license' name only refers to the current EXAoperation master node. Additional license
servers may be named as 'n1' up to 'n9'.
1. Fix checksums: All checksums on the selected node(s) are verified against the data and if there is a mismatch,
the affected checksum will be regenareted. This operation may take hours to complete (depending on the
amount of data that has already been written).
44
Chapter 9. EXAoperation
2. Discard checksums: All checksums on the selected node(s) are reset to 0 (as if no data has ever been written).
This operation takes only a few minutes, but the checksums are lost and only regenerated as new data is
written.
1. Move appropriate volume node to target node and wait for completion.
2. Stop database.
3. Start database.
Add a database
Precondition(s):
45
9.12. Using the EXAoperation browser interface
Postcondition(s):
The newly configured database is shown in the "EXASolution Instances Information" form.
Steps to do:
2. A new form opens. Configure the database and click "Add" once again.
Delete a database
Precondition(s):
The selected database(s) are not running and there is no logservice or backup that is configured with this database.
Postcondition(s):
The selected database(s) will not show up anymore in the "EXASolution Instances Information" form.
Steps to do:
1. Select database(s).
Start a database
Precondition(s):
Postcondition(s):
Steps to do:
1. Select database(s).
46
Chapter 9. EXAoperation
Restart a database
Precondition(s):
Postcondition(s):
Steps to do:
1. Select database(s).
Stop a database
Precondition(s):
Postcondition(s):
Steps to do:
1. Select database(s).
47
9.12. Using the EXAoperation browser interface
Start database
Precondition(s):
The database must exist and not be running. More than 50% of all cluster nodes and all necessary database nodes
must be online.
Postcondition(s):
Steps to do:
48
Chapter 9. EXAoperation
Restart database
Precondition(s):
Postcondition(s):
Steps to do:
Stop database
Precondition(s):
The database must be running. More than 50% of all cluster nodes must be online.
Postcondition(s):
Steps to do:
Create database
Precondition(s):
The database must not be running or in any other operation (e.g. backup). More than 50% of all cluster nodes must
be online. Each node that is member of the database system must be online or must have been online before.
Postcondition(s):
All database directories will have been created. The DWAd service knows about the database and will start it in
create mode next time. Within the next minute, a Samba share for this system will have been exported.
Steps to do:
49
9.12. Using the EXAoperation browser interface
The database must exist and not be running. More than 50% of all cluster nodes must be online.
Postcondition(s):
Steps to do:
Remove database
Precondition(s):
The database must exist on not be running. More than 50% of all cluster nodes must be online.
Postcondition(s):
All files of the database (log files, data files and backups) will be removed.
Steps to do:
The database may be in any state if only changing reserve nodes. For all other parameters, the database has to be
created.
Postcondition(s):
50
Chapter 9. EXAoperation
Steps to do:
Start a backup
Precondition(s):
Postcondition(s):
A backup process for the selected system will have been started.
Steps to do:
The database has been created, is not running and has an appropriate number of reserve nodes.
Postcondition(s):
The database will have been enlarged and an appropriate number of reserve nodes are active nodes now. The
database will have been started.
Steps to do:
3. A new form opens. Enter the number of new active nodes and click "Apply".
After startup, you should explicitely issue the command "REORGANIZE DATABASE" in the database. This assures
that data are reorganized in a balanced approach over all database nodes.
Shrink database
Precondition(s):
51
9.12. Using the EXAoperation browser interface
Steps to do:
3. A new form opens. Enter the target volume size and click "Apply".
Steps to do:
Postcondition(s):
The browser will download a ZIP file containing the database statistics of the last month, which can be sent to
EXASOL to provide useful usage graphs over its web portal.
Steps to do:
52
Chapter 9. EXAoperation
Start EXAStorage
Precondition(s):
Steps to do:
Steps to do:
53
9.12. Using the EXAoperation browser interface
The number of volume nodes must be a multiple of the "Master nodes" property. In case of an archive volume,
the size will be enhanced properly to match 4 GB boundaries per node and disk.
The volume has been created and no database backup to this volume is in progress.
Postcondition(s):
The volume will be enlarged by at least the number of GiB entered in the form.
Steps to do:
Only enlarge an archive volume in case no write operation (database backup) is currently made to it. Thus, check
the backup state for all databases on the "EXASolution form" and, in case of no backup, afterwards enlarge the
volume.
Steps to do:
In contrast to data/archive volumes, remote archive volumes will be named as r0000/r0001/... (instead of
v0000/v0001/...). Restore processes from remote archive volumes must be made in blocking mode. Thus, the non-
blocking and virtual restore mode is not available. Furthermore, EXAoperation will not delete expired backups
from remote volumes (it is under another control). To simplify automatic backup deletion processes on the server
side, an expire file will be created. This is placed as {database name}/id_{x}/level_{y}/node_0/expire_{expiration
timestamp}, where the expiration timestamp has the format "%Y%m%d%H%M".
Remove a volume
Precondition(s):
Steps to do:
1. Select volume(s).
54
Chapter 9. EXAoperation
All formerly unused Storage disks of the selected node(s) will have been configured to be usable by the Storage
service. The selected node(s) must be online or suspended.
Steps to do:
1. Select node(s).
The Storage service will have been restarted on the selected node(s).
Steps to do:
1. Select node(s).
Steps to do:
55
9.12. Using the EXAoperation browser interface
Steps to do:
Steps to do:
Suspend/Resume node
Postcondition(s):
Steps to do:
56
Chapter 9. EXAoperation
57
9.12. Using the EXAoperation browser interface
58
Chapter 9. EXAoperation
The service will be reachable soon afterwards on all database nodes in case of using HTTP and/or HTTPS.
Steps to do:
59
9.12. Using the EXAoperation browser interface
Postcondition(s):
All nodes of the node list file will be shown in the "EXACluster Nodes Information" form.
60
Chapter 9. EXAoperation
Steps to do:
Add a node
Precondition(s):
A node with the provided number and/or same private/public MAC address(es) does not exist.
Postcondition(s):
The newly configured node is shown in the "EXACluster Nodes Information" form and may be installed/booted
after switching power on.
Steps to do:
2. A new form opens. Configure the node and click "Add" once again.
If specifying a private and/or public failsafety network interface, network interfaces will be bonded in a active-
backup mode fashion. Thus, these interfaces may be connected to two different switches and will always commu-
nicate over one active link. If specifying no RAID type on a Storage device, each disk device will contain the size
of the disk. Thus, a size of 50 GB on disk devices /dev/sda and /dev/sdb will result in a total size of 100 GB for
the usable Storage space.
Delete a node
Postcondition(s):
The deleted node will not show up again in the "EXACluster Nodes Information" form and is not installed/booted
after power on/reboot.
Steps to do:
1. Select node(s).
Copy node
Precondition(s):
A node with the provided number and/or same private/public MAC address(es) does not exist.
Postcondition(s):
The newly configured nodes is shown in the "EXACluster Nodes Information" form and may be installed/booted
after switching power on.
Steps to do:
2. A new form opens. Configure the node and click "Add" once again.
61
9.12. Using the EXAoperation browser interface
If specifying a private and/or public failsafety network interface, network interfaces will be bonded in a active-
backup mode fashion. Thus, these interfaces may be connected to two different switches and will always commu-
nicate over one active link. If specifying no RAID type on a Storage device, each disk device will contain the size
of the disk. Thus, a size of 50 GB on disk devices /dev/sda and /dev/sdb will result in a total size of 100 GB for
the usable Storage space.
Postcondition(s):
Steps to do:
1. Select node.
Postcondition(s):
The front panel identify light of the selected node(s) will light up for some time.
Steps to do:
1. Select node(s).
Start a node
Precondition(s):
Postcondition(s):
A "power on" command will be sent to the LOM card of the selected node(s).
Steps to do:
1. Select node(s).
62
Chapter 9. EXAoperation
Reboot a node
Precondition(s):
The node has been started before and can be reached via SSH.
Postcondition(s):
Steps to do:
1. Select node(s).
Stop a node
Precondition(s):
The node has been started before and can be reached via SSH.
Postcondition(s):
Steps to do:
1. Select node(s).
Reset a node
Precondition(s):
Postcondition(s):
A "power reset" command will be sent to the LOM card of the selected node(s).
Steps to do:
1. Select node(s).
63
9.12. Using the EXAoperation browser interface
Postcondition(s):
A "power off" command will be sent to the LOM card of the selected node(s).
Steps to do:
1. Select node(s).
Install a node
Postcondition(s):
The selected node(s) will be installed during the next boot process. This includes deleting all former data.
Steps to do:
1. Select node(s).
Activate a node
Precondition(s):
Postcondition(s):
The selected node(s) will not be installed during the next boot process. Thus, all database data will remain.
Steps to do:
1. Select node(s).
Postcondition(s):
The selected node(s) will do a filesystem check during the next boot process.
64
Chapter 9. EXAoperation
Steps to do:
1. Select node(s).
Postcondition(s):
The selected node(s) will be configured the the default disk layout after the next installation.
Steps to do:
1. Select node(s).
The cluster node is online and can be used for starting databases. Logs of this node will be shown in monitoring
services.
Steps to do:
1. Select node(s).
Cluster services are started automatically on each node as part of the startup procedure. This action is only necessary
after having stopped the cluster services of this node.
The cluster node is offline and cannot be used for databases. Logs of this node will not be shown in monitoring
services anymore.
Steps to do:
1. Select node(s).
65
9.12. Using the EXAoperation browser interface
This operation can be useful in case of a defect node. Thus, a node can be analyzed without the danger of using
logs as if doing reboots.
Delete backup
Postcondition(s):
The backup and appropriate backup files will have been deleted.
Steps to do:
Steps to do:
Restore a backup
Precondition(s):
Postcondition(s):
A restore process for the selected system will have been started.
66
Chapter 9. EXAoperation
Steps to do:
Postcondition(s):
A restore process for the selected system will have been started. An backup.ini.out file will have been generated
in each timestamp directory and contains error messages in case of a failure.
Steps to do:
Virtual access restore processes are not possible from remote archive volumes.
67
9.12. Using the EXAoperation browser interface
Postcondition(s):
The newly configured user is shown in the "EXACluster Users Information" form and may login with {cluster
prefix}.{username} and his appropriate password.
Steps to do:
2. A new form opens. Configure the user and click "Add" once again.
Delete a user
Postcondition(s):
The user is not shown anymore in the "EXACluster Users Information" and not able to login.
Steps to do:
1. Select user.
68
Chapter 9. EXAoperation
Steps to do:
1. Select user.
69
9.12. Using the EXAoperation browser interface
Steps to do:
3. Wait for EXAoperation to issue a "Please shutdown databases and nodes and restart license server" message.
4. Shutdown databases, Storage and cluster nodes (including additional license servers).
Steps to do:
3. Restart EXAoperation.
70
Chapter 9. EXAoperation
Steps to do:
2. Choose update.
• *Name - name of the exported module. For Java, this defines the name, by which the module is referenced in
the script with the "%jar NAME.jar" command. Note: the extension ".jar" will be automatically added by
EXASolution, so you do not have to specify it. For Python and R, the name should be defined according to
the documentation.
• *URL - HTTP or FTP link to the .jar (Java) or tar.gz (Python, R) file. The module will be copied from this
location and installed on each configured EXASolution node.
71
9.12. Using the EXAoperation browser interface
EXAoperation deletes only the settings for the selected module. If a module was already installed, it remains on
the nodes and can still be used.
EXAoperation installs the selected module on all nodes. When installing Python or R modules, EXAoperation
checks all existing dependencies. The installation logs for all nodes can be downloaded by pressing 'Show Install-
ation Logs'.
EXAoperation uninstalls and deletes all installed modules from all nodes. However, the settings are not deleted
and the modules can be reinstalled.
72
Chapter 9. EXAoperation
A new JDBC driver has been uploaded and can be used by databases.
Steps to do:
You may upload more than one file for a JDBC driver. Just repeat steps 4-6 as often as required.
All driver files for the selected JDBC driver will be removed.
Steps to do:
73
9.12. Using the EXAoperation browser interface
1. Select driver.
Steps to do:
1. Select driver.
The selected driver will have been removed and is not usable anymore by any database.
Steps to do:
1. Select driver.
74
Chapter 9. EXAoperation
3. Optionally, estimate the size of uncompressed debug information selected. This will be the maximum file
size that you have to download.
Please bear in mind that in case of granting access for this form to a user, this user will be able to download database
logs regardless of having permission on the database itself.
75
9.12. Using the EXAoperation browser interface
Steps to do:
2. Select lowest priority that logservice should show, EXAClusterOS services and EXASolution systems that
should be shown.
Currently, a user may choose between five different EXAClusterOS services. These are: (1) EXAoperation - This
service logs general information about the cluster, e.g. boot processes of client nodes. (2) DWAd - The DWAd
service logs general information about EXASolution systems, like database startup/shutdown. (3) Lockd - This
base service is necessary for the DWAd. (4) Load - Each node in the cluster checks its load every minute. If it is
above a defined limit, a warning/error message will be logged. Furthermore, every client node will log its load as
an information each minute. (5) Storage - Yet unused. The message priorities have the following meaning: (A)
Information - State confirmation. (B) Notice - Some state of the system changed. (C) Warning - Something unex-
pected happened, but this should not affect usability of the system. (D) Error - An error occured that needs inter-
ventioni from an administrator.
Delete logservice
Postcondition(s):
The logservice will not show up again in the "EXACluster Logging Information"
Steps to do:
1. Select logservice.
1. Select logservice.
76
Chapter 9. EXAoperation
1. Select logservice.
Steps to do:
1. Select logservice.
3. A new form opens. Change logservice properties appropriately and click "Apply".
Steps to do:
1. Click "Edit".
77
9.12. Using the EXAoperation browser interface
An ntpdate command will have been issued that synchronizes to the NTP server(s) specified. This is useful in
cases where a license server has been left unsynchronized in a way that the NTPd will not change the system time
but an explicit time synchronization is necessary.
Steps to do:
Client nodes will take over the new configuration after a reboot/startup and the specified network/host will have
been made reachable via the provided gateway.
Steps to do:
78
Chapter 9. EXAoperation
Client nodes will take over the new configuration after a reboot/startup.
Steps to do:
1. Select route(s).
Change default gateway, network address, NTP server and/or time zone
Postcondition(s):
Client nodes will take over the new configuration after a reboot/startup.
Steps to do:
1. Click "Properties".
The new private network may be selected for database use. Network interfaces for this private network may be
defined for any node and can be used after the next boot.
Steps to do:
The selected private network must not be in use by any database or node.
Steps to do:
When deleting private networks and appropriate node interfaces, these node interfaces are accessible until the next
reboot, if those nodes are online. Thus, any started database using all network interfaces will continue to work as
before.
79
9.12. Using the EXAoperation browser interface
The new public network may be selected for any node and can be used after next reboot
Steps to do:
The new IPMI card group may be used for any existing or new node IPMI card.
Steps to do:
The selected IPMI card group must not be in use by any node.
Steps to do:
Steps to do:
Steps to do:
80
Chapter 9. EXAoperation
Steps to do:
EXAoperation will try to move to nodes with higher priority in case it fails on the current main node.
Steps to do:
3. Click "Apply".
81
9.12. Using the EXAoperation browser interface
Upload license
Postcondition(s):
The new license is shown and the sum of memory sizes over all running databases may be less or equal to the allowed
database memory.
Steps to do:
82
Chapter 9. EXAoperation
Number: Number of node in cluster. Each node number must be equal and above 10. This parameter can only be
set when adding a new node.
Console Redirection: Enable/disable redirection of kernel messages to a TTY instead of the monitor.
Spool Disk: Disk to use for spool data (data used for loader processes).
SrvMgmt Group: Group, that the Server Management Card of this node belongs to (if any).
Wipe Disks: Enable/disable wipe of disks of node on boot. Wiping may be a very time-consuming process.
Force Filesystem Check: Force filesystem check on next boot of this node.
Use 4 KiB Sectors for Disks: Use 4 KiB alignment for hard disks, is required for hard disks without 512 byte sector
size emulation and improves I/O performance on regular disks.
Enable Virtualization: Enable usage of virtualization features on this node for startup of virtual machines.
Label: Label of node, e.g. an ID string that identifies a node in a data center.
Hugepages (GiB): Amount of hugepages in GiB to use for databases on this node. This is recommended for nodes
with large amounts of RAM (> 512 GiB) to save process memory and must be smaller than the amount of DB
RAM on this node. See "Hugepages" chapter in manual for details.
83
9.13. EXAoperation Add/Edit Forms
Number: Number of node in cluster. Each node number must be equal and above 10. This parameter can only be
set when adding a new node.
Console Redirection: Enable/disable redirection of kernel messages to a TTY instead of the monitor.
Spool Disk: Disk to use for spool data (data used for loader processes).
SrvMgmt Group: Group, that the Server Management Card of this node belongs to (if any).
Wipe Disks: Enable/disable wipe of disks of node on boot. Wiping may be a very time-consuming process.
Force Filesystem Check: Force filesystem check on next boot of this node.
Use 4 KiB Sectors for Disks: Use 4 KiB alignment for hard disks, is required for hard disks without 512 byte sector
size emulation and improves I/O performance on regular disks.
Enable Virtualization: Enable usage of virtualization features on this node for startup of virtual machines.
Label: Label of node, e.g. an ID string that identifies a node in a data center.
Hugepages (GiB): Amount of hugepages in GiB to use for databases on this node. This is recommended for nodes
with large amounts of RAM (> 512 GiB) to save process memory and must be smaller than the amount of DB
RAM on this node. See "Hugepages" chapter in manual for details.
84
Chapter 9. EXAoperation
Remove Reserve/Failed Nodes: List of nodes that should be removed from the system. Only inactive database
nodes can be selected here.
Deactivate Database Nodes: List of nodes that should be deactivated for this system. When the database is running,
only reserve or failed nodes can be selected here.
Reactivate Database Nodes: List of deactivated nodes that should be reactivated for this system.
Storage volume restore delay: Move failed volume nodes to use reserve nodes automatically after given amount
of time, or disable it with no value.
Max volume size (GiB): Maximal size of database data volume (in GiB).
Network interfaces: List of network interfaces to use for database. Leave empty to use all possible network interfaces.
LDAP Server URLs: LDAP server URL(s) to use for remote database authentication, e.g. ldap://192.168.16.10.
Multiple servers must be separated by commas.
Extra DB parameters: Further EXASolution parameters. Use with care! This can only be set by a user with role
"Master".
Database RAM (GiB): Database RAM consumption of system over all nodes (memory will be shared evenly
between nodes).
Number of online Nodes: Number of online database nodes. If specifying more nodes than this number, all further
specified nodes will be used as reserve nodes.
Disk Name: Logical disk to store data and log files to.
Storage volume restore delay: Move failed volume nodes to use reserve nodes automatically after given amount
of time, or disable it with no value.
85
9.13. EXAoperation Add/Edit Forms
Max volume size (GiB): Maximal size of database data volume (in GiB).
Network interfaces: List of network interfaces to use for database. Leave empty to use all possible network interfaces.
LDAP Server URLs: LDAP server URL(s) to use for remote database authentication, e.g. ldap://192.168.16.10.
Multiple servers must be separated by commas.
Extra DB parameters: Further EXASolution parameters. Use with care! This can only be set by a user with role
"Master".
Database RAM (GiB): Database RAM consumption of system over all nodes (memory will be shared evenly
between nodes).
EXAClusterOS Services: Specifies all EXAClusterOS services that will log into this monitor.
EXASolution Systems: Specifies all EXASolution systems that will log into this monitor.
Remote Syslog Server: Specifies IP address of remote syslog service to which messages of this logservice should
be sent to via TCP.
Default Time Interval: Default time interval that is used for this logservice.
Options: Options for remote volume: (1) cleanvolume (database backup processes delete expired backups from
all databases), (2) noverifypeer (do not check server certificate), (3) nocompression (write plain data), (4) forcessl
(use STARTTLS in FTP connection), (5) webdav (use WebDAV for http-URL), (6) webhdfs (for WebHDFS
URLs), (7) delegation_token (for WebHDFS with Kerberos) (8) s3/s3s (for servers providing S3 compatible API
- with or without server-side encryption).
86
Chapter 9. EXAoperation
Prefix: Prefix of the JDBC name, must begin with "jdbc:" and ends with ":", like in "jdbc:mysql:".
Prefix: Prefix of the JDBC name, must begin with "jdbc:" and ends with ":", like in "jdbc:mysql:".
87
9.13. EXAoperation Add/Edit Forms
IPMI Multiline Password: Multiline password for IPMI card(s). May be useful if using SSH keys.
Public IP Addresses: If set, IP addresses are only reachable in the public net and no DHCP is used.
IPMI Multiline Password: Multiline password for IPMI card(s). May be useful if using SSH keys.
Public IP Addresses: If set, IP addresses are only reachable in the public net and no DHCP is used.
88
Chapter 9. EXAoperation
Default RAID 10 Redundancy: Software RAID 10 redundancy on each node per default (if used).
Default Data Encryption: Default data encryption to use for data disks.
Default Swap Size (GiB): Default size of swap disk on each node.
Default Data Disk Size (GiB): Size of default disk reserved for EXAStorage service.
Public Network: Network and appropriate network mask for external network interfaces of client nodes. Example:
192.168.16.0/24
NTP Server 1: IP address of first NTP server to use for time synchronization.
NTP Server 2: IP address of second NTP server to use for time synchronization.
NTP Key: Key for NTP server (consisting of Key ID and Key [space separated])
Backup Network Bandwidth per Node (MiB/s): Maximum bandwidth (MiB/s) a backup job is able to transfer
backups from one node to another at once.
OS Memory per Node (GiB): Memory that must not be used by EXASolution on each node. This value should be
2 for databases consuming up to 36 GiB/node, 4 for databases up to 72 GiB/node, 8 for databases up to 144 GiB,
else 16.
89
9.14. XML-RPC interface
Certificate of Remote Syslog Server(s): Text containing certificate of remote syslog server(s).
Error Level for Disk Usage (in %): Level upon which errors will be issued about disk usage.
Warning Level for Storage Usage (in %): Level upon which warnings will be issued about storage usage.
Error Level for Storage Usage (in %): Level upon which errors will be issued about storage usage.
Warning Level for Swap Usage (in %): Level upon which warnings will be issued about swap usage.
Error Level for Swap Usage (in %): Level upon which errors will be issued about swap usage.
Warning Level for Load: Level upon which warnings will be issued about load.
Error Level for Load: Level upon which errors will be issued about load.
Coredump Deletion Time (in days): Number of days after which coredumps will be deleted.
logEntries()
Parameter(s):
1. start (optional) of type (year, month, day, hour, minute, second, 0): Start time for log entries.
2. halt (optional) of type (year, month, day, hour, minute, second, 0): Stop time for log entries.
Result type:
Precondition(s):
90
Chapter 9. EXAoperation
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/logservice1');
my $result = $server->call('logEntries');
my $result = $server->call('logEntries', ['2009', '10', '2', '0', '0', '0', '0'], ['2009', '10', '2', '17', '0', '0', '0']);
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/logservice1")
s.logEntries()
s.logEntries([2009, 01, 01, 0, 0, 0, 0], [2009, 01, 01, 12, 0, 0, 0])
>>> pprint.pprint(s.logEntries([2010, 9, 29, 10, 59, 47, 0], [2010, 9, 29, 11, 00, 47, 0]))
[[2010, 9, 29, 10, 59, 47, 0],
[2010, 9, 29, 11, 0, 47, 0],
[{'message': 'n0011.c0001.exacluster.local 0.54 0.17 0.11',
'node': '1',
'priority': 'Information',
'strtime': '2010-09-29 11:00:01.780071+02:00',
'system': 'load',
'timestamp': '2010-09-29 11:00:01.780071'},
{'message': 'n0013.c0001.exacluster.local 0.08 0.09 0.12',
'node': '4',
'priority': 'Information',
'strtime': '2010-09-29 11:00:01.700149+02:00',
'system': 'load',
'timestamp': '2010-09-29 11:00:01.700149'},
{'message': 'n0014.c0001.exacluster.local 0.00 0.00 0.00',
'node': '3',
'priority': 'Information',
'strtime': '2010-09-29 11:00:01.678909+02:00',
'system': 'load',
'timestamp': '2010-09-29 11:00:01.678909'},
{'message': 'n0012.c0001.exacluster.local 0.23 0.26 0.24',
'node': '2',
'priority': 'Information',
'strtime': '2010-09-29 11:00:01.571414+02:00',
'system': 'load',
'timestamp': '2010-09-29 11:00:01.571414'}]]
logEntriesTagged()
Parameter(s):
Result type:
91
9.14. XML-RPC interface
Precondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/logservice1');
my $result = $server->call('logEntriesTagged', 'my source id');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/logservice1")
s.logEntriesTagged(3)
s.logEntriesTagged('my source id')
getDatabaseState()
Result type:
92
Chapter 9. EXAoperation
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
my $result = $server->call('getDatabaseState');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.getDatabaseState()
>>> s.getDatabaseState()
'running'
>>> s.getDatabaseState()
'shutdown'
getDatabaseConnectionState()
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
my $result = $server->call('getDatabaseConnectionState');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.getDatabaseConnectionState()
>>> s.getDatabaseConnectionState()
'No'
>>> s.getDatabaseConnectionState()
'Yes'
93
9.14. XML-RPC interface
getDatabaseConnectionString()
Result type:
connection string
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
my $result = $server->call('getDatabaseConnectionString');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.getDatabaseConnectionString()
>>> s.getDatabaseConnectionString()
'10.50.1.11..14:8563'
getDatabaseNodes()
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
my $result = $server->call('getDatabaseNodes');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.getDatabaseNodes()
>>> s.getDatabaseNodes()
{'active': ['n0011', 'n0012', 'n0013'], 'failed': [], 'reserve': ['n0014']}
getDatabaseOperation()
Result type:
94
Chapter 9. EXAoperation
string in ('None', 'Create', 'Remove', 'Startup', 'Shutdown', 'Cleanup', 'Backup', 'Restore', 'Failed')
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
my $result = $server->call('getDatabaseOperation');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.getDatabaseOperation()
>>> s.getDatabaseOperation()
'None'
>>> s.getDatabaseOperation()
'Backup'
startDatabase()
Result type:
This function does not return a result. In case something goes wrong, an exception will be raised.
Precondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
$server->call('startDatabase');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.startDatabase()
95
9.14. XML-RPC interface
stopDatabase()
Result type:
This function does not return a result. In case something goes wrong, an exception will be raised.
Precondition(s):
Postcondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
$server->call('stopDatabase');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.stopDatabase()
startStorageBackup()
Parameter(s):
1. volume of type string: Volume to backup data into (e.g. 'v0000' or 'r0010')
Precondition(s):
The database must be running. The archive volume must be configured and (if level is larger than 0) a base backup
must exist in the archive.
Postcondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_exa_db1');
$server->call('startStorageBackup', 'v0000', 2, '3d');
Python example:
96
Chapter 9. EXAoperation
import xmlrpxlic
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_exa_db1")
s.startStorageBackup('v0000', 2, '3d')
startBackup()
Parameter(s):
Result type:
This function does not return a result. In case something goes wrong, an exception will be raised.
Precondition(s):
Postcondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/backup1');
$server->call('startBackup', 'exa_db1');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/backup1")
s.startBackup('exa_db1')
submitFirewallConfiguration()
Result type:
Python example:
97
9.14. XML-RPC interface
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.submitFirewallConfiguration('n11', file('fw.conf').read())
getFirewallConfiguration()
Result type:
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getFirewallConfiguration('n11')
startupNode()
Result type:
Python example:
98
Chapter 9. EXAoperation
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.startupNode("n0011")
shutdownNode()
Result type:
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.shutdownNode("n0011")
getHardwareInformation()
Result type:
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getHardwareInformation("n11", 1)
99
9.14. XML-RPC interface
getNodeList()
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $result = $server->call('getNodeList');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getNodeList()
>>> s.getNodeList()
['n0011', 'n0012', 'n0013', 'n0014']
getEXAoperationMaster()
Result type:
This function returns a string with the name of the node currently serving EXAoperation.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $result = $server->call('getEXAoperationMaster');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getEXAoperationMaster()
>>> s.getEXAoperationMaster()
'n0010'
100
Chapter 9. EXAoperation
getEXASuiteVersion()
Result type:
This function returns the version of the EXASuite version currently installed.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $result = $server->call('getEXASuiteVersion');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getEXASuiteVersion()
>>> s.getEXASuiteVersion()
'6.0.0'
getArchiveFilesystems()
Result type:
This function returns a list of archive volumes that can be used by the calling user.
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/storage")
s.getArchiveFilesystems()
>>> s.getArchiveFilesystems()
{'v0002': ['volume', 2, ['read', 'write']]}
getVolumeList()
Result type:
This function returns a list of volumes that can be read by the calling user.
101
9.14. XML-RPC interface
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/storage")
s.getVolumeList()
>>> s.getVolumeList()
{'v0001': 'Archive', 'v0000': 'Data', 'v0001': 'Temporary Data', 'r0000': 'Remote Archive'}
getVolumeInfo()
Result type:
This function returns a list of volumes that can be read by the calling user.
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/storage")
s.getVolumeInfo(1)
s.getVolumeInfo('v0001')
>>> pprint.pprint(s.getVolumeInfo(0))
{'allowed users': ['admin'],
'disk': 'd03_storage',
'labels': ['exa_db1_persistent'],
'name': 'v0000',
'readonly users': [],
'redundancy': 2,
'segments': [['n0011', 'n0012', 'n0013'], ['n0012', 'n0013', 'n0011']],
'size': 8000,
'status': 'ONLINE',
'type': 'Data'}
>>> pprint.pprint(s.getVolumeInfo(1))
{'disk': 'd03_storage',
'labels': ['exa_db1_temporary'],
'name': 'v0001',
'redundancy': 1,
'segments': [['n0011', 'n0012', 'n0013']],
'size': 8000,
'status': 'ONLINE',
'type': 'Temporary Data'}
>>> s.getVolumeInfo(10000)
{'readonly users': [], 'type': 'Remote Archive', 'allowed users': ['admin'], 'name': 'r0000'}
102
Chapter 9. EXAoperation
getDatabaseList()
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $result = $server->call('getDatabaseList');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getDatabaseList()
>>> s.getDatabaseList()
['exa_db1', 'testdb']
getDatabaseInfo()
Result type:
This function returns a dictionary of key-value pairs with information about the specified database.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_testdb');
my $result = $server->call('getDatabaseInfo');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_testdb")
s.getDatabaseInfo()
>>> pprint.pprint(s.getDatabaseInfo())
{'connectible': 'No',
'connection string': '192.168.16.11..12:8563',
'name': 'exa_db1',
'nodes': {'active': ['n0011', 'n0012'], 'failed': [], 'reserve': []},
'operation': 'None',
'quota': 50,
'state': 'setup',
'usage persistent': 100,
'usage temporary': 1,
103
9.14. XML-RPC interface
getBackupList()
Result type:
This function returns a list of backup IDs for the specified database.
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_testdb")
s.getBackupList()
getBackupInfo()
Result type:
This functions returns a dictionary with key-value pairs describing the specified backup.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_testdb');
my $result = $server->call('getBackupInfo', '2010-11-02 14-35 00');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_testdb")
s.getBackupInfo('2010-11-02 14-35 00')
104
Chapter 9. EXAoperation
getDatabaseStatistics()
105
9.14. XML-RPC interface
Result type:
This functions returns a base64 encoded zip file containing database statistics. In case of not providing start and
stop date, database statistics will be retrieved for the last month. This can be used by customer service to provide
useful usage graphs etc.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/db_testdb');
my $result = $server->call('getDatabaseStatistics', 'user', 'pass');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/db_testdb")
# Get statistics for last month (including today)
s.getDatabaseStatistics('dbuser', 'pass')
# Get statistics for January until May (including the 31st of May)
s.getDatabaseStatistics('dbuser', 'pass', '2013-01-01', '2013-05-31')
startEXAStorage()
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/storage');
my $result = $server->call('startEXAStorage');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/storage")
s.startEXAStorage()
>>> s.startEXAStorage()
'OK'
106
Chapter 9. EXAoperation
stopEXAStorage()
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/storage');
my $result = $server->call('stopEXAStorage');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/storage")
s.stopEXAStorage()
>>> s.stopEXAStorage()
'OK'
getServiceState()
Result type:
This function returns a list of tuples, each tuple consisting of a service name and the appropriate service state. The
service state is described with 'OK', 'not running' or (for DWAd) 'DWAd has no quorum'.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $result = $server->call('getServiceState');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
s.getServiceState()
>>> s.getServiceState()
[['Loggingd', 'OK'], ['Lockd', 'OK'], ['Storaged', 'OK'], ['DWAd', 'OK']]
107
9.14. XML-RPC interface
getIPMISensorStatus()
Result type:
This function returns a list of IPMI key-value pairs with a value classification depending on the underlying IPMI
card.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/n0011');
my $result = $server->call('getIPMISensorStatus');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/n0011")
s.getIPMISensorStatus()
>>> pprint.pprint(s.getIPMISensorStatus())
[['CPU Temp 1', '47 degrees C', 'ok'],
['CPU Temp 2', '47 degrees C', 'ok'],
['CPU Temp 3', 'no reading', 'ns'],
['CPU Temp 4', 'no reading', 'ns'],
['Sys Temp', '29 degrees C', 'ok'],
['CPU1 Vcore', '1.14 Volts', 'ok'],
['CPU2 Vcore', '1.14 Volts', 'ok'],
['3.3V', '3.33 Volts', 'ok'],
['5V', '4.99 Volts', 'ok'],
['12V', '12 Volts', 'ok'],
['-12V', '-12.10 Volts', 'ok'],
['1.5V', '1.49 Volts', 'ok'],
['5VSB', '4.94 Volts', 'ok'],
['VBAT', '3.23 Volts', 'ok'],
['Fan1', '0 RPM', 'nr'],
['Fan2', '6300 RPM', 'ok'],
['Fan3', '6400 RPM', 'ok'],
['Fan4', '6100 RPM', 'ok'],
['Fan5', '0 RPM', 'nr'],
['Fan6', '0 RPM', 'nr'],
['Fan7/CPU1', '0 RPM', 'nr'],
['Fan8/CPU2', '0 RPM', 'nr'],
['Intrusion', '0 unspecified', 'nc'],
['Power Supply', '0 unspecified', 'ok'],
['CPU0 Internal E', '0 unspecified', 'ok'],
['CPU1 Internal E', '0 unspecified', 'ok'],
['CPU Overheat', '0 unspecified', 'ok'],
['Thermal Trip0', '0 unspecified', 'ok'],
['Thermal Trip1', '0 unspecified', 'ok']]
108
Chapter 9. EXAoperation
getNodeState()
Result type:
This function returns a dictionary(Python)/hash(Perl) that describes a state of node. The dictionary has 'state',
'power' and 'action' fields. Function is for Cluster and Node objects defined. When it is called from cluster, it takes
a node'' name as a obligatory parameter.
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/n0011');
my $result = $server->call('getNodeState');
Python example:
import xmlrpclib
# Node level
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/n0011")
s.getNodeState()
# Cluster level
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
for x in s.getNodeList():
print x, s.getNodeState(x)
>>> s.getNodeState()
{'status': 'Running', 'operation': 'Active', 'power': 'Power On'}
Output details:
The value of 'status' can vary between 'Running', 'Suspended', 'Offline', 'Installing', 'Installed', 'Unknown', 'Shredding',
'Shredded', 'Booting' The value of 'power' can be 'Power On', 'Power Off' and 'Unknown'(this when IPMI Service
is not present) The value of 'operation' can be 'Active', 'Active/Force fsck', 'Force no fsck', 'To install' and 'TO
WIPE'
getDiskStates()
Result type:
This function returns a list of dictionaries (Python)/hashes (Perl), whereas each entry describes one disk of the
specified node.
Perl example:
109
9.14. XML-RPC interface
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1/n0011');
my $result = $server->call('getDiskStates');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1/n0011")
s.getDiskStates()
>>> pprint.pprint(s.getDiskStates())
[{'devices': 'Default',
'encr': 'disk-encr-aes256',
'mount_count': '2/28',
'name': 'd00_os',
'next_fsck': 'Mon Mar 28 08:51:27 2011',
'raid': 'disk-raid-none',
'size': '50',
'state': 'None',
'type': 'disk-type-os',
'free': '41.8'},
{'devices': 'Default',
'encr': 'disk-encr-none',
'mount_count': '-',
'name': 'd01_swap',
'next_fsck': '-',
'raid': 'disk-raid-none',
'size': '4',
'state': 'None',
'type': 'disk-type-swap',
'free': '3.8'},
{'devices': 'Default',
'encr': 'disk-encr-none',
'mount_count': '2/27',
'name': 'd02_data',
'next_fsck': 'Mon Mar 28 08:52:04 2011',
'raid': 'disk-raid-none',
'size': '47',
'state': 'None',
'type': 'disk-type-data',
'free': '45.2'}]
Output details:
The state of a disk can vary between 'Online', 'Offline', and 'Degraded' for software RAIDs. Hardware RAID systems
may use proprietary interfaces to retrieve the current operation. Thus, such disks will show up with state 'None'.
The size of a disk is always shown in GiB (as usage is, too) or declared as 'Rest' (partition grows with disk size).
The used devices for a disk may be set to 'Default' (all devices of a node are used) or it is a list of devices, e.g.
"['/dev/sda', '/dev/sdb']".
showPluginList()
110
Chapter 9. EXAoperation
Result type:
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $funcs = $server->call('showPluginList');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
funcs = s.showPluginList()
>>> s.showPluginList()
['RAID.tw_cli-10.1', 'RAID.arcconf-6.30']
callPlugin()
Parameter(s):
Result type:
Precondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $result = $server->call('callPlugin', 'RAID.tool', 'n11', 'SHOW_LOGS');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
ret, output = s.callPlugin('RAID.tool', 'n11', 'SHOW_LOGS')
111
9.14. XML-RPC interface
----------------------------------------------------------------------
Logical device information
----------------------------------------------------------------------
Logical device number 0
Logical device name : ADAP1
RAID level :0
Status of logical device : Optimal
Size : 141590 MB
Stripe-unit size : 256 KB
Read-cache mode : Enabled
MaxIQ preferred cache setting : Enabled
MaxIQ cache setting : Disabled
Write-cache mode : Enabled (write-back)
Write-cache setting : Enabled (write-back)
Partitioned : No
Protected by Hot-Spare : No
Bootable : Yes
Failed stripes : No
Power settings : Disabled
112
Chapter 9. EXAoperation
--------------------------------------------------------
Logical device segment information
--------------------------------------------------------
Segment 0 : Present (0,4) WD-WMAKE2258147
Segment 1 : Present (0,5) WD-WMAKE2257664
----------------------------------------------------------------------
Physical Device information
----------------------------------------------------------------------
Device #0
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 1.5 Gb/s
Reported Channel,Device(T:L) : 0,4(4:0)
Reported Location : Connector 1, Device 0
Vendor : WDC
Model : WD740GD-00FL
Firmware : 33.08F33
Serial number : WD-WMAKE2258147
Size : 70911 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings :0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxIQ Cache Capable : No
MaxIQ Cache Assigned : No
NCQ status : Disabled
Device #1
Device is a Hard drive
State : Online
Supported : Yes
Transfer Speed : SATA 1.5 Gb/s
Reported Channel,Device(T:L) : 0,5(5:0)
Reported Location : Connector 1, Device 1
Vendor : WDC
Model : WD740GD-00FL
Firmware : 33.08F33
Serial number : WD-WMAKE2257664
Size : 70911 MB
Write Cache : Enabled (write-back)
FRU : None
S.M.A.R.T. : No
S.M.A.R.T. warnings :0
Power State : Full rpm
Supported Power States : Full rpm,Powered off
SSD : No
MaxIQ Cache Capable : No
MaxIQ Cache Assigned : No
NCQ status : Disabled
113
9.15. Libvirt interface for managing cluster
nodes
showPluginFunctions()
Parameter(s):
Result type:
Precondition(s):
Perl example:
use Frontier::Client;
my $server = Frontier::Client->new('url' => 'https://user:password@license-server/cluster1');
my $funcs = $server->call('showPluginFunctions', 'RAID.tool');
Python example:
import xmlrpclib
s = xmlrpclib.ServerProxy("https://user:password@license-server/cluster1")
funcs = s.showPluginFunctions('RAID.tool')
>>> pprint.pprint(s.showPluginFunctions('RAID.tw_cli-10.1'))
{'SHOW': 'Show information about controllers and units. May be called with a controller/unit ID as argument, e.g. "/c0", "/c0/u0"',
'SHOW_AENS': 'Show automatic event notifications of controllers. May be called with a controller ID as argument, e.g. "/c0".',
'SHOW_ALARMS': 'Show alarms of controllers. May be called with a controller ID as argument, e.g. "/c0".',
'SHOW_DIAG': 'Show diagnostic information of controllers. May be called with a controller ID as argument, e.g. "/c0".',
'SHOW_EVENTS': 'Show events of controllers. May be called with a controller ID as argument, e.g. "/c0".',
'SHOW_REBUILD': 'Show rebuild schedules for controllers. May be called with a controller ID as argument, e.g. "/c0".',
'SHOW_SELFTEST': 'Show information about controller selftests. May be called with a controller ID as argument, e.g. "/c0".',
'SHOW_VER': 'Show API/CLI version of tw_cli.',
'SHOW_VERIFY': 'Show verify schedules. May be called with a controller ID as argument, e.g. "/c0".'}
2. Add a Server Management group with type "Libvirt", public IP addresses and a valid Libvirt user (and pass-
word).
3. For each node, use the new Server Management Group and use the IP address of the physical virtualization
host as public Server Management IP.
114
Chapter 9. EXAoperation
1. Configure your hardware device and install the necessary PKCS#11 library in the cluster. This may include
using a command line shell in the EXASuite cluster. See your HSM manual for further details.
2. Enable the cluster node(s) to use the specified key and create an encryption key in that slot. To do this on the
command-line easily, we provide a tool that does that for you. This one can be called with pkcs11-handler
-l {PKCS#11 library} -k {keylabel} -S {slot number} -c
3. Create a key store in EXAoperation via "Access Management" -> "Key Stores" -> "Add". Provide an identi-
fier for that key store, the label of the key and some attributes. An example for how to use attributes is
LIB={PKCS#11 library};SLOT={slot number} If only one slot is provided, the SLOT parameter
may be skipped. Click "Apply" to save this new key store.
4. Select the newly created key store (via the radio button in the table) and choose "Unlock". Now, provide the
slot PIN and choose the time duration this key should be accessible.
5. Browse to "Access Management" -> "System Passwords" and choose this newly created key store ("Disk
Key Store").
As long as the used key store is unlocked, nodes may be booted. Otherwise, a boot process will fail.
2. For each disk, start the shred command. This is a standard tool on UNIX systems and will be used with the
command line parameters -n 7 -z. Thus, it will overwrite each disk with random data (seven times) and
start a further run, overwriting the disk with zeroes.
1. Enlarge disk devices: This is the preferred approach for VM environments. See SOL-177 in the EXASOL
support portal for more information.
2. Add additional EXAStorage disks: This is reasonable in case option (1) is not possible due to the cluster en-
vironment or its configuration. See SOL-453 in the EXASOL support portal for more information.
3. Reinstall nodes: This is an option in case the data of a node can be stored somewhere else during the process
of reinstallation.
9.19. Hugepages
Since EXAoperation version 5.0.0 it is possible to define the amount of so-called hugepages for cluster nodes. The
hugepage feature has been introduced into machine hardware and the Linux kernel several years ago and allows
efficient management of memory especially for situations with large amounts of physical RAM. It allows to shrink
115
9.20. Compatibility and known issues
the necessary kernel data structures for handling process memory dramatically2.We recommend to define a reas-
onable amount of hugepages for cluster nodes with at least 512 GiB RAM. These hugepages can be used for "hot
database data" as shown in Figure 9.20, “DB RAM and hugepages”. As only those data can be used here, it is ne-
cessary that the amount of hugepages per node is smaller than the amount of DB RAM used on it. Typically, 2-16
GiB of database RAM on any node will be used for other data necessary to be held in memory. This amount may
be larger in case of databases with many open client connections at one time. When using this feature, we recommend
to define at least 70 percent of DB RAM for hugepages, but to leave at least 64 GiB of physical node RAM un-
touched.
When starting multiple databases on one node, hugepages will be shared on-demand between database instances.
When changing the hugepage setting for a cluster node, it has to be restarted.
Here an example calculation with one database node having 768 GiB RAM: In such a configuration we recommend
at least 32 GiB of RAM for general purpose use (kernel memory, operating system processes, ...). Furthermore,
we recommend at least 32 GiB for database heap and hot data on demand. This results in 768-32-32 GiB = 704
GiB of hugepages that could be allocated.
2
Especially page table data structures can benefit. These normally reference pages with a size of 4 KB, but when used with hugepages (whose
size is 2MB), their size can be shrinked upto 500 times. In a real-case scenario, this can save dozens of Gigabytes.
116
Chapter 9. EXAoperation
Other common browsers as well as Internet Explorer installations in higher versions should also work as expected.
Konqueror 3.4.1 is known not to deselect checkboxes in "Unselect All" dialogs.
9.20.4. SW-RAID
In case of choosing software RAID, CentOS has an automated Cron job that checks and synchronizes all appropriate
devices once every week. Thus, you may recognize appropriate synchronization log messages over EXAoperation.
import xmlrpc.client
s = xmlrpc.client.ServerProxy("https://user:password@license-server/cluster1")
s.getDatabaseList()
117
9.20. Compatibility and known issues
2. Select the corresponding data volume on the EXAStorage page and enlarge it by X nodes.
3. On the corresponding database properties page select the action "Enlarge" and enlarge the database for the
same number of nodes. Note: by enlargement you are specifying a number of nodes, that will be added to a
database and not a total number of nodes in it.
5. When the database is running, connect to it with EXAplus and call REORGANIZE DATABASE. This will
reorganize data layout and improve performance.
If needed, also enlarge the archive volume by the same number of nodes.
Enlarging databases may require some additional database startup parameters in case not working properly. E.g.
enlarging a database with too much memory required results in a startup error message and later startup attempts
fail due to database file mismatches. In this case, enter "-enlargeCluster=1" into the "Extra database parameters"
field and start the database. Shut down the database after the first successfull login, remove the "-enlargeCluster=1"
parameter and start the database again.
$ModLoad imtcp
$InputTCPServerRun 514
For the UDP case, a configuration like this one has to be used:
$ModLoad imudp
$UDPServerRun 514
118
Chapter 10. Installation
In case of using a virtualized license server, e.g. via KVM or VirtualBox, the minimum requirements are:
• 4 GB RAM
• 2 network adapters
3. Mount the installation medium into this newly created directory (e.g. mount -o loop dvd.iso
/var/ftp/pub/EXASuite)
5. Restart xinetd (service xinetd restart) and start vsftpd (service vsftpd start)
6. Copy the necessary boot files into the tftpboot directory (e.g. cp /usr/share/syslinux/pxelinux.0
/var/lib/tftpboot/ && cp /var/ftp/pub/EXASuite/images/pxeboot/{in
itrd.img,vmlinuz} /var/lib/tftpboot/
7. Create the subdirectory pxelinux.cfg in the tftpboot base directory (mkdir /var/lib/tftp-
boot/pxelinux.cfg)
119
10.2. Automated installation of a license server
via network
8. Insert installation URL into kickstart file (e.g. sed 's!MAKE_URL!ftp://{IP of FTP serv-
er}/pub/EXASuite!g' /var/ftp/pub/EXASuite/kickstart-net/install.cfg >
/var/ftp/pub/install.cfg).
9. Create configuration file (e.g. as /var/ftp/pub/auto.cfg). This configuration could look like this (for
details of all configuration options see below):
[Network]
Private = 00:0A:0B:0C:0D:0E
Public = 00:0B:0C:0D:0E:0F
[General]
Number = 10
Installation method = immediate
Device = /dev/sda
Maintenance password = {SHA-512 hashed password}
[Cluster Network]
IP address = 10.17.0.0
Netmask = 255.255.0.0
[Public Network]
IP address = 192.168.6.2
Netmask = 255.255.255.0
Gateway = 192.168.6.1
10. Create a network boot configuration for the license server and store it into the file /var/lib/tftp-
boot/pxelinux.cfg/default. This configuration could look like this:
default linux
label linux
kernel vmlinuz
append initrd=initrd.img \
biosdevname=0 \
ks=ftp://{IP of FTP server}/pub/install.cfg \
ksdevice=eth1 \
exaconf=ftp://{IP of FTP server}/pub/auto.cfg
biosdevname=0 is a required parameter, as well as ksdevice= (the device to get the installation from),
ks= (the kickstart file to install from) and exaconf=.
11. Create a DHCPd configuration for the license server. It could look like this one:
deny unknown-clients;
default-lease-time 300;
allow booting;
allow bootp;
ddns-update-style none;
filename "pxelinux.0";
120
Chapter 10. Installation
next-server 192.168.6.1;
}
host n0010 {
hardware ethernet 00:0B:0C:0D:0E:0F;
fixed-address 192.168.6.2;
}
The IP address should match the one specified in the license server installation process. Change the MAC
address to the real one, the license server is booting from.
13. Now you can connect the license server to the PXE boot server and start it.
1. Section General:
• Installation method (required) - One of "immediate" (install on first startup), "delayed" (install
on first startup with help of maintenance user), "additional" (install as additional license server, in this
case, the parameter "root password" has to be delivered, which is a base-64 encoded string)
• Encryption (optional) defaults to "False" (no encryption); in case of choosing "True", encrypt disk
device with "Encryption password"
• Encryption password (optional) base-64 encoded string of disk password, only used in case of
choosing disk encryption
2. Section Network:
• Private (required) - MAC address of private network interface or default kernel interface name (e.g.
eth0, eth1)
• Public (required) - MAC address of public network interface or default kernel interface name (e.g. eth0,
eth1)
• Private-bonded (optional) - MAC address of private bonded network interface or default kernel in-
terface name (e.g. eth0, eth1)
• Public-bonded (optional) - MAC address of public bonded network interface or default kernel interface
name (e.g. eth0, eth1)
• Private MTU (optional) - MTU for private network interface; for 10 GBit networks, a value of 9000
for the whole network is recommended; do not mix different MTU sizes for one network
121
10.3. Installation of EXAClusterOS on a bare
CentOS server system
• Public MTU (optional) - MTU for private network interface; for 10 GBit networks, a value of 9000 for
the whole network is recommended; do not mix different MTU sizes for one network
• VLAN MTU (optional) - MTU for VLAN network interfaces; for 10 GBit networks, a value of 9000 for
the whole network is recommended; do not mix different MTU sizes for one network
• IP address (required in case of using static public network configuration) - public IP address of license
server
• Netmask (required in case of using static public network configuration) - public network netmask of li-
cense server
• Gateway (required in case of using static public network configuration) - public gateway of license
server
• Use DHCP (required in case of using public IP address via DHCP) - "True" or "False"
4. Section Private VLANs: This section defines VLAN IDs in case of using private VLANs. An entry of 1
= 11 constitutes the first private VLAN with a VLAN ID of 11.
2. Unpack /usr/opt/EXASuite-6/EXAClusterOS-6.0.6/var/clients/packages/EXAClus-
terOS-6.0.6_Linux-META_DEFAULT_KERNEL_x86_64.tar.gz into the root directory. This will
copy the required Linux kernel as well as its sources into the system. Afterwards, execute /sbin/new-
kernel-pkg --mkinitrd --depmod --install --make-default META_DEFAULT_KERNEL,
which will install the kernel.
After a shutdown or a crash of the Cored process, the license server may be reintegrated into the cluster by executing
/etc/init.d/cos start.
In case VLANs have been configured for client nodes, the license server should also be configured with appropriate
network interfaces. If only one private network interface is available, this should be configured to be part of all
VLANs on the switch. Furthermore, the network configuration should be changed appropriately as shown in the
following example (this example is based on two VLANs tagged with 25 (for network 27.1.0.0/16) and 53 (for
network 27.65.0.0/16): Create the files /etc/sysconfig/network-scripts/ifcfg-eth0.25 and
/etc/sysconfig/network-scripts/ifcfg-eth0.53. These files must contain the entry "VLAN=yes"
and the appropriate IP address of the interface configured statically. The file /etc/sysconfig/network-
scripts/ifcfg-eth0 must contain no IP address. In all files, the "ONBOOT" entry must be set to "yes".
For accessing license servers over an IPMI console, the following commands have to be executed:
122
Chapter 10. Installation
3. Add the entry "S1:2345:respawn:/sbin/agetty -h -L ttyS1 115200 vt100" to the /etc/inittab file.
5. Reboot the system. Now you should be able to use ipmitool for accesses to the SOL interface.
1. Network configuration: A valid network configuration for public interfaces of client nodes must be provided.
2. Disks: EXAoperation will use the inscribed disks for setting up client nodes.
3. Node installation: Client nodes will be identified by their MAC addresses of the appropriate private interface.
If not set explicitely, the default disk and network configuration will be used. New nodes should be started
either with EXAoperation (assuming an IPMI card in the node) or manually. While not being marked as ac-
tivated, node disks will be formatted on every startup. After activation, a restarted client node will not be in-
stalled anymore.
Hint: When using console redirection (as specified in the node configuration), a TTY speed of 19200 bps will be
used.
The following two sections will give a detailed description of the client boot process. This is followed by a descrip-
tion of the main part of client node monitoring implemented in cos-sensors.
2. Check that no installation process for this node is running currently, else exit.
3. Invoke stage 2 script, which will copy and unpack all necessary packages, do the final network configuration
and start up all cluster services.
1
Replace all occurrences of 115200 in this instruction guide with the configured bit rate of the IPMI card SOL interface.
123
10.5. Updates
10.5. Updates
Remember that an explicit database node reconfiguration requires a background-restore for non-Storage databases
to take place. This process could decrease your database performance for a certain time after database startup.
10.6. Downgrades
In case upgrades should be reverted it is also possible to do downgrades. Thus, the following steps are necessary:
2. Log into the license server via SSH over port 20. Call /etc/init.d/cos stop
4. Remove /etc/cos.conf.
After having downgraded, EXAoperation will use the configuration that was specified at upgrade time.
1. Install the new server with an EXAClusterOS installation medium. You must choose to install for an additional
license server and provide the SSH root password of the cluster nodes. 2
2. After the installation and one reboot, the synchronization is done automatically and the license server may
be used actively soon.
2
It is recommended to move EXAoperation to another (already installed) license server beforehand, because the data synchronization process
may make heavy use of network bandwidth, which may hit database performance temporarily.
124
Chapter 10. Installation
2. After the node started, log into it via rssh, execute killall crond and issue a dmesg command to see,
how the new disk(s) were integrated into the system (the formerly known disk partitions are shown at kernel
boot time).
3. If an already installed disk changed its name (e.g. from /dev/sda to /dev/sdb, rename it via EXAoper-
ation (-> Nodes -> Properties). Furthermore, execute mkdir /etc/cos (in rssh) and create a
/etc/cos/node_uuid file containing the "Unique ID" of that node (-> Node View), e.g. via echo -n
{UUID} >/etc/cos/node_uuid. Check the current referred disk via /usr/opt/EXASuite-
6/EXAClusterOS-6.0.6/sbin/hddident -m /dev/sdb1 -a (this assumes that the formerly
known disk is now known as /dev/sdb). Set the right disk via /usr/opt/EXASuite-6/EXACluster-
OS-6.0.6/sbin/hddident -m /dev/sdb1 -N /dev/sdb and re-check the success of this command
via /usr/opt/EXASuite-6/EXAClusterOS-6.0.6/sbin/hddident -m /dev/sdb1 -a.
Delete the /etc/cos directory via rm -rf /etc/cos.
4. Set the node into the "Install" state and add the new disk(s) (-> Nodes -> Node -> Disks). Set the node back
into the "Active" state.
6. Execute cos-ping in the rssh shell on the client node and wait until the boot of this node fails (see the
messages in EXAoperation).
7. Take the mkfs.ext4 command referring to the new disk from the /etc/hddinit_gpt.sh script on
the client node and execute it in the rssh shell.
125
126
Glossary
Glossary
D
Disk A partition that may be spanned over one or more physical devices. Comparable
to a logical volume under Linux.
See Also Disk device.
127
128