Cloud Private System
Cloud Private System
Cloud Private System
Ahmed Azraq
Wlodek Dymaczewski
Fernando Ewald
Luca Floris
Rahul Gupta
Vasfi Gucer
Anil Patil
Sanjay Singh
Sundaragopal Venkatraman
Dominique Vernier
Zhi Min Wen
In partnership with
IBM Academy of Technology
Redbooks
IBM Redbooks
April 2019
SG24-8440-00
Note: Before using this information and the product it supports, read the information in “Notices” on
page ix.
Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Authors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Now you can become a published author, too . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv
Comments welcome. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Stay connected to IBM Redbooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Contents v
6.5.1 Pushing and pulling images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
6.5.2 Enforcing container image security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration
tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.1.1 IaaS flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.1.2 Technology BOSH versus Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
10.2 Installation and extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327
10.2.1 Installation of the installer container in a Cloud Foundry Full Stack environment. .
327
10.2.2 Installation of the installer container in a CFEE environment . . . . . . . . . . . . . . 328
10.2.3 Config-manager role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
10.2.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
10.3 High availability installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
10.3.1 Zoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
10.3.2 External database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
10.3.3 External objects store . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.4 Backup and restore strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.4.1 Installation data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.4.2 Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
10.4.3 Cloud Foundry database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
10.5 Storage and persistent volumes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
10.5.1 Cloud Foundry Full Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
10.5.2 Cloud Foundry Enterprise Environment (CFEE) technology preview . . . . . . . 337
10.6 Sizing and licensing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
10.7 Networking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
10.8 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
10.8.1 TLS encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
10.8.2 Inbound routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
10.8.3 Credentials and certificates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
10.9 Monitoring and logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
10.9.1 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
10.9.2 Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
10.10 Integrating external services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
10.10.1 IBM Cloud Private services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
10.10.2 IBM Cloud services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
10.10.3 Legacy services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
10.11 Applications and buildpacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
10.11.1 Installing extra buildpacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
10.11.2 Application for an airgap environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
10.12 iFix and releases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Contents vii
10.12.1 Zero downtime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
Cognos® IBM® Redbooks (logo) ®
DataPower® Lotus® SPSS®
DataStage® Passport Advantage® Tivoli®
Domino® PowerPC® WebSphere®
Global Business Services® Redbooks®
IBM Watson™ Redpapers™
ITIL is a registered trademark, and a registered community trademark of The Minister for the Cabinet Office,
and is registered in the U.S. Patent and Trademark Office.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency
which is now part of the Office of Government Commerce.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Microsoft, Windows, and the Windows logo are trademarks of Microsoft Corporation in the United States,
other countries, or both.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its
affiliates.
Other company, product, or service names may be trademarks or service marks of others.
IBM® Cloud Private is an application platform for developing and managing containerized
applications across hybrid cloud environments, on-premises and public clouds. It is an
integrated environment for managing containers that includes the container orchestrator
Kubernetes, a private image registry, a management console, and monitoring frameworks.
This IBM Redbooks® covers tasks performed by IBM Cloud Private system administrators,
such as installation for high availability, configuration, backup and restore, using persistent
volumes, networking, security, logging and monitoring, Istio integration, troubleshooting and
so on.
The authors team has many years of experience in implementing IBM Cloud Private and
other cloud solutions in production environments, so throughout this document we took the
approach of providing you the recommended practices in those areas.
As part of this project, we also developed several code examples. You can download those
from the IBM Redbooks GitHub location (https://github.com/IBMRedbooks).
If you are an IBM Cloud Private system administrator, this book is for you. If you are
developing applications on IBM Cloud Private, you can see the IBM Redbooks publication
IBM Cloud Private Application Developer's Guide, SG24-8441.
Preface xiii
Sundaragopal Venkatraman (Sundar) is a cloud evangelist
and a Thought Leader on application infrastructure, application
modernization, performance, scalability, and high availability of
enterprise solutions. With experience spanning over two
decades, he has been recognized as a trusted advisor to
various MNC Banks & other IBM clients in India/Asia-Pacific.
He has been recognized as a technical focal point for IBM
Cloud Private from IBM Cloud Innovation Labs in India. He has
delivered deep dive sessions on IBM Cloud Private on
International forums and conducts boot camps worldwide. He
pioneered the IBM Proactive Monitoring toolkit, a lightweight
Monitoring solution which was highlighted on IBM showcase
events. He has authored various IBM Redbooks publications
and Redpapers, and is a recognized author. He thanks his wife
and his family for supporting him in all his endeavors.
Robin Hernandez, Atif Siddiqui, Jeff Brent, Budi Darmawan. Eduardo Patrocinio, Eswara
Kosaraju, David A Weilert, Aman Kochhar, Surya V Duggirala, Kevin Xiao, Brian Hernandez,
Ling Lan, Eric Schultz, Kevin G Carr, Nicholas Schambureck, Ivory Knipfer, Kyle C Miller, Sam
Ellis, Rick Osowski, Justin Kulikauskas, Christopher Doan, Russell Kliegel, Kip Harris
IBM USA
Raffaele Stifani
IBM Italy
Santosh Ananda, Rachappa Goni, Shajeer Mohammed, Sukumar Subburaj, Dinesh Tripathi
IBM India
The team would also like to express thanks to the following IBMers for contributing content to
the Cloud Foundry section of the book while continuing to develop the next IBM Cloud Private
Cloud Foundry release:
Kevin Cormier, Roke Jung, Colton Nicotera, Lindsay Martin, Joshua Packer
IBM Canada
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
Preface xv
Comments welcome
Your comments are important to us!
We want our books to be as helpful as possible. Send us your comments about this book or
other IBM Redbooks publications in one of the following ways:
Use the online Contact us review Redbooks form:
ibm.com/redbooks
Send your comments in an email:
redbooks@us.ibm.com
Mail your comments:
IBM Corporation, IBM Redbooks
Dept. HYTD Mail Station P099
2455 South Road
Poughkeepsie, NY 12601-5400
GitHub materials: If you’d like to follow the code examples in this IBM Redbooks
publication, you can download the GitHub repository of this book. See Appendix B,
“Additional material” on page 365 for instructions.
With a lightweight footprint and powerful platform capabilities, IBM Cloud Private enables
enterprises to unleash their developmental creativity, using industry-common technologies
and process guidance, in a minimal time frame.
Platform technologies enabling cloud native development include Docker containers and
Kubernetes with integrated operations management for security, logging, monitoring and
event management. IBM Cloud Private also provides access to necessary application
runtimes and data services.
Use cases for IBM Cloud Private include the following situations:
Create new cloud-native apps.
Modernize your existing apps on cloud.
Open your data center to work with cloud services.
1
Parts of this section are based on the whitepaper “IBM Cloud Private: The cloud-native application platform for the enterprises” written
by Raffaele Stifani (Executive Architect - Software Group) from IBM Italy.
Your enterprise can choose the prescriptive development approach of Cloud Foundry, or the
more customizable and portable approach of Kubernetes and Docker Containers.
Along with the application runtime frameworks, IBM delivers a core set of management
services for these frameworks and the applications being developed on top. Some examples
of the management services include logging, monitoring, access control, and event
management.
Enterprises can use these management tools integrated with the platform and ready to use.
These are tools frequently used by enterprise clients today and leverage existing skills. If
needed, these tools can be integrated with enterprise instantiations, so that the management
needs are operationalized from one location.
One of the most beneficial aspects of the IBM Cloud Private platform is the application
services that move innovation from idea to reality. As shown in Figure 1-2, IBM Cloud Private
includes services for data, messaging, Java, integration, Blockchain, DevOps, analytics, and
others. These services are crucial for enterprise application creation, and with IBM Cloud
Private they can be deployed rapidly and ready to accelerate new ideas.
IBM Cloud Private supports choice in application development with Kubernetes, Cloud
Foundry, and function-based programming models. It provides these benefits:
Containers and orchestration that are based on Kubernetes for creating
microservices-based applications.
A common catalog of enterprise and open services to accelerate developer productivity.
A choice of compute models for rapid innovation, including Kubernetes and Cloud
Foundry.
Common base services to support the scalable management of microservices, including
Istio, monitoring with Prometheus, logging with Elasticsearch, Logstash, and Kibana
(ELK).
Automatic horizontal and non-disruptive vertical scaling of applications.
Note: In the following images, the clusters represent minimal IBM Cloud Private
configurations. Actual production configurations can vary.
Note: You can use a single boot node for multiple clusters. In such a case, the boot and
master cannot be on a single node. Each cluster must have its master node. On the boot
node, you must have a separate installation directory for each cluster. If you are providing
your own certificate authority (CA) for authentication, you must have a separate CA domain
for each cluster.
Hosts that can act as the master are called master candidates. See Figure 1-4.
Note: if you do not specify a separate proxy, management, or etcd node, those
components are also handled by the master node.
Supported platforms: For the most recent information on supported platforms for each
IBM Cloud Private version 3.1.2 node type, see the Supported operating systems and
platforms IBM Knowledge Center link.
Workloads
Cloud Native Containerized Middleware Data Workloads DevOps & Tools Developer Automation
MICROSERVICES
APP BLOCKCHAIN CACHES DATABASES AUTOMATION OPENSORCE SERVICE
SERVER TOOLS TOOLS BROKER
MICROSERVICES PROVISION
AI
EDGE
SERVICES
SERVICES
TRANSFROMATION & OTHER ANALYTICS CONTAINER
MICROSERVICES CONNECTIVITY WORKLOADS CATALOG ENTERPRISE
MIDDLEWARE
IBM PUBLIC
CLOUD
Next Generation Management
ENTERPRISE
PaaS (Platform as a Service) CaaS (Container as a Service) DATA
SERVICES
Infrastructure
Infrastructure Automation
2
Image taken from IBM Cloud Architecture Center
https://www.ibm.com/cloud/garage/architectures/private-cloud/reference-architecture
OpenStack
For more information about the IBM Cloud Private architecture, visit the following IBM Cloud
Architecture link:
https://www.ibm.com/cloud/garage/architectures/private-cloud/reference-architecture
See Chapter 2, “High availability installation” on page 31 for details on installing IBM Cloud
Private.
3
Image taken from IBM Cloud Architecture Center
https://www.ibm.com/cloud/garage/architectures/private-cloud/reference-architecture
See Chapter 5, “Logging and monitoring” on page 153 for more information on the ELK stack.
IBM Cloud Private configures custom Prometheus collectors for custom metrics. Custom
metrics help provide insights and building blocks for customer alerts and custom dashboards.
IBM Cloud Private uses a Prometheus and Grafana stack for system monitoring.
See Chapter 5, “Logging and monitoring” on page 153 for more information on monitoring
and alerts in IBM Cloud Private.
1.4.4 Metering
Every container must be managed for license usage. You can use the metering service to
view and download detailed usage metrics for your applications and cluster. Fine-grained
measurements are visible through the metering UI and the data is kept for up to three months.
Monthly summary reports are also available for you to download and are kept for up to 24
months.
See Chapter 6, “Security” on page 237 for more information on this topic.
1.4.6 Security
IBM Cloud Private ensures data in transit and data at rest security for all platform services. All
services expose network endpoints via TLS and store data which is encrypted at rest. You
need to configure IPsec and dm-crypt in order to accomplish that.
All services must provide audit logs for actions performed, when they were performed, and
who performed the action. The security model ensures consistent audit trails for all platform
services and compliance across all middleware.
See Chapter 6, “Security” on page 237 for details on managing security in IBM Cloud Private.
Vulnerability Advisor provides security management for IBM Cloud Container Registry,
generating a security status report that includes suggested fixes and best practices.
Any issues that are found by Vulnerability Advisor result in a verdict that indicates that it is not
advisable to deploy this image. If you choose to deploy the image, any containers that are
deployed from the image include known issues that might be used to attack or otherwise
compromise the container. The verdict is adjusted based on any exemptions that you
specified. This verdict can be used by Container Image Security Enforcement to prevent the
deployment of nonsecure images in IBM Cloud Kubernetes Service.
Fixing the security and configuration issues that are reported by Vulnerability Advisor can
help you to secure your IBM Cloud infrastructure.
IBM Cloud Private with Cloud Automation Manager provides choice and flexibility for multiple
IT across the organization to build and deliver applications and application environments into
production more quickly, with greater consistency and control.
With IBM Cloud Private, developers can get up and running quickly with a lightweight
development environment optimized for delivering Docker containerized applications with an
integrated DevOps toolchain.
With Cloud Automation Manager, IT infrastructure managers can help provision and maintain
cloud infrastructure and traditional VM application environments with a consistent operational
experience across multiple clouds.
With the Cloud Automation Manager Service Composer, IT service managers can graphically
compose complex cloud services that can be consumed as a service from a DevOps
toolchain or delivered in a cloud service catalog.
With a large and growing catalog of pre-built automation content for popular open source and
IBM middleware, built to best practices, developers and IT architects can get productive fast.
See the following link for more information on IBM Cloud Automation Manager:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.1/featured_applications/
cam.html
The scan report highlights any gaps and efforts needed to make the application cloud ready
for deployment.The result of the Application Binary scan also provides the deployment YAML,
docker file for containerizing the application and a liberty server.xml file. If an application is
fully compliant and does not requires any changes, then it can be directly deployed to an IBM
Cloud Private thorough Transformation Advisor console itself for testing.
Benefits:
Included and deployed on IBM Cloud Private
Introspects applications running on most popular runtime environments
Spits out effort estimates for application modernization for the workload
Deploy application to a target IBM Cloud Private environment if the application is fully
compliant.
Provides recommendations for application modernization
Figure 1-12 on page 16 shows the IBM Cloud Transformation Advisor. See the following link
for more information on IBM Cloud Transformation Advisor:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.1/featured_applications/
transformation_advisor.html.
You can see the IIBM Cloud Private Application Developer's Guide, SG24-8441 IBM
Redbooks for detailed information on IBM Microclimate installation and configuration and how
to use it in a sample scenario.
Figure 1-13 on page 17 shows the IBM Cloud Private management console.
1.4.12 Kubernetes
To run a container in production, Kubernetes brings orchestration primitives to support
different styles of workloads:
Stateless ReplicaSets
Stateful StatefulSets
Batch Jobs
System DaemonSets
Helm charts describe even the most complex applications; provide repeatable application
installation, and serve as a single point of authority. Helm charts are easy to update with
in-place upgrades and custom hooks. Charts are also easy to version, share, and host on
public or private servers. You can use helm rollback to roll back to an older version of a
release with ease. See 1.5, “Helm” on page 19 for more information on Helm components.
1.4.15 Catalog
IBM Cloud Private provides an easy to use, extend, and compose catalog of IBM and
third-party content. The following are some key concepts:
Charts: A bundle of Kubernetes resources
Repository: A collection of charts
Releases: A chart instance loaded into Kubernetes. The same chart can be deployed
several times and each becomes its own release
The catalog provides a centralized location from which you can browse for and install
packages in your cluster
Packages for additional IBM products are available from curated repositories that are included
in the default IBM Cloud Private repository list. Your environment must be connected to the
internet for you to access the charts for these packages. To view a list of all the IBM Cloud.
The service broker is a component that implements the service broker API to view the
available services and plans, create an instance from the available services and plans, and
create bindings to connect to the service instance.
1.5 Helm
Helm is a package manager. Package managers automate the process of installing,
configuring, upgrading, and removing applications on a Kubernetes cluster.
For any application deployment, several Kubernetes commands (kubectl) are needed to
create and configure resources. Instead of manually creating each application dependency
resource separately, Helm creates many resources with one command. A Helm chart defines
several Kubernetes resources as a set in a YAML file. A default chart contains a minimum of
a deployment template and a service template.
Helm: A command-line interface (CLI) that installs charts into Kubernetes, creating a
release for each installation. To find new charts, search Helm chart repositories.
Chart: An application package that contains templates for a set of resources that are
necessary to run the application. A template uses variables that are substituted with
values when the manifest is created. The chart includes a values file that describes how to
configure the resources.
Repository: Storage for Helm charts. The namespace of the hub for official charts is
stable.
Release: An instance of a chart that is running in a Kubernetes cluster. You can install the
same chart multiple times to create many releases.
Tiller: The Helm server-side templating engine, which runs in a pod in a Kubernetes
cluster. Tiller processes a chart to generate Kubernetes resource manifests, which are
YAML-formatted files that describe a resource. YAML is a human-readable structured data
format. Tiller then installs the release into the cluster. Tiller stores each release as a
Kubernetes ConfigMap.
Working with more than one or two clouds means you can pick the best providers and
services across clouds, geographies and functions for specific needs. This helps potentially
lower costs, increase performance and solidify governance.
Multicloud enterprises rely on private clouds for better data center performance and
availability. They depend on public clouds to be more competitive, get to market faster and
build new capabilities like analytics and artificial intelligence.
But successfully developing and deploying these new capabilities means that the average
multicloud enterprise uses six or more clouds and hundreds of Kubernetes clusters. This
creates a complex, error-prone environment that’s expensive and time-consuming to manage.
IBM Multicloud Manager supports enterprises to manage resources across clouds and
helping decreasing compliance risks and reduce cost.
While still supporting traditional packaging and deployment models (installing software as
usual on a supported operating system), an increasing number of products are available as
container images. As mentioned, deploying containerized software in a manner suitable for
production requires more than an image.
IBM Cloud Paks provide enterprise software container images that are pre-packaged in
production-ready configurations that can be quickly and easily deployed to IBM’s container
platforms, with support for resiliency, scalability, and integration with core platform services,
like monitoring or identity management. For customers who don’t want to operate the
software and its underlying infrastructure, in containers or otherwise, IBM also makes many of
its products available as-a-Service, where they are hosted and maintained by IBM in its public
cloud.
Figure 1-17 shows the three ways IBM software is delivered and consumed as containers.
Not all IBM products are available in all three delivery models.
IBM Cloud Paks use Helm charts to describe how IBM software should be deployed in a
Kubernetes environment. These resource definitions can be easily customized during
deployment, and upgrades can be easily rolled out or rolled back using the management
interfaces provided by IBM Cloud Private or IBM Cloud Kubernetes Service.
An IBM Cloud Pak is more than a simple Helm chart. IBM Cloud Paks accelerate time to
value and improve enterprise readiness at lower cost than containers alone.
IBM Cloud Paks are identified in the Catalog by one of two badges with the entry. An entry
with an IBM Cloud Pak badge meets the criteria for that badge. An entry that displays a
Certified IBM Cloud Pak badge indicates that it meets the requirements of the Certified IBM
Cloud Pak badge, which are more stringent than what is required for the IBM Cloud Pak
badge.
IBM Cloud Paks can be created by IBM or 3rd party software solutions that are offered by IBM
Partners. Figure 1-18 shows the comparison between IBM Cloud Pak and Certified IBM
Cloud Pak.
Figure 1-18 Comparison between IBM Cloud Pak and Certified IBM Cloud Pak
IBM Cloud Private for Data: In addition to these IBM Cloud Private Editions, there is
another related product called IBM Cloud Private for Data. This product is a native cloud
solution that enables you to put your data to work quickly and efficiently. It lets you do both
by enabling you to connect to your data (no matter where it lives), govern it, find it, and use
it for analysis. IBM Cloud Private for Data also enables all of your data users to collaborate
from a single, unified interface, so your IT department doesn’t need to deploy and connect
multiple applications.
We will not discuss this product in this document. You can see the product page at
https://www.ibm.com/analytics/cloud-private-for-data for more information.
A persistent volume claim is a storage request, or claim, made by the developer. Claims
request specific sizes of storage, as well as other aspects such as access modes.
A StorageClass describes an offering of storage and allow for the dynamically provisioning of
PVs and PVCs based upon these controlled definitions.
Retain allows for the manual reclamation of the storage asset. The PVC is deleted but the PV
remains.
Delete reclaim policy removes both the objects from within the cluster, as well as the
associated storage asset from the external infrastructure.
Access Modes define how volumes can be mounted in the manner supported by the storage
provider
ReadWriteOnce (RWO) can be mounted as read-write by a single node and pod
ReadOnlyMany (ROX) can be mounted read-only by many nodes and pods
ReadWriteMany (RWX) can be mounted as read-write by many nodes and pods
Note: Reclaim policy and Access Modes may be defined differently by each storage
provider implementation.
See Chapter 4, “Managing persistence in IBM Cloud Private” on page 115 for more
information on managing storage in IBM Cloud Private.
application log (applog) A log that is produced from applications that are
deployed in the Cloud Foundry environment.
Cloud Foundry deployment tool The user interface that is used to manage the
deployment of Cloud Foundry.
management logging service An ELK stack that is used to collect and store all
Docker-captured logs.
Virtual Machine File System (VMFS) A cluster file system that allows virtualization to
scale beyond a single node for multiple VMware
ESX servers.
virtual storage area network (VSAN) A fabric within the storage area network (SAN).
High availability installations of IBM Cloud Private platform are only supported through
Cloud Native and Enterprise Edition only. See section 1.3, “IBM Cloud Private architecture”
on page 10 for details on the IBM Cloud Private architecture and section 1.2, “IBM Cloud
Private node types” on page 6 for a discussion of the IBM Cloud Private node types.
System administrators should determine the high availability requirements of the IBM Cloud
Private platform installation before the software is installed.
Kubernetes technology provides built-in functions to support the resiliency of a cluster. When
administrators install IBM Cloud Private, the installation process installs all the components
that they need. However, it is always good to know how Kubernetes works. Administrators can
deploy the master nodes and the proxy nodes in plurality for achieving high availability.
System administrators can configure high availability for only the master nodes, only the
proxy nodes, or for both types of nodes.
Note: Every instance of the master node has its own etcd database and runs the API
Server. The etcd database contains vital data of the cluster used in orchestration. Data in
etcd is replicated across multiple master nodes. This applies incase the etcd is included
within the masters, you can also separate etcd nodes from the master nodes.
Note: Scheduler and Control managers of Kubernetes are in all the master nodes, but they
are active only in one master node. They work in leader election mode so that only one is
active. If a failure occurs, another instance of master node takes over.
1 1 0
2 2 0
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3
Section 2.4, “Step-by-step installation guide using Terraform” shows installation of IBM Cloud
Private cluster with number of nodes of each type, shown in Table 2-2 with an additional
vulnerability advisor node.
Boot 1 2 8 250
Master 3 16 32 500
Management 2 8 16 500
Proxy 2 4 16 400
Worker 3+ 8 32 400
Boot 1 2 8 250
Master 3 16 32 500
Management 2 16 32 500
Proxy 3 4 16 400
Worker 3+ 8 32 400
Large size environment (worker nodes < 500) with high resiliency
Table 2-4 below shows sizing for large sized IBM Cloud Private cluster environment with 500
or less worker nodes.
Boot 1 4 8 250
Proxy 2 4 16 400
Vulnerability 3 6 48 500
advisor
Large size environment (worker nodes < 1000) with high resiliency
Table 2-5 shows sizing for large sized IBM Cloud Private cluster environment with 1000 or
less worker nodes.
Boot 1 4 8 250
Proxy 2 4 16 400
Vulnerability 3 6 48 500
advisor
This scenario uses Kubernetes functions but can’t protect applications from site failure. If
system administrators need site-based protection, they should combine this model with other
protection solutions, including a disaster recovery solution. Figure 2-1 shows the IBM Cloud
Private intra cluster topology.
The cluster is distributed among multiple zones. For example, you might have three, or five, or
seven master nodes and several worker nodes distributed among three zones. Zones are
sites on the same campus or sites that are close to each other. If a complete zone fails, the
master still survives and can move the pods across the remaining worker nodes.
Note: This scenario is possible, but might present a few challenges. It must be
implemented carefully.
Figure 2-2 shows the intra cluster with multiple availability zone topology.
The federation model is possible; however, beyond the orchestrator, you must consider all the
other components of IBM Cloud Private to recover. For example, you must recover all the
tools to manage the logs and to monitor your platform and workloads.
While the federation for Kubernetes is a built-in feature, you still must take care of all the other
components. As in the “Intra cluster with multiple zones” model, you must also be aware of
possible latency problems.
Support for federation is relatively new in Kubernetes. Before you apply federation to
business-critical workloads, look at its evolution and maturity.
Figure 2-3 Inter cluster with federation on different availability zones topology
In summary:
Consider your workload volumes. Do you have a high number of transactions? Is the
workload variable?
Consider your workload personality. Are your services network intensive? What are the
programming frameworks you are using?
Consider your performance requirements. How do your applications scale: horizontal,
vertical, or sharding?
In case of IBM Cloud Private, you use Terraform to perform the following automation:
Create Security Groups to allow intercluster communication, allow specific port range for
the public communication to the load-balancer, and to allow ssh between the boot node
and all nodes for installation.
Create two local load balancers in-front of the proxy nodes and the master nodes.
Create virtual machines for boot node, master nodes, proxy nodes, management nodes,
vulnerability advisor, and worker nodes.
Create file storage for master nodes shared storage.
Deploy IBM Cloud Private 3.1.2 Enterprise Edition on top of the provisioned infrastructure.
Figure 2-6 on page 41 shows an architecture diagram for the IBM Cloud Private highly
available configuration. You learn how to deploy IBM Cloud Private based on the
recommendations described in 2.1, “High availability considerations” on page 32. In this
example, you deploy IBM Cloud Private on top of IBM Cloud infrastructure using the following
configuration:
3 master nodes
2 management nodes
2 proxy nodes
1 vulnerability advisor node
3 worker nodes
Note: These steps have been tested on MacOs and Ubuntu 18.04 Minimal LTS. Provision
an Ubuntu 18.04 Virtual Server Instance in case you use other Operating System locally
and run into a problem while preparing your environment or while applying the Terraform
script. The specification of the machine needs to be at least 2 virtual cores with 4 GB of
RAM.
Install Terraform
Install Terraform on your machine as you use it to run the scripts needed to install IBM Cloud
Private. Perform the following steps to install Terraform:
1. Download the package corresponding to your operating system and architecture from the
following URL:
https://www.terraform.io/downloads.html
Ubuntu: wget
https://releases.hashicorp.com/terraform/0.11.11/terraform_0.11.11_linux_amd64.
zip
2. Extract the compressed folder. Perform the steps in Example 2-1 in case you have Ubuntu
operating system.
3. Update your PATH environment variable to point to the directory that have the extracted
binary or move the binary to binary directory:
Ubuntu: sudo mv terraform /usr/local/bin/
4. Verify that Terraform is installed successfully.
Open Terminal and write this command terraform --version, the Terraform version
should appear if installed successfully.
Perform the following steps to install and configure IBM Cloud Provider plug-in:
1. Download the package corresponding to your operating system and architecture from the
following URL:
https://github.com/IBM-Cloud/terraform-provider-ibm/releases
Perform the steps in “Upload IBM Cloud Private binaries to a File Server” in case you run
Terraform from your local machine in order to upload the binaries to File Server.
Note: The following steps shows the basic steps in order to create an Apache HTTP
Server. Make sure to apply the security considerations to secure it. Skip this step if you
have an existing HTTP Server or NFS server.
Note: The tag 3.1.2_redbook is included in the git clone command. This is because
the installation has been tested on this code base. Feel free to remove this tag in case
you would like to work with the latest version of the script.
Example 2-3 Clone the GitHub repo that has the Terraform IBM Cloud Private installation script
git clone --branch 3.1.2_redbook \
https://github.com/ibm-cloud-architecture/terraform-icp-ibmcloud.git
You are in 'detached HEAD' state. You can look around, make experimental
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b
<new-branch-name>
Note: The terraform templates contained in this Github repository will be continuously
updated by IBM team for upcoming IBM Cloud Private releases.
2. In order to allow Terraform to access your Cloud account, you need to provide the API
username and API key of your IBM Cloud infrastructure account.
Note: In case you are not an account owner or Superuser, make sure you have the
following permissions with IBM Cloud infrastructure:
Manage Storage.
Manage Security Groups.
a. Navigate to your user profile on IBM Cloud infrastructure account through opening
https://control.softlayer.com/account/user/profile.
b. Copy the API Username and Authentication Key from the portal, as shown in
Figure 2-10.
3. Specify the data-center that will host all the Virtual Server Instances by add the following
line to terraform.tfvars. Replace dal10 with the intended data-center code.
datacenter = "dal10"
Note: Some customers have data residency requirements. Make sure that in these
situations that the data-center configured in this step resides in the country of the
customer. For example, if it is a Singaporian customer that has data-residency
requirements, you will need to pick a data-center in Singapore, for example SNG01 as
shown in Figure 2-11.
4. Set the IBM Cloud Private Image to use for installation as shown. Get the image
corresponding to your version from knowledge center.
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/install_
containers.html
For IBM Cloud Private Enterprise Edition 3.1.2, the image is
ibmcom/icp-inception-amd64:3.1.2-ee. Add the following line to terraform.tfvars:
icp_inception_image = "ibmcom/icp-inception-amd64:3.1.2-ee"
5. Specify the location for the image by adding the following line to terraform.tfvars.
Image Location can NFS, HTTP, or a private registry. Replace IMAGE_LOCATION with the
location of your image that you have defined in 2.4.2, “Upload IBM Cloud Private binaries”
on page 43.
image_location = "IMAGE_LOCATION"
6. Set IBM Cloud Private admin password by adding the following line to terraform.tfvars
icppassword = "IBM-Cloud-Private-Admin-Redbooks".
Note: Starting from IBM Cloud Private 3.1.2, by default the admin password should
comply with a specific regular expression enforcement rule '^([a-zA-Z0-9\-]{32,})$'.
This regular expression can be changed during installation.
mgmt = {
nodes = "2"
cpu_cores = "8"
memory = "16384"
disk_size = "100" // GB
docker_vol_size = "400" // GB
}
va = {
nodes = "1"
cpu_cores = "8"
memory = "16384"
disk_size = "100" // GB
docker_vol_size = "400" // GB
}
proxy = {
nodes = "2"
cpu_cores = "4"
memory = "16384"
disk_size = "100" // GB
docker_vol_size = "300" // GB
}
worker = {
nodes = "3"
cpu_cores = "8"
memory = "32768"
disk_size = "100" // GB
docker_vol_size = "400" // GB
}
Example 2-7 Variable worker in variables.tf after adding the two additional variables
variable "worker" {
type = "map"
default = {
nodes = "3"
cpu_cores = "4"
memory = "16384"
disk_size = "100" // GB
docker_vol_size = "100" // GB
local_disk = false
network_speed= "1000"
hourly_billing = true
disk3_size = "100" // GB
disk4_size = "100" //GB
}
}
e. Override the values in terraform.tfvars with the intended size of disk 3 and disk 4 as
shown in Example 2-8.
Initializing modules...
- module.icpprovision
Getting source
"git::https://github.com/IBM-CAMHub-Open/template_icp_modules.git?ref=2.3//publ
ic_cloud"
To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
2. Run terraform plan to know the changes that Terraform will make. The output you
receive should be similar to Example 2-10.
------------------------------------------------------------------------
------------------------------------------------------------------------
Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.
3. Run terraform apply to start the deployment then confirm the action by typing yes. It
takes around 3 hours.
Troubleshooting: In case the apply fails for any reason like network failure, you can
perform terraform apply again; terraform preserves the state and will not start from
scratch.
In some cases you may need to delete a resource manually, make any needed
modifications, and perform terraform apply again. For example in case you would like to
change the image location, you need to delete the resource image_load using terraform
destroy -target null_resource.image_load. This is to make sure it reloads the image
from the new path.
Outputs:
Note: You can use similar steps to install IBM Cloud Private Community Edition in case
you don’t have a license for IBM Cloud Private. The Community Edition is free of charge
and its terraform template exist in the same GitHub repository under folder
icp-ce-minimal.
https://github.com/ibm-cloud-architecture/terraform-icp-ibmcloud/tree/master/te
mplates/icp-ce-minimal.
You can also install IBM Cloud Private Community Edition locally to your PC through
Vagrant.
https://github.com/IBM/deploy-ibm-cloud-private/blob/master/docs/deploy-vagrant
.md.
For more details on how to install these tools, see Appendix A, “Command line tools” on
page 347.
1. Perform the instructions in this Knowledge Center to install IBM Cloud Private CLI
(Cloudctl):
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.2/manage_cluster/install
_cli.html.
2. Follow the instructions in this Knowledge Center to download and install Kubernetes CLI
(Kubectl):
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.2/manage_cluster/install
_kubectl.html.
3. Verify that Cloudctl and Kubectl are installed by performing the following commands:
cloudctl version
kubectl version
4. Log in to your cluster with the following command, where ibm_cloud_private_console_url
is the external host name or IP address for your master:
cloudctl login -a https://<ibm_cloud_private_console_url>:8443
--skip-ssl-validation
5. Make sure that there are no errors during login.
7. Verify that all nodes are in Ready Status through performing the command in
Example 2-13.
3. After you login, you should be able to see “Welcome to IBM Cloud Private” screen as
shown in Figure 2-13.
https://github.com/ibm-cloud-architecture/terraform-icp-aws.git
https://github.com/ibm-cloud-architecture/terraform-icp-azure.git
https://github.com/ibm-cloud-architecture/terraform-icp-gcp.git
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/supported_environments
/openshift/overview.html
https://github.com/ibm-cloud-architecture/terraform-icp-openstack.git
https://github.com/ibm-cloud-architecture/terraform-icp-vmware.git
https://github.com/ibm-cloud-architecture/terraform-module-icp-deploy.git
Generally post installing IBM Cloud Private on a internet connection enabled host, the IBM
Charts will be synced automatically and all the IBM charts with its respective images are
synced to the local IBM Cloud Private cluster.
2.7.1 Prerequisites
The following are the prerequisites for setting up IBM Cloud Private catalog in an airgap
environment:
Ensure you have git configured on the local system.
Ensure Docker is installed on the local system, and has internet access.
Install and configure Helm on the local system.
Install IBM Cloud Private command line interface (CLI).
3. Pull the image locally using the repository and tag information from the values.yaml file:
docker pull <repository>:<tag>. In this example, it would be:
docker pull websphere-liberty:latest
4. Upload the image to the private image repository on IBM Cloud Private:
docker login <cluster_CA_domain>:8500
Tag the image: docker tag image_name.
<cluster_CA_domain:8500>/namespace/imagename:tagname. In our example:
docker tag websphere-liberty:latest
mycluster.icp:8500/default/websphere-liberty:latest.
5. Push the image to private repository:
docker push mycluster.icp:8500/default/websphere-liberty:latest
6. Open the values.yaml file again, and update the location of the Docker image, as shown in
Figure 2-20 on page 67:
repository: <cluster_CA_domain>:8500/namespace/imagename tag: <tagname>
Note: You must replace mycluster.icp with your own cluster_CA_domain that you
configured in the <installation_directory>/cluster/config.yaml file.
4. Back up the original certificate directory or file. The following example (Example 2-16)
backs up the router directory which contains the icp-router.crt and icp-router.key.
mv router router.bak
5. Replace router/icp-router.key and router/icp-router.crt with your own key pair and
certificate that you generated in step 1. See Example 2-17.
This chapter describes the various options available to Cluster Administrators when thinking
about backing up an IBM Cloud Private cluster, ranging from each individual core component
to the infrastructure itself. It will discuss the various alternatives and considerations to a
backup strategy, and provide several approaches to backing up and restoring the
configuration to the same or a different instance of IBM Cloud Private 3.1.2.
For IBM Cloud Private, there are three common options for defining a backup strategy
Backup the infrastructure
Backup the core components
Backup using a preferred third party tool
Due to the vast range of tools available, backing up using third party software is not covered
in detail in this book. Instead, this chapter will focus on how to effectively backup and restore
the infrastructure and the core components.
In any case, regular backups should be taken so that you’re able to restore to a recent point in
time. This point in time entirely depends on your own policies, but backups are recommended
to be take immediately after installation and prior to any major changes or upgrades, where
the risk of a cluster failure is higher than the normal day to day business operations.
Taking a full infrastructure backup can be quite storage intensive, so this chapter will provide
a variety of methods to backup the core IBM Cloud Private components in a flexible manner,
using both manual and automated methods to ensure the most efficient process is put in
place on a regular basis without worrying about the manual effort involved.
The main purpose of a backup plan is to actually restore it, so in addition to backups, the
relevant processes should be in place to test that the backups are actually good backups.
This helps to avoid nasty surprises in the event something does go wrong and to provide
peace of mind knowing that you can recover from failures.
HA is the ability to withstand failures, by eliminating any single points of failure from the whole
system. This typically entails having two or more instances of any component so that the
system is able to continue providing services by switching to instance 2, when instance 1 fails.
CA is the ability to withstand both unplanned and planned downtime for any given component,
providing greater flexibility for maintenance periods, or general component failures. DR on the
other hand, is the capability to recover from a failure.
These methods can be combined with the backup strategy to ensure you are backing up the
platform effectively to allow you to recover from any given failure. When doing the system
architecture it is important to consider the backup strategy and also if there is a need for high
availability (HA) or disaster recovery (DR). The following specifications must be taken into
account: Recovery Point Objective (RPO) and Recovery Time Objective (RTO).
The backup strategy will grant that the system is restored to a given time when the backup
was taken and in a case of the disaster, the data created between the time that the backup
was taken and the moment of the failure will be lost in most of cases. This should be
considered when creating the backup policy
The high availability environment will grant that the servers still responsive including when
one or more nodes fail. So other servers will remain online and perform the tasks. When
thinking about HA it is frequently referred as the local system redundancy and each of HA
environments should be able to handle the full load. The main difference between the HA and
backup is that if data gets corrupted in one of the nodes, it will be propagated to the other
nodes, with that both nodes will having the failure, and will need the backup to resume to its
normal operation.
The disaster recovery (DR) environment is a copy of the production (or running) environment
and in case of failure this environment will take over the requests and all operation, when the
main system is ready to be put back online the data should be replicated from the DR side to
the main system.
At the time of writing, there are 6 different types of IBM Cloud Private nodes; boot, master,
proxy, management, Vulnerability Advisor and worker nodes. Each node plays a specific role
in the cluster, and each will have a different impact on the backup and restore strategy.
Boot nodes: A boot (or bootstrap) node is used for running installation, configuration, node
scaling, and cluster updates. Only one boot node is required for any cluster, and a single
boot node can cater for multiple installations of IBM Cloud Private. This node stores the
cluster configuration data and is important to at least backup the filesystem so that the
original configuration and cluster certificates are able to be reused if necessary. In small
clusters, boot nodes are typically combined with the master nodes.
etcd
etcd is a distributed key-value store that maintains all of the configuration data for an IBM
Cloud Private cluster. Without a fully functioning and healthy etcd, the cluster will be
inoperable. etcd is a core component of IBM Cloud Private that should be backed up at
regular intervals so that a cluster can be restored to the most recent state in the event of a
failure.
MongoDB
MongoDB is a database used in IBM Cloud Private to store data for the metering service,
Helm repository, Helm API server, LDAP configuration and team/user/role information. This is
a core component of IBM Cloud Private that should be backed up at regular intervals so that a
cluster can be restored to the most recent state in the event of a failure.
MariaDB
MariaDB is a database used to store OpenIDConnect (OIDC) data used for the authentication
to IBM Cloud Private. By default IBM Cloud Private refreshes the user access tokens every 12
hours, so the data in MariaDB is transient and therefore an optional component to back up.
It’s worth noting that whilst the content of the MariaDB database is not essential, it is worth
backing up at least once to store the database structure, in the event that it becomes corrupt
and the authentication modules in IBM Cloud Private cannot authenticate users.
Helm repository
The Helm repository stores all the Helm charts uploaded to the IBM Cloud Private catalog.
This should be backed up to ensure the same Helm charts are available for deployment from
the catalog.
Monitoring data
In IBM Cloud Private, the monitoring stack includes Alert Manager, Prometheus, and
Grafana. These components provide the capabilities to expose metrics about the whole
cluster and the applications running within it, trigger alerts based on data analysis and
provide a graphical view of the collected metrics to the users. The monitoring stack is
deployed, by default, with no persistent storage and therefore does not require backing up.
After installation, if persistent storage is added to each component then backing up the
Persistent Volume data is necessary to ensure any changes made can be restored.
Vulnerability Advisor
Vulnerability Advisor (VA) is an optional management service that actively scans the images
used in an IBM Cloud Private cluster and identifies security risks in these images. The
reporting data for VA is stored in Kafka and Elasticsearch. Therefore, to enable a time-based
analysis of the cluster image security, the VA components should be backed up.
This means that the logging data also needs to be backed up to ensure VA data is available
upon restoring VA functionality. The scans and reports are generated based on the current
environment, so restoring this data to a new cluster means that the restored data is no longer
meaningful, and depending on the requirements for VA reporting, it may not be necessary to
restore the backup data at all.
Mutation Advisor
Mutation Advisor (MA) is an optional management service (installed with Vulnerability
Advisor) that actively scans running containers for any changes in system files, configuration
files, content files, or OS process within the container. Time-based reports are generated and
stored in the same way as Vulnerability Advisor, and therefore the same conditions apply.
The general strategy for successful IBM Cloud Private backup cycles is shown in Figure 3-2.
It’s a typical iterative approach during operations where periodic backups are taken at either
regular intervals (for example daily) or before maintenance.
Figure 3-2 General strategy for IBM Cloud Private backup cycles
The backup process needs to be carefully planned to minimise disruption to end users. The
following sections will explain the use of several options for backing up and restoring both
infrastructure and platform components to allow cluster administrators to apply the most
efficient process for their environment.
More information about taking nodes offline can be found in the IBM Knowledge Center:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/manage_cluster/node_ma
intenance.html
This recovery effort is normal behavior for a running system, but is not considered an
organized state that provides a stable backup of the cluster. Therefore, stopping the master
nodes first will provide the most stable backup to recover from.
The most stable approach, when taking cold backups of an IBM Cloud Private cluster, is to
shut down all master nodes (and etcd nodes, if they are separate from the masters) at the
same time, take a backup, then start up the node again. This ensures that all the core
components are backed up in exactly the same state. However, the drawback to this
approach is that the cluster is temporarily offline while backups are taken, and therefore is not
suitable for production installation beyond the point of initial installation.
The recommended time to use this approach is immediately after installation with a stable
cluster, so that it’s possible to quickly revert back to a fresh installation if required. Generally,
backups should be taken of all nodes after installation but proxy and worker nodes are easily
replaced and only need backups if the workload data resides on the node itself. Management
and Vulnerability Advisor nodes are also easily replaced, but many of the management
services write data directly to the host filesystem, so at a minimum the management node
filesystems should have a backup.
However, there is a caveat to this process; if too much time has passed in between taking a
cold backup of each node, the risk of the etcd database becoming too inconsistent is
increased. Furthermore, in a very active cluster, there may have been too many write
operations performed on a particular etcd instance, which means etcd will not be able to
properly recover.
The cluster administrator needs to consider the use of an active-active style architecture, as
discussed earlier, if they require as close to zero-downtime as possible for running
applications that rely on the cluster (or management) services on IBM Cloud Private (such as
the api-server, or Istio). In most cases, workloads running on worker nodes (using the IBM
Cloud Private proxy nodes for access to the applications) are not affected by master node
downtime unless there is an event on the application host node that requires attention from
the core IBM Cloud Private components whilst the master nodes are offline.
Based on the 1.3, “IBM Cloud Private architecture” on page 10, the following nodes contain
core components that persist data and should backed up regularly
etcd/master
management (optional but recommended)
Vulnerability Advisor (optional but recommended)
The other nodes not mentioned in the list above (proxy and worker nodes) are generally
replaceable without consequence to the health of the cluster. Information on replacing various
IBM Cloud Private nodes can be found in the IBM Knowledge Center documentation:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/add_node.ht
ml
There are several other crucial components that also require quorum, such as MongoDB and
MariaDB, but as they are Kubernetes resources it is always possible to recover it (assuming
the host filesystem is intact. If etcd is permanently failed, so is the whole cluster, so the main
focus in this section is retaining a healthy etcd cluster. The key thing to understand when
restoring infrastructure backups is that etcd will rebuild it’s cluster state using the latest
available data on the current “leader”, and the elected leader will have the highest raft index.
When the etcd/master nodes are brought back online, the etcd instances will continue to
operate where they left off at the point in time where the backup was taken, requesting new
elections and updating the current raft term. If any other etcd members making requests to
the cluster (becoming a ‘follower’) have a different raft index that the leader does not
recognise (for example a different copy of the data) and an inconsistent raft term then it will no
longer be a member of the cluster.
The issue here is that each backup for the master nodes is taken at a completely different
time, with master 3 being the last. This means that master 3 will have different data when
restored, and when brought online last it will attempt to communicate with the existing cluster.
Therefore, it’s essential to ensure that in the event of a cluster failure, and all 3 masters need
to be restored from a backup, that the order of restoration is the reverse of the order of
backup. This process applies to restoring both hot and cold backups.
Using the previous example of a HA cluster, master nodes 1, 2, and 3 were backed up using
VMWare snapshots while the machines were offline (cold snapshot) in sequence starting
from master 1. Nginx workloads were deployed to simulate user deployments so that the
cluster state in etcd was changed. To simulate a disaster, masters 1 and 2 were destroyed. At
this point etcd is now running as a single node cluster, in read-only mode and cannot accept
any write requests from Kubernetes.
Restore all 3 masters to a previous snapshot, power on the nodes and allow some time for all
the containers to start. To ensure the etcd cluster has started successfully and the cluster is
healthy, etcd provides a useful command line utility to query the cluster status, which can be
downloaded to your local machine or ran from the etcd container itself. To get the cluster
health status, complete the following steps from the IBM Cloud Private boot node (or
whatever system has cluster access with kubectl):
1. Run kubectl -n kube-system get pods | grep k8s-etcd to retrieve the etc pod name
(the format is k8s-etcd-<node-ip>):
[root@icp-ha-boot ~]# kubectl -n kube-system get pods -o name| grep k8s-etcd
k8s-etcd-172.24.19.201
k8s-etcd-172.24.19.202
k8s-etcd-172.24.19.203
2. Use any one of the pod names returned to execute an etcdctl cluster-health
command, using one of the master node IP addresses in the endpoint parameter:
[root@icp-ha-boot ~]# kubectl -n kube-system exec k8s-etcd-172.24.19.201 -- sh
-c "export ETCDCTL_API=2; etcdctl --cert-file /etc/cfc/conf/etcd/client.pem
--key-file /etc/cfc/conf/etcd/client-key.pem --ca-file
/etc/cfc/conf/etcd/ca.pem --endpoints https://172.24.19.201:4001
cluster-health"
This outputs the cluster state for all etcd endpoints, and will either state cluster is
healthy, or cluster is unhealthy as a final status. If the output is cluster is healthy,
then restoring etcd was successful. If the output is cluster is unhealthy, then further
investigation is needed.
It’s also important to check the other core components, such as MongoDB and MariaDB to
ensure they are running correctly. See the troubleshooting section for more information
about verifying a successful IBM Cloud Private cluster state.
Further testing
This section provides some real-world testing of hot and cold backups where a cluster is
under heavier stress from continuous workload deployment. The testing in this section was
conducted in a VMWare lab environment using VMWare snapshot capabilities and does not
represent a real customer deployment, it is only intended for simulation purposes.
To test the backup process resiliency for master nodes, the following test plan was used, as
shown in Figure 3-3.
Prior to taking backups, the etcd cluster was queried to retrieve information about the current
database size, current raft term and current raft index. Below shows the output of an etcdctl
endpoint status command.
After 30 seconds, the same query was run to analyze the changes in data.
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://172.24.19.201:4001 | 8a2d3ec6df19666f | 3.2.24 | 30 MB | false | 1617 | 1879291 |
| https://172.24.19.202:4001 | ae708d12aa012fdc | 3.2.24 | 31 MB | true | 1617 | 1879292 |
| https://172.24.19.203:4001 | dd2afb46d331fdd2 | 3.2.24 | 31 MB | false | 1617 | 1879293 |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
The leader and raft term is still the same, so there is no issues around frequent elections, but
the raft index has changed, which represents changes in the data.
Master 1 was taken offline, snapshot, then left offline for some time to simulate maintenance.
After 10 minutes, master 1was brought back online, and the process repeated for master 2
and 3. Once all masters were available and running, the etcd status now shows the following.
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://172.24.19.201:4001 | 8a2d3ec6df19666f | 3.2.24 | 43 MB | true | 1619 | 1931389 |
| https://172.24.19.202:4001 | ae708d12aa012fdc | 3.2.24 | 43 MB | false | 1619 | 1931391 |
| https://172.24.19.203:4001 | dd2afb46d331fdd2 | 3.2.24 | 43 MB | false | 1619 | 1931395 |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
This maintenance simulation process took place over 47 minutes, providing a realistic test for
restoring the snapshots to a point in time before maintenance took place on the masters.
Prior to restoring the nodes, the endpoint status resembles the following.
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://172.24.19.201:4001 | 8a2d3ec6df19666f | 3.2.24 | 46 MB | true | 1619 | 1945597 |
| https://172.24.19.202:4001 | ae708d12aa012fdc | 3.2.24 | 46 MB | false | 1619 | 1945597 |
| https://172.24.19.203:4001 | dd2afb46d331fdd2 | 3.2.24 | 46 MB | false | 1619 | 1945597 |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
To verify that restoring the masters in the original sequential order yields an inconsistent
cluster, the masters were brought online starting from master 1. The endpoint status shows
the following.
Failed to get the status of endpoint https://172.24.19.203:4001 (context deadline
exceeded)
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
This test was repeated 5 times, of which every test produced consistent results; the third
master could not join the cluster.
Upon restoring the nodes in reverse order (from master 3 to 1), the endpoint status resembles
the following.
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| ENDPOINT | ID | VERSION | DB SIZE | IS LEADER | RAFT TERM | RAFT INDEX |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
| https://172.24.19.201:4001 | 8a2d3ec6df19666f | 3.2.24 | 43 MB | false | 1661 | 1928697 |
| https://172.24.19.202:4001 | ae708d12aa012fdc | 3.2.24 | 43 MB | false | 1661 | 1928697 |
| https://172.24.19.203:4001 | dd2afb46d331fdd2 | 3.2.24 | 43 MB | true | 1661 | 1928697 |
+----------------------------+------------------+---------+---------+-----------+-----------+------------+
This test was repeated 5 times, of which every test produced consistent results; the etcd
cluster recovered successfully.
Other combinations were also tested. The below tables show the results per varying
combinations of restoration order for both hot and cold snapshots.
Figure 3-4 Tables showing simple test results for hot and cold restores
The results above reiterate the need to restore cold backups in reverse order. Restoring
backups in any other order yielded unpredictable results, and in some cases, an unstable
cluster. Hot backups were much more effective at restoring the original cluster state a lot
quicker (less time to start up services as memory was preserved). Hot snapshots were taken
within a much closer time frame, which means etcd had a very high chance of recovering
from minor inconsistencies between instances.
In most cases, hot snapshots are not used as the sole backup method and should be used
appropriately as documented by the platform provider. However, based on the above results
they can be reliable to quickly recover from fatal changes to the cluster.
It’s advised that the core components are backed up in the following order:
1. etcd
2. MongoDB
3. MariaDB (Optional)
This is not a strict backup order, just an advised order. MongoDB persists Team and User
data, and etcd persists the relevant Role Base Access Control (RBAC) Kubernetes resources
associated to those Teams and users. If you were to back up etcd first, add a new Team with
Users, then back up MongoDB, you would end up with out of sync data as etcd would not
contain the data to restore the RBAC Kubernetes resources for the new Team.
Alternatively, if you backup MongoDB first, add a new Team and Users, then back up etcd,
data would still be out of sync as MongoDB would not contain the Team and User data upon
restore. Based on this, the backup order is entirely the choice of the Cluster Administrator, but
following the approach in the infrastructure backup sections this chapter takes a backup of
etcd first and everything else after. Ideally, both etcd and MongoDB should be backed up
within a close time frame, at a period of time where cluster activity is low.
The non-core components can be backed up in any order. When restoring the Vulnerability
Advisor, it’s recommended to restore full functionality to the Elasticsearch stack first, as
Vulnerability Advisor and Mutation Advisor both rely on a healthy Elasticsearch to store and
retrieve data.
For all of the components that require a PersistentVolume (PV) and PersistentVolumeClaim
(PVC), the same PV for an NFS server is used. If the backups should be segregated on
different volumes, create additional PVs and PVCs for each component.
The use of accessModes: ReadWriteMany here allows multiple containers to write to the
volume simultaneously. Some storage types do not provide this capability and in such
scenarios, one PV and PVC should be created per component. The storage capacity of
200Gi is used as an example and the real-world value depends on the size of the backups,
and the duration the backup data is kept (which typically depends on internal customer
requirements/policies). As this is reused among all components that require a PV and PVC,
the directory structure is the following:
backup
••• etcd
••• logging
••• mariadb
••• mongodb
The volumes will be mounted to the appropriate directories on the NFS server.
The easiest way to determine how much storage etcd and MongoDB require is to manually
run a backup of etcd and MongoDB using kubectl and check the backup size. From this, the
storage capacity required for the PVs can be calculated from the CronJob schedule. “Manual
etcd and MongoDB backups” on page 89 provides more information on manual backups.
6. All generated jobs (at the defined schedule) should be visible in the Workloads →
Jobs →BatchJobs tab (Figure 3-6).
7. If a job is completed successfully, it will show a 1 in the Successful column. If the job was
not successful, troubleshoot it using the guidance provided in the troubleshooting chapter.
Verify that the backups exist on the storage provider (external NFS server in this example)
as demonstrated in Example 3-5.
This output also provides a baseline to calculate the necessary storage requirements. For
example, if etcd data should be kept for 30 days and each backup is about ~30 Mb, the
total needed for etcd backups is 900 Mb. Similarly for MongoDB, ~30 Mb is required for 30
days worth of backup data. Naturally, the backup size is likely to grow as the cluster is
used over time, so ensure the backup sizes are monitored periodically and adjust the PV
size as required.
The ETCD-ENDPOINT environment variable for the etcd container should be changed to
reflect an actual etcd IP address. Modify the containers section in Example 3-7 and
replace #ETCD-ENDPOINT with an appropriate etcd IP address.
Backing up MariaDB
MariaDB in IBM Cloud Private is only used for storing the transient OpenIDConnect data. The
default refresh period for user tokens is 12 hours. If this data is lost between restores, the data
can simply be regenerated by logging in to the platform. If there is a use case to retain the
MariaDB data, you can back up the database using kubectl or a Kubernetes job.
Run the job in the same way as the etcd and MongoDB jobs.
When later restoring a cluster, the system images will already exist after installation, so it
does not always make sense to back up the whole /var/lib/registry directory unless it is
easier to do so. It’s recommended to also store the application images in a location external to
the cluster, so that each individual image can be pushed to the image repository if needed.
To back up an individual image, you can use commands such as docker save -o
<backup-name>.tar <repo>/<namespace>/<image-name>:<tag>
Only one helm-repo pod is running at one time and is scheduled to a master node, using a
LocalVolume PersistentVolume. This means that the most up to date chart information from
MongoDB is stored on the master node hosting the helm-repo pod, so chart backups should
be retrieved from that node.
Backing up the contents of the /var/lib/icp/helmrepo directory on the master node hosting
the helm-repo pod is sufficient. Example 3-9 shows how to retrieve the host node name or IP
address using kubectl.
General considerations
By default, IBM Cloud Private configures the ELK stack to retain log data using the logstash
index for one day, and in this configuration, it is worth considering whether or not the platform
log data is actually meaningful enough to keep. In most cases, the platform log data is
transient, especially if the default curation is kept and logs are removed after 24 hours.
Backing up log data is only really valuable if the default 24 hours is extended to a longer
duration and where the platform ELK stack is the primary source of application log data. The
Logging and Monitoring Chapter provides information on the various use cases for platform
and application logging and the alternative approaches to consider, such as deploying a
dedicated ELK for applications.
The recommended way to backup the logging file system is to ensure Elasticsearch is not
running on the management node by stopping all containers running on it. Use the method
described in “Stopping an IBM Cloud Private node” on page 78 to stop kubelet and docker,
then the /var/lib/icp/logging/elk-data/nodes/0/indices directory can be safely copied.
During this time, no log data generated by the platform or application will be persisted.
In environments with more than one management node, multiple Elasticsearch data pods are
deployed (one on each management node), so it is possible to keep the logging services
running by repeating the above approach on each management node one by one. During this
time, Elasticsearch will persist the data only to the available data pods, which means that
there will be an imbalance of spread of replicas per shard on each management node.
Elasticsearch will attempt to correct this as data pods are brought back online so higher CPU
utilization and disk I/O is normal in this situation. More information about how replica shards
are stored and promoted when data pods are taken offline can be found here:
https://www.elastic.co/guide/en/elasticsearch/guide/current/replica-shards.html
To export an AlertRule to a file, use the command kubectl -n <namespace> get alertrule
<name> > <name>.yaml replacing <namespace> with the hosting namespace and <name>
with the AlertRule name.
Important: These steps will force the containers to be restarted in order to mount the new
volume, so any existing data stored in the containers will no longer be available. AlertRules
and Dashboards are stored as Kubernetes resources and can be exported, but current
monitoring data in Prometheus or additional configurations from modifying ConfigMaps
directly may be lost. Be sure to export your additional configurations prior to executing
these steps.
1. If you do not have dynamic storage provisioning, create a separate PV and PVC for the
Alert Manager (1 Gb), Prometheus (10 Gb) and Grafana (10 Gb). Examples of PVs and
PVCs can be found throughout this chapter and the Kubernetes documentation at
https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistent-volu
mes.
2. Retrieve the current monitoring Helm chart and the release values, replacing
mycluster.icp with a value suitable for your environment:
wget --no-check-certificate
https://mycluster.icp:8443/mgmt-repo/requiredAssets/ibm-icpmonitoring-1.4.0.tgz
alertmanager:
persistentVolume:
enabled: true
useDynamicProvisioning: false
size: 1Gi
storageClass: ""
existingClaimName: "monitoring-alertmanager"
selector:
label: ""
value: ""
grafana:
persistentVolume:
enabled: true
useDynamicProvisioning: false
size: 1Gi
storageClass: ""
existingClaimName: "monitoring-grafana"
selector:
label: ""
value: ""
The data stored by the monitoring stack is portable and can be reused in different instances,
so use the storage backup tools available to backup the PV data. If the monitoring data may
be restored on a different cluster, you need to ensure that the configuration is the same (for
example the Grafana ConfigMap is the same) for the data to be reused in a new deployment
of Alert Manager, Prometheus or Grafana.
Using your preferred methods, copy the PV data to a location outside of the IBM Cloud
Private cluster.
VA and MA store all its data in two places; on the hosts filesystem and in Elasticsearch. The
data stored by VA and MA is not configured to a specific host so data can be transported
between installations of IBM Cloud Private. The VA and MA filesystems that require backup
are as follows:
/var/lib/icp/va/minio
/var/lib/icp/va/zookeeper
/var/lib/icp/va/kafka
Copy this data to a location outside of the IBM Cloud Private cluster.
The most important data to be backed up is the cfc-certs directory, as it contains the
certificates and keys generated for the existing cluster that are used for the whole platform.
The config.yaml, hosts and ssh_key files can be easily replaced.
Copy this data to a location outside of the IBM Cloud Private cluster.
The platform components mostly store data on the host node, simplifying the backup
procedures by allowing the use of common tools to backup the data. For applications that rely
on PersistentVolumes, it is the cluster administrators responsibility to ensure that they are
familiar with the storage technologies they offer to application developers.
The use of a storage provider running as containers on IBM Cloud Private, such as GlusterFS
and Ceph, can complicate the backup process as these technologies typically require the
write operations to be suspended while the volume/block is backed up using the native
snapshot tools. If the cluster fails and you can no longer interact with the storage provider,
there is a higher risk of data loss unless the application can handle such scenarios gracefully.
In the example commands provided, the same backup files created in the “Platform backup
process” section will be used. It’s important to take a full infrastructure backup of the new
environment before attempting to restore the cluster. It’s also recommended to use the steps
in “Backup etcd using kubectl” on page 89 to create an etcd backup of the current
environment, in case the restore process fails and you end up in a situation where some
nodes are restored and others are not. There are several steps that involve replacing core
data files and there is a high margin for error.
The local cluster restore steps in this section were performed in a lab environment with LDAP
configured and light workloads, tested on the following cluster configurations:
Single master, management, proxy, Vulnerability Advisor and 3 x worker nodes
3 x masters, 2 x management, 2 x proxies, 3 x Vulnerability Advisor and 3 x worker nodes
3 x etcd, 3 x masters, 2 x management, 2 x proxies, 3 x Vulnerability Advisor and 3 x
worker nodes
The remote cluster restore steps in this section were performed in a lab environment with
LDAP configured and light workloads, tested on the following cluster configurations: Single
master, management, proxy, Vulnerability Advisor, and 3 x worker nodes.
The other components can be restored in any order, although it is recommended that the
logging data is restored before the Vulnerability Advisor. The Persistent Volume Data is
restored last so that when a cluster is restored, the backed up data is not overwritten with new
cluster data, which would later corrupt the restored applications.
Before restoring the cluster, you’ll need to extract all of the platform ConfigMaps and Secrets
immediately post-installation, as these will contain the certificates and keys specific to this
cluster. When etcd is restored, it will restore the same resources that existed in the backed up
installation, which means the new installation will not function correctly if these are not
replaced with the current ones. To export all of the current ConfigMaps and Secrets to your
local machine, run the script in Example 3-11.
echo "Done."
It’s important to note that unlike the local cluster installation, you do not need to use the same
certificates from the backed up cfc-certs directory.
Restoring MongoDB
The recommended way to restore MongoDB from a backup is by using the mongorestore
utility. This section will cover two methods; using kubectl and using a Kubernetes job. The
use of each method entirely depends on the method used to backup MongoDB. For example,
if you used kubectl to backup the database, it makes sense to use kubectl to restore it. If a
CronJob or Job was used, then the core-backup PersistentVolume created in the backup
steps can be used to remove the manual effort of copying the file to the local machine.
It’s worth checking the current database statistics as a reference point, to know whether the
mongorestore command has successfully repopulated the database.
1. Use kubectl exec to create a shell session in the icp-mongodb-0 pod.
[root@icp-ha-boot cluster]# kubectl -n kube-system exec -it icp-mongodb-0 -c
icp-mongodb sh
$
2. Copy the backup file to the pod from the previous step.
[root@icp-ha-boot ~]# kubectl cp backup.gz
"kube-system/icp-mongodb-0:/work-dir/backup.gz" -c icp-mongodb
3. Verify the file was copied successfully.
[root@icp-ha-boot ~]# kubectl -n kube-system exec icp-mongodb-0 -c icp-mongodb
ls /work-dir
backup.gz
credentials.txt
log.txt
mongo.pem
openssl.cnf
peer-finder
4. Execute the restore.
[root@icp-ha-boot ~]# kubectl exec icp-mongodb-0 -n kube-system -- sh -c
'mongorestore --host
icp-mongodb-0.icp-mongodb.kube-system.svc.cluster.local:27017 --username admin
The BACKUP_NAME environment variable should be changed to reflect the name of an actual
MongoDB backup file that exists on the PV. If the MongoDB backup job or CronJob in this
chapter was used to create the backup, the file name format is
The ibmcom/icp-mongodb:4.0.5 image is the default MongoDB image for IBM Cloud Private
3.1.2. If the environment does not have internet access, replace ibmcom with the private image
registry local to the cluster, for example, mycluster.icp:8500/ibmcom/icp-mongodb:4.0.5.
Create the job using kubectl create -f mongodb-restore_job.yaml. The job should be
visible in the Workloads →Jobs →BatchJobs tab in IBM Cloud Private. If the job did not
complete successfully, troubleshoot the failure using the guidance provided in the
troubleshooting chapter. If the job ran successfully, use the same method of verification using
show dbs as described earlier. At this point, MongoDB should be restored.
After filesystem data is restored, use kubectl to delete the image manager pods running on
all master nodes:
[root@icp-ha-boot ~]# kubectl delete pods -l app=image-manager
pod "image-manager-0" deleted
pod "image-manager-1" deleted
pod "image-manager-2" deleted
When the image-manager pods are running again, the restored images should now be
available. If you backed up images using docker save, you can reload them using dockerload.
For example, use the following command:
PersistentVolumes
Due to the large number of storage providers available in IBM Cloud Private, restoring the
data to the PersistentVolumes is not covered in this chapter. When restoring etcd, the original
PersistentVolume and PersistentVolumeClaim names will be restored so at this point, the
original data should be restored as per the storage providers recommendations.
The restored volumes will use the same configuration from the backed up cluster, so restoring
this data is a good idea to do before restoring etcd, so that there is minimal disruption to the
applications that start up and expecting data to be present. For hostPath or LocalVolume type
volumes and any external storage volumes where the data points on the host/remote storage
have not moved, the data restoration should be straight forward.
If you’re restoring a cluster that previously used a containerized storage provider such as
GlusterFS or Ceph, you will need to ensure that any LVM groups/volumes and disk paths
used in the storage configuration are identical to the backed up cluster. Restoring a cluster
with containerized storage providers was not tested in this chapter, and may yield unpredicted
results.
Most steps are run on the etcd/master nodes but the management nodes also require some
actions.
# Print result
[ $? -eq 0 ] && echo "etcd restore successful" || echo "etcd restore failed"
Alternatively, download and run the etcd-restore.sh script from the following link:
https://github.com/IBMRedbooks/SG248440-IBM-Cloud-Private-System-Administrator-
s-Guide/tree/master/Ch3-Backup-and-Restore/Restore/etcd-restore.sh
If successful, the output should be etcd restore successful.
Tip: There are other ways to read the correct variable values from the etcd.json file by
using tools such as jq, but the method used here is designed to be as generic as possible.
If this script fails with an error similar to Unable to find image, the etcd.json file may not
have been parsed correctly, potentially due to double hyphens or similar characters in the
cluster name. Set the etcd_image environment variable manually by getting the docker
image name using docker images | grep etcd and then setting etcd_image using export
etcd_image=<docker-image>.
Note that the ibmcom/etcd:3.2.24 image may need to be replaced with the private image
registry name for the environment. For example,
mycluster.icp:8500/ibmcom/etcd:3.2.24.
10.Clear your web browser cache.
11.The IBM Cloud Private platform may take some time to start up all the pods on the master
node and become available for use. Once the dashboard is accessible. verify that the
cluster has started recreating the previous state by redeploying workloads from the
backed up cluster. You may experience a few pod failures for some time until the system
becomes stable again. Logging may be particularly slow as it deals with the sudden influx
of log data from newly recreated pods all at once.
If the etcd snapshot restore was successful on some nodes and failed on others, you will
need to either address the issue on the failed nodes and try again, or revert the successful
nodes to the original state by restoring the etcd snapshot taken from the new installation.
On all master nodes (or dedicated etcd nodes if they are separate)
1. Stop Kubelet:
systemctl stop kubelet
2. Restart Docker:
systemctl restart docker
3. Remove existing etcd data:
rm -rf /var/lib/etcd
4. Recreate Etcd directory:
mkdir -p /var/lib/etcd
5. Copy data from backup:
cp -r /var/lib/etcdbackup/* /var/lib/etcd/
cp -r /var/lib/etcdwalbackup/* /var/lib/etcd-wal/
6. Recreate pods directory:
mkdir -p /etc/cfc/pods
7. Restore pod data:
mv /etc/cfc/podbackup/* /etc/cfc/pods/
8. Remove kubernetes pods:
rm -rf /var/lib/kubelet/pods
9. Start kubelet:
systemctl start kubelet
Allow some time for all pods to start up. The cluster should be returned to a normal working
state.
for ns in ${namespaces[@]}
do
#secret
for s in $(ls $ns.secret/ | grep -v "\-token-")
do
kubectl delete -f $ns.secret/$s && kubectl create -f $ns.secret/$s
done
#configmap
for s in $(ls $ns.configmap/)
do
echo "Done."
Alternatively download and run the script from the following link:
https://github.com/IBMRedbooks/SG248440-IBM-Cloud-Private-System-Administrator-
s-Guide/tree/master/Ch3-Backup-and-Restore/Restore/restore-yamls.sh
If Vulnerability Advisor is not installed, ignore the errors.
4. Several files created during the initial cluster installation need to be reapplied:
a. Change the directory to <installation_directory>/cluster/cfc-components/:
cd <installation_directory>/cluster/cfc-components/
b. Apply the bootstrap-secret.yaml file:
kubectl apply -f bootstrap-secret.yaml
c. Apply the tiller.yaml file:
kubectl apply -f tiller.yaml
d. Apply the image-manager.yaml file:
kubectl apply -f image-manager/image-manager.yaml
e. Apply the whole storage directory:
kubectl apply -f storage/
5. Restore the resources that use IP addresses in the YAML configuration. Run the script in
Example 3-17.
# Set files
files=(
servicecatalog.yaml
daemonset.extensions/service-catalog-apiserver.yaml
daemonset.extensions/auth-idp.yaml
daemonset.extensions/calico-node.yaml
Alternatively, download and run the script from the following link:
https://github.com/IBMRedbooks/SG248440-IBM-Cloud-Private-System-Administrator-
s-Guide/tree/master/Ch3-Backup-and-Restore/Restore/restore-ip-yamls.sh
6. Restart kubelet and docker on all nodes again. When restarted, the pod should start
scheduling across all the cluster nodes.
7. After all nodes are online, all kube-system pods need to be deleted so they inherit the
Secrets and ConfigMaps restored earlier.
kubectl -n kube-system delete pods $(kubectl -n kube-system get pods | awk
'{print $1}')
Check that all the pods start being deleted, using kubectl -n kube-system get pods.
Some pods may be stuck in Terminating state, which can prevent the new pods from
starting up properly.
To delete all stuck pods, using the following code:
kubectl delete pod --force --grace-period=0 -n kube-system $(kubectl get pods -n
kube-system | grep Terminating | awk '{print $1}')
You may also need to delete all pods running in the istio-system and cert-manager
namespaces, so use the same approach as above.
kubectl delete pod -n istio-system $(kubectl get pods -n istio-system | awk
'{print $1}')
kubectl delete pod -n cert-manager $(kubectl get pods -n cert-manager | awk
'{print $1}')
Allow some time for all pods to start up, which may take 10 - 40 minutes, depending on the
size of the cluster. The logging pods might take even longer than this to cater for the influx of
logs generated from the restore. If individual pods are still in error state, review the errors and
attempt to resolve the problems. Issues will vary with each environment, so there is no
common resolution. If the restore was successful, you should be able to access and operate
the cluster again.
Restoring MariaDB
As discussed in “Backing up MariaDB” on page 94, it’s not usually necessary to restore
MariaDB data, however for completeness, the information is provided here. The
recommended way to restore MariaDB from a backup is by using the mysql utility. This section
will cover two methods; using kubectl and using a Kubernetes job. The use of each method
entirely depends on the method used to backup MariaDB.
IBM Cloud Private cluster needs to be prepared for the data persistence and in this chapter
we discuss options available to the IBM Cloud Private administrator regarding data
persistence for running containerized applications.
We assume that the reader is familiar with persistent volume and persistent volume claim
terminology and in this chapter we are not going to discuss basic storage concepts which you
can find in the Kubernetes documentation
(https://kubernetes.io/docs/concepts/storage/persistent-volumes/).
The following sections discuss the key persistent storage aspects that need to be considered.
For GlusterFS and Ceph you can use the disk volume encryption using dm-crypt. For
FlexVolume based drivers that use external storage providers, the encryption level can often
be part of the storage class specification (If supported by the back-end hardware). If data
isolation is required you can define multiple dedicated hostgroups and create a separate data
cluster that uses one of the supported distributed file systems (in such case storageclass
used will determine the data placement).
Access mode
Another question is related to concurrency while accessing the data. Kubernetes
PersitentVolumesClaim can specify one of the three access modes: ReadWriteOnce (RWO),
ReadOnlyMany (ROX) and ReadWriteMany (RWX). If any of the latter two are required, then
the underlying storage technology must support concurrent access to the storage volume
from multiple worker nodes.
At the time of writing multi-attach was not supported by storage providers implementing
FlexVolumes based on the iSCSI protocol as well as the VMware vSphere provider. If your
application requires ReadOnlyMany and ReadWriteMany modes, then you should select
either any of supported distributed file systems (such as GlusterFS or IBM Spectrum Scale),
container native storage provider like Portworx or NFS as your storage technology.
For production clusters it is highly recommended to pick the storage option that supports
dynamic provisioning.
The answers to the above mentioned questions affect also another choice: whether to use
storage provider external to the cluster or the internal one, installed directly on the IBM Cloud
Private. When using the distributed file systems installed on IBM Cloud Private nodes the
appropriate configuration of the provider is done automatically during the installation and will
be automatically upgraded with the future releases of IBM Cloud Private. The drawback of
this option is related to the additional CPU load that storage containers will introduce to your
worker nodes.
For additional backup considerations see “Backing up PersistentVolume data” on page 99.
Figure 4-1 Persistent volumes created by IBM Cloud Private during default installation
If you are installing IBM Cloud Private in a high availability topology with multiple master
nodes, certain directories on the master nodes must use a shared storage. Those directories
are used for storing the platform internal image registry as well as some platform logs.
You can either provide an existing shared storage that uses the NFS protocol or use any other
distributed file system like GluserFS. The section “Configuring persistent storage for
application containers” discusses mounting a volume from an external GlusterFS storage
cluster.
Luckily Kubernetes provides an open architecture which allows for multiple storage options to
coexist within the same cluster. Using the storageclass definition users can pick the storage
option that is best for their specific workload.
In this section, we describe how to configure some of the popular storage technologies that
are supported in IBM Cloud Private environments:
In “Configuring vSphere storage provider for IBM Cloud Private” on page 119 we discuss
how to set up and use the VMware storage provider.
In “Configuring NFS Storage for IBM Cloud Private” on page 120 we present a short
cookbook on setting up the NFS storage provider for IBM Cloud Private.
In “Configuring GlusterFS for IBM Cloud Private” on page 125 we present how to
configure a GlusterFS distributed file system.
In “Configuring Ceph and Rook for IBM Cloud Private” on page 131 we show you how to
configure a Ceph RBD cluster.
In “Configuring Portworx in IBM Cloud Private” on page 140 we present how to configure a
Portworx cluster.
Finally in “Configuring Minio in IBM Cloud Private” on page 147 we present how to
configure Minio, a ligthweight S3-compatible object storage in your IBM Cloud Private
cluster.
Prerequisites
In order to setup an NFS server you must meet the following prerequisites:
Linux machine (in this case, we will use a Red Hat Enterprise Linux system) with a
sufficient disk space available.
Yum repository configured to install NFS packages.
The designated NFS server and its clients (in our case the cluster nodes) should be able
to reach each other over a network.
In case your Linux machine has a firewall turned on, open the ports required for NFS.
sudo firewall-cmd --permanent --zone=public --add-service=nfs
sudo firewall-cmd --permanent --zone=public --add-service=mountd
sudo firewall-cmd --permanent --zone=public --add-service=rpc-bind
sudo firewall-cmd --reload
Replace the <path_to_share> with the path on the file system that you have sufficient disk
space and <subnet_address> with the subnet of your worker nodes. In case you are using a
different mask than 255.255.255.0 for your worker nodes, replace the ‘24’ with the correct
value. Optionally, instead of a subnet you may specify multiple entries with IP addresses of
worker nodes and /32 mask as shown below:
echo '<path_to_share> <node1_IP>/32(no_root_squash,rw,sync)' >> /etc/exports
echo '<path_to_share> <node2_IP>/32(no_root_squash,rw,sync)' >> /etc/exports
...
echo '<path_to_share> <nodeX_IP>/32(no_root_squash,rw,sync)' >> /etc/exports
Tip: Do not put any spaces between the netmask value and the opening bracket ‘(‘ -,
otherwise you will allow the NFS share to be mounted by any IP address.
You can verify if the directory was successfully exported by running the exportfs command
without parameters as shown in Example 4-1.
/storage/vol001 10.10.99.0/24
/storage/vol002 10.10.99.0/24
/storage/vol003 10.10.99.0/24
/storage/vol004 10.10.99.0/24
/storage/vol005 10.10.99.0/24
You can verify that the shared directories are visible from the NFS client by running the
showmount command on the worker node as shown in Example 4-2.
If the mount works, you are ready to use NFS in your IBM Cloud Private cluster.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-nfs
namespace: default
labels:
app: nginx-nfs
spec:
replicas: 1
selector:
matchLabels:
app: nginx-nfs
template:
metadata:
labels:
app: nginx-nfs
spec:
volumes:
Once the deployment is ready, get the pod name. Execute a shell session on the pod and
create a test file as shown in Example 4-6.
root@nginx-nfs-5b4d97cb48-2bdt6:/# cd /usr/share/nginx/html/
Note: Dynamic provisioning of NFS persistent volumes is not supported by IBM, so use it
at your own risk.
The nfs-client-provisioner requires existing NFS share to be precreated, so we will reuse the
one defined in the section “Create the directories on the local file system” on page 120.
RESOURCES:
==> v1beta1/PodSecurityPolicy
NAME DATA CAPS SELINUX RUNASUSER FSGROUP SUPGROUP
READONLYROOTFS VOLUMES
nfs-client-provisioner false RunAsAny RunAsAny RunAsAny RunAsAny false
secret,nfs
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
nfs-client-provisioner-5f7bf7f77d-7x7cn 0/1 ContainerCreating 0 0s
==> v1/StorageClass
NAME PROVISIONER AGE
nfs-client cluster.local/nfs-client-provisioner 0s
==> v1/ServiceAccount
NAME SECRETS AGE
nfs-client-provisioner 1 0s
==> v1/ClusterRole
NAME AGE
nfs-client-provisioner-runner 0s
==> v1/ClusterRoleBinding
NAME AGE
run-nfs-client-provisioner 0s
As shown in Example 4-8 on page 124, the Helm chart creates a new storage class named
“nfs-client”. You can test the dynamic provisioning of the NFS persistent volumes by creating
a new persistent volume claim, as shown in Example 4-9.
Example 4-9 YAML file for creating new PersistentVolumeClaim using nfs-client storageclass
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: test-nfs-client
namespace: default
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: nfs-client
As a result you should get the new persistent volume claim bound to the dynamically
provisioned persistent volume using a subdirectory from your NFS server as shown in
Example 4-10.
When you delete the persistent volume claim the nfs-client-provisioner will automatically
remove the associated persistent volume, however the subdirectories created on the NFS
share will not be deleted, but renamed with the “archived.” prefix. This is the default behavior
of the nfs-client-provisioner and can be changed with the storageClass.archiveOnDelete
parameter.
What is Heketi?
Heketi is a dynamic provisioner for GlusterFS that exposes REST API and is capable of
creating storage volumes on request. More information on Heketi can be obtained from the
project page: https://github.com/heketi/heketi
GlusterFS prerequisites
The following prerequisites have to be met when configuring GlusterFS within an IBM Cloud
Private cluster.
There must be a minimum of 3 worker nodes.
Each worker node should have at least one spare disk volume of at least 25GB.
Each worker node must be connected to a yum repository or have the glusterfs-client
package already installed.
For other operating systems you can find the appropriate commands here:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/manage_cluster/prepare
_nodes.html.
The output will be a list of attached disk volumes, along with their details.
In this example, the disk name we are looking for is /dev/sdb. You may see other names such
as /dev/vdb, /dev/sdc as this depends on the hypervisor type and number of disks that are
attached to the virtual machine.
This gives the /dev/disk/by-path symlink, which we will use in this example. Note that there
are other methods available such as /dev/disk/by-id, /dev/disk/by-uuid, or
/dev/disk/by-label, but only by-path has been used in this example.
Note: In some environments, such as IBM Cloud Virtual Servers or SUSE Linux Enterprise
Server (SLES), no symlinks are automatically generated for the devices. In such case, you
have to manually create the symlinks. See
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/manage_cluster/prep
are_disks.html#manual for the detailed procedure.
Make a note of the symlink and its link path. For the example device sdb, /dev/disk/by-path
is the link path and pci-0000:00:10.0-scsi-0:0:1:0 is the symlink. For each device that you
are using for the GlusterFS configuration, you need to add the line in the config.yaml file. For
the example device sdb, you would add /dev/disk/by-path/pci-0000:00:10.0-scsi-0:0:1:0
in the config.yaml file.
[worker]
...
[proxy]
...
[hostgroup-glusterfs]
<worker_1_ip>
<worker_2_ip>
<worker_3_ip>
Note: In some environments where the nodes names use hostnames instead of the IP
addresses, replace the worker node IPs with the appropriate worker node names, as listed
in the output of kubectl get nodes command.
In the default configuration, glusterfs is disabled in the config.yaml supplied by IBM in the
icp-inception image. To enable GlusterFS in the config.yaml file located in the installation
directory on the boot node find the management_services section and change the value
storage-glusterfs to enabled.
You will find below the description of the important elements of the configuration that were
marked as bold in Example 4-14 on page 129:
ip is the IP address of the worker node where you want to deploy
GlusterFS (you must add at least three worker nodes).
device is the full path to the symlink of the storage device.
storageClass this section defines if the storageclass should automatically be created
(created: true) and the properties of a storage class for GlusterFS,
such as the name, reclamationPolicy, replicacount, and so forth.
nodeSelector this section refers to the labels that are added to the worker nodes.
Make sure that the key and value are identical to those used in the
step “Labeling worker nodes for GlusterFS” on page 128.
prometheus this section enables automatic collection of the performance metrics
related to glusterFS in the Prometheus database. It is recommended
that you turn it on by providing the enabled: true value.
The GlusterFS cluster volumes, as well as the hosts and the config.yaml on the boot node,
are now ready. You can run the installation with the following command:
docker run --rm -t -e LICENSE=accept --net=host -v $(pwd):/installer/cluster \
<icp_inception_image_used_for_installation> addon
The GlusterFS cluster nodes are now ready. You can test the storage provisioning with the
following yaml file:
To test the volume, follow the steps described in “Testing the volume” on page 122 by using
the selector app=nginx-glusterfs and creating a file on one pod and verifying its existence
on the other one.
What is Ceph?
Ceph is open source software designed to provide highly scalable object, block and file-based
storage under a unified system.
Ceph storage clusters are designed to run on commodity hardware, using an algorithm called
CRUSH (Controlled Replication Under Scalable Hashing) to ensure data is evenly distributed
across the cluster and that all cluster nodes can retrieve data quickly without any centralized
bottlenecks. See https://ceph.com/ceph-storage/ for more information.
What is Rook
Rook is an open source orchestrator for distributed storage systems running in cloud native
environments.
Rook turns distributed storage software into self-managing, self-scaling, and self-healing
storage services. It does this by automating deployment, bootstrapping, configuration,
provisioning, scaling, upgrading, migration, disaster recovery, monitoring, and resource
management. Rook uses the facilities that are provided by the underlying cloud-native
container management, scheduling and orchestration platform to perform its duties.
Rook integrates deeply into cloud native environments leveraging extension points and
providing a seamless experience for scheduling, lifecycle management, resource
management, security, monitoring, and user experience.
Attention: While Ceph provides file based and object storage interfaces, the content
provided by IBM creates a distributed storage cluster for block devices (RADOS Block
Devices or RBD in short). At the time of writing Rook did not support mounting RBD
volumes to multiple nodes at the same time, so this class of storage cannot be used for
RWX and ROX access modes.
This guide assumes you have a cluster with internet access to pull the required Helm
packages and images.
At the time of writing this book, the installation of a Rook Ceph cluster was a three-step
process:
1. Configure role-based access control (RBAC).
2. Install the Rook Ceph Operator Helm chart.
3. Install the Rook Ceph cluster (ibm-rook-rbd-cluster) chart.
Create a PodSecurityPolicy
PodSecurityPolicy, as shown in Example 4-16, is required for the Rook Operator and Rook
Ceph chart to install properly.
Tip: Copying the content from the book PDF may mess up indentation, which results in
parsing errors. Access the examples source code at GitHub:
https://github.com/IBMRedbooks/SG248440-IBM-Cloud-Private-System-Administrator-
s-Guide.git.
Create the file rook-priviledged-psp.yaml with the content shown in Example 4-16 and run
the following command:
kubectl create -f rook-priviledged-psp.yaml
Create a ClusterRole
Next, you need to create a ClusterRole that uses the PodSecurityPolicy which was defined
above, as shown in Example 4-17.
Example 4-17 Sample YAML file with ClusterRole definition for Rook
# privilegedPSP grants access to use the privileged PSP.
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: privileged-psp-user
rules:
- apiGroups:
- extensions
resources:
- podsecuritypolicies
resourceNames:
- rook-privileged
verbs:
- use
Create a file rook-priviledged-psp-user.yaml with the content of Example 4-17 and run the
following command:
kubectl create -f rook-priviledged-psp-user.yaml
Create a file rook-clusterrolebinding.yaml with the content of Example 4-18 and run the
following command:
kubectl create -f rook-clusterrolebinding.yaml
Example 4-19 Sample YAML file for creating RBAC for pre-validation checks
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "list"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
You need a ClusterRoleBinding for each namespace in which you install the Rook Ceph
Cluster chart. In our example, we use the same namespace rook we have created in step
“Create a namespace for Rook operator” on page 134. Create a rook-pre-validation.yaml
file with content of the Example 4-19 on page 134 and run the command:
kubectl create -f rook-pre-validation.yaml
Create the rook-values.yaml file with the content shown in Example 4-20.
Example 4-20 The values.yaml file used for Rook Ceph Operator chart installation
image:
prefix: rook
repository: rook/ceph
tag: v0.8.3
pullPolicy: IfNotPresent
resources:
limits:
cpu: 100m
memory: 128Mi
requests:
cpu: 100m
memory: 128Mi
rbacEnable: true
pspEnable: true
Then, run the following command to install the Rook Ceph Operator chart:
helm install --tls --namespace rook --name rook-ceph rook-beta/rook-ceph \
--version v0.8.3 -f rook-values.yaml
NAME: rook-ceph
LAST DEPLOYED: Fri Feb 22 00:29:10 2019
NAMESPACE: rook
STATUS: DEPLOYED
RESOURCES:
==> v1beta1/PodSecurityPolicy
NAME DATA CAPS SELINUX RUNASUSER FSGROUP SUPGROUP
READONLYROOTFS VOLUMES
00-rook-ceph-operator true * RunAsAny RunAsAny RunAsAny RunAsAny false
*
==> v1beta1/CustomResourceDefinition
NAME AGE
clusters.ceph.rook.io 0s
volumes.rook.io 0s
pools.ceph.rook.io 0s
objectstores.ceph.rook.io 0s
filesystems.ceph.rook.io 0s
==> v1beta1/ClusterRole
rook-ceph-system-psp-user 0s
rook-ceph-global 0s
rook-ceph-cluster-mgmt 0s
==> v1beta1/Role
rook-ceph-system 0s
==> v1beta1/RoleBinding
NAME AGE
rook-ceph-system 0s
==> v1/ServiceAccount
NAME SECRETS AGE
rook-ceph-system 1 0s
==> v1beta1/ClusterRoleBinding
NAME AGE
rook-ceph-global 0s
rook-ceph-system-psp-users 0s
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
rook-ceph-operator 1 1 1 0 0s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
rook-ceph-operator-f4cd7f8d5-ks5n5 0/1 ContainerCreating 0 0s
Important: You should use only one of the above examples either raw devices or directory
paths.
You can verify that this step has been completed successfully as shown in Example 4-25.
NAME URL
stable https://kubernetes-charts.storage.googleapis.com
local http://127.0.0.1:8879/charts
rook-beta https://charts.rook.io/beta
ibm-charts https://raw.githubusercontent.com/IBM/charts/master/repo/stable
Install the ibm-rook-rbd-cluster chart as shown in Example 4-26. It will be deployed to the
namespace selected in kubectl context. To target different namespace add --namespace
<namespace> to the command.
NAME: rook-rbd-cluster
LAST DEPLOYED: Fri Feb 22 00:33:16 2019
NAMESPACE: kube-system
STATUS: DEPLOYED
RESOURCES:
==> v1beta1/Cluster
NAME AGE
rook-rbd-cluster-ibm-rook-rbd-cluster-rook-ceph-cluster 0s
==> v1beta1/Pool
rook-rbd-cluster-ibm-rook-rbd-cluster-rook-ceph-pool 0s
==> v1/StorageClass
==> v1/ServiceAccount
NAME SECRETS AGE
rook-ceph-cluster 1 0s
==> v1beta1/Role
NAME AGE
rook-ceph-cluster 0s
==> v1beta1/RoleBinding
NAME AGE
rook-ceph-cluster 0s
rook-ceph-cluster-mgmt 0s
==> v1/RoleBinding
rook-ceph-osd-psp 0s
rook-default-psp 0s
NOTES:
1. Installation of Rook RBD Cluster
rook-rbd-cluster-ibm-rook-rbd-cluster-rook-ceph-cluster successful.
You can verify that the resources were created as shown in Example 4-27. Target namespace
may be different depending on to which namespace you have deployed the chart.
NAME AGE
rook-rbd-cluster-ibm-rook-rbd-cluster-rook-ceph-cluster 3m
NAME AGE
rook-rbd-cluster-ibm-rook-rbd-cluster-rook-ceph-pool 4m
Finally, you can verify that the persistent volume was dynamically created and bound as
shown in Example 4-29.
Portworx Helm chart has multiple options regarding what drives/filesystems to use as well as
set of optional components (for example, a dedicated UI to manage Portworx cluster). The
described procedure uses default settings and installs the PX-Enterprise Trial version with
Storage Operator Runtime for Kubernetes (Stork), as described on github.com:
Stork can be used to co-locate pods with where their data is located. This is achieved by
using a kubernetes scheduler extender. The scheduler is configured to use stork as an
extender. Therefore, every time a pod is being scheduled, the scheduler will send filter and
prioritize requests to stork.
Prerequisites
To install the Portworx Helm chart you need at least 1 node with 4 CPU cores, 4 GB RAM and
available free unmounted volumes or filesystems of at least 8 GB size. (See the full list of
prerequisties here:
https://docs.portworx.com/start-here-installation/#installation-prerequisites).
Portowrx requires an existing key-value database to be available at the installation time (for
example etcd or consul). When installing on IBM Cloud Private it is not recommended to
reuse cluster etcd database.
1
https://github.com/libopenstorage/stork#hyper-convergence
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: px-account-privileged-binding
roleRef:
kind: ClusterRole
name: ibm-privileged-clusterrole
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: stork-sa-anyuid-binding
roleRef:
kind: ClusterRole
name: ibm-anyuid-clusterrole
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: stork-account
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: stork-sched-sa-anyuid-binding
roleRef:
kind: ClusterRole
name: ibm-anyuid-clusterrole
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: stork-scheduler-account
namespace: kube-system
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: portworkx-hook-anyuid-binding
roleRef:
kind: ClusterRole
name: ibm-anyuid-clusterrole
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: portworx-hook
namespace: kube-system
Create a portworx-clusterrolebindings. yaml file with the content of Example 4-31 and
install it with the following command:
kubectl apply -f portworx-clusterrolebindings.yaml
Attention: By default, the Portworx chart creates a daemonset that means that the
Portworx pods will spread on all of the worker nodes in the cluster, scanning for available
disk drives. To prevent this behaviour, label the nodes where you don’t want Portworx to be
installed with the following command:
kubectl label nodes <nodes list> px/enabled=false --overwrite
NAME: portworx
LAST DEPLOYED: Sun Mar 10 17:56:36 2019
NAMESPACE: kube-system
STATUS: DEPLOYED
RESOURCES:
==> v1/ConfigMap
NAME DATA AGE
stork-config 1 0s
==> v1/ClusterRole
NAME AGE
node-get-put-list-role 0s
stork-scheduler-role 0s
stork-role 0s
==> v1/Service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
portworx-service ClusterIP 10.0.0.246 <none> 9001/TCP 0s
stork-service ClusterIP 10.0.118.214 <none> 8099/TCP 0s
==> v1beta1/Deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
stork-scheduler 3 3 3 0 0s
stork 3 3 3 0 0s
==> v1beta1/StorageClass
NAME PROVISIONER AGE
==> v1/StorageClass
stork-snapshot-sc stork-snapshot 0s
==> v1/ServiceAccount
NAME SECRETS AGE
px-account 1 0s
stork-scheduler-account 1 0s
stork-account 1 0s
==> v1/ClusterRoleBinding
NAME AGE
node-role-binding 0s
stork-scheduler-role-binding 0s
stork-role-binding 0s
==> v1beta1/DaemonSet
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
portworx 3 3 0 3 0 <none> 0s
==> v1/Pod(related)
NAME READY STATUS RESTARTS AGE
portworx-6qlvp 0/1 ContainerCreating 0 0s
portworx-c4h4t 0/1 ContainerCreating 0 0s
portworx-p86qn 0/1 ContainerCreating 0 0s
stork-scheduler-679f999679-5gbt7 0/1 ContainerCreating 0 0s
stork-scheduler-679f999679-jq6gp 0/1 ContainerCreating 0 0s
stork-scheduler-679f999679-w5mv5 0/1 ContainerCreating 0 0s
stork-86bb9cb55d-467wr 0/1 ContainerCreating 0 0s
stork-86bb9cb55d-969vf 0/1 ContainerCreating 0 0s
stork-86bb9cb55d-khwrv 0/1 ContainerCreating 0 0s
NOTES:
Portworx would create a unified pool of the disks attached to your Kubernetes
nodes.
No further action should be required and you are ready to consume Portworx Volumes
as part of your application data requirements.
For further information on usage of the Portworx in creating Volumes please refer
https://docs.portworx.com/scheduler/kubernetes/preprovisioned-volumes.html
For dynamically provisioning volumes for your Stateful applications as they run on
Kubernetes please refer
https://docs.portworx.com/scheduler/kubernetes/dynamic-provisioning.html
Want to use Storage Orchestration for hyperconvergence, Please look at STork here.
(NOTE: This isnt currently deployed as part of the Helm chart)
We tsted this procedure on an IBM Cloud Private 3.1.2 cluster running Centos 7 nodes.
Additional steps might be required while running it on IBM Cloud Virtual Servers and SLES.
If the installation succeeds you can verify the Portworx cluster status as shown in
Example 4-33.
Status: PX is operational
License: Trial (expires in 31 days)
Node ID: dc9ddffb-5fc3-4f5d-b6df-39aee43325df
IP: 10.10.27.120
Local Storage Pool: 1 pool
POOL IO_PRIORITY RAID_LEVEL USABLE USED STATUS ZONE
REGION
0 HIGH raid0 50 GiB 6.0 GiB Online default
default
Local Storage Devices: 1 device
Device Path Media Type Size Last-Scan
0:1 /dev/sdb STORAGE_MEDIUM_MAGNETIC 50 GiB 10 Mar 19
10:14 UTC
total - 50 GiB
Cluster Summary
Cluster ID: px-cluster-1a2f8f7b-f4e8-4963-8011-3926f00ac9bc
Cluster UUID: 91f47566-cac4-46f0-8b10-641671a32afd
Scheduler: kubernetes
Nodes: 3 node(s) with storage (3 online)
IP ID SchedulerNodeName
StorageNode Used Capacity Status StorageStatus Version
Kernel OS
10.10.27.120 dc9ddffb-5fc3-4f5d-b6df-39aee43325df icp-worker3
Yes 6.0 GiB 50 GiB Online Up (This node) 2.0.2.3-c186a87
3.10.0-957.5.1.el7.x86_64 CentOS Linux 7 (Core)
10.10.27.118 85019d27-9b33-4631-84a5-7a7b6a5ed1d5 icp-worker1
Yes 6.0 GiB 50 GiB Online Up 2.0.2.3-c186a87
3.10.0-957.5.1.el7.x86_64 CentOS Linux 7 (Core)
10.10.27.119 35ab6d0c-3e3a-4e50-ab87-ba2e678384a9 icp-worker2
Yes 6.0 GiB 50 GiB Online Up 2.0.2.3-c186a87
3.10.0-957.5.1.el7.x86_64 CentOS Linux 7 (Core)
Global Storage Pool
Total Used : 18 GiB
Total Capacity : 150 GiB
Attention: The procedure described in this section will not work in the air-gapped
environments as Portworx pods downloads the content and activate Trial license on the
Portworx site. For air-gapped installation see Portworx manuals at
https://docs.portworx.com/portworx-install-with-kubernetes/on-premise/airgapped.
Your portworx storage cluster should be ready with active Trial license for 30 days. In case
you want to purchase a production license visit Portworx website.
Because the IBM Knowledge Center provides detailed step-by-step instructions on deploying
Minio Helm chart we will not replicate this in this book. See
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/manage_cluster/configu
re_minio.html.
2. On the default Grafana dashboard click the Home dropdown in the upper left corner, as
shown in Figure 4-3.
Figure 4-4 List of predefined Grafana dashboards available in IBM Cloud Private
If you do not see any data, make sure that the user account that you have used for
authenticating to the IBM Cloud Private Console has the access rights to the namespace
hosting the Rook Ceph Helm chart.
All of the persistent volumes used for tests were hosted on datastore residing on local drives
inside the ESXi server.
We have run series of performance benchmarks using dbnech and pgbench tools. All the
volumes used were hosted on SSD drives inside a ESXi server used for the test.
Note: All of the storage providers were used with the default settings as of IBM Cloud
Private V 3.1.2. It is very likely that some tuning could improve the results, especially for
the distributed filesystems.
Table 4-1 shows the results averaged over several runs of the test jobs.
Attention: The results are provided on a as-is basis using the output that was produced by
the FIO tool. Note that NFS uses the client caching feature, which allows the data to be
cached on the NFS client and read out of local memory instead of remote disk. This might
affect the results for NFS. Also, Portworx applies some I/O optimization techniques which
might affect the results.
This command creates the sample database of 1 GB size. After that we ran the test
command:
pgbench -c 10 -j 2 -t 10000 -h $PGHOST -p $PGPORT -U admin postgres
The command runs 10 clients in parallel with 2 threads each, where each client executes
10000 transactions.
Neither the database nor the storage were not optimized in any way as the goal of this test
was to just show the relative performance of different storage options using the same
environment.
Vmware 1898
Gluster 103
NFS 1283
Portworx 676
The results are in general consistent with IOPS test, showing significant performance
advantage of Vmware over the other providers. This is not very surprising in our test setup.
However, it is worth noticing that Ceph RBD performed over 2 times better than GlusterFS
(while both storage providers reported similar IOPS performances). From the distributed
storage providers, Portworx showed almost 6 times better performance than GlusterFS and 3
times better than Ceph RBD.
A brief search on the Internet returns the hint from GlusterFS documentation: “Gluster does
not support so called ‘structured data’, meaning live, SQL databases.”
This chapter explores each of these components in depth, describing their function in an IBM
Cloud Private cluster and how the logging and monitoring systems can be leveraged to cover
a range of common use cases when used with IBM Cloud Private.
Elasticsearch
Elasticsearch is a NoSQL database that is based on the Lucene search engine. Elasticsearch
In IBM Cloud Private has three main services that process, store and retrieve data; the client,
master and data nodes.
The client (also known as a ‘smart load-balancer’) is responsible for handling all requests to
Elasticsearch. It is the result of a separation of duty from the master node and the use of a
separate client enables stability by reducing the workload on the master.
The master node is responsible for lightweight cluster-wide actions such as creating or
deleting an index, tracking which nodes are part of the cluster and deciding which shards to
allocate to which nodes.
Data nodes hold the shards that contain the documents you have indexed. Data nodes
handle data related operations like CRUD, search and aggregations. These operations are
I/O and memory intensive. It is important to monitor these resources and to add more data
nodes if they are overloaded. The main benefit of having dedicated data nodes is the
separation of the master and data roles to help stabilise the cluster when under load.
Logstash
Logstash is a log pipeline tool that accepts inputs from various sources, executes different
transformations, and exports the data to various targets. In IBM Cloud Private, it acts as a
central input for different log collectors, such as Filebeat, to rapidly buffer and process data
before sending it to Elasticsearch. Logstash can be configured to output data not just to
Elasticsearch, but a whole suite of other products to suit most other external log analysis
software.
Kibana
Kibana is an open source analytics and visualization layer that works on top of Elasticsearch
that allows end users to perform advanced data analysis and visualize your data in a variety
of charts, tables and maps. The lucene search syntax allows users to construct complex
search queries for advanced analysis and, feeding in to a visualization engine to create
dynamic dashboards for a real time view of log data.
In IBM Cloud Private, the platform logging components are hosted on the management
nodes, with the exception of Filebeat that runs on all nodes, collecting log data generated by
Docker. Depending on the cluster configuration, there are multiples of each component. For
example, in a High Availability (HA) configuration with multiple management nodes, multiple
instances of the Elasticsearch components will be spread out across these nodes. The
overview in Figure 5-2 shows how the Elasticsearch pods are spread across management
nodes
Each of the components are configured to run only on management nodes and, where
possible, spread evenly across them to ensure that the logging service remains available in
the event a management node goes offline.
All containers running on a host write out data to stdout and stderr, which is captured by
Docker and stored on the host filesystem. In IBM Cloud Private, Docker is configured to use
the json-file logging driver, which means that Docker captures the standard output (and
standard error) of all the containers and writes them to the filesystem in files using the JSON
format. The JSON format annotates each line with its origin (stdout or stderr) and its
timestamp and each log file contains information about only one container. For each
container, Docker stores the JSON file in a unique directory using the container ID. A typical
format is /var/lib/docker/containers/<container-id>/<container-id>-json.log.
The /var/lib/docker/containers/ directory has a symlink for each file to another location at
/var/log/pods/<uid>/<container-name>/<number>.log.
Filebeat then continuously monitors the JSON data for every container, but it does not know
anything about the container IDs. It does not query Kubernetes for every container ID, so the
kubelet service creates a series of symlinks pointing to the correct location (useful for
centralized log collection in Kubernetes) and it retrieves the container logs from the host
filesystem at /var/log/containers/<container-name>_<namespace>_<uid>.log. In the
Filebeat configuration, this filepath is used to retrieve log data from all namespaces using a
wildcard filepath /var/log/containers/*.log to retrieve everything, but it’s also possible to
configure more accurate filepaths for specific namespaces.
Filebeat consists of two components; inputs and harvesters. These components work
together to tail files and send event data to a specific output. A harvester is responsible for
reading the content of a single file. It reads each file, line by line, and sends the content to the
output. An input is responsible for managing the harvesters and finding all sources to read
from. If the input type is log, the input finds all files on the drive that match the defined glob
paths and starts a harvester for each file.
Filebeat keeps the state of each file and, if the output (such as Logstash) is not reachable,
keeps track of the last lines sent so it can continue reading the files as soon as the output
becomes available again, which improves the overall reliability of the system. In IBM Cloud
Private, Logstash is pre-configured as an output in the Filebeat configuration, so Filebeat is
actively collecting logs from all cluster nodes and sending the data to Logstash.
First and foremost, Elasticsearch is built on top of Lucene. Lucene is Java based information
retrieval software primarily designed for searching text based files. Lucene is able to achieve
fast search responses because, instead of searching the text directly, it searches an ‘index’. In
this context, an index is a record of all the instances in which a keyword exists. To explain the
theory, a typical example is to think about searching for a single word in a book where all the
In Elasticsearch, an index is a single item that defines a collection of shards and each shard
is an instance of a Lucene index. A shard is a basic scaling unit for an index, designed to
sub-divide the whole index in to smaller pieces that can be spread across data nodes to
prevent a single index exceeding the limits of a single host. A Lucene index consists of one or
more segments, which are also fully functioning inverted indexes. The data itself is stored in a
document, which is the top level serialized JSON object (with key-value pairs), stored in the
index and the document is indexed and routed to a segment for searching. Lucene will search
each of these segments and merge the results, which is returned to Elasticsearch.
Elasticsearch also provides the capability to replicate shards, so ‘primary’ and ‘replica’ shards
are spread across the data nodes in the cluster. If a data node hosting a primary shard goes
down, the replica is promoted to primary, thus still able to serve search queries.
For more information about the individual concepts, see the Elasticsearch documentation at
https://www.elastic.co/guide/en/elasticsearch/reference/6.2/_basic_concepts.html
with other useful articles available at
https://www.elastic.co/guide/en/elasticsearch/guide/current/inverted-index.html
and https://www.elastic.co/blog/found-elasticsearch-from-the-bottom-up
IBM Cloud Private, by default, will set 5 shards per index and 1 replica per shard. This means
that in a cluster with 300 indices per day the system will, at any one time, host 3000 shards
(1500 primary + 1500 replicas). To verify, in Example 5-2 an index is used to check the
number of shards, replicas and the index settings.
{
"logstash-2019.03.03": {
"settings": {
"index": {
"creation_date": "1551791346123",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "xneG-iaWRiuNWHNH6osL8w",
"version": {
"created": "5050199"
},
"provided_name": "logstash-2019.03.03"
}
}
}
}
Each time Logstash sends a request to Elasticsearch, it will create a new index if it does not
exist, or the existing index will be updated with additional documents. The Elasticsearch data
pod is responsible for indexing and storing data, so during this time the CPU utilization,
memory consumption and disk I/O will increase.
The default logging configuration is designed to be a baseline and it provides the minimum
resources required to effectively run a small IBM Cloud Private cluster. The default resource
limits are not the ‘production ready’ values and therefore the Cluster Administrator should
thoroughly test and adjust these settings to find the optimal resource limits for the workloads
that will be running on the production environment. IBM Cloud Private Version 3.1.2 logging
installs with the following resource limits by default (See Table 5-1).
Tip: Some users experience high CPU utilization by the java processes on the host. This is
due to no limit specified on the containers, allowing them to consume all the available host
CPU, if necessary. This is intentional and setting limits may impact the ELK stack stability.
It is worth noting that high CPU utilization may be an indication of memory pressure due to
garbage collection and should be investigated.
The size of the Elasticsearch cluster deployed in an environment entirely depends on the
number of management nodes in the cluster. The default number of Elasticsearch master and
data pods are calculated based on the available management nodes and take on the
following rules:
One Elasticsearch data pod per IBM Cloud Private management node
Number of Elasticsearch master pods is equal to number of management nodes
One Logstash pod per IBM Cloud Private management node
One Elasticsearch client pod per IBM Cloud Private management node
Elasticsearch client and Logstash replicas can be temporarily scaled as required using the
default Kubernetes scaling methods. If any scaling is permanent, it’s recommended to use the
Helm commands to update the number of replicas.
Data retention
The default retention period for logs stored in the platform ELK stack is 24 hours. A curator is
deployed as a CronJob that will remove the logstash indices from Elasticsearch every day at
23:30 UTC. If Vulnerability Advisor is enabled, another CronJob runs at 23:59 UTC to remove
the indices related to Vulnerability Advisor older than 7 days.
Modifying the default retention period without proper capacity planning may be destructive to
the ELK stack. Increasing the retention period will increase the resources required to search
and store the data in Elasticsearch, so ensure the cluster has the required resources to be
able to do so. For more information about resource allocation, see “Capacity planning” on
page 164.
For testing purposes, the default retention period is easily customized by modifying the
logging-elk-elasticsearch-curator-config ConfigMap. To modify the retention period from
1 day to 14 days edit the logging-elk-elasticsearch-curator-config ConfigMap and
modify the unit_count in the first action named delete_indices. The result should look
similar to Example 5-4.
actions:
1:
action: delete_indices
description: "Delete user log indices that are older than 1 days. Cron
schedule: 30 23 * * *"
options:
timeout_override:
continue_if_exception: True
ignore_empty_list: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: logstash-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 14
After saving and closing the file, the new curator configuration will be automatically reloaded
and indices will be retained for 14 days.
Also within this ConfigMap, cleanup actions are provided for the Vulnerability Advisor indices.
If you already have a valid X-Pack license, you can use the X-Pack security features instead,
by adding the following to the config.yaml at installation time
logging:
security:
provider: xpack
The Certificate Authority (CA) is created during installation, but the IBM Cloud Private installer
offers the capability to supply your own CA that will be used for all other certificates in the
cluster. For more information, see
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/create_cert
.html
The default configuration should not be relied upon to meet the needs of every deployment. It
is designed to provide a baseline that the Cluster Administrator should use as a starting point
to determine the resources required for their environment. There are a number of factors that
affect the logging performance, mostly centered around CPU and memory consumption.
Elasticsearch is based on Lucene, which uses Java, and therefore has a requirement for
sufficient memory to be allocated to the JVM heap. The entire JVM heap is assigned to
Lucene and the Lucene engine will consume all of the available memory for its operations,
which can lead to out-of-memory errors if the heap size is not sufficient for all the indexing,
searching and storing that the Lucene engine is trying to do. By default, there are no CPU
limits on the Elasticsearch containers, as the requirements vary depending on workload. On
average, the CPU load is generally low, but will significantly rise during active periods where
heavy indexing or searches are taking place and could consume the entire available CPU
from the host. Restricting the CPU usage to only a few cores will create too much of a backlog
of logs to process, increasing the memory usage and ultimately resulting in an unusable
Elasticsearch cluster during this time. Therefore, it is almost impossible to predict the required
resources for every use case and careful analysis should be made in a pre-production
environment to determine the required configuration for the workload that will be running, plus
additional margins for spikes in traffic.
It is not uncommon for the logging pods to be unresponsive immediately after installation of
an IBM Cloud Private cluster, especially when left at the default configuration. In a basic
installation, whilst all pods are starting up, they are producing hundreds of logs per second
that all need to be catered for by the logging pods, which are also trying to start up at the
same time. During cluster installation, an observed average across several installations by the
development team was around 1500 messages per second and it takes around 30 minutes
for the logging platform to stabilise with the normal rate of 50-100 messages per second in a
lightly active cluster. When Vulnerability Advisor is enabled, the rate during installation can
rise to observed rate of around 2300 messages per second, taking Elasticsearch up to 90
X-Pack monitoring
IBM Cloud Private logging comes with a trial license for X-Pack enabled by default, but the
trial functionality is not enabled during deployment. The trial is aimed at users who need more
advanced capabilities that may eventually need to purchase the full X-Pack license.
Information about X-Pack can be found at
https://www.elastic.co/guide/en/x-pack/current/xpack-introduction.html as it is not
covered in this chapter. However, for the purpose of estimating the logging requirements, the
X-Pack monitoring can be enabled.
To enable the X-Pack monitoring, if it is not already enabled at installation time, use the helm
upgrade command.
The logging Helm chart is located in a Helm repository called mgmt-charts. Instead of adding
the mgmt-charts repository to the local machine, the URL of the chart can be used instead.
Run the helm upgrade command and set the xpack.monitoring value to true. You’ll need to
pass the default installation values in this command too, found in the cluster installation
directory.
Attention: Using helm upgrade can result in some down time for the logging service while
it is replaced by new instances, depending on what resources have changed. Any changes
made to the current logging configuration that was not changed through Helm will be lost.
In some cases, the cluster may be unresponsive and requires initialization by sgadmin. This
is a known issue, so if it occurs follow the IBM Knowledge Center steps at
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/getting_started/kno
wn_issues.html.
Restriction: The helm upgrade command may not run without first deleting the
logging-elk-kibana-init job. The use of --force in the above command typically
removes the need for this, but if you receive an error related to the
logging-elk-kibana-init job, delete this prior to running helm upgrade using kubectl -n
kube-system delete job logging-elk-kibana-init.
Navigate to the Monitoring →Elasticsearch (cluster) →> Nodes section. All the nodes
that make up the current Elasticsearch cluster are displayed, along with extremely useful data
about JVM memory consumption and CPU usage per node. Assuming that the workload in
the current cluster is realistic, it’s more clear to see if the current resource allocation for the
management node CPU cores, or the JVM heap, is enough to handle the current workload
plus additional spikes.
Currently, in the above cluster, JVM memory usage for the data nodes is around 60% which is
suitable for the current environment, plus a 40% overhead to handle temporary excess log
data. If this value was around 80% or 90% consistently, it would be beneficial to double the
allocated resources in the logging-elk-data StatefulSet. The current data nodes’ CPU usage
is around 30% and the maximum value the nodes have hit is around 60%, so for the current
workload, the CPU cores assigned to the management nodes are sufficient. In all cases,
these values should be monitored for changes and adjusted as required to keep the logging
services in a stable state. High CPU usage is an indicator that there is memory pressure in
the data nodes due to garbage collection processes.
Note that this may only be the current and previous day’s indices due to the curator cleaning
up indices older than the default 24 hours (unless configured differently). Selecting the
previous day’s index will provide information about how much storage the index is using, the
rate at which data is indexed, growth rate over time and other metrics. Assuming that these
values are realistic representations of the actual log data the production workloads would
produce, it becomes easier to realize whether or not the current storage is sufficient for the
length of time the data should be retained. Figure 5-8 shows that 6 days worth of data
retention has yielded around 30GB of storage, so the default value of 100GB is sufficient, for
now.
The advanced dashboard provides further analytics and insights in to some key metrics that
enable better fine tuning of the cluster. Tuning Elasticsearch for better performance during
indexing and querying is a big subject and is not covered in this book. More information about
tuning Elasticsearch can be found in the Elasticsearh documentation at
https://www.elastic.co/guide/en/elasticsearch/reference/5.5/general-recommendation
s.html.
Prometheus monitoring
Elasticsearch exposes metrics to Prometheus, which can be viewed using a default Grafana
dashboard (Figure 5-9 on page 168), or queried using the Prometheus User Interface (UI). An
Elasticsearch dashboard is provided by the development team out-of-the-box with IBM Cloud
Private Version 3.1.2 to easily view various metrics about the Elasticsearch cluster, similar to
those provided by the X-Pack monitoring.
The Prometheus UI is also available, but is used to submit Prometheus search queries and
return the raw results, with no additional layers on top. Access the Prometheus UI at
https://<icp-cluster-ip>:8443/prometheus and this should display the graph page allowing
you to input queries and view the results as values or a time-series based graph.
There are numerous metrics available to use, from both the standard Prometheus container
metrics scraped by Kubernetes containers and those provided by the Elasticsearch
application. As an example, start typing ‘elasticsearch_’ in to a search box and a list of
available Elasticsearch specific metrics to use will show in the drop down list. It’s worth noting
that Grafana also uses these metrics to construct the Elasticsearch dashboard mentioned
previously, so all the same data is available here too.
As a starting point, use the query in Example 5-5 to view the current memory usage as a
percentage of all master, data, client and logstash pods. Copy the query to the search box,
select Execute and then Graph, to see the time-series data graphically.
In the current environment, this produces the output in Figure 5-10 on page 169.
Here we can see the overall memory consumption for the whole container, not just the JVM
heap usage. The query in Example 5-6 also displays the current memory usage in GB.
Example 5-7 shows a query for viewing the average rate of the current CPU usage of the
same Elasticsearch containers, also providing valuable information about how accurate the
CPU resource allocation is in this cluster.
Figure 5-11 on page 170 shows the output of this query for this cluster.
The container_cpu_usage_seconds_total metric used here contains the total amount of CPU
seconds consumed by container, by core. The output represents the number of cores used by
the container, so a value of 2.5 in this example would mean 2500 millicores in terms of
Kubernetes resource requests and limits.
When planning the amount of storage required, use the following rule:
In this equation, there is an unknown; the Total index GB, which is the sum of the storage
consumed for each index. Without this information, it’s difficult to estimate how much storage
to allocate to each data node. However, using the same monitoring methods from the
previous sections it’s possible to see the current storage usage for each index and therefore
making it possible to measure the amount of storage capacity used on a daily basis. In the
default Elasticsearch deployment in IBM Cloud Private 3.1.2, there is only one index
configured to store logging data, but if audit logging and Vulnerability Advisor is enabled there
are additional indices to cater for, each varying in size.
As an example, based on the information gathered by the X-Pack monitoring dashboard, for
the logstash-2019.03.05 index, it takes up ~18GB storage space. This includes the replicas,
so the primary shards consume 9GB and the replica shards also consume 9GB. Depending
on the configuration (e.g. HA) this will be spread across management nodes. Typically, if there
is only one management node, then use the 9GB value as replica shards will remain
unassigned. Therefore, retaining 14 days worth of logging data (using this as an average)
requires a minimum of ~126GB. This figure however, does not represent a real production
value, only that of a minimally used lab environment. In some cases, a single days worth of
logging data may be around 100GB for a single index. Therefore in a typical cluster with 2
management nodes, 50GB per management node per day of retention is required. In a use
Similar methods can be used by using the Kibana UI and executing API commands directly to
Elasticsearch to gather data on the storage used per index. For example, the storage size on
disk can be retrieved using GET _cat/indices/logstash-2019.03.05/, producing the output in
Example 5-8.
There does appear to be some correlation between the storage capacity and the data node
resource consumption during idle indexing. A simple advisory rule is to set the data node
memory to around 15% of the storage capacity required on disk. So for example, storing
100GB of log data would mean setting the data node memory limits to around 15G. So in
practice, this would be 8GB JVM Xmx/Xms and 16GB container memory for each data node.
Based on the examples in this section, the daily index requires around 18GB of storage per
node, per day, and the observed memory usage of the data nodes is around 2.5GB, which is
roughly 14%.
In another cluster, similar behavior was observed where the index storage size was at 61GB
for the current day and the observed memory consumption for the data node (configured with
6gb JVM Xmx/Xms and 12GB container memory) was 8.1GB, so the memory usage was
around 13%.
This is just a baseline to help start with a simple estimate and the ‘15%’ value is likely to
increase if users perform a lot of queries on large sets of data.
When setting the heap size, the following rules should be followed:
JVM heap should be no more than 50% of the total container memory
JVM heap should not exceed 32GB (which means the maximum size for an Elasticsearch
node is 64GB memory)
These rules also apply to the master, client and Logstash containers however, there is no
Lucene engine running on these nodes so around 60% of the total memory can be allocated
to the JVM heap.
Elasticsearch reports that garbage collection is configured to start when the JVM heap usage
exceeds 75% full. If the Elasticsearch nodes are constantly above 75%, the nodes are
experiencing memory pressure, which means not enough is allocated to the heap. If nodes
constantly exceed 85-95% the cluster is at high risk of becoming unstable with frequent
response delays and even out-of-memory exceptions. Elasticsearch provides useful
information about the garbage collector usage on each node using the _nodes/stats API. In
particular, the heap_used_percent metric in the jvm section is worth looking at if you’re
experiencing issues to ensure it is generally below 75% during normal operation. See
Example 5-9.
3 hours into load testing, the index size was monitored and logs were, on average, generated
at a rate of about 0.9GB per hour. See Figure 5-12.
Over 24 hours, this gives 21.6GB of data per day. Retaining this for 14 days gives 302GB and
as this is cluster is installed with two management nodes, each management node requires at
least 151GB available storage. Based on this information, the memory for data nodes can
also be estimated using the ‘15%’ rule mentioned in the previous section. 15% of 21.6GB is
3.25GB, so in this example, setting the JVM Xmx/Xms to 2GB and container limit to 4GB is a
good estimate.
elasticsearch_storage_size: "100Gi"
Use the below examples as templates to set the memory limits for a given component,
replacing values where necessary.
These commands are sufficient to scale the containers vertically, but not horizontally. Due to
the cluster settings being integrated in to the Elasticsearch configuration, you cannot simply
add and remove master or data nodes using standard Kubernetes capabilities such as the
HorizontalPodAutoscaling resource.
Using kubectl patch should be used for testing to find a suitable value. Updating the logging
resources in this way will not persist during chart or IBM Cloud Private upgrades. Permanent
configuration changes can only be made using helm upgrade. To update the JVM and
container memory for the data nodes, for example, run the upgrade:
With the above in mind it’s important to be able to scale the components as and when it is
required.
Logstash and client pods - increase the number of replicas by using Helm:
Whilst audit logging was being enabled, the workloads from the previous sections were left
running to determine the effect of enabling audit logging has on a running cluster with an
already active Elasticsearch.
Figure 5-13 on page 177 shows that the cluster became unresponsive for a short period of
time whilst the main pods that generate audit logs (plus kubernetes api-server audit logging)
were restarted and began generating data in the newly added audit index.
Figure 5-14 shows one of the data nodes becoming unresponsive after being overloaded due
to high CPU utilization.
Checking the total memory consumed during this time period in Prometheus also shows
similar results, with the container memory peaking at around 4.5GB. See Figure 5-16.
As the audit log generation uses Fluentd, data is sent directly to Elasticsearch, so there is no
need to consider Logstash.
The client and master nodes generally stayed stable throughout the test. The sudden hike for
one of the client nodes in Figure 5-17 on page 179 is due to a new replica starting in advance
of enabling audit logging.
Based on the above results, Table 5-2 shows the minimum advised resources for a small High
Availability cluster using 3 master nodes, 2 proxy nodes, 2 management nodes, 1
Vulnerability Advisor node and 3 worker nodes.
Table 5-2 Advised ELK resource limits for audit logging with high availability
Name Number of Nodes Memory
In smaller clusters, resources can be reduced. Audit logging was tested on a smaller, 6 node
cluster (1 master, 1 proxy, 1 management and 3 workers) and the minimum resource limits in
Table 5-3 are sufficient for audit logging.
Table 5-3 Minimum ELK resource limits for audit logging in small clusters
Name Number of Nodes Memory
In larger, more active clusters, or during installation of IBM Cloud Private with audit logging
and Vulnerability Advisor enabled these values will almost certainly increase, so plan
accordingly and be patient when finding the correct resource limits. During times of heavy
use, it may appear as though Elasticsearch is failing, but it usually does a good job of
catching up after it has had time to process, provided it has the appropriate resources. An
easy mistake users often make is thinking that it’s taking too long and pods are stuck, so they
are forcefully restarted and the process takes even longer to recover from. During busy
periods of heavy indexing or data retrieval, it’s common to observe high CPU utilization on the
management nodes, however this is usually an indicator that there is not enough memory
Careful monitoring of the container memory is advised and container memory should be
increased if there are several sudden dips or gaps in the Prometheus graphs while the
container is under heavy load, as this is typically a sign that the container is restarting.
Evidence of this is in Figure 5-18 on page 180, where a data pod had reached it’s 3GB
container limit (monitored using Prometheus) and suddenly dropped, indicating an
out-of-memory failure.
As a reference, Example 5-15 provides the base YAML to add to the config.yaml to set the
logging resources to the above recommendations.
The RBAC modules consists of Nginx containers bundled with the client and Kibana pods to
provide authentication and authorization to use the ELK stack. These Nginx containers use
the following rules
1. A user with the role ClusterAdministrator can access any resource, whether audit or
application log.
2. A user with the role Auditor is only granted access to audit logs in the namespaces for
which that user is authorized.
3. A user with any other role can access application logs only in the namespaces for which
that user is authorized.
4. Any attempt by an auditor to access application logs, or a non-auditor to access audit logs,
is rejected.
The RBAC rules provide basic data retrieval control for users that access Kibana. The rules
do not prevent seeing metadata such as log field names or saved Kibana dashboards.
User-saved artifacts in Kibana are all saved in Elasticsearch in the same default index of
/.kibana. This means that all users using the same instance of Kibana can access each
If X-Pack functions were enabled during deployment, they will also appear in the UI.
Discover
This is the first page shown automatically when logging in to Kibana. By default, this page will
display all of your ELK stack’s 500 most recently received logs in the last 15 minutes from the
namespaces you are authorized to access. Here, you can filter through and find specific log
messages based on search queries.
The search bar is the most convenient way to search to string of text in log data. It uses a
fairly simple language structure, called the Lucene Query Syntax. The query string is parsed
into a series of terms and operators. A term can be a single word or a phrase surrounded by
double quotes which searches for all the words in the phrase, in the same order. Figure 5-19
on page 183 shows searching Kibana for log data containing the phrase “websphere-liberty”
Searches can be further refined, for example by searching only for references to the default
namespace. Using filters enables users to restrict the results shown only to contain the
relevant field filters, as shown in Figure 5-20.
And the results are restricted to this filter only, as seen in Figure 5-21.
The use of fields also allows more fine grained control over what is displayed in the search
results, as seen in Figure 5-22 where the fields kubernetes.namespace,
kubernetes.container_name and log are selected.
Visualize
The Visualize page lets you create graphical representations of search queries in a variety of
formats, such as bar charts, heat maps, geographical maps and gauges.
This example will show how a simple pie chart is created, configured to show the top 10
containers that produce the highest number of logs per namespace from a Websphere
Liberty Deployment that is running in different namespaces.
To create a visualization, select the + icon, or Create Visualization, if none exist, then
perform the following steps
1. Select the indices that this visualization will apply to, such as logstash. You may only have
one if the default configuration has not been modified or audit logging is not enabled.
2. Select Pie Chart from the available chart selection.
The result should look similar to Figure 5-23. Save the visualization using the Save link at the
top of the page, as this will be used later to add to a dashboard.
Explore the other visualizations available, ideally creating a few more as these can be used to
create a more meaningful Kibana Dashboard.
Dashboards
The Kibana Dashboard page is where you can create, modify and view your own custom
dashboards where multiple visualizations can be combined on to a single page and filter them
by providing a search query or by selecting filters. Dashboards are useful for when you want
to get an overview of your logs and make correlations among various visualizations and logs.
To create a new Dashboard, select the + icon, or Create Visualization, if none exist. Select
Add Visualization, then select the visualizations created earlier you want to display in this
dashboard. From here, you further filter the data shown in the individual visualizations by
entering a search query, changing the time filter, or clicking on the elements within the
visualization. The search and time filters work just like they do in the Discover page, except
Timelion
Timelion is a time series data visualizer that enables you to combine totally independent data
sources within a single visualization. It’s driven by a simple expression language you use to
retrieve time series data, perform calculations to tease out the answers to complex questions
and visualize the results.
The following example uses Timelion to compare time-series data about the number of logs
generated in the past hour, compared to the number of logs generated this hour. From this,
you can compare trends and patterns in data at different periods of time. For IBM Cloud
Private, this can be useful to see trends in the logs generated on a per-pod basis. For
example, the chart in Figure 5-25 compares the total number of logs in the current hour
compared with the previous hour and the chart in Figure 5-26 on page 187 compares the
number of logs generated by the image-manager pods in the last hour.
Figure 5-25 Timelion chart comparing total logs in the current hour compared to the previous hour
These charts can also be saved and added to a Dashboard to provide even more analysis on
the log data stored in Elasticsearch.
Dev Tools
The Dev Tools page is primarily used to query the Elasticsearch API directly and can be used
in a multitude of ways to retrieve information about the Elasticsearch cluster. This provides an
easy way to interact with the Elasticsearch API without having to execute the commands from
within the Elasticsearch client container. For example, Figure 5-27 shows executing the
_cat/indices API command to retrieve a list of indices
Important: One thing to note about this tool is that all users with access to Kibana
currently have access to the Elasticsearch API, which at the time of writing is not RBAC
filtered, so all users can run API commands against Elasticsearch. It is possible to disable
the Dev Tools page, by adding console.enabled: false to the kibana.yml content in the
logging-elk-kibana-config ConfigMap in the kube-system namespace and restarting the
Kibana pod.
5.2.8 Management
The Management page allows for modifying the configuration of several aspects of Kibana.
Most of the settings are not configurable, as they are controlled by the use of a configuration
file within the Kibana container itself, but the Management page lets you modify settings
related to the stored user data, such as searches, visualizations, dashboards and indices.
Adding additional ELK stacks does not come without it’s price. As described throughout this
chapter, logging takes a toll on the resources available within a cluster, favoring a lot of
memory to function correctly. When designing a cluster, it’s important to take in to
consideration whether multiple ELK stacks are required, and if so, the resources required
using the capacity planning methods discussed in “Capacity planning” on page 164. The
same care should be taken when designing clusters for production that include multiple ELK
stacks for applications and the configuration should be thoroughly tested in the same way the
platform ELK is tested for resiliency. Failure to do so will result in the loss of logging services
and potentially loss of data if Kubernetes has to restart some components due bad design by
not having enough memory to fulfill the logging requirements for an application.
Planning for multiple ELK stacks should, in theory, be a little easier than figuring out how
much the platform ELK should be scaled to meet the needs of both the platform and the
applications it is hosting. This is because developers typically know how much log data their
application (or group of applications) produces based on experience or native monitoring
tools. In this situation, you can solely focus on what the application needs, as opposed to
catering for the unknowns that the platform brings.
Architectural considerations
Before deploying additional ELK stacks to the cluster, think about how it will affect any
resources available for the namespace. Resource requirements for ELK can be quite high, so
if you are deploying ELK to an individual namespace where users operate, take this into
consideration when designing Resource Quotas for that namespace. Best practice is to
deploy ELK to it’s own namespace to isolate it from user workloads, and users themselves if
necessary.
The ELK stack requires elevated privileges in order to function correctly, in particular, it
requires the IPC_LOCK privilege which is not included in the default Pod Security Policy. If ELK
is not being deployed to the kube-system namespace, the Service Account that ELK will use
(typically the default Service Account for the hosting namespace) should be configured to use
a Pod Security Policy that permits the use of IPC_LOCK. This can be done by creating a new
Pod Security Policy for ELK and by creating a new ClusterRole and RoleBinding. Here there
are two options to consider:
1. Deploying the ELK stack to the kube-system namespace
2. Deploying the ELK stack to the user namespace
Deploying the ELK stack to the users namespace means that the users that have access to
the namespace also have access to view the resources within it. This means that users will be
able to perform operations on the ELK pods (depending on the user roles) including viewing
the CA certificates used to deploy the ELK stack security configuration (if ELK is deployed
with security). By giving a service account within the user namespace elevated privileges,
you’re also allowing users to acquire those privileges, so ensure that the IPC_LOCK capability
does not conflict with any security policies. IPC_LOCK enables mlock to protect the heap
memory from being swapped. You can read more about mlock at
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_MRG/1.3/html/Real
time_Reference_Guide/sect-Realtime_Reference_Guide-Memory_allocation-Using_mlock_t
o_avoid_memory_faults.html.
If this is an issue, consider deploying the additional ELK stacks to a separate namespace.
Additional namespaces can be used to host the additional ELK stacks and utilise the standard
RBAC mechanisms within IBM Cloud Private to prevent unauthorized users from accessing
the ELK pods. However in this scenario (providing security is enabled), users in other
namespaces would not be able to access the Kibana pods to view logs or the status for
troubleshooting. If, as a Cluster Administrator, you do not require users to monitor the Kibana
pods, then restricting access to it is recommended. If users should be able to view logs and
the status of the Kibana pod (to troubleshoot access issues), an additional namespace can be
used to host the Kibana pod, where users can be given a 'Viewer' role. This still provides
access to the Kibana pods for diagnostics but prevents users from making any changes to it’s
state or configuration, further protecting the ELK stack from malicious intent.
For development clusters, deploying the ELK stack to user namespaces may be sufficient. For
production clusters where access to the core ELK components should be restricted,
deploying the ELK stack to a dedicated management namespace is recommended.
RBAC No Yes
Each mode has pros and cons and the mode to choose entirely depends on what the ELK
requirements are. For example, if the requirement is to deploy only one additional ELK stack
dedicated to application logs, but it should be secure, implement RBAC mechanisms and not
consume worker node resources, then the managed mode is suitable. The drawback is that it
The recommended way to deploy additional ELK stacks is by using the standard mode.
Managed mode is possible to achieve, but introduces a lot of additional configuration to
enable all the beneficial features of managed mode and it is not covered in this book.
Storage
Data node replicas in each deployment of ELK require a dedicated PersistentVolume (PV)
and PersistentVolumeClaim (PVC). If a dynamic storage provider is available, ELK can be
configured to use this during deployment. If dynamic provisioning is not available, then
suitable PVs must be created first.
To create the namespace, use kubectl create namespace elk and then label is using
kubectl label namespace elk -l name=elk. Alternatively, use the YAML definition in
Example 5-16.
RBAC
The ELK stack requires privileged containers, IPC_LOCK and SYS_RESOURCE capabilities,
which means giving the default Service Account (SA) in the elk namespace elevated
privileges, as the default restricted policy is too restrictive for ELK to function. To allow this, a
new Pod Security Policy (PSP) is required, as well as a Cluster Role and Role Binding to the
new PSP. To create the required RBAC resources perform the following steps
1. Copy the elk-psp.yaml in Example 5-17 to a local file called elk-psp.yaml and create it in
Kubernetes using kubectl create -f elk-psp.yaml.
After the above steps, RBAC is configured with the correct privileges for ELK.
Pulling images
As ELK is deployed to a dedicated namespace, it’s necessary to create an image pull secret
so that it can pull the ELK images from the ibmcom namespace that contains the platform
images.
Create an image pull secret in the elk namespace using the platform admin credentials.
Replace mycluster.icp and the username/password credentials with the values for the
environment. This Secret will be used later to allow the ELK pods to pull images from the
ibmcom namespace.
Security
It’s recommended to enable security on all deployed ELK stacks. You have the option of using
the platform generated Certificate Authority (CA), supplying your own or letting Elasticsearch
To create a dedicated CA for each ELK stack use the openssl command in Example 5-18
replacing the subject details if necessary.
This will output a ca.crt and ca.key file to your local machine. Run the command in
Example 5-20 to create a new key-pair Kubernetes Secret from these files.
To specify your own CA to use when deploying additional ELK stacks, three requirements
must be met:
The CA must be stored in a Kubernetes secret.
The secret must exist in the namespace to which the ELK stack is deployed.
The contents of the certificate and its secret key must be stored in separately named fields
(or keys) within the Kubernetes secret.
If the keys are stored locally, run the command in Example 5-20 to create a new secret,
replacing <path-to-file> with the file path of the files.
Alternatively, Example 5-21 shows the YAML for a sample secret using defined CA
certificates. You’ll need to paste the contents of the my_ca.crt and my_ca.key in the YAML
definition in your preferred editor.
If your own CA is not supplied, and the default IBM Cloud Private cluster certificates are
suitable, they are located in the cluster-ca-cert Secret in the kube-system namespace. As
ELK is deployed to another namespace, it’s important to copy this Secret to that namespace,
otherwise the deployment will fail trying to locate it. This is only relevant if you are not using
your own (or generated) CA for ELK. Use the command in Example 5-22 to copy the
cluster-ca-cert to the elk namespace, using sed to replace the namespace name without
manually editing it.
Deploying ELK
The ibm-icplogging Helm chart contains all the Elasticsearch, Logstash, Kibana and Filebeat
components required to deploy an ELK stack to an IBM Cloud Private cluster. The chart
version used in this deployment is 2.2.0, which is the same as the platform ELK stack
deployed during cluster installation in IBM Cloud Private Version 3.1.2 to ensure the images
used throughout the platform are consistent between ELK deployments. This Helm chart can
be retrieved from the mgmt-charts repository, as it is not (at the time of writing) published in
the IBM public Helm chart repository.
The chart will install the following ELK components as pods in the cluster:
client
data
filebeat (one per node)
logstash
master
kibana (optional)
The goal of this example ELK deployment is to provide logging capabilities for applications
running in the production namespace. ELK will be configured to monitor the dedicated
production worker nodes, retrieving log data from applications in that namespace only.
2. If dynamic storage provisioning is enabled in the cluster, this can be used. If dynamic
storage is not available, PersistentVolumes should be created prior to deploying ELK. In
this deployment, the namespace isolation features in IBM Cloud Private 3.1.2 have been
used to create a dedicated worker node for ELK. This means that a LocalVolume
PersistentVolume can be used, as ELK will be running on only one node. Example 5-23 is
a YAML definition for a PersistentVolume that uses LocalVolume, so the data node uses
the /var/lib/icp/applogging/elk-data file system on the hosting dedicated worker node.
Example 5-23 YAML definition for a Persistent Volume using LocalVolume on a management node
apiVersion: v1
kind: PersistentVolume
metadata:
name: applogging-datanode-172.24.19.212
In this deployment, two of these volumes are created; one for each management node in
the cluster.
3. Retrieve the current platform ELK chart values and save to a local file. These values can
be used to replace the defaults set in the Helm chart itself so that the correct images are
used.
helm get values logging --tls > default-values.yaml
4. Create a file called override-values.yaml. This will be the customized configuration for
ELK and is required to override some of the default values to tailor the resource names,
curator duration, security values or number of replicas of each component in this
deployment. Use the values in Example 5-24 as a template.
Tip: If the kibana-init pod fails, it’s because it could not initialize the default index in Kibana.
This is not a problem, as the default index can be set through the Kibana UI.
8. Set the default index to whichever value you choose. The default is logstash- but this may
change depending on how you modify Logstash in this instance. Note that it is not possible
to set the default index until data with that index actually exists in Elasticsearch, so before
this can be set, ensure log data is sent to Elasticsearch first.
to
output {
elasticsearch {
hosts => "elasticsearch:9200"
index => "%{kubernetes.namespace}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
ssl => true
ssl_certificate_verification => true
keystore =>
"/usr/share/elasticsearch/config/tls/logstash-elasticsearch-keystore.jks"
keystore_password => "${APP_KEYSTORE_PASSWORD}"
truststore => "/usr/share/elasticsearch/config/tls/truststore.jks"
truststore_password => "${CA_TRUSTSTORE_PASSWORD}"
}
}
Save and close to update the ConfigMap. Logstash will automatically reload the new
configuration. If it does not reload after 3 minutes, delete the Logstash pod(s) to restart them.
As Logstash does the buffering and transformation of log data, it contains a variety of useful
functions to translate, mask or remove potentially sensitive fields and data from each log
message, such as passwords or host data. More information about mutating the log data can
be found at
https://www.elastic.co/guide/en/logstash/5.5/plugins-filters-mutate.html.
The resulting filters should look similar to the following, which applies to all indices in this ELK
deployment
filters:
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
If only namespace1, namespace2 and namespace3 indices should be deleted, you can use a
regex pattern, similar to the following
filters:
- filtertype: pattern
kind: regex
value: '^(namespace1-|namespace2-|namespace3-).*$'
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 28
2. Create an ingress using Example 5-26 as a template, replacing the ingress name and
path if necessary.
3. Modify the Kibana ConfigMap to add the server.basePath value. This is required to
prevent targeting the default Kibana instance when the /app-kibana ingress path is used
in the browser. Edit the Kibana ConfigMap using kubectl -n elk edit configmap
app-logging-elk-kibana-config and add ‘server.basePath: "/app-kibana"’ anywhere
in the data section.
4. Test the new URL by logging out of IBM Cloud Private and trying to access
https://<master-ip>:8443/app-kibana. This should redirect you to the log in page for
IBM Cloud Private.
To do this with Helm, create a file called fb-ns.yaml with the content in Example 5-27
Use Helm to push the new configuration, passing the default parameters used during initial
chart installation. These values are required as the default chart values are different, and
Helm will use the default chart values as a base for changes.
You can also do this without Helm by modifying Kubernetes resources directly. Edit the
following input paths in the app-logging-elk-filebeat-ds-config ConfigMap in the elk
namespace. For example, to add the pre-production namespace, add the line
For example:
filebeat.prospectors:
- input_type: log
paths:
- "/var/log/containers/*_production_*.log"
- "/var/log/containers/*_pre-production_*.log"
This method may be preferred if additional modifications have been made to the ELK stack.
In the case of option 1, the current Filebeat and Logstash components already deployed can
be repurposed to ship all log data to the external logging system. Depending on the external
system, only Filebeat may need to be retained. The output configuration for Filebeat and
Logstash can be redirected to the external system instead of the platform Elasticsearch, but
additional security certificates may need to be added as volumes and volume mounts to both
Filebeat and/or Logstash, if the external system uses security.
For option 2, depending on the external system, it’s recommended to deploy an additional
Filebeat and Logstash to IBM Cloud Private. It’s possible to add an additional ‘pipeline’ to
Logstash, so that it can stream logs to the platform Elasticsearch and the external system
simultaneously, but this also introduces additional effort to debug in the event one pipeline
fails. With separate deployments, it’s easy to determine which Logstash is failing and why.
Both Filebeat and Logstash have a number of plugins that enable output to a variety of
endpoints. More information about the available output plugins for Filebeat can be found at
https://www.elastic.co/guide/en/beats/filebeat/current/configuring-output.html and
information about the available output plugins for Logstash can be found at
https://www.elastic.co/guide/en/logstash/5.5/output-plugins.html.
In this example, an external Elasticsearch has been set up using HTTP demonstrate how
Filebeat and Logstash can be used in IBM Cloud Private to send logs to an external ELK.
Filebeat and Logstash will use the default certificates generated by the platform to secure the
communication between these components. Filebeat and Logstash are deployed to a
dedicated namespace. Perform the following steps
Tip: This step is not needed if Logstash is being deployed to kube-system namespace, as
it automatically inherits this privilege from the ibm-privileged-psp Pod Security Policy.
filter {
if [type] == "kube-logs" {
mutate {
rename => { "message" => "log" }
remove_field => ["host"]
}
date {
match => ["time", "ISO8601"]
}
dissect {
mapping => {
"source" =>
"/var/log/containers/%{kubernetes.pod}_%{kubernetes.namespace}_%{container_file_ex
t}"
}
}
dissect {
mapping => {
"container_file_ext" => "%{container}.%{?file_ext}"
}
remove_field => ["host", "container_file_ext"]
}
grok {
"match" => {
"container" =>
"^%{DATA:kubernetes.container_name}-(?<kubernetes.container_id>[0-9A-Za-z]{64,64})
"
}
remove_field => ["container"]
}
}
}
filter {
# Drop empty lines
if [log] =~ /^\s*$/ {
drop { }
}
# Attempt to parse JSON, but ignore failures and pass entries on as-is
json {
source => "log"
output {
elasticsearch {
hosts => ["9.30.123.123:9200"]
index => "%{kubernetes.namespace}-%{+YYYY.MM.dd}"
document_type => "%{[@metadata][type]}"
}
}
logstash.yml: |-
config.reload.automatic: true
http.host: "0.0.0.0"
path.config: /usr/share/logstash/pipeline
xpack.monitoring.enabled: false
xpack.monitoring.elasticsearch.url: "http://9.30.123.123:9200"
Important: This configuration does not use security. To enable security using keystores,
add the following section to output.elasticsearch
ssl => true
ssl_certificate_verification => true
keystore => "/usr/share/elasticsearch/config/tls/keystore.jks"
keystore_password => "${APP_KEYSTORE_PASSWORD}"
truststore => "/usr/share/elasticsearch/config/tls/truststore.jks"
truststore_password => "${CA_TRUSTSTORE_PASSWORD}"
This output uses the Elasticsearch plug-in. A list of all available plug-ins can be found at
https://www.elastic.co/guide/en/logstash/5.5/output-plugins.html. For example, to
use a generic HTTP endpoint, use the following example:
output {
Option 1 is the recommended option as all the log data is sent to stdout and stderr, which also
gives the advantage of allowing tools such as kubectl logs to read the log data, as Docker
handles storing stdout and stderr data on the host, which is read by kubelet. This data is also
automatically sent to Elasticsearch using the default logging mechanisms. See Figure 5-30.
Option 2 is also a useful solution, as the data from specific log files can be parsed or
transformed in the side car and pushed directly to a logging solution, whether it’s the platform
ELK, an application dedicated logging system or an external logging system entirely. See
Figure 5-31.
Important: At the time of writing, the current Filebeat image runs as the root user within
the container, so ensure this complies with your security policies before giving the
namespace access to a PodSecurityPolicy that allows containers to be run with the root
user.
Using a Filebeat side car to forward log file messages to stdout and
stderr
This example will use a simple WebSphere Liberty deployment and Filebeat side car to send
log data from a file populated by WebSphere within the WebSphere container to stdout, to
simulate typical application logging. This functionality can be achieved in a similar way, by
mounting another image, such as busybox, to tail the file and redirect to stdout in the busybox
container, but this is not scalable and requires multiple busybox side car containers for
multiple log files. Filebeat has a scalability advantage, as well as advanced data processing to
output the data to stdout and stderr flexibly.
To create a WebSphere and Filebeat side car Deployment, perform the following steps:
1. Create a new namespace called sidecar for this example
kubectl create namespace sidecar
2. Creating a ConfigMap for the Filebeat configuration allows you to reuse the same settings
for multiple deployments without redefining an instance of Filebeat every time.
Alternatively, you can create one ConfigMap per deployment if your deployment requires
very specific variable settings in Filebeat. This ConfigMap will be consumed by the
Filebeat container as its core configuration data.
Create the ConfigMap in Example 5-33 to store the Filebeat configuration using kubectl
create -f filebeat-sidecar-config.yaml.
multiline.pattern: '${MULTILINE_PATTERN:^\s}'
multiline.match: '${MULTILINE_MATCH:after}'
multiline.negate: '${MULTILINE_NEGATE:false}'
filebeat.config.modules:
# Set to true to enable config reloading
reload.enabled: true
output.console:
codec.format:
string: '%{[message]}'
logging.level: '${LOG_LEVEL:info}'
After a few minutes (to cater for the containers starting up) the log data should now be visible
in Kibana after the platform Filebeat instance has successfully collected logs from Docker and
pushed them through the default logging mechanism. See Figure 5-32 on page 216.
Important: These commands will copy the certificates used within the target ELK stack,
which could be used to access Elasticsearch itself. If this does not conform to security
standards, consider deploying a dedicated ELK stack for a specific application, user or
namespace and provide these certificates for that stack.
ignore_older: '${IGNORE_OLDER:0}'
scan_frequency: '${SCAN_FREQUENCY:10s}'
symlinks: '${SYMLINKS:true}'
max_bytes: '${MAX_BYTES:10485760}'
harvester_buffer_size: '${HARVESTER_BUFFER_SIZE:16384}'
multiline.pattern: '${MULTILINE_PATTERN:^\s}'
multiline.match: '${MULTILINE_MATCH:after}'
multiline.negate: '${MULTILINE_NEGATE:false}'
fields_under_root: '${FIELDS_UNDER_ROOT:true}'
fields:
type: '${FIELDS_TYPE:kube-logs}'
node.hostname: '${NODE_HOSTNAME}'
pod.ip: '${POD_IP}'
kubernetes.namespace: '${NAMESPACE}'
kubernetes.pod: '${POD_NAME}'
tags: '${TAGS:sidecar-ls}'
filebeat.config.modules:
# Set to true to enable config reloading
reload.enabled: true
output.logstash:
logging.level: '${LOG_LEVEL:info}'
Alternatively, download at
https://github.com/IBMRedbooks/SG248440-IBM-Cloud-Private-System-Administrator-
s-Guide/tree/master/Ch7-Logging-and-monitoring/Forwarding-logs-from-application
-log-files/logstash/websphere-liberty-fb-sidecar-logstash-deployment.yaml.
This is similar to Example 5-34, but with a few key differences. First, the certificates in the
logging-elk-certs Secret mounted as volumes in the filebeat-sidecar container
definition, allowing it to communicate securely using TLS with the Logstash instance.
Second, the introduction of additional environment variables provides the Filebeat side car
with additional information that is forwarded to Logstash.
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
Without this, the log entries in Elasticsearch would not contain the namespace and pod
name fields.
After a few minutes (to cater for the containers starting up) the log data should now be visible
in Kibana after the platform Filebeat instance has successfully collected logs from Docker and
pushed them through the default logging mechanism. See Figure 5-33.
Prometheus discovers targets to scrape from service discovery. The scrape discovery
manager is a discovery manager that uses Prometheus’s service discovery functionality to
find and continuously update the list of targets from which Prometheus should scrape metrics.
It runs independently of the scrape manager which performs the actual target scrape and
feeds it with a stream of target group updates over a synchronization channel.
Prometheus stores time series samples in a local time series database (TSDB) and optionally
also forwards a copy of all samples to a set of configurable remote endpoints. Similarly,
Prometheus reads data from the local TSDB and optionally also from remote endpoints. The
scrape manager is responsible for scraping metrics from discovered monitoring targets and
forwarding the resulting samples to the storage subsystem.
Every time series is uniquely identified by its metric name and a set of key-value pairs, also
known as labels. The metric name specifies the general feature of a system that is measured
(for example http_requests_total - the total number of HTTP requests received). It may
contain ASCII letters and digits, as well as underscores and colons. It must match the regex
[a-zA-Z_:][a-zA-Z0-9_:]*.
The Prometheus client libraries offer four core metric types. These are currently only
differentiated in the client libraries (to enable APIs tailored to the usage of the specific types)
and in the wire protocol: These types are:
Counter - A counter is a cumulative metric that represents a single monotonically
increasing counter whose value can only increase or be reset to zero on restart. For
example, you can use a counter to represent the number of requests served, tasks
completed, or errors.
Gauge - A gauge is a metric that represents a single numerical value that can arbitrarily
go up and down. Gauges are typically used for measured values like temperatures or
The Prompt engine is responsible for evaluating Prompt expression queries against
Prometheus’s time series database. The engine does not run as its own actor goroutine, but
is used as a library from the web interface and the rule manager. PromQL evaluation happens
in multiple phases: when a query is created, its expression is parsed into an abstract syntax
tree and results in an executable query. The subsequent execution phase first looks up and
creates iterators for the necessary time series from the underlying storage. It then evaluates
the PromQL expression on the iterators. Actual time series bulk data retrieval happens lazily
during evaluation (at least in the case of the local TSDB (time series database)). Expression
evaluation returns a PromQL expression type, which most commonly is an instant vector or
range vector of time series.
Prometheus serves its web UI and API on port 9090 by default. The web UI is available at /
and serves a human-usable interface for running expression queries, inspecting active alerts,
or getting other insight into the status of the Prometheus server.
https://prometheus.io/docs/introduction/overview/
IBM Cloud Private provides the following exporters to provide metrics as listed in Table 5-6
below:
The rule manager in Prometheus is responsible for evaluating recording and alerting rules on
a periodic basis (as configured using the evaluation_interval configuration file setting). It
evaluates all rules on every iteration using PromQL and writes the resulting time series back
into the storage. The notifier takes alerts generated by the rule manager via its Send()
method, enqueues them, and forwards them to all configured Alertmanager instances. The
notifier serves to decouple generation of alerts from dispatching them to Alertmanager (which
may fail or take time).
The Alertmanager handles alerts sent by client applications such as the Prometheus server. It
takes care of deduplicating, grouping, and routing them to the correct receiver integration.
The following describes the core concepts the Alertmanager implements:
Grouping - Grouping categorizes alerts of similar nature into a single notification. This is
especially useful during larger outages when many systems fail at once and hundreds to
thousands of alerts may be firing simultaneously.
Silences - Silences are a straightforward way to simply mute alerts for a given time. A silence
is configured based on matchers, just like the routing tree. Incoming alerts are checked
whether they match all the equality or regular expression matchers of an active silence. If they
do, no notifications will be sent out for that alert.
https://prometheus.io/docs/alerting/overview/
https://grafana.com/dashboards?dataSource=prometheus
Prometheus data source could be configured and then data could be used in graphs. For
more details refer the document below:
https://prometheus.io/docs/visualization/grafana/
IBM Cloud Private following Grafana dashboards listed in Table 5-7 below.
The resulting entry in config.yaml might resemble the YAML in Example 5-37.
When the rule is created, you will be able to see the new alert in the Alerts tab in the
Prometheus dashboard.
This example will configure Alertmanager to send notifications on Slack. The Alertmanager
uses the Incoming Webhooks feature of Slack, so first we need to set that up. Go to the
Incoming Webhooks page in the App Directory and click Install (or Configure and then Add
Configuration if it is already installed). Once the channel is configured to the incoming
webhook, slack will provide a webhook URL. This URL will have to configured in the Alertrule
configuration.
On the IBM Cloud Private boot node (or wherever kubectl is installed) run the following
command to pull the current ConfigMap data into a local file. Example 5-43 shows how to get
the alertmanager ConfigMap.
Example 5-43 Pull the current ConfigMap data into a local file
kubectl get configmap monitoring-prometheus-alertmanager -n kube-system -o yaml >
monitoring-prometheus-alertmanager.yaml
Example 5-44 on page 231 shows the Alertmanager ConfigMap with updated Slack
configuration including the webhook URL and channel name. Save this file and run the
command in Example 5-45 to update the Alertmanager configuration.
Figure 5-37 on page 233 shows that the Node memory usage alert is sent as a notification on
Slack for the operations teams to look at.
Users can use existing dashboards like Prometheus stats to see various statistics of
Prometheus monitoring system. Figure 5-38 shows the Prometheus statistics dashboard.
https://grafana.com/dashboards/8588
Configure the datasource to Prometheus and set unique identifier for the dashboard. You can
alternatively import the dashboard by exporting the JSON from grafana.com and importing
inside your grafana user interface.
Chapter 6. Security
This chapter describes some of the recommended practices for implementing IBM Cloud
Private security. This section has the following sections:
6.1, “How IBM Cloud Private handles authentication” on page 238
6.2, “How authorization is handled in IBM Cloud Private” on page 239
6.3, “Isolation on IBM Cloud Private” on page 241
6.4, “The significance of the admission controller in IBM Cloud Private” on page 246
6.5, “Image security” on page 248
You can assign a service account to a pod by specifying the account’s name in the pod
manifest. If you don’t assign it explicitly, the pod will use the default service account in the
namespace.
IBM Cloud Private supports the following two authentication protocols for users:
OIDC (OpenID Connect) based authentication
SAML (Security Assertion Markup Language) based federated authentication
With IBM Cloud Private, you can authenticate across multiple LDAPs. You can add multiple
directory entries to the LDAP config in the server.xml file. Liberty automatically resolves the
domain name from the login and authenticates against the targeted LDAP directory. IBM
Cloud Private users and user groups are associated with an enterprise directory during the
time of the user and user group onboarding via import. When the new LDAP directory entry is
created, the domain name also gets added as a new entry. At the time of login, you can
specify the domain against which this authentication should be validated.
For more information on configuring LDAP connection with IBM Cloud Private see the
following document:
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.2/user_management/configure
_ldap.html?view=kc
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/user_management/saml_c
onfig.html.
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/user_management/saml.h
tml.
It is possible to fetch the resources that are attached to a specific user through their teams by
using the https://icp-ip:8443/idmgmt/identity/api/v1/users/{id}/getTeamResourcesAPI.
Let us take an example. We will create a team in IBM Cloud Private, onboard LDAP user in
the team and assign a role to the user.
2. Add user carlos to the team icp-team and assign the Administrator role for this team. See
Figure 6-2.
This will conclude the scenario. For details on which role has permission to perform actions
on which resources, see the url below:
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.2/user_management/assign_ro
le.html
Within a team, each user or user group can have only one role. However, a user might have
multiple roles within a team when you add a user both individually and also as a member of a
team’s group. In that case, the user can act based on the highest role that is assigned to the
user. For example, if you add a user as an Administrator and also assign the Viewer role to
the user’s group, the user can act as an Administrator for the team.
The following are some of the key prerequisites to achieve isolation of deployments on cluster
nodes.
IBM Cloud Private provides several levels of multi-tenancy. The cluster administrator must
analyze workload requirements to determine which levels are required. The following isolation
features can be used to satisfy these requirements:
Host groups: As part of preinstall configuration, the cluster administrator can configure
groups of nodes to worker host groups and proxy host groups. This operation also involves
pre-planning the namespaces, as each host group is mapped to a namespace.
6.3.1 Scenarios
Let us consider the following scenario. In an organization there are two teams, team1, team2
who want to enforce user, compute, and network isolation by confining workload deployments
to virtual and physical resources. During the IBM Cloud Private installation we created
isolated worker and proxy nodes. Post installation, we onboarded team members from both
teams in IBM Cloud Private and assigned them namespaces ns-team1, ns-team2
respectively.
Namespace ns-team1 is bound to the worker node workerteam1 and the proxy node
proxyteam1.
Namespace ns-team2 is bound to the worker node workerteam2 and the proxy node
proxyteam2.
Example 6-1 shows that team1 and team2 has dedicated worker and proxy nodes.
For both teams users and groups information are onboarded from LDAP into IBM Cloud
Private. Deployment that is done from users of team1 will go under the namespace ns-team1
and will be deployed on the worker node workerteam1 and use the proxy node proxyteam1.
Figure 6-4 Team1 and team2 from LDAP have been onboarded
So, in the above scenario we were able to achieve the following isolation features:
Host groups: We are able to isolate proxy node and worker node for each team.
VLAN subnet: Worker and proxy nodes are in same subnet for each team. Team1 is
using the subnet 172.16.236.0/24 and team2 is using the subnet 172.16.16.0/24.
Namespaces: Both teams have been assigned to different namespaces for logical
grouping of resources. This means team1 has been assigned to the namespace ns-team1
and team2 has been assigned to the namespace ns-team2.
Network ingress controllers: Both teams have isolated proxy nodes, so they will use an
ingress controller from their respective proxy nodes.
Users, user groups and teams: Both teams can onboard any group and users from
LDAP in a team, Cluster administrator can create new teams.
Note that the pods deployed from team1 can still talk to the pods of team2. For example both
teams deployed the node js sample application from the IBM Cloud Private Helm chart. To
stop the communication between the pods of team1 and team2 we should execute the
following steps in this order.
See Example 6-2, Example 6-3, Example 6-4 on page 244, Example 6-5 on page 244, and
Example 6-6 on page 244. First we start with some prerequisite steps.
Example 6-5 Getting the service details of the pod deployed by team2
root@acerate1:~# kubectl get svc -n ns-team2 | awk {' print $1" " $5'} | column -t
NAME PORT(S)
nodejs-deployment-team2-nodejssample-nodejs 3000:30905/TCP
Example 6-6 Accessing the pod of team2 from the pod of team1
root@acerate1:~# kubectl exec -it nodejs-deployment-team1-nodejssample-nodejs-c856dff96-z84gv -n
ns-team1 -- /bin/bash -c "curl 10.1.219.133:3000"
<!--
Licensed Materials - Property of IBM
(C) Copyright IBM Corp. 2018. All Rights Reserved.
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.
-->
<!DOCTYPE html>
<html lang="en">
<head>
....
....
<div class="footer">
<p>Node.js is a trademark of Joyent, Inc. and is used with its permission. We are not
endorsed by or affiliated with Joyent.</p>
</div>
</div>
</body>
At this point, we need to create network policies to create firewall rules at the namespace
scope. This will stop the communication between pods of team1 and team2. Example 6-7,
Example 6-8, Example 6-9 on page 245, Example 6-10 on page 245, and Example 6-11 on
page 245 show how to do it.
Example 6-7 Patching the namespace for team1 with label name: ns-team1
apiVersion: v1
kind: Namespace
metadata:
name: ns-team1
labels:
name: ns-team1
Example 6-8 Patching the namespace for team2 with label name: ns-team2
apiVersion: v1
kind: Namespace
metadata:
name: ns-team2
labels:
name: ns-team2
Example 6-10 Creating a network policy for team2 to stop communication from any pods except its own pods
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: networkpolicy-team2
namespace: ns-team2
spec:
policyTypes:
- Ingress
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
name: ns-team2
Example 6-11 Trying to access the pod of team2 from the pod of team1 or vice versa will fail
root@acerate1:~# kubectl exec -it nodejs-deployment-team1-nodejssample-nodejs-c856dff96-z84gv -n
ns-team1 -- /bin/bash -c "curl 10.1.219.133:3000"
[curl: (7) Failed to connect to 10.1.219.133 port 3000: Connection timed out
If team1 wants to use a different level of security for its pods then team2, they can create its
own pod security policy and bind it with namespace. For detailed steps see the following URL:
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.1/user_management/create_na
mespace_pspbind.html.
Admission control is where an administrator can really start to wrangle the users’ workloads.
With admission control, you can limit the resources, enforce policies, and enable advanced
features.
In the following sections you can find some examples of the admission controller.
The pod security policies allow cluster administrators to create pod isolation policies and
assign them to namespaces and worker nodes. IBM Cloud Private provides predefined
policies that you can apply to your pod by associating them with a namespace during the
namespace creation. These predefined pod security policies apply to most of the IBM content
charts.
The following list shows the types and descriptions that range from the most restrictive to the
least restrictive:
ibm-restricted-psp: This policy requires pods to run with a non-root user ID, and prevents
pods from accessing the host.
ibm-anyuid-psp: This policy allows pods to run with any user ID and group ID, but
prevents access to the host.
ibm-anyuid-hostpath-psp: This policy allows pods to run with any user ID and group ID
and any volume, including the host path.
Attention: This policy allows hostPath volumes. Ensure that this is the level of access
that you want to provide.
ibm-anyuid-hostaccess-psp: This policy allows pods to run with any user ID and group
ID, any volume, and full access to the host.
Attention: This policy allows full access to the host and network. Ensure that this is the
level of access that you want to provide.
Attention: This policy is the least restrictive and must be used only for cluster
administration. Use with caution.
If you install IBM Cloud Private version 3.1.1, or later as a new installation, the default pod
security policy setting is restricted. When it is restricted, the ibm-restricted-psp policy is
applied by default to all of the existing and newly created namespaces.You can also create
your own pod security policy. For more information see the following link:
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.1/manage_cluster/enable_pod
_security.html.
In the following you can find some of the recommended practices when using the pod security
policy:
6.4.2 ResourceQuota
This admission controller will observe the incoming requests and ensure that they do not
violate any of the constraints enumerated in the ResourceQuota object in a namespace.
6.4.4 AlwaysPullImages
This admission controller modifies every new pod to force the image pull policy to Always.
This is useful in a multitenant cluster so that users can be assured that their private images
can be used only by those who have the credentials to pull them. Without this admission
controller, once an image has been pulled to a node, any pod from any user can use it simply
by knowing the image’s name (assuming the pod is scheduled onto the right node), without
any authorization check against the image. When this admission controller is enabled,
images are always pulled prior to starting containers, which means valid credentials are
required.
IBM Cloud Private supports all of the Kubernetes admission controllers. For more details see
the following link:
https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/
<cluster_CA_domain> is the certificate authority (CA) domain that was set in the config.yaml
file during installation. If you did not specify a CA domain name, the default value is
mycluster.icp.
You can push or pull the image only if the namespace resource is assigned to a team for
which you have the correct role. Administrators and operators can push or pull the image.
Editors and viewers can pull images. Unless you specify an imagePullSecret, you can access
the image only from the namespace that hosts it.
We can push or pull the image only if the namespace resource is assigned to a team for
which you have the correct role. Administrators and operators can push or pull the image.
Editors and viewers can pull images. Unless you specify an imagePullSecret, you can access
the image only from the namespace that hosts it.
The service account defined in a PodSpec can pull the image from the same namespace
under the following conditions:
The PodSpec is using default service account.
The service account is patched with a valid image pull secret.
The PodSpec includes the name of a valid image pull secret.
The image scope is changed to global after the image is pushed.
To change the scope from global to namespace, run the following command:
kubectl get image <image name> -n=namespace -o yaml \
| sed 's/scope: global/scope: namespace/g' | kubectl replace -f -
where <image name> value is the name of the image for which scope is changed.
If a namespace policy does not exist, then the cluster policy is applied. If the namespace
policy and the cluster policy overlap, the cluster scope is ignored. If neither a cluster or
namespace scope policy exists, your deployment fails to start.
Pods that are deployed to namespaces, which are reserved for the IBM Cloud Private
services, bypass the container image security check.
The following namespaces are reserved for the IBM Cloud Private services:
kube-system
cert-manager
istio-system
In this example, repository_name specifies the name of repository from which the image will
be to pulled from. A wildcard (*) character is allowed in the repository name. This wildcard (*)
character denotes that the images from all of the repositories are allowed or trusted. To set all
your repositories to trusted, set the repository name to (*) and omit the policy subsections.
Repositories by default require a policy check, with the exception of the default
mycluster.icp:8500 repository. An empty or blank repository name value blocks deployment
of all the images.
When va is set to enabled: true (See Example 6-12 on page 250), vulnerability advisor policy
is enforced. It works only for the default IBM Cloud Private built-in container registry. With the
other image registries this option should be false, otherwise the image will not be pulled.
ClusterImagePolicy or ImagePolicy for a namespace can be viewed or edited like any other
Kubernetes object.
Let us check this out with an example: Hello-world image enforcement with kubectl. We run
the sample hello-world docker image, as shown in Example 6-13.
The security enforcement hook blocks the running of the image. This is a great security
feature enhancement that prevents any unwanted images to be run in the IBM Cloud Private
cluster.
Now we will create a whitelist to enable this specific docker image from the Docker Hub. For
each repository that you want to enable the pulling of an image, you have to define the name
for the repository, where the wildcard(*) is allowed. You can also define the policy of the
vulnerability advisor (VA). If you enable the VA enforcement as true, then only those images
that have passed the vulnerability scanning can be pulled. Otherwise, they will be denied.
Here we create an image policy which applies to the default namespace. We give the exact
match of the image and set the VA policy disabled. Save it as image-policy.yaml, and apply it
with the kubectl apply -f image-policy.yaml command.
Now run the same command again. You will see the docker’s hello world message as shown
in Example 6-15.
To try something more ambitious, you can run an Ubuntu container with:
$ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker ID:
https://hub.docker.com/
Perform the following steps for the hello-world image enforcement example with dashboard:
1. Login to dashboard.
2. Go to Manage Resource Security →Image Policies.
3. Select the my-cluster-images-whitelist that was created with the kubectl command.
4. Remove the policy as shown in Figure 6-6 on page 253.
Example 6-17 shows the list of the images that are allowed at the cluster level by default in
IBM Cloud Private 3.1.2.
Example 6-17 Default enabled image list in IBM Cloud Private 3.1.2
<your icp cluster name>:8500/*
registry.bluemix.net/ibm/*
cp.icr.io/cp/*
docker.io/apache/couchdb*
docker.io/ppc64le/*
docker.io/amd64/busybox*
docker.io/vault:*
docker.io/consul:*
docker.io/python:*
docker.io/centos:*
docker.io/postgres:*
docker.io/hybridcloudibm/*
docker.io/ibmcom/*
docker.io/db2eventstore/*
docker.io/icpdashdb/*
docker.io/store/ibmcorp/*
docker.io/alpine*
docker.io/busybox*
docker.io/dduportal/bats:*
docker.io/cassandra:*
docker.io/haproxy:*
docker.io/hazelcast/hazelcast:*
docker.io/library/busybox:*
docker.io/minio/mc:*
docker.io/minio/minio:*
docker.io/nginx:*
docker.io/open-liberty:*
docker.io/openwhisk/*
docker.io/rabbitmq:*
docker.io/radial/busyboxplus:*
docker.io/ubuntu*
docker.io/websphere-liberty:*
docker.io/wurstmeister/kafka:*
docker.io/zookeeper:*
For any image that is not in the default list, you have to create either a namespace based
image policy or a cluster level policy to allow the image to run.
Chapter 7. Networking
This chapter provides an overview of the network component in IBM Private Cloud and
discusses how communication flows between the pods and the external network and how a
pod is exposed to the network.
A pod could consist of various containers that are collocated at the same host, share the
same network stack, and share other resources (such as volumes). From an application
developer point of view, all containers in a Kubernetes cluster are considered on a flat subnet.
Figure 7-1 shows the communication coming from the physical Interface (eth0) going to the
Docker0 and then to the virtual network.
Each docker instance creates its own network and enables the pod communication.
The next sections demonstrate the networking concepts in Kubernetes, such as pod
networking, load balancing, and ingress.
1
Some of the content in this chapter is based on the following GitHub document created by Rachappa Goni, Eswara Kosaraju, Santosh
Ananda, and Jeffrey Kwong.
It is important to note that all containers in a single pod share the same port space. If
container A uses port 80, you cannot have container B inside the same pod that uses port 80
as well. Using the same port would only work if these containers are on different pods. An
additional component, called the pod network namespace, is provided by the Kubernetes
pause container. This component creates and owns the network namespace.
Another component that is present on the pod network is the container network interface
(CNI). This consists of a plug-in that connects the pod network namespace to the rest of the
pods in the Kubernetes cluster. The most widely used CNIs in IBM Cloud Private are the
Calico and NSX.
7.2.1 Calico
Calico enables networking and network policy in Kubernetes clusters across the cloud. It
combines flexible networking capabilities with run anywhere security enforcement, providing
performance similar to a native kernel and enabling real cloud-native scalability. Calico is
implemented without encapsulation or overlays, providing high-performance networking. It
also provides a Network security policy for Kubernetes pods through its distributed firewall.
When using Calico, policy engine enforces the same policy model at the host networking
layers (and at the service mesh if using Istio) helping to protect the infrastructure from
workloads from compromised infrastructures.
Because Calico uses the Linux Kernel’s forwarding and access control, it provides a
high-performance solution without the resources used by encapsulation and decapsulation.
Calico creates a flat layer-3 network, and assigns a fully routable IP address to every pod. To
do that, it divides a large network of CIDR (Classless Inter-Domain Routing) into smaller
blocks of IP addresses, and assigns one or more of these smaller blocks to the nodes in the
cluster. This configuration is specified at the IBM Cloud Private installation time using the
network_cidr parameter in config.yaml in CIDR notation.
By default, Calico creates a BGP (Border Gateway Protocol) mesh between all nodes of the
cluster, and broadcasts the routes for container networks to all of the worker nodes. Each
node is configured to act as a layer 3 gateway for the subnet assigned to the worker node,
and serves the connectivity to pod subnets hosted on the host.
All nodes participate in the BGP mesh, which advertises the local routes that the worker node
owns to the rest of the nodes. BGP peers external to the cluster can participate in this mesh
as well, but the size of the cluster affects how many BGP advertisements these external peers
will receive. Route reflectors can be required when the cluster scales past a certain size.
When routing the pod traffic, Calico uses the system capabilities, such as the node’s local
route tables and iptables. All pod traffic traverses iptables rules before they are routed to
their destination.
Calico maintains its state using an etcd key/value store. By default, in IBM Cloud Private
Calico uses the same etcd key/value store as Kubernetes to store the policy and network
configuration states.
However, in environments that do not require an overlay, IP-in-IP tunneling should be disabled
to remove the packet encapsulation resource use, and enable any physical routing
infrastructure to do packet inspection for compliance and audit. In such scenarios, the
underlay network should be made aware of the additional pod subnets by adding the underlay
network routers to the BGP mesh. See the discussion about this in “Calico components”.
Calico components
The calico network is created by the following 3 components: node agent, CNI, and
kube-controller.
Calico/node agent
This entity has three components: felix, bird, and confd.
Felix’s primary responsibility is to program the host’s iptables and routes to provide the
wanted connectivity to and from the pods on that host.
Bird is an open source BGP agent for Linux that is used to exchange routing information
between the hosts. The routes that are programmed by felix are picked up by bird and
distributed among the cluster hosts.
Confd monitors the etcd data store for changes to the BGP configuration, such as IPAM
information and AS number, and accordingly changes the bird configuration files and
triggers bird to reload these files on each host. The calico/node creates veth-pairs to
connect the Pod network namespace with the host’s default network namespace.
Calico/cni
The CNI plug-in provides the IP address management (IPAM) functionality by provisioning IP
addresses for the Pods hosted on the nodes.
Calico/kube-controller
The calico/kube-controller watches Kubernetes NetworkPolicy objects and keeps the Calico
data store in sync with the Kubernetes. calico/node runs on each node and uses the
information in the Calico etcd data store and program the local iptables accordingly.
There are two scenarios that can be used to the Calico communication:
Calico can be configured to create IP-in-IP tunnel end points on each node for every
subnet hosted on the node. Any packet originated by the pod and egressing the node is
encapsulated with the IP-in-IP header, and the node IP address is utilized as the source.
This way, the infrastructure router does not see the pod IP addresses.
The IP-in-IP tunneling brings in extra resource use in terms of network throughput and
latency due to the additional packet resource and processing at each endpoint to
encapsulate and decapsulate packets.
7.2.2 NSX-T
NSX-T is a network virtualization and security platform that automates the implementation of
network policies, network objects, network isolation, and micro-segmentation.
L2 and L3 segregation
NSX-T creates a separate L2 switch (virtual distributed switch, or VDS) and L3 router
(distributed logical router, or DLR) for every namespace. The namespace level router is called
a T1 router. All T1 routers are connected to the T0 router, which acts like an edge gateway to
the IBM Cloud Private cluster, as well as an edge firewall and load balancer. Due to separate
L2 switches, all the broadcast traffic is confined to the namespace. In addition, due to the
separate L3 router, each namespace can host its own pod IP subnet.
NAT pools
Edge appliance is an important component of the NSX-T management cluster. It offers
routing, firewall, load balancing, and network address translation, among other features. By
creating pods on the NSX-T pod network (and not relying on the host network), all traffic can
be traversed though the edge appliance using its firewall, load balancing, and network
address translation capabilities.
The edge appliance assigns SNAT (Source Network Address Translation) IPs to the outbound
traffic, and DNAT (Destination Network Address Translation) IPs to the inbound traffic from
the NAT pool (created as part of the NSX-T deployment). By relying on the network address
translation, the cluster node IPs are not exposed on the outbound traffic.
When using an external load balancer, the master load balancer should monitor the
Kubernetes API server port 8001 for health on all master nodes, and the load balancer needs
to be configured to accept connections on the following locations:
Forward traffic to 8001 (Kubernetes API)
8443 (platform UI), 9443 (authentication service)
8500 and 8600 (private registry)
When using an external load balancer, each master node can be in different subnets if the
round-trip network time between the master nodes is less than 33 ms for etcd. Figure 7-3 on
page 263 illustrates the load balancer option.
This setting is done once as part of the installation of IBM Cloud Private, using the
vip_manager setting in config.yaml. For ucarp and keepalived, the advertisements happen
on the management interface, and the virtual IP will be held on the interface provided by
cluster_vip_iface and proxy_vip_iface. In situations where the virtual IP will be accepting
a high load of client traffic, the management network performing the advertisements for
master election should be separate from the data network accepting client traffic.
The etcd virtual IP manager is implemented as an etcd client that uses a key/value pair. The
current master or proxy node holding the virtual IP address acquires a lease to this key/value
pair with a TTL of 8 seconds. The other standby master or proxy nodes observe the lease
key/value pair.
If the lease expires without being renewed, the standby nodes assume that the first master
has failed and attempt to acquire their own lease to the key to be the new master node. The
master node that is successful writing the key brings up the virtual IP address. The algorithm
uses randomized election timeout to reduce the chance of any racing condition where one or
more nodes tries to become the leader of the cluster.
Note: Gratuitous ARP is not used with the etcd virtual IP manager when it fails over.
Therefore, any existing client connections to the virtual IP address after it fails over will fail
until the client’s ARP cache has expired and the MAC address for the new holder of the
virtual IP is acquired. However the etcd virtual IP manager avoids the use of multicast as
ucarp and keepalived requires.
Each node sends out an advertisement message on its network interface that it can have a
virtual IP address every few seconds. This is called the advertise base. Each master node
sends a skew value with that CARP (Common Address Redundancy Protocol) message. This
is similar to its priority of holding that IP, which is the advskew (advertising skew). Two or more
systems both advertising at one second intervals (advbase=1), the one with the lower advskew
will win.
Any ties are broken by the node that has the lower IP address. For high availability, moving
one address between several nodes in this manner enables you to survive the outage of a
host, but you must remember that this only enables you to be more available and not more
scalable.
A master node will become master if one of the following conditions occurs:
No one else advertises for 3 times its own advertisement interval (advbase).
The --preempt option is specified by the user, and it “hears” a master with a longer
(advertisement) interval (or the same advbase but a higher advskew).
After failover, ucarp sends a gratuitous ARP message to all of its neighbors so that they can
update their ARP caches with the new master’s MAC address.
VRRP is a fundamental brick for failover. The keepalived virtual IP manager implements a set
of hooks to the VRRP finite state providing low-level and high-speed protocol interactions.
To ensure stability, the keepalived daemon is split into the following parts:
A parent process called as watchdog in charge of the forked children process monitoring.
A child process for VRRP.
Another child process for health checking.
The keepalived configuration included with IBM Cloud Private uses the multicast address
224.0.0.18 and IP protocol number 112. This must be allowed in the network segment where
the master advertisements are made. Keepalived also generates a password for
authentication between the master candidates which is the MD5 sum of the virtual IP.
Keepalived by default uses the final octet of the virtual IP address as the virtual router ID
(VRID). For example, for a virtual IP address of 192.168.10.50, it uses VRID 50. If there are
any other devices using VRRP on the management layer 2 segment that are using this VRID,
it might be necessary to change the virtual IP address to avoid conflicts.
Note: Ingress controller in an IBM Cloud Private environment is also known as the proxy
node.
More than one ingress controller can also be deployed if isolation between namespaces is
required. The ingress controller itself is a container deployment that can be scaled out and is
exposed on a host port on the proxy nodes. The ingress controller can proxy all of the pod
and service IP mesh running in the cluster.
IBM Cloud Private installation defines some node roles dedicated to running the shared IBM
Cloud Private ingress controller called proxy nodes. These nodes serve as a layer 7 reverse
proxy for the workload running in the cluster. In situations where an external load balancer
can be used, this is the suggested configuration, because it can be difficult to secure and
scale proxy nodes, and using a load balancer avoids additional network hops through proxy
nodes to the pods running the actual application.
[proxy]
172.21.13.110
172.21.13.111
172.21.13.112
If an ingress controller and ingress resources are required to aggregate several services that
use the built-in ingress resources, a good practice is to install additional isolated ingress
controllers using the included Helm chart for the namespace, and expose these individually
through the external load balancer.
In this case, all traffic on the ingress controller’s address and port (80 or 443) will be
forwarded to this service.
Simple fanout
With the simple fanout approach you can define multiple HTTP services at different paths and
provide a single proxy that routes to the correct endpoints in the back end. When there is a
highly available load balancer managing the traffic, this type of ingress resource will be helpful
in reducing the number of load balancers to a minimum.
In the Example 7-3 on page 267, “/” is the rewrite target for two services: employee-api on
port 4191 and manager-api on port 9090. The context root for both of these services is at /;
the ingress will rewrite the path /hr/employee/* and /hr/manager/* to / when proxying the
requests to the back ends.
Example 7-3 demonstrates that it is possible to expose multiple services and rewrite the URI.
In Example 7-4 and Example 7-5 on page 268, two Node.js servers are deployed. The
console for the first service can be accessed using the host name myserver.mydomain.com
and the second using superserver.mydomain.com. In DNS, myserver.mydomain.com and
superserver.mydomain.com can either be an A record for the proxy node virtual IP 10.0.0.1, or
a CNAME for the load balancer forwarding traffic to where the ingress controller is listening.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
name: mysuperserver
spec:
rules:
- host: superserver.mydomain.com
http:
paths:
- backend:
serviceName: nodejs-superserver
servicePort: 3000
It is usually a good practice to provide some value for the host, because the default is *, which
forwards all requests to the back end.
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
ingress.kubernetes.io/rewrite-target: /
name: api
spec:
rules:
- host: myserver.mydomain.com
http:
paths:
- backend:
serviceName: main-sales-api
servicePort: 4191
path: /api/sales/*
- backend:
serviceName: backorder-api
servicePort: 9090
path: /api/backorder/*
tls:
- hosts:
- myserver.mydomain.com
secretName: api-tls-secret
The secret api-tls-secret is created in the same namespace as the ingress resource using
the command:
kubectl create secret tls api-tls-secret --key=/path/to/tls.key
--cert=/path/to/tls.crt
The secret can also declaratively be created in a YAML file if the TLS key and certificate
payloads are base-64 encoded, as shown in Example 7-7.
apiVersion: v1
type: Opaque
kind: Secret
metadata:
name: api-tls-secret
data:
tls.crt: <base64-encoded cert>
tls.key: <base64-encoded key>
Advantages:
A common ingress controller reduces compute resources required to host applications.
Ready for use.
Disadvantages:
All client traffic passes through a shared ingress controller. One service’s client traffic can
affect the other.
Limited ability to isolate northbound ingress resource traffic from southbound ingress
traffic, such as a public-facing API versus an operations dashboard running in the same
cluster would share the same ingress controller.
If an attacker were to gain access to the ingress controller they would be able to observe
unencrypted traffic for all proxied services.
Need to maintain different ingress resource documents for different stages. For example,
the need to maintain multiple copies of the same ingress resource YAML file with different
namespace fields.
The ingress controller needs access to read ingress, service, and pod resources in every
namespace in the Kubernetes API to implement the ingress rules.
Advantages:
Delineation of ingress resources for various stages of development, production.
Performance for each namespace can be scaled individually.
Traffic is isolated; when combined with isolated worker nodes on separate VLANs, true
Layer 2 isolation can be achieved as the upstream traffic does not leave the VLAN.
Continuous integration and continuous delivery teams can use the same ingress resource
document to deploy (assuming that the dev namespace is different from the production
namespace) across various stages.
Disadvantages:
Additional ingress controllers must be deployed, using extra resources.
Ingress controllers in separate namespaces might require either a dedicated node or a
dedicated external load balancer.
Each Kubernetes cluster is logically separated into namespaces, and each namespace acts
as a subdomain for name resolution. Upon examining a container’s /etc/resolv.conf,
observe that the name server line points at an IP address internal to the cluster, and the
search suffixes are generated in a particular order, as shown in Example 7-8.
# cat /etc/resolv.conf
Name server <kube-dns ClusterIP>
search <namespace>.svc.<cluster_domain> svc.<cluster_domain> <cluster_domain>
<additional ...>
options ndots:5
The <additional ...> is a list of search suffixes obtained from the worker node’s
/etc/resolv.conf file. By default, a short host name like account-service has
<namespace>.svc.<cluster_domain> appended to it, so a pod matching the label selector that
is running in the same namespace as the running pod will be selected. A pod can look up the
ClusterIP of a pod in a different namespace by appending the namespace to the host name.
Chapter 8. Troubleshooting
This chapter provides information about how to fix some common issues with IBM Cloud
Private. It shows you how to collect log information and open a request with the IBM Support
team.
The first parameter that you need to configure in the config.yaml file is the admin user and
password. The admin password policy has been changed in IBM Cloud Private version 3.1.2
and now requires, by default, at least 32 characters. If the password does not match the
requirements, the installation log will show an error as shown in Example 8-1.
If you want to change the policy, you can use the regular expression that best fits your
company policy as described at the following link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/install_con
tainers.html
If your server has more than one Ethernet adapter and you are installing the IBM Cloud
Private Enterprise Edition, you also need to configure the following parameters:
cluster_lb_address: <external address>
proxy_lb_address: <external address>
The external address is the IP address of the adapter from which the incoming traffic will be
coming to the server.
For the full options of the cluster.yaml file configuration, see the following URL:
https://www.ibm.com/support/knowledgecenter/SSBS6K_3.1.2/installing/config_yaml.html
If the hosts file configuration is not done, the error described in Example 8-2 will be displayed.
Tip: When planning for the installation of IBM Cloud Private, it is highly advised that you
define all of the functions that you want your server to run, because some of the
customizations on the hosts file require IBM Cloud Private to be uninstalled and installed
again.
If you plan to use storage for persistent data, you might need to add the host group in the
hosts file for that particular type of storage. Read the documentation about the storage
system you are using and make sure that the prerequisites are met before running the IBM
Cloud Private installation. For more information, see Chapter 4, “Managing persistence in
IBM Cloud Private” on page 115.
This procedure is described in “Step 2: Set up the installation environment item 9” at the
following link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/install_con
tainers.html
8.1.4 Missing the IBM Cloud Private binary files in the installation folder
To have the IBM Cloud Private installed you need to copy the binary files to the
/<installation_directory>/cluster/images folder.
This procedure is described in step 2: Set up the installation environment item 10 at the
following link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/install_con
tainers.html
To avoid these kinds of errors, it is required that the system matches at least the minimum
system requirements. You can see the system requirements at this link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/supported_system_confi
g/hardware_reqs.html
Also, it is suggested that you need to evaluate the sizing of the cluster before the installation.
See this link for information about how to size your cluster:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/installing/plan_capaci
ty.html
After running the uninstaller, you need to run the commands described in Example 8-5 to
make sure that the system is clean and ready for a new installation.
Example 8-5 Making sure the system is ready for a new installation
sudo systemctl stop kubelet docker
sudo systemctl start docker
sudo docker rm $(sudo docker ps -qa)
if sudo mount | grep /var/lib/kubelet/pods; then sudo umount $(sudo mount | grep
/var/lib/kubelet/pods | awk '{print $3}'); fi
sudo rm -rf /opt/cni /opt/ibm/cfc /opt/kubernetes
sudo rm -rf /etc/cfc /etc/cni /etc/docker/certs.d
sudo rm -rf /var/lib/etcd/* /var/lib/etcd-wal/*
sudo rm -rf /var/lib/mysql/*
The calico_ipip_enabled parameter must be set to true if all the nodes in the cluster do not
belong to the same subnet. This parameter must be set to true if the nodes are deployed in a
cloud environment such as OpenStack, where source and destination checks prevent the IP
traffic from unknown IP ranges, even if all the nodes belong to the same subnet. This
configuration enables encapsulation of pod to pod traffic over the underlying network
infrastructure.
The calico_ip_autodetection_method parameter must be set, so that Calico uses the correct
interface on the node. If there are multiple interfaces, aliases, logical interfaces, bridge
interfaces, or any other type of interfaces on the nodes, use either the following settings to
ensure that the auto-detect mechanism chooses the correct interface.
calico_ip_autodetection_method: can-reach= (This is the default setting.)
calico_ip_autodetection_method: interface=
The calico_tunnel_mtu parameter must be set based on the MTU of the interface that is
configured to be used by Calico.
If the calico_ipip_enabled parameter is set to true, 20 bytes are used for IP-IP tunnel
header. It is required to set the calico_tunnel_mtu parameter to be at least 20 bytes less
than the actual MTU of the interface.
If IPsec is enabled, 40 bytes are needed for the IPsec packet header. Because when you
enable IPsec, calico_ipip_enabled to true is set, you also need the 20 bytes for the
IP-IP tunnel header. Therefore, you must set the calico_tunnel_mtu parameter to be at
least 60 bytes less than the actual MTU of the interface.
The network CIDR (Classless Inter-Domain Routing), existing host network, and the
service cluster IP range must not be in conflict with each other.
In these situations, gather the following information from the cluster for support:
1. Get the node list:
kubectl get nodes -o wide
2. Get the logs:
Collect logs from the calico-node-* pod running on the node which is experiencing the
mesh problem. Example 8-6 shows how to get the logs from calico-node-* running on
node 10.10.25.71.
Configuring calicoctl
Perform the following steps:
1. Log in to the node. Find the calico-ctl docker image and copy the calicoctl to node as
shown in Example 8-7.
2. Configure calicoctl to authenticate to the etcd cluster. Copy the etcd cert, key, and ca
files to a node from boot node’s cluster directory:
– cert file: cluster/cfc-certs/etcd/client.pem
– key file: cluster/cfc-certs/etcd/client-key.pem
– ca file: cluster/cfc-certs/etcd/ca.pem
3. Create a calicoctl.cfg file at /etc/calico/calicoctl.cfg, with the following contents, as
shown in Example 8-8.
apiVersion: projectcalico.org/v3
kind: CalicoAPIConfig
metadata:
spec:
datastoreType: "etcdv3"
etcdEndpoints: "https://<master node IP>:4001"
etcdKeyFile: <File path of client-key.pem>
etcdCertFile: <File path of client.pem>
etcdCACertFile: <file path of ca.pem>
Calico networks must be enabled in IP-in-IP mode. Calico tunnel MTU must be set correctly.
The IPsec package used for encryption must be installed on all the nodes in the cluster. The
IPsec package used for RHEL is libreswan. On Ubuntu and SLES, it is strongswan.
Note: All nodes in the cluster must run the same operating system.
Configuration
When performing the configuration to use IPSec, ensure that the following Calico
configurations are provided in the config.yaml file (Example 8-9).
network_type: calico
calico_ipip_enabled: true
calico_tunnel_mtu: 1390
calico_ip_autodetection_method: interface=eth0
ipsec_mesh:
enable: true
interface: eth0
subnets: [10.24.10.0/24]
exclude_ips: [10.24.10.1/32, 10.24.10.2, 10.24.10.192/28]
cipher_suite: aes128gcm16!
Where:
interface must be the same interface that was set in the
calico_ip_autodetection_method parameter.
subnets are the address ranges. The packets destined for such subnet ranges are
encrypted. The IP address of the data plane interface must fall in one of the provided
subnet ranges.
exclude_ips are the IP addresses that are excluded from the IPsec subnet. Traffic to these
IP addresses is not encrypted.
cipher_suite: aes128gcm16! is the list of Encapsulating Security Payload (ESP)
encryption/authentication algorithms to be used. The default cipher suite that is used is
aes128gcm16!. Ensure that this module is available and loaded in the operating system on
all the hosts. It is also possible to change it to another cipher suite.
Post installation
For the RHEL installation, perform the following steps:
1. Check the libreswan configuration:
cat /etc/ipsec.conf
cat /etc/ipsec.d/ipsec-libreswan.conf
2. Check the status of the ipsec process:
ipsec status
3. If the ipsec status does not display the established connections, check
/var/log/messages for errors related to IPsec. Enable the libreswan logging by enabling
plutodebug in the /etc/ipsec.conf file, as shown in Example 8-11.
config setup
...
...
plutodebug = all # <<<<<<<<<<<<
config setup
...
...
charondebug="ike 2, knl 2, cfg 2" # <<<<<<<<<<<<
If the problem persists, you can open a support ticket as described in 8.5, “Opening a support
case” on page 287.
Limits:
cpu: 8
memory: 64Gi
Requests:
cpu: 4
memory: 8Gi
Liveness: http-get http://:service/ delay=120s timeout=5s period=10s
#success=1 #failure=3
Readiness: http-get http://:service/ delay=120s timeout=5s period=10s
#success=1 #failure=3
Environment:
DATAPOWER_ACCEPT_LICENSE: true
DATAPOWER_INTERACTIVE: true
DATAPOWER_LOG_COLOR: false
DATAPOWER_WORKER_THREADS: 4
In Example 8-13 on page 282, it is possible that the pod was not able to run due to insufficient
CPU or memory. This would cause the error. To fix this issue, make sure that the server
(worker) has sufficient memory and CPU to run the pod.
To fix the issue, add more CPU and restart the docker and Kubernetes to get the pod running.
To determine the amount of CPU being used, run the following command:
kubectl describe node <node>
The output of the command will be like that shown in Example 8-14.
To solve this issue, you need to add more CPUs to the instance or remove some of the
unused pods.
In this case, you can check the full list of ports that have conflicts with the following command:
kubectl describe pods <pod>
To fix the problem, remove the deployment and change the port that is being used so that
there are no conflicts.
To fix the problem, you need to grant the pod security to the namespace. Run the command:
kubectl -n appsales create rolebinding ibm-anyuid-clusterrole-rolebinding
--clusterrole=ibm-anyuid-clusterrole --group=system:serviceaccounts:appsales
8.4.1 Getting the 504 or 500 errors when trying to access the application
After deploying a pod or during the execution of the pod, you might get the error message 504
or 500 when trying to access the application from a browser as shown in Figure 8-3.
There are some common cases where this error is displayed, such as the pod entering in
CrashLoopBack or not starting. Those errors discussed in the next sections.
in Example 8-16, you can see that the pod has been restarted 3774 times in the last 9 hours.
Usually this error happens when the pod is starting and failing in a loop.
To try to understand where the error is, you can run the following commands:
kubectl logs <pod>
kubectl describe <pod>
With the output of both commands you can determine where the error is and how to solve it.
The details about the error are observed when running the description of the pod (the kubectl
describe <pod> command).
Note: if you are using the IBM Cloud Private Community Edition it is possible to get
support through the Slack channel and community forum or ask the Watson Chatbot. See
the following addresses:
Collect the following general troubleshooting data, along with any other data for your problem
area:
hosts file (located at /<installation_directory>/cluster): This provides IBM the server
topology details
config.yaml file (located at /<installation_directory>/cluster): This provides details
about customization, including the load balancer details.
Run the following command and attach the output to the case:
sudo docker run --net=host -t -e LICENSE=accept -v "$(pwd)":/installer/cluster
ibmcom/icp-inception-<architecture>:<version> healthcheck -v
Tip: You can find the cheat sheet in “Cheat sheet for production environment” on page 361
useful when troubleshooting the IBM Cloud Private problems.
There are a lot of articles on the internet about different service registry and discovery
solutions, and about microservices and cloud native programing frameworks. Many of them
relate to a specific product or technology, but it is not obvious how these solutions relate to
Kubernetes or to what extent they are applicable in hybrid, multi-cloud, enterprise-centric
environments. Things that work fine for an enthusiastic cloud startup developer might not be
easy for an enterprise developer who often operates in a highly regulated environment.
In the following section we briefly summarize the role of the service mesh and discuss which
functions are fulfilled by Kubernetes itself and which ones need to be augmented by
additional solution – in case of IBM Cloud Private the product of choice is Istio.
IBM Cloud Private uses Kubernetes technology as its foundation. In Kubernetes, services are
of primary importance, and Kubernetes provides implicit service registry using DNS. When a
controller alters Kubernetes resources, for example starting or stopping some pods, it also
updates the related service entries. Binding between pod instances and services that expose
them is dynamic, using label selectors.
Istio leverages Kubernetes service resources, and does not implement a separate service
registry.
9.3.1 Components
The following section describes the components of the planes.
Figure 9-1 on page 293 shows the different components that make up each plane.
1
Excerpt from https://istio.io/docs/concepts/what-is-istio/
Envoy
Istio uses an extended version of the Envoy proxy. Envoy is a high-performance proxy
developed in C++ to mediate all inbound and outbound traffic for all services in the service
mesh. Istio leverages Envoy’s many built-in features, for example:
Dynamic service discovery
Load balancing
TLS termination
HTTP/2 and gRPC proxies
Circuit breakers
Health checks
Staged rollouts with percentage-based traffic split
Fault injection
Rich metrics
2
Image taken from Istio documentation at https://istio.io/docs/concepts/what-is-istio/
The sidecar proxy model also allows you to add Istio capabilities to an existing deployment
with no need to rearchitect or rewrite code. You can read more about why we chose this
approach in our Design Goals.
Mixer
Mixer is a platform-independent component. Mixer enforces access control and usage
policies across the service mesh, and collects telemetry data from the Envoy proxy and other
services. The proxy extracts request level attributes, and sends them to Mixer for evaluation.
You can find more information about this attribute extraction and policy evaluation in Mixer
Configuration documentation.
Mixer includes a flexible plug-in model. This model enables Istio to interface with a variety of
host environments and infrastructure backends. Thus, Istio abstracts the Envoy proxy and
Istio-managed services from these details.
Pilot
Pilot provides service discovery for the Envoy sidecars, traffic management capabilities for
intelligent routing (e.g., A/B tests, canary deployments, etc.), and resiliency (timeouts, retries,
circuit breakers, etc.).
Pilot converts high level routing rules that control traffic behavior into Envoy-specific
configurations, and propagates them to the sidecars at run time. Pilot abstracts
platform-specific service discovery mechanisms and synthesizes them into a standard format
that any sidecar conforming with the Envoy data plane APIs can use. With this loose coupling,
Istio can run on multiple environments, such as Kubernetes, Consul, or Nomad, while
maintaining the same operator interface for traffic management.
Citadel
Citadel provides strong service-to-service and end-user authentication with built-in identity
and credential management. You can use Citadel to upgrade unencrypted traffic in the
service mesh. Using Citadel, operators can enforce policies based on service identity rather
than on network controls. Starting from release 0.5, you can use Istio’s authorization feature
to control who can access your services.
Galley
Galley validates user authored Istio API configuration on behalf of the other Istio control plane
components. Over time, Galley will take over responsibility as the top-level configuration
ingestion, processing and distribution component of Istio. It will be responsible for insulating
the rest of the Istio components from the details of obtaining user configuration from the
underlying platform (for example Kubernetes).
Connect
Istio provides traffic management for services. The traffic management function includes:
Intelligent routing: The ability to perform traffic splitting and traffic steering over multiple
versions of the service
Resiliency: The capability to increase micro services application performance and fault
tolerance by performing resiliency tests, error and fault isolation and failed service
ejection.
Secure
Istio implements a Role-based Access Control (RBAC) which allow a specific determination
on which service can connect to which other services. Istio uses Secure Production Identity
Framework for Everyone (SPIFFE) to identify the ServiceAccount of a micro service uniquely
and use that to make sure communication are allowed.
Control
Istio provides a set of Policies that allows control to be enforced based on data collected. This
capability is performed by Mixer.
Observe
While enforcing the policy, Itsio also collects data that is generated by the Envoy proxies. The
data can be collected as metrics into Prometheus or as tracing data that can be viewed
through Jaeger and Kiali.
This chapter describes mainly the resiliency (9.5, “Service resiliency” on page 301) and
security (9.6, “Achieving E2E security for microservices using Istio” on page 311)
implementation using Istio, while intelligent routing is discussed in “Chapter 4 Manage your
service mesh with Istio” of the IBM Redbooks IIBM Cloud Private Application Developer's
Guide, SG24-8441.
Here we will introduce another approach by installing the Istio using the command-line
interface, where you have more control on the options.
Tip: This approach can be used as a base in the airgap environment where you must use
a local chart.
In addition, we will also demonstrate the sample bookInfo application deployed in IBM Cloud
Private, where you must pay extra attention due to the enhanced security management
features starting from IBM Cloud Private version 3.1.1.
Note that the ca-file, the key-file, and the cert-file are created automatically when you
perform the first step.
Refresh the repository with the helm repo update command.
3. Create the secret for Grafana and Kali. We enable Grafana and Kali for this deployment.
The secret for the console login is required. Create the files as in Example 9-2 and in
Example 9-3 on page 296.
Notice that the username and passphrase are base64 encoded. You can get the encoded
value by running echo yourtext | base64.
5. Now you need to customize your settings. Create a YAML file as shown in Example 9-5 to
override the default settings of the istio Helm chart. Save it as vars.yaml.
You can see the values.yaml file in the chart. The default chart tarball can be downloaded
with the following command:
curl -k -LO https://<Your
Cluster>:8443/mgmt-repo/requiredAssets/ibm-istio-1.0.5.tgz
Example 9-5 Override the default settings of the Istio Helm chart
grafana:
enabled: true
tracing:
enabled: true
kiali:
enabled: true
Here we enable the grafana, tracing, and kiali, which are disabled by default.
6. Next you will deploy the Istio chart by running the command in Example 9-6 on page 297.
All the pods under the namespace istio-system are running. Notice that other than the
ingressgateway and egressgateway that run on the proxy node, the rest of the services all
run on the management node.
b. Name the namespace istio-exp, then select ibm-privileged-psp as the Pod Security
Policy.
c. Click Create.
Note: You can also create these using the kubectl create ns istio-exp kubectl
command.
d. Create the file named as rolebinding.yaml with the content shown in Example 9-8.
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ibm-privileged-clusterrole-rolebinding
namespace: istio-exp
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: ibm-privileged-clusterrole
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:serviceaccounts:istio-exp
apiVersion: securityenforcement.admission.cloud.ibm.com/v1beta1
kind: ImagePolicy
metadata:
name: book-info-images-whitelist
namespace: istio-exp
spec:
repositories:
- name: docker.io/istio/*
Notice the 2/2 of the output. The sidecar injection adds the additional istio-proxy container
and makes it into two containers.
You will also need to delete the CustomResourceDefinition objects that are left over in this
version. To delete these object run the command kubectl delete -f
istio/ibm-mgmt-istio/templates/crds.yaml.
Kubernetes does not fulfill the previous challenges as ready-for-use functionality; we need
service mesh for it. Service mesh provides critical capabilities including service discovery,
load balancing, encryption, observability, traceability, authentication, and authorization and
support for the circuit breaker pattern. There are different solutions for service mesh, such as
Istio, Linkerd, Conduit, and so on. We will use the Istio service mesh to discuss how to
achieve service resiliency.
Istio comes with several capabilities for implementing resilience within applications. Actual
enforcement of these capabilities happens in the sidecar. These capabilities will be discussed
in the following sections within the context of a microservices example.
To understand the Istio capabilities we will use the example published on the following IBM
Redbooks GitHub url:
https://github.com/IBMRedbooks/SG248440-IBM-Cloud-Private-System-Administrator-s-G
uide/tree/master/CH8-Istio.
The details about microservices setup are given in the readme file.
With Istio’s retry capability, it can make a few retries before having to truly deal with the error.
Example 9-12 the virtual service definition for the catalog service.
virtualservice.networking.istio.io/catalog created
2. Create a destination rule for all three microservices. See Example 9-14.
5. As we can see from the output in Example 9-16, the user microservice is randomly calling
different versions of the catalog microservice. Now add a bug in catalog:v2 microservice
and check how it behaves. See Example 9-17.
We will launch the user microservice to determine how it responds. See Example 9-18.
As we can see in the output, catalog:v2 microservice is not responding and all calls
made to it from the user microservice fail.
6. We will add retry logic for virtualservice of the catalog and check how it behaves. See
Example 9-19.
7. The previous virtualservice definition for the catalog will make sure that each call to the
catalog service will do 3 attempts of calling the catalog microservices before failure. Next,
create a virtual service for the catalog using the definition in Example 9-20.
8. Then, examine the output of starting the user microservice, as shown in Example 9-21.
As we can see in Example 9-21 on page 304, the user microservice no longer starts
catalog:v2. As it tries to launch, the call fails. The user microservice retries 2 more times to
start catalog:v2, and then launches catalog:v1 or catalog:v3, which succeeds.
9.5.2 Timeout
Calls to services over a network can result in unexpected behavior. We can only guess why
the service has failed? Is it just slow? Is it unavailable? Waiting without any timeouts uses
resources unnecessarily, causes other systems to wait, and is usually a contributor to
cascading failures. Your network traffic should always have timeouts in place, and you can
achieve this goal with the timeout capability of Itsio. It can wait for only a few seconds before
giving up and failing.
1. We will delete any virtualservice, destinationrule definitions on setup. Add 5 second delay
in response of the catalog:v2 microservice, as shown in Example 9-22.
As the script is run, you will notice that whenever a call goes to catalog:v2 microservice, it
takes more than 5 seconds to respond, which is unacceptable for service resiliency.
Example 9-24 Create a VirtualService for the catalog and add 1 second as timeout
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: catalog
spec:
hosts:
- catalog
http:
- route:
- destination:
host: catalog
timeout: 1.000s
As we can see in Example 9-26, whenever a call is made to catalog:v2, it timeouts in 1 sec
according to the virtual service definition for the catalog.
We will show an example where we will not define circuit breaker rule and see how it works.
See Example 9-27.
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: catalog
spec:
host: catalog
subsets:
- labels:
version: v1
name: version-v1
- labels:
version: v2
name: version-v2
- labels:
version: v3
name: version-v3
destinationrule.networking.istio.io/catalog created
2. Next, create a virtual service and destination rule for the catalog microservice. See
Example 9-29.
virtualservice.networking.istio.io/catalog created
3. Now, try to check how the catalog microservice responds to requests. Use the seige tool
to make a request to the catalog microservice. In Example 9-30 you can see that 5 clients
are sending 8 concurrent requests to the catalog.
Example 9-30 Five clients are sending eight concurrent requests to the catalog
root@scamp1:~/istio_lab/cb# siege -r 8 -c 5 -v 10.0.0.31:8000
** SIEGE 4.0.4
** Preparing 5 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 0.02 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.03 secs: 74 bytes ==> GET /
HTTP/1.1 200 5.05 secs: 75 bytes ==> GET /
HTTP/1.1 200 5.06 secs: 75 bytes ==> GET /
HTTP/1.1 200 5.04 secs: 75 bytes ==> GET /
HTTP/1.1 200 0.03 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.03 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.02 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.01 secs: 74 bytes ==> GET /
As we can see in Example 9-30 on page 308, all requests to our system were successful,
but it took some time to complete the test, as the v2 instance of catalog was a slow
performer. But suppose that in a production system this 5 second delay was caused by too
many concurrent requests to the same instance. This might cause cascade failures in your
system.
4. Now, add a circuit breaker that will open whenever there is more than one request that is
handled by the catalog microservice. See Example 9-31.
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: catalog
spec:
host: catalog
trafficPolicy:
connectionPool:
http:
http1MaxPendingRequests: 1
maxRequestsPerConnection: 1
tcp:
maxConnections: 1
outlierDetection:
baseEjectionTime: 120.000s
consecutiveErrors: 1
interval: 1.000s
maxEjectionPercent: 100
Warning: kubectl apply should be used on resource created by either kubectl create
--save-config or kubectl apply
destinationrule.networking.istio.io/catalog configured
5. Because the circuit breaker for the catalog is in place, make a request to the catalog
microservice. Example 9-32 shows 5 clients sending 8 concurrent requests to the catalog
microservice.
** SIEGE 4.0.4
** Preparing 5 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 0.05 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.05 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.05 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.05 secs: 74 bytes ==> GET /
HTTP/1.1 503 0.02 secs: 57 bytes ==> GET /
HTTP/1.1 503 0.04 secs: 57 bytes ==> GET /
HTTP/1.1 200 0.04 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.05 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.04 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.02 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.01 secs: 74 bytes ==> GET /
HTTP/1.1 503 0.02 secs: 57 bytes ==> GET /
HTTP/1.1 503 0.00 secs: 57 bytes ==> GET /
HTTP/1.1 200 0.01 secs: 74 bytes ==> GET /
HTTP/1.1 503 0.01 secs: 57 bytes ==> GET /
HTTP/1.1 200 0.04 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.04 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.03 secs: 74 bytes ==> GET /
HTTP/1.1 503 0.02 secs: 57 bytes ==> GET /
HTTP/1.1 200 0.01 secs: 74 bytes ==> GET /
HTTP/1.1 200 5.05 secs: 75 bytes ==> GET /
HTTP/1.1 503 0.01 secs: 57 bytes ==> GET /
HTTP/1.1 200 0.01 secs: 74 bytes ==> GET /
HTTP/1.1 200 0.02 secs: 74 bytes ==> GET /
In Example 9-32, you can see the 503 errors being displayed. As the circuit breaker is being
opened, whenever Istio detects more than one pending request being handled by the catalog
microservice, it opens circuit breaker.
Example 9-33 Find the ingress port and ingress host IP for the Kubernetes cluster
export INGRESS_PORT=$(kubectl -n istio-system get service istio-ingressgateway -o
jsonpath='{.spec.ports[?(@.name=="http2")].nodePort}')
2. Create an ingress gateway and attach it to the user virtual service, as shown in
Example 9-34.
Example 9-34 Create an ingress gateway and attach it the user virtual service
root@scamp1:~/istio_lab# cat gateway.yaml
apiVersion: networking.istio.io/v1alpha3
We have now created a virtual service configuration for the user service containing one
route rule that allows traffic for path /. The gateways list specifies that only requests
through your app-gateway are allowed. All other external requests will be rejected with a
404 response. Note that in this configuration, internal requests from other services in the
mesh are not subject to these rules, but instead will default to round-robin routing. To apply
these or other rules to internal calls, you can add the special value mesh to the list of the
gateways.
3. Try to access the user service using the curl command shown in Example 9-35.
If you want to expose HTTPs endpoint for inbound traffic, see the following link:
https://istio.io/docs/tasks/traffic-management/secure-ingress/
This section describes how to configure Istio to expose external services to Istio-enabled
clients. You will learn how to enable access to external services by defining ServiceEntry
configurations, or alternatively, to bypass the Istio proxy for a specific range of IPs.
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: google
spec:
hosts:
- www.google.com
ports:
- number: 443
name: https
protocol: HTTPS
resolution: DNS
location: MESH_EXTERNAL
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: google
spec:
hosts:
- www.google.com
tls:
- match:
- port: 443
sni_hosts:
- www.google.com
route:
- destination:
host: www.google.com
port:
number: 443
weight: 100
EOF
apiVersion: networking.istio.io/v1alpha3
kind: ServiceEntry
metadata:
name: httpbin-ext
spec:
hosts:
- httpbin.org
ports:
- number: 80
name: http
protocol: HTTP
resolution: DNS
location: MESH_EXTERNAL
EOF
2. Make requests to the external services. Make an HTTP outbound request to httpbin.org
from the user pod, as shown in Example 9-38.
Example 9-38 HTTP outbound request to httpbin.org from the user to the pod
root@scamp1:~/istio_lab# export SOURCE_POD=$(kubectl get pod -l app=user -o
jsonpath={.items..metadata.name})
This policy specifies that all workloads in the service mesh will only accept encrypted
requests using TLS. As you can see, this authentication policy has the kind: MeshPolicy. The
name of the policy must be default and it contains no targets specification (as it is intended to
apply to all services in the mesh). At this point, only the receiving side is configured to use the
mutual TLS. If you run the curl command between the Istio services (for example those with
the sidecars), all requests will fail with a 503 error code as the client side is still using
plain-text. See Example 9-40.
Example 9-40 Try to access the catalog microservice from the user microservice
root@scamp1:~# kubectl get svc
To configure the client side, you need to set destination rules to use mutual TLS. It is possible
to use multiple destination rules, one for each applicable service (or namespace). However, it
is more convenient to use a rule with the * wildcard to match all services so that the
configuration is on par with the mesh-wide authentication policy. See Example 9-41.
Example 9-41 Set the destination rule to use the mutual TLS
root@scamp1:~# cat <<EOF | kubectl apply -f -
> apiVersion: "networking.istio.io/v1alpha3"
> kind: "DestinationRule"
> metadata:
> name: "default"
> namespace: "default"
> spec:
> host: "*.local"
> trafficPolicy:
> tls:
> mode: ISTIO_MUTUAL
> EOF
destinationrule.networking.istio.io/default created
Now, try to make a call to the catalog microservice from the user microservice. See
Example 9-42.
Example 9-42 Make a call to the catalog microservice from the user microservice
root@scamp1:~# kubectl exec -it user-v1-6b5c74b477-svtj7 -- curl 10.0.0.31:8000
White listing
The whitelist is a deny everything rule, except for the approved invocation paths:
1. In the following example, you white list calls from the user microservice to the catalog
microservice. See Example 9-43.
Example 9-43 White listing calls from user microservice to catalog microservice
root@scamp1:~/istio_lab# cat catalog_whitelist.yaml
apiVersion: "config.istio.io/v1alpha2"
kind: listchecker
metadata:
name: catalogwhitelist
spec:
overrides: ["user"]
blacklist: false
---
apiVersion: "config.istio.io/v1alpha2"
kind: listentry
metadata:
name: catalogsource
spec:
value: source.labels["app"]
---
apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
name: checkfromuser
spec:
match: destination.labels["app"] == "catalog"
actions:
- handler: catalogwhitelist.listchecker
instances:
- catalogsource.listentry
2. We will try to make calls from the user microservice to the catalog microservice. See
Example 9-44.
Example 9-44 Make calls from the user microservice to the catalog microservice
root@scamp1:~/istio_lab# kubectl get svc
As we can see in Example 9-44 on page 317, the user microservice call can be made to
the catalog microservice.
3. Now we will try to make a call from the product microservice to the catalog microservice.
See Example 9-45.
Example 9-45 Make a call from the product microservice to the catalog microservice
root@scamp1:~/istio_lab# kubectl exec -it product-v1-747cf9f795-z5ps2 -- curl
10.0.0.31:8000
Example 9-45 shows that the call from the product microservice to the catalog microservice
has failed, as we had only white listed the user microservice. This is the wanted result.
Black listing
The black list is explicit denials of particular invocation paths.
1. In Example 9-46 we black list the user microservice in the catalog microservice, so that it
cannot make calls to the catalog microservice.
Example 9-46 Black listing the user microservice to the catalog microservice
root@scamp1:~/istio_lab# cat catalog_blacklist.yaml
apiVersion: "config.istio.io/v1alpha2"
kind: denier
metadata:
name: denyuserhandler
spec:
status:
code: 7
message: Not allowed
---
apiVersion: "config.istio.io/v1alpha2"
2. We will make a call from the user microservice to the catalog microservice as shown in
Example 9-47.
Example 9-47 Make a call from the user microservice to the catalog microservice
root@scamp1:~/istio_lab# kubectl exec -it user-v1-6b5c74b477-svtj7 -- curl
10.0.0.31:8000
As we had blacklisted user microservice to the catalog microservice so call has failed in
Example 9-47.
3. Make a call from the product microservice to the user microservice. See Example 9-48.
Example 9-48 Make a call from the product microservice to the user microservice
root@scamp1:~/istio_lab# kubectl exec -it product-v1-747cf9f795-z5ps2 -- curl
10.0.0.31:8000
The previous call has succeeded because we did not blacklist the product microservice.
Figure 9-3 shows the basic Istio authorization architecture. Operators specify Istio
authorization policies by using .yaml files. When Itsio is deployed, it saves the policies in the
Istio Config Store. Pilot watches for changes to Istio authorization policies. It fetches the
updated authorization policies if it sees any changes. Pilot distributes Istio authorization
policies to the Envoy proxies that are colocated with the service instances.
Each Envoy proxy runs an authorization engine that authorizes requests at run time. When a
request comes to the proxy, the authorization engine evaluates the request context against
the current authorization policies, and returns the authorization result, ALLOW or DENY.
1. The first thing to do is enable Istio authorization by using the RbacConfig object. See
Example 9-49.
3
Image taken from https://istio.io/docs/concepts/security/
By default, Istio uses a deny by default strategy, meaning that nothing is permitted until
you explicitly define access control policy to grant access to any service.
3. Now, grant access to any user to any service of our mesh (user, catalog, or product) only if
the communication goes through the GET method. See Example 9-51.
4. Now if you make a GET call to the user microservice, it should succeed as seen in
Example 9-52.
https://istio.io/docs/tasks/security/role-based-access-control/
Cloud Foundry allows development in several languages, and has provisions to allow for
containers or binaries to be incorporated as applications. This gives an enterprise endless
flexibility, while maintaining ease of operation at scale.
Operationally, networking, applications, routing, and services are ritualized, so the physical
infrastructure requirements are related to configuring an appropriate IaaS. IBM provides the
stemcell, which guarantees that all platform VMs are cut from the same binaries, and also
allows for a single stemcell patch to easily be propagated to all systems in the platform. This
helps to lower the operational workload usually spent patching and maintaining systems.
Cloud Foundry takes this concept one step further to the garden containers. By basing each
application container on the cflinuxfs base, security and container improvements are
propagated to each application. This means that operations does not need to worry about
container security on a per application basis. It frees operations to work on maintenance at a
higher scale, and development can focus on coding and not application hosting foundations.
Kubernetes abstracts the IaaS from the platform, so there is no need to worry about IaaS
information or dealing with the systems. Kubernetes allows the focus to be shifted to
platform management and development pursuits. It deploys quickly and scales fast. It
consumes Kubernetes resources, no matter what underlying IaaS is being used.
This lowers the entry bar to Cloud Foundry when Kubernetes is already deployed, as well as
provides a base for local services to be provisioned on demand for Cloud Foundry. This does
preclude that Kubernetes has been configured, and it will be sitting on an IaaS, so this
management piece is pushed down the stack.
In most cases you have both the IaaS and the Kubernetes. If you’re not running Cloud
Foundry on Kubernetes, you will likely still use IBM’s Cloud Private platform for shared
monitoring, logging, and data services.
The installation is done in two steps, first the creation of the installer container and second
Cloud Foundry deployment itself using the Cloud Foundry deployment tool.
The customer must download the installation kit from (link). The installation kit, if needed,
must be copied to a Linux base server that has network connectivity with the targeted
environment.
The installer host has different requirements depending if the target environment is Cloud
Foundry Enterprise Environment (CFEE) or Cloud Foundry Full Stack.The installer container
is created either by using the provided scripts or by deploying the provided Cloud Pak in IBM
Cloud Private. Once the installer container is created, the installation is similar on all
supported platforms. The installer container is used only for deployment and upgrades. Once
deployment or upgrade of Cloud Foundry is done, the installer container is not needed
anymore.
It then uploads all images into Docker and executes launch.sh with some parameters to
create the installation container. One of the parameters is the directory where the installation
data should reside. It is useful to have a backup of the data directory and this eases the
subsequent upgrade of the Cloud Foundry Full Stack as this directory contains all
configuration parameters for the installation and upgrades. Once the installer container is
installed, the environment is ready to get deployed.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 327
Figure 10-1 shows the installation flow for a Cloud Foundry Full Stack environment.
Figure 10-1 Installation flow for a Cloud Foundry Full Stack environment
A pod is created with the “installer container” as it is for the Cloud Foundry Full Stack solution.
The deployer uses the Cloud Foundry deployment tool to start the deployment of Cloud
Foundry. It is useful to have the installer persistent volume backed up to ease the Cloud
Foundry upgrades in case of data lost.
When the installer container is installed, either the installation can be executed using the
script launch_deployment.sh for Cloud Foundry Full Stack, or through the deployment tool for
Cloud Foundry Full Stack and CFEE. For Cloud Foundry Full Stack the deployment tool is
accessible at the url http://<installer_host_ip>:8180. For CFEE, the deployer can launch
the installer deployment tool from the IBM Cloud Private management console.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 329
Figure 10-3 and Figure 10-4 show some screen captures of the Cloud Foundry deployment
tool.
10.2.4 Extensions
As mentioned, the base role of the config-manager is to deploy extensions.
IBM provides a number of embedded extensions to integrate external services. For example
for Cloud Foundry Full Stack, IBM provides ready to use extensions to integrate LDAP,
application logs, and system logs.
An extension developer can create their own extensions by using the extension framework
provided in the CFEE and Cloud Foundry Full Stack solution. These extensions are called
“custom” extensions.
An extension is first developed, then registered, and then integrated in the main deployment
flow.
Once the extension is developed, it is compressed in a .zip file. Through the deployment
console or the command line, the deployer can register the extension and integrate it in the
deployment flow.
An extension can be also set to auto-place itself into the flow by defining in the
extension-manifest.yml where the extension needs to be placed.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 331
Figure 10-6 shows the extension directory structure.
10.3.1 Zoning
Availability zones represent functionally independent segments of infrastructure where an
issue or outage is likely to affect only a single availability zone. This means that resources in
other availability zones continue to function as normal. This concept is implemented by most
public IaaS providers, usually in the form of geographically distinct data centers.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 333
The same concept can be implemented to some degree with on-premises IaaS solutions by
dividing compute, storage, and networking infrastructure into groups.
Cloud Foundry supports deployment of its components into different availability zones for
better high availability. It is often beneficial to have two instances of a component in separate
availability zones, so that if one zone becomes unavailable, the remaining instance can keep
the platform running. When three or more availability zones are configured, platform
availability can usually be maintained as long as the majority of the zones remain operational.
By default, Cloud Foundry Full Stack uses two availability zones, z1 and z2, but both are
configured with a single set of Cloud Provider Interface properties that you provide when
configuring your Cloud Foundry installation. The effect of this is that if a single instance of a
Cloud Foundry component fails you might still have some protection for components that
deploy multiple instances. But if the entire availability zone experiences an outage, the Cloud
Foundry platform will as well.
You can customize the z2 availability zone to use different infrastructure, thereby enhancing
the high availability of your deployment, because for several of the components, multiple
instances are deployed by default, and these instances are distributed over zones z1 and z2.
While availability zones cannot be configured in the main configuration file that you create for
your Cloud Foundry installation, additional configuration files do provide a placeholder for a
third availability zone, z3. If you want to use this third availability zone, you edit it to provide
the applicable Cloud Provider Interface properties.
Next, you must modify the deployment to specify which Cloud Foundry components will be
deployed over the three availability zones and to adjust the instance counts as desired. (With
the exception of the number of Diego cells, which is specified in the main configuration file,
the instance counts of all other Cloud Foundry components are fixed.) Instances are
distributed in a round-robin fashion over the specified zones.
Further information about how to plan the number of instances of each Cloud Foundry
component can be found in the article High Availability in Cloud Foundry. Full instructions for
customizing the availability zones for your Cloud Foundry Full Stack deployment can be found
in the Knowledge Center. See Configuring Availability Zones for IBM Cloud Private Cloud
Foundry.
It also comes with the default backup and restore configuration which utilizes internal NFS or
GlusterFS servers. This might be sufficient but a highly-available external database might be
a better choice in production environments for improved performance, scalability and
protection against data loss.
The default BOSH Director deployment manifest needs to be modified to use external
databases. The instruction is available at the following link:
https://github.com/cloudfoundry/bosh-deployment
In AWS you would create the three buckets that are recommended by the extension, update
the extension .yaml file if you changed the bucket names, then follow the instructions to load
the extension. Loading the extension does not transfer existing data, but will reset the
blobstore, so objects like apps might need to be pushed again.
10.4.2 Director
In a Cloud Foundry Full Stack environment, the BOSH Backup and Restore (BBR) is
responsible for backing up the BOSH director as well as the BOSH deployment of the Cloud
Foundry environment. Backups can be configured to be copied to an external customer NFS
server. BBR backup can be set up by configuring values in the uiconfig.yml files to define
when backups occur, where the files should be copied to, and how many backups to keep
before being overwritten.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 335
The restore procedure allows recovery of the BOSH director and deployment should a
significant error occur. The restore produce is detailed at
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/cloud_foundry/configur
ing/backups.html.
Similar to Cloud Foundry Full Stack, Cloud Foundry Enterprise Environment supports both
developer and high availability configuration modes. For detailed storage requirements for
each of the modes, see
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/cloud_foundry/tech_pre
view/install_cfee.htm.
This means if you have 3 Diego Cells in VMware or OpenStack, you would license IBM Cloud
Private Cloud Foundry for 12x VPC because you would be using 12x vCPU. Now that we
understand the mappings of vCPU to VPC (1:1) and Diego cells to memory, a sizing
calculation can be made. Most sizings are done based on memory. You might want 64 GB,
128 GB, or 256 GB of application memory. If you want to buy 256 GB of application memory,
the VPC calculation would be as follows: 256 GB / 32 GB = 8 (Diego Instances) x 4
(vCPU/VPC) = 32 (vCPU/VPC).
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 337
You would want to purchase 32 VPCs of IBM Cloud Private Cloud Foundry. This would
support running the required number of instances, until it hits the memory maximum. The
default overcommit is 2:1, so the platform’s users would have approximately 512 GB of virtual
application memory available if an overcommit of 2:1 is used. 2:1 is fine for development and
test, and a 1:1 is recommended in production, but this can be changed if the application
profile warrants.
From a hardware perspective, the link provides a Totals table, that allows you to calculate the
full hardware requirements based on management systems and Diego cells. You should also
keep in mind, as the number of cells is increased, as you approach 512 GB of memory, your
application profile might dictate that the management systems be horizontally scaled (routers,
loggregators, or cloud controllers). This does not incur a licensing cost, but does require
additional virtualization resources to host.
10.7 Networking
This section describes the network implementation for IBM CLoud Private Cloud Foundry.
Cloud Foundry administrators can specify a set of network policies that enable
communication between application instances. Each of the policy specifications typically
includes source application, destination application, protocol (both TCP and UDP are
supported) and port. The policies are dynamically applied and take effect immediately without
the need to restart Cloud Foundry applications. Network policies also continue to work when
applications are redeployed, scaled up or down, or placed in different containers.
By default, Cloud Foundry includes a Silk CNI plug-in for providing overlay IP address
management and a VXLAN policy agent to enforce network policy for traffic between
applications. These network components are designed to be swappable components in Cloud
Foundry so they can be replaced with an alternate network implementation.
More details on customizing container networking in IBM Cloud Private Cloud Foundry can be
found at
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/cloud_foundry/configur
ing/configure_container_networking.html.
10.8 Security
In this section, we describe the encryption, credentials, and certificates aspects of IBM Cloud
Private Cloud Foundry.
This procedure can be used to either have the installer auto-generate new certificates,
secrets, or tokens. It can also be used to update values generated by you or your security
department. If the user or certificates are not part of the uiconfig file (has wildcard
certificates and the admin password), then you need to locate the installer configuration
directory. This is the directory that was specified when running the launch.sh command.
Inside this directory, you will find a “CloudFoundry” directory.
This directory contains a certificates.yml and credentials.yml file. These files might be
encrypted and stored elsewhere when the installer is not being used. Make sure these files
are restored and editable. Open the appropriate file and find the certificate, secret, or token
you want to change. If you want to let the installer generate a new value, remove the entry.
Alternately you can replace the entry with your won value, making sure to meet the
complexity and length described for the value.
When all of the changes have been made, make sure that the launch.sh has been run and
the installer is available. Run cm engine reset and then re-run the launch_deployment.sh
command you used to install or update the system last. If you used the Install Deployment
Console, you can launch the installer there. The cm command sets all the installation and
update states to “READY”.
After the installer has completed, the new certificates, secrets, or tokens will have been
applied. This works for both the director and the Cloud Foundry installation. The process can
take a long time if director certificates are involved, because this might require all of the
BOSH virtual machines to be rebuilt.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 339
10.9 Monitoring and logging
This section discusses how to manage monitoring and logging in an IBM Cloud Foundry
environment.
10.9.1 Monitoring
Logging is supported through internal and external options. External options include
Prometheus, Splunk, and third party Cloud Foundry compatible tools. Internal monitoring is
provided through a shared Prometheus and Grafana tool set. The internal monitoring is
provided in IBM Cloud Private Kubernetes and is accessible by that administrator. To enable
the capability, you can use Prometheus exporters, which are configured by the installer.
The exporters are provided as a configured Helm chart that can be loaded, and automatically
starts tracking monitoring metrics for the Cloud Foundry deployment. This monitoring is
available through Full Stack and Enterprise Environment (Kubernetes-based Cloud
Foundry).
Figure 10-9 shows the flow from IBM Cloud Foundry to Exporter to Prometheus and pull from
Grafana.
Figure 10-9 The flow from IBM Cloud Foundry to Exporter to Prometheus and pull from Grafana
When the Helm chart is loaded, you have access to BOSH, API, and Firehose metrics. In
Grafana, you still need to find or define dashboards. There are a number of dashboards
available in the community to choose from and you can create your own.
The following are recommended BOSH dashboards found in the files bosh_overview and
bosh_deployments located at:
https://github.com/bosh-prometheus/prometheus-boshrelease/tree/master/jobs/bosh_da
shboards/templates.
10.9.2 Logging
A Cloud Foundry deployment can produce a lot of log output, both from the components that
make up the Cloud Foundry platform itself and the applications you run in the environment.
Logs produced by Cloud Foundry components, which we see as syslogs, can be viewed
directly on the BOSH instances (for Cloud Foundry Full Stack) or the Kubernetes pods (for
Cloud Foundry Enterprise Environment). Most logs are output to a subdirectory
corresponding to the job name under the /var/vcap/sys/log directory.
Syslogs are subject to rotation to avoid using up all the storage available to the Cloud
Foundry components, so their lifespan is limited.
Logs from applications deployed in the environment, or applogs, can be accessed using the
Cloud Foundry CLI or any tooling that integrates with the Cloud Foundry firehose, such as the
Cloud Foundry management console. These logs are a real-time stream, with a limited
amount of history available.
Cloud Foundry Full Stack provides extensions for log forwarding that allow you to forward logs
to a logging system of your choice for retention and analysis. An integration with the Elastic
Stack logging system in IBM Cloud Private is also provided.
Then, when the deployment is run, the extensions insert the configuration values in the
required configuration files so that the “Deploy Cloud Foundry” state will configure the Cloud
Foundry components to forward logs according to the configuration. For syslog forwarding, all
Cloud Foundry components are modified and individually forward their logs according the
configuration. For applog forwarding, the Doppler instances (typically 4), a core component of
the Cloud Foundry logging system, are modified to forward all application logs. Both
extensions can be configured automatically if you are using the integration with the Elastic
Stack provided in IBM Cloud Private.
Integration with Elastic Stack in IBM Cloud Private is achieved by using a Helm chart that
Cloud Foundry Full Stack exports during the “Prepare Helm charts” state. After running this
state of the deployment, an archive file containing the ibm-cflogging Helm chart and required
images can be found in the IBMCloudPrivate subdirectory of the installation configuration
directory.
The ibm-cflogging Helm chart provides a Logstash deployment that can be configured to
accept syslogs and applogs from the cfp-ext-syslog-forwarder and cfp-ext-applog-forwarder
extensions. This deployment of Logstash outputs logs to the Elasticsearch deployment
provided by IBM Cloud Private, which allows you to view, search, and visualize logs in
Kibana. By default, all connections are made securely using mutual TLS.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 341
If you provide the Configuration Manager endpoint address and access token when installing
the ibm-cflogging Helm chart, a job is run during installation that adds the chosen extensions
(cfp-ext-syslog-forwarder or cfp-ext-applog-forwarder or both) to the deployment and
configures them with the required IP addresses, ports, certificates, keys, and other settings. A
typical setup process looks like this:
1. If not already installed, deploy the Cloud Foundry Full Stack environment, then obtain the
ibm-cflogging Helm chart archive from the installation configuration directory.
2. Use the cloudctl CLI to load the Helm chart archive into IBM Cloud Private.
3. Obtain the Configuration Manager address and token from your Cloud Foundry Full Stack
deployment. Create a Kubernetes secret containing the token.
4. Install the ibm-cflogging chart from the IBM Cloud Private catalog, configuring as desired
and providing the Kubernetes secret for the Configuration Manager token.
5. In the Cloud Foundry deployment tool, launch deployment to run the newly inserted
logging extensions. The “Deploy Cloud Foundry” state will be rerun with new configuration
and the affected instances will be updated to forward syslogs or applogs to the Logstash
deployment created by ibm-cflogging.
Full instructions for installing the ibm-cflogging chart for integration with Elastic Stack in IBM
Cloud Private, and for installing the logging extensions to forward syslogs and applogs to
other logging solutions, are available in the Knowledge Center. See the following articles:
Connecting to Elastic Stack in IBM Cloud Private:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/cloud_foundry/integ
rating/icplogging.html
Configuring platform system log forwarding:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/cloud_foundry/confi
guring/syslog_forwarding.html
Configuring application log forwarding:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/cloud_foundry/confi
guring/applog_forwarding.html
The ibm-osb-database is installed and run on IBM Cloud Private using these high-level steps.
1. The ibm-osb-database IBM Cloud Private catalog archive can be found in the IBM Cloud
Private Cloud Foundry installation configuration directory. The archive contains a Docker
image and a Helm chart.
The standard Cloud Foundry CLI commands are used to manage the buildpacks: cf
create-buildpacks, cf delete-builpacks. The position can be changed by using cf
update-buildpacks.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 343
10.11.2 Application for an airgap environment
There are several Cloud Foundry buildpacks that assume the environment has internet
access and has the ability to access external websites and download additional prerequisite
software artifacts during cf push operations for an application. If you have an airgap
environment and do not have internet access, cf push operations will fail anytime something
needs to be downloaded from the internet. An example of this would be a Node.js app where
Node modules are typically downloaded when the app is pushed.
To avoid these node module downloads in an airgap environment, there are some additional
steps to perform prior to the cf push of the application. These steps need to be performed
on a machine that has internet connectivity. For example:
1. Download the NodeJS app from the Git repository:\
git clone https://github.com/IBM-Bluemix/get-started-node
2. Change to the app directory
cd get-started-node
3. Install all the tools we need to create a Node.js airgap buildpack using the following
commands:
a. apt-get update
b. apt install nodejs
c. apt install npmapt
d. install nodejs-legacy
4. Force all the Node.js modules to be downloaded and packaged with the application. Run
the following:
a. npm install
b. npm dedupe
c. npm shrinkwrap
Now that the Nodes.js application is built to work in an airgap environment, a simple cf push
can be used to deploy the application. If you need to copy the application files to the airgap
environment, create a tar file and copy it to the airgap machine. After an untar of the file, the
cf push can be performed.
There is also the option of doing both. This gives you seamless updates, with no need to
scale either deployment to handle the additional load while traffic is being redirected. This
gives you full HA in each data center, while allowing for disaster recovery through
geographically separate deployments.
Chapter 10. IBM Cloud Private Cloud Foundry and common systems administration tasks 345
346 IBM Cloud Private System Administrator’s Guide
A
: Client: &version.Version{SemVer:"v2.9.1",
GitCommit:"20adb27c7c5868466912eebdf6664e7390ebe710", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.9.1+icp",
GitCommit:"843201eceab24e7102ebb87cb00d82bc973d84a7", GitTreeState:"clean"}
Using helmcli
Follow the steps to review a list of available or installed packages:
1. Add a Helm repository. To add the Kubernetes Incubator repository, run the following
command:
helm repo add incubator
https://kubernetes-charts-incubator.storage.googleapis.com/
2. View the available charts by running the following command:
helm search -l
3. Install a chart. Run the following command:
helm install --name=release_name stable/chart_in_repo --tls
4. In the above command, release_name is the name for the release to be created from the
chart, and chart_in_repo is the name of the available chart to install. For example, to
install the WordPress chart, run the following command:
helm install --name=my-wordpress stable/wordpress --tls
5. List the releases by running the following command:
helm list --tls
The output should be similar to Example A-2.
You can follow the prompt for choosing the account and namespace or provide those options
except the username and password in the command. The same action can be performed in
one command:
cloudctl login [-a CLUSTER_URL] [-c ACCOUNT_ID or ACCOUNT_NAME] [-n namespace]
[--skip-ssl-validation]
Important: Providing your password as a command line option is not recommended. Your
password might be visible to others and might be recorded in your shell history.
cloudctl logout will log out the session from the IBM Cloud Private cluster.
For more information, see IBM Knowledge center or use cloudctl - -help or cloudctl -h
as shown in Figure A-2 on page 352.
To manage the API keys, IDs, and the service policies of an IBM Cloud Private cluster, use
the cloudctl iam commands. Run the following to see the list of available commands:
cloudctl iam - -help.
For more information about using the cloudctl iam commands visit
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.0/manage_cluster/cli_iam
_commands.html.
cloud cm commands
To manage an IBM Cloud Private cluster, use the following cloud cm commands:
cloudctl cm credentials-set-openstack: Sets the infrastructure account credentials for
the OpenStack cloud provider.
cloudctl cm credentials-set-vmware: Sets the infrastructure account credentials for the
VMware cloud provider.
cloudctl cm credentials-unset: Removes cloud provider credentials. After you remove
the credentials, you cannot access the cloud provider.
cloudctl cm machine-type-add-openstack: Adds an openstack machine type. A
machine type determines the number of CPUs, the amount of memory, and disk space
that is available to the node.
cloudctl cm machine-type-add-vmware: Adds a VMware machine type. A machine type
determines the number of CPUs, the amount of memory, and disk space that is available
to the node.
cloudctl cm machine-types: Lists available machine types. A machine type determines
the number of CPUs, the amount of memory, and disk space that is available to the node.
cloudctl cm master-get: Views the details about a master node.
cloudctl cm masters: Lists all master nodes.
cloudctl cm nodes: Lists all nodes.
cloudctl cm proxies: Lists all proxy nodes.
cloudctl cm proxy-add: Adds a proxy node to a cluster.
cloudctl cm proxy-get: Views the details about a proxy node.
cloudctl cm proxy-rm: Removes proxy nodes.
cloudctl cm registry-init: Initializes cluster image registry.
cloudctl cm worker-add: Adds a worker node to a cluster.
cloudctl cm worker-get: Views the details about a worker node.
cloudctl cm worker-rm: Removes worker nodes.
cloudctl cm workers: Lists all worker nodes in an existing cluster.
Kubectl
To manage Kubernetes clusters and IBM Cloud Private you can use the kubectl commands.
In this section we show you some of the kubectl commands that would help you manage
your cluster and IBM Cloud Private Installation.
To see the details about how to install the kubectl see the Knowledge Center link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/manage_cluster/install
_kubectl.html.
Tip: You can run kubectl --help to see the options for this command. See Example A-3.
As you can see in Example A-3, it is possible to see the details specifically for each
command.
The full list of options is available when you run kubectl as shown on Example A-4.
Deploy Commands:
rollout Manage the rollout of a resource
scale Set a new size for a Deployment, ReplicaSet, Replication
Controller, or Job
autoscale Auto-scale a Deployment, ReplicaSet, or ReplicationController
Advanced Commands:
apply Apply a configuration to a resource by filename or stdin
patch Update field(s) of a resource using strategic merge patch
replace Replace a resource by filename or stdin
wait Experimental: Wait for a specific condition on one or many
resources.
convert Convert config files between different API versions
Settings Commands:
label Update the labels on a resource
annotate Update the annotations on a resource
completion Output shell completion code for the specified shell (bash or
zsh)
Other Commands:
alpha Commands for features in alpha
api-resources Print the supported API resources on the server
api-versions Print the supported API versions on the server, in the form of
"group/version"
config Modify kubeconfig files
plugin Provides utilities for interacting with plugins.
version Print the client and server version information
Usage:
kubectl [flags] [options]
Use "kubectl <command> --help" for more information about a given command.
Use "kubectl options" for a list of global command-line options (applies to all
commands).
In this section we describe the commonly used commands for server management and
troubleshooting.
kubectl get
The kubectl get command is used to get information about the Kubernetes environment
such as resources, pods and node information.
The kubectl get command runs against the namespace that was selected during the login. If
needed it could be specified with the flag -n <namespace>.
If you need more information, you can see the command kubectl get nodes -o wide.
Example A-6 shows the output of the command kubectl get nodes.
Optionally, if you need to get the status or information about a specific node, you can specify
the name of the node in the command.
This command is useful when trying to get the status of the pod. It is frequently used when
troubleshooting a pod issue as described in Chapter 8, “Troubleshooting” on page 273.
kubectl logs
The kubectl logs command shows the logs of a resource or a pod. This command is useful
when troubleshooting an application and when you need more information about it.
Select a namespace:
1. cert-manager
2. default
3. development
4. finance
5. ibmcom
6. istio-system
7. kube-public
8. kube-system
9. marketing
10. platform
11. services
Enter a number> 1
Targeted namespace cert-manager
root@web-terminal-597c796cc-r5czh:/usr/local/web-terminal# ls
app.js index.html node_modules package.json style.css
certs main.js package-lock.json static
root@web-terminal-597c796cc-r5czh:/usr/local/web-terminal#
It is also possible to run a specific command, such as ls -ltr /tmp. See Example A-10.
As you can see in this example, this command runs the pod as a local VM.
kubectl describe
The command kubectl describe is used to get information about pods, nodes, and other
Kubernetes resources:
With the previous command it is possible to see the details for the running nodes, including
pods and used resources.
Suppose we have pod foo with the label ‘test’. We update the pod foo by removing the label
named test if it exists with the following command:
As you remove the label, the pod will be out of service and deployment will spawn a new pod
with the label test.
places a taint on node node1. The taint has a key, a key value, and the taint effect NoSchedule
parameter. This means that no pod will be able to schedule onto node1 unless it has a
matching toleration.To remove the taint added by the command above, you can run:
You specify a toleration for a pod in the PodSpec. Both of the following tolerations match the
taint created by the kubectl taint line above, thus a pod with either toleration would be able
to schedule onto node1.
Updating resources
The following list describes some example commands for updating resources:
Rolling update “www” containers of “frontend” deployment, updating the image:
kubectl set image deployment/frontend www=image:v2
Rollback to the previous deployment:
kubectl rollout undo deployment/frontend
Watch rolling update status of “frontend” deployment until completion:
kubectl rollout status -w deployment/frontend
Create a service for a replicated nginx, which serves on port 80 and connects to the
containers on port 8000:
kubectl expose rc nginx --port=80 --target-port=8000
Update a single-container pod’s image version (tag) to v4:
kubectl get pod mypod -o yaml | sed 's/\(image: myimage\):.*$/\1:v4/' | kubectl
replace -f
Scaling resources
The following lists some example commands for scaling resources:
Scale a replicaset named ‘foo’ to 3:
kubectl scale --replicas=3 rs/foo
Scale a resource specified in “foo.yaml” to 3:
kubectl scale --replicas=3 -f foo.yaml
If the deployment named mysql’s current size is 2, scale mysql to 3:
kubectl scale --current-replicas=2 --replicas=3 deployment/mysql
Scale multiple replication controllers:
kubectl scale --replicas=5 rc/foo rc/bar rc/baz
The publications listed in this section are considered particularly suitable for a more detailed
discussion of the topics covered in this book.
IBM Redbooks
The following IBM Redbooks publications provide additional information about the topic in this
document. Note that some publications referenced in this list might be available in softcopy
only.
IBM Cloud Private Application Developer's Guide, SG24-8441
You can search for, view, download, or order these documents and other Redbooks,
Redpapers, Web Docs, draft and additional materials, at the following website:
ibm.com/redbooks
Online resources
These websites are also relevant as further information sources:
IBM Cloud Private v3.1.2 documentation
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/kc_welcome_containe
rs.html
IBM Cloud Paks Knowledge Centerl ink:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/app_center/cloud_pa
ks_over.html
IBM Cloud Private for Data ink:
https://www.ibm.com/analytics/cloud-private-for-data
IBM Cloud Private v3.1.2 supported platforms:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.2/supported_system_co
nfig/supported_os.html
IBM Cloud Automation Manager Knowledge Center link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.1/featured_applicatio
ns/cam.html
IBM Cloud Transformation Advisor Knowledge Center link:
https://www.ibm.com/support/knowledgecenter/en/SSBS6K_3.1.1/featured_applicatio
ns/transformation_advisor.html
IBM Cloud Architecture Center:
https://www.ibm.com/cloud/garage/architectures/private-cloud/reference-architec
ture
Downloading Terraform:
https://www.terraform.io/downloads.html
SG24-8440-00
ISBN 0738457639
Printed in U.S.A.
®
ibm.com/redbooks