Presentation of Ceilometer (OpenStack Telemetry) new features in OpenStack Havana and a look at the features coming in IceHouse. Joint presentation done with Julien Danjou at the OpenStack In Action 4 (Dec 5th 2013)
1 of 28
Download to read offline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
More Related Content
From Ceilometer to Telemetry: not so alarming!
1. From Ceilometer
to Telemetry
Not so alarming!
A Julien Danjou & Nick Barcet presentation
for
OpenStack in action! 4
on the 5th December 2013
2. Speakers
Nick Barcet
VP Products @ eNovance
Co-founded the Ceilometer project at the Folsom
summit and led the project through incubation
Julien Danjou Ceilometer Lead Dev @ eNovance
Has been a core Ceilometer contributor from the
outset, taking over the PTL reins for Havana
3. State of the project
● Officially named OpenStack Telemetry
● Havana is the first integrated release
● Community growth
○ Grizzly: 30 contributors, 267 commits
○ Havana: 57 contributors, 434 commits
5. UDP transport
● Faster, stateless
● Lighter (msgpack encoding)
but…
● No delivery guaranteed
● Not signed
▶ Use case: gathering metrics for alarms
6. Improved API
● Group samples by fields when requesting
statistics (?groupby[]=user_id)
● Limit the number of items returned (?limit=42)
● Provides links to other resources in the API
7. Send your own samples
Users or operators can
send samples
➔ Leverage the
statistics
➔ Usable for alarming
POST /v2/meters/mymeter
[{
"counter_type": "gauge",
"counter_unit": "megabyte",
"counter_volume": 142.0,
"user_id": "efd87807-12d2-4b38-9c705f5c2ac427ff",
"project_id": "35b17138-b364-4e6a-a1318f3099c5be68",
"resource_id": "bd9431c1-8d69-4ad3-803a8d4a6b89fd36",
"resource_metadata": {
"name1": "value1",
"name2": "value2"
},
"source": "mypaasplatform",
"timestamp": "2013-09-10T20:34:13.711330"
}]
9. Database TTL
Previously:
No way to purge data.
Ceilometer produces a lot of data
(gigabytes per day)
Now:
ceilometer-expirer will drop data older
than the configured time-to-live delay
11. New meters
● API endpoints
○ Meters the requests made to API server (Neutron,
Glance, Nova, Swift, etc)
● Neutron bandwidth
○ Meter the bandwidth consumed by each project
○ Traffic labeled as configured by operator
(based on source/destination)
15. Alarm types
● Threshold alarms
Triggered once a value crosses a threshold
“Call a Webhook as soon as CPU usage goes above 80%”
● Combination alarms
Triggered once all alarms in that alarm are triggered
“Call a Webhook as soon as alarm “foo” and alarm “bar” are
triggered”
16. Alarms API
POST /v2/alarms
GET /v2/alarms/foobar
PUT /v2/alarms/foobar
{
"alarm_actions": [ "http://site:8000/alarm"],
"insufficient_data_actions": ["http://site:8000/nodata"],
"ok_actions": ["http://site:8000/ok"],
"comparison_operator": "gt",
"description": "An alarm",
"evaluation_periods": 2,
"matching_metadata": {"key_name": "key_value"},
"meter_name": "storage.objects",
"name": "SwiftObjectAlarm",
"period": 240,
"statistic": "avg",
"threshold": 200.0
}
DELETE /v2/alarms/foobar
17. Heat & auto-scaling
API service
Heat Engine
injects user
metadata
triggers alarm
my_stack
Instance
Alarm
evaluator
monitors instances
Compute
Agent
Ceilometer
creates alarms
18. Heat & auto-scaling
API
Heat Engine
Alarms
injects user
metadata
my_stack
Instance
Instance
Instance
scales out
stack
Compute
Ceilometer
alarming
19. Heat & auto-scaling
API
Heat Engine
Alarms
injects user
metadata
my_stack
Instance
Instance
Instance
Instance
Instance
scales out
stack
Compute
Ceilometer
alarming
20. Events storage
(Almost) all OpenStack components send notifications on
events: let’s store them.
➔ Useful to be able to re-generate samples
➔ Useful to generate new sample we did not think about
➔ Allow to have a double-entry accounting
➔ Audit ability
Not yet complete, to be continued in Icehouse
22. General improvements
● Split the collector in two logical pieces
● Rely on notification for samples rather than
RPC
● Bring SQLAlchemy and MongoDB driver
almost on parity
● Support for hardware polling
● Support Ironic
23. API improvements
● Complex filtering and query DSL
x OR y AND z
● /v2/samples
(a.k.a. /v2/meter without the meter)
● Return rate rather than absolute value
● More statistics functions (rate of change,
moving-window averages…)
● Bulk requests
25. Distributed polling
Leveraging Tooz and Taskflow to distribute
tasks among workers (agents).
★ Ability to distribute the polling
★ Replace alarm evaluator custom distributor