Software Defined Networking in the ATMOSPHERE project

Co-funded by the European Commission
Horizon 2020 - Grant #777154
Software Defined
Networking in the
ATMOSPHERE project
Giacomo Verticale
Politecnico di Milano

• ATMOSPHERE is a 24-month H2020
project aiming at the design and
development of a framework and a
platform to implement trustworthy
cloud services on a federated
intercontinental cloud.
• Expected Results
• A federated cloud platform.
• A development framework
• Trustworthy evaluation and monitoring
• Trustworthy Distributed Data Management
• Trustworthy Distributed Data Processing
• A pilot use case on Medical Imaging
Processing.
The Project
Trustworthy Data Processing Services (TDPS)
Application
Trustworthy Data Management Services (TDMS)
Infrastructure Management Services (IMS)
Federated Infrastructure
Trustworthiness
Monit.&Assessment
(TMA)

The problem
I do not want to care for the infrastructure, resource
management, job scheduling, secure access and
similar burdens. Moreover, I want to guarantee that
no sensitive data is exposed outside of the country
where it was produced.
I need to build up an Image Processing Tool that
uses sensitive data that requires a high computing
demand. Once developed, I want to exploit it as a
service securely and with a Quality of Service.

• PROVAR study – the first large-scale RHD screening program in
Brazil.
• RHD Screening: public schools, private schools and primary health
units in the cities of Belo Horizonte,
Montes Claros and Bocaiúva,
Minas Gerais, Brazil.
The Data

• The characterization of Echo-cardio
images obtained in public schools
• 5,600 exams, with an average of 14
videos per exams (total of 75,836
videos)
• 5,330 exams are classified as normal (with a
total of 71,686 videos) - 95%
• 238 exams are classified as borderline RHD
(with a total of 3,649 videos) - 4%.
• 32 exams are classified as definite RHD (with a
total of 501 videos) - 1%.
• Additionally, there is another databank with 3.5
millions electrocardiograms from the same
population area and age.
Image Biobank Requirements
Mean age: 13 ± 3 y.o.
Female sex: 55%.

• Sensitive data must not be accessible out of the boundaries of
the hosting country
• Sensitive data is protected by the Brazilian LGPD and must be processed under high
access-protection means, robust even in a potentially vulnerable cloud offering.
• Anonymous data, though, can be released but should be kept accessible only in a
secured environment.
• Medical Imaging processing and Machine Learning model
building requires intensive computing resources
• The capabilities for processing may not be accessible in the boundaries where the
data is located and therefore such processing algorithms must run elsewhere.
• The access should be coherent and secure, and image processing should be efficient.
• Experiments should be reproducible and stable
• The model building, image processing and classification should run on well-defined
environments that could be reproduced for further analysis.

• Trust is a choice that is based on past experience. Trust takes time to
build, but it can disappear in a second.
• Trusting cloud services is as complicated as trusting people. You need a
way to measure it and pieces of evidence to build trust.
• Trust in a cloud environment is considered as the reliance of a customer on a cloud
service and, consequently, on its provider.
• Trust bases on a broad spectrum of properties such as
Security, Privacy, Coherence, Isolation, Stability,
Fairness, Transparency and Dependability.
• Nowadays, few approaches deal with the
quantification of trust in cloud computing.
What is trust?

• Trustworthiness is considered in its multiple dimensions
• Security, as the capability to defence from attacks.
• Privacy, as the inherent risk of a dataset to contain re-identifiable data.
• Coherence, as the capability of providing a coherent behaviour
from any point of the federation.
• Isolation, as the difference when a service runs isolated or not.
• Stability, as the idempotency and stability of the services.
• Fairness, as the inexistence of undesirable or hidden biases.
• Transparency, as the capability of understand the
output of a system.
• Dependability, mainly focusing on availability and reliability.
• Measuring the trustworthiness properties
• A priori and a posteriori evaluation of vulnerability, performance,
re-identification risks, data loss rate, integrity, robustness, scalability, resource consumption,
classification bias and isolation.
Trustworthiness life-cycle

• Along with these
requirements, we explore
other requirements:
• Measurement of the Fairness of
the models to evaluate the bias
of the model with respect to
sensitive categories, such as
gender or race.
• Evaluation of the Explainability
of the model.
• Evaluation of the privacy loss
risk to determine the quality of
the anonymisation and the
potential leakage of personal
data inside the models.
... successfully reidentified the demographic data of
4478 adults (94.9%) & 2120 children (87.4%) …
(P < .001)

11
The Previous situation
Application Developers
- Who develop the tools for
processing the data.
- They require the
infrastructure to provide
some types of services and
resources, such as
computing, secure storage,
high-availability, data
persistence.
- They will deliver the
applications to others
to operate.
Application Manager
- An Application Developer may
not be in charge of deploying
the application on the
production infrast.
- The deployment implies the
monitoring and management
of the resources, services,
user accounts and data.
- The Application Manager will
have access credentials to the
infrastructure and will decide
the optimal allocation of the
resource.
End-Users
- Data providers and Data
scientists exploring and
processing data.
- Need for secure data
transfer and data access
tracing, as well as
simplified processing
tools.
- No need to worry about
achieving ICT skills.

Building Trust with
The ATMOSPHERE Platform

● Lemonade* is a web-based system for
designing and running analytics
applications.
● Users, who are not necessarily
programmers, describe applications as
workflows; Lemonade generates code
and controls their execution.
● Workflows consist of operations
(boxes) and data flows (arrows) among
them, performing:
⁃ Data preparation and engineering
⁃ Machine learning methods (MLib)
⁃ Visualization metaphors 15
LEMONADE

16
Supported Trustworthiness properties
Property Developers Data Scientists
Stability Stability strategies (e.g., cross-
validation)
Quality assurance of model outcome
(e.g., calibrate cross validation and
evaluate accuracy variance)
Privacy Privacy-preserving algorithms and
techniques (e.g., k-anonymity)
Assess the impact of preserving privacy
on the outcome utility and effectiveness
Transparency Transparency methods to be combined
with different data analytic flows (e.g.,
LIME/SHAP methods)
Execute ML models and, based on
explanations, calibrate the model or
enhance the input
Fairness Fairness-enhancing mechanisms and
strategies (e.g., Aequitas toolkit).
Generate report as to evaluate fairness
and decide on features to include on
models

• PAF assists organizations owning
and processing datasets to
understand how the processing of
data can affect their conformance
with regulations related to privacy
(GDPR and LGPD)
• These assessments may be used to
generate appropriate security/
privacy policies and checks used by
other services
17
Privacy assessment
forms (PAF)

• Typical best practices
• Data in transit and at rest can be encrypted
• Some processing can even be done over encrypted data
• Keys and certificates not included in repositories
• But this is not enough...
• If attacker has access to the machine (VM escapes,
internal attacker, cold boots), code can be changed,
memory can be dumped
• Keys or data can be stolen
19
Data access challenges

Data Protection
Layer
(Vallum)
The Vallum Framework
Colunar DBMS
(e.g., Cassandra)
Relational DBMS
(e.g., MySQL)Proxying
Authentication
Authorization
Privacy
Auditing Document Store
(e.g., MongoDB)
File System
(e.g. IPFS)
Query
Compliant
Results
Query
Compliant
Results
Query
Compliant
Results
Modified
Query
Result
Modified
Query
Result
Modified
Query
Result
Modified
Query
Result
trusted execution
environment (TEE) raw data encrypted
at rest
data encrypted
in transit

22
TMA: Design and interfaces
Measures and enforces
the multiple dimensions
of trustworthiness:
• Security
• Privacy
• Coherence
• Isolation
• Stability
• Fairness
• Transparency.
• Dependability

An orchestration platform to manage a federated set of hybrid
resources, to provide measures, adaptive mechanisms and policies
to improve trustworthiness
● orchestration platform ➔ Automatic configuration via TOSCA blueprints
● federated ➔ multiple clouds independently owned and managed, multi-
tenancy
● hybrid resources ➔ CPUs, SGX, GPUs
● measures ➔ metrics and tools to evaluate the trustworthiness of cloud
resources (availability, performance, etc)
● adaptive mechanisms ➔ to scale o reallocate cloud and network
resources
24
Trustworthy
Infrastructure Management

25
Infrastructure Management Services
Federated Infrastructure
Resource
Provider
Resource
Provider
Resource
Provider
ATMOSPHERE Platform
Federation middleware
Fogbow Fogbow Fogbow
Federation-wide
monitoring services
probes running at
each site
monitoring
service
Automated deployment service Performance prediction &
assessment serviceEC3 TOSCA-IM
Model training
Profiling

26
Site A
DMZInternal
XMPP
OVS
FNS
RAS
DMZ Internal
FNS
RAS
Cloud A
(OpenStack)
Site B
Cloud B
(OpenNebula)
Network federation
Fogbow
Dashboard
Fogbow
Dashboard
ONOS
XMPP
OVS
ONOS
IPSec
• Fogbow middleware can deploy multiple
VMs over a single VLAN spanning
multiple heterogeneous clouds
• Each federated site holds:
• a Federated Network Service
(FNS)
• a Resource Allocation Service
(RAS)
• an XMPP service
• one or more instances of
OpenVSwitch (OVS)
• Selected sites hold
• an instance of ONOS
• an instance of the Intent
Monitoring and Rerouting (IMR)
application
control
IMR IMR
control

27
Creation of a Network
Federation
Site A
DMZInternal
XMPP
OVS
FNS
RAS
DMZ Internal
FNS
RAS
Cloud A
(OpenStack)
Site B
Cloud B
(OpenNebula)
ONOS
XMPP
OVS
ONOS
1. The Infrastructure Manager (IM) requests
a new federated network and specifies
the private IP range and the VLAN ID
2. The IM requests a new local VM in the
federation
3. The FNS chooses an IP address,
prepares the cloud-init script and
forwards the request to the RAS
4. The RAS sets up OVS to accept the
incoming tunnel
5. The RAS interacts with the cloud to
create the VM.
6. The VM executes the cloud-init script and
establishes a tunnel with OVS.
7. Other VMs are attached to the federated
network in a similar way, with requests for
VMs in remote sites being forwarded by
the RAS accordingly.
8. ONOS sets up routing intents between
pairs of VMs
9. Intents are monitored and re-routed to
guarantee availability (and latency)
1 2
3
4
5
VM
6
VM
7
7
7
7
7
8
9

DEMO:
1. Configuration of each datacenter:
• one gateway VM (OVS)
• one instance of ONOS
• one or more VMs belonging to two
federations
2. The IMS monitors link availability and
assigns each link an «availability» score
3. Two VMs in the same federation
exchange traffic along the shortest path
4. When an IPSec tunnel fails traffic is
immediately rerouted along a live path
5. When the faulty IPSec tunnel is available
again, traffic remains in the backup path
until the availability score recovers
6. When the availability score is high, traffic
is rerouted 28
T4.4, D4.2Distributed Implementation of
Federated Networks

• The underlying infrastructure is a federated cloud
• Using fogbow (www.fogbowcloud.org) on OpenStack and OpenNebula.
• With a Federated Network to provide a coherent network space among nodes.
• Heterogeneous resources: SGX-enabled and GPU nodes.
• Using EC3(1) and Infrastructure Manager(2) to deploy a virtual
infrastructure.
30
Intercontinental
infrastructure
Cloud Resources @EU
Cloud Resources
@ Brazil
SGX-Enabled Resources
container
Encrypted
PROVAR
Study
Cloud
Manager
Cloud
Manager
Federation Layer
Secure overlay network
Central
TMA
TOSCA-IM
GPU-Enabled
Resources container
(1) https://marketplace.eosc-portal.eu/services/elastic-cloud-compute-cluster-ec3
(2) https://marketplace.eosc-portal.eu/services/infrastructure-manager-im
EC3

• The virtual infrastructure is managed by an
elastic Kubernetes cluster spawn over the
federated network
• Containers and services are accessible from both
sites but only through the federated network.
• Resources are properly tagged (SGX and GPU
capabilities and Brazil / Europe) so K8s applications
are placed in the correct resource.
• Infrastructure is described as code(3).
• K8s Front-end is deployed and nodes are being
powered on as the applications are deployed,
creating the request for specific resources.
31
Deployment of the virtual
infrastructure
(3) https://github.com/grycap/ec3/tree/atmosphere

• A secure storage is deployed at the
Brazilian side
• It uses Vallum(4), a service that provides
on-the-fly annonymisation based on policies.
• It masks (or blurs) the fields that are marked
as sensitive to different profiles of users.
• It relies on an HDFS filesystem for the files
and on SQL databases for the structured data.
• It runs the data anonymisation and sensitive data access on enclaves
running on SGX-enabled containers, so they securely run even in untrusted
clouds
• Data remains encrypted in disk.
32
Secure storage at Brazilian side
Cloud Resources
@ Brazil
SGX-Enabled
Resources
VALLUM
Encrypted
PROVAR
Study
Cloud
Manager
(4) https://www.atmosphere-eubrazil.eu/vallum-framework-access-privacy-protection

• Data is requested to Vallum from external users, but they will
only access to partially anonymised data
• Anonymised data (~1TB) is copied where the computing accelerators
are placed.
33
Anonymised Data
Cloud Resources @EUCloud Resources @ Brazil
SGX-Enabled Resources
VALLUM
Encrypted
PROVAR
Study
Plain &
Anonymised
data
Application
TMA
Cloud
Manager
Cloud
Manager
Federation Layer
Secure overlay network Central
TMA
GPU-Enabled
Resources
TOSCA-IM
storage
service

• Videos are split into frames and
classified by color inspection
• A color-based segmentation using k-means
clustering extracts the color pixels from the
Doppler images.
• Images are classified according
their acquisition view using a CNN
• Parasternal long axis view has proven to be
relevant to obtain an accurate classification.
• First & second order texture analyses
characterize the images by the spatial variation of pixel intensities.
• Besides texture features, blood velocity information is also obtained.
• Finally, all the extracted features are classified through machine learning
techniques in order to differentiate between RHD positive and healthy 34
Building the models for the
Estimation pipeline.
Image
Classification
Frame
Splitting
Preparation of
images for classifier Color-Based
Segment.
Doppler
Data Preparation
View
Classification
Texture Analysis &
Velocity Extraction
Features
Classification
Parasternal Long Axis
Data Analysis

• The pipeline is developed
using LEMONADE(5)
• LEMONADE provides
a GUI and a Machine
Learning librarie to
develop data analytics
pipelines.
• Pipelines can be run
interactively or transformed into executable code.
• Code can be interactively run or further embed into
services to be exposed for production.
• A model building pipeline and an estimation
pipeline are developed.
35
Coding the pipeline:
LEMONADE
(5) https://www.atmosphere-eubrazil.eu/lemonade-live-exploration-and-mining-non-trivial-amount-data-everywhere

Fairness
● Algorithms, in ML and IA, learn by identifying patterns in data collected
over many years. Why may algorithms become “unfair”?
○ By using unbalanced data sets, biased to certain population.
○ By using data sets that are perpetuating historical biases.
○ By inappropriate data handling.
○ As result of inappropriate model selection, uncorrect algorithm design or application.
● Algorithms Fairness components:
○ Aequitas Bias and Fairness Audit Toolkit, proposed
by the DSSG group from University of Chicago
(http://aequitas.dssg.io/)
○ Properties:
■ Equal Parity & Proportional Parity.
■ False Positive Rate and False Discovery
Rate Parity.
■ False Negative Rate and False Omission
Rate Parity.
Fairness
Tree
Equal
Parity
Proport.
Parity
Represent.
Fairness
Error
Fairness
FNRP FPRP FDRP FORP

● Model Complexity increase typically reduces Interpretability
○ Complex multilayer Convolutional Neural Networks are far more difficult to explain than
Decision Trees or Linear Regression.
● Effort is invested in characterizing explainability and providing
information to explain how the algorithm reached such results
○ 𝛿-Interprepetability (https://arxiv.org/pdf/1707.03886.pdf).
○ LIME (https://github.com/marcotcr/lime)
■ The output of LIME is a list of explanations,
reflecting the contribution of each feature to
the prediction of a data sample.
Interpretability
Retinopathy prediction using a 48 layers deep net)
https://www.kaggle.com/kmader/inceptionv3-for-retinopathy-gpu-hr
Severe
Retinopathy

Privacy Assessment Forms for GDPR
and LGPD
● The International context requires
dealing with multiple legal
frameworks
○ Brazilian LGPD and GDPR in our case.
● Integrated a tool for tagging and
following up sensitive fields
○ To provide a list of Personally Identifiable
Information (PII) and Sensitive Information
■ PIIs: Fullname, Ethnicity, Medical Record id,
Gender,..
■ Sensitive Info: Medical Information,
Genetics,..
○ Traces the use of sensitive data within a
processing workflow to guide on the
annotation of sensitive derived information.

Re-identification Risk
● Anonymisation defined by policies
○ Define actions (Removal, Blurring, Reduction,
Substitution) and fields.
○ The system starts with the less restrictive
policy, applies anonymisation and computes
the Metric.
● Data Privacy Model
○ Anonymisation Process.
○ K-anonymity Model Computation.
○ Threshold Checker.
○ Linkage Attack for Validation.
○ Increase Anonymity.

40
Conclusions
• Need to
manually
configure the
environment.
• Lack of
reproducibility.
• Qualitative
appraisal of the
trustworthiness.
Before After
• Self-assessment
of GDPR/LGDP.
• Trustable storage
environment even
on an untrusted
provider.
• Quantitative
anonymisation
level.
• Manual analysis
of GDPR/LGDP
risks
• Need to trust on
the storage
provider.
• Anonymisation
level is
qualitative.
• Applications templates
for complex &
distributed
applications.
• Provide a repeatable
way to deploy the
whole application.
• Quantitative measure
of trustworthiness

Software Defined Networking in the ATMOSPHERE project

More Related Content

Software Defined Networking in the ATMOSPHERE project