Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

2020 11 l3 PC Slides

Download as pdf or txt
Download as pdf or txt
You are on page 1of 178

KNIME Server Course:

Productionizing and Collaboration


- Online -
KNIME GmbH

1
Agenda
§ Session 1
§ KNIME Software Overview
§ Working with KNIME Server
§ Connect to KNIME Server
§ Server-Side Workflow Execution
§ Remote Workflow Editor
§ Permissions & Versioning
§ Session 2
§ Introduction to Components
§ Component Configuration
§ Composite Views
§ WebPortal Applications
§ Session 3
§ KNIME Server REST API
§ Integrated Deployment
§ KNIME Server Administration

© 2020 KNIME AG. All rights reserved. 2


KNIME Software - Overview
KNIME Software - one Ecosystem

KNIME KNIME
Analytics Platform Server
Data
KNIME KNIME Community Partner KNIME
Science as
Extensions Integrations Extensions Extensions WebPortal
a Service

© 2020 KNIME AG. All rights reserved. 4


KNIME Analytics Platform: loved by individuals

3000+ Nodes for all Steps of “End-To-End” Data Science

© 2020 KNIME AG. All rights reserved. 5


Creating Data Science: KNIME Analytics Platform

KNIME Community
Integrations Extensions
KNIME
Data Blending
Analytics Plattform
KNIME Partner
Extensions Extensions
Data Analytics
Data Engineers Data Scientists App Developers
Data Data Analysts ML/AI Engineers
Predictive Analytics
Science

Machine Learning

Artificial Intelligence

© 2020 KNIME AG. All rights reserved. 6


Data Science for the Business:
Individual:Creation
Create & Production

Great
Model or Report

© 2020 KNIME AG. All rights reserved. 7


Data Science Practice: Multiple Stakeholders’ Needs
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 8


KNIME Software - one Ecosystem

Create Productionize

Gather & Wrangle Model & Visualize Deploy & Manage Consume & Optimize

Open Source Commercial


KNIME Analytics Platform KNIME Server
Data
KNIME KNIME Community Partner KNIME
Science as
Extensions Integrations Extensions Extensions WebPortal
a Service

© 2020 KNIME AG. All rights reserved. 9


KNIME Hub: Sharing Resources

https://hub.knime.com/

© 2020 KNIME AG. All rights reserved. 10


Log in to KNIME Hub and publish your Workflows

KNIME Forum
Account Credentials

© 2020 KNIME AG. All rights reserved. 11


Edit the Workflow

Drag & Drop

© 2020 KNIME AG. All rights reserved. 12


Data Science Practice: Teams !
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 13


Reuse, Share and Document
Workflows & Components build all types of applications – automated and interactive

Features:
§ Self Documenting
§ No limits: All nodes
Workflow § DB, Spark, DL, Python
etc.
§ Task packaging
§ Mix and Match
§ Sharable / Reusable /
Instantiated

Component

https://www.knime.com/blog/knime-analytics-platform-40-components-are-for-sharing

© 2020 KNIME AG. All rights reserved.


Capture IP, Leverage Specialists, Collaborate
Sharable / Reusable / Instantiated Workflows, Components and Collaborative Development

Database Specialists Data Engineers Python Specialists

Data Science Specialists Visualization Specialists


Data Governance
Data Science Generalists

As a web-based
application

within a workflow
(manual/automated)

© 2020 KNIME AG. All rights reserved. 15


Recoverable, Backward Compatible, No Migration

Workflows, Components and Collaboration Features:


§ Instantiated / Updated
§ Secure (create, use, etc.)
§ Versioned and snapshots
§ Version comparison
§ Locked / Encrypted
§ Reproducible
§ Guaranteed backward
compatible

https://www.knime.com/blog/knime-meets-knime-will-they-blend
© 2020 KNIME AG. All rights reserved. 16
Data Science Practice: Multiple Stakeholders’ Needs
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 17


Data Science for the Business: Creation & Production

© 2020 KNIME AG. All rights reserved. 18


Empower Business Users Appropriately

Delivering reports and output to business users appropriately

Features:
§ Visualizations
§ Plotly, JavaScript, etc.
§ Reports Creation BIRT
§ Integration with
§ Excel
§ Functionality exploitation
not just CSVs
§ PowerBI
§ Tableau
§ Qlik
https://www.knime.com/community/continental-nodes-for-knime-xls-formatter § Spotfire
§ …
© 2020 KNIME AG. All rights reserved. § 19
Empower Business Users Appropriately

Guided Analytics for Building Applications


Appropriate levels of Automation & Human Interaction at any stage of the Data
Science Life Cycle depending on task and audience

Features:
§ Workflows and webportal nodes
build interactive applications
& dashboards
§ KNIME WebPortal manages access

https://www.knime.com/blog/principles-of-guided-analytics

© 2020 KNIME AG. All rights reserved. 20


Data Science Practice: Multiple Stakeholders’ Needs
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 21


Data Science for the Business: Creation & Production

© 2020 KNIME AG. All rights reserved. 22


Flexible Delivery Options: Automate

Automated workflow execution

Features:
§ Scheduled
§ Triggered
§ Called (Rest / SAAS)
§ Call Actions based on status
§ Scale and Pin Execution
§ View, edit, execute workflows
remotely

https://docs.knime.com

© 2020 KNIME AG. All rights reserved. 23


Continuous Integration / Continuous Deployment
Integrated Deployment
Features:
Creation Workflow § Eliminates the gap
between
Creation and Production
§ mark what‘s necessary
for production
§ not just the model
§ no manual intervention
Production Workflow § No rewrite, copy paste
§ Capture all nodes &
settings
§ Automatically created
production workflow
https://www.knime.com/integrated-deployment § Always in sync
© 2020 KNIME AG. All rights reserved. 24
Data Science Practice: Multiple Stakeholders’ Needs
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 25


Governance & Compliance

Explainability / Interpretability of models

Features:
§ Many Techniques available
§ LIME
§ SHAP
§ Shapley
§ Partial Dependence / ICE
§ Binary Classification Inspector

https://hub.knime.com/knime/extensions/org.knime.features.mli/latest

© 2020 KNIME AG. All rights reserved. 26


Governance & Compliance

Data / Model Lineage

Archive

Document

Explore &
Analyze

© 2020 KNIME AG. All rights reserved. 27


Data Science Practice: Multiple Stakeholders’ Needs
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 28


Manage Infrastructure & Users

Central Management and Monitoring capabilities

Features:
§ Client Customizations
§ Custom update sites
§ Manage preferences via profiles
§ Node repository & libraries
§ Monitor server activity
§ Running and scheduled jobs
§ Adjust permissions
§ Manage ongoing services

© 2020 KNIME AG. All rights reserved. 29


Single Sign-on, Integrate with multiple Security protocols

OAUTH, LDAP, AD Integration

Features:
KNIME
Server § Single sign-on (SSO) to KNIME
Server
§ Integrate with multiple identity
providers
§ Flexible configuration capabilities
Client
Identity
to map users and groups
Provider
§ Manage all aspects of KNIME
usage

https://docs.knime.com

© 2020 KNIME AG. All rights reserved. 30


Working with KNIME Server
Connecting to KNIME Server

32
Set Up a New Mount Point

Server connections are shown as “mount


points” in the KNIME Explorer. To add a
new mount point simply:
1. Click the Configure button in the KNIME
Explorer.
2. Click New…
3. Configure a mount point with your details

© 2020 KNIME AG. All rights reserved. 33


Server Mount Point as a Shared Repository (1/2)

The Server provides an area in the


Explorer for sharing work with your
colleagues. Use workflow groups to Click to add new
organize your workflows, components, mountpoint
and data files. Data
Workflow Groups

To move resources simply drag Workflows


Jobs
and drop or copy and paste

Components

© 2020 KNIME AG. All rights reserved. 34


Server Mount Point as a Shared Repository (2/2)

Another way to deploy resources on KNIME Server..

© 2020 KNIME AG. All rights reserved. 35


Inspecting a Workflow from KNIME Server

§ By double-clicking a workflow on KNIME Server, the client downloads it (to a


temporary location) and subsequently opens it automatically
§ The yellow bar at the top of the editor indicates that this is a temporarily
downloaded server workflow

© 2020 KNIME AG. All rights reserved. 36


Automation - Remote Execution

37
Executing a Workflow on the Server – Remote Execution

Check to reset workflow before


execution. All nodes are reset
(including File and Database
Reader nodes, etc.).

If selected, the executed job is


deleted immediately after
execution, and is not saved.

Enter one or multiple email


addresses (separated by
commas) to which a notification is
sent after the workflow execution
has been finished.

The name of the workflow job as


it is displayed in the server view.
By default this is the name of the
workflow. The execution date is
always appended to the name.

© 2020 KNIME AG. All rights reserved. 38


Executing a Workflow on the Server – Remote Execution
If the workflow contains a report,
you can select to save it on
KNIME Server.

By default the name of the report


is the name of the workflow. You
can define a custom report file
name, if you wish so.

You can overwrite the report with


every execution or append a
timestamp.

Here you can define the location,


where you want to store the report
on KNIME Server.

You can select in which formats


the report should be saved.

© 2020 KNIME AG. All rights reserved. 39


Executing a Workflow on the Server – Remote Execution

Depending on whether your job


executed successful or not, you can
configure to run subsequent
workflows

© 2020 KNIME AG. All rights reserved. 40


Executing a Workflow on the Server – Remote Execution
Here you can specify the first and
optionally the last execution date of
the scheduled job (in case it’s a Repeating jobs can be repeated
repeating job). after a certain number of minutes,
hours, or days. The latter takes into
account daylight saving, i.e. the
start hour will be the same in winter
and summer (e.g. 12:00).

By default, repeating jobs are


executed every day. Here you can
filter whether they should run only
on certain days of the week, days
of the month, or only in certain
months. “Last” means the last day
of the month.

If the previous job is still running


when the next execution is
supposed to start, you can opt to
skip this execution.

Scheduled jobs can be disabled


temporarily.

© 2020 KNIME AG. All rights reserved. 41


Executing a Workflow on the Server – Remote Execution

Configuration nodes from the


workflows can be set via execution
dialog to parametrize workflow
execution

© 2020 KNIME AG. All rights reserved. 42


Workflow Jobs

Remotely executed workflows are run as Jobs:


§ A workflow job is a copy of the workflow with specific settings and data
§ Jobs are tied to the version from when the job was created
§ Orphaned jobs are colored red
§ Jobs have messages (e.g. successful or failure)
§ Jobs can be saved as a workflow for data provenance and debugging
(right-click à Save as)

© 2020 KNIME AG. All rights reserved. 43


Remote Workflow Editor

44
Remote Workflow Editor – 1/3

§ Remote Control of Job running on KNIME Server


§ Capabilities:
§ Live update of workflow job execution
(executing node and progress)
§ Execute and cancel execution supported
§ Add/delete nodes
§ Change node settings
§ Inspect data tables / flow variables
§ JavaScript nodes can show data/views

© 2020 KNIME AG. All rights reserved. 45


Remote Workflow Editor – 2/3

What’s my Change node JavaScript


workflow doing configurations in View support
now? the Job

© 2020 KNIME AG. All rights reserved. 46


Remote Workflow Editor– 3/3

§ To use the Remote Workflow Editor please install the KNIME


extension Remote Workflow Editor

© 2020 KNIME AG. All rights reserved. 47


Permissions
Permissions

§ Permissions can be set for all types of items: workflows, workflow groups,
components, and data files
§ Permissions are assigned to either individual people or user groups
§ The user who uploads an item, automatically becomes its owner
§ Users with admin rights have no restrictions on permissions
§ The owner, plus everyone with admin rights, can assign and change
permissions for an item
§ It is also possible to set permissions on schedules, such that a schedule can be
maintained/changed by a team member while the owner is e.g. on vacation

© 2020 KNIME AG. All rights reserved. 49


Permission Types

Type Workflow Workflow Groups Files/Components

Read Download a workflow job - See the content of a workflow group § File: download data and
including data execute workflows that use the
data
§ Component: use and download

Write Overwrite, create snapshots, Create and upload new items in a Overwrite a file or component
and delete workflows workflow group

Execute Execute a workflow by creating


a workflow job

© 2020 KNIME AG. All rights reserved. 50


Setting Permissions

Everybody Else

© 2020 KNIME AG. All rights reserved. 51


Versioning
Versioning

§ Possibility of creating a history of items on Server


§ Create snapshots of workflows, data files, and
components. These are stored with a timestamp
and a comment

© 2020 KNIME AG. All rights reserved. 53


KNIME Workflow Difference (1/2)

§ Automates identification and comparison of nodes in a workflow, metanodes,


and two different workflows
§ Identifies insertions, deletions, substitutions, and parameter change

© 2020 KNIME AG. All rights reserved. 54


KNIME Workflow Difference (2/2)

Highlight differences:
§ Nodes included/excluded
§ Node configurations

© 2020 KNIME AG. All rights reserved. 55


Review and Exercises

56
Config Details – Access KNIME Server

§ KNIME Server Address: https://3.249.93.238/


§ KNIME WebPortal Address: https://3.249.93.238/knime/webportal

Login credential: firstname.lastname (*)


Password: knime

*for double names and double surnames the whitespace has been removed

© 2020 KNIME AG. All rights reserved. 57


Working with KNIME Server, Activity 1

§ Configure a mount point for KNIME Server with the details provided in the
Config Details – Access KNIME Server slide at the end of the slide deck
§ Download Server Training Material in your LOCAL workspace (hint: drag
and drop or copy and paste the entire folder)

© 2020 KNIME AG. All rights reserved. 58


Working with KNIME Server, Activity 2

Path to the workflow:


Exercises à 1. Remote Workflow Execution à 01. Data Mining

§ Create a local copy of the workflow and rename it as


“01_Basic_Customer_Segmentation_Use_Case_YourInitials”

§ Deploy the workflow on KNIME Server under the directory


Solutions à Uploaded Workflows

§ Schedule a workflow execution (5 mins later). Check “Notify upon completion”.


Add your email address to get a notification email on success and failure.

© 2020 KNIME AG. All rights reserved. 59


Working with KNIME Server, Activity 3 (1/3)

Path to the workflow:


Exercises à 1. Remote Workflow Execution à 01.Data Mining , Activity III
§ Make some changes to the workflow
“01_Basic_Customer_Segmentation_Use_Case_YourInitials” (e.g. generate 5
clusters instead of 4) in your Local Workspace and deploy it under the directory
Server_Course_Solutions/Server Course_My wf

© 2020 KNIME AG. All rights reserved. 60


Working with KNIME Server, Activity 3 (2/3)

§ Create a snapshot of the workflow and check the created snapshots with Server
History view
§ Overwrite the existing item
§ Check the option Create Snapshot before Overwriting and provide a name to the snapshot
§ The snapshot is listed in the Server History view (View à Other à Server History)
§ Download the snapshot to LOCAL Workspace

© 2020 KNIME AG. All rights reserved. 61


Working with KNIME Server, Activity 3 (3/3)

§ Use the WorkDiff feature to compare the downloaded workflow with the latest
workflow deployed on KNIME Server:
§ use Ctrl (or Cmd) button to select both workflows from KNIME Explorer;
§ right click on one of the two workflows
§ click Compare
§ select the two nodes that you wish to compare
§ click Show configuration differences of highlighted nodes

© 2020 KNIME AG. All rights reserved. 62


Components
What are they good for?

§ Components encapsulate functionalities that can be reused as your personal


customized KNIME nodes, to perform tasks that you often repeat.
§ They can also be shared with others via KNIME Hub and KNIME Server.

© 2020 KNIME AG. All rights reserved. 64


KNIME Verified Components

§ Verified Components reuse bundled functionalities, verified by KNIME experts


§ Released and updated on the KNIME Hub

© 2020 KNIME AG. All rights reserved. 65


Metanodes vs. Components

Metanodes Components

Configuration Not configurable Via Configuration nodes (local workflow) and


Widget nodes (KNIME WebPortal)
Variable scope Global Configurable: Local or global

WebPortal Usage Executed in the background JavaScript Views and Widgets inside the
component are shown on a WebPortal page
Execution mode Normal execution Allows Simple Streaming execution
Recommended uses Workflow cleaning Enabling custom interactions, producing
interactive views, sharing functionalities

© 2020 KNIME AG. All rights reserved. 66


Component Setup

§ Add input and output ports to components


§ Remove ports to adapt to changes after
creation of the component

© 2020 KNIME AG. All rights reserved. 67


Flow Variable Scope of Components

§ Flow Variables are -by default - only available locally inside the component
§ Configure the component input/output to pass Flow Variables
from/to outside the component

© 2020 KNIME AG. All rights reserved. 68


Component Description

§ Double click a component to configure it


§ Providing meaningful description is best
practice

© 2020 KNIME AG. All rights reserved. 69


Components: Creating a custom
nodes from KNIME nodes
Creating a Configuration Dialog

§ Configuration nodes allow creation of a Component’s configuration dialog


§ Configuration nodes enable different types of user inputs such as string input,
integer input, selecting one value from a list and many more

© 2020 KNIME AG. All rights reserved. 71


Creating a Configuration Dialog

§ Use Configuration nodes to create Flow Variables


§ Every Configuration node has a label, description and parameter/variable name
§ Depending on the node, additional options and visual properties can be set
§ Flow variables created in the Configuration nodes can then be used to overwrite
the settings of subsequent nodes

© 2020 KNIME AG. All rights reserved. 72


Configuration of a Component

§ Double click a component to configure it

© 2020 KNIME AG. All rights reserved. 73


Shared Components
What is a shared component?

§ Components can be saved in your


KNIME workspace, on KNIME
Server or KNIME Hub for later
reuse

§ To do this, simply right-click any


component and select “Share…”
§ Shared components are read-only
instances of a component
§ Public Shared Components are
available on the EXAMPLES
Server and on the KNIME Hub

© 2020 KNIME AG. All rights reserved. 75


How can you edit a shared component?

§ Components can be edited using


the Component Editor, similar to
workflows
§ To edit a component using the
Component Editor, double-click the
component in its location in the
KNIME Explorer

§ To ensure components are


executable when opened in the
Component Editor, choose the
option to “Include input data with
component” when sharing it

© 2020 KNIME AG. All rights reserved. 76


How can you use a shared component?

§ To use a Shared Component, drag


and drop it to your workflow editor

§ Instances of Shared Components


can be updated either manually or
when the workflow is opened

§ A Shared Component can also be


unlinked from its original location,
which makes it editable in the
workflow directly

§ Update Shared Components by


overwriting them

© 2020 KNIME AG. All rights reserved. 77


Component Composite View
Creating a Composite View

§ JavaScript visualization and Widget nodes allow creation of a composite view


§ Widget nodes allow user interaction in a composite view, e.g.
§ Adjustment of parameters via various input nodes
§ Selection of predefined values
§ Consumption of outputs

© 2020 KNIME AG. All rights reserved. 79


Component Composite View

§ Multiple JavaScript View nodes can be combined in Components


§ Selections are transmitted to all other views
§ User can give input via Widget nodes

© 2020 KNIME AG. All rights reserved. 80


Component Composite View

§ You can open a composite view directly from


the node’s context menu.
§ It will open a browser window that displays
all containing JavaScript and Widget nodes.
§ The same view will also be shown via
KNIME WebPortal

© 2020 KNIME AG. All rights reserved. 81


Configuration of Widget Nodes – Input Nodes

§ Widget input nodes have very similar configuration dialogs

© 2020 KNIME AG. All rights reserved. 82


Layouting

§ A layout can be defined for any Component that contains at least one widget
node or JavaScript-enabled view

§ The layout editor can be accessed from the top toolbar, when inside the
component

© 2020 KNIME AG. All rights reserved. 83


Layouting – Nested Components

§ It is possible to include components inside other components


§ It allows to build up a library of components that contain useful linked views and
then easily assemble them to create complex views for visualizing and
interacting with datasets

© 2020 KNIME AG. All rights reserved. 84


Defining the Layout

§ Four tabs available:


§ Node Usage
§ Visual Layout
§ Basic Layout
§ Advanced Layout

© 2020 KNIME AG. All rights reserved. 85


Layouting – Order Items to Show on the WebPortal

§ The “Append the IDs to node names” button on the top bar shows the ID of each
node

§ This is useful to reorder the items in the layout structure for the WebPortal

© 2020 KNIME AG. All rights reserved. 87


Layouting I – Node Usage

§ Enabling the view of Widget


nodes on the WebPortal or on
the Component view
§ Enabling the view of the input
given by the Configuration node
on the Component Dialog

© 2020 KNIME AG. All rights reserved. 88


Layouting II – Visual Layout

§ Makes it easier to specify


how individual JavaScript
views are laid out in a
WebPortal page or
Component composite view

© 2020 KNIME AG. All rights reserved. 89


Layouting III – Order Items to Show on the WebPortal

© 2020 KNIME AG. All rights reserved. 90


Layouting IV – Order Items to Show on the WebPortal

© 2020 KNIME AG. All rights reserved. 91


KNIME WebPortal

92
Data Science Practice: Multiple Stakeholders’ Needs
Data Engineers Smart Business Users (more than Excel)
Data Science “coders” (Python, etc.)
Application Users – Interaction required
Data Science Specialists
Data Science Visual workflow / Application Users – Made to spec
generalists Report Consumers

Model / ML Operations Operations Consumption


Feeding production systems (Applications, Systems, Edge, etc.)

IT Operations
Centralized resources / strategies
Standards and preferred platforms used, Financial / Risk Oversite
infrastructure options Costs allocation
Exit strategies Compliance officer
IT Security Data/model Governance,
Data, applications traceability, GDPR

© 2020 KNIME AG. All rights reserved. 93


Close the Gap: Guided Analytics and Automation

Extending data
Incorporate domain Amplifies the best
science to the
experts’ knowledge data science
Business Analysts

KNIME
Guided
Analytics &
Automation

R&D MFG IT Marketing Sales more …

© 2020 KNIME AG. All rights reserved. 94


KNIME WebPortal

§ All workflows on KNIME


Server available as web apps
§ Step-by-step execution of
workflows from any browser
§ Simple, clean interface for
end users: Guided Analytics
§ Customize layout to match
corporate design

© 2020 KNIME AG. All rights reserved. 95


Guided Analytics: Interactive Data Science

Interaction Points

© 2020 KNIME AG. All rights reserved. 96


Inside an Interaction Point

© 2020 KNIME AG. All rights reserved. 97


Access to KNIME WebPortal from the Client

§ Possibility to start the execution of a workflow on KNIME WebPortal directly from


KNIME Analytics Platform: right-click on the workflow available in the KNIME
Server Mountpoint and click Open in WebPortal

© 2020 KNIME AG. All rights reserved. 98


Detail Pane

§ If a workflow is selected in the left section, its details page is shown in the
section on the right

§ Server sends a notification email to the address as soon as the workflow


execution finishes

© 2020 KNIME AG. All rights reserved. 99


WebPortal Applications:
Guided Analytics for ML Automation
Guided Analytics for ML/AI Automation

§ Interaction & Automation


§ Data Scientist’s Choice: right mix for target audience

© 2020 KNIME AG. All rights reserved. 101


© 2020 KNIME AG. All rights reserved. 102
© 2020 KNIME AG. All rights reserved. 103
© 2020 KNIME AG. All rights reserved. 104
WebPortal Applications:
Customer Segmentation

105
Classic CRM Analytics

§ CRM System § Churn Prediction


§ Data about your customer § Upselling Likelihood
§ Demographics § Product Propensity /NBO
§ Behavior § Campaign Management
§ Revenues § Customer Segmentation
§ …

Model

© 2020 KNIME AG. All rights reserved. 106


Customer Segmentation

§ Customer Segmentation is a standard technique that consists of dividing a


customer base into groups of individuals that are similar in specific ways
relevant to marketing, such as age, gender, interests, and spending habits
§ Companies need to treat their customers differently depending on the segment
to which they belong
§ By providing personalized offers and communication to each segment,
companies can:
§ Increase customer lifetime value
§ Reduce irrelevant customer interactions
§ Differentiate their brand relative to their competitors
§ Generate higher profits
§ We will build a workflow that implements a customer segmentation technique and then
construct a web user interface to inject business experts’ knowledge into the final results

© 2020 KNIME AG. All rights reserved. 107


Basic Workflow for Customer Segmentation

§ Predefined data sources


§ Static configuration

© 2020 KNIME AG. All rights reserved. 108


WebPortal Workflow for Customer Segmentation

§ Predefined interaction points


§ Allows expert to give inputs and consume outputs via a browser

© 2020 KNIME AG. All rights reserved. 109


Step by Step Walkthrough

© 2020 KNIME AG. All rights reserved. 110


Define Cluster Parameters

KNIME AP KNIME WebPortal

© 2020 KNIME AG. All rights reserved. 111


Display Cluster Result

KNIME AP KNIME WebPortal

© 2020 KNIME AG. All rights reserved. 112


Take Notes

KNIME AP KNIME WebPortal

© 2020 KNIME AG. All rights reserved. 113


Display Labeled Clusters

KNIME AP KNIME WebPortal

© 2020 KNIME AG. All rights reserved. 114


Stepping through Pages – Wizard Execution

§ Each Component node represents one


WebPortal page

§ To preserve the order of the sequence


of each web page, these need to be
connected to each other

§ It is also possible to use Component


nodes inside loops to iterate over an
item set or recursively refine a model

© 2020 KNIME AG. All rights reserved. 115


KNIME WebPortal: URL Parameter (1/2)

§ It is possible to link directly to specific workflows in the WebPortal. URLs are


generally set up like this:
§ http://<server-address>/knime/webportal/<ItemPath>?exec&<WorkflowParameters>
§ <ItemPath> = The path to a workflow, workflow group, or workflow job (a workflow job is
referenced with its ID like: WorkflowGroup/Workflow?exec=job_id)
§ <WorkflowParameters> can appear in any order, but have to be after the <ItemPath>?exec.
Parameters are always appended with a leading ‘&’

§ List of available workflow parameters:


§ &pm:<name>=<value> - Set widget parameters: sets the named widget parameter to the
specified value.

© 2020 KNIME AG. All rights reserved. 116


KNIME WebPortal: URL Parameter (2/2)
§ &emails=sample@mail.com – Enable email notification: enables email notification and sets the
specified comma-separated list of email addresses
§ &formats=<formats> – Set report formats: sets the report formats included as attachments in the
notification email specified by a comma-separated list. Available formats are: pdf (enabled by
default), html, doc, docx xls, ppt, pptx, ps, odt, ods, and odp.

A complete URL might look like:


http://localhost:8080/knime/webportal/demo/file%20to%20csv?exec&pm:title=foo&emai
ls=sample@mail.com

© 2020 KNIME AG. All rights reserved. 117


Review and Exercises
Config Details – Access KNIME Server

§ KNIME Server Address: https://3.249.93.238


§ KNIME WebPortal Address: https://3.249.93.238/knime/

Login credential: firstname.lastname (*)


Password: knime

*for double names and double surnames the whitespace has been removed

© 2020 KNIME AG. All rights reserved. 119


Component Exercise, Activity

Path to the workflow: Exercises à 2. Working with Components à 00. Import


Components

§ Drag&drop the Reading and Pre-Processing Data_v1 Component, and the


Customer Segmentation_v1 Component available on KNIME Server in the
workflow editor

§ Connect the two components accordingly and execute the workflow

© 2020 KNIME AG. All rights reserved. 120


Define Cluster Parameters Exercise, Activity I (1/2)

Path to the workflow: Exercises à 2. Working with Components à 01. Define


Cluster Parameters
§ Filter only numeric columns (hint: use the Column Filter node)
§ Use the Integer Widget node to define the number of clusters Configuration:
§ min number of clusters 2
§ max number of clusters 10
§ default value: 4
§ Use the Column Filter Widget node to define the column to be included for
clustering

© 2020 KNIME AG. All rights reserved. 121


Define Cluster Parameters Exercise, Activity I (2/2)

§ Use the Text Output Widget node to write the webpage description
Text for the WebPage (hint: use html as text format):
<h2>Define Cluster Parameters</h2>
<p>Set parameters to be taken into account in the following clustering.</p>
<p>Click 'Next' to start the clustering process.</p>
<P>If you do not know what a clustering process is, check <a
href="https://en.wikipedia.org/wiki/Cluster_analysis">Cluster Analysis</a> and specifically the <a
href="https://en.wikipedia.org/wiki/K-means_clustering">k-Means algorithm</a>.
§ Encapsulate the 4 created nodes in a component and configure 2 outports: one
for the Integer Widget node and one for the Column Filter Widget node

© 2020 KNIME AG. All rights reserved. 122


Define Cluster Parameters Exercise, Activity II

Path to the workflow: Exercises à 2. Working with Components à 01. Define


Cluster Parameters

§ Define the layout of the items with in order to have the items ordered as shown
in the figure

Text Output
Widget

Column
Filter Widget

© 2020 KNIME AG. All rights reserved. 123


WebPortal Exercise, Activity

§ Access to the WebPortal (details available in KNIME Server Configuration slide)


§ Execute the workflow located at
Use Cases à Customer_Segmentation_Use_Case
§ Discard the job of the workflow once it has been executed (from KNIME
Analytics Platform!)

© 2020 KNIME AG. All rights reserved. 124


KNIME Server REST API
KNIME Server REST API

§ Integrate KNIME Server functionality with IT infrastructure


§ REST is a design pattern used for building networked applications
§ REST = Representational State Transfer
§ Communication based on HTTP
§ Usually clear text
§ Execute workflows, check server status, set permissions, and more
§ Entry point for the REST Interface is https://server-address/knime/rest/

See Blog Posts for detailed tutorials:


https://www.knime.org/blog/giving-the-knime-server-a-rest
https://www.knime.org/blog/the-knime-server-rest-api
https://tech.knime.org/wiki/using-knime-server-rest-api-for-file-uploads-and-downloads

© 2020 KNIME AG. All rights reserved. 126


KNIME Server REST API

§ All server functionality available via REST API


§ Programmatically control KNIME Server
§ Upload/download/delete resources
§ Staging from Dev to Prod
§ Upload licenses
§ Empty trash/restore items
§ Execute workflows
§ Schedule jobs
§ Set permissions
§ Create users & groups, etc.
§ Documentation available at:
/knime/rest/doc/index.html

© 2020 KNIME AG. All rights reserved. 127


KNIME Server REST API

§ Enables external integration


§ Build applications around KNIME Server Deploy KNIME workflows
as web services
§ e.g. for Microservices and real time scoring

Output in server
Input data response

© 2020 KNIME AG. All rights reserved. 128


SwaggerUI definitions of individual workflows (KNIME Server Medium and Large)

§ One of the key functionality of KNIME


Server is to allow individual workflows
to be exposed as REST endpoints
§ SwaggerUI interface allows to
document and test your web services
§ “Show API definition” is available in the
context menu item on the KNIME
Explorer. Selecting this option opens
the SwaggerUI page for that service in
your web browser.

© 2020 KNIME AG. All rights reserved. 129


SwaggerUI definitions of individual workflows

© 2020 KNIME AG. All rights reserved. 130


REST API – Use Case: Workflows Calling Workflows

§ Execute the workflow via Call Remote


Workflow node
§ Analyses input parameters
§ Prepare input data accordingly
§ Executes job and gets back results

© 2020 KNIME AG. All rights reserved. 131


New Node: Call Workflow (Table Based)

§ Makes it easier to call other


workflows using an entire
KNIME table
§ A caller workflow can send a
table and flow variables to a
callee workflow and receive a
table from the callee via the
Container Input (Table) node

© 2020 KNIME AG. All rights reserved. 132


New Node: KNIME Server Connection

§ Allows to connect to KNIME Server


§ After a connection has been
established, any of the remote file
handling nodes can be used with the
connected server
§ The server connection can also be used
together with the Call Workflow (Table
Based) node in order to run workflows
that are shared via a KNIME Server

© 2020 KNIME AG. All rights reserved. 133


REST API – Use Case: Integration with External Applications

§ The workflow can also be executed by external tools such as Postman or Curl
for debugging purposes
§ KNIME Server as backend for third party analytical applications

© 2020 KNIME AG. All rights reserved. 134


Integrated Deployment
Data Science: Development != Deployment

© 2020 KNIME AG. All rights reserved.


Moving Data Science into Production

Issues:
• Development =!
Deployment
• Needs Copy/Paste, Rewrite
• Transport of models is non-
trivial

Þ Inefficient & Error prone

© 2020 KNIME AG. All rights reserved.


Side Note: It’s not just a Visual Workflow Issue

Creating Data Science Data Science in Production

# read data
raw_target_data = read_xls_data() productionize # Predictions, running as a flask service
# remove duplicates, handle missing values:
target_data = basic_data_cleanup_with_pandas(raw_target_data) # load saved components
raw_feature_data = fetch_db_data_using_psycopg2() feature_scaler,trained_RF = load_models_with_joblib()
# remove duplicates, handle missing values:
feature_data = basic_data_cleanup_with_pandas(raw_feature_data) # read and prepare data
raw_prediction_data = get_dataframe_from_request()
# basic feature engineering with sklearn prediction_data = basic_data_cleanup_with_pandas(raw_prediction_data)
feature_scaler = sklearn.preprocessing.StandardScaler().fit(feature_data) scaled_data = feature.scaler.transform(prediction_data)
standardized_features = feature_scaler.transform(feature_data)
filtered_feature_data = variance_feature_filter_with_sklearn(standardized_features,target_data) # generate predictions
predictions = trained_RF.predict(scaled_data)
# build model with sklearn prediction_probs = trained_RF.predict_proba(scaled_data)
training_feature_data,testing_feature_data,training_target_data,testing_target_data = predictions = join_tables_with_pandas(predictions,prediction_probs)
split_data_with_sklearn(filtered_feature_data,target_data)
RF_params = RF_hyperparameter_search_with_sklearn(training_feature_data,training_target_data) # return results from the service:
trained_RF = build_RF_using_params(training_feature_data,training_target_data,RF_params) return_dataframe_to_service(predictions)

# validate model
generate_validation_report(trained_RF,testing_feature_data,testing_target_data)

#--------------------
# save models
save_models_with_joblib((feature_scaler,trained_RF))

© 2020 KNIME AG. All rights reserved.


Integrated Deployment

© 2020 KNIME AG. All rights reserved. 139


Capture and Combine Parts of a Workflow

© 2020 KNIME AG. All rights reserved.


Write Production Workflow from Captured Parts

© 2020 KNIME AG. All rights reserved.


KNIME Server Administration
Architecture
Central Components of KNIME Server

Executor(s)
KNIME Server Large Message
Queue
Workflow
Repository Request

Request

Tomcat
Request


Web Container


Client

© 2020 KNIME AG. All rights reserved. 144


KNIME Server – Scaling Options

§ KNIME Server supports ‘Scale Up’

§ KNIME Executors allows ‘Scale Out’

© 2020 KNIME AG. All rights reserved. 145


Scaling Out - KNIME Executors - KNIME Server Large

§ Run multiple executors on independent hardware


§ The distributed Executors rely on the KNIME REST interface and on a message
queueing system called RabbitMQ

© 2020 KNIME AG. All rights reserved. 146


KNIME Cloud Offerings – AWS and Azure

Features:
§ KNIME Analytics Platform
§ KNIME Server Small & Medium
§ KNIME Server Large BYOL
§ Supports Server Large with multiple
Executors
§ Has an embedded Executor so can be
stand-alone
§ KNIME Executors
§ Multiple Executors that can be used by
KNIME Server Large
§ Pay as you go (PAYG) offering supports
elastic scaling
https://www.knime.com/knime-software-on-amazon-web-services
§ Bring your own license (BYOL) offering
https://www.knime.com/knime-software-on-microsoft-azure uses cores from your Server license

© 2020 KNIME AG. All rights reserved. 147


Flexible Cloud deployments to meet computing needs

Mixed Cloud Usage


Virtual Private Cloud

Executors BYOL
Features:
§ Supplement traditionally licensed
Executor Executor Executors with Pay-as-you-Go
(PAYG) model
ı

Executor Executor
§ Meet periodic demand peaks
KNIME § Fulfill need for speciality hardware
Server Executors PAYG
(e.g. GPU‘s)
Executor Executor § Meet budgeting needs
ı

Executor Executor

© 2020 KNIME AG. All rights reserved. 148


Flexible Hybrid deployments to meet computing needs

Executors Licensed KNIME Hybrid Usage Model

Executor Executor
Features:
§ Mix of Enterprise data center
Executor Executor and Cloud deployments
§ Meet periodic demand peaks
KNIME
Server AWS Virtual Private Cloud § Fulfill need for speciality
hardware (e.g. GPU‘s)
Executors PAYG
§ Meet budgeting needs
Executor Executor
VPN
ı

Executor Executor

© 2020 KNIME AG. All rights reserved. 149


Properties

150
Set Properties

© 2020 KNIME AG. All rights reserved. 151


Job Pools

§ Allows frequently executed workflows to stay in memory


§ Advantage: Eliminates the overhead of loading the workflow in an Executor for
each execution
§ Useful when job loading time is large compared to execution time
§ A job pool can be enabled by setting a property on the workflow that should be
pooled

© 2020 KNIME AG. All rights reserved. 152


Workflow Pinning

Features:
RAM
Executor
CPU
§ Match workflow needs to
Executor capabilities
KNIME
Executor
GPU CPU § Partition compute resources
Server
by capability, department,
RAM GPU
usage, …
Executor
§ Workflow needs determined
by workflow publisher

© 2020 KNIME AG. All rights reserved. 153


Workflow Pinning

§ Workflow pinning is useful when using multiple KNIME Executors


§ For a workflow that needs certain system requirements (e.g. specific hardware
(like GPU), extensions, or system environments (like Linux) it is possible to
define such Executor requirements
§ Only Executors that fulfill the requirements will accept and execute the workflow
job
§ The system admin of the Executors must specify a property for each executor
separately
§ The requirements can be defined by setting a property on a workflow, which is a
simple comma-separated list of user defined values.

© 2020 KNIME AG. All rights reserved. 154


Executor Groups

Supporting Executors in the Enterprise

Features:
CPU RAM Marketing
Executor § Logical groupings of
Group 1
Executors
KNIME Executor
CPU Database Finance
Finance § Match users/groups to
Server Group 2
Executor Groups

Executor
CPU GPU Engineering § Partition compute resources
Group 3 by groups, department, …
§ Partitioning managed by
Server administrators

© 2020 KNIME AG. All rights reserved. 155


KNIME Server Configuration
KNIME Server Configuration Files

§ knime-server.config
§ Located in \workflow_repository\config
§ Central configuration file
§ knime.ini
§ Located in Executor installation folder
§ Provides runtime parameters and JVM settings
§ executor.epf
§ Centrally managed in \workflow_repository\config\client-profiles\executor
§ Runtime copy in Executor (and client) workspace
§ Specifies preferences, e.g. database drivers, Python environments, …
§ Is distributed to all Executors that connect to the Server

© 2020 KNIME AG. All rights reserved. 157


User Authentication: Local User Database

§ KNIME Server installer configures a database (H2)


based authentication method by default
§ Recommended for use by small workgroups
who do not have access to an LDAP system
§ The administration pages of the WebPortal can be used
to add/remove users, or create/remove groups

© 2020 KNIME AG. All rights reserved. 158


User Authentication: LDAP/AD

§ Possibility to integrate with existing user-/role-management system


§ It is possible to use Kerberos in combination with LDAP for Single-Sign-On for
authentication with KNIME Server
§ KNIME Server also allows authentication by JWT (JSON Web Tokens)

© 2020 KNIME AG. All rights reserved. 159


User Authentication: OAuth

Features:
KNIME
Server § Single sign-on (SSO) to KNIME
Server
§ Integrate with multiple identity
providers
§ Flexible configuration capabilities
Client
Identity
to map users and groups
Provider

© 2020 KNIME AG. All rights reserved. 160


KNIME Server Management Services
About Managing Preferences

§ Management (Client Preferences) makes IT operations easier by centrally


managing KNIME Analytics Platform preferences

§ Solution: Management (Preferences) part of the Management Tools available


with KNIME Server Medium and KNIME Server Large

§ Features:
§ Easier IT Operations.
§ Manage Analytics Platform preferences centrally
§ Include dependencies – e.g. driver files.
§ Deliver updates to configurations automatically

§ Simplifies Executor setup in distributed Executor environments

© 2020 KNIME AG. All rights reserved. 162


Management (Client Preferences)

§ Different departments/teams
have different requirements

§ Multiple OS deployments Marketing Finance R&D


§ Windows 7
§ Windows 10
§ Linux
§ macOS

Windows 10 Windows 7 Linux and macOS


Hive Oracle Python
Spark MS Access R

© 2020 KNIME AG. All rights reserved. 163


Management (Client Preferences)

§ Client-profiles
§ Python-Linux
KNIME Server § Python-macOS
§ R-Linux
client-profiles
§ Databases-Win7
Databases- Python-macOS § Big Data-Win10
Big Data-Win10
Win7 (etc) § Executor

Profiles can include:


Preferences, drivers,
and more

Marketing Finance R&D


© 2020 KNIME AG. All rights reserved. 164
Management (Client Preferences)
Client-profiles
§ Default
§ default.epf
KNIME Server § Python-Linux
§ python.epf
client-profiles § py-linux.sh
§ Python-macOS
Databases- Python-macOS
Big Data-Win10 § python-mac.epf
Win7 (etc)
§ py-mac.sh
§ R-macOS
§ R-mac.epf
§ Databases
§ db.epf
§ oracle.jdbc
§ msaccess.jdbc
§ Big Data
§ bigdata.epf

Marketing Finance R&D


© 2020 KNIME AG. All rights reserved. 165
Management (Client Preferences)

1. Knime.ini
By adding lines to the knime.ini (file available in the same directory as the KNIME Analytics Platform
executable)
On application startup of KNIME Analytics Platform, KNIME Server is queried for the specified
preference profiles. Preferences are applied before finishing startup

© 2020 KNIME AG. All rights reserved. 166


Management (Client Preferences)

2. Manually set a single preference

Allows users to browse available profiles


and add them on demand.

© 2020 KNIME AG. All rights reserved. 167


KNIME Server Maintenance
Admin Workflows

§ KNIME Server comes with a set of admin workflows


§ Admin workflows make use of KNIME Server REST API
§ Allows automated KNIME Server maintenance

© 2020 KNIME AG. All rights reserved. 169


Backups

§ The following files and/or directories need to be backed up:


§ The full server repository directory, except for the “temp” folder
§ The full Tomcat directory
§ In case you installed your own molecule sketcher for KNIME WebPortal (see above), also backup
this directory
§ In order to restore a backup, copy the files and directories back to their original
locations and restart KNIME Server

© 2020 KNIME AG. All rights reserved. 170


Log Files

Tomcat Server Logs


§ Catalina Logs
§ Location: <tomcat-folder>/logs/catalina.yyyy-mm-dd.log
§ This file contains all general Tomcat server messages, such as startup and shutdown.
§ Localhost Logs
§ Location: <tomcat-folder>/logs/localhost.yyyy-mm-dd.log
§ This file contains all messages related to the KNIME Server operation.

KNIME Executor Logs


§ <executor workspace>/.metadata/knime/knime.log
§ Contains messages from the KNIME Executor that is used to execute workflows on the server.
§ <executor workspace>/.metadata/.log
§ Eclipse logs might give insights on failing executors.

© 2020 KNIME AG. All rights reserved. 171


Best Practices

§ Set up permissions to the workflows deployed to KNIME Server – access rights


can be set to workflow groups, shared components, data files, workflows and
schedules
§ When updating a version of an existing workflow already stored on KNIME
Server, make sure to create a snapshot of the existing one

§ Remove any hardcoded paths and use relative paths instead

© 2020 KNIME AG. All rights reserved. 172


Review and Exercises
Explore KNIME Server REST API, Activity I (1/2)

§ Access KNIME Server REST API documentation pages:


http://3.249.93.238/knime/rest/doc/index.html
§ Explore different endpoints

© 2020 KNIME AG. All rights reserved. 174


Explore KNIME Server REST API, Activity I (2/2)

Path to the workflow: Examples à REST à Predict Results Using REST API
§ Right click menu à Show API Definition
§ Explore Execution Endpoint: GET Request
§ Try out and execute from browser

© 2020 KNIME AG. All rights reserved. 175


Config Details – Access KNIME Server

§ KNIME Server Address: http://3.249.93.238/


§ KNIME WebPortal Address: http://3.249.93.238/knime/webportal

Login credential: firstname.lastname


Password: knime

© 2020 KNIME AG. All rights reserved. 176


Additional Info

§ We will keep KNIME Server up and running for an additional week to let you
play around a little bit more with it
§ Interested in a trial license? Just send me an email at

© 2020 KNIME AG. All rights reserved. 177


Q&A
Thank You!

179

You might also like