Spring Cloud Data Flow Samples
Spring Cloud Data Flow Samples
Spring Cloud Data Flow Samples
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 1/144
11/22/2018 Spring Cloud Data Flow Samples
Table of Contents
1. Overview
2. Java DSL
2.1. Deploying a stream programmaticaly
3. Streaming
3.1. HTTP to Cassandra Demo
3.1.1. Prerequisites
3.1.2. Using the Local Server
Additional Prerequisites
Building and Running the Demo
3.1.3. Using the Cloud Foundry Server
Additional Prerequisites
Running the Demo
3.1.4. Summary
3.2. HTTP to MySQL Demo
3.2.1. Prerequisites
3.2.2. Using the Local Server
Additional Prerequisites
Building and Running the Demo
3.2.3. Using the Cloud Foundry Server
Additional Prerequisites
Building and Running the Demo
3.2.4. Summary
3.3. HTTP to Gem re Demo
3.3.1. Prerequisites
3.3.2. Using the Local Server
Additional Prerequisites
Building and Running the Demo
Using the Cloud Foundry Server
3.3.3. Summary
3.4. Gem re CQ to Log Demo
3.4.1. Prerequisites
3.4.2. Using the Local Server
Additional Prerequisites
Building and Running the Demo
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 2/144
11/22/2018 Spring Cloud Data Flow Samples
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 3/144
11/22/2018 Spring Cloud Data Flow Samples
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 4/144
11/22/2018 Spring Cloud Data Flow Samples
Version 1.0.0.BUILD-SNAPSHOT
Copies of this document may be made for your own use and for distribution to others, provided that
you do not charge any fee for such copies and further provided that each copy contains this
Copyright Notice, whether distributed in print or electronically.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 5/144
11/22/2018 Spring Cloud Data Flow Samples
1. Overview
This guide contains samples and demonstrations of how to build data pipelines with Spring Cloud
Data Flow (https://cloud.spring.io/spring-cloud-dataflow/).
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 6/144
11/22/2018 Spring Cloud Data Flow Samples
2. Java DSL
2.1. Deploying a stream programmaticaly
This sample shows the two usage styles of the Java DSL to create and deploy a stream. You should
look in the source code
(https://github.com/spring-cloud/spring-cloud-dataflow-samples/tree/master/batch/javadsl/src/main) to get a
feel for the different styles.
BASH
./mvnw clean package
With no command line options, the application will deploy the stream http --
server.port=9900 | splitter --expression=payload.split(' ') | log using the URI
localhost:9393 to connect to the Data Flow server. There is also a command line option --
style whose value can be either definition or fluent . This options picks which JavaDSL
style will execute. Both are identical in terms of behavior. The spring-cloud-dataflow-rest-
client project provides auto-configuration for DataFlowOperations and StreamBuilder
JAVA
@Autowired
private DataFlowOperations dataFlowOperations;
@Autowired
private StreamBuilder builder;
You can use those beans to build streams as well as work directly with `DataFlowOperations"
REST client.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 7/144
11/22/2018 Spring Cloud Data Flow Samples
JAVA
Stream woodchuck = builder
.name("woodchuck")
.definition("http --server.port=9900 | splitter --expression=payload.split(' ') | lo
.create()
.deploy(deploymentProperties);
JAVA
Stream woodchuck = builder.name("woodchuck")
.source(source)
.processor(processor)
.sink(sink)
.create()
.deploy(deploymentProperties);
where source , processor , and sink variables were defined as @Bean`s of the type
`StreamApplication
JAVA
@Bean
public StreamApplication source() {
return new StreamApplication("http").addProperty("server.port", 9900);
}
Another useful class is the DeploymentPropertiesBuilder which aids in the creation of the
Map of properties required to deploy stream applications.
JAVA
private Map<String, String> createDeploymentProperties() {
DeploymentPropertiesBuilder propertiesBuilder = new DeploymentPropertiesBuilder();
propertiesBuilder.memory("log", 512);
propertiesBuilder.count("log",2);
propertiesBuilder.put("app.splitter.producer.partitionKeyExpression", "payload");
return propertiesBuilder.build();
}
2) Run a local Data Flow Server and run the sample application. This sample demonstrates the
use of the local Data Flow Server, but you can pass in the option --uri to point to another Data
Flow server instance that is running elsewhere.
BASH
$ java -jar target/scdfdsl-0.0.1-SNAPSHOT.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 8/144
11/22/2018 Spring Cloud Data Flow Samples
BASH
Deploying stream.
Wating for deployment of stream.
Wating for deployment of stream.
Wating for deployment of stream.
Wating for deployment of stream.
Wating for deployment of stream.
Letting the stream run for 2 minutes.
To verify that the application has been deployed successfully, will tail the logs of one of the log
sinks and post some data to the http source. You can find the location for the logs of one of the log
sink applications by looking in the Data Flow server’s log file.
5) Verify the output Tailing the log file of the first instance
BASH
cd /tmp/spring-cloud-dataflow-4323595028663837160/woodchuck-1511390696355/woodchuck.log
tail -f stdout_0.log
BASH
2017-11-22 18:04:08.631 INFO 26652 --- [r.woodchuck-0-1] log-sink : how
2017-11-22 18:04:08.632 INFO 26652 --- [r.woodchuck-0-1] log-sink : chuck
2017-11-22 18:04:08.634 INFO 26652 --- [r.woodchuck-0-1] log-sink : chuck
BASH
cd /tmp/spring-cloud-dataflow-4323595028663837160/woodchuck-1511390696355/woodchuck.log
tail -f stdout_1.log
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 9/144
11/22/2018 Spring Cloud Data Flow Samples
BASH
$ tail -f stdout_1.log
2017-11-22 18:04:08.636 INFO 26655 --- [r.woodchuck-1-1] log-sink : much
2017-11-22 18:04:08.638 INFO 26655 --- [r.woodchuck-1-1] log-sink : wood
2017-11-22 18:04:08.639 INFO 26655 --- [r.woodchuck-1-1] log-sink : would
2017-11-22 18:04:08.640 INFO 26655 --- [r.woodchuck-1-1] log-sink : a
2017-11-22 18:04:08.641 INFO 26655 --- [r.woodchuck-1-1] log-sink : woodchuck
2017-11-22 18:04:08.642 INFO 26655 --- [r.woodchuck-1-1] log-sink : if
2017-11-22 18:04:08.644 INFO 26655 --- [r.woodchuck-1-1] log-sink : a
2017-11-22 18:04:08.645 INFO 26655 --- [r.woodchuck-1-1] log-sink : woodchuck
2017-11-22 18:04:08.646 INFO 26655 --- [r.woodchuck-1-1] log-sink : could
2017-11-22 18:04:08.647 INFO 26655 --- [r.woodchuck-1-1] log-sink : wood
Note that the partitioning is done based on the hash of the java.lang.String object.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 10/144
11/22/2018 Spring Cloud Data Flow Samples
3. Streaming
3.1. HTTP to Cassandra Demo
In this demonstration, you will learn how to build a data pipeline using Spring Cloud Data Flow
(http://cloud.spring.io/spring-cloud-dataflow/) to consume data from an HTTP endpoint and write the
payload to a Cassandra database.
We will take you through the steps to configure and Spring Cloud Data Flow server in either a
local (https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started/) or
Cloud Foundry
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle/#getting-
started)
environment.
3.1.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 11/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
Additional Prerequisites
A running local Data Flow Server
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 12/144
11/22/2018 Spring Cloud Data Flow Samples
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 13/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 14/144
11/22/2018 Spring Cloud Data Flow Samples
dataflow:>stream list
5. Post sample data pointing to the http endpoint: localhost:8888 ( 8888 is the
server.port we specified for the http source in this case)
6. Connect to the Cassandra instance and query the table clouddata.book to list the persisted
records
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 15/144
11/22/2018 Spring Cloud Data Flow Samples
7. You’re done!
Additional Prerequisites
Cloud Foundry instance
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 16/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 17/144
11/22/2018 Spring Cloud Data Flow Samples
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
1. Register
(https://github.com/spring-cloud/spring-cloud-dataflow/blob/master/spring-cloud-dataflow-
docs/src/main/asciidoc/streams.adoc#register-a-stream-app)
the out-of-the-box applications for the Rabbit binder
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 18/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 19/144
11/22/2018 Spring Cloud Data Flow Samples
You may want to change the cassandrastream name in PCF if you have enabled random
application name prefix, you could run into issues with the route name being too long.
dataflow:>stream list
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
+ . Lookup the url for cassandrastream-http application from the list above. Post sample data
pointing to the http endpoint: <YOUR-cassandrastream-http-APP-URL>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 20/144
11/22/2018 Spring Cloud Data Flow Samples
+ . Connect to the Cassandra instance and query the table book to list the data inserted
+ . Now, let’s try to take advantage of Pivotal Cloud Foundry’s platform capability. Let’s scale the
cassandrastream-http application from 1 to 3 instances
$ cf scale cassandrastream-http -i 3
Scaling app cassandrastream-http in org user-dataflow / space development as user...
OK
$ cf apps
Getting apps in org user-dataflow / space development as user...
OK
+ . You’re done!
3.1.4. Summary
In this sample, you have learned:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 21/144
11/22/2018 Spring Cloud Data Flow Samples
How to use Spring Cloud Data Flow’s Local and Cloud Foundry servers
We will take you through the steps to configure and Spring Cloud Data Flow server in either a
local (https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started/) or
Cloud Foundry
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle/#getting-
started)
environment.
3.2.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 22/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
Additional Prerequisites
A running local Data Flow Server
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 23/144
11/22/2018 Spring Cloud Data Flow Samples
Create the test database with a names table (in MySQL) using:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 24/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 25/144
11/22/2018 Spring Cloud Data Flow Samples
dataflow:>stream list
5. Post sample data pointing to the http endpoint: localhost:8787 [ 8787 is the
server.port we specified for the http source in this case]
6. Connect to the MySQL instance and query the table test.names to list the new rows:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 26/144
11/22/2018 Spring Cloud Data Flow Samples
7. You’re done!
Additional Prerequisites
Cloud Foundry instance
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 27/144
11/22/2018 Spring Cloud Data Flow Samples
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 28/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 29/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 30/144
11/22/2018 Spring Cloud Data Flow Samples
service and only this application in the stream gets the service binding. This
also eliminates the requirement to supply datasource credentials in stream
definition.
dataflow:>stream list
$ cf apps
Getting apps in org user-dataflow / space development as user...
OK
5. Lookup the url for mysqlstream-http application from the list above. Post sample data
pointing to the http endpoint: <YOUR-mysqlstream-http-APP-URL>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 31/144
11/22/2018 Spring Cloud Data Flow Samples
6. Connect to the MySQL instance and query the table names to list the new rows:
7. Now, let’s take advantage of Pivotal Cloud Foundry’s platform capability. Let’s scale the
mysqlstream-http application from 1 to 3 instances
$ cf scale mysqlstream-http -i 3
Scaling app mysqlstream-http in org user-dataflow / space development as user...
OK
$ cf apps
Getting apps in org user-dataflow / space development as user...
OK
9. You’re done!
3.2.4. Summary
In this sample, you have learned:
How to use Spring Cloud Data Flow’s Local and Cloud Foundry servers
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 32/144
11/22/2018 Spring Cloud Data Flow Samples
We will take you through the steps to configure and run Spring Cloud Data Flow server in either a
local (https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started/) or
Cloud Foundry
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle/#getting-
started)
environment.
For legacy reasons the gemfire Spring Cloud Stream Apps are named after
Pivotal GemFire . The code base for the commercial product has since been
open sourced as Apache Geode . These samples should work with compatible
versions of Pivotal GemFire or Apache Geode. Herein we will refer to the
installed IMDG simply as Geode .
3.3.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 33/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
If you do not have access an existing Geode installation, install Apache Geode
(http://geode.apache.org) or Pivotal Gemfire (http://geode.apache.org/) and start the gfsh CLI in a
separate terminal.
_________________________ __
/ _____/ ______/ ______/ /____/ /
/ / __/ /___ /_____ / _____ /
/ /__/ / ____/ _____/ / / / /
/______/_/ /______/_/ /_/ 1.2.1
Additional Prerequisites
A running local Data Flow Server
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 34/144
11/22/2018 Spring Cloud Data Flow Samples
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
3. Register
(https://github.com/spring-cloud/spring-cloud-dataflow/blob/master/spring-cloud-dataflow-
docs/src/main/asciidoc/streams.adoc#register-a-stream-app)
the out-of-the-box applications for the Rabbit binder
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 35/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
This example creates an http endpoint to which we will post stock prices as a JSON document
containing symbol and price fields. The property --json=true to enable Geode’s JSON
support and configures the sink to convert JSON String payloads to PdxInstance
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 36/144
11/22/2018 Spring Cloud Data Flow Samples
(https://geode.apache.org/releases/latest/javadoc/org/apache/geode/pdx/PdxInstance.html), the
recommended way to store JSON documents in Geode. The keyExpression property is a
SpEL expression used to extract the symbol value the PdxInstance to use as an entry key.
If the Geode locator isn’t running on default port on localhost , add the
options --connect-type=locator --host-addresses=<host>:<port> . If
there are multiple locators, you can provide a comma separated list of locator
addresses. This is not necessary for the sample but is typical for production
environments to enable fail-over.
dataflow:>stream list
6. Post sample data pointing to the http endpoint: localhost:9090 ( 9090 is the port we
specified for the http source)
7. Using gfsh , connect to the locator if not already connected, and verify the cache entry was
created.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 37/144
11/22/2018 Spring Cloud Data Flow Samples
symbol | price
------ | ------
VMW | 117.06
8. You’re done!
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 38/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 39/144
11/22/2018 Spring Cloud Data Flow Samples
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 40/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 41/144
11/22/2018 Spring Cloud Data Flow Samples
{
"locators": [
"10.0.16.9[55221]",
"10.0.16.11[55221]",
"10.0.16.10[55221]"
],
"urls": {
"gfsh": "http://...",
"pulse": "http://.../pulse"
},
"users": [
{
"password": <password>,
"username": "cluster_operator"
},
{
"password": <password>,
"username": "developer"
}
]
}
3. Using gfsh , connect to the PCC instance as cluster_operator using the service key values
and create the Stocks region.
This example creates an http endpoint to which we will post stock prices as a JSON document
containing symbol and price fields. The property --json=true to enable Geode’s JSON
support and configures the sink to convert JSON String payloads to PdxInstance
(https://geode.apache.org/releases/latest/javadoc/org/apache/geode/pdx/PdxInstance.html), the
recommended way to store JSON documents in Geode. The keyExpression property is a
SpEL expression used to extract the symbol value the PdxInstance to use as an entry key.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 42/144
11/22/2018 Spring Cloud Data Flow Samples
dataflow:>stream list
7. Using gfsh , connect to the PCC instance as cluster_operator using the service key values.
symbol | price
------ | ------
VMW | 117.06
8. You’re done!
3.3.3. Summary
In this sample, you have learned:
How to use Spring Cloud Data Flow’s Local and Cloud Foundry servers
How to create streaming data pipeline to connect and write to gemfire :sectnums: :docs_dir:
../../..
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 43/144
11/22/2018 Spring Cloud Data Flow Samples
We will take you through the steps to configure and run Spring Cloud Data Flow server in either a
local (https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started/) or
Cloud Foundry
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle/#getting-
started)
environment.
For legacy reasons the gemfire Spring Cloud Stream Apps are named after
Pivotal GemFire . The code base for the commercial product has since been
open sourced as Apache Geode . These samples should work with compatible
versions of Pivotal GemFire or Apache Geode. Herein we will refer to the
installed IMDG simply as Geode .
3.4.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 44/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
If you do not have access an existing Geode installation, install Apache Geode
(http://geode.apache.org) or Pivotal Gemfire (http://geode.apache.org/) and start the gfsh CLI in a
separate terminal.
_________________________ __
/ _____/ ______/ ______/ /____/ /
/ / __/ /___ /_____ / _____ /
/ /__/ / ____/ _____/ / / / /
/______/_/ /______/_/ /_/ 1.2.1
Additional Prerequisites
A Running Data Flow Server
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 45/144
11/22/2018 Spring Cloud Data Flow Samples
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
3. Register
(https://github.com/spring-cloud/spring-cloud-dataflow/blob/master/spring-cloud-dataflow-
docs/src/main/asciidoc/streams.adoc#register-a-stream-app)
the out-of-the-box applications for the Rabbit binder
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 46/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
This example creates an gemfire-cq source to which will publish events matching a query
criteria on a region. In this case we will monitor the Orders region. For simplicity, we will
avoid creating a data structure for the order. Each cache entry contains an integer value
representing the quantity of the ordered item. This stream will fire a message whenever the
value>999. By default, the source emits only the value. Here we will override that using the
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 47/144
11/22/2018 Spring Cloud Data Flow Samples
If the Geode locator isn’t running on default port on localhost , add the
options --connect-type=locator --host-addresses=<host>:<port> . If
there are multiple locators, you can provide a comma separated list of locator
addresses. This is not necessary for the sample but is typical for production
environments to enable fail-over.
dataflow:>stream list
6. Monitor stdout for the log sink. When you deploy the stream, you will see log messages in the
Data Flow server console like this
Copy the location of the log sink logs. This is a directory that ends in orders.log . The log
files will be in stdout_0.log under this directory. You can monitor the output of the log sink
using tail , or something similar:
$tail -f /var/folders/hd/5yqz2v2d3sxd3n879f4sg4gr0000gn/T/spring-cloud-dataflow-
5375107584795488581/orders-1509370775940/orders.log/stdout_0.log
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 48/144
11/22/2018 Spring Cloud Data Flow Samples
1. You’re done!
Additional Prerequisites
A Cloud Foundry instance
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 49/144
11/22/2018 Spring Cloud Data Flow Samples
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 50/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 51/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 52/144
11/22/2018 Spring Cloud Data Flow Samples
{
"locators": [
"10.0.16.9[55221]",
"10.0.16.11[55221]",
"10.0.16.10[55221]"
],
"urls": {
"gfsh": "http://...",
"pulse": "http://.../pulse"
},
"users": [
{
"password": <password>,
"username": "cluster_operator"
},
{
"password": <password>,
"username": "developer"
}
]
}
3. Using gfsh , connect to the PCC instance as cluster_operator using the service key values
and create the Test region.
This example creates an gemfire-cq source to which will publish events matching a query
criteria on a region. In this case we will monitor the Orders region. For simplicity, we will
avoid creating a data structure for the order. Each cache entry contains an integer value
representing the quantity of the ordered item. This stream will fire a message whenever the
value>999. By default, the source emits only the value. Here we will override that using the
cq-event-expression property. This accepts a SpEL expression bound to a CQEvent
(https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/query/CqEvent.html). To
reference the entire CQEvent instance, we use #this . In order to display the contents in the
log, we will invoke toString() on the instance.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 53/144
11/22/2018 Spring Cloud Data Flow Samples
dataflow:>stream list
cf logs <log-sink-app-name>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 54/144
11/22/2018 Spring Cloud Data Flow Samples
3.4.4. Summary
In this sample, you have learned:
How to use Spring Cloud Data Flow’s Local and Cloud Foundry servers
How to create streaming data pipeline to connect and publish CQ events from gemfire
:sectnums: :docs_dir: ../../..
We will take you through the steps to configure and run Spring Cloud Data Flow server in either a
local (https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-started/) or
Cloud Foundry
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle/#getting-
started)
environment.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 55/144
11/22/2018 Spring Cloud Data Flow Samples
For legacy reasons the gemfire Spring Cloud Stream Apps are named after
Pivotal GemFire . The code base for the commercial product has since been
open sourced as Apache Geode . These samples should work with compatible
versions of Pivotal GemFire or Apache Geode. Herein we will refer to the
installed IMDG simply as Geode .
3.5.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 56/144
11/22/2018 Spring Cloud Data Flow Samples
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
If you do not have access an existing Geode installation, install Apache Geode
(http://geode.apache.org) or Pivotal Gemfire (http://geode.apache.org/) and start the gfsh CLI in a
separate terminal.
_________________________ __
/ _____/ ______/ ______/ /____/ /
/ / __/ /___ /_____ / _____ /
/ /__/ / ____/ _____/ / / / /
/______/_/ /______/_/ /_/ 1.2.1
Additional Prerequisites
A Running Data Flow Server
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 57/144
11/22/2018 Spring Cloud Data Flow Samples
3. Register
(https://github.com/spring-cloud/spring-cloud-dataflow/blob/master/spring-cloud-dataflow-
docs/src/main/asciidoc/streams.adoc#register-a-stream-app)
the out-of-the-box applications for the Rabbit binder
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 58/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
This example creates an gemfire source to which will publish events on a region
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 59/144
11/22/2018 Spring Cloud Data Flow Samples
If the Geode locator isn’t running on default port on localhost , add the
options --connect-type=locator --host-addresses=<host>:<port> . If
there are multiple locators, you can provide a comma separated list of locator
addresses. This is not necessary for the sample but is typical for production
environments to enable fail-over.
dataflow:>stream list
6. Monitor stdout for the log sink. When you deploy the stream, you will see log messages in the
Data Flow server console like this
CONSOLE
2017-10-28 17:28:23.275 INFO 15603 --- [nio-9393-exec-2] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/hd/5yqz2v2d3sxd3n879f4sg4gr0000gn/T/spring-cloud-dataflow
2017-10-28 17:28:23.277 INFO 15603 --- [nio-9393-exec-2] o.s.c.d.s.c.StreamDeploymentCon
2017-10-28 17:28:23.311 INFO 15603 --- [nio-9393-exec-2] o.s.c.d.s.c.StreamDeploymentCon
2017-10-28 17:28:23.318 INFO 15603 --- [nio-9393-exec-2] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/hd/5yqz2v2d3sxd3n879f4sg4gr0000gn/T/spring-cloud-dataflow
Copy the location of the log sink logs. This is a directory that ends in events.log . The log
files will be in stdout_0.log under this directory. You can monitor the output of the log sink
using tail , or something similar:
CONSOLE
$tail -f /var/folders/hd/5yqz2v2d3sxd3n879f4sg4gr0000gn/T/spring-cloud-dataflow-409399206
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 60/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
2017-10-28 17:28:52.893 INFO 18986 --- [emfire.events-1] log sink
2017-10-28 17:28:52.893 INFO 18986 --- [emfire.events-1] log sink
2017-10-28 17:28:52.893 INFO 18986 --- [emfire.events-1] log sink
2017-10-28 17:28:52.893 INFO 18986 --- [emfire.events-1] log sink
By default, the message payload contains the updated value. Depending on your application,
you may need additional information. The data comes from EntryEvent
(https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/EntryEvent.html). You can
access any fields using the source’s cache-event-expression property. This takes a SpEL
expression bound to the EntryEvent. Try something like --cache-event-
expression='{key:'+key+',new_value:'+newValue+'}' (HINT: You will need to destroy
the stream and recreate it to add this property, an exercise left to the reader). Now you should
see log messages like:
CONSOLE
2017-10-28 17:28:52.893 INFO 18986 --- [emfire.events-1] log-sink
2017-10-28 17:41:24.466 INFO 18986 --- [emfire.events-1] log-sink
9. You’re done!
Additional Prerequisites
A Cloud Foundry instance
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 61/144
11/22/2018 Spring Cloud Data Flow Samples
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 62/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 63/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 64/144
11/22/2018 Spring Cloud Data Flow Samples
{
"locators": [
"10.0.16.9[55221]",
"10.0.16.11[55221]",
"10.0.16.10[55221]"
],
"urls": {
"gfsh": "http://...",
"pulse": "http://.../pulse"
},
"users": [
{
"password": <password>,
"username": "cluster_operator"
},
{
"password": <password>,
"username": "developer"
}
]
}
3. Using gfsh , connect to the PCC instance as cluster_operator using the service key values
and create the Test region.
4. Create the stream, connecting to the PCC instance as developer. This example creates an
gemfire source to which will publish events on a region
dataflow:>stream list
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 65/144
11/22/2018 Spring Cloud Data Flow Samples
cf logs <log-sink-app-name>
By default, the message payload contains the updated value. Depending on your application,
you may need additional information. The data comes from EntryEvent
(https://geode.apache.org/releases/latest/javadoc/org/apache/geode/cache/EntryEvent.html). You can
access any fields using the source’s cache-event-expression property. This takes a SpEL
expression bound to the EntryEvent. Try something like --cache-event-
expression='{key:'+key+',new_value:'+newValue+'}' (HINT: You will need to destroy
the stream and recreate it to add this property, an exercise left to the reader). Now you should
see log messages like:
CONSOLE
2017-10-28 17:28:52.893 INFO 18986 --- [emfire.events-1] log-sink
2017-10-28 17:41:24.466 INFO 18986 --- [emfire.events-1] log-sink
9. You’re done!
3.5.4. Summary
In this sample, you have learned:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 66/144
11/22/2018 Spring Cloud Data Flow Samples
How to use Spring Cloud Data Flow’s Local and Cloud Foundry servers
How to create streaming data pipeline to connect and publish events from gemfire
:sectnums: :docs_dir: ../../..
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 67/144
11/22/2018 Spring Cloud Data Flow Samples
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
A Java IDE
Choose a message transport binding as a dependency for the custom app There are options
for choosing Rabbit MQ or Kafka as the message transport. For this demo, we will use
rabbit . Type rabbit in the search bar under Search for dependencies and select Stream
Rabbit .
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 68/144
11/22/2018 Spring Cloud Data Flow Samples
Hit the generate project button and open the new project in an IDE of your choice
We can now create our custom app. Our Spring Cloud Stream application is a Spring Boot
application that runs as an executable jar. The application will include two Java classes:
We are creating a transformer that takes a Fahrenheit input and converts it to Celsius.
Following the same naming convention as the application file, create a new Java class in
the same package called CelsiusConverterProcessorConfiguration.java .
CelsiusConverterProcessorConfiguration.java
@EnableBinding(Processor.class)
public class CelsiusConverterProcessorConfiguration {
Here we introduced two important Spring annotations. First we annotated the class with
@EnableBinding(Processor.class) . Second we created a method and annotated it with
@Transformer(inputChannel = Processor.INPUT, outputChannel =
Processor.OUTPUT) . By adding these two annotations we have configured this stream app
as a Processor (as opposed to a Source or a Sink ). This means that the application
receives input from an upstream application via the Processor.INPUT channel and sends
its output to a downstream application via the Processor.OUTPUT channel.
The convertToCelsius method takes a String as input for Fahrenheit and then returns
the converted Celsius as an integer. This method is very simple, but that is also the beauty
of this programming style. We can add as much logic as we want to this method to enrich
this processor. As long as we annotate it properly and return valid output, it works as a
Spring Cloud Stream Processor. Also note that it is straightforward to unit test this code.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 69/144
11/22/2018 Spring Cloud Data Flow Samples
$cd <PROJECT_DIR>
$./mvnw clean package
If all goes well, we should have a running standalone Spring Boot Application. Once we verify
that the app is started and running without any errors, we can stop it.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 70/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
app register --type processor --name convertToCelsius --uri <File URL of the jar file
on the local filesystem where you built the project above> --force
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 71/144
11/22/2018 Spring Cloud Data Flow Samples
We will create a stream that uses the out of the box http source and log sink and our
custom transformer.
dataflow:>stream list
dataflow:>runtime apps
CONSOLE
2016-09-27 10:03:11.988 INFO 95234 --- [nio-9393-exec-9] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/2q/krqwcbhj2d58csmthyq_n1nw0000gp/T/spring-cloud-dataflow
2016-09-27 10:03:12.397 INFO 95234 --- [nio-9393-exec-9] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/2q/krqwcbhj2d58csmthyq_n1nw0000gp/T/spring-cloud-dataflow
2016-09-27 10:03:14.445 INFO 95234 --- [nio-9393-exec-9] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/2q/krqwcbhj2d58csmthyq_n1nw0000gp/T/spring-cloud-dataflow
6. Post sample data to the http endpoint: localhost:9090 ( 9090 is the port we specified
for the http source in this case)
7. Open the log file for the convertToCelsiusStream.log app to see the output of our stream
CONSOLE
tail -f /var/folders/2q/krqwcbhj2d58csmthyq_n1nw0000gp/T/spring-cloud-dataflow-7563139704
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 72/144
11/22/2018 Spring Cloud Data Flow Samples
3.6.4. Summary
In this sample, you have learned:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 73/144
11/22/2018 Spring Cloud Data Flow Samples
4. Task / Batch
4.1. Batch Job on Cloud Foundry
In this demonstration, you will learn how to orchestrate short-lived data processing application
(eg: Spring Batch Jobs) using Spring Cloud Task (http://cloud.spring.io/spring-cloud-task/) and Spring
Cloud Data Flow (http://cloud.spring.io/spring-cloud-dataflow/) on Cloud Foundry.
4.1.1. Prerequisites
Local PCFDev (https://pivotal.io/pcf-dev) instance
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 74/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Spring Cloud Data Flow Cloud Foundry Server running in PCFDev
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 75/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 76/144
11/22/2018 Spring Cloud Data Flow Samples
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
PCF 1.7.12 or greater is required to run Tasks on Spring Cloud Data Flow. As of
this writing, PCFDev and PWS supports builds upon this version.
1. Task support needs to be enabled on pcf-dev. Being logged as admin , issue the following
command:
cf enable-feature-flag task_creation
Setting status of task_creation as admin...
OK
For this sample, all you need is the mysql service and in PCFDev, the mysql
service comes with a different plan. From CF CLI, create the service by: cf
create-service p-mysql 512mb mysql and bind this service to dataflow-
server by: cf bind-service dataflow-server mysql .
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 77/144
11/22/2018 Spring Cloud Data Flow Samples
All the apps deployed to PCFDev start with low memory by default. It is
recommended to change it to at least 768MB for dataflow-server . Ditto for
every app spawned by Spring Cloud Data Flow. Change the memory by: cf
2. Tasks in Spring Cloud Data Flow require an RDBMS to host "task repository" (see here
(http://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#spring-cloud-dataflow-task-
repository)
for more details), so let’s instruct the Spring Cloud Data Flow server to bind the mysql service
to each deployed task:
3. As a recap, here is what you should see as configuration for the Spring Cloud Data Flow
server:
cf env dataflow-server
....
User-Provided:
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_DOMAIN: local.pcfdev.io
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_MEMORY: 512
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_ORG: pcfdev-org
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_PASSWORD: pass
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SKIP_SSL_VALIDATION: false
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_SPACE: pcfdev-space
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_TASK_SERVICES: mysql
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_URL: https://api.local.pcfdev.io
SPRING_CLOUD_DEPLOYER_CLOUDFOUNDRY_USERNAME: user
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 78/144
11/22/2018 Spring Cloud Data Flow Samples
4. Notice that dataflow-server application is started and ready for interaction via dataflow-
server.local.pcfdev.io endpoint
Unlike Streams, the Task definitions don’t require explicit deployment. They
can be launched on-demand, scheduled, or triggered by streams.
7. Verify there’s still no Task applications running on PCFDev - they are listed only after the
initial launch/staging attempt on PCF
$ cf apps
Getting apps in org pcfdev-org / space pcfdev-space as user...
OK
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 79/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
$ cf logs foo
Retrieving logs for app foo in org pcfdev-org / space pcfdev-space as user...
...
...
...
...
...
...
...
...
application are launched independently and they returned with the status
COMPLETED .
Unlike LRPs in Cloud Foundry, tasks are short-lived, so the logs aren’t always
available. They are generated only when the Task application runs; at the end
of Task operation, the container that ran the Task application is destroyed to
free-up resources.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 80/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf apps
Getting apps in org pcfdev-org / space pcfdev-space as user...
OK
4.1.3. Summary
In this sample, you have learned:
How to register and orchestrate Spring Batch jobs in Spring Cloud Data Flow
How to use the cf CLI in the context of Task applications orchestrated by Spring Cloud Data
Flow
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 81/144
11/22/2018 Spring Cloud Data Flow Samples
In this demonstration, you will learn how to create a data processing application using Spring
Batch (http://projects.spring.io/spring-batch/) which will then be run within Spring Cloud Data Flow
(http://cloud.spring.io/spring-cloud-dataflow/).
4.2.1. Prerequisites
A Running Data Flow Server
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 82/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
BatchConfiguration.java - this is where we define our batch job, the step and components
that are used read, process, and write our data. In the sample we use a FlatFileItemReader
which reads a delimited file, a custom PersonItemProcessor to transform the data, and a
JdbcBatchItemWriter to write our data to a database.
Person.java - the domain object representing the data we are reading and processing in our
batch job. The sample data contains records made up of a persons first and last name.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 83/144
11/22/2018 Spring Cloud Data Flow Samples
we simply transform the first and last name of each Person to uppercase characters.
Application.java - the main entry point into the Spring Boot application which is used to
launch the batch job
Resource files are included to set up the database and provide sample data:
schema-all.sql - this is the database schema that will be created when the application starts
up. In this sample, an in-memory database is created on start up and destroyed when the
application exits.
data.csv - sample data file containing person records used in the demo
This example expects to use the Spring Cloud Data Flow Server’s embedded H2
database. If you wish to use another repository, be sure to add the correct
dependencies to the pom.xml and update the schema-all.sql.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 84/144
11/22/2018 Spring Cloud Data Flow Samples
5. Inspect logs
The log file path for the launched task can be found in the local server output, for example:
CONSOLE
2017-10-27 14:58:18.112 INFO 19485 --- [nio-9393-exec-6] o.s.c.d.spi.local.LocalTaskLaun
Logs will be in /var/folders/6x/tgtx9xbn0x16xq2sx1j2rld80000gn/T/spring-cloud-dataflow
4.2.4. Summary
In this sample, you have learned:
How to register and orchestrate Spring Batch jobs in Spring Cloud Data Flow
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 85/144
11/22/2018 Spring Cloud Data Flow Samples
The source for the demo project is located in the batch/file-ingest directory at the top-level
of this repository.
5.1.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 86/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
Additional Prerequisites
A running local Data Flow Server
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 87/144
11/22/2018 Spring Cloud Data Flow Samples
To simplify the dependencies and configuration in this example, we will use our
local machine acting as an SFTP server.
$ cd batch/file-ingest
$ mvn clean package
For convenience, you can skip this step. The jar is published to the Spring
Maven repository
(https://repo.spring.io/libs-snapshot-
local/io/spring/cloud/dataflow/ingest/ingest/1.0.0.BUILD-SNAPSHOT/)
Now we create a remote directory on the SFTP server and a local directory where the batch
job expects to find files.
If you are using a remote SFTP server, create the remote directory on the
SFTP server. Since we are using the local machine as the SFTP server, we will
create both the local and remote directories on the local machine.
With our Spring Cloud Data Flow server running, we register the sftp-dataflow source and
task-launcher-dataflow sink. The sftp-dataflow source application will do the work of
polling the remote directory for new files and downloading them to the local directory. As
each file is received, it emits a message for the task-launcher-dataflow sink to launch the
task to process the data from that file.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 88/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
dataflow:>app register --name sftp --type source --uri maven://org.springframework.cloud
Successfully registered application 'source:sftp'
dataflow:>app register --name task-launcher --type sink --uri maven://org.springframework
Successfully registered application 'sink:task-launcher'
4. Register and create the file ingest task. If you’re using the published jar, set --uri
maven://io.spring.cloud.dataflow.ingest:ingest:1.0.0.BUILD-SNAPSHOT :
CONSOLE
dataflow:>app register --name fileIngest --type task --uri file:///path/to/target/ingest-
Successfully registered application 'task:fileIngest'
dataflow:>task create fileIngestTask --definition fileIngest
Created new task 'fileIngestTask'
Now lets create and deploy the stream. Once deployed, the stream will start polling the SFTP
server and, when new files arrive, launch the batch job.
SFTP server, specify the host using the --host , and optionally --port ,
parameters. If not defined, host defaults to 127.0.0.1 and port defaults
to 22 .
CONSOLE
dataflow:>stream create --name inboundSftp --definition "sftp --username=<user> --passwor
Created new stream 'inboundSftp'
Deployment request has been sent
We can see the status of the streams to be deployed with stream list , for example:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 89/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
dataflow:>stream list
╔═══════════╤════════════════════════════════════════════════════════════════════════════════════
║Stream Name│ Stream Definition
╠═══════════╪════════════════════════════════════════════════════════════════════════════════════
║inboundSftp│sftp --password='******' --remote-dir=/tmp/remote-files/ --local-dir=/tmp/lo
║ │--allow-unknown-keys=true --username=<user> | task-launcher
╚═══════════╧════════════════════════════════════════════════════════════════════════════════════
7. Inspect logs
In the event the stream failed to deploy, or you would like to inspect the logs for any reason,
you can get the location of the logs to applications created for the inboundSftp stream using
the runtime apps command:
CONSOLE
dataflow:>runtime apps
╔═══════════════════════════╤═══════════╤════════════════════════════════════════════════════════
║ App Id / Instance Id │Unit Status│
╠═══════════════════════════╪═══════════╪════════════════════════════════════════════════════════
║inboundSftp.sftp │ deployed │
╟┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┼┈┈┈┈┈┈┈┈┈┈┈┼┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
║ │ │ guid = 23057
║ │ │ pid = 71927
║ │ │ port = 23057
║inboundSftp.sftp-0 │ deployed │ stderr = /var/folders/hd/5yqz2v2d3sxd3n879f
║ │ │ stdout = /var/folders/hd/5yqz2v2d3sxd3n879f
║ │ │ url = http://192.168.64.1:23057
║ │ │working.dir = /var/folders/hd/5yqz2v2d3sxd3n879f
╟───────────────────────────┼───────────┼────────────────────────────────────────────────────────
║inboundSftp.task-launcher │ deployed │
╟┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┼┈┈┈┈┈┈┈┈┈┈┈┼┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈┈
║ │ │ guid = 60081
║ │ │ pid = 71926
║ │ │ port = 60081
║inboundSftp.task-launcher-0│ deployed │ stderr = /var/folders/hd/5yqz2v2d3sxd3n879f
║ │ │ stdout = /var/folders/hd/5yqz2v2d3sxd3n879f
║ │ │ url = http://192.168.64.1:60081
║ │ │working.dir = /var/folders/hd/5yqz2v2d3sxd3n879f
╚═══════════════════════════╧═══════════╧════════════════════════════════════════════════════════
8. Add data
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 90/144
11/22/2018 Spring Cloud Data Flow Samples
Normally data would be uploaded to an SFTP server. We will simulate this by copying a file
into the directory specified by --remote-dir . Sample data can be found in the data/
directory of the Batch File Ingest project.
Copy data/name-list.csv into the /tmp/remote-files directory which the SFTP source is
monitoring. When this file is detected, the sftp source will download it to the /tmp/local-
files directory specified by --local-dir , and emit a Task Launch Request. The Task
Launch Request includes the name of the task to launch along with the local file path, given as
the command line argument localFilePath . Spring Batch binds each command line
argument to a corresponding JobParameter. The FileIngestTask job processes the file given by
the JobParameter named localFilePath . The task-launcher sink polls for messages using
an exponential back-off. Since there have not been any recent requests, the task will launch
within 30 seconds after the request is published.
$ cp data/name-list.csv /tmp/remote-files
When the batch job launches, you will see something like this in the SCDF console log:
CONSOLE
2018-10-26 16:47:24.879 INFO 86034 --- [nio-9393-exec-7] o.s.c.d.spi.local.LocalTaskLaun
2018-10-26 16:47:25.100 INFO 86034 --- [nio-9393-exec-7] o.s.c.d.spi.local.LocalTaskLaun
Logs will be in /var/folders/hd/5yqz2v2d3sxd3n879f4sg4gr0000gn/T/fileIngestTask3100511
After data is received and the batch job runs, it will be recorded as a Job Execution. We can
view job executions by for example issuing the following command in the Spring Cloud Data
Flow shell:
CONSOLE
dataflow:>job execution list
╔═══╤═══════╤═════════╤════════════════════════════╤═════════════════════╤══════════════════╗
║ID │Task ID│Job Name │ Start Time │Step Execution Count │Definition Stat
╠═══╪═══════╪═════════╪════════════════════════════╪═════════════════════╪══════════════════╣
║1 │1 │ingestJob│Tue May 01 23:34:05 EDT 2018│1 │Created
╚═══╧═══════╧═════════╧════════════════════════════╧═════════════════════╧══════════════════╝
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 91/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
dataflow:>job execution display --id 1
╔═══════════════════════════════════════╤══════════════════════════════╗
║ Key │ Value ║
╠═══════════════════════════════════════╪══════════════════════════════╣
║Job Execution Id │1 ║
║Task Execution Id │1 ║
║Task Instance Id │1 ║
║Job Name │ingestJob ║
║Create Time │Fri Oct 26 16:57:51 EDT 2018 ║
║Start Time │Fri Oct 26 16:57:51 EDT 2018 ║
║End Time │Fri Oct 26 16:57:53 EDT 2018 ║
║Running │false ║
║Stopping │false ║
║Step Execution Count │1 ║
║Execution Status │COMPLETED ║
║Exit Status │COMPLETED ║
║Exit Message │ ║
║Definition Status │Created ║
║Job Parameters │ ║
║-spring.cloud.task.executionid(STRING) │1 ║
║run.id(LONG) │1 ║
║localFilePath(STRING) │/tmp/local-files/name-list.csv║
╚═══════════════════════════════════════╧══════════════════════════════╝
When the the batch job runs, it processes the file in the local directory /tmp/local-files
and transforms each item to uppercase names and inserts it into the database.
You may use any database tool that supports the H2 database to inspect the data. In this
example we use the database tool DBeaver . Lets inspect the table to ensure our data was
processed correctly.
Within DBeaver, create a connection to the database using the JDBC URL
jdbc:h2:tcp://localhost:19092/mem:dataflow , and user sa with no password. When
connected, expand the PUBLIC schema, then expand Tables and then double click on the
table PEOPLE . When the table data loads, click the "Data" tab to view the data.
Additional Prerequisites
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 92/144
11/22/2018 Spring Cloud Data Flow Samples
Running this demo in Cloud Foundry requires a shared file system that is
accessed by apps running in different containers. This feature is provided by NFS
Volume Services
(https://docs.pivotal.io/pivotalcf/2-3/devguide/services/using-vol-services.html). To use
Volume Services with SCDF, it is required that we provide nfs configuration via
cf create-service rather than cf bind-service . Cloud Foundry introduced
the cf create-service configuration option for Volume Services in version 2.3.
For this example, we use an NFS host configured to allow read-write access
(https://www.tldp.org/HOWTO/NFS-HOWTO/server.html) to the Cloud Foundry instance.
Create the nfs service instance using a command as below, where share
specifies the NFS host and shared directory( /export ), uid an gid specify an
account that has read-write access to the shared directory, and mount is the
container’s mount path for each application bound to nfs :
The Cloud Foundry Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations/) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow-server-cloudfoundry) it yourself. If you build it
yourself, the executable jar will be in spring-cloud-dataflow-server-cloudfoundry/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 93/144
11/22/2018 Spring Cloud Data Flow Samples
Although you can run the Data Flow Cloud Foundry Server locally and configure
it to deploy to any Cloud Foundry instance, we will deploy the server to Cloud
Foundry as recommended.
1. Verify that CF instance is reachable (Your endpoint urls will be different from what is shown
here).
$ cf api
API endpoint: https://api.system.io (API version: ...)
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
No apps found
2. Follow the instructions to deploy the Spring Cloud Data Flow Cloud Foundry server
(https://docs.spring.io/spring-cloud-dataflow-server-cloudfoundry/docs/current/reference/htmlsingle).
Don’t worry about creating a Redis service. We won’t need it. If you are familiar with Cloud
Foundry application manifests, we recommend creating a manifest for the the Data Flow
server as shown here
(https://docs.spring.io/spring-cloud-dataflow-server-
cloudfoundry/docs/current/reference/htmlsingle/#sample-manifest-template)
.
3. Once you have successfully executed cf push , verify the dataflow server is running
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 94/144
11/22/2018 Spring Cloud Data Flow Samples
$ cf apps
Getting apps in org [your-org] / space [your-space] as user...
OK
4. Notice that the dataflow-server application is started and ready for interaction via the url
endpoint
5. Connect the shell with server running on Cloud Foundry, e.g., dataflow-server.app.io
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
server-unknown:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 95/144
11/22/2018 Spring Cloud Data Flow Samples
Normally, for security and operational efficiency, we may want more fine grained
control of which apps bind to the nfs service. One way to do this is to set
deployment properties when creating and deploying the stream, as shown below.
Create a directory on the SFTP server where the sftp source will detect files and download
them for processing. This path must exist prior to running the demo and can be any location
that is accessible by the configured SFTP user. On the SFTP server create a directory called
remote-files , for example:
Create a directory on the NFS server that is accessible to the user, specified by uid and gid ,
used to create the nfs service:
With our Spring Cloud Data Flow server running, we register the sftp-dataflow source and
task-launcher-dataflow sink. The sftp-dataflow source application will do the work of
polling the remote directory for new files and downloading them to the local directory. As
each file is received, it emits a message for the task-launcher-dataflow sink to launch the
task to process the data from that file.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 96/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
dataflow:>app register --name sftp --type source --uri maven://org.springframework.cloud
Successfully registered application 'source:sftp'
dataflow:>app register --name task-launcher --type sink --uri maven://org.springframework
Successfully registered application 'sink:task-launcher'
Now lets create and deploy the stream. Once deployed, the stream will start polling the SFTP
server and, when new files arrive, launch the batch job.
Replace <user> , '<pass>`, and <host> below. The <host> is the SFTP
server host, <user> and <password> values are the credentials for the
remote user. Additionally, replace --
spring.cloud.dataflow.client.server-uri=http://<dataflow-server-
route> with the URL of your dataflow server, as shown by cf apps . If you
have security enabled for the SCDF server, set the appropriate
spring.cloud.dataflow.client options.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 97/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
dataflow:> app info --name task-launcher --type sink
╔══════════════════════════════╤══════════════════════════════╤══════════════════════════════╤═══
║ Option Name │ Description │ Default
╠══════════════════════════════╪══════════════════════════════╪══════════════════════════════╪═══
║spring.cloud.dataflow.client.a│The login username. │<none>
║uthentication.basic.username │ │
║spring.cloud.dataflow.client.a│The login password. │<none>
║uthentication.basic.password │ │
║trigger.max-period │The maximum polling period in │30000
║ │milliseconds. Will be set to │
║ │period if period > maxPeriod. │
║trigger.period │The polling period in │1000
║ │milliseconds. │
║trigger.initial-delay │The initial delay in │1000
║ │milliseconds. │
║spring.cloud.dataflow.client.s│Skip Ssl validation. │true
║kip-ssl-validation │ │
║spring.cloud.dataflow.client.e│Enable Data Flow DSL access. │false
║nable-dsl │ │
║spring.cloud.dataflow.client.s│The Data Flow server URI. │http://localhost:9393
║erver-uri │ │
╚══════════════════════════════╧══════════════════════════════╧══════════════════════════════╧═══
Since we configured the SCDF server to bind all stream and task apps to the nfs service, no
deployment parameters are required.
CONSOLE
dataflow:>stream create inboundSftp --definition "sftp --username=<user> --password=<pass
Created new stream 'inboundSftp'
dataflow:>stream deploy inboundSftp
Deployment request has been sent for stream 'inboundSftp'
Alternatively, we can bind the nfs service to the fileIngestTask by passing deployment
properties to the task via the task launch request in the stream definition: --
task.launch.request.deployment-
properties=deployer.*.cloudfoundry.services=nfs
CONSOLE
dataflow:>stream deploy inboundSftp --properties "deployer.sftp.cloudfoundry.services=nfs
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 98/144
11/22/2018 Spring Cloud Data Flow Samples
The status of the stream to be deployed can be queried with stream list , for example:
CONSOLE
dataflow:>stream list
╔═══════════╤════════════════════════════════════════════════════════════════════════════════════
║Stream Name│ Stream Defi
╠═══════════╪════════════════════════════════════════════════════════════════════════════════════
║inboundSftp│sftp --task.launch.request.deployment-properties='deployer.*.cloudfoundry.se
║ │--remote-dir=remote-files --local-dir=/var/scdf/shared-files/ --task.launch.
║ │--username=<user> | task-launcher --spring.cloud.dataflow.client.server-uri=
╚═══════════╧════════════════════════════════════════════════════════════════════════════════════
7. Inspect logs
In the event the stream failed to deploy, or you would like to inspect the logs for any reason,
the logs can be obtained from individual applications. First list the deployed apps:
CONSOLE
$ cf apps
Getting apps in org cf_org / space cf_space as cf_user...
OK
In this example, the logs for the SFTP application can be viewed by:
CONSOLE
cf logs dataflow-server-N5RYLDj-inboundSftp-sftp --recent
The log files of this application would be useful to debug issues such as SFTP connection
failures.
Additionally, the logs for the task-launcher application can be viewed by:
8. Add data
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 99/144
11/22/2018 Spring Cloud Data Flow Samples
Sample data can be found in the data/ directory of the Batch File Ingest project. Connect to
the SFTP server and upload data/name-list.csv into the remote-files directory. Copy
data/name-list.csv into the /tmp/remote-files directory which the SFTP source is
monitoring. When this file is detected, the sftp source will download it to the
/var/scdf/shared-files directory specified by --local-dir , and emit a Task Launch
Request. The Task Launch Request includes the name of the task to launch along with the
local file path, given as a command line argument. Spring Batch binds each command line
argument to a corresponding JobParameter. The FileIngestTask job processes the file given by
the JobParameter named localFilePath . The task-launcher sink polls for messages using
an exponential back-off. Since there have not been any recent requests, the task will launch
within 30 seconds after the request is published.
After data is received and the batch job runs, it will be recorded as a Job Execution. We can
view job executions by for example issuing the following command in the Spring Cloud Data
Flow shell:
CONSOLE
dataflow:>job execution list
╔═══╤═══════╤═════════╤════════════════════════════╤═════════════════════╤══════════════════╗
║ID │Task ID│Job Name │ Start Time │Step Execution Count │Definition Stat
╠═══╪═══════╪═════════╪════════════════════════════╪═════════════════════╪══════════════════╣
║1 │1 │ingestJob│Thu Jun 07 13:46:42 EDT 2018│1 │Created
╚═══╧═══════╧═════════╧════════════════════════════╧═════════════════════╧══════════════════╝
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 100/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
dataflow:>job execution display --id 1
╔═══════════════════════════════════════════╤════════════════════════════════════╗
║ Key │ Value ║
╠═══════════════════════════════════════════╪════════════════════════════════════╣
║Job Execution Id │1 ║
║Task Execution Id │1 ║
║Task Instance Id │1 ║
║Job Name │ingestJob ║
║Create Time │Wed Oct 31 03:17:34 EDT 2018 ║
║Start Time │Wed Oct 31 03:17:34 EDT 2018 ║
║End Time │Wed Oct 31 03:17:34 EDT 2018 ║
║Running │false ║
║Stopping │false ║
║Step Execution Count │1 ║
║Execution Status │COMPLETED ║
║Exit Status │COMPLETED ║
║Exit Message │ ║
║Definition Status │Created ║
║Job Parameters │ ║
║-spring.cloud.task.executionid(STRING) │1 ║
║run.id(LONG) │1 ║
║localFilePath(STRING) │/var/scdf/shared-files/name_list.csv║
╚═══════════════════════════════════════════╧════════════════════════════════════╝
When the the batch job runs, it processes the file in the local directory /var/scdf/shared-
files and transforms each item to uppercase names and inserts it into the database.
Fortunately, Spring Cloud Data Flow 1.7 introduced new features to manage concurrently
running tasks, including a new configuration parameter,
spring.cloud.dataflow.task.maximum-concurrent-tasks , to limit the number of
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 101/144
11/22/2018 Spring Cloud Data Flow Samples
If there are no requests in the input queue, you will see something like:
CONSOLE
07:42:51.760 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:42:53.768 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:42:57.780 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:43:05.791 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:43:21.801 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:43:51.811 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:44:21.824 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
07:44:51.834 INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : No task launch request rec
The first three messages show the exponential backoff at start up or after processing the final
request. The the last three message show the task launcher in a steady state of polling for
messages every 30 seconds. Of course, these values are configurable.
The task launcher sink polls the input destination. The polling period adjusts according to the
presence of task launch requests and also to the number of currently running tasks reported
via the Data Flow server’s tasks/executions/current REST endpoint. The sink queries this
endpoint and will pause polling the input for new requests if the number of concurrent tasks
is at its limit. This introduces a 1-30 second lag between the creation of the task launch
request and the execution of the request, sacrificing some performance for resilience. Task
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 102/144
11/22/2018 Spring Cloud Data Flow Samples
launch requests will never be sent to a dead letter queue because the server is busy or
unavailable. The exponential backoff also prevents the app from querying the server
excessively when there are no task launch requests.
CONSOLE
$ watch curl <dataflow-server-url>/tasks/executions/current
Every 2.0s: curl http://localhost:9393/tasks/executions/current
2. Add Data
sftp>cd remote-files
sftp>lcd batch/file-ingest/data/split
sftp>mput *
>cp * /tmp/remote-files
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 103/144
11/22/2018 Spring Cloud Data Flow Samples
CONSOLE
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling period reset to 1000 ms.
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Launching Task fileIngestTask
WARN o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Data Flow server has reached its concurrent
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling paused- increasing polling period to
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling resumed
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Launching Task fileIngestTask
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling period reset to 1000 ms.
WARN o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Data Flow server has reached its concurrent
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling paused- increasing polling period to
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling resumed
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Launching Task fileIngestTask
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling period reset to 1000 ms.
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Launching Task fileIngestTask
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Launching Task fileIngestTask
WARN o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Data Flow server has reached its concurrent
INFO o.s.c.s.a.t.l.d.s.LaunchRequestConsumer : Polling paused- increasing polling period to
...
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 104/144
11/22/2018 Spring Cloud Data Flow Samples
<dependency>
<groupId>org.springframework.integration</groupId>
<artifactId>spring-integration-jdbc</artifactId>
</dependency>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-jdbc</artifactId>
</dependency>
<dependency>
<groupId>com.h2database</groupId>
<artifactId>h2</artifactId>
</dependency>
If you are running on a local server with the in memory H2 database, set the JDBC url in
src/main/resources/application.properties to use the Data Flow server’s database:
spring.datasource.url=jdbc:h2:tcp://localhost:19092/mem:dataflow
If you are running in Cloud Foundry, we will bind the source to the mysql service. Add the
following property to src/main/resources/application.properties :
spring.integration.jdbc.initialize-schema=always
If running in Cloud Foundry, the resulting executable jar file must be available in a location
that is accessible to your Cloud Foundry instance, such as an HTTP server or Maven
repository. If running on a local server:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 105/144
11/22/2018 Spring Cloud Data Flow Samples
Follow the instructions for building and running the main SFTP File Ingest demo, for your
preferred platform, up to the Add Data Step . If you have already completed the main
exercise, restore the data to its initial state, and redeploy the stream:
Undeploy the stream, and deploy it again to run the updated sftp source
If you are running in Cloud Foundry, set the deployment properties to bind sftp to the
mysql service. For example:
4. Add Data
Let’s use one small file for this. The directory batch/file-ingest/data/split contains the
contents of batch/file-ingest/data/name-list.csv split into 20 files. Upload one of
them:
sftp>cd remote-files
sftp>lcd batch/file-ingest/data/split
sftp>put names_aa.csv
5. Inspect data
Using a Database browser, as described in the main demo, view the contents of the
INT_METADATA_STORE table.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 106/144
11/22/2018 Spring Cloud Data Flow Samples
Note that there is a single key-value pair, where the key identies the file name (the prefix
sftpSource/ provides a namespace for the sftp source app) and the value is a timestamp
indicating when the message was received. The metadata store tracks files that have already
been processed. This prevents the same files from being pulled every from the remote
directory on every polling cycle. Only new files, or files that have been updated will be
processed. Since there are no uniqueness constraints on the data, a file processed multiple
times by our batch job will result in duplicate entries.
Now let’s update the remote file, using SFTP put or if using the local machine as an SFTP
server:
$touch /tmp/remote-files/names_aa.csv
Now the PEOPLE table will have duplicate data. If you ORDER BY FIRST_NAME , you will see
something like this:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 107/144
11/22/2018 Spring Cloud Data Flow Samples
Of course, if we drop another one of files into the remote directory, that will processed and we
will see another entry in the Metadata Store.
5.1.6. Summary
In this sample, you have learned:
How to create a stream to poll files on an SFTP server and launch a batch job
How the Data Flow Task Launcher limits concurrent task executions
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 108/144
11/22/2018 Spring Cloud Data Flow Samples
6. Analytics
6.1. Twitter Analytics
In this demonstration, you will learn how to build a data pipeline using Spring Cloud Data Flow
(http://cloud.spring.io/spring-cloud-dataflow/) to consume data from TwitterStream and compute
simple analytics over data-in-transit using Field-Value-Counter.
We will take you through the steps to configure Spring Cloud Data Flow’s Local server.
6.1.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 109/144
11/22/2018 Spring Cloud Data Flow Samples
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 110/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 111/144
11/22/2018 Spring Cloud Data Flow Samples
3. Verify the streams are successfully deployed. Where: (1) is the primary pipeline; (2) and (3)
are tapping the primary pipeline with the DSL syntax <stream-name>.<label/app name>
[e.x. :tweets.twitterstream ]; and (4) is the final deployment of primary pipeline
dataflow:>stream list
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 112/144
11/22/2018 Spring Cloud Data Flow Samples
5. Verify that two field-value-counter with the names hashtags and language is listing
successfully
CONSOLE
dataflow:>field-value-counter list
╔════════════════════════╗
║Field Value Counter name║
╠════════════════════════╣
║hashtags ║
║language ║
╚════════════════════════╝
CONSOLE
dataflow:>field-value-counter display hashtags
Displaying values for field value counter 'hashtags'
╔══════════════════════════════════════╤═════╗
║ Value │Count║
╠══════════════════════════════════════╪═════╣
║KCA │ 40║
║PENNYSTOCKS │ 17║
║TEAMBILLIONAIRE │ 17║
║UCL │ 11║
║... │ ..║
║... │ ..║
║... │ ..║
╚══════════════════════════════════════╧═════╝
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 113/144
11/22/2018 Spring Cloud Data Flow Samples
b. Stream as language
b. Stream as hashtags
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 114/144
11/22/2018 Spring Cloud Data Flow Samples
6.1.3. Summary
In this sample, you have learned:
How to create streaming data pipeline to compute simple analytics using Twitter Stream
and Field Value Counter applications
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 115/144
11/22/2018 Spring Cloud Data Flow Samples
7. Data Science
7.1. Species Prediction
In this demonstration, you will learn how to use PMML
(https://en.wikipedia.org/wiki/Predictive_Model_Markup_Language) model in the context of streaming
data pipeline orchestrated by Spring Cloud Data Flow (http://cloud.spring.io/spring-cloud-dataflow/).
We will present the steps to prep, configure and rub Spring Cloud Data Flow’s Local server, a
Spring Boot application.
7.1.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 116/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 117/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 118/144
11/22/2018 Spring Cloud Data Flow Samples
The built-in pmml processor will load the given PMML model definition and
create an internal object representation that can be evaluated quickly. When
the stream receives the data, it will be used as the input for the evaluation of
dataflow:>stream list
CONSOLE
2016-02-18 06:36:45.396 INFO 31194 --- [nio-9393-exec-1] o.s.c.d.d.l.OutOfProcessModuleD
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gp/T/spring-cloud-data-flo
2016-02-18 06:36:45.402 INFO 31194 --- [nio-9393-exec-1] o.s.c.d.d.l.OutOfProcessModuleD
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gp/T/spring-cloud-data-flo
2016-02-18 06:36:45.407 INFO 31194 --- [nio-9393-exec-1] o.s.c.d.d.l.OutOfProcessModuleD
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gp/T/spring-cloud-data-flo
5. Post sample data to the http endpoint: localhost:9001 ( 9001 is the port we specified
for the http source in this case)
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 119/144
11/22/2018 Spring Cloud Data Flow Samples
{
"sepalLength": 6.4,
"sepalWidth": 3.2,
"petalLength": 4.5,
"petalWidth": 1.5,
"Species": {
"result": "versicolor",
"type": "PROBABILITY",
"categoryValues": [
"setosa",
"versicolor",
"virginica"
]
},
"predictedSpecies": "versicolor",
"Probability_setosa": 4.728207706362856E-9,
"Probability_versicolor": 0.9133639504608079,
"Probability_virginica": 0.0866360448109845
}
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 120/144
11/22/2018 Spring Cloud Data Flow Samples
{
"sepalLength": 6.4,
"sepalWidth": 3.2,
"petalLength": 4.5,
"petalWidth": 1.8,
"Species": {
"result": "virginica",
"type": "PROBABILITY",
"categoryValues": [
"setosa",
"versicolor",
"virginica"
]
},
"predictedSpecies": "virginica",
"Probability_setosa": 1.0443898084700813E-8,
"Probability_versicolor": 0.1750120333571921,
"Probability_virginica": 0.8249879561989097
}
7.1.3. Summary
In this sample, you have learned:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 121/144
11/22/2018 Spring Cloud Data Flow Samples
8. Functions
8.1. Functions in Spring Cloud Data Flow
This is a experiment to run Spring Cloud Function workload in Spring Cloud Data Flow. The
current release of function-runner used in this sample is at 1.0 M1 release and it is not
recommended to be used in production.
In this sample, you will learn how to use Spring Cloud Function
(https://github.com/spring-cloud/spring-cloud-function) based streaming applications in Spring Cloud
Data Flow. To learn more about Spring Cloud Function, check out the project page
(http://cloud.spring.io/spring-cloud-function/).
8.1.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 122/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
This sample requires access to both Spring’s snapshot and milestone repos. Please
follow how-to-guides
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#howto) on
how to set repo.spring.io/libs-release and repo.spring.io/libs-
milestone as remote repositories in SCDF.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 123/144
11/22/2018 Spring Cloud Data Flow Samples
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 124/144
11/22/2018 Spring Cloud Data Flow Samples
These samples assume that the Data Flow Server can access a remote Maven
repository, repo.spring.io/libs-release by default. If your Data Flow
server is running behind a firewall, or you are using a maven proxy
preventing access to public repositories, you will need to install the sample
apps in your internal Maven repository and configure
(https://docs.spring.io/spring-cloud-dataflow/docs/current/reference/htmlsingle/#getting-
started-maven-configuration)
the server accordingly. The sample applications are typically registered using
Data Flow’s bulk import facility. For example, the Shell command
dataflow:>app import --uri bit.ly/Celsius-SR1-stream-
applications-rabbit-maven (The actual URI is release and binder specific so
refer to the sample instructions for the actual URL). The bulk import URI
references a plain text file containing entries for all of the publicly available
Spring Cloud Stream and Task applications published to repo.spring.io .
For example,
source.http=maven://org.springframework.cloud.stream.app:http-
source-rabbit:1.3.1.RELEASE registers the http source app at the
corresponding Maven address, relative to the remote repository(ies)
configured for the Data Flow server. The format is maven://<groupId>:
<artifactId>:<version> You will need to download
(https://repo.spring.io/libs-release/org/springframework/cloud/stream/app/spring-cloud-
stream-app-descriptor/Bacon.RELEASE/spring-cloud-stream-app-descriptor-
Bacon.RELEASE.rabbit-apps-maven-repo-url.properties)
the required apps or build (https://github.com/spring-cloud-stream-app-starters)
them and then install them in your Maven repository, using whatever group,
artifact, and version you choose. If you do this, register individual apps using
dataflow:>app register… using the maven:// resource URI format
corresponding to your installed app.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 125/144
11/22/2018 Spring Cloud Data Flow Samples
(https://github.com/spring-cloud/spring-cloud-function/blob/master/spring-cloud-
function-samples/function-
sample/src/main/java/com/example/functions/CharCounter.java)
.
dataflow:>stream list
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 126/144
11/22/2018 Spring Cloud Data Flow Samples
BASH
....
....
2017-10-17 11:43:03.714 INFO 18409 --- [nio-9393-exec-7] o.s.c.d.s.s.AppDeployerStreamDe
2017-10-17 11:43:04.379 INFO 18409 --- [nio-9393-exec-7] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gs/T/spring-cloud-dataflow
2017-10-17 11:43:04.380 INFO 18409 --- [nio-9393-exec-7] o.s.c.d.s.s.AppDeployerStreamDe
2017-10-17 11:43:04.384 INFO 18409 --- [nio-9393-exec-7] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gs/T/spring-cloud-dataflow
2017-10-17 11:43:04.385 INFO 18409 --- [nio-9393-exec-7] o.s.c.d.s.s.AppDeployerStreamDe
2017-10-17 11:43:04.391 INFO 18409 --- [nio-9393-exec-7] o.s.c.d.spi.local.LocalAppDeplo
Logs will be in /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gs/T/spring-cloud-dataflow
....
....
6. Post sample data to the http endpoint: localhost:9001 ( 9001 is the port we specified
for the http source in this case)
BASH
$ tail -f /var/folders/c3/ctx7_rns6x30tq7rb76wzqwr0000gs/T/spring-cloud-dataflow-65490254
....
....
....
....
2017-10-17 11:45:39.363 INFO 19193 --- [on-runner.foo-1] log-sink : 11
2017-10-17 11:46:40.997 INFO 19193 --- [on-runner.foo-1] log-sink : 24
....
....
8.1.3. Summary
In this sample, you have learned:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 127/144
11/22/2018 Spring Cloud Data Flow Samples
How to use the out-of-the-box function-runner application in Spring Cloud Data Flow
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 128/144
11/22/2018 Spring Cloud Data Flow Samples
9. Micrometer
9.1. SCDF metrics with In uxDB and Grafana
In this demonstration, you will learn how Micrometer (http://micrometer.io) can help to monitor
your Spring Cloud Data Flow (http://cloud.spring.io/spring-cloud-dataflow/) (SCDF) streams using
InfluxDB (https://docs.influxdata.com/influxdb/v1.5/) and Grafana (https://grafana.com/grafana).
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 129/144
11/22/2018 Spring Cloud Data Flow Samples
Unlike Spring Cloud Data Flow Metrics Collector, metrics here are sent
synchronously over HTTP not through a Binder channel topic.
All Spring Cloud Stream App Starers enrich the standard dimensional tags
(http://micrometer.io/docs/concepts#_supported_monitoring_systems) with the following SCDF specific
tags:
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 130/144
11/22/2018 Spring Cloud Data Flow Samples
instance.index instance.index 0
For custom app starters that don’t extend from the core
(https://github.com/spring-cloud-stream-app-starters/core) parent, you should add the
app-starters-common : org.springframework.cloud.stream.app
dependency to enable the SCDF tags.
Below we will present the steps to prep, configure the demo of Spring Cloud Data Flow’s Local
server integration with InfluxDB . For other deployment environment, such as Cloud Foundry
or Kubernetes , additional configurations might be required.
9.1.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 131/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 132/144
11/22/2018 Spring Cloud Data Flow Samples
(https://github.com/spring-cloud-stream-app-starters/log/blob/master/spring-cloud-starter-stream-sink-
log/README.adoc)
applications starters, pre-built with io.micrometer:micrometer-registry-influx
dependency.
BASH
app register --name time2 --type source --uri file://<path-to-your-time-app>/time-
source-kafka-2.0.0.BUILD-SNAPSHOT.jar --metadata-uri file://<path-to-your-time-
app>/time-source-kafka-2.0.0.BUILD-SNAPSHOT-metadata.jar
BASH
docker run -d --name grafana -p 3000:3000 grafana/grafana:5.1.0
By default, the InfluxDB server runs on localhost:8086 . You can add the
app.*.management.metrics.export.influx.uri={influxbb-server-url} property to
alter the default location.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 133/144
11/22/2018 Spring Cloud Data Flow Samples
BASH
docker exec -it influxdb /bin/bash
root:/# influx
> show databases
> use myinfluxdb
> show measurements
> select * from spring_integration_send limit 10
4. Configure Grafana
Name influx_auto_DataFlowMetricsCollector
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 134/144
11/22/2018 Spring Cloud Data Flow Samples
Type InfluxDB
Host localhost:8086
Access Browser
Database myinfluxdb
Password admin
(DB)
For previous Grafana 4.x set the Access property to direct instead.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 135/144
11/22/2018 Spring Cloud Data Flow Samples
9.1.3. Summary
In this sample, you have learned:
How to use InfluxDB and Grafana to monitor and visualize Spring Cloud Stream
application metrics. :sectnums: :docs_dir: ../..
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 136/144
11/22/2018 Spring Cloud Data Flow Samples
Prometheus is time series database used for monitoring of highly dynamic service-oriented
architectures. In a world of microservices, its support for multi-dimensional data collection and
querying is a particular strength.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 137/144
11/22/2018 Spring Cloud Data Flow Samples
Unlike Spring Cloud Data Flow Metrics Collector, metrics here are sent
synchronously over HTTP not through a Binder channel topic.
All Spring Cloud Stream App Starers enrich the standard dimensional tags
(http://micrometer.io/docs/concepts#_supported_monitoring_systems) with the following SCDF specific
tags:
instance.index instance.index 0
For custom app starters that don’t extend from the core
(https://github.com/spring-cloud-stream-app-starters/core) parent, you should add the
app-starters-common : org.springframework.cloud.stream.app
dependency to enable the SCDF tags.
Prometheus employs the pull-metrics model, called metrics scraping. Spring Boot provides an
actuator endpoint available at /actuator/prometheus to present a Prometheus scrape with the
appropriate format.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 138/144
11/22/2018 Spring Cloud Data Flow Samples
Below we will present the steps to prepare, configure the demo of Spring Cloud Data Flow’s
Local server integration with Prometheus . For other deployment environment, such as Cloud
Foundry or Kubernetes , additional configurations might be required.
9.2.1. Prerequisites
A Running Data Flow Shell
the Spring Cloud Data Flow Shell and Local server implementation are in the
same repository and are both built by running ./mvnw install from the project
root directory. If you have already run the build, use the jar in spring-cloud-
dataflow-shell/target
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 139/144
11/22/2018 Spring Cloud Data Flow Samples
$ cd <PATH/TO/SPRING-CLOUD-DATAFLOW-SHELL-JAR>
$ java -jar spring-cloud-dataflow-shell-<VERSION>.jar
____ ____ _ __
/ ___| _ __ _ __(_)_ __ __ _ / ___| | ___ _ _ __| |
\___ \| '_ \| '__| | '_ \ / _` | | | | |/ _ \| | | |/ _` |
___) | |_) | | | | | | | (_| | | |___| | (_) | |_| | (_| |
|____/| .__/|_| |_|_| |_|\__, | \____|_|\___/ \__,_|\__,_|
____ |_| _ __|___/ __________
| _ \ __ _| |_ __ _ | ___| | _____ __ \ \ \ \ \ \
| | | |/ _` | __/ _` | | |_ | |/ _ \ \ /\ / / \ \ \ \ \ \
| |_| | (_| | || (_| | | _| | | (_) \ V V / / / / / / /
|____/ \__,_|\__\__,_| |_| |_|\___/ \_/\_/ /_/_/_/_/_/
Welcome to the Spring Cloud Data Flow shell. For assistance hit TAB or type "help".
dataflow:>
The Spring Cloud Data Flow Shell is a Spring Boot application that connects to the
Data Flow Server’s REST API and supports a DSL that simplifies the process of
defining a stream or task and managing its lifecycle. Most of these samples use
the shell. If you prefer, you can use the Data Flow UI localhost:9393/dashboard,
(or wherever it the server is hosted) to perform equivalent operations.
The Local Data Flow Server is Spring Boot application available for download
(http://cloud.spring.io/spring-cloud-dataflow/#platform-implementations) or you can build
(https://github.com/spring-cloud/spring-cloud-dataflow) it yourself. If you build it yourself, the
executable jar will be in spring-cloud-dataflow-server-local/target
To run the Local Data Flow server Open a new terminal session:
$cd <PATH/TO/SPRING-CLOUD-DATAFLOW-LOCAL-JAR>
$java -jar spring-cloud-dataflow-server-local-<VERSION>.jar
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 140/144
11/22/2018 Spring Cloud Data Flow Samples
(https://github.com/spring-cloud-stream-app-starters/log/blob/master/spring-cloud-starter-stream-sink-
log/README.adoc)
applications starters, pre-built with io.micrometer:micrometer-registry-prometheus
dependency.
The deployment properties make sure that the prometheus actuator is enabled and the Spring
Boot security is disabled
BASH
cd ./spring-cloud-dataflow-samples/micrometer/spring-cloud-dataflow-prometheus-
service-discovery
./mvnw clean install
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 141/144
11/22/2018 Spring Cloud Data Flow Samples
It will connect to the SCDF runtime url, and generates /tmp/targets.json files every 10 sec.
YAML
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is
every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1
minute.
# scrape_timeout is set to the global default (10s).
5. Start Prometheus
BASH
docker run -d --name prometheus \
-p 9090:9090 \
-v <full-path-to>/prometheus-local-file.yml:/etc/prometheus/prometheus.yml \
-v /tmp/targets.json:/etc/prometheus/targets.json \
prom/prometheus:v2.2.1
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 142/144
11/22/2018 Spring Cloud Data Flow Samples
Use the management UI: localhost:9090/graph to verify that SCDF apps metrics have been
collected:
# Throughput
rate(spring_integration_send_seconds_count{type="channel"}[60s])
# Latency
rate(spring_integration_send_seconds_sum{type="channel"}
[60s])/rate(spring_integration_send_seconds_count{type="channel"}[60s])
BASH
docker run -d --name grafana -p 3000:3000 grafana/grafana:5.1.0
7. Configure Grafana
Name ScdfPrometheus
Type Prometheus
Host localhost:9090
Access Browser
For previous Grafana 4.x set the Access property to direct instead.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 143/144
11/22/2018 Spring Cloud Data Flow Samples
9.2.3. Summary
In this sample, you have learned:
How to use Prometheus and Grafana to monitor and visualize Spring Cloud Stream
application metrics.
https://docs.spring.io/spring-cloud-dataflow-samples/docs/current/reference/htmlsingle/ 144/144