Streamprocessing Labmanual
Streamprocessing Labmanual
Streamprocessing Labmanual
Date:
Step 1: Go to MongoDB download Page and click download as shown in the screenshot. A
.msi file like this mongodb-win32-x86_64-2008plus-ssl-3.4.7-signed will be downloaded in
your system. Double click on the file to run the installer.
Step 2: Click Next when the MongoDB installation windows pops up.
1
Step 3: Accept the MongoDB user Agreement and click Next.
Step 4: When the setup asks you to choose the Setup type, choose Complete.
2
Step 5: Click Install to begin the installation.
3
Step 6: That’s it. Click Finish once the MongoDB installation is complete.
4
Result: Thus the MongoDB is successfully installed.
5
Exercise: 2 Create and Drop a database in MongoDB
Date:
After pressing enter we are at the MongoDB shell as shown in below figure
6
Once you are in the MongoDB shell, create the database in MongoDB by typing this
command:
The DB madavi is created; is not present in the list of all the databases. This is because a
database is not created until you save a document in it.
Note: If the database name you mentioned is already present then this command will
connect you to the database. However if the database doesn’t exist then this will create
the database with the given name and connect you to it.
• Now we are creating a collection Student and inserting a document in it.
>db.student.insert({name: “sree", age: 30, address:”vijayawada”})
• You can now see that the database “madavi” is created.
7
OUTPUT:
Result: -
8
Exercise:3 MongoDB on the fly
Date:
• The cool thing about MongoDB is that you need not to create collection before you insert
document in it. With a single command you can insert a document in the collection and
the MongoDB creates that collection on the fly.
• SYNTAX:
db.collection_name.insert({key:value, key:value…})
EXAMPLE:
db.student.insert({rollno:”20X41A0441”,name:”durga”,age:18,city:“Vijayawada”})
• SYNTAX: db.collection_name.find()
• To check whether the collection is created successfully, use the following command.
> show collections // This command shows the list of all the collections in the
currently selected database.
OUTPUT:
Result:-
9
Exercise:4 Creating Collection
Date:
Aim:-
Creating collection with options before inserting and Drop the collection .
• We can also create collection before we actually insert data in it. This method
provides you the options that you can set while creating a collection.
SYNTAX:
db.createCollection(name, options)
• name is the collection name
• options is an optional field that we can use to specify certain parameters such
as size, max number of documents etc. in the collection.
db.collection_name.drop()
Note: Once you drop a collection all the documents and the indexes associated
with them will also be dropped. To preserve the indexes we use remove() function
that only removes the documents in the collection but doesn’t remove the
collection itself and the indexes created on it. We will learn about indexes and
remove() function in the later tutorials.
EXAMPLE:
> db.createCollection("students")
{ "ok" : 1 }
> db.students.drop()
true
OPTIONS field in the above syntax:
10
• size: type: number.
This specifies the max size of collection (capped collection) in bytes.
• max: type: number.
This specifies the max number of documents a collection can hold.
{ "ok" : 1 }
• This command will create a collection named “teachers” with the max size of 9232768
bytes. Once this collection reaches that limit it will start overwriting old entries.
11
Result:-
12
Exercise: 5 Insert document using MongoDB
Date:
• The field “course” in the example below is an array that holds the several key-value
pairs.
>db.students.insert(
{
name: "Chaitanya",
age: 20,
email: "chaitu@gmail.co.in",
course: [ { name: "MongoDB", duration: 7 }, { name: "Java", duration: 30 } ]
}
)
Output:
WriteResult({ "nInserted" : 1 })
13
To insert multiple documents in collection, we define an array of documents and later we
use the insert() method on the array variable as shown in the example below. Here we are
inserting three documents in the collection named “students”. This command will insert the data
in “students” collection, if the collection is not present then it will create the collection and insert
these documents.
EXAMPLE:
>var beginners
=[
{
"StudentId" : 1001,
"StudentName" :
"Steve",
"age": 30
},
{
"StudentId" : 1002,
"StudentName" :
"Negan",
"age": 42
},
{
"StudentId" : 3333,
"StudentName" :
"Rick",
"age": 35
},
];
db.students.insert(beginners);
output:
BulkWriteResult({ "writeError
s" : [ ],
"writeConcernErrors" : [
14
], "nInserted" : 3,
"nUpserted" : 0,
"nMatched" : 0,
"nModified" : 0,
"nRemoved" : 0,
"upserted" : [ ]
})
As you can see that it shows number 3 in front of nInserted. this means that the 3
documents have been inserted by this command.
To verify that the documents are there in collection. Run this
command: db.students.find()
print the output data in a JSON format so that you can read it easily. To print the data in
JSON format run the command
db.collection_name.find().forEach(printjson)
In the screenshot below, you can see the difference. First we have printed the documents using
normal find() method and then we printed the documents of same collection using JSON
format. The documents in JSON format are neat and easy to read.
15
16
OUTPUT:
17
b. Insert multiple documents in collection
18
Printing all documents based on Query
19
Result:-
20
Exercise:6 MongoDB Update document
Date:
Aim :- To update the document using update method and save method.
a) using update()method.
b) using save() method.
SYNTAX:
> db.collection_name.update(criteria,
update_data) EXAMPLE:
>db.studnets.update({“name”:”sai”},{$set:{“name”:”sree”}})
SYNTAX:
{multi:true}) EXAMPLE:
>db.studnets.update({“name”:”sai”},{$set:{“name”:”sree”}},{multi:true})
SYNTAX:
• To work with save() method you should know the unique _id field of that document.
• A very important point to note is that when you do not provide the _id field while
using save()
method, it calls insert() method and the passed document is inserted into the collection
as a new document
• To get the _id of a document, you can either type this command:
• db.students.find().pretty()
21
OUTPUT:
22
To update multiple documents with the update() method:
23
Retrieving a document using name field
Result:-
24
Exercise:7 MongoDB Delete
Date:
The remove() method is used for removing the documents from a collection in MongoDB.
>db.collection_name.remove(delete_criteria)
EXAMPLE:
> db.students.find().pretty()
"_id" :
ObjectId("59bcecc7668dcce02aaa6fed"),
"StudentId" : 1001,
"StudentName" :
"Steve", "age" : 30
"_id" :
ObjectId("59bcecc7668dcce02aaa6fef"),
"StudentId" : 3333,
"StudentName" :
25
"Rick", "age" : 35
To remove the student from this collection who has a student id equal to 3333. To do this write a
command using remove() method like this:
db.students.remove({"StudentId": 3333})
Output:
WriteResult({ "nRemoved" : 1 })
When there are more than one documents present in collection that matches the criteria then all
those documents will be deleted if you run the remove command. However there is a way to
limit the deletion to only one document so that even if there are more documents matching the
deletion criteria, only one document will be deleted.
SYNTAX:
>db.collection_name.remove(delete_criteria, justOne)
Here justOne is a Boolean parameter that takes only 1 and 0, if you give 1 then it will limit the
the document deletion to only 1 document. This is an optional parameters as we have seen above
that we have used the remove() method without using this parameter.
> db.walkingdead.find().pretty()
"_id" :
ObjectId("59bf280cb8e797a22c654229"),
"age" : 32,
26
"rname" : "Andrew Lincoln"
"_id" :
ObjectId("59bf2851b8e797a22c65422a"),
"name" : "Negan",
"age" : 35,
"_id" : ObjectId("59bf28a5b8e797a22c65422b"),
"age" : 32,
To remove the document that has age equal to 32. There are two documents in this collection
that are matching this criteria. However to limit the deletion to one we are setting justOne
parameter to true.
db.walkingdead.remove({"age": 32}, 1)
To remove all the documents from a collection but does not want to remove the collection
itself then you can use remove() method like this:
SYNTAX:
27
>db.collection_name.remove({})
To drop a collection , first connect to the database in which you want to delete collection and
then type the following command to delete the collection:
>db.collection_name.drop()
Note: Once you drop a collection all the documents and the indexes associated with them will
also be dropped. To preserve the indexes we use remove() function that only removes the
documents in the collection but doesn’t remove the collection itself and the indexes created on it.
switched to db
madavi
admin
students
teachers
> db.teachers.drop()
true
admin
students
The command db.teachers.drop() returned true which means that the collection is deleted
28
successfully. The same thing we have verified using the show collections command after
deletion as shown above.
29
OUTPUT:
30
b) Remove only one document matching your criteria
31
32
(Or)
33
34
c) Remove all documents
35
7
Result :-
36
Exercise:8 Java &PHP
Aim:
import com.mongodb.client.MongoDatabase;
import com.mongodb.MongoClient;
import com.mongodb.MongoCredential;
try {
MongoClient db=
= new MongoClient("localhost", 27017);
catch (Exception e) {
System.out.println("Connection establishment failed");
System.out.println(e);
}
}
2. To use MongoDB with PHP, you need to use MongoDB PHP driver. Download the driver from the url
Download PHP Driver. Make sure to download the latest release of it. Now unzip the archive and put
php_mongo.dll in your PHP extension directory ("ext" by default) and add the following line to your
php.ini file −
extension = php_mongo.dll
Make a Connection and Select a Database
To make a connection, you need to specify the database name, if the database doesn't exist then MongoDB
creates it automatically.
Following is the code snippet to connect to the database −
<?php
// connect to mongodb
$m = new MongoClient();
38
echo "Connection to database successfully";
// select a database
$db = $m->mydb;
Result:-
Thus the simple application is successfully created.
39
Exercise :9 Procedure for installing Apache Kafka
Date:
Apache Kafka can be run on all platforms supported by Java. In order to set up Kafka on
the Ubuntu system, you need to install java first. As we know, Oracle java is now
commercially available, So we are using its open-source version OpenJDK.
Download the Apache Kafka binary files from its official download website. You can also
select any nearby mirror to download.
wget https://downloads.apache.org/kafka/3.4.0/kafka_2.12-3.4.0.tgz
tarxzf kafka_2.12-3.4.0.tgz
sudomv kafka_2.12-3.4.0 /usr/local/kafka
40
Step 3 — Creating System Unit Files
Now, you need to create system unit files for the Zookeeper and Kafka services. Which will
help you to start/stop the Kafka service in an easy way.
nano /etc/systemd/system/zookeeper.service
[Unit]
Description=Apache Zookeeper server
Documentation=http://zookeeper.apache.org
Requires=network.target remote-fs.target
After=network.target remote-fs.target
[Service]
Type=simple
ExecStart=/usr/local/kafka/bin/zookeeper-server-start.sh
/usr/local/kafka/config/zookeeper.properties
ExecStop=/usr/local/kafka/bin/zookeeper-server-stop.sh
Restart=on-abnormal
[Install]
WantedBy=multi-user.target
nano /etc/systemd/system/kafka.service
[Unit]
Description=Apache Kafka Server
Documentation=http://kafka.apache.org/documentation.html
Requires=zookeeper.service
[Service]
41
Type=simple
Environment="JAVA_HOME=/usr/lib/jvm/java-1.11.0-openjdk-amd64"
ExecStart=/usr/local/kafka/bin/kafka-server-start.sh /usr/local/kafka/config/server.properties
ExecStop=/usr/local/kafka/bin/kafka-server-stop.sh
[Install]
WantedBy=multi-user.target
systemctl daemon-reload
First, you need to start the ZooKeeper service and then start Kafka. Use the systemctl
command to start a single-node ZooKeeper instance.
sudosystemctlstart zookeeper
Now start the Kafka server and view the running status:
42
All done. The Kafka installation has been successfully completed. The part of this tutorial
will help you to work with the Kafka server.
Kafka provides multiple pre-built shell scripts to work on it. First, create a topic named
“myTopic” with a single partition with a single replica:
cd /usr/local/kafka
bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 --
topic myTopic
The replication factor describes how many copies of data will be created. As we are running
with a single instance keep this value 1. Set the partition options as the number of brokers
you want your data to be split between. As we are running with a single broker keep this
value 1. You can create multiple topics by running the same command as above.
After that, you can see the created topics on Kafka by the running below command:
bin/kafka-topics.sh--list--bootstrap-server localhost:9092
Result:-
43
Exercise :10 Kafka Cluster& Basic operations
Date:
Aim :-
Demonstrate setting up a single-node, single-broker Kafka cluster and show
basic operations such as creating topics and producing/consuming messages.
To set up a Kafka cluster, you will need to follow these general steps:
1. Install Kafka on all nodes of the cluster. You can download Kafka from the Apache
Kafka website.
2. Configure the server.properties file on each node to specify the broker ID, the
ZooKeeper connection string, and other properties.
3. Start the ZooKeeper service on each node. This is required for Kafka to function.
4. Start the Kafka brokers on each node by running the kafka-server-start command
and specifying the location of the server.properties file.
5. Test the cluster by creating a topic, producing and consuming messages, and verifying
that they are replicated across all nodes.
1. Install Kafka on all nodes of the cluster. You can download Kafka from the Apache
Kafka website.
2. Configure the server.properties file on each node to specify the broker ID, the
ZooKeeper connection string, and other properties. For example, here is a
configuration for a simple Kafka cluster with three brokers:
broker.id=1listeners=PLAINTEXT://localhost:9092num.partitions=3
log.dirs=/tmp/kafka-logs-1zookeeper.connect=localhost:2181broker.id=2
listeners=PLAINTEXT://localhost:9093 num.partitions=3 log.dirs=/tmp/kafka-logs-2
zookeeper.connect=localhost:2181 broker.id=3
listeners=PLAINTEXT://localhost:9094 num.partitions=3 log.dirs=/tmp/kafka-logs-3
zookeeper.connect=localhost:2181
In this example, each broker has a unique broker.id and listens on a different port for client
connections. The num.partitions property specifies the default number of partitions for new
topics, and log.dirs specifies the directory where Kafka should store its data on disk.
zookeeper.connect specifies the ZooKeeper connection string, which should point to the
ZooKeeper ensemble.
1. Start the ZooKeeper service on each node. This is required for Kafka to function. You
can start ZooKeeper by running the following command:
bin/zookeeper-server-start.shconfig/zookeeper.properties
This will start a single-node ZooKeeper instance using the default configuration.
44
1. Start the Kafka brokers on each node by running the kafka-server-start command
and specifying the location of the server.properties file. For example:
bin/kafka-server-start.shconfig/server.properties
This will start the Kafka broker on the default port (9092) using the configuration in
config/server.properties.
1. Test the cluster by creating a topic, producing and consuming messages, and verifying
that they are replicated across all nodes. You can use the kafka-topics, kafka-
console-producer, and kafka-console-consumer command-line tools to perform
these tasks. For example:
These commands will create a topic with three partitions and three replicas, produce
messages to the topic, and consume them from all three brokers. You can verify that the
messages are replicated across all nodes by stopping one of the brokers and observing that
the other brokers continue to serve messages.
server.properties
broker.id=1
listeners=PLAINTEXT://localhost:9093
45
log.dirs=c:/kafka/kafka-logs-1
auto.create.topics.enable=false (optional)
Creating new Broker-1
1. change id to 1
Edit: server-2.properties
broker.id=2
listeners=PLAINTEXT://localhost:9094
log.dirs=c:/kafka/kafka-logs-2
auto.create.topics.enable=false
Starting up these 2 Kafka brokers
46
1. starting the first broker
.\bin\windows\kafka-server-start.bat .\config\server-1.properties
2. starting the second broker
.\bin\windows\kafka-server-start.bat .\config\server-2.properties
Kafka Cluster
→ So we have successfully started 3 Kafka brokers and now we have a Kafka cluster that is
up and running in our machine with 3 brokers.
It's time to create a new topic, then we will produce and consume the messages with our new
cluster setup.
.\bin\windows\kafka-topics.bat --create --topic test-topic-replicated -zookeeper
localhost:2181 --replication-factor 3 --partitions 3
47
.\bin\windows\kafka-console-producer.bat --broker-list localhost:9092 --topic test-topic-
replicated
message sent: Hi
Instantiate a new Consumer to receive the messages.
message received: Hi
Now whatever message we have sent is received to console consumers. Now the interesting
part is that we have 3 new Kafka folders right? Let’s go ahead and check that what we have
in it.
Log directories
• close the producer console now and you know have created a kafka-logs-1 and kafka-
logs-2 directories are created.
• Now each broker got a new folder and that is where it is actually persisting all the
messages that are produced to a particular broker. So we have three different
directories for each and every broker.
Result :-
Thus successfully setting up Kafka cluster and execute the operations.
48