A New

A New Digital Color Image Watermarking
Algorithm with its FPGA and ASIC Implementation

Shivdeep Sudip Ghosh Hafizur Rahaman
School of VLSI Technology School of VLSI Technology School of VLSI Technology
IIEST, Shibpur IIEST, Shibpur IIEST, Shibpur
Kolkata, India Kolkata, India Kolkata, India
sivdp1@gmail.com sudip etc@yahoo.co.in rahaman h@yahoo.co.in
Abstract—In the world full of visual content, digital image In section.II proposed algorithm including embedding and
authentication has become an important concern. Digital image extraction is described and its software implementation along
watermarking can play a key role in this regard. Though several with performance checks and attacks is discussed in section.
techniques and algorithms exist in literature but color image
watermarking techniques with its hardware implementation are III. In section.IV we elaborated proposed hardware architec-
few. The objective of this paper is to introduce a new algorithm ture hierarchically while FPGA and ASIC implementations are
for watermarking a color cover image using color watermark. given in section.V. Results are analyzed in section.VI and the
The basic technique is to alter the pixel values of the cover image, conclusion in section.VII.
based on the similarity between cover image and watermark. The
amount of alteration can be controlled by a parameter called II. P ROPOSED A LGORITHM
modulation index, which also decides the quality of cover image
as well as that of extracted watermark image. A pseudo-noise A. Embedding Algorithm
code is used for embedding and extraction of the watermark,
hence only authorized users having exact pseudo-noise code People in literature have modified either one [6] or two
can extract the watermark. This is an invisible watermarking [3] LSB or any bit-plane [4] of cover image pixels, but in
technique, so it doesn’t affect the appearance of the original proposed algorithm multiple bits may change depending upon
image significantly. Furthermore, FPGA as well as ASIC based modulation index. The color depth of watermark image is 12
hardware implementation of the aforesaid algorithm is realized. bpp and that of the cover image is 24 bpp. The algorithm
For real-time application hardware realization is more efficient
than software implementation. The proposed algorithm and is developed for embedding watermark image of size (64 x
its VLSI implementation have been compared with stat-of-art 64) pixel on cover image of size (256 x 256) pixel. The
research work present in literature. Throughput of the proposed color of every pixel is determined by three intensity values of
algorithm is high and it can also be used for digital video Red, Green and Blue (RGB). A color image consists of three
watermarking. intensity planes, one for each color. These intensity planes
Index Terms—ASIC, Color watermarking, Data hiding, FPGA,
Hardware implementation; can be viewed as separate intensity images or 2D matrices
and each pixel or element in such matrix is represented by 4
I. I NTRODUCTION bits, while in cover image that contains 8 bits.
Digital images fall in a wide category based on pixel
depth and presentation such as binary, grayscale and color
images. There are many watermarking algorithms which can
embed binary or grayscale watermarks. As the amount of
information increases rapidly for color images, embedding
becomes difficult and unreliable. In this paper, we present an
algorithm to embed 12 bits per pixel (bpp) color watermark Fig. 1. Color image format.
into 24 bpp color cover image. A basic property of a color
image is that it consists of three grayscale or intensity images. Each plane of watermark image can be embedded into
Each pixel is defined in three intensity planes simultaneously, corresponding plane of cover image and the same way
and these three values collectively decide the color of a watermark can be extracted from each plane and combine
pixel. We split RGB images into intensity planes and divide extracted planes to get color watermark.
the planes into smaller segments and then encode or decode
information recursively. Software implementation using MAT- Inputs: Cover and Watermark images, MI, and PN codes;
LAB checks functionality and performance of the algorithm Outputs: Watermarked image;
whereas ASIC and FPGA based design checks feasibility of Step.1: Divide both watermark and cover image into 3
hardware implementation. We will also compare efficiency of intensity planes and select a segment of size (8 x 8) pixel
both implementations. from cover image and that of size (2 x 2) from watermark
978-1-7281-6564-6/20/$31.00 ©2020 IEEE

Authorized licensed use limited to: UNIVERSIDADE FEDERAL DE PELOTAS. Downloaded on July 29,2024 at 13:51:58 UTC from IEEE Xplore. Restrictions apply.
image. III. S OFTWARE I MPLEMENTATION OF P ROPOSED
Step.2: Extract MSB (Most Significant Bit) of each element A LGORITHM
of XC (a matrix from cover image) into a 64bit string string1. The proposed algorithm was first implemented using MAT-
Elements of wm are 4bit each i.e. [w3 w2 w1 w0], extrapolate LAB to check its functionality. We read the 24 bpp cover
it into 16 bits and store in string2. Extrapolation will be done image of size (256 x 256) pixel and 12 bpp watermark image
by repeating W3 9 times, w2 5 times, w1 & w0 will occur of size (64 x 64) pixel. Run a loop to select one of the
once. three planes at a time and then another loop to select a non-
Step.3: Divide both strings into 4 equal length segments and overlapping section from each, cover and watermark images.
compare corresponding segments from each string bit wise. If Encode the selected segment of the cover image and similarly,
segments match 50% or more, say it in-phase otherwise out the whole image got watermarked. While embedding, read and
of phase and generate 1 or 0 respectively and store in another store images in 3D matrix form. Run a loop to select each
string named as PM (positional match). intensity plane, one at a time. Inside it, run another loop to
Step.4: Generate four binary PN (pseudo noise) codes PN1, select a small segment of size (8 x 8) pixel and embed a
PN2, PN3, PN4; each of size 64 bits. segment of watermark image of size (2 x 2). 6-bit LFSR code
Step.5: Evaluate watermarked image using given relation: is used to generate pseudo-noise of 64 bits. Two examples in
XE = XC ± M I.P N 1 ± M I.P N 2 ± M I.P N 3 ± M I.P N 4 Fig. 2 depict the watermarking process.
(1)
Where MI is modulation index and this parameter accounts
for amount of modification in cover image. XE & XC are
embedded and cover matrices respectively. PN codes randomly
select the pixels (ie. elements of XC) and on the basis of
similarity between cover and watermark images (PM), MI is
either added or subtracted to modify pixel values. If PM(i)
(ith positional match) is 1 subtraction takes place otherwise
addition, i = [1,2,3,4];
Step.6: Repeat until whole plane gets encoded and do the same
thing with three planes and group three embedded planes to
get watermarked color image.
B. Decoding Algorithm
Inputs: Watermarked image and Pseudo-noise code;
Fig. 2. Illustration of embedding & extraction using test images.
Output: Extracted watermark image;
Step.1: Select a segment of size (8 x 8) pixel from a intensity Performance metrics Peak Signal to Noise Ratio (PSNR),
plane of watermarked color image and say it XE. Calculate 2D correlation (R) and Structural Similarity Index (SSIM)
correlation coefficient with pseudo noise using given relation: are calculated to assess images produced by embedding and
P64
XC(i).P N 1(i) extraction [2]. PSNR is used to measure the quality of recon-
U = i=1 (2) structed image from lossy compression or it depicts the fidelity
64
Step.2: Four correlation coefficients will be calculated for each of representation of image. PSNR gives idea about quality of
pseudo noise code (U(XC,PN1), U(XC,PN2), U(XC,PN3) & reproduced image. A 2D correlation R computes element wise
U(XC,PN4)). similarity among two array. SSIM also gives the measure of
Step.3: Create a string of 64 bits containing MSBs of each similarity between original and encoded/decoded images, but
element of XE and divide it into 4 equal sets of 16 bits this method is perceptually aware as it accounts neighborhood
each. Say e(1), e(2), e(3) & e(4) and modify e(i) according to of a element. A 2D correlation R gives a measure of similarity
function given in equation(3); in two arrays element wise.

e(i), U (i) = 1; m−1 n−1
1 XX
e(i) = (3)
e0 (i), U (i) = 0; M SE = [A(i, j) − B(i, j)]2 (4)
mn i=0 j=0
Step.4: Check bits of e(i). For upper 9 bits if 0 is majority,
assign 0 else 1; same for next 5 bits. Keep last 2 bits as it is. where, MSE is mean square error, A & B are images of order
compute (2 x 2) watermark, 4 bits for each element. (m x n);
Step.5: Repeat steps(2-4) for all non-overlapping segments of 2552
P SN R = 10 log10 ( ) (5)
intensity plane and obtain (64 x 64) matrix. M SE
Step.6: Repeat steps(1-5) for three intensity planes and obtain Pm Pn
three RGB components of watermark, group them to get color i=1 j=1 (Aij − Ā)(Bij − B̄)
R = qP (6)
m Pn 2
Pm Pn 2
watermark. i=1 j=1 (Aij − Ā) i=1 j=1 (Bij − B̄)
where, Ā & B̄ are mean values of A & B & R is correlation;
(2µA µB + C1 )(2σAB + C2 )
SSIM (A, B) = (7)
(µ2A + µ2B + C1 )(σA
2 + σ2 + C )
B 2
where, µA , µB , σA , σB , and σAB are the local means, standard

deviations, and cross-covariance for images A, B. C1 & C2 (a) (b) (c) (d)
are regularization constants;
TABLE I
E FFECT OF MI ON QUALITY OF WATERMARKED IMAGE
Cover image versus watrmarked image

MI PSNR SSIM Correlation(R) (e) (f) (g) (h)
1 42.1102 0.9992 0.9773
6 25.2082 0.9673 0.5469 Fig. 3. (a)without attack; after attack: (b)salt & pepper, (c)JPEG compression,
10 21.2823 0.9247 0.3701 (d)Gaussian, (e)Poisson, (f)Speckle, (g)Crop, (h)Brightness;
Original versus extracted watermark
MI PSNR SSIM Correlation(R)
1 5.8869 0.2319 0.2473
6 7.1596 0.3990 0.4582
10 17.8288 0.9209 0.9456
A. Robustness
The proposed algorithm is robust against noise like salt &
pepper, Gaussian, Poisson and speckle; it can also withstand
JPEG compression, crop, brightness adjustments, scaling and
overwriting attacks. Extracted watermarks before and after
various attacks on embedded image, are obtained and the
correlations among watermarks extracted with and without Fig. 4. Top Level
attack are given below.
TABLE II produces 4-bit “PM”. Encoder block uses “MI”, PM and PN

I MPACT OF VARIOUS ATTACKS ON QUALITY OF EXTRACTED WATERMARKS to encode “XC” and produces 8-bit “XE” as output.
Attack Correlation(R) SSIM 16-bit MSB and extrapolated “wm in” are XORed and
(Salt & Pepper (30%)) 0.8968 0.8249 result is added bit-wise in Fig. 6. Resultant sum is compared
JPEG compression 0.8178 0.7892 with 8 to check majority of ones and generate 1-bit result
Gaussian (0 mean) 0.8532 0.7871
Poisson 0.8780 0.8277 which is stored in PM Register. 4-bit “PM” code is generated
Speckle(0.01) 0.8591 0.7677 and stored recursively in “PM” register.
Crop (10%) 0.8949 0.9014 “XC, MI, PN and PM” generated by positional match block
Brightness (1%) 0.9659 0.9521
are fed to encoder block as shown in Fig. 7. Multiplexer
Overwritting 0.9155 0.8792
(MUX 2:1) pass “MI” if “PN” is high else 4-bit 0 passes
on to adder/subtractor block. “PM” code controls signed
addition/subtraction operation and this combinational block
IV. P ROPOSED H ARDWARE A RCHITECTURE solves equation (1). Result can be smaller than 0 or greater
Hierarchical architecture have 8-bit data bus, encode and than 255 which is not valid. MUX 4:1 caps the result in range
decode pins to enable embedding or extraction unit and other and produces “XE”.
inputs are clock and active low enabled clear. “Data out” is
shared 8-bit output bus to deliver embedded or extracted data
and “Ready” signal acknowledges completion of task as shown
in Fig. 4.
Fig. 5 is embedding unit. It takes 8-bit “XC” and 4-bit
“wm in”, “MI” and “PN” inputs which deliver watermark
image pixel, modulation index and pseudo-noise code to
dedicated registers. MSB register stores MSBs of each pixel
and feeds to next block Positional Match in parallel. 4-bit
“wm in” gets converted into 16 bits and fed to next block
which computes positional match between two strings and Fig. 5. Embedding Unit
Fig. 6. Positional Match
Fig. 9. Correlation Factor
1 bit out of 2-bit “CF” to two identical “WM” blocks and two
watermark pixels will produce as shown in Fig. 10.
Fig. 7. Encoder
Fig. 10. Decoder

Extraction unit in Fig. 8 has two major blocks correlation
factor and to decoder. 8-bit “XE” and 4-bit “PN” codes are
fed to correlation factor block and it generates 4-bit “CF” MUX 2:1 in Fig. 11 multiplexes inverted and non-inverted
code. MSB register stores MSBs of “XE” and decoder takes copy of 16-bit MSB string and uses “CF” as select line. Most
64-bit parallel MSB string and 4-bit “CF” to compute 4-bit significant 9 bits and next 5 bits out of 16-bit output of MUX
watermark pixel values. feed accumulators which add all bits and comparators compare
the result with 5 and 3 resp. “WM” register stores 1-bit result
from each comparator and least significant 2 bits from MUX
output.
Fig. 8. Extraction Unit
Accumulator sub-block takes 64 “XE” values serially and

adds or subtract to/from the value stored in registers. “PN”
decides addition or subtraction in equation (2). MSB of output
generated from accumulator is inverted and stored in “CF”
register as shown in Fig. 9.
Fig. 11. WM
MUX 2:1 multiplexes 64-bit MSB into 32 bits and 4-bit
“CF” into 2 bits. Further, feed 16 bits out of 32-bit MSB and
V. H ARDWARE I MPLEMENTATION B. ASIC Implementation
Semi-Custom ASIC implementation of the algorithm is
A. FPGA Based Hardware Implementation done using Encounter and Innovus tools of licensed Cadence
EDA software. A total of 1198 macros from SCL library of
Xilinx Spartan 3E FPGA kit with XC3S500E device is used. 180 nm technology node were implemented on a die area
Xilinx ISE 14.7 project navigator along with XST synthesis of 0.045 mm2 . Maximum time delay produced by critical
tool and ISim simulator are used for interfacing, RTL synthesis path is 15.671 ns, hence a clock of 63.8 MHz frequency is
and simulation respectively. Power analysis was done using used to simulate and analyze power consumption. Total power
XPower Analyzer. calculated is 0.507 mW which is way less than FPGA power
Verilog HDL is used to model description and to create consumption and it is intuitive too. Reports are given in Table
test benches. Every block and registers are described and (IV).
simulated separately and then full encoder and extraction Macro cells from SCL library are routed using four metal
units are designed hierarchically. Data flow is controlled by layers namely M1(blue), M2(red), M3(green), top m(yellow).
controller block by generating various control signals to enable Power ring has M3 & top m and cell rows has power stripes
registers at certain clock cycles. Same clock is used throughout of M1. Physical layout is given in Fig. 13. IO pins are at die
the model. Top level test bench provides data inputs in proper boundary.
sequence to verify design functionally. Input, output, clock and
clear signals were observed in ISim simulator. TABLE IV
Resource, timing and power reports are given in Table (III). R ESOURCE , T IMING , AND P OWER R EPORTS OF ASIC I MPLEMENTATION
Resource Utilization Summary

TABLE III # Instances 1198
R ESOURCE , T IMING , AND P OWER R EPORTS OF FPGA I MPLEMENTATION # Std Cells 1198
# Net 1342
Resource Utilization Summary # IO Pins 20
Logic Utilization Used Available Utilization # Routing Layers 4
Total Slice Registers 195 9312 2% SDC max tran 2700.0 ps
Flip Flops 188 Total area of Core 45020.416 µm2
4 input LUTs 327 9312 3% Total area of Chip 53984.358 µm2
Occupied Slices 228 44656 4% % Core Density 70.047%
Slices containing only related logic 228 228 100% (Counting Std Cells and MACROs)
Slices containing unrelated logic 0 228 0% Timing Information
4 input LUTs 340 9312 3% Maximum time delay in critical path 15.671 ns
Logic 327 Clock frequency 63.8 MHz
Route-thru 13 Power Consumption
Bonded IOBs 20 66 30% Internal Power 0.312 mW
BUFGMUXs 1 24 4% Switching Power 0.194 mW
Average Fanout of Non-Clock Nets 3.22 Total power 0.507 mW
Timing Information Leakage power 0.4 µW
Maximum time delay 16.817 ns
Clock frequency 59.46 MHz
Latency (Encoder) 76 clock cycle*16.81ns = 1.28 us
Latency (Decoder) 66 clock cycle*16.81ns = 1.11 us VI. R ESULTS AND A NALYSIS
Throughput (Encoder) 216 images/sec The proposed algorithm is imperceptible as shown in Fig. 2.
Throughput (Decoder) 746 images/sec
Power Consumption Being a spatial domain technique it modifies image and hence
Leakage Power 82mW increased modulation index results in enhanced watermark
Logic Power 0.56mW extraction while, watermarked image gets affected perceptibly.
Signal Power 0.35mW
So, MI needs to be manage in a range. This trade off can be
IO Power 2.37mW
Total Power 85.28mW seen in Table (I) and in the plots given in Fig. 12.
Algorithm is robust and watermark can be extracted after
attack on watermarked image as shown in Fig. 3 and quality
Throughput is 216 images per second for encoder and 746 of watermark is significantly remarkable as given in Table(II).
images per second for decoder. Hence the algorithm can be Hardware architecture has common 8-bit data bus which
used for cover images of large size. Partition the cover image leads to lesser number of IO ports and hardware utilization is
into non-overlapping segments of size (256 x 256) pixels low hence the proposed design is efficient.
and embed multiple copy of watermark into these segments. Area, power and speed of various hardware implementations
Intuitively, extraction of watermark is more probable this way. reported in literature [1][3][4][5][7][8] are given for reference
Furthermore, commonly used frame rate in videos lies between in Table(V). Papers proposing similar work are very few and
24 to 120 frames per second. Since, the throughput of 216 they have done implementations in different technology. So,
images per second can be achieved, the algorithm is also it’s difficult to compare our work with existing papers. Overall
suitable to watermark a video. it performs good for both the kinds FPGA and ASIC.
TABLE V
H ARDWARE I MPLEMENTATIONS R EPORTED IN L ITERATURE
Research Work Processing Domain FPGA/ASIC Technology Area(mm2 )/ LUT/ LE #FF Power(mW) Frequency(MHz)
[1,7] Invisible, Spatial FPGA (Cyclone IV E) 4399 1969 201.72 88.69
ASIC (90 nm) 1.52 — 4.69 181.82
[4] Invisible, Spatial ASIC (350 nm) 213.54 — 2.05 545
[5] Invisible, Spatial FPGA (Virtex-E) 788 279 — —

ASIC (130 nm) 0.001325 — 0.404 100
[3] Invisible, Spatial ASIC (130 nm) 0.286 — 9.19 166.6
[8] Invisible, Spatial FPGA (Virtex 6) 2596 161 — —
Proposed work Invisible, Spatial FPGA (Spartan 3E) 340 188 3.28 59.46
ASIC (180 nm) 0.054 — 0.51 63.80
Fig. 12. 2D correlation, PSNR and SSIM
VII. C ONCLUSION R EFERENCES

A new algorithm for invisible and robust color image [1] K. Pexaras, I. G. Karybali, and E. Kalligeros,“Optimization and Hard-
watermarking with its software, FPGA and ASIC implemen- ware Implementation of Image and Video Watermarking for Low-Cost
Applications,” IEEE Transactions on Circuits and Systems, vol. 66, pp.
tations has been proposed. Algorithm has been evaluated 2088-2101, 2019.
using various test images and performance metrics along [2] Zhou Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, “Image
with several attacks. Specific hardware architecture has been quality assessment: from error visibility to structural similarity,” in IEEE
Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, April
proposed and implementation in FPGA and ASIC have been 2004.
given and evaluated in terms of resources used, area, speed and [3] A. Garimella, M. V. V. Satyanarayana, P. S. Murugesh and U. C.
power. Proposed design may lack in terms of frequency but Niranjan, “ASIC for digital color image watermarking,” 3rd IEEE
Signal Processing Education Workshop. 2004 IEEE 11th Digital Signal
throughput is so high that it’s suitable for video applications Processing Workshop, 2004., Taos Ski Valley, NM, USA, 2004, pp.292-
too. 296.
[4] S. P. Mohanty, E. Kougianos and N. Ranganathan, “VLSI architecture
and chip for combined invisible robust and fragile watermarking,” in
IET Computers & Digital Techniques, vol. 1, no. 5, pp. 600-611, Sept.
2007.
[5] Kumar, Karthi & Kaliaperumal, Baskaran. (2011). FPGA and ASIC
implementation of robust invisible binary image watermarking algorithm
using connectivity preserving criteria. Microelectronics Journal.
[6] Priyadharshini J. and R. S. Sabeenian, “Digital watermarking for im-
age using FELICS algorithm in VLSI implementation,” 2014 Interna-
tional Conference on Electronics and Communication Systems (ICECS),
Coimbatore, 2014, pp. 1-4.
[7] K. Pexaras, C. Tsiourakis, I. G. Karybali and E. Kalligeros, “Optimiza-
tion and hardware implementation of image watermarking for low cost
applications,” 2017 24th IEEE International Conference on Electronics,
Circuits and Systems (ICECS), Batumi, 2017, pp. 347-350.
[8] J. Joseph, A. Chalil and G. G. Dath, “Publicly Verifiable Digital
Watermarking Technique for Copyright Property Protection,” 2018 3rd
Fig. 13. Physical layout International Conference on Communication and Electronics Systems
(ICCES), Coimbatore, India, 2018, pp. 104-108.

A New

Uploaded by

Copyright:

Available Formats

A New

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A New

Uploaded by

Copyright:

Available Formats

A New Digital Color Image Watermarking

Algorithm with its FPGA and ASIC Implementation

978-1-7281-6564-6/20/$31.00 ©2020 IEEE

where, µA , µB , σA , σB , and σAB are the local means, standard

Cover image versus watrmarked image

TABLE II produces 4-bit “PM”. Encoder block uses “MI”, PM and PN

Fig. 9. Correlation Factor

Fig. 10. Decoder

Fig. 8. Extraction Unit

Accumulator sub-block takes 64 “XE” values serially and

Resource Utilization Summary

[5] Invisible, Spatial FPGA (Virtex-E) 788 279 — —

Fig. 12. 2D correlation, PSNR and SSIM

VII. C ONCLUSION R EFERENCES

You might also like