Intellectual Property Protection Via Hierarchical Watermarking
Edoardo Charbon and Ilhami Torunoglu
Cadence Design Systems Inc., San Jose, California, 95134
Abstract
WATERMARK
Intellectual property copyrights are protected by means of patents and nondisclosure agreements. In many cases however copyright laws cannot be
effectively enforced due to the difficulty of proving or even detecting infringement. This problem is addressed in the paper using a scheme known
as watermarking. The method consists of implanting a semi-transparent
unique signature in the circuit’s intrinsic structure, without disrupting its
functionality. A formalization of the watermarking problem is presented
in the context of hierarchical IC design. Algorithms are proposed for
implanting and detecting watermarks. The concepts of robustness against
forgery and theft tracking are analyzed and examples are presented to show
the feasibility of the approach.
1 Introduction
Employing well-known techniques, it is possible today to reverseengineer virtually any IC design, given a sufficient pool of samples
and enough time. With the explosion of the commerce in electronic
Intellectual Properties (IPs) and the emergence of system-on-chip
design paradigms, the industry will soon need to fight widespread
copyright infringement. Direct IP theft and abuse prevention, if
ever possible, is impractical and very expensive. A valid alternative is based on deterrence. Watermarking is an ancient process,
traditionally used in banknotes and other documents to discourage
counterfeiting. It consists of printing semi-transparent symbols
embedded on paper. Similar concepts have been applied to digital
audio-visual IPs [1, 2].
Watermarking, applied to IC design, has been recently proposed in [3, 4]. In [3] we have proposed to simultaneously create
watermarks in several design abstractions. The method, which is
adopted in this paper as well, is particularly resilient to tampering and redesign due to its inherent distributed nature. In fact,
several possibly independent abstraction levels are interested by
the process, from the algorithm to the implementation’s high-level
description, from RTL to gate-level design, from the net-list to
physical design. If a watermark is deleted at one abstraction level,
all hierarchically higher levels remain intact, thus allowing watermark recovery. Moreover, forgery can be traced to the source,
since watermarks at the lowest abstraction levels are associated
with the last licensees who ultimately caused the breach. Figure 1
illustrates the main abstraction levels at which watermarks can be
created.
In this paper a deterrence scheme based on watermarking is
proposed, which can be potentially used at any abstraction level.
The scheme consists of two phases: watermark synthesis and
watermark detection. A reasonable degree of protection against
copyright infringement can be obtained if these tasks can be performed efficiently and accurately. Watermark synthesis is defined
by (a) a set of algorithms translating design features onto a unique
watermark for a given key, (b) , the worst-case time required
to forge and/or delete the watermark, and (c) , the odds that a
Algorithm
Description
Language
RTL
Unstructured
Connectivity
Physical
Design
Figure 1: Abstraction levels interested by watermarking
design carries an unintended watermark. Detection is defined by
(d)
, the probability that an embedded watermark is not de, the probability that
tected, also known as miss rate, (e)
a watermark is found in a design lacking one, also known as false
alarm rate. Typical specifications of a protection scheme could
be:
2
10 30
10 6 In general,
watermarks can be designed to be inherently redundant, in order to
combat partial forging and/or deletion. Redundancy is designed
to boost the confidence in positive watermark identification, by
and .
reducing
The paper is organized as follows. The problem is formulated
in Section 2. Sections 3, 4 and 5 outline watermark synthesis and
detection techniques at various abstraction levels. Examples are
shown in Section 6.
2 Problem Formulation
Let
be the set of all strings in a finite alphabet , e.g.
0 1 . Assume there exists a compact representation or signature
of a given design at a subset
of abstraction levels. Let
be
. Let us define signature
a signature at abstraction level
mapping
as the mapping of abstraction-dependent design
features onto a signature:
:
(1)
where
is one of all possible implementations of the deis a non-unique, possibly lossy mapping
sign. Note that
and may or may not be defined for all abstraction levels. Moreover, once a signature is derived at a specific abstraction level, its
contents/inherent structure is abstraction-independent.
Let be a mapping which transforms implementation onto
another one, say :
:
(2)
1
1
2
bubble
3
1
3
2
m1
9
m2
wire
obstacle
bubble
5
5
4
6
A
B
7
6
(a)
7
(a)
9
5
4
6
4
6
7
8
8
3
2
9
5
4
1
3
2
7
8
(b)
8
Figure 3: Two-layer topological routing: (a) relative location of
bubbles; (b) separate triangulations, one for each layer
(b)
Figure 2: Bubbles and rough routes
1
2
3
If
, then
is called signature-invariant. If
is signature-invariant over a partition of the design, it is called
partially signature-invariant.
Let watermark set
be a collection of strings over
associated with the design signatures at all abstraction levels. Let
us now select a special string
. Define synthesis algorithm
as the mapping of signatures onto a watermark:
:
(3)
where represents a generic signature at any one abstraction level
and its -coded watermark. Note that, due to the abstraction
independence of , is a generic algorithm, while is a designspecific key known solely by the IP provider.
3 Signature Mapping
Consider an implementation of a given design and define its
granularity for a particular abstraction level. Call atomic blocks
those modules or devices which cannot be further sub-divided and
define as the set of all such blocks and their instantiations.
Most layout implementations are associated with a topology.
A topology describes the relative position and orientation of any
2
object pairs 1 2
. Let us represent the atomic blocks of
a layout in terms of primitives called bubbles using a dedicated
mapping [5]. A bubble is a point associated with a given layer.
Let be the set of all bubbles in the design. Let the centerline or
path of a wire be a continuous curve of finite length which begins
and ends in a bubble. Figure 2(a) depicts bubbles and wires. One
can easily recognize that all the structures present in a layout may
be represented in terms of the described primitives [6]. Moreover,
note that
grows linearly with the number of atomic blocks.
Let us define rough routing as the specification of a continuous
finite-length curve for each wire. Moreover, let topological routing
be an equivalence class of rough routings connecting its pins. Two
rough routings of the wire are equivalent when one can be obtained
from the other by continuous deformation with no violations of
any of the scaled design rules. Define the edge on a given layer
to be the line segment joining the centers of two bubbles on the
layer. A wire and an edge are said to intersect topologically when
the wire intersects the edge in every rough routing of a given
topological routing.
Let us partition any given layer into simply connected regions
which contain no bubbles and whose boundaries are finite sets of
edges. Let us define for layer as a set of all simply connected
regions satisfying the properties of a triangulation. The reader
is referred to [7] for details on triangulation properties. Let us
Figure 4: Mapping repetitive structures
is in place for each
assume now that an arbitrary triangulation
layer and let
be the set of all bubbles associated with .
For convenience, although not needed, let us set four bubbles at
the extremities of the union of all the layers, so as to encompass
every layer. Moreover, assume that every rough routing has a
source and a sink, i.e. two bubbles representing the beginning
and the end of the route. Rough routings can now be represented
as a sequence of edges which are crossed by traveling from the
source to the sink. As an illustration, consider rough route in
Figure 2(b). Its representation according to the above convention
is: 6 46 67 67 67 67 47 14 13 13 3 , where bubbles 6 and
is the
3 are respectively the source and sink, while
edge connecting vertices
and . All the bubbles are named
sequentially a priori through an arbitrary scheme. Correct bubble
identification is always guaranteed provided that such scheme is
consistently used.
This representation captures the topology of a rough route
in a non-unique fashion. To eliminate the problem, one needs
to convert the sequence onto a canonical form. This is done
simply removing adjacent identical edges, which form so-called
loops. For example, the canonical form for rough route
is:
6 46 47 14 3 , which is a unique sequence for the topological
routing associated with both rough routes and . The unique
canonical form of an arbitrary topological routing is called topological signature .
The triangulation of any given design is not unique. In fact
some but not all of the bubbles may be implemented in every
layer, and every layer may or may not have different triangulation
schemes. Consider now the implementation of the interconnect
wiring of Figure 3(a). Let us assign a bubble to the pins and the
Steiner point. A possible triangulation is shown in Figure 3(b) and
(c) for both layers. In this case, the topological signatures for both
layers according to the usual convention are: 1 23 4 67 8 and
9 25 54 4 , respectively.
Standard cell row-based layouts (see Figure 4) can be usually
converted onto a more compact signature. Due to the simple
one-dimensional structure of the layout, topology is captured by
a simple sequence of symbols representing the circuit instances.
For the example of Figure 4, the corresponding signature is
1 2 3 1 1 3 3 2 1 1 3 1 2 2 2 3 3 3 3 2 1 1 3 3 1 3 2 1 2 1 .
Signatures associated with unstructured connectivities can be
constructed in a similar fashion. For instance, RTL and gate-level
A
L
A
T
C
H
a
B
L
A
T
C
H
F
c
C
E
d
D
sent all abstraction levels and result from the composition of partial signatures
and arbitrary many-to-many mappings
1
. As an illustration, consider again the
two-layer topological routing of Figure 3(a). The resulting sig1 23 4 67 8 and
natures, represented as vectors, are:
1
9 25 54 4 . Assuming that
2
G
b
(a)
G
A
B
k
C
E
D
F
1
10000
01000
00010
00001
k
2
01000
1 23 67 8 25
Algorithm relates vector to the final watermark
be represented explicitly in terms of a mapping as:
a
b
c
:
d
a
(b)
Figure 5: (a) Gate-level circuit; (b) Connectivity graph
designs can be represented through a graph
. The nodes
of the graph correspond to general blocks, or single gates, as well
as nets. The (directed) edges define connectivity. Let us define
as the set of all blocks in . Let
be the set of all nets,
with
. Let
be the set of edges in net
which are connected to an output. The set of edges leading to
an input is called , while the set of edges connected to a highimpedance pin or pass transistor gate is called . For simplicity,
we assume that exactly one edge can be connected to an output,
i.e.
1, this condition is however not necessary. The
pin number
and the type and port of the gates connected by
are necessary but not sufficient properties to uniquely identify
the net. A set of constraints on sets
,
and
for each
net, can be imposed so as to make these properties define the
net uniquely, to all practical purposes. Consider the gate-level
circuit in Figure 5(a) and the corresponding connectivity graph of
:
is prime . Next,
Figure 5(b). Let
:
impose the following constraints on each net
gates of type
gates of type
gates of type
;
;
;
(4)
;
and
are net size-dependent
where
parameters, generated using, for example, a parametrized pseudorandom sequence determined by key . The signature associated
with the design unstructured connectivity, known as constraintbased signature, is represented by a set of equations. In the case
of Figure 5, Equations (4), written compactly, form the constraintbased signature for net as:
;
3; 0 .
4 Watermark Synthesis
Several mechanisms or any combination of them can be used to
implement algorithm . All these mechanisms are aimed at selectively eliminating parts of the signature according to a scheme
controlled by . Let us compactly represent a generalized signature associated with a given design at various abstraction levels as
a vector of size
1
1
1
2
1
2
2
(5)
2
where
and
are topological and constraint-based signatures. The signatures, whose entries are elements of , repre-
(6)
and it can
(7)
where mapping is k-dependent. Algorithm can also be defined
implicitly. We propose a compact way based on the concept of
operons. Operons are sequences of symbols which determine the
beginning and the end of active genes inside a DNA chain. Similarly, one can define sequences of symbols in which “activate”
and “deactivate” topological signatures. As an illustration, consider the example in Figure 3. Suppose we define the following
activating and deactivating operons 0 and 1 in form of vectors
23
25 54 Then, the resulting signature is:
0
1
9 25 54 4 .
5 Watermark Detection
Watermarks are only useful if they can be extracted from a design
which has been deprived of labels, net names, instances, etc. In order to achieve such capability the detection problem is partitioned
into two phases: signature extraction and watermark identification. In the first phase design features are mapped to a signature
which is then converted into the watermark using the given key.
Using standard slicing techniques [8], the layout is partitioned
in rectilinear areas encompassing exactly one atomic block. The
complexity of this operation is
where
is the
number of objects in the layout. Then, the contents of each slice
1 time onto the corresponding tuple of bubare mapped in
bles via mapping . All the bubbles in the layout are labeled
and catalogued in order from left to right and from top to
bottom. Then, using optimal algorithms, a triangulation is performed in
time [7, p. 241]. Finally, a signature
is extracted for all abstraction levels using the various mappings
outlined in Section 3. Line segment intersection algorithm is used
for the computation of the edges being cut by each topological
routing. The complexity of this operation is again
[7, p. 285].
Repetitive structures are extracted by slicing the layout and by
cataloging every atomic block using a signature associated with the
block if one exists. Alternatively, one can use external dimensions
and other physical features to identify the blocks [3]. Unstructured
using sigconnectivity is extracted by generating graph
natures associated with atomic blocks for identification purpose.
Constraint-based signatures are built from the constraints derived
from the graph as outlined in Section 3.
In all watermark synthesis schemes, one wants to identify a
given code with a certain level of confidence or conversely to
compute the rate of similarity between the given signature and the
extracted one, even when fragmentary. In the absence of any type
of tampering, algorithm and key applied to signature always
return the original watermark .
When tampering has occurred, some sections of the extracted
signature will be corrupted. Since the watermark is a subset of the
signature, such corrupted sections may in fact have little impact.
Potentially dangerous tampering involves symbol scrambling, due
to the global effects on the watermark. To cope with this problem,
a technique known as genome search [3] is used to identify the
be
number of fragments in which also appear in . Let
the set of all signature fragments present in the original watermark
. Moreover, let be the set of all operons defined
for .
genome search(
)
foreach (
)
= best match( , , )
overlap += overlap(
)/length( )
A subsequence of
which best matches
is selected by
). The function uses the operons in , to
best_match(
derive the subsections of to be matched. The matching criterium is simply that of maximizing the number of identical symbols. Function length( ) computes the number of symbols in ,
) the number of identical symbols in and . This
overlap(
algorithm returns an estimate of the probability that the design
contains in fact watermark .
Let us now focus on the measures defining the uniqueness and
robustness of the proposed scheme. The set of all the bubbles
in the design and their location on the various layers is determined
by the slicing algorithm and by mapping . The topological
signatures associated with any networks in the design are unique
for a given triangulation. However, for each layer the number
of possible triangulations grows factorially as
1 ! 3!.
Next, choose a layer
which maximizes
over all layers.
, the total number of possible
By a conservative estimate,
1 ! 3!. Consider
triangulations over all layers, is
now all
topological routings in . The routings consist of
-terminal nets,
2
. Note that all
topological
routings can be represented in terms of
two-terminal sub1 . Hence, the number of
routings, with
2
possible topological signatures
is given by
N
NT
2
thus Pu
1
N
of length in watermark
is the probability of uniqueness. In case no tampering
where
0. As an illustration, consider a
has occurred, then
design with
20,
10, 2
3 3
5 4
2.
Hence,
19 and
20 1 !171 3!
3 5 1018 ,
2 9 10 19 .
In order to model tampering activities, let us consider the following occurrences: (1) routing modification, (2) atomic block
modification, and (3) atomic block move and/or addition/deletion.
(1) does not modify triangulation, however it may cause changes
in the topological signature. There exists three types of possible
effects on the signature: literal addition, deletion and swap. More
than one literal may be involved in the change at any time, however, when this occurs, the change can be modeled in terms of
a composition of simple literal modifications. (2) and (3) result
in a change in the triangulation, thus potentially having an effect
on one or more topological signatures. However, the effects can
again be modeled in terms of simple literal operations.
be the probability that a literal change occurs in a topoLet
logical signature. Moreover, let be the probability that a literal
be included in the watermark. Then, the probability that a section
Pr Pi
Pt
j
1
Pr 1
B
Pi
j
(9)
j 1
Note that in our case
. As an illustration, consider the
above design with
1, moreover suppose
10 4 ,
2 10 7.
then
block repetitive structures, the number of
In the case of
permutations
1
is !, due to the one-dimensional
character of the design. Moving always implies a change in the
signature. Given the same assumptions of above, probability
is computed as in Equation (9).
Consider now unstructured connectivity. Assuming that every
has exactly one arriving and
leaving edges, the number
of possible graphs
1
is
Ng
j
2
1
1
(10)
j 2
100,
40, then
4 54 10 16. There
Suppose
exist three ways to “atomically” alter connectivity information,
namely: (1) edge augmentation, (2) net partitioning and (3) net
consolidation. Through (1) connectivity is added to dead or unused
nodes, or extra cycles are created in graph
. Dead and
unused nodes can be easily detected as extraneous components to
the original set . Moreover, cycles can be found in
at relative low cost. Hence, this type of counterfeiting can be
eliminated as a pre-processing step. Manipulations (2) and (3)
may be difficult to accomplish at no cost of circuit performance,
since they imply adding/removing extra components (a buffer, for
example) in possibly critical nets. Both manipulations cause new
nets to be formed from the original ones.
As an illustration, consider net in Figure 5(b). The net can be
, connecting with
partitioned in: , connecting with and
through a buffer. A net generally connects one output with
1
inputs or high-impedance gates. Assuming that a -edge net is
part of the watermark with probability
, then the probability
that a partition in an -edge net is causing a mutation is
L
1
Pr
(8)
mutates is
t
Pi j
1 Pi j
1 r
Pi r
j Pi r
j r
r
4
j 1
(11)
where
1 2 if is odd,
2 otherwise.
is the probability that a section of length cut from a segment
of a given
of length are identical. If a subset of nets in
size is partitioned, consequently
. For instance, if
Pi j
Pi
10 3 j y z, then Pm P4 6 10 6 .
r 4 Pi j z
6 Results
The techniques proposed in this paper were applied to a set of
MCNC 86 benchmarks. For every circuit a topological signature
was produced and then a watermark was created using the method
outlined in Section 4. The experiments were performed on a
133MHz Pentium PC running under the Linux operating system.
We experimented with a wide range of circuits, from small
library components to large circuits consisting of regular as well
as non-regular structures. Figures 6(a) and (b) show the layout and
corresponding triangulation for a small benchmark. Table 1 lists
all relevant experimental data for the circuits used, namely size,
number of I/O pins, size of triangulation, and number of routing
segments. The computed bounds on the probability of uniqueness
(a)
(b)
Figure 6: (a) Layout; (b) Corresponding triangulation
circuit
# devices
# I/O ports
afa
cword6 b
croute2
ccell6
ddx
86
26
84
21
374
15
16
0
12
0
534
200
478
134
2138
# segments
CPU time
101
25
120
23
530
0.35s
0.1s
0.37s
0.07s
1.32s
10
10
10
10
10
163
49
129
39
900
1
4
8
3
3
10
10
10
10
10
6
7
7
7
6
Table 1: Benchmarks
and on the miss rate
are also reported. For the computation
10 4 , while for the
of the miss rate it was assumed that
computation of
, the use of an arbitrary triangulation scheme
was assumed. The results show the effectiveness of the proposed
method to identify original IPs at little or no design overhead,
while ensuring high recognition reliability.
7 Conclusions
Hierarchical watermarking has been proposed as an effective
method to deter IP copyright infringement. The method consists of stamping a given design with a unique structural code.
The code can be reliably detected and used to claim IP ownership.
The resilience of the technique have been shown through several
industrial examples.
References
[1] M. D. Swanson, B. Zhu and A. H. Tewfik, “Transparent
Robust Image Watermarking”, in Proc. IEEE International
Conference on Image Processing, volume 3, pp. 211–214,
September 1996.
[2] L. Boney, A. H. Tewfik and K. N. Hamdy, “Digital Watermarks for Audio Signals”, in Proc. IEEE International Conference on Multimedia Computing and Systems, pp. 473–480,
June 1996.
[3] E. Charbon, “Hierarchical Watermarking in IC Design”, in
Proc. IEEE Custom Integrated Circuit Conference, pp. 295–
298, May 1998.
[4] A. Kahng, S. Mantik, I. L. Markov, M. Potkonjak, P. Tucker,
H. Wang and G. Wolfe, “Robust IP Watermarking Methodologies for Physical Design”, in Proc. IEEE/ACM Design
Automation Conference, pp. 782–787, June 1998.
[5] T. Whitney, Hierarchical Composition of VLSI Circuits, PhD
thesis, California Institute of Technology, 1985.
[6] J. Valainis, S. Kaptanoglu, E. Liu and R. Suaya, “TwoDimensional IC Layout Compaction Based on Topological
Design Rule Checking”, IEEE Trans. on Computer Aided
Design, vol. CAD-9, n. 3, pp. 260–275, March 1990.
[7] F. P. Preparata and M. I. Shamos, Computational Geometry.
An Introduction, Springer, second Edition, 1988.
[8] R. H. J. M. Otten, “Automatic Floorplan Design”, in Proc.
IEEE/ACM Design Automation Conference, pp. 261–267,
June 1982.