Dynamic Bezier curves for variable rate-distortion

Ferdous Ahmed  Sohel; Gour Karmakar

Open Research Online The Open University’s repository of research publications and other research outputs Dynamic Bezier curves for variable rate-distortion Journal Article How to cite: Sohel, Ferdous A.; Karmakar, Gour C. and Dooley, Laurence (2008). able rate-distortion. Pattern Recognition, 41(10) pp. 3153–3165. Dynamic Bezier curves for vari- For guidance on citations see FAQs. c [not recorded] Version: [not recorded] Link(s) to article on publisher’s website: http://dx.doi.org/doi:10.1016/j.patcog.2008.03.006 Copyright and Moral Rights for the articles on this site are retained by the individual authors and/or other copyright owners. For more information on Open Research Online’s data policy on reuse of materials please consult the policies page. oro.open.ac.uk Dynamic Bezier Curves for Variable Rate-Distortion Ferdous A. Sohel 1 , Gour C. Karmakar, and Laurence S. Dooley ABSTRACT Bezier curves (BC) are important tools in a wide range of diverse and challenging applications, from computer aided design to generic object shape descriptors. A major constraint of the classical BC is that only global information concerning control points (CP) is considered, consequently there may be a sizeable gap between the BC and its control polygon (CtrlPoly), leading to a large distortion in shape representation. While BC variants like degree elevation, composite BC, and refinement and subdivision narrow this gap, they increase the number of CP and thereby, both the required bit-rate and computational complexity. In addition, while quasiBezier curves (QBC) close the gap without increasing the number of CP, they reduce the underlying distortion by only a fixed amount. This paper presents a novel contribution to BC theory, with the introduction of a dynamic-Bezier curve (DBC) model, which embeds variable localised CP information into the inherently global Bezier framework, by strategically moving BC points towards the CtrlPoly. A shifting parameter (SP) is defined that enables curves lying within the region between the BC and CtrlPoly to be generated, with no commensurate increase in CP. DBC provides a flexible rate-distortion (RD) criterion for shape coding applications, with a theoretical model for determining the optimal SP value for any admissible distortion being formulated. Crucially DBC retains core properties of the classical BC, including the convex hull 1 Corresponding author: E-mail: Ferdous.Sohel@ieee.org; Ferdous.Sohel@infotech.monash.edu.au; Tel.: +61-3-990- 26133; Fax: +61-3-990-26842. Mailing address:- GSIT, Monash University, Churchill, Victoria – 3842, Australia. and affine invariance, and can be seamlessly integrated into both the vertex-based shape coding and shape descriptor frameworks to improve their RD performance. DBC has been empirically tested upon a number of natural and synthetically shaped objects, with qualitative and quantitative results confirming its consistently superior shape approximation performance, compared with the classical BC, QBC and other established BC-based shape descriptor techniques. Index Terms – Vertex-based shape coding, image processing, video processing, and Bezier curves. I. INTRODUCTION Bezier curves (BC) were independently introduced by P. de Casteljau and P. E. Bézier, and have been applied to a wide variety of computer-aided design applications. While their genesis lies in the design of car body shapes, their usage is no longer confined to this domain. Indeed, their robustness in curve representation means BC now pervade many areas of multimedia technology, such as shape description of characters [1] and objects [2]-[3], shape coding and error concealment for video objects [4]. The classical BC is defined by a set of control points (CP), which when conjoined, form the control polygon (CtrlPoly), with the number and orientation of the vectors connecting the CP, governing the curve shape. A major limitation of BC theory is that only global information about the CP is considered [5], since each BC point is produced by blending all CP. As a consequence, a large gap can arise between the curve and its CtrlPoly, leading to high distortion in shape representation and approximation applications. A number of approaches have been proposed to reduce this gap, including degree elevation [6], composite Bezier curves (CBC) [7] and refinement and subdivision [8]-[9]. While these techniques successfully reduce to some extent, the distance between a Bezier approximation and CtrlPoly, they concomitantly increase the CP number so incurring higher coding or descriptor lengths. In contrast, quasi-Bezier curves (QBC) [10] reduce this gap by incorporating localised CP information into the Bezier framework, shifting curve points towards the CtrlPoly by a fixed amount, without compromising the CP number. The gap is narrowed however, by the same preset amount, and there is no mechanism to flexibly control its size in a rate-distortion (RD) context. As generically-shaped objects may contain contour portions that exhibit regular geometric features like edges, while other parts have more complex random patterns, shifting each Bezier point by the same amount fails to fully exploit the potential to reduce the distortion and motivates investigation of alternative paradigms that support variable localised shifting of curve points. This provided the impetus for the dynamic Bezier curve (DBC) model 2 presented in this paper. DBC incorporates local information within the classical BC theory, by variably moving Bezier points to new parametrically determined locations between the BC point and CtrlPoly, with the optimal value of the shifting parameter (SP) being analytically determined for a prescribed admissible distortion, using the Lagrangian multiplier method. It is important to highlight the generality of the new model since judiciously selecting the SP value allows any curve bounded by the original BC and CtrlPoly inclusively to be synthesised. B-splines (BS), which are a generalisation of the BC [5] since quadratic BS are piecewise BC, have been efficaciously applied in the standard vertex-based operational-rate-distortion (ORD) 2 The preliminary idea behind this work was presented at IEEE International Conference on Image Processing (ICIP 2005) [11]. optimal shape coding framework [12]-[17]. QBC has also been successfully integrated into this framework [10], though applying a fixed SP value independently of a shape’s contour does not necessarily minimise distortion, with the corollary it fails to maximise the overall improvement in RD performance. To achieve this objective, a strategy for dynamically generating curves within the ORD framework is required. This paper presents a mechanism for seamlessly embedding DBC within the ORD shape coding framework together with determining the bound upon the widths of the corresponding variable admissible control point band (VCB) [12]. Choosing various SP values enables the generic DBC to not only synthesis the classical BC (no localised information), but also provide exactly the same distortion results as both the BS and QBC models. Concomitantly it retains fundamental properties of the original BC, with its performance as both a generic shape descriptor together with its application in the vertex-based ORD shape coding framework, being extensively analysed for a large number of arbitrary shapes. The qualitative and numerical evaluations of the DBC results confirm its consistent performance superiority over the original BC, BC variants and the two QBC models. The remainder of the paper is organised as follows: Section II presents a series of short overviews of core BC theory, the recently proposed pair of QBC models and the vertex-based ORD optimal shape coding framework respectively. Section III introduces the mathematical foundations of the new DBC paradigm, together with germane evidence it both retains the main BC properties, and can be seamlessly embedded into the ORD framework. Section IV provides a comprehensive empirical analysis of the improved RD performance of DBC, with some conclusions being drawn in Section V. II. RELATED WORK This section presents a brief review of the underlying theory behind the classical BC, popular variants and QBC, before the BS-based shape coding ORD framework is investigated. II A. Bezier Curve Theory and Variants The BC is a recursive linear weighted subdivision of the edges of a generated polygon starting with a set of points forming the initial (control) polygon and ending when the final point is generated for a particular weight u . The set of N + 1 starting points is referred to as the CP which governs the shape of the N degree BC, while the polygon connecting the CP is known as the CtrlPoly. The matrix form [5] of the BC for an ordered set of CP P = {p 0 , p1 , K , p N } is defined as: p(u ) = Pow N (u ) ⋅ Bez N ⋅ P T , 0 ≤ u ≤ 1 (1) where p(u ) is the Bezier curve point for a particular u , Pow N (u ) represents the power basis (1, u, u 2 ,L, u N ) and the ij th term of matrix Bez N is found from σ ij = (− 1) j −i ⋅ N C i ⋅ i C j , where C denotes the combination function. u is a parametric operator which defines the location of the curve point, with the number of curve points depending upon the number of u values ( u ′ ). Figure 1: A quadratic BC example to elucidate the existence of the gap. Figure 1 shows a quadratic BC produced using CP p 0 , p1 and p 2 . The large gap between the BC approximation and its CtrlPoly represents a substantial shape distortion (error) caused by the fundamental BC limitation of considering only global CP information. If for a particular value u = 0.5 , points A and B are generated by (1), then the inner area of ΔAp1 B is never reached and the final BC point C will be generated along AB . This inadequacy has spawned many variants of the classical BC including degree elevation, CBC and subdivision and refinement. Degree elevation [6] forms a curve with the number of CP increasing by one each pass. With the exception of the two endpoints, the CP must be recalculated every time, so the computational overhead correspondingly increases, while higher degree curves are always be more computationally intensive than lower-order curves. CBC [7] models a shape by dividing it into multiple segments, each of which is then defined by a simple BC. Their drawback is that the number of segments increases with shape complexity because the segment division process is not intuitive. This was the catalyst for the development of the subdivision and refinement techniques [7], where the BC is split in two [9] with a new CP set being calculated from the initial CP set for each part, so it is guaranteed closer to the curve and thereby lowers the overall distortion (gap). These algorithms increase the number of curve segments, with both subdivision and CBC doubling the number to incur a higher bit-rate encoding overhead, so while gap reduction is achieved, it is at the pyrrhic cost of a commensurately expanding number of CP. In contrast, QBC curves [10] reduce this gap without enlarging the number of CP, as will now be discussed. II B. The quasi-Bezier curve (QBC) models Both QBC models (QBC-n and QBC) are characterised by integrating localised information about the CP within the global BC, shifting original BC points towards the centre of gravity (CoG) G of the area ΔAp1 B . Since the shift towards G is always by a fixed amount, while the gap is reduced compared with the original BC, there can still be a significant area (distortion) between QBC and the CtrlPoly, as visualised in the example in Figure 2, which reveals that though both QBC and QBC-n curves have narrowed the gap, a large distortion still remains. Figure 2: Illustration of the gap in QBC with the centre of gravity G . Moreover, since SP is preset, the model has no facility to determine the optimal value of SP for an admissible distortion (gap size), as well as affording no trade-off mechanism between bitrate and distortion to enhance the RD performance for specific values of SP. This provided the principal motivation behind the development of the new DBC paradigm presented in Section III. II C. The Vertex-Based ORD Optimal Polynomial Shape Coding Framework A rigorous review of shape coding algorithms has been furnished in [13] with the conclusion that the classical vertex-based polynomial shape coding framework is optimal in an ORD sense. With both polygon- and quadratic BS- based shape encoding strategies being deployed, this finding has become the bedrock for several other shape coding algorithms [14]-[18], though by virtue of using higher order curves, the BS-based algorithms require a lower bit-rate than their polygonbased counterparts, for the same experimental setup and test shapes. The general aim of all these algorithms is that for some prescribed distortion, a shape contour is optimally encoded in terms of define this mathematically, let boundary B = {b0 , b1 , L , b N B −1 } be an ordered set of points, where the number of bits, via selecting a set of CP that incurs the lowest bit rate and vice versa. To N B is the total number of boundary points. S = {s 0 ,s1 ,L ,s N S +1 } is an ordered set of CP used to approximate B , where N S is the total number of quadratic curve segments. The k th ( k ≥ 1 ) curve segment is then defined by three consecutive CP, s k −1 , s k , s k +1 under the assumption that S ⊆ F , where F is the ordered set of vertices in the admissible control point band (ACB) around the shape boundary, which is the source of potential CP. Sohel et al [12] have extended the ACB concept to a dynamic VCB which enhances the performance of the ORD framework by exploiting the nexus between admissible distortion and shape curvature. As Figure 3 illustrates, the VCB is formed around the shape contour so CP are always selected from VCB points when encoding, and thus a closer approximation of the CtrlPoly would mean a better shape approximation. Figure 3: The variable admissible control point band (VCB) In addition, while the original framework [13] employs quadratic BS, the relationship between BC and BS means the former can replace the latter, with appropriate adjustments in the CP. For instance, from (1) the polynomial form of a quadratic BC Q BC for the ordered CP set {p 0 , p1 , p 2 } is given by: Q BC (( p 0 , p1 , p 2 ), u ) = (1 − u )2 p 0 + 2 ⋅ u ⋅ (1 − u ) p1 + u 2 p 2 , 0 ≤ u ≤ 1 (2) Again, a quadratic BS segment Q BS for the same CP set is defined as [5]: ( ) Q BS (( p 0 , p1 , p 2 ), u ) = 12 ⋅ (1 − u )2 p 0 + − u 2 + u + 0.5 p1 + 12 ⋅ u 2 p 2 , 0 ≤ u ≤ 1 From (2) and (3): Q BS (( p 0 , p1 , p 2 ), u ) ≡ Q BC (( p0 + p1 2 , p1 , p1 + p2 2 ), u ), 0 ≤ u ≤ 1 (3) (4) This formalises how to represent a BC in BS format, and that a quadratic BS is in fact a piecewise BC, with its two end CP being the midpoints of the respective CtrlPoly edge of the BS, as shown in Figure 4. This implies with correct CP calculation, a BC can be equivalently used instead of BS, which crucially provides an avenue for embedding the proposed DBC model into the BS-based ORD optimal shape coding framework to improve overall RD performance. Figure 4: Graphical illustration of the relationship between BC and BS. The next section formally introduces the DBC model which reduces the gap (distortion) between the classical BC and its CtrlPoly, in addition to affording a flexible RD trade-off mechanism by selecting an optimal SP value. III. THE DYNAMIC BEZIER CURVE MODEL In this section, the theory underpinning DBC is firstly developed before by a series of formal proofs is presented verifying the core properties of the classical BC are upheld in the new representation. A short expose is then provided upon how DBC can be seamlessly integrated into the ORD optimal vertex-based shape coding framework to improve its RD performance. III A. The Dynamic Bezier Curve Model While QBC reduce the gap, they only reduce by a fixed limited amount. For a CP set {p 0 , p1 , p 2 }, the BC produced a gap bounded by 1 2 p 0 p1 (assuming p 0 p1 ≥ p1 p 2 ), where represents the length of the straight line joining the two points, while the QBC and QBC-n can reduce this gap respectively by 1 12 p 0 p1 and 1 6 p 0 p1 [10]. It becomes crucial to further reduce this gap and reduce by variable amount. DBC meets these requirements as follows: i) DBC permits a larger shift which potentially leads to lower distortions, since as Figure 2 confirms, even QBC-n [10] can generate sizeable errors. Since the VCB band is formed around a shape contour and the CP then selected from this band. These CP form the CtrlPoly and hence a lower distance between the CtrlPoly and approximating curve would mean a lower distortion in between the original and approximating shape. ii) As the SP value increases, DBC will tend towards exhibiting the shape of the CtrlPoly and so become a comparatively more localised curve, generating a correspondingly piecewise shape approximation. While the rationale for the QBC models was to shift a BC point towards a specific point G (the CoG of the triangular region in Figure 1), the DBC model moves the corresponding BC towards a specific CP edge. When a BC point is generated for a particular u , one CtrlPoly edge will be at a minimum distance from it, and this edge analytically exerts the maximum influence on that particular curve point. The DBC point is obtained by making a parametric shift of the BC point towards this particular edge in the direction of shortest distance. This can be mathematical explained as follows: Figure 5: An illustration of DBC formulation. In Figure 5, the generated BC point for a particular u is (BC x , BC y ) , whose nearest CtrlPoly edge is with endpoints (x1 , y1 ) and (x 2 , y 2 ) . The shortest distant point on the edge from the BC point is the intersection point (x12 , y12 ) between this edge and the perpendicular line passing through the BC point and is given by: x12 = y12 = and ( ) Δx Δx × BC x + Δy × BC y + Δy (Δy × x1 − Δx × y1 ) ( Δx 2 + Δy 2 (5) ) Δy Δx × BC x + Δy × BC y − Δx(Δy × x1 − Δx × y1 ) Δx 2 + Δy 2 (6) where Δx = x1 − x 2 and Δy = y1 − y 2 . If m is the SP, i.e., m : (1 − m ) is the shifting ratio at the curve point between (BC x , BC y ) and (x12 , y12 ) , the new DBC point (DBC x , DBC y ) calculated from: DBC x = m × x12 + (1 − m ) × BC x and DBC y = m × y12 + (1 − m ) × BC y which is formalised in matrix form as: (7) can be Δx ⎡ ⎢ 2 2 ⎡ x12 ⎤ ⎢ Δx + Δy = ⎢ ⎥ Δy ⎣ y12 ⎦ u ⎢ ⎢ 2 2 ⎢⎣ Δx + Δy Δy ⎤ ⎥ Δx + Δy ⎥ − Δx ⎥ ⎥ Δx 2 + Δy 2 ⎥⎦ u 2 2 ⎡ BC x ⋅⎢ ⎣− y1 BC y ⎤ ⎥ x1 ⎦ u ⎡Δx ⎤ ⋅⎢ ⎥ ⎣Δy ⎦ u (8) and [ DBC (u ) = DBC x DBC y ]u = [m (1 − m )] ⋅ ⎢ ⎡ x12 ⎣ BC x y12 ⎤ ⎥ , 0 ≤ m ≤ 1; 0 ≤ u ≤ 1 BC y ⎦ (9) u As m lies in the range 0 ≤ m ≤ 1 , DBC is bounded by the BC and CtrlPoly. When m = 0 there is no shifting so it is a classical BC approximation, while for m = 1 , the maximum shift means DBC becomes the CtrlPoly. The choice of SP thus plays an influential role on shape approximating performance, as for large m ( m ≈ 1 ), DBC approaches the CtrlPoly, as local dominates global CP information, resulting in small distortions, though the corresponding curves will increasingly lose smoothness. Conversely, when SP is small ( m ≈ 0 ), the curve possesses maximum smoothness though the distortion is nearly a maximum as global CP information prevails and DBC becomes analogous to the classical BC model. The value of SP consequently provides a flexible design trade-off parameter between distortion minimisation and the level of smoothness, so an effective strategy to optimise m for a given admissible distortion is mandated. III A 1. Optimising the shifting parameter To uphold the maximum admissible distance between the curve and CtrlPoly, in addition to preserving smoothness, m must be as small as possible, because increasing m compromises the curve smoothness. The Lagrangian optimisation method [19] is applied to determine the optimal value of m for a maximum admissible distance ( D adm ). If I (m ) and D(m ) are respectively an identity function and the maximum distance between the curve and CtrlPoly at a particular m , then for any λ ≥ 0 , an unconstrained problem for the optimal solution m * (λ ) using the generalised Lagrangian multiplier [19] can be formulated as: min (I (m ) + λ × D(m )) m∈[0,1.0 ] (10) In accordance with the theory of Lagrangian multipliers, the optimal solution to this unconstrained problem is also the optimal solution to the constrained problem [14]: ( min I (m ) subject to: D(m ) ≤ Dadm m∈[0,1.0 ] ) (11) Since D m * (λ ) is a non-increasing function of λ [14], the bisection method [20] is used to find the optimal value of λ . Note, the admissible distance D adm is bounded by 0 ≤ D adm ≤ l max , N / 2 ⎦ ⋅ ⎡N / 2 ⎤ where l max = ⎣ Δp 2 2N ∞ with Δp 2 ∞ being the maximum of the i th centred second difference of the coefficient sequence pi , i = 0, L , N , with [21] proving the maximum distance between a BC and its CP is always l max . The complete DBC process is summarised in Algorithm 1. Algorithm 1: The dynamic Bezier curve (DBC) model. Inputs: Dadm – maximum admissible distortion; the CP set. Output: DBC – the dynamic BC curve. 1. Calculate the optimal value of m for an admissible Dadm ; Determine the Bezier point (BC x , BC y ) ; 2. For each values of u 4. Determine the minimum distance edge (x1, y1, x2 , y2 ) from (BC x , BC y ); 5. IF two consecutive edges tie for the minimum distance THEN 3. 6. 7. Calculate DBC (u ) by shifting towards the common CP of these edges using m ; ELSE Calculate DBC point DBC (u ) using (8) and (9); 8. STOP. For the scenario where the distance of a BC point from two consecutive CtrlPoly edges is edges, ( p x , p y ) , as follows: equal, the DBC point is obtained by shifting the BC point towards the common CP of those two [ DBC (u ) = DBC x DBC y ]u = [m ⎡ px (1 − m )] ⋅ ⎢ ⎢⎣ BC x py ⎤ ⎥ BC y ⎥⎦ u (12) As the foundations of the DBC framework are underpinned by classical BC theory, the core properties of BC are preserved as will now be formalised. III B. Properties of the DBC The following series of lemma examine key properties of the DBC model. Lemma 1 (Endpoint interpolation). DBC always passes through the first and last CP. Proof: Any BC interpolates its end points [5] for the starting ( u = 0 ) and end ( u = 1 ) CP, i.e., (BC x , BC y ) is (x , y ) and (x 1 1 2, y 2 ) for u = 0 and u = 1 respectively. For DBC, at both u = 0 and u = 1 , from (8) and (9), DBC x = x12 = BC x and DBC y = y12 = BC y , so DBC always pass through the end CP. ● Lemma 2 (Convex hull property). DBC always lies within the convex hull of its CP. Proof: BC always lie within the convex hull of their CP [5], while the DBC points lie in the region between the CP and BC inclusive, so the DBC also lies within the convex hull of its CP. As discussed before, when m → 0 , the DBC approaches the BC, while for m = 0 it is coincident with the BC, and for m → 1 , DBC approaches the CP. For m = 1 , it coincides with the CtrlPoly when the number of u values tends to ∞ . ● Lemma 3 (Affine invariance). DBC is invariant under affine transformations (translation, rotation and scaling). Proof: BC is affine invariant if the curve drawn with affine transformed CP is the same as the entire affine transformed curve with the same parameters [5]. According to the DBC definition, each DBC point is generated based on the BC point. Since the BC is affine invariant, for each BC point, the shortest distant edge and its distance are the same for any affine transformation, so the amount of shifting and the relative direction of shifting with respective to the CtrlPoly, will also be the same. Thus DBC is affine invariant. ● Lemma 4 (Computational complexity of DBC). For a given m, DBC has exactly the same computational complexity as the BC. Proof: Step 3 of Algorithm 1 calculates the BC points in O(N ) time for each value of the control parameter u . Step 4 identifies the closest CtrlPoly edge from the BC point which also incurs ) time. Hence for any DBC t n a t s n o c O( N ) time, while Steps 5 to 7 generate the DBC point in O( point, the overall complexity is O(N ) provided m is known. In reality, the optimal value of m is iteratively determined in Step 1, so based upon Steps 3 to 7, the DBC computational overhead is conditional on the number of iterations. In this context, the number of u values has a major impact on the computational cost in both BC and DBC models, with their overall complexity being O(u ′ ⋅ N ) , where u ′ is the number of u values within 0 ≤ u ≤ 1 . ● III C. Embedding DBC into the ORD Optimal vertex-based shape coding framework Katsaggelos et al. [13] proposed the original framework for the ORD optimal vertex-based shape coding using BS and polygons, which has subsequently been deployed in [14], [15] and extended in [17], [12]. It has already been shown in Section IIB that a quadratic BC can be equivalently used instead of the BS, so to enhance the RD performance of the algorithms, a series of conjoint DBC curves are applied for shape approximation. As DBC possesses a similar endpoint interpolation property (Lemma 1) to the classical BC, to ensure all conjoint curves have some common CP, whenever two DBC curves join, curve points are managed in an analogous manner (( ), u ), 0 ≤ u ≤ 1 to (4). DBC can now be embedded into the BS-based framework as follows: Q BS (( p 0 , p1 , p 2 ), u ) ← ⎯→ Q DBC p0 + p1 2 p1 + p2 2 , p1 , (13) where ←⎯→ indicates the right-hand-side curve will replace the left-hand-side curve. For a series curves using the CP set S = {s 0 ,s1 ,L ,s N S −1 }, the k th curve segment is defined, within the range 0 ≤ u ≤ 1 , as: Qk (u ) = Q BS k ((s k −1 , s k , s k +1 ), u ) ← ⎯→ Q DBCk (( sk −1 + sk 2 , sk , sk + sk +1 2 ), u ) (14) Figure 6: Illustration of a series of conjoint DBC curves within a quadratic BS framework. Figure 7: Plot showing the RD-m dynamics for the test shape (Stefan). As m has to be transmitted along with the encoded bit-stream, to ensure an efficient bit-rate, the SP impost must be minimised. This is achieved using a universal SP value in approximating DBC segments for a particular shape contour, rather than encoding a separate m for each segment. The value of m guides the RD characteristics of the encoder, as evidenced by the example in Figure 7 of the popular object shape Stefan. Two key observations may be drawn from this plot: i) For a given m , conventional RD characteristics are maintained, namely bit-rate is a non increasing function of distortion. ii) For a given distortion, the rate-m curves trace a convex parabola so the requisite bitrate reduces as m increases up to a certain value, whereupon it commences increasing with m . This occurs especially at lower distortions because for large values of the shifting ratio m ( m ≈ 1 ), the DBC approximation eventually tends to a low-order polygonal approximation which inevitably incurs a higher bit-rate. Figure 8: Summary of DBC characteristics amongst m , u and gap (distortion) for a typical CP set. Figure 8 reveals the effect of the control parameter u on the maximum gap size (distance), with it being largest in the vicinity of u ≈ 0.5 , and then narrowing on both the sides. It is emphasised that a particular u value only represents a point on the curve, not the entire curve, so the maximum distance actually needs to be measured for every u value of a curve, so the overall impact of u in reducing distortion is negligible. Moreover, this plot also shows the effect of m in reducing the gap for any particular u . For these reasons it is essential to iteratively determine the most appropriate value of m in the range 0 ≤ m ≤ 1 that optimises RD performance. For a given peak distortion Dmax : (min R, 0 ≤ m ≤ 1) | D = Dmax (15) where R is the required bit-rate and D is the distortion. For a given bit-rate Rmax : (min D, 0 ≤ m ≤ 1) | R = Rmax (16) Asides from the influence of m , since original BS points are moved towards the CtrlPoly in the DBC model, there are substantial implications upon the VCB widths of each individual contour point. This is examined in the next section, together with an investigation into the corresponding bound on the width of the VCB. III C 1. Maximum bound for the VCB width It has been shown in [12] that for BS-based encoding, the width W [ j ] of the ACB for each boundary point b j is: ⎧⎪ 3δ + 4Tmax + 2T [ j ] ρ 2 ⎫⎪ W [ j ] ≤ min ⎨ , ⎬ + T [ j] 4 ⎪⎭ 6 ⎪⎩ (17) where δ and ρ are respectively the longest chord length of the boundary and the largest runlength possible for the code employed. It has also been proven in [10] that the corresponding bounds for QBC and QBC-n are respectively: ⎧⎪ 11 (37δ + 48Tmax + 26T [ j ]) 11ρ 2 ⎫⎪ , W [ j ] ≤ min ⎨ ⋅ ⎬ + T [ j] 26 48 ⎪⎭ ⎪⎩ 37 and ⎧⎪ (5δ + 6Tmax + 4T [ j ]) ρ 2 ⎫⎪ W [ j ] ≤ min ⎨ , ⎬ + T [ j] 20 6 ⎪⎭ ⎪⎩ (18) (19) Lemma 5 (Bound for the VCB width in DBC). For the quadratic DBC-based framework, for a SP value m , the maximum bound of the VCB width is: ⎧⎪ 1 − m ⎛ 3 + m ⎞ (1 − m ) ⋅ ρ 2 ⎫⎪ 2 W [ j ] ≤ min ⎨ Tmax + T [ j ]⎟⎟, ⋅ ⎜⎜ δ+ ⎬ + T [ j] 1+ m 4 ⎪⎭ ⎪⎩ 3 + m ⎝ 2(1 + m ) ⎠ (20) Proof: Figure 9(a) shows a uniform quadratic parametric curve (BC or DBC) for the ordered CP set {p'1 , p 2 , p'3 } , with h being the minimum distance of the mid CP p 2 from the curve. It follows from [21] that for BC, 2h ≤ max{ p'1 p 2 , p 2 p '3 } , where p 2 p '3 is the length of edge p 2 p '3 . For DBC however, the curve point is generated by shifting the BC and so this distance is reduced. This minimum distance becomes greatest when the end CP pair p1 and p3 coincide and this is 1 (1 − m ) ⋅ p'1 p 2 , i.e., 2h ≤ max{ p'1 p 2 , p 2 p'3 } . Thus, 4h ≤ max{ p1 p 2 , p 2 p3 } . 1− m 2 1− m (a) (b) Figure 9: a) Distance between a quadratic BC or DBC curve and its CP, b) Maximal width of the admissible CP band calculation. In the example in Figure 9(b), three CP {s k −1 , s k , s k +1 } are used to encode a shape segment that includes boundary point b j , which has an admissible distortion of T [ j ] . Assuming s k −1 s k ≥ s k s k +1 , the distance of the DBC curve from s k is always ≤ 1−4m ⋅ s k −1 s k . Let α [ j ] denote the difference between the corresponding admissible distortion and width of the admissible CP band, i.e., W [ j ] = α [ j ] + T [ j ] , so: α [ j ] ≤ 1−4m ⋅ s k −1 s k (21) The maximum length of s k −1 s k is: δ + Tmax + Tmax + α max + α max = δ + 2Tmax + 2α max where α max is the maximum value of α [ j ] . So and δ + 2Tmax + 2α max ≥ 1−4m α max α max ≤ (1 − m )(δ + 2Tmax ) 2(1 + m ) (22) (23) The corresponding α [ j ] for boundary point b j is given by, Hence, α [ j] ≤ 4 α 1− m [ j ] ≤ δ + Tmax + α max + T [ j ] + α [ j ] . ⎞ 1− m ⎛ 3 + m 2 ⎜⎜ δ+ Tmax + T [ j ]⎟⎟ 3 + m ⎝ 2(1 + m ) 1+ m ⎠ (24) The encoding strategy adopted can limit the length of an edge since for example, the logarithmic code [14] can support a maximum length of ρ = 15 , while using a 3-connected chain as the direction encoder, it is able to encode a maximum length of ρ 2 (through the diagonal) so: α [ j] ≤ (1 − m ) ⋅ ρ 2 4 (25) ⎧⎪ 1 − m ⎛ 3 + m ⎞ (1 − m ) ⋅ ρ 2 ⎫⎪ 2 ⋅ ⎜⎜ Tmax + T [ j ]⎟⎟, δ+ ⎬ 1+ m 4 ⎪⎭ ⎠ ⎩⎪ 3 + m ⎝ 2(1 + m ) From (24) and (25), α [ j ] ≤ min ⎨ and ⎧⎪ 1 − m ⎛ 3 + m ⎞ (1 − m ) ⋅ ρ 2 ⎫⎪ 2 ⋅ ⎜⎜ δ+ W [ j ] ≤ min ⎨ Tmax + T [ j ]⎟⎟, ⎬ + T [ j] . 1+ m 4 ⎪⎭ ⎪⎩ 3 + m ⎝ 2(1 + m ) ⎠ (26) ● So the VCB-width bound is dependent upon m. When m is small, W [ j ] is large and vice versa, while for m = 0 , DBC has exactly the same bound as the BC/BS-based model in (17). Moreover, the bounds for both the QBC and QBC-n models are directly obtained from their SP values, m = 121 and m = 13 respectively, so corroborating the generality of the new DBC paradigm within the ORD framework. With the theory underpinning the DBC model formalised, the next section presents a rigorous experimental results analysis, to test its efficacy from both curve and shape representation, and boundary encoding standpoints. IV. EXPERIMENTAL RESULTS ANALYSIS The performance of DBC is initially compared with the classical BC and QBC from the perspective of curve representation by using some hypothetical CP sets, before analysing the results upon a series of popular test shapes from the perspective of both shape descriptor and DBC-based, ORD optimal shape encoding. To quantitatively evaluate the performance of DBC, the widely-used shape distortion measurement metrics [14] were employed. Class one distortion measures the peak distortion D max over the entire curve, while Class two distortion provides a measure of the mean-square (MS) distortion Dms of the shape approximation. The accurate distortion measurement technique [18] has been employed in all the experiments for distortion measurement purposes. IV A. Curve representation results Figure 10: BC and DBC for different admissible distortions on a Cartesian plot. The performance of the DBC model was firstly compared with the original BC from a curve representational perspective by maintaining different admissible distortions ( Dadm ) between the curve and the CtrlPoly, initially for a set of synthetic CP, prior to some real-world test shapes being analysed. The Cartesian coordinate plots in Figure 10 reveal, that as anticipated, for lower admissible distortions Dadm (e.g., Dadm = 0.3 ), the DBC lies closer to CtrlPoly, while for higher distortions ( Dadm = 0.7 ), the approximating curve is closer to the classical BC. Figure 11: The gap size (distortion) between approximating curves and the CtrlPoly verse SP for a sample CP set. Figure 11 shows the distance between the curves and CtrlPoly for a sample CP set. This confirms the theory that the gap between the curve and CtrlPoly is constant for BC, QBC and QBC-n, with in the latter two models, the amount of shifting being constant. In contrast for DBC, this varies with the SP value, which is determined from the prescribed admissible distortion. The graph also illustrates the generalisation of the DBC paradigm, since when there is no shifting ( m = 0 ), DBC and BC are the same, while as theoretically proven in Lemma 2, as m increases the gap narrows and closes to CtrlPoly as m approaches unity. Moreover, from this plot by respectively choosing m = 0.0833 and m = 0.333 , distortion levels of QBC and QBC-n models are directly obtainable. IV B. Shape descriptor results (a) (b) (c) (d) Figure 12: Shape modelling for Arabic-character [1] by a) BC; b) QBC; c) QBC-n; d) DBC approximations. A series of experiments were performed to test the robustness of DBC as a shape descriptor compared with BC and QBC using the Arabic character from [1], which has strong localised information comprising very sharp peaks followed by sharp troughs over the entire shape. The respective results for BC, QBC, QBC-n and DBC are displayed in Figure 12(a), (b), (c) and (d). Although [1] produces an optimal set of CP in terms of minimum distortion for a BC representation, the perceptual results for DBC clearly reveal a better approximation compared to BC, QBC and QBC-n, for instances, as highlighted by the rectangles and ellipses in the respective figures. A similar judgement is quantitatively confirmed by Table 1, with the peak distortions produced respectively by BC, QBC, QBC-n and DBC being 1.45, 1.44, 1.2 and 1.0 pixels. (a) (b) (c) Figure 13: DBC, QBC-n, QBC and BC comparison for – a) degree elevation; b) composite curves; c) subdivision (legend Sub-Div C H means sub-division convex hull). A number of experiments were conducted to compare DBC with the Bezier variants. The first set was to compare with the degree elevation [6] technique. A hypothetical CP set for a quadratic BC was employed for which BC, QBC, QBC-n and DBC respectively yielded maximum distortions of 3.6, 3.3, 2.4 and 1.8 pixels and MS distortions of 4.5, 3.6, 1.9 and 1.01 pixels2. A new CP set for one degree elevation shown in Figure 13(a) was generated by degree elevation using the same CP set. It is visually apparent that the new control polygon is closer to DBC than both QBC-n and QBC, and the classical BC To test the effectiveness of DBC compared with the classical BC using a CBC approach, another experiment was conducted using the curve in Figure 13(b), which is intuitively divided into two segments. The corresponding control polygons, each defined by four CP are shown in Figure 13(b). The results reveal the control polygon for BC and both QBCs is further away than DBC. The plots in Figure 13(c) illustrate the potential of QBC-n and QBC using the midpoint subdivision algorithm [8]. Both curves were drawn using the resultant CP generated by Bezier subdivision and reveal that DBC qualitatively generated better curve approximations than BC and both QBC, using the same subdivided CP set. Table 1: Results summary obtained for shape description for the Arabic character. Class one distortion (pixel) DBC QBC-n QBC BC 1.0 1.2 1.44 1.45 Class two distortion (pixel2) DBC QBC-n QBC BC 0.224 0.23 0.34 0.34 IV C. Comparison with BS-based ORD optimal shape coding framework As mentioned earlier, DBC has principally been developed to provide the variable rate-distortion through shifting in the bedrock of BC and QBC theories, the relationship in (4), subject to the appropriate CP adjustments, permit DBC to be embedded into a BS-based framework, in an analogous manner to QBC in [10]. Section IIIC delineated how quadratic DBC could be used within the BS-based ORD optimal shape coding algorithms, so some related experimental results are now presented. (a) BS (b) QBC c) QBC-n (d) DBC Figure 14: Results for the left-hand kid in the 1 frame of the Kids sequence with Tmax = 2 and st Tmin = 1 pixel (legends – solid line: Approximated boundary; dashed line: Original boundary; asterisk: CP). This series of experiments concentrated upon the requisite bit-rate for a prescribed set of admissible distortion values. The respective results produced by the ORD algorithm with the different curves upon the first kid shape of the 1st frame of the Kids sequence, for a peak distortion bound of Tmax = 2, Tmin = 1 pixel , are shown in Figure 14 (a)-(d). The subjective results in Figure 14 show the approximated shapes maintained the admissible distortions in all cases, and for DBC with m = 0.5 , the approximating curves possessed similar smoothness to both the BS- and QBC-based models. To vindicate the robust performance of DBC, the average bits and computational time requirements per frame were compared, for a number of standard test shape sequences of varying spatial and temporal resolutions, using different admissible distortion settings. These are summarised in Table 2, with the results clearly evincing DBC consistently provides the lowest bit rate, where the extra overhead incurred in encoding m has been included in all DBC bit-rates. The computational time results reveal DBC sustains comparable throughputs to all other approaches (BC, QBC and QBC-n), with the improvement over both QBC paradigms being due to the latter’s use of nested recursive calls [10]. Conversely, as DBC includes an iterative search algorithm to determine the optimal SP value, BS approximations generally afford a lower computational overhead, but it crucially requires a higher bit rate. Table 2: Average bit-rate (bits per frame) and computational time (minutes per frame) requirements for different video test sequences; and distortion limits (pixels), using different parametric curves within the dynamic ORD optimal shape coding framework. Test sequences ↓ MissAmerica.qcif (100 frames) Akiyo.qcif (300 frames) Bream.qcif (300 frames) Kids.sif (100 frames) Stefan.sif (450 frames) Admissible distortion ↓ T max =1, T min =1 Bit-rate (bits), Computational time (minutes) DBC QBC-n QBC BS 325, 3.00 331, 3.10 358, 3.10 360, 2.05 T max =2, T min =1 289, 5.20 295, 5.25 308, 5.22 310, 4.00 T max =2, T min =2 250, 6.05 262, 6.06 273, 6.06 275, 3.95 T max =1, T min =1 302, 3.00 310, 3.00 332, 3.00 335, 2.06 T max =2, T min =1 268, 5.15 278, 5.16 290, 5.16 295, 4.08 T max =2, T min =2 240, 5.96 251, 5.99 275, 5.99 277, 4.00 T max =1, T min =1 430, 4.50 435, 4.50 450, 4.50 465, 3.05 T max =2, T min =1 350, 6.25 360, 6.26 400, 6.25 402, 4.50 T max =2, T min =2 314, 7.20 325, 7.25 370, 7.23 375, 5.25 T max =1, T min =1 1060, 15.02 1084, 15.10 1136, 15.10 1140, 12.00 T max =2, T min =1 690, 19.00 708, 19.02 728, 19.02 730, 15.25 T max =2, T min =2 600, 21.00 620, 21.20 628, 21.20 641, 18.52 T max =1, T min =1 445, 5.12 455, 5.15 475, 5.12 481, 3.25 T max =2, T min =1 392, 7.05 405, 7.10 445, 7.07 446, 5.01 T max =2, T min =2 352, 7.89 371, 8.00 415, 7.95 417, 5.42 Figure 15: Comparative RD performances for different ORD algorithms using the MPEG-4 Dn metric upon the Kids.sif test sequence. To substantiate the performance of the DBC paradigm within the dynamic vertex-based ORD optimal shape coding framework, a final series of experiments was conducted using the MPEG-4 shape distortion metric Dn , which is defined as the percentile ratio of the number of erroneously represented pixels of an approximating shape to the total number of pixels in the original shape [22]. Figure 15 displays the corresponding RD curves for BS, QBC, QBC-n and DBC-based algorithms for the Kids.sif sequence, which reveals how the DBC performance depends upon the SP, with for example, DBC( 0.6 ) clearly producing superior results to DBC( 0.2 ) due to the increased level of shifting. For DBC(0.2), the SP value is low and this reflected by the RD results being more akin to those of the QBC model (m=0.0833), while DBC(0.6) is the optimal bit-rate produced by the DBC framework. The RD curves also show that at higher distortions, DBC produced comparatively better results than QBC-n, with for example at Dn = 1.326% , the respective bit-rate requirements for QBC-n and DBC being 491 and 475 bits. This improvement is as a direct consequence of the flexibility DBC affords in controlling the amount of shifting of BC points towards the CtrlPoly, in comparison to the BC and QBC-n models, where either no or a preset shift is applied. V. CONCLUSION While the Bezier curve (BC) is a well established tool for a wide range of applications, its principal drawback is that it does not consider localised shape information. This paper has focused upon bridging this hiatus by developing a flexible model that integrates variable local information into the classical BC framework, without increasing the number of control points. The theoretical foundations of the dynamic Bezier curve (DBC) have been presented together with a strategy to determine the optimal value of the shifting parameter and it has also been proven DBC retains the core properties of the BC. The qualitative and quantitative results using different control point sets and test shapes have endorsed the capacity DBC affords in terms of a consistently lower shape distortion performance compared with BC and the two Quasi-BC models, together with other recognized shape descriptor methodologies. DBC can be seamlessly integrated into all these descriptor strategies and the operational rate-distortion optimal vertexbased shape coding framework to improve their shape approximating performance. This paper has also determined the theoretical bounds of the admissible control point band for DBC when it is embedded within the classical vertex-based shape coding framework. VI. ACKNOWLEDGEMENTS The authors acknowledge the work is partially supported by a Monash University Postgraduate Publications Award. VII. REFERENCES [1] M. Sarfraz and M.A. Khan, “Automatic outline capture of Arabic fonts,” Information Sciences, pp.269-281, 2002. [2] L. Cinque, S. Levialdi and A. Malizia, “Shape description using cubic polynomial Bezier curves,” Pattern Recognition Letters, pp.821-828, 1998. [3] J.A. Mun˜oz-Rodriguez, R. Rodriguez-Vera, and M. Servin, “Direct object shape detection base on skeleton extraction of a light line,” Opt. Eng., vol.39, no.9, pp. 2463-2471, 2000. [4] L.D. Soares and F. Pereira, “Spatial shape error concealment for object-based image and video coding,” IEEE Transactions on Image Processing, vol.13, no.4, pp.586-599, 2004. [5] F. S. Hill Jr., Computer Graphics, Prentice Hall, Englewood Cliffs, 1990. [6] A.R. Forrest, “Interactive interpolation and approximation by Bézier polynomials,” Computer Journal, vol.15, no.1, pp.71-79, 1972. [7] R.H. Bartels, J.C. Beatty and B.A. Barsky, An Introduction to Splines for use in Computer Graphics & Geometric Modeling, Morgan Kaufmann Publishers Inc, 1987. [8] J.M. Lane and R.F. Riesenfeld, “A theoretical development for the computer generation of piecewise polynomial surfaces,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.2, no.1, pp.35-46, 1980. [9] M. Hosaka and F. Kimura, “A theory and methods for free form shape construction,” Journal of Information Processing, vol. 3, no. 3, pp. 140-151, 1980. [10] F.A. Sohel, G.C. Karmakar, L.S. Dooley, and J. Arkinstall, “Quasi-Bezier curves integrating localised information,” Pattern Recognition, vol.40, no.2, pp.513-542, 2008. [11] F.A. Sohel, L.S. Dooley, and G.C. Karmakar, “A dynamic Bezier curve model,” in Proc. International Conference on Image Processing, ICIP-05, vol. II, pp.474-477, 2005. [12] F.A. Sohel, L.S. Dooley, and G.C. Karmakar, “New dynamic enhancements to the vertexbased rate-distortion optimal shape coding framework,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 17, no.10, pp.1408-1413, 2007. [13] A.K. Katsaggelos, L.P. Kondi, F.W. Meier, J. Ostermann, and G. Schuster, “MPEG-4 and rate-distortion-based shape-coding techniques,” Proceedings of the IEEE, vol.86, no.6, pp.1126-1154, 1998. [14] G.M. Schuster and A.K. Katsaggelos, Rate-Distortion Based Video Compression-Optimal Video Frame Compression and Object Boundary Encoding, Kluwer Academic Publishers, 1997. [15] G.M. Schuster, G. Melnikov, and A.K. Katsaggelos, “Operationally optimal vertex-based shape coding,” IEEE Signal Processing Magazine, vol.15, no.6, pp.91-108, 1998. [16] L.P. Kondi, G. Melnikov, and A.K. Katsaggelos, “Jointly optimal coding of texture and shape,” Proceedings of International Conference on Image Processing (ICIP), vol.3, pp.9497, 2001. [17] L.P. Kondi, G. Melnikov, and A.K. Katsaggelos, “Joint optimal object shape estimation and encoding,” IEEE Transactions on Circuits and Systems for. Video Technology, vol.14, no.4, pp.528-533, 2004. [18] F.A. Sohel, L.S. Dooley, and G.C. Karmakar, “Accurate distortion measurement for generic shape coding,” Pattern Recognition Letters, vol.27, no.2, pp.133-142, 2006. [19] H. Everett, "Generalized Lagrange multiplier method for solving problems of optimum allocation of resources," Operational Research, vol. 11, pp. 399-417, 1963. [20] J. B. Scarborough, Numerical mathematical analysis: Baltimore: Johns Hopkins, 1966. [21] D. Nairn, J. Peters, and D. Lutterkort, “Sharp, quantitative bounds on the distance between a polynomial piece and its Bezier control polygon,” Computer Aided Geometric Design, vol.16, no.7, pp.613-631, 1999. [22] N. Brady, “MPEG-4 standardized methods for the compression of arbitrarily shaped video objects,” IEEE Transactions on Circuits and Systems for Video Technology, vol.9, no.8, pp.1170-1189, 1999.

RELATED PAPERS

RELATED TOPICS

Log In

Dynamic Bezier curves for variable rate-distortion

Dynamic Bezier curves for variable rate-distortion

Related Papers

RELATED PAPERS

RELATED TOPICS