Retrieve
Retrieve
Retrieve
14771
Inria, Univ. Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800, France
(a)
(b) (c)
Figure 1: Our rough animation system supports the inbetweening of complex special effects with multiple topological events, here a flame
splitting into smaller components. In (a), we show the input key drawings traced over a reference from Gilland’s book [Gil09], while in
(b) we show their decomposition into transient embeddings. Even though we use a unique color per embedding throughout the animation
for visualization, each embedding only exists between a pair of keyframes in this example. This allows the handling of topological changes,
which occur at the third and fifth keyframes. In (c), we display in magenta a subset of the inbetween frames generated by our animation
system in real-time. Please see the supplemental results video for the full sequence.
Abstract
Traditional 2D animation requires time and dedication since tens of thousands of frames need to be drawn by hand for a typical
production. Many computer-assisted methods have been proposed to automatize the generation of inbetween frames from a set
of clean line drawings, but they are all limited by a rigid workflow and a lack of artistic controls, which is in the most part due
to the one-to-one stroke matching and interpolation problems they attempt to solve. In this work, we take a novel view on those
problems by focusing on an earlier phase of the animation process that uses rough drawings (i.e., sketches). Our key idea is to
recast the matching and interpolation problems so that they apply to transient embeddings, which are groups of strokes that only
exist for a few keyframes. A transient embedding carries strokes between keyframes both forward and backward in time through
a sequence of transformed lattices. Forward and backward strokes are then cross-faded using their thickness to yield rough
inbetweens. With our approach, complex topological changes may be introduced while preserving visual motion continuity.
As demonstrated on state-of-the-art 2D animation exercises, our system provides unprecedented artistic control through the
non-linear exploration of movements and dynamics in real-time.
CCS Concepts
• Computing methodologies → Graphics systems and interfaces; Animation;
1. Introduction their dynamics (e.g., their speed and acceleration) through the tim-
ing & spacing of drawings [JT95, Wil01]. Even though the rough
Traditional 2D animation requires a lot of planning, not only at the
animation itself is not directly visible in the final movie, its impact
level of storyboards, but also for the animation itself where rough
on motion design is vividly retained, since it serves as a guide for
drawings (i.e., sketches) are used to test out the motion of different
“inbetweeners” – the artists who draw all the intermediate frames.
characters or special effects (e.g., water, smoke). These tests aim
at defining the trajectories of the characters or objects, as well as
In this paper, we introduce a system for the design and explo- dings are presented in Section 5, featuring non-linear control over
ration of rough 2D animations; a problem which, to the best of timing & spacing and direct artistic control over trajectories be-
our knowledge, has never been addressed in previous work. Our tween and across keyframes. Implementation details including a
main contribution is in the assembly and adaptation of a set of ex- novel real-time ARAP registration technique for vector strokes are
isting techniques for the previsualization of 2D animations. The exposed in Section 6. We demonstrate that our system allows users
main goal is to provide real-time feedback at intermediate frames to produce complex 2D animations in Section 7, by reproducing
between rough key drawings, to both significantly speed up the an- typical animation exercises [Wil01], such as walk cycles, special ef-
imation process and allow artists to experiment with different cre- fects, articulated motion, and some principles of animation [JT95],
ative alternatives. This is useful not only for experienced animators, such as anticipation and follow-through, or squash and stretch.
who may try variations in early tests for discussions with art direc-
tors and quickly converge to final rough animations to pass down
2. Related work
to inbetweeners; but also for animation students, who may benefit
from the ability to observe interactively the look-and-feel of differ- The design of computer-aided 2D animation systems dates back to
ent animation choices. As described in supplemental material, we the inception of Computer Graphics in the late ’60s and early ’70s
relied on an observational study of a professional animator at work [MIT67, Bae69, BW71]. As already observed by Catmull [Cat78],
followed by interviews to guide the design of our system. By work- automatic inbetweening is a central problem tackled by most of
ing at the rough animation stage, we leverage the fact that draw- such systems, and yet – more than forty years later – current com-
ings are sketchy and the global perception of movement is more mercial solutions [Ado, Too, CAC, Com] often remain too limited
important than the appearance of the strokes themselves that will or time consuming for most use cases in production. In addition,
eventually be redrawn at the cleaning stage. However, it brings two by focusing on inbetweening of final clean line drawings, we be-
fundamental challenges. First, artists may create drawings through lieve that those systems and most previous work in academia have
very different workflows such as “shift-and-trace” (drawings are missed the real potential benefit of computer assistance, that is, us-
traced over deformations of previous ones) or “pose-to-pose” (all ing the words of Durand [Dur91], to “boost user creativity by al-
key drawings are created in advance then interpolated). Second, the lowing them to concentrate on the most interesting part of their
drawings themselves are most often composed of different numbers work”: the design and exploration of motion.
of strokes and routinely differ in their number of parts.
The key idea of our approach is to shift focus from the animation
As described by Fekete et al. [FBC∗ 95], automated inbetween- of individual strokes to the animation of groups of strokes that are
ing systems can be divided in two main families: those based on only defined between a pair of keyframes, which we call transient
templates or embeddings (e.g., [BW75]), and those relying on ex- embeddings. Nevertheless, it requires to revisit the two main stages
plicit correspondences between strokes (e.g., [MIT67]). The for- of explicit correspondence techniques: matching and interpolation.
mer family is mostly well-suited for “cut-out” animations since the
movement of the embedded objects or characters is restricted by Matching. Most early methods require the user to manually iden-
the motion of their template (e.g., skeleton, control polygon, cage) tify correspondences between the strokes of consecutive keyframes
whose topology is usually fixed throughout the animation. Explicit and do not handle occlusions or topological changes [MIT67,
correspondence systems are more flexible as the stroke-to-stroke Bae69, Ree81, Dur91]. More recent techniques support such fea-
mapping is transient, changing between each pair of keyframes. tures using manually populated 2.5D [DFSEVR01, RID10] or
However they are restricted to “tight inbetweening” of clean line space-time [DRvdP15] data structures. Despite their appeal, these
drawings due to the challenge (or chore) of matching complex net- approaches require ad hoc, rather constraining drawing represen-
works of strokes. In this work, we propose transient embeddings to tations which are not suitable for rough drawings. To partially au-
keep the best of both approaches, hence allowing template-based tomatize the stroke correspondence process, a large body of work
animation of rough drawings with topological changes. In practice, represents the drawings as a graph of strokes and try to match
this requires adapting two common problems to deal with transient those graphs at subsequent keyframes [Kor02, WNS∗ 10, LCY∗ 11,
embeddings: the matching problem where two drawings must be YBS∗ 12, CMV17, Yan18, YSC∗ 18, MFXM21]. They differ by the
registered, here with drawings having different numbers of strokes; graph matching algorithm they employ, and the way users inter-
and the interpolation problem, where the movement from one key act with the system to guide or correct correspondences, espe-
drawing to the next must be generated while providing flexible cially when strokes appear or disappear. To resolve occlusions in a
and interactive artistic control over timing, spacing and trajecto- user-controllable fashion, Jiang et al. [JSL22] introduce “boundary
ries. A key feature of our approach is to enable changes of topol- strokes”, i.e., strokes with an occluding side that acts as occluding
ogy at keyframes (i.e., key drawings may have different numbers of surfaces. However, none of these methods can handle rough draw-
embeddings), while ensuring visual continuity through constrained ings with an highly dissimilar number of strokes per keyframe, and
trajectories that persist over multiple keyframes. despite recent advances in rough sketch cleanup [YVG20], no algo-
rithm is currently able to produce a sequence of clean line drawings
Our main contribution is a novel animation system that relies
that can be automatically inbetweened.
on transient embeddings to provide full non-linear artistic control
at the rough animation stage, as described in Section 3. Methods Alternatively, some methods aim at estimating region (rather
for matching embeddings at keyframes are introduced in Section 4: than stroke) correspondences between consecutive frames based on
they work with shift-and-trace and pose-to-pose workflows, or any their appearance (color, shape, distance) [Xie95, MSG96, SBv05,
combination of them. Methods for interpolating between embed- dJB06,BBA09,ZHF12,LMY∗ 13] and motion features [ZLWH16],
but they are limited to polygonal shapes or cel animations (i.e., Whited et al. [WNS∗ 10] present an interpolation scheme that pro-
mostly flat color regions with clean line boundaries). Taking in- duces arc trajectories for a full graph of strokes. It first computes
spiration from As-Rigid-As-Possible (ARAP) shape deformation motion paths for stroke endpoints along logarithmic spirals, and
techniques [IMH05, WXXC08], Sýkora et al. [SDC09] present an then deform the intermediate stroke vertices using intrinsic interpo-
image registration algorithm that decouples the matching resolution lation [SGWM93] followed by curve fitting and a tangent-aligning
from the image complexity by embedding it into a square lattice. warp to ensure continuity between adjacent strokes. The trajectory
Noris et al. [NSC∗ 11] use this method to estimate a global warp of any stroke vertex can be edited, albeit without considering its
between two drawings of an existing rough animation, abstract- dynamics. This scheme was later used by Noris et al. [NSC∗ 11] for
ing the input strokes by their rasterized distance fields. Then, each generating smooth stroke trajectories between pairs of strokes.
stroke of the first drawing, deformed by the ARAP transformation,
An alternative solution to minimize shape distortion is to rely
is matched with the most similar stroke in the second one. We also
on ARAP interpolation [ACOL00, XZWB05] of 2D embeddings
embed strokes into square lattices, but extend the registration algo-
of the drawings. The interpolated trajectories can be controlled
rithm to directly take as input vectorial strokes. Furthermore, we
through point and vector constraints [BBA08, KHS∗ 12] or even a
make the assumption that stroke-to-stroke correspondences are not
full skeleton [YHY19]. However, the boundary polygon of those
required to depict motion in rough animations, which we demon-
embeddings must be compatible across keyframes and put into cor-
strate in our results.
respondence, and a compatible triangulation of their interior must
Closest to our work, Xing et al. [XWSY15] present an interac- also be built. Baxter et al. [BBA09] describe the most relevant tech-
tive system that combines a global shape similarity metric with an niques to solve this challenging problem along with their own so-
embedding-free ARAP deformation model [SSP07] to match an lution. Zhu et al. [ZPBK17] extend these approaches to handle ex-
existing drawing with a new set of hand-drawn guidelines. We dis- treme shape deformations and topological changes, but it requires
cuss the benefits of our explicit embeddings in Section 8.1 and pro- significant manual intervention and involves an expensive numeri-
vide visual comparison in the supplemental results video. cal optimization that prevents its use in an interactive system.
Following the current trend in computer science, learning-based Yang [Yan18] combines the strokes deformation technique of
techniques [Yag17,NHA19,LZLS21,SZY∗ 21] have also been pro- Whited et al. [WNS∗ 10] with a simpler embedding, called “con-
posed to estimate per-pixel correspondence between two raster text mesh”, that better preserves the global layout of the stroke
clean line drawings. In the work of Casey et al. [CPL21], line- network. He presents an automatic construction algorithm of these
enclosed segments are first extracted from the two drawings, and compatible meshes based on the matched strokes, and an edge-
then correspondences between segments are estimated using a com- based rigid interpolation technique inspired by the method of
bination of convolutional and transformer neural networks. Extend- Igarashi et al. [IMH05] which is robust to degenerated configura-
ing such approaches to rough drawings, whose style may consider- tions (e.g., collapsing edges) and may be constrained to follow a
ably vary from one artist to another, seems extremely difficult for given trajectory. It is however unclear how such “context meshes”
such data-driven approaches. could be built for rough drawings. Instead, we use even simpler lat-
tice embeddings, which are compatible between two keyframes by
Interpolation. Once the key drawings have been put into corre- construction, but do not need to extend further in time.
spondence, inbetween frames can be generated by interpolation.
Dvorožnák et al. [DLKS18] use similar embeddings to build de-
As already noted by Burtnyk et al. [BW75], linear interpolation
formable puppets, but since those are connected at fixed junctions
and thus linear trajectories sampled at uniform rates do not produce
driven by a skeleton, their results suffer from the “cut-out” look-
natural motion in the great majority of cases.
and-feel. The animation system of Bai et al. [BKLP16] integrates
To offer maximum artistic control, the animation system of handle-based shape deformations with example-based simulations
Reeves [Ree81] allows the user to specify the trajectory and dy- to interpolate drawings embedded into triangular meshes. This ap-
namics of a set of “moving points” spanning multiple keyframes. proach manages to reproduce many of the principles of anima-
This effectively decomposes the full 2D+t space of the animation tion [JT95], supports manual topological changes and local control
into a network of Coons patches, into which interpolation can be of the dynamics, but user interaction is restricted to handles manip-
performed independently but with continuity at boundaries. How- ulation, hence once again following the “cut-out” metaphor rather
ever heuristics are required to complete the patch network, and user than hand-drawn animation.
manipulation of moving points in space and time may be laborious.
For image-based approaches, interpolation turns into an image
Kort [Kor02] models trajectories of stroke vertices by quadratic morphing (i.e., deformation and blending) problem. Many solu-
splines. The user can correct these paths when needed and spec- tions have been proposed for photographs (e.g., [FZP∗ 20,PSN20]),
ify their spacing. Since this simple interpolation scheme does not cartoon animations [LZLS21, SZY∗ 21, CZ22] and, closest to our
take the shape of the strokes into account, it may lead to local or inputs, concept sketches [ADN∗ 17]. Yet, rough drawings have a
global distortions. Sederberg et al. [SGWM93] introduce an in- very specific style which requires preserving the distribution, spa-
trinsic interpolation technique which minimizes shape distortion. tial continuity and color or gray-level intensity of the strokes. Pre-
Similar approaches [FTA05, SZGP05] attempt to preserve local vious approaches are unlikely to satisfy all three criteria. In this
differential quantities (Laplacian coordinates or edge deformation work, we do not attempt to solve this problem, and use simple
gradients). But those three methods only apply to a single poly- cross-fading of stroke thicknesses that turns out to be sufficient for
line. Motivated by classical 2D animation books [JT95, Wil01], motion previsualization in practice.
3.1. Transient embeddings A same embedding may be used over multiple keyframes, as
shown in Figure 2(b). To make this possible, we introduce break-
In its simplest form, as shown in Figure 2(a), an embedding is de- down keyframes, inspired by the traditional animation technique of
fined by a pair of lattices with the same topology at two keyframes. the same name [Wil01]. In our system, they amount to storing a
The lattice at the start keyframe holds strokes (a subset from the new transformed lattice in the embedding, along with an additional
corresponding key drawing) that are propagated forward in time. A set of strokes that is propagated both backward and forward in time.
second set of strokes is stored in a transformed lattice at the end
keyframe, which is lined up in time with the next keyframe. How- Figure 2(c) abstracts the structure of a transient embedding with
ever, the end keyframe itself is never displayed. The transformation a simple sequence of symbols: a square for the start keyframe, cir-
between the two lattices must be invertible so that strokes from the cles for (optional) breakdown keyframes, and a crossed circle for
end keyframe can be propagated backward in time. The two sets of the end keyframe. In effect, the end keyframe is a special case of
forward and backward strokes are cross-faded using stroke thick- breakdown that is only propagated backward in time, and is not it-
ness instead of opacity. Such a representation opens up a number self displayed. Segments between symbols represent both forward
of possibilities. For instance, backward strokes may be obtained and backward embedding transformations, which may be evaluated
by copying a subset of the strokes in the next key drawing, hence at any time step to yield cross-faded strokes.
(a) Split
(a)
(b) *
Remove
(c)
(b)
Figure 3: In our animation system, the timeline is segmented
into intervals whose boundaries are shown with tall ticks, while Figure 4: Non-linear editing of the animation structure is made
short ticks delimit frames. An animation is created by (a) adding possible by special updates of the transient embeddings. Splitting
transient embeddings that span one or more intervals, with their an interval in two is done by (a) inserting a breakdown keyframe in
keyframes lined up at the beginning of each interval. Timing is read- all involved embeddings, which may optionally be converted into
ily modified by (b) automatically updating embeddings when inter- an end keyframe followed by a new, automatically generated em-
val boundaries are moved. Edits are local since (c) the removal of bedding. Removing an interval yields (b) three different types of
an embedding does not affect other embeddings. results depending on the type of keyframe: extension of the embed-
ding, removal of a breakdown, or removal of the embedding.
T Id
Register Redraw
Eg
5 1 5 1 9 1
Id S S'
4
Spacing Retiming 8
4 7
3
6
2 3 5
4
2 3
2
1 0 1 0 1 0
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5 6 7 8 9
Figure 11: We illustrate our spacing function on the green embedding Eg . By default, we use (a) the identity spacing function, resulting in a
linear time parameter t. It may be modified through (b) user-provided ticks in the vertical spacing chart at left, which results in a new spacing
function S (in red). After retiming (c), S is stretched to S′ and new ticks on the spacing chart (in red) are automatically recomputed.
6. Implementation details
Figure 13: Hierarchical constraints maintain spatial relationships Strokes & lattices. In our system, strokes are represented by poly-
between adjacent embeddings throughout interpolation. In (a), the lines with stylus pressure recorded at vertices when available. As
arm, forearm and hand embeddings interpenetrate at intermediate mentioned in previous sections, embeddings may apply to subsets
frames, as shown by the dashed trajectories. In (b), by placing two of strokes, which consist of stroke segments. Moreover, since em-
hierarchical constraints – one (in red) on the arm driving the fore- beddings are transient, different subsets of strokes may be manipu-
arm, and the other (in blue) on the forearm driving the elbow – the lated at different keyframes, as demonstrated in the last interval of
full articulated chain remains properly connected at constrained Figure 17. Each new lattice is enforced to be connected and auto-
points, which follow smooth arc trajectories. matically initialized in an axis-aligned manner. We make only one
exception: when converting a breakdown keyframe into a new em-
bedding (see Figure 4(a)), we keep the previously transformed lat-
notable feature of our approach, since it provides persistent con- tice to retain its target transformation T ⋆ . This is a small limitation
trol over trajectories while relying on transient embeddings. We of the current implementation that could be solved by transferring
demonstrate that feature in Figure 12(b), where the trajectories of T ⋆ to a new axis-aligned lattice. We provide an additional tool to
Eg and Er are chained with G1 continuity to trajectories of Eb . This add contiguous empty cells inside an embedding, which is particu-
produces a visually continuous motion even in the presence of a larly useful to fill in lattices (enforcing their rigidity through inter-
topological change, as is best seen in the supplemental demo video polation), or to provide support for constraints away from strokes.
where we use more inbetween frames. The size of lattice cells may also be adjusted to capture fine details.
Trajectories might not only be used to “knit” embeddings across Vector registration. The registration problem consists in finding
time but also in space. This is a useful feature when two embed- a transformation T ⋆ that aligns the strokes in an embedding (the
dings that are visually connected at contiguous keyframes get dis- source) with a given subset of stroke segments in the next keyframe
connected or interpenetrate during interpolation, as shown in Fig- (the destination). Our solution relies on the image registration tech-
ure 13(a). For a shared constraint to be added, lattices in each em- nique of Sýkora et al. [SDC09], which works in two phases that are
beddings must overlap. One of the embeddings is identified as the iterated until convergence: a “push” phase that moves lattice cor-
leader El and the other one as the follower E f . Only the constrained ners towards locations where the source and destination are similar;
trajectory of El is edited, while E f follows that same trajectory. and a “regularize” phase that reintroduces local rigidity.
This raises an issue when the spacing functions Sl and S f of the
respective embeddings are different, as El and E f are then driven In our approach, we adapt the push phase to work on vec-
by the same trajectory but at different speeds. The issue is trivially tor strokes, while the regularize phase is left unchanged, as il-
solved by using the spacing function of the leader Sl on the follower lustrated in Figure 14. In our push phase, we compute the opti-
E f , only at the constrained embedding point; S f is retained other- mal rigid transformations that minimizes the sum of squared dis-
wise. Such hierarchical constraints bear a resemblance to skeletons tances between the stroke points in the source and their nearest-
used in cut-out animation systems, as they open up to the control of neighbors in the destination, using the closed-form solution of
articulated structures. In particular, they may be persistent as with Schaefer et al. [SMW06]. This process results in a set of discon-
any other constrained trajectory, as shown in Figure 13(b). How- nected transformed cells (step 1 in Figure 14). The lattice connec-
ever, they may also be deactivated at any keyframe, which is useful tivity is then restored by averaging lattice corner positions (step 2).
for imposing temporary constraints, such as an object temporarily The regularize phase is similar, except we compute optimal rigid
attached before being thrown away (see the supplemental results transformations that aligns the source cells with the lattice obtained
video for an example). at the end of the push phase (step 3), before restoring connectivity
(step 4). The regularize phase is repeated Nr times, with greater
Constrained trajectories are easily updated after each of the time-
values of Nr increasing rigidity of the lattice (we use Nr = 10 by
line operations of Figure 4. When an embedding is split at a frame
default). Repetitions are crucial to prevent the lattice from collaps-
fsplit , the lattice positions Vsplit = V (S(fsplit )) are used for the break-
ing due to the naive initial nearest-neighbor correspondences.
down keyframe and a new ARAP interpolation is recomputed on
each new interval, i.e., from V0 to Vsplit and from Vsplit to V1 . Bézier An alternative solution would have been to use the block match-
trajectories are split as well, and the tangents at the split points re- ing approach of Sýkora et al. [SDC09] applied to rasterized strokes,
computed, for instance using De Casteljau’s algorithm. In all our or their distance transform, similarly to Noris et al. [NSC∗ 11]. We
tests, we have observed that motion before and after splitting are have found in practice that a vectorial solution is much more ef-
visually identical, even though we could not find a proof of the ficient since it provides direct initial correspondences. As a result,
Nr repetitions
Figure 14: Our vector registration algorithm matches a source strokes, using a function c(t) : [0, 1] → [0, 1]:
drawing (bottom strokes) with a destination drawing (top strokes) (
in two phases. During the push phase, lattice cells are (1) indepen- (1 − (2(t − 12 )2 )2 if t ∈ [0, 12 ],
dently rigidly transformed to match the closest destination stroke c(t) =
1 otherwise.
points (red arrows), after which (2) the lattice connectivity is re-
stored through averaging. During the regularize phase (repeated For backward strokes we simply use c(1 − t). Hence at any time
Nr times), the shape of the source lattice is partly restored by (3) t, there is always one set of strokes (forward or backward) that
finding independent rigid cell transformations that match the cur- is displayed at full thickness. Note that cross-fading is also sub-
rent target lattice before (4) restoring connectivity. jected to spacing since t = S(f) for a frame f. By default, we de-
activate cross-fading for embeddings whose end keyframe does not
store any stroke, so as to keep forward propagated strokes displayed
over the full interval. Cross-fading is reactivated whenever the end
there is no need to iterate over the sequence of push and regular- keyframe is populated with strokes, or the embedding is tagged as
ize phases as in the raster version: both phases are only applied fading in or out. Yet, we use linear cross-fading in the latter case to
once, and our vectorial registration achieves real-time performance. make strokes appear or disappear at the same rate as interpolation.
Thanks to its efficiency, vectorial registration may be used interac-
tively in a semi-automatic fashion. For instance, our system allows Performance. Our system is implemented in C++, using the Qt li-
the user to deform the lattice using the tools mentioned in Sec- brary for the GUI and an OpenGL Geometry Shader for the final
tion 4.1 and to start the registration from this deformed configura- stroke rendering. We use the sparse LU solver of Eigen [GJ∗ 10] to
tion, hence allowing plastic deformations of the lattice. Registra- efficiently factorize and solve the linear system involved in ARAP
tion might even be run continuously during the deformation so that interpolation (Appendix A). Table 1 reports the performance of
the source strokes glide over the destination. Please see the supple- our implementation recorded for the second keyframe of Figure 19
mental demo video for a live illustration. that uses the largest lattice from all our examples, and relies on
the more demanding pose-to-pose workflow with semi-automatic
Spacing & trajectories. The analytic spacing function S may be (and thus interactive) registration. Interpolation times are also re-
controlled by adjusting the position of each individual tick. For in- ported since we display the interpolated results during matching
tervals holding many frames, this may be tedious; hence we provide via an “onion skin” visualization. We obtain real-time performance
several interface tools to control multiple ticks at once. As shown in such a worst-case scenario, as well as in all our experiments.
in the supplemental demo video, we provide typical “ease-in/ease-
out” controls, as well as options to place ticks on “halves”, as rou-
tinely done by 2D artists when creating spacing charts [Wil01]. To 7. Results
facilitate constrained trajectory editing, besides tangents manipu-
In this section and the supplemental video we present complex re-
lation, we have implemented a sketching technique that linearly
sults obtained using our animation system. Note that the video does
transforms a curve drawn by the user such that its endpoints match
not merely show final inbetweened results, but first and foremost
the constrained positions at keyframes, and then fit a cubic Bézier
the ability of our system to enable a fast and flexible creative pro-
curve with uniform arc-length parameterization to the result. This
cess to get to those results.
is also demonstrated in the supplemental demo video.
Figure 1 shows the particularly complex example of a special ef-
Cross-fading. Each interpolated frame is the result of cross-fading fect animation traced from Gilland’s book [Gil09], featuring mul-
forward and backward transformed strokes. To limit ghosting ar- tiple topological changes made possible by the use of several tran-
tifacts, we apply cross-fading to control the thickness of forward sient embeddings. Figure 13 demonstrates a first Articulated arm
Figure 16: Starting from the same two extreme key drawings (in
gray) traced over an animation exercise by Williams [Wil01] and
using the same decomposition into four embeddings throughout
(see inset image), we produce (a) a “normal” walk cycle by adding
Figure 15: Starting from the same first extreme key drawing (in three breakdowns (in black) at, and around, the passing posi-
gray), three ball drop animations are produced with a different se- tion, and (b) a very different animation with a single breakdown
quence of breakdowns (in black) using a shift-and-trace workflow. keyframe and a constrained trajectory (red curve) to make the char-
In (a), the ball is slightly deformed in the direction of motion and acter bow down inbetween steps.
acceleration is conveyed through an ease-in/ease-out spacing, with
a high rebound conveying a light object. In (b), the ball is acceler-
ated, with no deformation on contact and a low rebound, all effects
conveying a heavier object. In (c), we apply a large “squash-and- 8. Discussion
stretch” deformation to the ball before and after it hits the ground, Our animation system relies on the concept of transient embed-
giving it an elastic and cartoony look-and-feel. dings. In Section 8.1, we justify this choice, comparing it to an al-
ternative stroke-level animation system. We then discuss practical
limitations and future work in Sections 8.2 and 8.3.
animation exercise taken from Williams’ guide [Wil01], showcas-
ing arc motions controlled through hierarchical trajectories. In Fig-
ure 15 through 18, we reproduce three other classic animation ex- 8.1. Comparison with stroke-level inbetweening
ercises. Details are provided in figure captions; we summarize the
Xing et al. [XWSY15] present an interactive 2D animation system
main demonstrated features below:
that aims at assisting in both the drawing of new keyframes and
Ball drop (Figure 15): different impressions of weight are con- the matching of drawings across keyframes. In their approach, the
veyed through variations in breakdowns and spacing; user provides guiding strokes that are used by the system to predict
Walk cycle (Figure 16): breakdown keyframes are used to produce a new drawing based on past spatial and temporal repetitions. The
drastically different animations, starting from the same pair of user may then either directly reuse the suggested drawing or instead
extreme key drawings; rely on guide strokes, either case yielding strokes matched between
Flour sack jump (Figure 17): plausible motion dynamics are ob- the current and next keyframes. In effect, this amounts to a different
tained by adding anticipation and follow-through via a combina- kind of workflow that works at the stroke level and couples drawing
tion of breakdowns, deformations and spacing; with matching, alternating between shift and trace steps.
Head turn (Figure 18): 3D-like rotation is conveyed through sev-
Since strokes are only matched from one keyframe to the next,
eral embeddings, some being tagged as fading in or out.
the representation may be considered transient, like ours. However,
Note that even a simple ball drop animation (Figure 15) would unlike our method, the embedded deformation model [SSP07] is
require many trials-and-errors for novice 2D animators to produce carried by the strokes themselves, at a coarse sampling rate. To
a first result. Exploring alternate results afterwards would require a generate inbetween frames, the parameters of this model – an affine
significant amount of work (tens of minutes, perhaps hours) since transformation per sample – are interpolated in time, and the full-
all inbetween frames must be redrawn. In contrast, our system al- resolution strokes are reconstructed by spatial diffusion.
lows the exploration of different alternatives instantly while retain-
We reproduce one of their animations with our system in Fig-
ing a plausible rough hand-drawn animation style.
ure 19(a), achieving a very similar result using a pose-to-pose
The Head turn example (Figure 18) reveals the main limitation workflow. The comparison is not intended to show the superior-
of our approach (see Section 8.3): even with a special treatment ity of one workflow over another. Indeed, we believe that their
of the occluded embeddings (such as the ears), our system is not auto-completion algorithm – which is the core contribution of their
yet designed to let parts of embeddings appear or disappear be- work – could be adapted to compute lattice transformations in our
hind other embeddings, as it would require to interpolate topologi- system as well, providing an additional matching solution to artists.
cal changes. The method of Zhu et al. [ZPBK17] is able to produce Instead, we want to stress the implications of choosing to work at
such inbetween frames, but the user must specify correspondences the stroke level. First, coupling matching with drawing of strokes
for cuts, openings and boundaries on every key drawing, and com- has the undesired property that when strokes are erased, matching
patible embeddings must then be computed throughout the anima- is lost in the process. This is obviously not the case with our sys-
tion with a prohibitively expensive optimization. tem, since matching is done on embedding lattices. Second, it is
Figure 18: This head turn animation is generated from only three key drawings whose embeddings are color-coded – the last keyframe is
made of a single embedding (in black). To make the ears appear or disappear, we must draw and/or deform them occluded (blue and magenta
dashed lines) and tag them for fading in or out.
[Gil09] G ILLAND J.: Elemental Magic, Volume I: The Art of Special [SBv05] S ÝKORA D., B URIÁNEK J., Ž ÁRA J.: Colorization of black-
Effects Animation. Focal Press, 2009. 1, 10 and-white cartoons. Image and Vision Computing 23, 9 (2005), 767–782.
[GJ∗ 10] G UENNEBAUD G., JACOB B., ET AL .: Eigen v3. doi:10.1016/j.imavis.2005.05.010. 2
http://eigen.tuxfamily.org, 2010. 10 [Sch90] S CHNEIDER P. J.: An Algorithm for Automatically Fitting Digi-
[IMH05] I GARASHI T., M OSCOVICH T., H UGHES J. F.: As-rigid- tized Curves. Academic Press Professional, Inc., 1990, p. 612–626. 8
as-possible shape manipulation. ACM Trans. Graph. 24, 3 (2005), [SDC09] S ÝKORA D., D INGLIANA J., C OLLINS S.: As-rigid-as-
1134–1141. doi:10.1145/1073204.1073323. 3 possible image registration for hand-drawn cartoon animations. In Pro-
[JSL22] J IANG J., S EAH H. S., L IEW H. Z.: Stroke-based drawing and ceedings of the 7th International Symposium on Non-Photorealistic An-
inbetweening with boundary strokes. Computer Graphics Forum 41, 1 imation and Rendering (2009), ACM, p. 25–33. doi:10.1145/
(2022), 257–269. doi:10.1111/cgf.14433. 2 1572614.1572619. 3, 7, 9
[JT95] J OHNSTON O., T HOMAS F.: The illusion of life : Disney anima- [SGWM93] S EDERBERG T. W., G AO P., WANG G., M U H.: 2-d shape
tion. Disney Press, 1995. 1, 2, 3, 8 blending: An intrinsic solution to the vertex path problem. In Proceed-
ings of the 20th Annual Conference on Computer Graphics and Inter-
[KHS∗ 12] K AJI S., H IROSE S., S AKATA S., M IZOGUCHI Y., A NJYO
active Techniques (1993), ACM, p. 15–18. doi:10.1145/166117.
K.: Mathematical Analysis on Affine Maps for 2D Shape Interpolation.
166118. 3
In Eurographics/ ACM SIGGRAPH Symposium on Computer Animation
(2012). doi:10.2312/SCA/SCA12/071-076. 3, 15 [SMW06] S CHAEFER S., M C P HAIL T., WARREN J.: Image deformation
[Kor02] KORT A.: Computer aided inbetweening. In Proceedings of using moving least squares. ACM Trans. Graph. 25 (2006), 533–540.
the 2nd international symposium on Non-photorealistic animation and doi:10.1145/1179352.1141920. 9
rendering (2002), ACM, pp. 125–132. doi:10.1145/508530. [SSP07] S UMNER R. W., S CHMID J., PAULY M.: Embedded defor-
508552. 2, 3 mation for shape manipulation. ACM Trans. Graph. 26, 3 (2007).
[LCY∗ 11] L IU D., C HEN Q., Y U J., G U H., TAO D., S EAH H. S.: doi:10.1145/1276377.1276478. 3, 11
Stroke correspondence construction using manifold learning. Com- [SZGP05] S UMNER R. W., Z WICKER M., G OTSMAN C., P OPOVI Ć
puter Graphics Forum 30, 8 (2011), 2194–2207. doi:10.1111/j. J.: Mesh-based inverse kinematics. ACM Trans. Graph. 24, 3 (2005),
1467-8659.2011.01969.x. 2 488–495. doi:10.1145/1073204.1073218. 3
[LMY∗ 13] L IU X., M AO X., YANG X., Z HANG L., W ONG T.-T.: [SZY∗ 21] S IYAO L., Z HAO S., Y U W., S UN W., M ETAXAS D., L OY
Stereoscopizing cel animations. ACM Trans. Graph. 32, 6 (2013). C. C., L IU Z.: Deep animation video interpolation in the wild. In
doi:10.1145/2508363.2508396. 2 2021 IEEE/CVF Conference on Computer Vision and Pattern Recog-
[LZLS21] L I X., Z HANG B., L IAO J., S ANDER P.: Deep sketch- nition (2021), pp. 6583–6591. doi:10.1109/CVPR46437.2021.
guided cartoon video inbetweening. IEEE Transactions on Visualiza- 00652. 3
tion and Computer Graphics (2021). doi:10.1109/TVCG.2021.
[Too] T OON B OOM A NIMATION I NC .: Toon boom harmony. URL:
3049419. 3
https://cacani.sg/. 2, 4, 7
[MFXM21] M IYAUCHI R., F UKUSATO T., X IE H., M IYATA K.: Stroke
correspondence by labeling closed areas. In 2021 Nicograph Interna- [TVP] TVPAINT D EVELOPPEMENT: Tvpaint. URL: https://www.
tional (2021), IEEE Computer Society, pp. 34–41. doi:10.1109/ tvpaint.com. 4, 6
NICOINT52941.2021.00014. 2 [Wil01] W ILLIAMS R.: The animator’s survival kit. Faber and Faber,
[MIT67] M IURA T., I WATA J., T SUDA J.: An application of hybrid curve 2001. 1, 2, 3, 4, 7, 8, 10, 11
generation: cartoon animation by electronic computers. In AFIPS ’67 [WNS∗ 10] W HITED B., N ORIS G., S IMMONS M., S UMNER R. W.,
(Spring) (1967). 2 G ROSS M., ROSSIGNAC J.: Betweenit: An interactive tool for tight in-
[MSG96] M ADEIRA J. S., S TORK A., G ROSS M. H.: An approach to betweening. Computer Graphics Forum 29, 2 (2010), 605–614. doi:
computer-supported cartooning. The Visual Computer 12, 1 (1996), 1– 10.1111/j.1467-8659.2009.01630.x. 2, 3, 8, 12
17. doi:10.1007/BF01782215. 2 [WXXC08] WANG Y., X U K., X IONG Y., C HENG Z.-Q.: 2d shape de-
[NHA19] NARITA R., H IRAKAWA K., A IZAWA K.: Optical flow based formation based on rigid square matching. Computer Animation and
line drawing frame interpolation using distance transform to support in- Virtual Worlds 19, 3-4 (2008), 411–420. doi:10.1002/cav.251. 3
betweenings. In 2019 IEEE International Conference on Image Process- [Xie95] X IE M.: Feature matching and affine transformation for 2d cell
ing (2019), pp. 4200–4204. 3 animation. The Visual Computer 11, 8 (1995), 419–428. doi:10.
[NSC∗ 11] N ORIS G., S ÝKORA D., C OROS S., W HITED B., S IM - 1007/BF02464332. 2
MONS M., H ORNUNG A., G ROSS M., S UMNER R. W.: Tempo-
ral noise control for sketchy animation. In Proceedings of the ACM [XWSY15] X ING J., W EI L.-Y., S HIRATORI T., YATANI K.: Auto-
SIGGRAPH/Eurographics Symposium on Non-Photorealistic Animation complete hand-drawn animations. ACM Trans. Graph. 34, 6 (2015).
doi:10.1145/2816795.2818079. 3, 11, 12, 13
and Rendering (2011), ACM, p. 93–98. doi:10.1145/2024676.
2024691. 3, 9 [XZWB05] X U D., Z HANG H., WANG Q., BAO H.: Poisson shape inter-
[PSN20] PARK S., S EO K., N OH J.: Neural crossbreed: Neural based polation. In Proceedings of the 2005 ACM symposium on Solid and phys-
image metamorphosis. ACM Trans. Graph. 39, 6 (2020). doi:10. ical modeling (2005), ACM, pp. 267–274. doi:10.1145/1060244.
1145/3414685.3417797. 3 1060274. 3, 15
[Qui10] Q UILEZ I.: Inverse bilinear interpolation, 2010. URL: https: [Yag17] YAGI Y.: A filter based approach for inbetweening. CoRR
//iquilezles.org/articles/ibilinear/. 15 abs/1706.03497 (2017). 3
[Ree81] R EEVES W. T.: Inbetweening for computer animation utiliz- [Yan18] YANG W.: Context-aware computer aided inbetweening. IEEE
ing moving point constraints. In Proceedings of the 8th Annual Confer- Transactions on Visualization and Computer Graphics 24, 2 (2018),
ence on Computer Graphics and Interactive Techniques (1981), ACM, 1049–1062. doi:10.1109/TVCG.2017.2657511. 2, 3
p. 263–269. doi:10.1145/800224.806814. 2, 3 [YBS∗ 12] Y U J., B IAN W., S ONG M., C HENG J., TAO D.: Graph based
[RID10] R IVERS A., I GARASHI T., D URAND F.: 2.5d cartoon mod- transductive learning for cartoon correspondence construction. Neu-
els. ACM Trans. Graph. 29, 4 (2010). doi:10.1145/1778765. rocomput. 79 (2012), 105–114. doi:10.1016/j.neucom.2011.
1778796. 2 10.003. 2
[YHY19] YANG W.-W., H UA J., YAO K.-Y.: Cr-morph: Control- Appendix A: Controllable ARAP interpolation
lable rigid morphing for 2d animation. Journal of Computer Sci-
ence and Technology 34, 5 (2019), 1109–1122. doi:10.1007/ The original ARAP formulations of Alexa et al. [ACOL00] and
s11390-019-1963-3. 3 Xu et al. [XZWB05] do not offer any control over motion trajecto-
[YSC∗ 18] YANG W., S EAH H.-S., C HEN Q., L IEW H.-Z., S ỲKORA ries. Baxter et al. [BBA08] introduce such controls through linear
D.: Ftp-sc: Fuzzy topology preserving stroke correspondence. Computer constraints thanks to their reformulation of the problem in terms
Graphics Forum 37, 8 (2018), 125–135. doi:10.1111/cgf.13518. of normal equations. Katji et al. [KHS∗ 12] present an even more
2
generic mathematical framework, but since we do not need such
[YVG20] YAN C., VANDERHAEGHE D., G INGOLD Y.: A benchmark a generalization, we choose the method of Baxter et al. [BBA08]
for rough sketch cleanup. ACM Trans. Graph. 39, 6 (2020). doi:10.
1145/3414685.3417784. 2
whose implementation is simpler and very efficient.
[ZHF12] Z HANG L., H UANG H., F U H.: Excol: An extract-and- More precisely, to compute the interpolated positions of the lat-
complete layering approach to cartoon animation reusing. IEEE Transac- tice corners at time t ∈ [0, 1], we use Equation 5 in their paper:
tions on Visualization and Computer Graphics 18, 7 (2012), 1156–1169.
⊤ −1 ⊤
doi:10.1109/TVCG.2011.111. 2
P WP C P WA(t)
V (t) = ,
[ZLWH16] Z HU H., L IU X., W ONG T.-T., H ENG P.-A.: Globally opti- C 0 D(t)
mal toon tracking. ACM Trans. Graph. 35, 4 (2016). doi:10.1145/
2897824.2925872. 2 where the sparse matrix P encodes the triangulated lattice connec-
[ZPBK17] Z HU Y., P OPOVI Ć J., B RIDSON R., K AUFMAN D. M.: Planar tivity, the matrix A(t) stores the target affine triangle deformations,
interpolation with extreme deformation, topology change and dynam- the diagonal matrix W allows to specify a weight per triangle (we
ics. ACM Trans. Graph. 36, 6 (2017). doi:10.1145/3130800. use its area), the matrix C expresses hard linear constraints defined
3130820. 3, 11 on lattice vertices, and the matrix D(t) stores the constrained driven
positions. When no constraint is provided by the user, we com-
pel the mean position of the lattice to follow a linear and uniform
trajectory in order to have a unique solution. This implies setting
C = [1/N . . . 1/N], with N the number of lattice triangles, and set
D(t) to the linearly interpolated position of the lattice barycenter.
Otherwise, each constrained point p maps to a row C p in C with
four non-zero values, one for each barycentric coordinate relative
to its cell corners. Denoting {i, j, k, l} the indices of these corners,
enumerated in clockwise order starting from top-left, yields:
i j k l
Cp = . . . (1 − u)(1 − v) . . . u(1 − v) . . . uv . . . (1 − u)v . . . ,
with (u, v) the coordinates of p in the quad cell obtained by in-
verse bilinear interpolation [Qui10]. The corresponding row in D(t)
stores the position along the trajectory curve evaluated at t.
Note that matrix inversion, which is the most computationally
expensive part of the method, needs to be performed in only two
cases: when a lattice is created or its topology changed, since P is
modified; or when a linear constraint is added to C. The matrix A(t)
is updated whenever matching is modified, whereas D(t) is updated
whenever a constrained trajectory is edited.
With this formulation, the inverse lattice transformation
T (t)−1 = V (1 − t) −V1 might result in a non-symmetric behavior,
which is obviously problematic for strokes cross-fading. We thus
use the symmetric formulation of Baxter et al. [BBA08] which is
slightly more complex but equally fast to compute. We refer the
interested reader to their paper for details.