Research Blog: Lipsyncing

In the process of creating my specialization project it became apparent that I in fact, know little about lipsyncing. In order to fix this, research was done.

First of all, there should be roughly 4 stages of dialogue animation, thankfully broken down simply by DJ Nicke on Youtube:

  • Foundation
  • Structure
  • Details
  • Polish

Let’s break these down a little more.


Foundation is knowing your audio, how fast, what’s being said, what emotions are being portrayed, knowing what you’re doing and how they’re doing it. Knowing your sh*t, and knowing what your characters are going to do about it. Simple. Know your stuff back to front, forwards, backwards, inside and out.


So lipsyncing is fairly straightforward – make the mouth move in time with the audio to make it seem like the character is speaking.  Open, close, open, close – simple, right? Well that may be, but that’s only the beginning. Speaking –  for a human at least – is not simply just opening and closing one’s mouth – that’s more of a Muppet thing.


BUUUUTTTT it’s the starting point. Take your dialogue, and speak it yourself. Determine when your mouth is open and when it’s closed, so sounds like vowels, usually your mouth is open to some extent, and consonants are when your mouth ‘closes’. If you can’t determine this by just saying it aloud or looking in a mirror, this video has a good technique on how to determine when/where you should place your open/close frames (and incidentally is where I learnt most of this outside of just absorbing it from watching stuff):

But one more thing! Even though you can hear words starting and stopping, you don’t, or rather shouldn’t need to animate every syllable. Humans are humans, and we don’t exactly accentuate each syllable of what we speak, in fact, given the timing of the dialogue we might even merge many words into one blur, only closing our mouth one or two times during the whole thing. Otherwise a character’s face might look like a dog flap on a windy day.

Once you have the open/close frames done, you can start thinking about the syllables more directly. Again, staring at yourself saying the audio can determine how the mouth needs to move to make it look like we’re saying things. Here’s a pair of handy dandy reference charts, for both 2D and 3D formats:

As you can see here, wide vowels like A, O, U,and to an extent E and I are said louder and the mouths are more open. The above video has another good way of determining it bit by bit as well, by the way. Personally this part was made easier not just because I had a mirror beside me at all times, but I studied a couple of other languages that worked in syllables, so I kiiinda also went through these syllables just to get a wider range of mouth movements covered (particularly ones involving Wh- sounds.). It made it a bit easier (for me at least) to visualise what more complicated sounds looked like.

Just going through the timeline, where you had the open/closed frames, and adding in bits where the mouth becomes narrower (for O, U, W Sounds) or wider ( A, E and I sounds) adds another level of depth. For sounds like SS, X, Z, C, hard sounds, sometimes the teeth come together but the lips don’t.


So we have the opening, and the vowel movements (heh.). Time to add some detail. If it hasnt been done already (it had been done prior in my case, but time to add more), time to add emotion to the dialogue. Shifting eyebrows, the direction of the eyes, eyelids if you so desire. Moving the corners of the mouths to make him happy/unhappy, even. So basic emotive movements.

If anything, a good example of this would maybe be the Lego Movie – a lot of facial animation, but not so much movement in the head area (mostly because of the lack of neck for minifigs but that’s beside the point)


Now if you’re viewing the character from far off, or working on a low budget production, you’d be done. If not, Here we can add even more personality to the character. Smaller head tilts, head shakes/bobbing, more exaggerated squash/stretch adding the smaller details that elaborate the existing emotion.

For example, take the Shrek 1 Lord Farquaad, vs King Candy from Wreck it Ralph’s head movements. (you could also compare the above lego scene in here but that might be a stretch)



Refs if you could call them that:


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s