top of page


Year: 2012

Video and sound

In "Listen", the audience is encouraged to constantly shift between vision and audition in an attempt to decipher an all-important but obscured spoken message.

Listen is a 3-minute audiovisual piece filmed in a single shot, presenting a close-up of a speaking mouth, which occupies most of the screen-space.


In Listen, the speaker promises to reveal intimate secrets – and this intimacy is accentuated by the closeness of the mouth and by the un-effected voice recording, and simultaneously contrasted by the anonymity of the speaker’s face, which lies beyond reach.


As the speaker begins her revelations, a disruptive and constant, extraneous sound comes in to obliterate her message, rendering it unintelligible by making her voice barely audible, while the visuals remain unaltered.

At irregular intervals the obstructive sound relents and fragments of the spoken sentences are brought to the fore and become audible, while simultaneously the image blurs.

Towards the end of the piece, both visuals and speech are disrupted by blurring and by the extraneous sound respectively, and the mouth and the speech’s message become clearly defined again only at the very end, where it becomes apparent that the revelations have ended, and where the speaker thanks us for listening.


As Michel Chion points out in his book Audio-Vision (1994), human beings (and the filmic medium) are “vococentric” and “verbocentric”, so when listening to a sound film, our ears tend to seek and notice voice first, and specifically messages conveyed by any words being uttered.


In Listen, disruption is introduced to put the message just out of reach. Because the voice can be faintly heard in the barrage of extraneous sound, the audience’s vococentric/verbocentric ears naturally seek it and strain, but fail to pick out the all-important message being conveyed by the obstructed words.

Speech is visible in the speaking mouth’s movements, but barely audible, and is unintelligible, so our eyes then try to compensate, to fill in by lip-reading.


When fragments of spoken sentences are occasionally brought to the fore to titillate our ears and maintain our interest in the unreachable conversation, the image blurs, forcing the audience to readjust from focusing on the lips’ movements to purely listening. The repetition of this coupling of blurred visuals with decipherable voice is then itself disrupted when both the image of the mouth and its speaking are simultaneously interfered with, forcing the audience to readjust and strain once more, before the film ends.


This continuous disruption of audibility and visibility encourages constant readjustment in the audience, and this makes them take notice and reflect on the relationship between sound and visuals and their degree of interdependency on a more conscious level by encouraging them to switch between leaning on one and then the other in an attempt to get to the message, thus taking neither for granted.

bottom of page