Music-related memory

Musical memory is the ability to recall music-related information, such as melodies and progressions of tones or pitches. Researchers have noted differences…

Musical memory refers to the cognitive capacity for recalling information associated with music, including melodies and sequences of tones or pitches. Scholars have identified distinctions between linguistic and musical memory, prompting the hypothesis that musical memory might be encoded differently from language and could constitute a distinct element of the phonological loop. Nevertheless, the application of this term presents challenges, as it suggests a verbal input, while music is fundamentally nonverbal.

Neurological Bases

In alignment with the principle of hemispheric lateralization, research indicates that the brain's left and right hemispheres contribute distinctly to various facets of musical memory. For instance, Wilson & Saling (2008) observed hemispheric disparities in the roles of the left and right medial temporal lobes in melodic memory by analyzing the learning patterns of patients with damage to these regions. Similarly, Ayotte, Peretz, Rousseau, Bard & Bojanowski (2000) reported that patients who underwent a left middle cerebral artery transection due to an aneurysm exhibited more significant deficits in musical long-term memory tasks compared to those with a right middle cerebral artery transection. Consequently, they posited that the left hemisphere is predominantly crucial for the representation of music in long-term memory, while the right hemisphere primarily facilitates access to this memory. Sampson and Zatorre (1991) investigated patients with severe epilepsy who underwent surgical intervention, alongside control participants. Their findings revealed impaired memory recognition for text, whether sung or spoken, following a left temporal lobectomy but not a right one. Conversely, melody recognition, when a tune was sung with novel lyrics (as opposed to the original encoding), was compromised after either a right or left temporal lobectomy. Furthermore, impairments in melody recognition without lyrics were observed after a right temporal lobectomy but not a left one. These results imply the existence of dual memory codes for musical memory, wherein the verbal code engages left temporal lobe structures, and the melodic code relies on the encoding processes involved.

Semantic Versus Episodic Memory

Platel (2005) delineated musical semantic memory as the recall of musical compositions independent of their temporal or spatial learning contexts, while musical episodic memory was defined as the recall of compositions along with the specific circumstances of their acquisition. Comparative analysis of semantic and episodic components of musical memory revealed two distinct neural activation patterns. After controlling for initial auditory processing, working memory, and mental imagery, Platel observed that the retrieval of semantic musical memory engaged activation in the right inferior and middle frontal gyri, the superior and inferior right temporal gyri, the right anterior cingulate gyrus, and the parietal lobe region. Additionally, some activation was noted in the middle and inferior frontal gyri of the left hemisphere. Conversely, the retrieval of episodic musical memory, encompassing music-evoked autobiographical memory, elicited bilateral activation in the middle and superior frontal gyri and the precuneus. Despite the presence of bilateral activation, a clear dominance in the right hemisphere was evident. These findings indicate the independence of episodic and semantic musical memory. Furthermore, the Levitin effect illustrates precise semantic memory for musical pitch and tempo among listeners, even in the absence of formal musical training or episodic recall of the initial learning environment.

Individual Differences

Sex

Gaab, Keenan & Schlaug (2003) identified sex-based differences in the neural processing and subsequent memory for pitch, utilizing fMRI. Specifically, males exhibited more lateralized activity within the anterior and posterior perisylvian regions, with a pronounced activation in the left hemisphere. Males also demonstrated greater cerebellar activation compared to females. In contrast, females displayed increased activation in the posterior cingulate and retrosplenial cortex relative to males. Despite these neurological distinctions, the study concluded that behavioral performance did not significantly vary between males and females.

Handedness

Deutsch's research indicates that left-handed individuals with mixed hand preference demonstrate superior performance over right-handed individuals in short-term pitch memory tasks. This observed advantage may stem from a more distributed storage of information across both cerebral hemispheres in the mixed-handed left-handed cohort.

Atypical Cases

Expertise

Experts possess extensive experience acquired through practical application and formal education within their specialized domains. Musical experts, similar to professionals in other fields demanding substantial memorization, employ strategies such as chunking, systematic organization, and consistent practice. For instance, musicians may structure notes into scales or devise hierarchical retrieval systems to enhance recall from long-term memory. A case study by Chaffin & Imreh (2002) on an accomplished pianist revealed the development of a retrieval schema designed to ensure effortless musical recall. This expert utilized a combination of auditory, motor, and conceptual memory. Auditory and motor representations collectively contribute to performance automaticity, while conceptual memory primarily serves to guide corrections when deviations occur. Further research by Chaffin and Logan (2006) on concert soloists not only affirmed the presence of hierarchical memory organization but also proposed the use of a mental map of the musical piece, enabling tracking of its progression. Chaffin and Logan (2006) additionally demonstrated the existence of performance cues that monitor and adapt automatic performance elements. They categorized these into basic, interpretive, and expressive performance cues. Basic cues oversee technical attributes, interpretive cues manage modifications across various musical facets, and expressive cues regulate the emotional content of the music. These cues are cultivated through focused attention on specific aspects during practice sessions.

Savantism

A savant is characterized as an individual possessing a low IQ yet exhibiting exceptional proficiency in a specific domain. Sloboda, Hermelin, and O'Connor (1985) documented the case of patient NP, who could commit highly intricate musical compositions to memory after only three or four auditions. NP's capabilities surpassed those of experts with significantly higher IQs. Nevertheless, his performance on other memory assessments was typical for someone within his IQ bracket. This case led them to propose that a high IQ is not a prerequisite for musical memorization skill, implying the influence of other contributing factors. Miller (1987) similarly investigated a 7-year-old child identified as a musical savant. This child demonstrated superior short-term musical memory, which was observed to be affected by the degree of attention paid to the music's complexity, key signature, and recurring patterns within a sequence. Miller (1987) posited that a savant's aptitude stems from encoding information into pre-existing, meaningful long-term memory structures.

Child prodigies

Ruthsatz & Detterman (2003) define a prodigy as a child (under 10 years old) who demonstrates exceptional proficiency in "culturally relevant" tasks, often surpassing the typical performance levels of adult professionals in the same field. They presented a case study of a boy who, by age 6, had already released two CDs (featuring his singing in two languages) and mastered multiple musical instruments.

Additional observations regarding this young child included:

performed numerous concerts
appeared twice on national television
featured in two films
performed highly expressive musical pieces
originated from a family with no notable musical aptitude
never received formal instruction, instead learning by listening to others' compositions and employing improvisation
an IQ of 132, which is two standard deviations above the average
an exceptional memory across all cognitive domains

Amusia

Amusia, commonly referred to as tone deafness, is characterized by primary deficits in pitch processing. Individuals with amusia also experience difficulties with musical memory, vocal performance, and rhythmic timing. Furthermore, amusics are unable to differentiate melodies from their underlying rhythm or beat. Conversely, amusics exhibit normal recognition of other auditory stimuli, such as lyrics, voices, and environmental sounds. This suggests that amusia does not result from impairments in exposure, hearing, or general cognitive function.

Effects on non-musical memory

Music has been demonstrated to enhance memory recall in various contexts. For instance, a study investigating music's influence on memory involved pairing visual cues (filmed events) with accompanying background music. Subsequently, participants who initially struggled to recall scene details were presented with the background music as a retrieval cue, leading to the recovery of previously inaccessible scene information.

Additional research supports the notion that musical training enhances text memory. Words presented in song form were recalled significantly better than those delivered through speech. This observation aligns with earlier findings that advertising jingles, which integrate words with music, facilitate superior recall compared to words presented in isolation or spoken words with background music. Memory for brand-slogan pairings was also improved when advertisements incorporated lyrics and music, rather than spoken words accompanied by background music.

Musical training has also been demonstrated to enhance verbal memory in both children and adults. A study compared immediate and 15-minute delayed word recall between musically trained individuals and those without a musical background. Participants were orally presented with word lists three times before attempting to recall as many words as possible. Even after controlling for intelligence, musically trained participants consistently outperformed their non-musically trained counterparts. The researchers propose that musical training improves verbal memory processing through neuroanatomical alterations in the left temporal lobe, a region critical for verbal memory, a hypothesis supported by prior investigations. Magnetic Resonance Imaging (MRI) studies have revealed that this specific brain region is larger in musicians than in non-musicians, potentially indicating changes in cortical organization that contribute to enhanced cognitive function.

Anecdotal evidence from an amnesic patient, identified as CH, who experienced declarative memory deficits, indicated a preserved capacity for recalling song titles. CH's specialized knowledge of accordion music enabled researchers to investigate verbal and musical associations. When provided with song titles, CH consistently played the correct song with a 100% success rate. Furthermore, upon hearing a melody, CH accurately selected the corresponding title from multiple distractors 90% of the time.

Interference

Interference is defined as the phenomenon where information within short-term memory impedes or obstructs the retrieval of other data. Some researchers hypothesize that interference in pitch memory stems from a general, limited capacity of the short-term memory system, irrespective of the information type it holds. Conversely, Deutsch's research has demonstrated that pitch memory is susceptible to interference from the presentation of other pitches, but not from spoken numbers. Subsequent studies have further revealed that short-term memory for a tone's pitch is subject to highly specific effects generated by other tones, with these effects contingent upon the pitch relationship between the interfering tones and the target tone. Consequently, it suggests that pitch memory operates as a highly organized system specifically dedicated to retaining pitch information.

Any extraneous information present during comprehension can displace target information from short-term memory. Consequently, an individual's capacity for understanding and recall may be impaired if studying occurs concurrently with television or radio exposure.

Although research on music's impact on memory has yielded inconsistent findings, evidence indicates that music can interfere with various memory tasks. Novel situations necessitate new configurations of cognitive processing, which subsequently directs conscious attention toward unfamiliar situational elements. Thus, the intensity of musical presentation, alongside other musical components, can divert an individual from typical responses by prompting focus on the musical information. Both attention and recall are demonstrably impaired by the presence of distractions. Wolfe (1983) advises educators and therapists to recognize the potential for environments featuring simultaneous sounds from multiple sources (both musical and non-musical) to distract and impede student learning.

Introversion and Extroversion

Researchers Campbell and Hawley (1982) demonstrated a regulatory mechanism for arousal distinctions between introverts and extroverts. Their findings indicated that in a library setting, extroverts tended to select study environments characterized by activity and stimulation, whereas introverts preferred tranquil, isolated spaces. Subsequently, Adrian Furnham and Anna Bradley observed that introverts, when exposed to music during two cognitive tasks (prose recall and reading comprehension), exhibited significantly poorer performance on a memory recall assessment compared to extroverts under identical musical conditions. Conversely, in the absence of musical stimuli during these tasks, both introverts and extroverts demonstrated comparable performance levels.

Hemispheric Interference

Contemporary studies indicate that the brain's right hemisphere processes melody holistically, aligning with Gestalt Psychology principles, while the left hemisphere analyzes melodic segments in a more detailed manner, akin to its feature-detection capabilities in the visual field. For example, Regalski (1977) illustrated that upon hearing the melody of the well-known carol "Silent Night," the right hemisphere perceives it as "Ah, yes, Silent Night," whereas the left hemisphere processes it as "two sequences: the first a literal repetition, the second a repetition at different pitch levels—ah, yes, Silent Night by Franz Gruber, typical pastorate folk style." Typically, brain function is optimized when each hemisphere executes its specialized role during task or problem resolution, as the two hemispheres are largely complementary. Nevertheless, instances can occur where these two modes conflict, leading to one hemisphere impeding the function of the other.

Testing

Absolute Pitch

Absolute pitch (AP) refers to the capacity to generate or identify specific musical pitches without relying on an external reference. Individuals possessing AP have internalized pitch benchmarks, enabling them to sustain consistent pitch representations within long-term memory. AP is considered an uncommon and somewhat enigmatic aptitude, observed in approximately 1 in 10,000 individuals. A standard methodology for assessing AP involves instructing participants to close their eyes and mentally envision a particular song. They are then prompted to reproduce the song's tones, starting at any point they choose, through singing, humming, or whistling. The participant's vocalizations are subsequently recorded digitally. Finally, these productions are compared against the original tones performed by the artists. Discrepancies are quantified as semitone deviations from the accurate pitch. However, this particular assessment evaluates implicit absolute pitch rather than definitively establishing the presence of true absolute pitch. Regarding true absolute pitch, Deutsch and colleagues have demonstrated that music conservatory students who speak tonal languages exhibit a significantly higher prevalence of absolute pitch compared to speakers of non-tonal languages, such as English.

Testing

The capacity to identify incorrect musical pitch is frequently assessed using the Distorted Tunes Test (DTT). Developed in the 1940s, the DTT was initially employed in extensive studies within the British population. The DTT quantifies musical pitch recognition on an ordinal scale, with scores representing the count of accurately identified tunes. Specifically, the DTT assesses participants' proficiency in determining whether straightforward popular melodies contain inaccurately pitched notes. This methodology has been utilized by researchers to explore the genetic underpinnings of musical pitch recognition in both monozygotic and dizygotic twin cohorts. Drayna, Manichaikul, Lange, Snieder, and Spector (2001) concluded that variability in musical pitch recognition largely stems from highly heritable distinctions in auditory functions not detectable by conventional audiological assessments. Consequently, the DTT methodology offers potential advantages for advancing comparable research investigations.

In Infants

A specific testing procedure has been employed to evaluate infants' recall of familiar, intricate musical compositions and their preferences for timbre and tempo. This methodology has revealed that infants not only exhibit prolonged attention to familiar music compared to unfamiliar pieces but also retain the tempo and timbre of familiarized melodies over extended durations. Evidence for this retention is provided by observations that altering the tempo or timbre during testing eliminates an infant's preference for the previously familiarized melody. Consequently, this suggests that infants' long-term memory representations encompass not merely abstract musical structures but also specific surface or performance characteristics. The testing procedure comprises three distinct phases:

**Familiarization:** The chosen musical composition is provided to parents or caregivers on a compact disc. Instructions mandate that parents and caregivers play the musical piece three times daily, specifically when the infant is in a quiet and alert state, and the home environment is tranquil.
**Retention:** Immediately following the familiarization phase, compact discs are retrieved from parents or caregivers. This collection ensures that no further exposure to the familiarized musical piece occurs throughout the subsequent two-week retention period.
**Test:** The final phase involves laboratory testing of infants using the Headturn-preference procedure, a behavioral data-collection methodology designed to quantify preferences for specific auditory stimuli. This procedure operates on the premise that an infant will orient its head towards a preferred stimulus. The test is conducted within a specialized booth, with the infant positioned on the lap of its mother. Lights are situated on both sides of the infant, and a trial commences when the infant fixates straight ahead. Both the mother and the experimenter are required to wear tight-fitting earphones, which deliver masking music throughout the entire procedure. This measure is implemented to prevent any potential bias in the infant's response from either the mother or the experimenter. During each trial, one sidelight flashes, prompting the infant to direct its gaze towards it. Upon the infant turning its head and looking at the light, the auditory stimulus is initiated. The stimulus continues until its completion or until the infant averts its gaze. If the infant turns away from the sound source for a minimum of two seconds, both the sound and light deactivate, concluding the trial. A subsequent trial begins once the infant re-engages with the central panel.

Lyrical versus Instrumental Memory

A significant number of students engage with music during study sessions, often asserting that this practice helps prevent drowsiness and sustains their arousal for academic tasks. Some even posit that background music enhances work performance. Conversely, Salame and Baddeley (1989) demonstrated that both vocal and instrumental music negatively impacted linguistic memory performance. They attributed this interference to task-irrelevant phonological information consuming working memory resources. This disruption is explicable by the capacity of music's linguistic elements to occupy the phonological loop, analogous to the processing of speech. Further evidence supports this, as vocal music is generally perceived to interfere more significantly with memory than instrumental music or natural soundscapes. Rolla (1993) posited that lyrics, as a form of language, generate imagery that facilitates the interpretation of experience in the communicative process. Contemporary research aligns with this perspective, suggesting that the experiential sharing through language in song can convey feelings and moods more directly than either language itself or instrumental music alone. Additionally, vocal music tends to influence emotion and mood more rapidly than instrumental music. Nevertheless, Fogelson (1973) reported that instrumental music also hindered children's performance on reading comprehension assessments.

Development

Neural structures develop and increase in complexity through experiential learning. For instance, an early developmental preference for consonance—defined as the harmonious agreement of musical components—over dissonance, an unstable tonal combination, has been observed. Investigations indicate that this preference arises from exposure to structured auditory stimuli and the maturation of the basilar membrane and auditory nerve, which are among the brain's earliest developing structures. Auditory stimuli elicit measurable brain responses known as event-related potentials (ERPs), which directly reflect cognitive processing or perception. Distinct ERP patterns have been identified in normally developing infants between two and six months of age. Infants aged four months and older exhibit faster, more negative ERPs. Conversely, newborns and infants up to four months display slower, unsynchronized, and positive ERPs. Trainor et al. (2003) posited that these findings suggest responses in infants under four months originate from subcortical auditory structures, while those in older infants typically stem from higher cortical regions.

Relative and Absolute Pitch

Two primary methods exist for encoding and recalling musical information. The first, termed relative pitch, denotes an individual's capacity to discern the intervallic relationships between successive tones. Consequently, musical pieces are apprehended as a continuous sequence of intervals. Additionally, some individuals employ absolute pitch, which is the faculty to identify or reproduce a specific tone without relying on an external reference. A related, though rarer, phenomenon is perfect pitch, which describes the ability to accurately sing or name any given note or interval upon hearing or seeing it. Some scholars consider relative pitch the more sophisticated of these processes, as it facilitates rapid recognition irrespective of absolute pitch, timbre, or quality, and can elicit physiological responses when a melody deviates from learned relative pitch patterns. The development of relative pitch exhibits cultural variability. Trehub and Schellenberg (2008) observed that Japanese children aged five and six years demonstrated significantly superior performance on relative pitch tasks compared to their Canadian counterparts of the same age. They posited that this disparity might stem from greater exposure to pitch accent in Japanese language and culture among Japanese children, contrasting with the predominantly stress-based linguistic environment experienced by Canadian children.

Plasticity in Musical Development

The early acquisition of relative pitch facilitates an accelerated mastery of musical scales and intervals. Musical training enhances the attentional and executive functions crucial for interpreting and efficiently encoding musical information. These processes progressively stabilize in conjunction with brain plasticity. Nevertheless, this phenomenon presents a degree of circularity: increased learning leads to greater process stability, which in turn may reduce overall brain plasticity. This mechanism might account for the observed differences in effort required by children and adults to master novel tasks.

Models

Modal Model

Atkinson and Shiffrin's 1968 model proposes distinct components for short-term and long-term memory storage. This model posits that short-term memory is constrained by both its capacity and temporal duration. Evidence indicates that musical short-term memory is processed distinctly from verbal short-term memory. Berz (1995) reported divergent correlations between modality and recency effects in language versus music, implying the involvement of distinct encoding mechanisms. Furthermore, Berz illustrated varying degrees of task interference attributable to linguistic versus musical stimuli. Ultimately, Berz supported a separate store theory via the "Unattended Music Effect," asserting that "If there was a singular acoustic store, unattended instrumental music would cause the same disruptions on verbal performance as would unattended vocal music or unattended vocal speech; this, however, [is] not the case".

Baddeley and Hitch's Model of Working Memory

The 1974 Baddeley and Hitch model of working memory comprises three distinct components: a primary central executive and two subordinate systems, the phonological loop and the visuospatial sketchpad. The central executive's principal function involves mediating interactions between these two subsidiary systems. The visuospatial sketchpad is responsible for retaining visual information. The phonological loop is further delineated into two sub-components: the Articulatory control system, often termed the "inner voice," which facilitates verbal rehearsal; and the Phonological store, or "inner ear," dedicated to speech-based information storage. Significant critiques of this model highlight its omission of musical processing and encoding, as well as its disregard for other sensory inputs, specifically concerning the encoding and storage of olfactory, gustatory, and tactile data.

A Theoretical Model of Memory

William Berz (1995) introduced a theoretical model building upon the Baddeley and Hitch framework. Berz's modifications incorporated a distinct musical memory loop, conceived as a somewhat independent addition to the phonological loop. This novel musical perceptual loop integrates musical "inner speech," complementing the verbal "inner speech" function of the original phonological loop. Furthermore, Berz proposed an additional loop to accommodate other sensory inputs that the Baddeley and Hitch model had previously overlooked.

Koelsch's Model

Stefan Koelsch and Walter Siebel delineated a model wherein musical stimuli are processed along a sequential timeline, disaggregating auditory input into distinct characteristics and semantic content. They posited that initial sound perception activates the auditory nerve, brainstem, and thalamus. During this phase, approximately 10–100ms post-stimulus, features such as pitch height, chroma, timbre, intensity, and roughness are extracted. Subsequently, melodic and rhythmic grouping takes place, which is then registered by auditory sensory memory. This is followed by an analysis of intervals and chord progressions, leading to the construction of harmony based on metre, rhythm, and timbre, occurring approximately 180–400ms after initial perception. Structural reanalysis and repair processes then transpire around 600–900ms. The final stage involves the activation of the autonomic nervous system and multimodal association cortices. Koelsch and Siebel further suggested that, between approximately 250–500ms, the interpretation of sound meaning and associated emotional responses continuously unfold, evidenced by the N400, a negative spike observed at 400ms in event-related potential measurements.

Jancke's Model

Lutz Jancke also formalized the processes underlying the perception and storage of musical information. He characterized music perception as a serial-to-parallel conversion, conceptualized as the integration of sequential data across multiple hierarchical levels. Music, initially comprising a series of individual tones, is integrated into motives, which are then further integrated into phrases. Subsequently, these melodic fragments coalesce into larger melodic clusters or even the complete musical composition.

Musical information is retained within a specialized memory system, which bears resemblance to the associative memory framework proposed by Raaijmakers and Shiffrin, as well as Kalveram's model of inverse processing. Within this system, auditory input is projected to a memory store, conceptualized as a correlation storage where auditory information becomes linked with diverse other data. Consequently, musical information can be associated with motor programs, semantic memory, episodic memory, autobiographical memory, implicit memory, emotional states, and numerous other cognitive aspects. The robustness of these correlations directly influences the strength of the musical memory's association with other information, with these correlations being established throughout the learning process.

Musical aptitude
Tonal memory

Music-related memory