The emergence of the internet had a great impact on language use and communication. Among the biggest changes we can mention the speed of information spread, the dismissal of grammatical and orthographic rules, a fact that nowadays influences everyday language use, and most importantly, the very frequent application of non-linguistic elements with specific communicative functions. In accordance with this tendency, a large proportion of digital communication is performed via pictures and pictorial/textual elements, even in those cases in which a textual description would by itself carry the information.

The attention-grabbing function of pictures is not in doubt today, but the question still remains: how do these pictorial and pictorial-textual elements that occur so frequently in the digital era, actually work? What kind of meaning-making processes do they account for?
In the present paper the aim is to present a possible interpretation for these digital elements from a cognitive linguistic view. The main hypothesis is that these elements are constructed and perceived as a specific language in which the pictures (and in some cases texts) used function as the constructional elements of this language.

In order to analyze these elements, in the first section of the paper the traditional meme theory will be presented, which is followed by the digital meme theory. The instances chosen to analyze are intentionally not called memes yet. In the following section different types of pictorial and pictorial-textual elements are presented. The theoretical framework is given by cognitive linguistic notions like categorization, frame, Conceptual Metaphor Theory, iconicity, mental spaces, Conceptual Integration Theory. The aim is to build up a possible framework based on the notions mentioned above, that hopefully may give an explanation of the working mechanism of the referred digital elements.

Since the instances in this case are referred to as “memes” by its users and producers, first a brief look is taken at meme theory.

Meme Theory

According to the Oxford English Dictionary, the word “meme” has the following meaning:
  1. An element of a culture or system of behavior passed from one individual to another by imitation or other non-genetic means.
  2. An image, video, piece of text, etc., typically humorous in nature, that is copied and spread rapidly by Internet users, often with slight variations.
The two definitions differ in several aspects from each other. How could an element of a culture become something which is popular on the internet for a few days, but disappears after that? These two definitions suggest differences that are then actually confirmed by the theoretical explication of memes: traditional memes and digital memes.

As Susan Blackmore puts it in her book entitled Meme Machine: “When we imitate somebody, something is passed through. This something can be given further and further, getting its own life in this way. We can call this thing an idea, an instruction, behavior, a piece of information. If we want to study it, we have to give it a name. Fortunately, there is such a name. This is the meme.”[1] Based on this statement we can recognize that the world we live in, our personality, our feelings, are all memes. Blackmore quotes many times evolution biologist Richard Dawkins, who introduced the word meme in its today widely known meaning: “The new replicators must be given a name that expresses the unity of cultural transmission, the unity of imitation. The word mimema has a spacious Greek sounding, but I would like to find a monosyllabic name that sounds a little bit like ’gene’. Hopefully my friends with classical literacy forgive me for abbreviating the word mimema to meme. A meme can be a melody, a thought, a keyword, a fashion, a method.”[2] Further Dawkins argues that even religion is a meme, more specifically memes in connection with each other developed separately. Both authors cited above claim that memes like religion have to make a big psychological impact, since this is their condition of survival. The faith has a seemingly simple answer for the deep and torturing questions of life. It suggests that the injustices committed in this world can be corrected in a next life. However, the meme spreads regardless of its positive or negative content. According to Dawkins, a good meme has to fulfill three conditions: reliability, productivity and lifetime. This means that there have to be made as many copies as possible, these copies must be precise and they have to survive as long as possible.[3] 

According to the position described above we know what a meme is, and seen from that perspective the whole world and its element are memes, even a melody, a dance step, a thought. Now we can understand the cultural aspect mentioned in the Oxford Dictionary and the meme’s intention of spreading. Here we have to note that this interpretation of the word is probably the basis of the second meaning, and as a result, the meme concept of the second definition has some of the properties mentioned above, but also gained new ones.

Probably at this point we can begin to see the connections, both as to content and functioning, between the traditional meme concept on the one hand, and the pictures spreading via the internet on the other. The latter are also mediating a knowledge, an opinion; they make statements, they are humorous, and their main purpose is to reach many internet users. [4] In some aspects, however, the two levels still seem to diverge, since the meme as such would need other characteristics to remain viable in the digital environment.

The Digital Meme

Limor Shifman claims in his study that memes were invented decades before the digital era, but the internet made them into a daily visible phenomenon. According to his conception, the contrasting positions in the scientific debates related to memes – namely that on one hand everything in the world is a meme, and on the other the meme does not exist, it is just a construction and it is useless – should be brought closer to each other, and also memes should be analyzed from a communicational perspective. While there is an intense debate about the definition itself at the academic level, among internet users it became a fashion not just to spread but also to create memes. It is important to note that the digital meme for internet users is a form of expression of an idea, thought, by text, picture. Significant difference in comparison to the traditional definition is that a digital meme does not necessarily have a long lifetime, and while the traditional meme is abstract or ambivalent, the digital meme is expressed by very concrete utterances, such as YouTube videos or meme-groups, e.g. One Does Not Simply, Forever Alone, Grumpy Cat, etc.[5]

Shifman analyzes the cultural aspects of the traditional meme concept related to digital memes. He investigates on what specific level digital memes spread from one person to another, are the copies modified, and which meme is viable. In this comparison Shifman found that websites like YouTube, Facebook, Twitter, which are based on sharing content, are a prosperous environment for memes, since the shared content reaches masses of people in a couple of hours. Regarding the second aspect he claims that in the case of verbal communication a person processes the information before forwarding it, thereby modifying it, shaping it to his or her personality, so conveying a somewhat altered version of the original information. In the digital environment, by contrast, we can forward, attach, share content with a single click, without effecting any change to it. Additionally, there are specific applications, websites that provide for internet users the possibility to create their own memes, or modify, rethink the existing ones. With respect to the issue of lifetime Shifman claims that in the digital era a certain meme can be tracked by anybody, so its survival became more verifiable; however, a long lifetime is not an expectation in this case.

In the present paper, only those instances are considered to be memes that contain either a pictorial element modified by digital techniques in order to achieve a certain meaning, or textual-pictorial elements created with the same purpose. Generally these memes are expressing a thought, an opinion in a witty, pun-like manner, occasionally with sarcasm. A critical attitude is also characteristic of these memes.

Since the general statements I have made so far are mainly experience-based, in the next section a cognitive linguistic framework is presented that may serve as an explanation of the behaviour of memes. Basically my hypothesis is that memes can be described with a cognitive grammar-based framework.


Since categorization in cognitive linguistics is considered to be the basis of conceptualizing the word, it must be helpful in interpreting memes. Hereby a grammar-based categorization theory is presented, since it is assumed that memes could have a specific grammar.

According to Kövecses, the cognitive linguistic view of (linguistic) categories is meaning based, and just like categories in general, are organized around prototypes. It is suggested that categories have central and less central instances. Thus, for example, central cases of nouns include table, ball, water, boy and girl, while less central or less prototypical cases include invitation, fear, running, and collapse. These words, although they are grammatically speaking nouns, have conceptually speaking a more active meaning.[6]

Based on the above analysis, we can assume that specific categories can be defined among memes. A prototypical meme would look like the picture in Figure 1.

Figure 1

Regardless of the real or assumed connotations that this picture may carry, it is the very composition of memes that I am attempting to present by it. Text is added on the top and the bottom of a picture. The typography of the text may also seem to be uniform. The interpretation process is not linear: we perceive the picture first, than we decode the text, with every linguistic feature it may contain.

Less prototypical memes can be those that contain only bottom text, those that do not contain picture but the typography of the text indicates that we are encountering a meme, the text is not placed at top nor the bottom (but somewhere in the middle of the picture), and those that contain only pictorial elements. For some examples, see Figure 2.[7]

Figure 2

Kövecses elsewhere claims that a prototype is the best example of a conceptual category. The instances of a conceptual category are the members belonging to it. The members that belong together can be concepts for objects and events in the world, senses of words (e.g. love) or linguistic categories (noun, verb, etc.). Prototypical members (the best examples) are represented as conceptual frames, nonprototypical members are given as modifications or “deviations” from frames or prototypical members.[8]

Based on this position it can be claimed that the prototypical meme is the one with picture and text. More peripheral elements of the category of meme are those containing only text or only picture.


In order to make transparent the notion of frames, Kövecses in his work elaborates on framing in cognitive grammar as follows: in the case of the sentence Sara faxed Jeremy the invoice, we have to take into account not only the noun’s meaning (which is: a machine or system of transmitting documents via telephone wires), but also the frame-based meaning of the construction that it evokes: that is, that of the ditransitive construction in which someone gives someone something. Kövecses also stated that schematic constructions have a meaning, and that meaning is crucial in understanding sentences.[9] 

According to this statement, if memes are perceived as schematic constructions, an important meaning is carried by the frame that is not explicitly present in the particular digital content. The highly schematic knowledge in general is the following: within that pictorial-textual digital utterance the conceptualizer finds a thought, an opinion, a comment, a reaction to a situation, communicated in a witty, sarcastic, occasionally juvenile style. The communicative function, the ultimate meaning, is the same in the case of the peripheral elements, too.

However, frames can be found on a more concrete level, too. On websites focusing on generating memes, but also in a Google search, templates are available for encouraging further creation. These templates are the most frequently used pictorial elements among memes. Some templates are presented in Figure 3.

Figure 3

These and further similar templates are working as frames for memes, depending on the meaning intended to create. Here I will analyze in detail two of the above pictures, adding a brief explanation to all of them.

Quite often, pictures about famous movie characters are chosen to serve as a frame. In these cases, additional knowledge about the character, the actor and the particular movie may help the interpretation. However, some of these templates are conventionalized at such a level that the memes created with these pictures can be interpreted without the contextual, pop-culture related knowledge. In some cases, references are made to the movie in which that certain scene was captured; the top text of the meme may be a sentence from the movie, in other cases the picture is only an illustration of the particular character, handing down its characteristics to the content of the picture.

The “One Does Not Simply” Memes
Some “completed” versions of this type of meme can be seen in Figure 4.

Figure 4

This meme is one of the most popular ones, the scene is from The Lord of the Rings movie, capturing Boromir (performed by Sean Beam) while saying: “One does not simply walk into Mordor.” Mordor is a very dangerous place with life-threatening creatures, according to the story.

Assuming that the conceptualizer of these memes is familiar with the movie, he or she knows about Mordor, its dangers and the sentence said by Boromir, so he or she may create a blend between the dangers of Mordor and the everyday situations described by the changed sequences of the sentence, like engaging in a political debate and staying calm, or not get distracted by memes while collecting data. If the blend is created, these minor situations can be conceptualized as life-threatening dangers. Since the meme under discussion is highly conventionalized, a conceptualization is possible also without the background knowledge. If this is the case, the conceptualizer may perceive these memes as generally expressing impossible situations, critiques on dismissing unwritten rules. However, the wittiness of this type of meme will not appear in the latter case, since the blend presented above, which is the source of the playful meaning, hence the humor, is not present.

The “Grumpy Cat” Memes

Some of the “completed” templates are given in Figure 5.
Figure 5

According to Wikipedia, Tardar Sauce is a cat and internet celebrity known for her “grumpy” facial expression, and thus known by the common name Grumpy Cat. Her owner says that her permanently grumpy-looking face is due to an underbite and feline dwarfism. Grumpy Cat’s popularity originated from a picture posted to the social news website Reddit in 2012. In March 2016 “The Official Grumpy Cat” page on Facebook had over 8.5 million likes.

In the case of Grumpy Cat, very few people are aware of the background information presented above. Presumably the majority of people who are familiar with Grumpy Cat are only focusing on her facial expression. Usually some mean, passive-aggressive messages are associated with the pictures presenting this cat. Technically the attitude associated with the cat’s facial expression is the main aspect that defines the frame in this case. I will provide a deeper analysis of this below.

Also with regard to the other templates presented in Figure 3, there are meanings carried by the frame that can be discerned. In the template in the upper right corner Willie Wonka is presented from the Willie Wonka and the Chocolate Factory movie, performed by Gene Wilder. This meme is entitled “Tell Me More”, usually reacting to situations in which someone is talking about irrelevant topics, or is detached from reality. The memes that contain the dinosaur are called “Philosoraptor”, and usually unpleasant, irritating, pointless or ironic questions are formulated textually, such as “If time is money, are ATMs time machines?” (It is beyond any doubt that the questions in these memes are rather intriguing from a cognitive linguistic point of view.) The small yellow creature originates from the animation movie Minions, generally the texts added contain some nice, kind, funny content. The template in which Leonardo DiCaprio can be seen is from The Great Gatsby movie, in which DiCaprio was performing Gatsby. The content is usually “cheers!” to someone.

There are differences among the templates with regard to whether the interpretation of the memes created with them requires a broader knowledge about the origin of the picture. Despite of this, it can be stated that since the templates are used frequently with contents that work with a similar frame, the pictures used as templates accumulate a certain content that may not be present either in the original picture, or in the instantiations created with the same image. This aspect of such memes may be perceived as a frame-based meaning of the construction that is crucial in understanding them as a whole, because, for a total outsider, these pictures (in their template format) do not have the meanings described in the present section.

According to this interpretation, I suggest that template-based memes might be the prototypical elements of the category, since their template carries a frame-based meaning that assures these memes (variants) a longer lifetime.

In order to properly describe the less prototypical elements of the category, further cognitive linguistic features will be involved. To these less prototypical memes I will refer as ad-hoc memes.

Ad-hoc Memes

As ad-hoc memes I consider those ones that are inspired by popular topics like sports or politics. With respect to their form, they may seem prototypical or less prototypical (they can be labeled with top and bottom texts with the very specific typography, but they can also contain black-framed pictures, or missing one or both of the textual elements). The main difference between these memes and the template-based ones is that ad-hoc memes have a much shorter lifetime, since generally they are created in addition to some specific event. The shorter lifetime is due to the fact that since these memes are closely related to specific events, they work only in that period of time when the topic is active.

Some examples of ad-hoc memes are presented in Figure 6.

Figure 6

In the first sequence I presented football-related memes, in the second some political ones. In the first two pictures Mario Balotelli can be seen, the picture used was taken at the UEFA European Football Championship 2012, capturing a memorable goal celebration in the semifinal. The two pictures below of Cristiano Ronaldo were taken at the UEFA European Football Championship 2016. These pictures have clearly been manipulated with photo editing programs in each case. The original pictures were presumably taken accidentally (not with the intention of generating memes), however they present a highly expressive representation of famous characters. These expressions, gestures, induce further interpretations, and thereby the creation of many varieties of a meme that exploit the same basic picture. The memes created with such a reconstruction can be quite well interpreted through Conceptual Metaphor Theory, as also, very fruitfully, through Conceptual Integration Theory.

The first two political memes are created with the same technique, the main difference in comparison to the memes about football stars may be that the inductive factor the final meme relies on is not the physical appearance of the characters, but rather some real or assumed personality traits between the two (or more) persons that are mixed in the final result.

In the case of the memes presenting Barack Obama and George Bush the pictures used are expressing emotions that fit the message composed in the texts. As can be seen, in these cases the pictures do not contain a frame-based meaning, in contrast to the template-based memes. It is more likely that in those cases where there is no textual element, the original picture and the modified entity are working as input spaces of a blend.

Another obvious aspect is that where a text is added, the pictures are not manipulated, so in these cases the pictorial element is functioning mainly as a context-creating entity. In each case however a certain lexical knowledge is required in order to recognize the characters.

Cognitive linguistically speaking, some further aspects can help the interpretation process of the memes.

The togetherness of texts and/or pictures has a reciprocal impact on both constructional elements.


In terms of cognitive grammar, Kövecses defines iconicity in the following way: when a sign (word or phrase, or gesture in sign languages) resembles what it is a sign for, we talk about iconicity. He argues that a more complicated type of iconicity is in which there is an isomorphism between conceptual structure and linguistic structure. We can think of this type of iconicity as a form of metaphorical conceptualization, where the metaphor is STRENGHT OF EFFECT IS CLOSENESS OF FORM. The next two sentences are given as an example:
John killed Bill.
John caused Bill to die.

Kövecses claims that when we use a single word for a complex concept, it suggests a unitary construal, since the word kill implies the action and the result, while the word caused leaves these entailments opaque.[10]

With regard to memes, we can suppose that the closeness of textual/pictorial elements results in different levels of  tightness in closure.  Coming to template-based memes, the closeness seems to be very tight, since one part of the text is always the same. Some memes have been created that reflect this aspect, like the one in Figure 7.

Figure 7

In the case of ad-hoc memes, however, there are two possibilities: if there is a text, the picture is usually needed in order to place the text in context. If we remove the pictures, the text in itself would become nonsensical, or we would have too many unknown elements. In those ad-hoc memes where two pictures are merged and there is no text, we could remove those elements that were placed upon the original picture post factum (e.g. the ballet dancer’s lower body, the mustache and the beard – which elements metonymically stand for those concepts that are blended with the characters from the original pictures), but in this way the meme as a construction would be destroyed.

Seemingly a specific logic of iconicity is in place in the creation and interpretation of memes. The tight closure is straightforward in the prototypical elements of the category, this closure resulted in conventionalization, but in the case of the less prototypical elements, due to the weaker connection, the textual and pictorial elements presuppose each other.
Some level of closure is present in all of the instances, since the construction of the prototypical meme requires that. It may be assumed that ad-hoc memes do not have the time to effect conventionalization, since after the main topic they refer to vanishes, they are no longer needed.

Humor in Memes

According to Kövecses, “one of the striking features that one notices about humorous expressions from a cognitive linguistic perspective is the very noticeable presence of a number of ’figurative’ cognitive devices in the expressions. These include metonymy, metaphor and blending.”[11] Further he emphasizes that there are two kinds of evidence that indicate that figurative devices are neither sufficient nor necessary for humorous effect. One is that there are humorous expressions that do not contain any of the figurative devices mentioned previously, and, second, there are expressions that do involve figurative devices but are not humorous in their effect.

Two basic elements are defined that play a major role in building up humorous content. On one hand, in some cases the understanding requires familiarity with some literal conventional knowledge. On the other hand, the additional element needed is the notion of incongruity, or incompatibility, or contrast, inside or between conceptual frames of knowledge; either figurative or literal. Specific kind of incongruities have been defined by Kövecses:

Real vs. imagined
Possible vs. impossible
Socially neutral/expected/acceptable vs. socially unacceptable/stigmatized/taboo
Elevated vs. mundane
Large amount vs. small amount
Natural vs. constructed
Positive vs. negative evaluation
Action vs. event
Logical incongruity
Linguistic/discourse incongruity

Evidence for linguistic humor is presented for each type. Now my suggestion is that since memes are more complex than pure linguistic utterances, the types listed above can actually appear simultaneously.

Take Figure 8 as an example.[12]
Figure 8

The picture shows Franz Joseph, the Emperor of Austria, King of Hungary and Croatia. The relevant information about him in order to understand the meme is that he concluded the Ausgleich (the “Compromise”) of 1867, which granted greater autonomy to Hungary, hence transforming the Austrian Empire into the Austro-Hungarian Empire under his dual monarchy.

My suggestion is that here we have a blend with two input spaces. In Input space 1 we have the historical age in which Hungary was part of the Austro-Hungarian Empire, and the person presented in the picture was the emperor, hence metonymically representing the former country (LEADER OF THE COUNTRY FOR THE COUNTRY). In Input space 2 we have the European Football Championship 2016, in which the Hungarian football team was playing with the Austrian team. This input space is indicated by the conventional knowledge in a wider sense, more closely by the words match and play. In this way, two time spaces are active at the same time.

With regard to the humorous effect, the following incongruities might be discovered:

Real vs. imagined
Possible vs. impossible
Elevated vs. mundane
Natural vs. constructed
Logical incongruity
Linguistic/discourse incongruity

In the picture a real situation is combined with an imagined one. This situation was also real, but a hundred years ago, so from the perspective of the present age it is imagined. Due to the blend, the possible situation is the actual football match, but if we stay within the blend constructed, the meaning of this match gets the following online meaning: The Austro-Hungarian Empire is going to play a football match against itself.

The elevated/mundane pairing can be caught in the topics of the input spaces. Franz Joseph is considered to be one of the greatest emperors in history; this is elevated in comparison to a football match that seems mundane. The manner of the imagined dialogue can reinforce this aspect, since the emperor could not be addressed by anyone with rumor-like topics.

The natural vs. constructed incongruity can be derived from the constructional aspects of memes in general, while the input spaces that serve as a basis might be natural.

The logical incongruity appears as a consequence of the blend constructed, it carries a latent meaning of the resurrection of Franz Joseph who has the general knowledge of that historical age, but he is encountering a situation in the present age, he is addressed with a present-time question, and he responds with the knowledge which was actual in that past historical age.

The linguistic incongruity is the “physical” representation of the logical one.

As a conclusion we might claim that the incongruities can correlate with each other and reinforce each other, this way building up the humorous content, together with the figurative devices mentioned above.

I assume that these figurative devices can be found in every meme. To give just a brief interpretation, if we turn back to the “Grumpy Cat” memes, one can assert that the cat from the picture is humanized via the conceptual metaphor ANIMAL IS HUMAN, a fact that may have a constructional basis, since prototypically speaking, most frequently the pictures used for creating memes present people. Once the conceptual metaphor is in place, a metonymical process is started that relies on the PHYSICAL EFFECT OF THE EMOTION FOR THE EMOTION metonymy, in this way does the cat become “grumpy”. The texts added to these pictures also rely on the ANIMAL IS HUMAN metaphor, since due to the force dynamics (the tight closeness of the picture and text) we perceive these memes as the linguistic messages would be formulated by the cat. From the aspect of creating humor, obviously we can find the Real vs. imagined / Possible vs. impossible type of incongruity, but the Natural vs. constructed and the Logical incongruity types also seem possible, and maybe some more, depending on what frame the textual element includes.


Memes cannot be categorized simply by prototypicality. Additional aspects must be taken into consideration, like form, lifetime (which depends on the topic they are connected to), and the quantity of the figurative devices they operate with. As can be seen, the ad-hoc memes need to use more figurative devices than the template-based ones, in order to survive the selection process that they are exposed to by definition. Even though they are constructed with the awareness of their shorter lifetime, they strive for the production of as many copies as possible. In the case of the template-based memes it can be suspected that it is not the particular meme with specific elements that has a bigger chance to survive longer, but rather the template, the frame, which has a meaning in itself.  

With regard to the humorous effect, it can be stated that it is highly possible that every meme relies on certain incongruities, which is most likely to be a basic feature in the creation process, since generally in every meme two entities are put together that are conceptualized as being different from each other originally.

Possible categorizations are presented in Figure 9, Figure 10, and Figure 11. In every case the central element represents the type considered to be the most prototypical one.

Figure 9: Categorization based on formal aspects

Figure 10: Categorization based on lifetime

Figure 11: Categorization based on the quantity of figurative operations used

