Memes, as originally defined by Richard Dawkins in 1976, are a "unit of cultural transmission, or a unit of imitation". In different terms, memes are concepts and customs, patterns and behaviors, attitudes and habits - the building blocks of culture, science, religion, and even society itself. It is everything that is transmitted between individuals by means of imitation, in the loosest definition. Dawkins elaborates upon this further by drawing an analogy to genes - both being, in a sense, self-replicating entities that propagate themselves through people and, thus, through time.
Understandably, this definition has been a source of much controversy. Susan Blackmoore, in her work "The Meme Machine", identifies (and argues against) three problems with memes, of which two are immediately relevant:
The unit of a meme cannot be specified
The mechanism for copying and storing memes is unknown.
This provides clear limitations for quantitative study and is the basis of much criticism.
Patrick Davison, in his 2012 essay The Language of Internet Memes, proposes the following definition:
Definition 1 (Internet meme). a piece of culture, typically a joke, which gains influence through online transmission.
It is important to reinforce that humor is not a requirement, as internet memes have also been observed as the vehicle for political propaganda, hate speech and traumatic confessions. The significant part of this definition is online transmission, which requires the meme to be encoded into an internet viable medium. Whether this medium is an image, text, or video, it is readable data.
This additional property now allows us to discern discrete units and observe the mechanisms of replication and storage in action, at least within the online environment - opportunity for quantitative analysis.
A final level of granularity is required to reason about internet memes in a sufficiently precise manner. Image macros are perhaps the most representative subgenre of internet memes and are often regarded as the epitomical internet meme. The singular required property is for the meme to be encoded into an image (file). Typically, the resulting image consists of:
A background image that is chosen as such that it is immediately recognizable by the intended audience and provides them context
Superimposed text as a caption, containing the message and sometimes additional contextual information.
A singular format does not exist; however, most share the property of being multimodal constructions of text and image.
A large volume of data has been amassed. The data was initially scraped from KnowYourMeme.
Subsequently, data have been enriched with Google Vision and DBPedia Spotlight - this is truly the one true semantically rich meme dataset to rule them all.