Emoticons and emojis#
Transliteration of emojis can be achieved through the use of the Python module emoji
, while emoticons can be processed - and converted - using regular expressions as exemplified in [s6.06]
.
Options and arguments for emoji
can be found in the official documentation.
Install the required module#
pip install emoji
CATLISM, 345-346
Function to transliterate emojis (using two different output formats)1CATLISM, 345-346
#
The function defined in [s6.04]
can be imported into any script and be used with the included syntax
demojize(INPUT, OUTPUT_FORMAT)
1# Import the required module to transliterate emojis
2import emoji
3
4# Define the function called 'demojize')
5def demojize(text, output):
6 """Converts emoji(s) found in a string of text into their transliterated CLDR version; input is:
7
8 text: the string of text with one or more emojis
9 output: the format of the 'output.
10
11 If 'output' is set to 'default', the result for 🙃 is {upside-down_face}
12 If 'output' is set to custom, result is {upside^down^face}
13
14 Usage follows the syntax
15 demojize(INPUT, FORMAT)
16 """
17
18 # If 'output' is set to 'default', apply the standard transliteration using square brackets as delimiters
19 if output == "default":
20 return emoji.demojize(text, delimiters=("{", "}"))
21 # Else if set to 'custom' do:
22 elif output == "custom":
23 # Create a list and store inside of it the text to be processed
24 out_text = list(text)
25 # Use the function 'emoji_count' to count the total number of identified emojis
26 emoji_count = emoji.emoji_count(out_text)
27 # For each identified emoji do:
28 for i in range(emoji_count):
29 # Take the first (and only) emoji in the list of emojis found, created through the function 'emoji_list'.
30 # The function create, for each emoji, three data-points: 'emoji' containing the actual emoji;
31 # 'match_start' indicates the positional value of the first character of the emoji; 'match_end the positional
32 # value of the last character of the emoji.
33 first_emoji = emoji.emoji_list(out_text)[0]
34 # Store the three aforementioned data-points in three separate variables
35 found_emoji = first_emoji["emoji"]
36 emoji_start = first_emoji["match_start"]
37 emoji_end = first_emoji["match_end"]
38 # Apply the standard demojize function to the identified emoji, and replace the underscore _ with the character ^
39 demojized = str(
40 " " + emoji.demojize(found_emoji, delimiters=("{", "}")) + " "
41 ).replace("_", "^")
42 # Replace the hyphen with the character ^
43 demojized = demojized.replace("-", "^")
44 # Replace the emoji with its transliterated version in the original text
45 out_text[emoji_start:emoji_end] = demojized
46 # Return the full text with transliterated emojis
47 return "".join(out_text)