text_summary module¶
Created on Mon Aug 5 19:56:19 2019
@author: Daniel
-
class
text_summary.
TextSummary
(data=None)¶ Bases:
object
A representation of the basic statistics of a set of texts.
-
data
¶ input dataframe of texts
- Type
Pandas.DataFrame
-
count
¶ Number of _, has keys texts, words, char, space letter, digit, emotes, punct.
- Type
dict
-
prop
¶ contains overall statistics that are fractions (laziness, % of emoji, words per text, verbosity)
- Type
dict
-
occurrence_dicts
¶ contains dictionaries {token: count} (words or emotes)
- Type
dict
-
per_text_lists
¶ contains statistics per text (sentiment: polarity, subjectivity, words per text, characters per text, emotes per text)
- Type
dict
-
compare_freq
(other, token)¶ Find differences in word or emoji use frequency.
- Parameters
other (TextSummary) – TextSummary to compare to.
token (string) – key of the thing to compare (words or emotes)
- Returns
dictionary where keys correspond to words and values are tuples (total, expected ratio)
- Return type
diff_dict (dict)
-
get_conversations
(names)¶ Get a list of conversations.
- Parameters
names (list) – names of senders.
- Returns
a list of dictionaries with conversation information
- Return type
convos (list)
-
get_counts
(word)¶ Find number of occurrences of word in each text.
- Parameters
word (string) – the word to find.
- Returns
list of integer counts.
- Return type
counts (list)
-
set_counts
(raw_text, emote_free_text)¶ Set the count statistics.
- Parameters
raw_text (string) – the original concatenated text
emote_free_text (string) – the emote/emoji-free concatenated text
-
set_occurrence_dicts
(raw_text, emote_free_text)¶ Fill a dictionary with the occurrences of each word and each emote.
- Parameters
raw_text (string) – the original concatenated text
emote_free_text (string) – the emote/emoji-free concatenated text
-
set_per_text_lists
()¶ Set the per text list dictionary with words per text, characters per text, and emotes per text.
-
set_props
()¶ Set the proportion statistics.
-