GeneratedAnalytics
Class
Python3 class to organize different kinds of data from
Keybase
export. Lives ingenerate_analytics.py
.
GeneratedAnalytics
Properties
The properties of the GeneratedAnalytics
class are maybe best to think of as "data reports." GeneratedAnalytics
"reports" are refreshed by corresponding Methods.
Generalized Pattern
{ "type" : STR "title": STR "x_axis": LIST of STR or INT "y_axis": LIST of INT "x_label": STR "y_label": STR }
user_list
GeneratedAnalytics.user_list = []
A list
containing string
elements that are the unique users in the database.
topic_list
GeneratedAnalytics.topic_list = []
A list
containing string
elements that are the unique topics in the database.
topic_list
GeneratedAnalytics.characters_per_user = {"user": [], "characters_per_user": []}
A dict
array with the total number of characters entered via messages to chat by element in user_list
.
characters_per_topic
GeneratedAnalytics.characters_per_user = {"user": [], "characters_per_topic": []}
A dict
array with the total number of characters per element in topic_list
.
messages_per_user
GeneratedAnalytics.messages_per_user = {"user": [], "messages_per_user": []}
A dict
array with the total number of messages to chat by element in user_list
.
messages_per_topic
GeneratedAnalytics.messages_per_topic = {"topic": [], "messages_per_topic": []}
A dict
array with the total number of messages per element in topic_list
.
number_users_per_topic
GeneratedAnalytics.number_users_per_topic = {"users_list": [], "topics_list": []}
A dict
array with the number of users per element in topic_list
.
reaction_per_message
GeneratedAnalytics.reaction_per_message = {"ordered_message_id":[], "num_reaction":[]}
A dict
array with the message ID and corresponding number of reactions to that message.
reaction_sent_per_user
GeneratedAnalytics.reaction_sent_per_user = {"ordered_user":[], "user_to_reaction":[]}
TODO
- [ ] Check that this is actually what we think it is. Looks like User gets sorted, but
user_to_reaction
is not?
reaction_popularity_map
GeneratedAnalytics.reaction_popularity_map = {"reactions":{}}
TODO
- [ ] Clarify what this is (or remove)?
reactions_per_user
GeneratedAnalytics.reactions_per_user = {"users_reactions":{}, "users_ordered":[]}
TODO
- [ ] Clarify what this is: lists of unique reactions per user?
received_most_reactions
GeneratedAnalytics.recieved_most_reactions = {"users_reactions":{}, "users_ordered":[]}
TODO
- [ ] Clarify what this is: list of messages with most reactions by user and what the reactions were?
edits_per_user
GeneratedAnalytics.edits_per_user = {
"users":{},
"ordered_users":[],
"ordered_num_edits":[]}
Which user had the most (raw) number of edits?
edits_per_topic
GeneratedAnalytics.edits_per_topic = {
"topics":{},
"ordered_topics":[],
"ordered_num_edits":[]}
Which topic had the most (raw) number of edits?
deletes_per_user
GeneratedAnalytics.deletes_per_user = {
"users":{},
"ordered_users":[],
"ordered_num_deletes":[]}
Which users had the most (raw) number of deletes?
deletes_per_topic
GeneratedAnalytics.deletes_per_topic = {
"topics":{},
"ordered_topics":[],
"ordered_num_deletes":[]}
Which topics had the most (raw) number of deletes?
who_edits_most_per_capita
GeneratedAnalytics.who_edits_most_per_capita = {
"users":{},
"ordered_users":[],
"ordered_edit_per_capita" : []}
Who edits most per message?
who_deletes_most_per_capita
GeneratedAnalytics.who_deletes_most_per_capita = {
"users":{},
"ordered_users":[],
"ordered_edit_per_capita" : []}
Who deletes the most per message?
topic_edits_per_capita
GeneratedAnalytics.topic_edits_per_capita = {
"topics":{},
"ordered_topics":[],
"ordered_edit_per_capita" : []}
Which topic channels had the most edits per message?
topic_deletes_per_capita
GeneratedAnalytics.topic_deletes_per_capita = {
"topics":{},
"ordered_topics":[],
"ordered_edit_per_capita" : []}
Which topic channels had the most deletes per message?
top_domains
GeneratedAnalytics.top_domains = {
"URLs":{},
"top_domains_sorted":[],
"num_times_repeated":[]}
What were the most-used top-level domains, what were the specific links, and how many times did they appear?
GeneratedAnalytics
Methods
Methods are mainly there to return "refreshed" versions of the data with respect to the database.
get_message
message = GeneratedAnalytics.get_message(MESSAGE_ID_NUM)
Returns a single row from Message object (SQL table) by ID.
get_reaction_per_message
GeneratedAnalytics.get_reaction_per_message()
Update the reactions to each message.
get_reaction_sent_per_user
GeneratedAnalytics.get_reaction_sent_per_user()
Update the reactions sent by each user.
get_num_messages_from_user
msg_changes = {"edit": INT, "text": INT, "delete": INT}
msg_changes.append(GeneratedAnalytics.get_num_messages_from_user("USER"))
Return object with number of times a text was edited or deleted for a given user.
get_num_messages_from_topic
messages = {"edit": INT, "text": INT, "delete": INT}
messages.append(GeneratedAnalytics.get_num_messages_from_topic("TOPIC"))
Return object with number of times a text was edited or deleted for a given topic.
get_list_all_users
GeneratedAnalytics.get_list_all_users()
Should be called from the object constructor; updates and returns list of all users in database.
TODO
- [ ] Set scoping for private/public methods?
get_list_all_topics
GeneratedAnalytics.get_list_all_topics()
Should be called from the object constructor; updates and returns list of all topics in database.
TODO
- [ ] Set scoping for private/public methods?
get_characters_per_user
characters_per_user = {"user": [], "characters_per_user": []}
characters_per_user.append(GeneratedAnalytics.get_characters_per_user())
Update and return total number of characters from messages for each user.
TODO
- [ ] Set scoping for private/public methods?
get_characters_per_topic
characters_per_topic = {"user": [], "characters_per_topic": []}
characters_per_topic.append(GeneratedAnalytics.get_characters_per_topic())
Update and return total number of characters from messages posted in each topic.
TODO
- [ ] Set scoping for private/public methods?
get_messages_per_user
messages_per_user = {"user": [], "messages_per_user": []}
messages_per_user.append(GeneratedAnalytics.get_messages_per_user())
Update and return total number of messages for each user.
TODO
- [ ] Set scoping for private/public methods?
get_messages_per_topic
messages_per_topic = {"user": [], "messages_per_topic": []}
messages_per_topic.append(GeneratedAnalytics.get_messages_per_topic())
Update and return total number of messages posted in each topic.
TODO
- [ ] Set scoping for private/public methods?
get_number_users_per_topic
number_users_per_topic = {"users_list": [], "topics_list": []}
number_users_per_topic.append(
GeneratedAnalytics.get_number_users_per_topic)
Update and return the number of users for each topic.
TODO
- [ ] Set scoping for private/public methods?
get_reaction_popularity_topic
reactions = {"reactions":{}, "list":[]}
reactions.append(GeneratedAnalytics.get_reaction_popularity_topic("TOPIC"))
Get popularity of all reactions in a topic corresponding to a specific topic (string
).
get_all_user_message_id
msgID = {"users_reactions":{}, "users_ordered":[]}
msgID.append(GeneratedAnalytics.get_all_user_message_id(user))
For a specific user (user
, string
), return all message IDs involving that user.
get_user_sent_most_reactions
GeneratedAnalytics.get_user_sent_most_reactions()
Return the sorted user by most number of reactions issued.
get_user_received_most_reactions
GeneratedAnalytics.get_user_received_most_reactions()
Update and return the sorted listing of users by number of reactions received.
get_edits_per_user
GeneratedAnalytics.get_edits_per_user()
Update and return the raw number of edited messages by user.
get_deletes_per_user
GeneratedAnalytics.get_deletes_per_user()
Update and return the raw number of deleted messages by user.
get_edits_per_topic
GeneratedAnalytics.get_edits_per_topic()
Update and return the raw number of edited messages by topic.
get_deletes_per_topic
GeneratedAnalytics.get_deletes_per_topic()
Update and return the raw number of deleted messages by topic.
get_who_edits_most_per_capita
GeneratedAnalytics.get_who_edits_most_per_capita()
Update and return the sorted per-capita message edits by user.
get_who_deletes_most_per_capita
GeneratedAnalytics.get_who_deletes_most_per_capita()
Update and return the sorted per-capita message deletions by user.
get_topic_edits_per_capita
GeneratedAnalytics.get_topic_edits_per_capita()
Update and return the per-capita edits by topic.
get_topic_deletes_per_capita
GeneratedAnalytics.get_topic_deletes_per_capita()
Update and return the per-capita deletes by topic.
get_top_domains
GeneratedAnalytics.get_top_domains()
Update list of most-popular top-level domains linked in text chat.
get_reaction_type_popularity_per_user
reaction_type_popularity_per_user = {
"users_reactions":{},
"reactions_ordered":[]
}
reaction_type_popularity_per_user.append(
GeneratedAnalytics.get_reaction_type_popularity_per_user(
"USERS USERNAME"))
Returns/updates the popularity of a given reaction type by their username.
get_message_data_frames
df = get_message_data_frames(self, offset_time=0)
Returns df
, a Pandas
data frame with user, message ID, time, team name, topic, body text, and word count data for "text"
type messages only.
Messages
Class
Python3 class that uses
sqlalchemy
to interface withSQL
database. Lives indatabase.py
.
Messages
Properties
Each Messages
property can be thought of like a table column. They correspond to parts of the .json
object returned by querying Keybase
that we want to retain about each message, which includes transactions like "reactions" to other text chat messages, or users entering and leaving a channel.
id
id = Column(Integer, primary_key=True)
Primary key identifier for each unique Message.
team
team = Column(String(1024))
Which Keybase
team was this message sent in?
topic
topic = Column(String(128))
What Topic channel was this message sent in?
msg_id
msg_id = Column(Integer)
What message does this instance reference?
msg_type
msg_type = Column(String(32))
What type of message (i.e. "text", "reaction", etc.) was this interaction?
from_user
from_user = Column(String(128))
From which user did this message originate?
sent_time
sent_time = Column(Integer)
What time was this message sent?
- Note: this is the number of seconds, in posixtime convention (UTC). That is, the number of seconds elapsed since 1970.
txt_body
txt_body = Column(String(4096))
What text content was in the body of the message?
word_count
word_count = Column(Integer)
How many words are in the body of the message?
TODO
- [ ] Indicate how this was computed; are stop words included? etc.
num_urls
num_urls = Column(Integer)
How many URLs are referenced in this message?
urls
urls = Column(String(4096))
What URLs were identified in this message?
TODO
- [ ] Indicate how this was computed: are the URLs recognized on the Keybase end or on our side? From the code it looks like it is done on their end.
reaction_body
reaction_body = Column(String(1024))
If this is a reaction message, what emoji reaction was used in response to the message?
msg_reference
msg_reference = Column(Integer)
If this message replies to (for "text"
) or reacts to (for "reaction"
), which message identifier does it reference?
userMentions
userMentions = Column(String(1024))
What users were @<user>
tagged in this message?