`GeneratedAnalytics` Class

Python3 class to organize different kinds of data from Keybase export. Lives in generate_analytics.py.

`GeneratedAnalytics` Properties

The properties of the GeneratedAnalytics class are maybe best to think of as "data reports." GeneratedAnalytics "reports" are refreshed by corresponding Methods.

Generalized Pattern

{ "type" : STR "title": STR "x_axis": LIST of STR or INT "y_axis": LIST of INT "x_label": STR "y_label": STR }

`user_list`

GeneratedAnalytics.user_list = []

A list containing string elements that are the unique users in the database.

`topic_list`

GeneratedAnalytics.topic_list = []

A list containing string elements that are the unique topics in the database.

`topic_list`

GeneratedAnalytics.characters_per_user = {"user": [], "characters_per_user": []}

A dict array with the total number of characters entered via messages to chat by element in user_list.

`characters_per_topic`

GeneratedAnalytics.characters_per_user = {"user": [], "characters_per_topic": []}

A dict array with the total number of characters per element in topic_list.

`messages_per_user`

GeneratedAnalytics.messages_per_user = {"user": [], "messages_per_user": []}

A dict array with the total number of messages to chat by element in user_list.

`messages_per_topic`

GeneratedAnalytics.messages_per_topic = {"topic": [], "messages_per_topic": []}

A dict array with the total number of messages per element in topic_list.

`number_users_per_topic`

GeneratedAnalytics.number_users_per_topic = {"users_list": [], "topics_list": []}

A dict array with the number of users per element in topic_list.

`reaction_per_message`

GeneratedAnalytics.reaction_per_message = {"ordered_message_id":[], "num_reaction":[]}

A dict array with the message ID and corresponding number of reactions to that message.

`reaction_sent_per_user`

GeneratedAnalytics.reaction_sent_per_user = {"ordered_user":[], "user_to_reaction":[]}

TODO

[ ] Check that this is actually what we think it is. Looks like User gets sorted, but user_to_reaction is not?

`reaction_popularity_map`

GeneratedAnalytics.reaction_popularity_map = {"reactions":{}}

TODO

[ ] Clarify what this is (or remove)?

`reactions_per_user`

GeneratedAnalytics.reactions_per_user = {"users_reactions":{}, "users_ordered":[]}

TODO

[ ] Clarify what this is: lists of unique reactions per user?

`received_most_reactions`

GeneratedAnalytics.recieved_most_reactions = {"users_reactions":{}, "users_ordered":[]}

TODO

[ ] Clarify what this is: list of messages with most reactions by user and what the reactions were?

`edits_per_user`

GeneratedAnalytics.edits_per_user = {
    "users":{}, 
    "ordered_users":[], 
    "ordered_num_edits":[]}

Which user had the most (raw) number of edits?

`edits_per_topic`

GeneratedAnalytics.edits_per_topic = {
"topics":{}, 
"ordered_topics":[], 
"ordered_num_edits":[]}

Which topic had the most (raw) number of edits?

`deletes_per_user`

GeneratedAnalytics.deletes_per_user = {
"users":{}, 
"ordered_users":[], 
"ordered_num_deletes":[]}

Which users had the most (raw) number of deletes?

`deletes_per_topic`

GeneratedAnalytics.deletes_per_topic = {
"topics":{}, 
"ordered_topics":[], 
"ordered_num_deletes":[]}

Which topics had the most (raw) number of deletes?

`who_edits_most_per_capita`

GeneratedAnalytics.who_edits_most_per_capita = {
    "users":{}, 
    "ordered_users":[], 
    "ordered_edit_per_capita" : []}

Who edits most per message?

`who_deletes_most_per_capita`

GeneratedAnalytics.who_deletes_most_per_capita = {
    "users":{}, 
    "ordered_users":[], 
    "ordered_edit_per_capita" : []}

Who deletes the most per message?

`topic_edits_per_capita`

GeneratedAnalytics.topic_edits_per_capita = {
    "topics":{}, 
    "ordered_topics":[], 
    "ordered_edit_per_capita" : []}

Which topic channels had the most edits per message?

`topic_deletes_per_capita`

GeneratedAnalytics.topic_deletes_per_capita = {
    "topics":{}, 
    "ordered_topics":[], 
    "ordered_edit_per_capita" : []}

Which topic channels had the most deletes per message?

`top_domains`

GeneratedAnalytics.top_domains = {
    "URLs":{}, 
    "top_domains_sorted":[], 
    "num_times_repeated":[]}

What were the most-used top-level domains, what were the specific links, and how many times did they appear?

`GeneratedAnalytics` Methods

Methods are mainly there to return "refreshed" versions of the data with respect to the database.

`get_message`

message = GeneratedAnalytics.get_message(MESSAGE_ID_NUM)

Returns a single row from Message object (SQL table) by ID.

`get_reaction_per_message`

GeneratedAnalytics.get_reaction_per_message()

Update the reactions to each message.

`get_reaction_sent_per_user`

GeneratedAnalytics.get_reaction_sent_per_user()

Update the reactions sent by each user.

`get_num_messages_from_user`

msg_changes = {"edit": INT, "text": INT, "delete": INT}
msg_changes.append(GeneratedAnalytics.get_num_messages_from_user("USER"))

Return object with number of times a text was edited or deleted for a given user.

`get_num_messages_from_topic`

messages = {"edit": INT, "text": INT, "delete": INT}
messages.append(GeneratedAnalytics.get_num_messages_from_topic("TOPIC"))

Return object with number of times a text was edited or deleted for a given topic.

get_list_all_users

GeneratedAnalytics.get_list_all_users()

Should be called from the object constructor; updates and returns list of all users in database.

TODO

[ ] Set scoping for private/public methods?

get_list_all_topics

GeneratedAnalytics.get_list_all_topics()

Should be called from the object constructor; updates and returns list of all topics in database.

TODO

[ ] Set scoping for private/public methods?

`get_characters_per_user`

characters_per_user = {"user": [], "characters_per_user": []}
characters_per_user.append(GeneratedAnalytics.get_characters_per_user())

Update and return total number of characters from messages for each user.

TODO

[ ] Set scoping for private/public methods?

`get_characters_per_topic`

characters_per_topic = {"user": [], "characters_per_topic": []}
characters_per_topic.append(GeneratedAnalytics.get_characters_per_topic())

Update and return total number of characters from messages posted in each topic.

TODO

[ ] Set scoping for private/public methods?

`get_messages_per_user`

messages_per_user = {"user": [], "messages_per_user": []}
messages_per_user.append(GeneratedAnalytics.get_messages_per_user())

Update and return total number of messages for each user.

TODO

[ ] Set scoping for private/public methods?

`get_messages_per_topic`

messages_per_topic = {"user": [], "messages_per_topic": []}
messages_per_topic.append(GeneratedAnalytics.get_messages_per_topic())

Update and return total number of messages posted in each topic.

TODO

[ ] Set scoping for private/public methods?

get_number_users_per_topic

number_users_per_topic = {"users_list": [], "topics_list": []}
number_users_per_topic.append(
    GeneratedAnalytics.get_number_users_per_topic)

Update and return the number of users for each topic.

TODO

[ ] Set scoping for private/public methods?

`get_reaction_popularity_topic`

reactions = {"reactions":{}, "list":[]}
reactions.append(GeneratedAnalytics.get_reaction_popularity_topic("TOPIC"))

Get popularity of all reactions in a topic corresponding to a specific topic (string).

`get_all_user_message_id`

msgID = {"users_reactions":{}, "users_ordered":[]}
msgID.append(GeneratedAnalytics.get_all_user_message_id(user))

For a specific user (user, string), return all message IDs involving that user.

`get_user_sent_most_reactions`

GeneratedAnalytics.get_user_sent_most_reactions()

Return the sorted user by most number of reactions issued.

`get_user_received_most_reactions`

GeneratedAnalytics.get_user_received_most_reactions()

Update and return the sorted listing of users by number of reactions received.

`get_edits_per_user`

GeneratedAnalytics.get_edits_per_user()

Update and return the raw number of edited messages by user.

`get_deletes_per_user`

GeneratedAnalytics.get_deletes_per_user()

Update and return the raw number of deleted messages by user.

`get_edits_per_topic`

GeneratedAnalytics.get_edits_per_topic()

Update and return the raw number of edited messages by topic.

`get_deletes_per_topic`

GeneratedAnalytics.get_deletes_per_topic()

Update and return the raw number of deleted messages by topic.

`get_who_edits_most_per_capita`

GeneratedAnalytics.get_who_edits_most_per_capita()

Update and return the sorted per-capita message edits by user.

`get_who_deletes_most_per_capita`

GeneratedAnalytics.get_who_deletes_most_per_capita()

Update and return the sorted per-capita message deletions by user.

`get_topic_edits_per_capita`

GeneratedAnalytics.get_topic_edits_per_capita()

Update and return the per-capita edits by topic.

`get_topic_deletes_per_capita`

GeneratedAnalytics.get_topic_deletes_per_capita()

Update and return the per-capita deletes by topic.

`get_top_domains`

GeneratedAnalytics.get_top_domains()

Update list of most-popular top-level domains linked in text chat.

`get_reaction_type_popularity_per_user`

reaction_type_popularity_per_user = {
    "users_reactions":{},
    "reactions_ordered":[]
}
reaction_type_popularity_per_user.append(
    GeneratedAnalytics.get_reaction_type_popularity_per_user(
        "USERS USERNAME"))

Returns/updates the popularity of a given reaction type by their username.

`get_message_data_frames`

df = get_message_data_frames(self, offset_time=0)

Returns df, a Pandas data frame with user, message ID, time, team name, topic, body text, and word count data for "text" type messages only.

`Messages` Class

Python3 class that uses sqlalchemy to interface with SQL database. Lives in database.py.

`Messages` Properties

Each Messages property can be thought of like a table column. They correspond to parts of the .json object returned by querying Keybase that we want to retain about each message, which includes transactions like "reactions" to other text chat messages, or users entering and leaving a channel.

`id`

id = Column(Integer, primary_key=True)

Primary key identifier for each unique Message.

`team`

team = Column(String(1024))

Which Keybase team was this message sent in?

`topic`

topic = Column(String(128))

What Topic channel was this message sent in?

`msg_id`

msg_id = Column(Integer)

What message does this instance reference?

`msg_type`

msg_type = Column(String(32))

What type of message (i.e. "text", "reaction", etc.) was this interaction?

`from_user`

from_user = Column(String(128))

From which user did this message originate?

`sent_time`

sent_time = Column(Integer)

What time was this message sent?

Note: this is the number of seconds, in posixtime convention (UTC). That is, the number of seconds elapsed since 1970.

`txt_body`

txt_body = Column(String(4096))

What text content was in the body of the message?

`word_count`

word_count = Column(Integer)

How many words are in the body of the message?

TODO

[ ] Indicate how this was computed; are stop words included? etc.

`num_urls`

num_urls = Column(Integer)

How many URLs are referenced in this message?

`urls`

urls = Column(String(4096))

What URLs were identified in this message?

TODO

[ ] Indicate how this was computed: are the URLs recognized on the Keybase end or on our side? From the code it looks like it is done on their end.

`reaction_body`

reaction_body = Column(String(1024))

If this is a reaction message, what emoji reaction was used in response to the message?

`msg_reference`

msg_reference = Column(Integer)

If this message replies to (for "text") or reacts to (for "reaction"), which message identifier does it reference?

`userMentions`

userMentions = Column(String(1024))

What users were @<user> tagged in this message?

Backlinks

docs

GeneratedAnalytics Class

GeneratedAnalytics Properties

Generalized Pattern

user_list

topic_list

topic_list

characters_per_topic

messages_per_user

messages_per_topic

number_users_per_topic

reaction_per_message

reaction_sent_per_user

TODO

reaction_popularity_map

TODO

reactions_per_user

TODO

received_most_reactions

TODO

edits_per_user

edits_per_topic

deletes_per_user

deletes_per_topic

who_edits_most_per_capita

who_deletes_most_per_capita

topic_edits_per_capita

topic_deletes_per_capita

top_domains

GeneratedAnalytics Methods

get_message

get_reaction_per_message

get_reaction_sent_per_user

get_num_messages_from_user

get_num_messages_from_topic

get_list_all_users

TODO

get_list_all_topics

TODO

get_characters_per_user

TODO

get_characters_per_topic

TODO

get_messages_per_user

TODO

get_messages_per_topic

TODO

get_number_users_per_topic

TODO

get_reaction_popularity_topic

get_all_user_message_id

get_user_sent_most_reactions

get_user_received_most_reactions

get_edits_per_user

get_deletes_per_user

get_edits_per_topic

get_deletes_per_topic

get_who_edits_most_per_capita

get_who_deletes_most_per_capita

get_topic_edits_per_capita

get_topic_deletes_per_capita

get_top_domains

get_reaction_type_popularity_per_user

get_message_data_frames

Messages Class

Messages Properties

id

team

topic

msg_id

msg_type

from_user

sent_time

txt_body

word_count

TODO

num_urls

urls

TODO

reaction_body

msg_reference

`GeneratedAnalytics` Class

`GeneratedAnalytics` Properties

`user_list`

`topic_list`

`topic_list`

`characters_per_topic`

`messages_per_user`

`messages_per_topic`

`number_users_per_topic`

`reaction_per_message`

`reaction_sent_per_user`

`reaction_popularity_map`

`reactions_per_user`

`received_most_reactions`

`edits_per_user`

`edits_per_topic`

`deletes_per_user`

`deletes_per_topic`

`who_edits_most_per_capita`

`who_deletes_most_per_capita`

`topic_edits_per_capita`

`topic_deletes_per_capita`

`top_domains`

`GeneratedAnalytics` Methods

`get_message`

`get_reaction_per_message`

`get_reaction_sent_per_user`

`get_num_messages_from_user`

`get_num_messages_from_topic`

`get_characters_per_user`

`get_characters_per_topic`

`get_messages_per_user`

`get_messages_per_topic`

`get_reaction_popularity_topic`

`get_all_user_message_id`

`get_user_sent_most_reactions`

`get_user_received_most_reactions`

`get_edits_per_user`

`get_deletes_per_user`

`get_edits_per_topic`

`get_deletes_per_topic`

`get_who_edits_most_per_capita`

`get_who_deletes_most_per_capita`

`get_topic_edits_per_capita`

`get_topic_deletes_per_capita`

`get_top_domains`

`get_reaction_type_popularity_per_user`

`get_message_data_frames`

`Messages` Class

`Messages` Properties

`id`

`team`

`topic`

`msg_id`

`msg_type`

`from_user`

`sent_time`

`txt_body`

`word_count`

`num_urls`

`urls`

`reaction_body`

`msg_reference`

`userMentions`