ETL to QE, Update 23, Designing and Checking my Premises

Date: 2024-01-28

Newly Acquired Milestones

I did some research into Grants and Incubators and found that Project Catalyst has a great Grant Outline including the wonderful question, Please provide a detailed plan, including timeline and key milestones for delivering your proposal. This question left me feeling naked, in a good way, where I was self conscious and could see myself more accurately. So below I have listed some milestones and done a simple overview of them.

This current outline lacks form and a timeline but is probably the best articulation of what I am trying to build thus far. For people reading in the future check out, What is the QE plan, timeline, and milestones?

CGFS Milestone

The foundational concept of my QE project is the CGFS(Context Graph File System). CGFS is supposed to be a pretty simple and resilient relying on IPLD's DAG-JSON with an agnostic storage backend and JSONSchema for managing data types. CGFS is supposed to constitute the raw base of our data and what data is transferred between individuals. The same CGFS data and file types can be loaded into different applications. DAG-JSON helps providing a name for what we name JSON blobs.

Identity and data availability are the core of basically every electronic information system today even IPFS relies on their Peer Identity system. Even though we as individuals may perceive ourselves as having a single identity our identity online is fragmented across platforms, protocols, and devices. Then there is the fact we want to present ourselves differently depending on who we are communicating with. How about we lean into the fragmentation concept and use "human conversation" as a new primitive to build up from. Therefore the first thing that has to go from the traditional social media concept of client/server server/sever communication and move to an identity to identity model.

First User Journey

Here is an example user journey. Rather than allow my followers or public internet view my, ActivityPub, Twitter, choice of Social Media Platforms or Social Media Protocols I can present myself independently to each identity that wants to communicate with me based on solely a public key. I also don't need to reveal any PII even a pseudonym until the person I have communicated with has revealed enough valuable context to me for me to care about them. Think for a second, 4chan has no identity system and is a pretty successful social media platform, and people on rarely read the username of posts or comments and are in fact told to read the username only when it is relevant. This also build up from the Google Plus circles idea.

Milestone Persona + Meme Data Model

A couple components are required to make the above user story a reality, CGFS Persona Schema for organizing your data, an encrypted pub sub standard for sending and receiving messages, and CGFS Meme Model for the messages themselves and how they relate to one another.

What has been described so far is basically nothing more than your average Social Media Protocol. The distinction I want to make is that we want to manage heterogeneous sets of data collectively. So rather than having a list of messages sorted by time we have a media graph that adapts to where we desire our attention. Practically speaking this would look a bit more like Git, LDAP, or Google Drive with people putting memes, messages, and links embedded within their context.

Milestone Tokenization by Authority

This vision is still not tangible because it is missing the two components that make it unique, the QE - Token Specification, the QE - Proof of Meme.

The goal of the QE - Token Specification specification is to get Wordcels using numbers. The use case here is to allow individual to ICO themselves and then use their token to regulate how others interact with them, for more info check user journey below. The end result for this milestone is a library that allows individuals to manage a token in the form of a state machine that can be managed and modified by a single or set of private keys. This system is still susceptible to the Double Spend Problem, the QE - Proof of Meme described below under helps solve this problem.

Second User Journey

The best way to understand QE - Token Specification is via another user journey. So the situation is you have a new group chat on QE with your friends. This group chat has all the same features as your favorite messaging app. The group chat by its nature provides everyone a token wallet that recharges X tokens per hour for a maximum of Y tokens. Every message, attachment, or emoji, sent requires 1 token. Individuals can attach any number of tokens in their wallet to their message to help contextualize what other people should be paying attention to. For example when you need everyone attention dump 100 tokens on a message rather than one. When someone sends a meme you really like put 5 tokens on your emoji reaction instead of the normal one. When you want someone to signal you that something is important give them your token so they can direct your attention. New tokens can be minted and distributed to regulate context, for example running a poll everyone gets a single or a few tokens they can use as message and attach where they seem fit. Additional tokens can also act as tags for example one token for photos of people in the group chat, another for when yall talk about organizing events, and another for meme wars.

Milestone Proof of Meme

The goal of QE - Proof of Meme it a mechanism for proof of meme. The core use case here is to take what individuals have said and prove it was said in the past by using the blockchain. The secondary use case here is to have the state of a QE - Token Specification validated in the past helping solve the Double Spend Problem at scale.

Taking the primitives expressed from the user journey in the previous paragraph we can now extend this token concept out to all of social media using the QE - Proof of Meme. Within the context of a group chat every member has the entire context or what they were allowed via the admins, it is a self contained reinforcing system that can be modified and forked depending on the current consensus. When it comes to public social media the entire context will not fit any any reasonably sized computer so how are people supposed to communicate manage the state of all these tokens with the identities and memes they are attached to? Well all this complex state can be delegated to authorities that can put these things called merkle proofs, which are just 8 bytes, on public blockchains that can be used to validate millions of separate pieces of information. Simply put the data structure of a individual context token will be accessible within not one but multiple merkle proof updated in a hourly or daily basis.

Third User Journey

So the QE - Proof of Meme user journey would go like this. Your get a message from a public key asking an intriguing question, for example, "Your blog seems pretty all over the place, it reminds me of myself..." then proceeds to provide examples. You then say, "Why should I trust you, give me proof of meme". The individual then provides a series of merkle roots signed with their same public key to various viral memes and what seems to be a receipt of work for a DAO that has a good rating on trust pilot. This is proof enough for you to give them your public Calling Public Key key so yall have an encrypted chat via a voice changing application.

Milestone CGFS Spec and Documentation

The CGFS Specification is supposed to be a standard definition of the above features in a similar culture to IETF RFC, IEEE and, ISO, ERC, and NIP standards.

Visualizing a Path Forward

So I was recently given some guidance on my Discord Binding project which basically boiled down to, stop building for people that only exist in your mind and go out and talk to IRL people then build what they want. I started going down this path then went complete neurotic, became overwhelmed, and stopped reaching out to strangers for Informational Interviews. This resulted in the question, Why didn't I just move forward with the discord-binding ETL pipeline and dashboard?

For those unaware the Discord Binding tool I built can index entire discord guilds to SQL where I do queries such as who sent the most message, what message has the most reactions, activity per month, and many more questions. I have put a lot of time into this project and don't really have anyone including myself that has used it for anything tangible.

I had and still have a nagging itch from the Perfectionist side of my personality that says that I should have a standardized format for these discord message and the metadata I annotate on top of it. I want the Discord Data available on CGFS before I start interrogating it.

This has resulted in me doing my Research on data structures of different messaging apps which resulted in my Message and Annotation Features document which I can use to come up with the first CGFS message type. It is key that all CGFS message types can transform between one another.

So what's next? I am still hiding in my apartment behind my computer desk rather than interacting with the real world. My problem is that I do not have clarity for what I want and therefore spend my time cognitively masturbating. Well there was this concept called the Smokey Mirror where one acts as a medium for others to see themselves. It is my goal to do that for communities, I can do that with my Discord Binding tool. It is just a matter of setting up specific goalposts.

New Design Heuristics

I have had a series of design problems that upon reflection are much more simple, they are as follows.

What kind of data structure is supposed to represent a user's profile?
- TL;DR You should respond to requests from other cryptographic identities with specific types that can be manipulated on a per request basis
What is the best way to produce a large merkle tree?
- TL;DR Just use a IPLD standard, then write a library for generating the proofs
How are meme's supposed to be linked to one another so they don't get lost?
- TL;DR Use the PKMS Linking Standard
What fundamental jsonschema format can be used for notes, messages, imported social media data, blog posts, annotations, and LLM logs?
- Work in progress see Research - Format of messages from different messaging apps
Within QE what is the equivalent conception of a folder or dataset?
- TL;DR A recursive key value hierarchy where each key value pair has RBAC
What kind of data structure can be used to manage a series of online identities that are supposed to remain separate?
- TODO, Should be Priority

Articulated a Product Dream State

The user journey for the end product I will want to market to the masses is as follows.

Epic User Journeys