Skip to content

ETL to QE, Update 24, Roadmap Revisited with Memes

Original Roadmap

  1. Discord Analytics Reports and Dashboard
  2. Graph Based Annotation on Top of Discord Data
  3. Allow for Generalized Questioning and add Additional Data Sources
  4. Proof of Meme Micro Bounty Platform

Updated Roadmap

TL;DR, Memes, Schema, Tokens, Merkle Trees

  1. Memes, CGFS Meme Model;
    1. Description: Come up with a message format that we can transform existing social media into, include encryption and signing of messages with this format
    2. Define and update self referencing systems of memes known as ontologies for purposes of tagging data.
    3. Composable Message Standard
      1. Based on Research - Format of messages from different messaging apps
      2. Must be able to index into and from Obsidian and Tiddly Wiki
      3. Must be able to index into and from Raindrop.io
      4. Must be able to index into and from Hypothes.is
      5. Must be able to index from ActivityWatch
      6. Must be able to index from Git
      7. Must be able to index from social media including Keybase, Discord, Twitter, Facebook, Signal etc.
      8. Must be able to index emails
    4. Synthases of messages standard
      1. Memes must be able to integrate and link with one another
    5. Extendable cryptographic identity standard
      1. Support browser injected signers such as MetaMask, and Nostr NIP-07 signers
    6. DID standard for existing social media accounts such as Discord, Linkedin, Facebook etc. etc.
  2. Schema, CGFS Persona Schema
    1. Description: Come up with a generated schema that for social media that we can reindex social media to
    2. Develop user journeys
    3. Come up with a simple to user interface for contextualizing all memes one has
    4. Context
      1. QE is supposed to to be modular and composable like Obsidian allowing the user to develop their own or adopt the social media schema they see fit.
      2. Using QE everyone you communicate with has to either tell you their name, you can ask their name, or you have to assign a name to them.
  3. Tokens QE - Token Specification;
    1. Description: Come up with a signature chain proof of concept that individuals can issue
    2. Research and review existing token standards
    3. See question-engine/backend/transactions for reference design that needs to be reviewed and re implemented using DAG-JSON
  4. Merkle Trees, Proof of Meme, QE - Proof of Meme
    1. Description: Come up with merkle tree and data availability mechanisms to store proof of memes on the blockchain.
    2. Research and review
      1. Compare existing merkle proof libraries
      2. On Chain Merkle Proofs
      3. Validate usability of my Eth Waterloo 2023 Project
    3. User Journey Validation
    4. Write a library that can
      1. Create DAG-JSON merkle trees
      2. Store, Backup, and Share raw data from merkle trees
      3. Generate Merkle Proofs
      4. Validate Merkle Proofs
      5. Publish merkle roots to various blockchains
      6. Read merkle proof from various blockchains

Reflection + Rant

Intelligence is just breaking down problems into doable chunks. I have completed the first step of the Original Roadmap via my Discord Analytics Reports and Dashboard. Now I am onto the second phase, Graph Based Annotation on Top of Discord Data.

I have been twiddling my thumbs for months waiting for the right strike of inspiration to proceed to phase two, Graph Based Annotation on Top of Discord Data. My initial plan was to implement a user model by just writing some SQL or use a ORM like Prisma or SQLAlchemy. Have users create an account using OAuth, Email, or MetaMask then add in the features for the user journeys I have outlined in Epic User Journeys. The first user journey would be to add tags then then add features for rankings, comments, and links to the web such as wikipedia and linkedin.

I was unable to break down the task. Here are some thoughts that were going through my head. I attended a hackathon six months back and getting email setup was a bitch, using gmail, Microsoft, or protonmail. Then you sign up for a email SAAS offering and they have some convoluted API. Then there was the OAuth option, having to manage a Google API key just goes against my entire ethos of Self Hosting. Once I see a self hosted app requires an API key for some service of some kind I suddenly don't want to deploy it. Having a separate database, sure no problem just update the docker-compose, want to be fancy and use object storage, sure just add minio to the docker-compose. Setting up a domain name with TLS then getting a whatever OAuth API Key or whatever you need.... Pardon my french but fuck that. I understand that if this project gets users and possibly funding email and OAuth may need to be implemented but what if there was a way to build up from a more fundamental authentication model.

Well the Authentication and User model are inherently linked. I am afraid of committing to some custom user model in an ORM or SQL due to my inability to easily understand the user model used in my favorite open source projects. I don't understand what Django is doing under the hood, Jellyfin does IDK what, I check out Immich and I get this monstrosity, wikijs has this monstrosity. Actually let's create a list of these.

Wouldn't it be cool if all these systems were built in raw SQL and used PRQL. Actually there might be something to this.

  • People do not use PRQL because it is not SQL Database Agnostic like ORMs
    • I bet that could be figured out
  • Whatever custom solution on top of PRQL is going to have to be written in some programming language everyone is not going to agree on
    • PRQL is written in Rust, as long as bindings can be made and it has a good CLI like curl I doubt this would be a problem.
  • This is yet another case of the Standards XKCD
    • The ORM SQL Library space does not have standards, every programming language has multiple ways of interacting with databases
  • PRQL is just supposed to be used for migrations and schema setup, the actual application can connect to the database however they want.
  • Actually can PRQL reverse engineer the schema migration by looking at two separate schema dumps?
    • I bet there is a LLM Langchain system that can help with this.
  • If you are supporting multiple database backends you either need to stay within the requirements of all 3 or specialize such as how wikijs 3.0 will support only postgres
  • DuckDB has connectors to all 3 databases which I hope would make stuff simpler.
  • Is JSON support different between DuckDB and SQLite?

Well it now seems like I have a bunch of research topics rather than features to build. The topics are as follow?

Three questions, that's good enough any more and the task of sorting becomes convoluted. You can not answer each of these before having to worry about anything else.