Skip to content

ETL to QE, Update 2, S3 and PostGraphile

Date: 2023-10-03

See Discord Binding for project context

Project Context TLDR; The goal is to have all the raw discord data from DiscordChatExporter stored on S3 rather than having it partially stored in multiple places.

Failure to use RClone with VultrS3

The first tool I wanted to use to backup my fragmented archive of Discord data across multiple devices is called RClone. I got an error trying to use it documented in the following note:

How to solve RClone S3 Storage is not working as expected bug when backing up Discord Data?

Use TrueNAS Scale to backup everything

At part of my Homelab I have a TrueNAS Scale server. This means I run a computer in my closet that has a lot of storage that can be accessed over the network. After the failure to upload everything to VultrS3 I used rsync to back up everything to my server at home. I

PostGraphile is not my friend

Even when you use GraphQL you still need to write your SQL. PostGraphile is a beautiful piece of software, you just point it at a postgres database and you get a GraphQL API. I really liked the idea of using a GraphQL API for querrying my Discord Data because it is easier to use than SQL but I ended up learning a very important lesson.

One query where Postgraphile failed was What percentage of users on each Discord Guild posted less than 1, 3, 10, or 100 messages?. There was just no way of performing this query without recursive calls to the Postgraphile which defeats the purpose of GraphQL. So I learned I was going to have to write the SQL queries for my own GraphQL API in the end.

Also check out Should I use GraphQL?.

Tasks