Skip to content

Filling self described SQL models via LLM Heilmer Catechism

What are you trying to do? Articulate your objectives using absolutely no jargon.

I want to have a script where you ask ChatGPT to develop a SQL Schema for something then ChatGPT generates data for the Schema until it starts repeating itself or the user stops requesting additional data.

2024-02-08

LLM's can generate SQL Schemas for just about anything

I recently learned that you can ask ChatGPT to produce a SQL Schema for just about anything and it will spit something somewhat intelligible what. This has got me thinking that I can extract a model of the world in SQL from ChatGPT.

ChatGPT can generate traditional SQL Schemas for stuff such as,

ChatGPT can also generate SQL Schemas for more complex or abstract systems that require knowledge of the world such as,

ChatGPT can even generate SQL Schemas for just raw concepts.

It should also be possible to do with with Object Oriented Classes.

LLM's can also populate the SQL Schemas they come up with

LLM's can also populate SQL Schemas you make up and ask it to populate

How is it done today, and what are the limits of current practice?

What is new in your approach and why do you think it will be successful?

Who cares? If you are successful, what difference will it make?

2024-02-08

So you generate a bunch of these Schemas for all these different things then what?

So you get a CRM schema, it isn't useful unless you write the rest of the CRM software so it actually becomes usable. People don't interact with raw SQL databases. And even if you do finish the CRM software the software itself is not developed with actual users in mind so no one will use it.

How is someone supposed to use the data loaded into these SQL Schemas by a LLM? Using a LLM is easier to load data into a schema than using fakerjs. Though I don't know anyone who really had the problem of generating synthetic data... plus this is likely a solved problem.

This recursive prompting tooling I am developing here, similar to Langchain and LlamaIndex, can be used to generate narrative worlds that actually have to interact with systems. For example modeling the state of a car.

I am personally not interested in developing complex video game worlds at this point. So then why does this problem intrigue me so much.

The question should be what concepts do I want to use a LLM to map out? I am interested in creating a data structure to log every activity a human being does so that we can generate a digital reflection of them in a computer that can live forever Ray Kurzweil style. I want to know what data structure Bernard from WestWorld used to manage the park so we can do WestWorld IRL ARG style.

What are the risks?

How much will it cost, time is a resource?

How long will it take?

What are the mid-term and final “exams” to check for success?