Wednesday, 4 December 2024

Show HN: The Canada census data in a SQLite file; advice appreciated https://bit.ly/49j45Yx

Show HN: The Canada census data in a SQLite file; advice appreciated This is niche, I'll admit. I needed to look through the latest census data, but it was exported as multiple multi-gigabyte bespoke latin1-encoded CSV files. Pandas, Polars, and SQLite's CSV import tool weren't much help, so I shelved the project until recently, when I started taking a SQLite course online. I picked it up again, normalized the data, and now there's a database that can be queried through a SQL view that matches the headings in the original CSVs. I'm proud of the script I created to export the data, as well as automatically compress the artifact, make the diagrams and checksums, etc. This is my first time building up a big database, does my schema seem sane? I've been considering switching the counts from REALs to TEXT, since then SQLite's decimal extension can do exact calculations, but considering there's only one or two places after the decimal points in the data, I'm not sure if it's worth it space-wise. https://bit.ly/49DSPpT December 5, 2024 at 12:20AM

No comments:

Post a Comment