Replace CSV with more efficient Parquet file format or add Parquet as second file format.
Posted: Fri Jul 22, 2022 7:04 am
I'm playing with Jupyter Notebooks / Pandas and recently discovered the Parquet file format.
It has many advantages over CSV.
It would be nice to get the data from the site in bulk using that format.
As data is stored column by column instead of row by row it's more compressible, faster to load, faster to process.
It has many advantages over CSV.
It would be nice to get the data from the site in bulk using that format.
As data is stored column by column instead of row by row it's more compressible, faster to load, faster to process.