Transforming .CSV format file in Apache Parquet format via Amazon Glue

I had the task of storing a .CSV format file in an Amazon S3 storage, but the pre-requisite was converting this .CSV format file to Apache Parquet format. Thinking from this perspective, I assume that one of the reasons for this conversion is to reduce costs, and for that, the tool Amazon Glue is the perfect tool to accomplish this task. I thought to describe and show my implementation here, but I realized that there are many how-tos that I followed, and their link is shared below as a source:

Convert CSV / JSON files to Apache Parquet using AWS Glue

Three AWS Glue ETL job types for converting data to Apache Parquet

Format Options for ETL Inputs and Outputs in AWS Glue

AWS Glue | CSV to Parquet transformation | Getting started

AWS: How to use AWS Glue ETL to convert CSV to Parquet – Tutorial

Related posts

Leave a Comment