
I had the task of storing a .CSV format file in an Amazon S3 storage, but the pre-requisite was converting this .CSV format file to Apache Parquet format. Thinking from this perspective, I assume that one of the reasons for this conversion is to reduce costs, and for that, the tool Amazon Glue is the perfect tool to accomplish this task. I thought to describe and show my implementation here, but I realized that there are many how-tos that I followed, and their link is shared below as a source:
Convert CSV / JSON files to Apache Parquet using AWS Glue
Three AWS Glue ETL job types for converting data to Apache Parquet
Format Options for ETL Inputs and Outputs in AWS Glue
AWS Glue | CSV to Parquet transformation | Getting started
AWS: How to use AWS Glue ETL to convert CSV to Parquet – Tutorial


*The views expressed here are my own and do not represent those of my employer.*
Hello, I’m Bruno — a dual citizen of Brazil and Sweden. I bring a global perspective shaped by experiences in both South America and Europe, with a strong focus on collaboration and innovation across cultures. I am a Computer Scientist, PhD Candidate in Information and Communication Technologies, focusing on Data Science and Artificial Intelligence, and hold dual Master’s degrees in Data Science and Cybersecurity. With over fifteen years of international experience spanning Brazil, Hungary, and Sweden, I have collaborated with global organizations such as IBM, Playtech, and Oracle, as well as contributed remotely to projects across multiple regions. My professional interests include Databases, Cybersecurity, Cloud Computing, Data Science, Data Engineering, Big Data, Artificial Intelligence, Programming, and Software Engineering, all driven by a deep passion for transforming data into strategic business value.