Parquet Example

Install pandas library from Terminal (C:\ or $)

pip install pandas
import pandas as pd

# Assuming you have a CSV file named 'sample.csv'
# file_path = 'sample.csv'

# file_path = 'https://raw.githubusercontent.com/gchandra10/filestorage/main/sales_100.csv'

# Read the CSV file
df = pd.read_csv(file_path)

# Display the first few rows of the DataFrame
print(df.head())

# Write DataFrame to a Parquet file
df.to_parquet('sample.parquet')
  • We import the pandas library.

  • We assume there's a CSV file named sample.csv in the same directory as your Python script.

  • We use pd.read_csv(file_path) to read the CSV file. The resulting DataFrame (df) contains the data from the CSV file.

  • df.head() Is used to print the first few rows of the DataFrame for a quick data preview.

  • df.to_parquet() to convert data to parquet format

How to read the Parquet file?

pip install pandas pyarrow
import pandas as pd

# Assuming you have a Parquet file named 'sample.parquet'
file_path = 'sample.parquet'

# Read the Parquet file
df = pd.read_parquet(file_path)

# Display the DataFrame
print(df)

Last updated