Quantcast
Channel: Concat files in one directory based on their extension - Stack Overflow
Viewing all articles
Browse latest Browse all 3

Concat files in one directory based on their extension

$
0
0

I have a directory which holds multiple big size files of different extensions:

file1.csv
file2.csv
text_file.txt
text_file2.txt
json_file.json
json_file2.json
...

My goal is to join all files into three groups based on their extension. The script wouldnt need to join everything in one run, I can change the extension and run the script 3 times. The main goal would be to join all certain type ( let's say .csv ) files in one category. I've found this script in stackoverflow but it throws me errors:

import os
import glob
import pandas as pd

os.chdir("/Users/user/Desktop")
extension = 'csv'
all_filenames = [i for i in glob.glob(f'*.{extension}')]
#combine all files in the list
combined_csv = pd.concat([pd.read_csv(f) for f in all_filenames])
#export to csv
combined_csv.to_csv("combined_csv.csv", index=False, encoding='utf-8-sig')

The code throws me this error:

  File "pandas/_libs/parsers.pyx", line 543, in pandas._libs.parsers.TextReader.__cinit__
pandas.errors.EmptyDataError: No columns to parse from file

Other side of question would be in terms of memory efficiency, when should I start concatenating files line by line rather than the whole file.


Viewing all articles
Browse latest Browse all 3

Latest Images

Trending Articles



Latest Images