We can read coma separated value file in Python using csv library.
Please check the following code for reference.
Sample Code:
import csv
with open('SampleCSV.csv', newline='') as csvfile:
csvContent = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in csvContent:
print( row )
for col in row:
print( 'Column value is ' + col )
You can use the above sample csv file for your example.
Use encoding=”utf8″, errors=’ignore’ or encoding=”ISO-8859-1″ for the following exception:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf6 in position 5873: invalid start byte
Sample Code with encoding=”ISO-8859-1″:
import yake
import csv
concatenatedString = '';
with open( 'report.csv', newline='', encoding="ISO-8859-1" ) as csvfile:
csvContent = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in csvContent:
for col in row:
concatenatedString += ' ' + col
keywordExtractor = yake.KeywordExtractor("en", 6, 0.5, 'lev', 2, 5, features=None)
matchedKeywords = keywordExtractor.extract_keywords( concatenatedString )
topKeywords = [ keyword[ 0 ] for keyword in matchedKeywords ]
print( topKeywords )
Sample Code with utf8 encoding:
import yake
import csv
concatenatedString = '';
with open( 'report.csv', newline='', encoding="utf8", errors='ignore' ) as csvfile:
csvContent = csv.reader(csvfile, delimiter=',', quotechar='|')
for row in csvContent:
for col in row:
concatenatedString += ' ' + col
keywordExtractor = yake.KeywordExtractor("en", 5, 0.9, 'lev', 2, 5, features=None)
matchedKeywords = keywordExtractor.extract_keywords( concatenatedString )
topKeywords = [ keyword[ 0 ] for keyword in matchedKeywords ]
print( topKeywords )
Note:
errors=’ignore” will lose characters you are trying to concatenate in case of errors or exceptions.