Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
437 views
in Technique[技术] by (71.8m points)

Python: writes only one row into SQLite DB despite loading the full file data into pandas framework

I am trying to read the data from "data.txt" and write it to SQLITE database, what I actually get is only first row of the record. I figured out by printing the pandas frame object, it does get all the data from "txt" but failed to iterate throughout, to me the iteration loop looks fine can anybody figure out what I am doing wrong here which making it fail to replicate all the rows of file.

File data looks like this:

862     1
4828    1
6429    1
10013   1
7729    1
380    1
3808 1
7246    1
1663 1

Secondly, you will notice that spaces between the columns are inconsistent, at some places there are spaces and at some places these are tab separated. I have few thousands of inconsistent rows , this is just a subset of those for reference. Is there any way to handle this?

I tweaked delimiter parameter and also split function but it doesn't workout here. Any suggestion?

Actual results:

Only one row into DB

862     1

Expected results:

862     1
4828    1
6429    1
10013   1
7729    1
380     1
3808    1
7246    1
1663    1

Here is the piece of code I tried:

data = pd.read_csv ("data.txt")   
rows = pd.DataFrame(data)

#print (rows)

for row in rows:
    r = row.split()
    object_id = r[0]
    object_type = r[1]
    cur.execute(''' INSERT INTO objects (object_id, object_type) 
     VALUES (?, ?) ''', (object_id, object_type))
    conn.commit()

Any valuable input is appreciated! Thanks in advance.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This should work as expected:

data = pd.read_csv ("data.txt", delimiter=' ', header=None, names=['object_id', 'object_type'], skipinitialspace=True)
 
for row in data.itertuples(index=False):
    cur.execute(''' INSERT INTO objects (object_id, object_type) 
     VALUES (?, ?) ''', (row.object_id, row.object_type))

conn.commit()

header=None ensures the first row in the file is not considered as a header row.
names lists out the column names
skipinitialspace=True resolves the whitespace issue.

In your code you were iterating through the column names and not the rows.
Since the first row(862, 1) was parsed as header, only that row got inserted into the db.

Also, you don't have to commit at every iteration. A single commit at the end of iteration is suffice.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...