Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
72 views
in Technique[技术] by (71.8m points)

python - How to convert appended list of strings into one single list

I have an HTML file where I am interested in BBox information with the text. After extracting the BBox with text, I appended it into a list. However, the output seems it's appending the first list (first added the first line into a list) into a second list (added a second line of string into a list). To better illustrate this problem, I attached a snippet of this problem. enter image description here

However, I want this into one single list. The following snippet illustrating the output that I want. enter image description here

Below is the simple code that I wrote:

import bs4

xml_input = open("1.html","r",encoding="utf-8")
soup = bs4.BeautifulSoup(xml_input,'lxml')
ocr_lines = soup.findAll("span", {"class": "ocr_line"})
#We will save coordinates of line and the text contained in the line in lines_structure list
lines_structure = []
for line in ocr_lines:
    line_text = line.text.replace("
"," ").strip()
    title = line['title']
    #The coordinates of the bounding box
    x1,y1,x2,y2 = map(int, title[5:title.find(";")].split())
    lines_structure.append({"x1":x1,"y1":y1,"x2":x2,"y2":y2,"text": line_text})
    print(lines_structure)

I would really appreciate your help regarding this problem.

question from:https://stackoverflow.com/questions/65836667/how-to-convert-appended-list-of-strings-into-one-single-list

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Actually, after digging, I found that the print needs to be outside of the 'for' loop. It was a quick fix. Thanks for your time.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...