If you are still looking for the solution, how about this answer? Unfortunately, I couldn't find the prepared methods for retrieving the number of lines in the Google Document. In order to do this, how about this workaround?
If the end of each line can be detected, the number of lines can be retrieved. So I tried to add the end markers of each line using OCR. I think that there might be several workarounds to solve your issue. So please think of this as one of them.
At Google Documents, when a sentence is over the page width, the sentence automatically has the line break. But the line break has no
or
. When users give the line break by the enter key, the line break has
or
. By this, the text data retrieved from the document has only the line breaks which were given by users. In your case, it seems that your document has the line breaks for after incididunt
and consequat.
. So the number of lines doesn't become 6.
I thought that OCR may be able to be used for this situation. The flow is as follows.
- Convert Google Document to PDF.
- Convert PDF to text data using OCR.
- I selected "ocr.space" for OCR.
- If you have already known APIs of OCR, you can try to do this.
- When I used OCR of Drive API, the line breaks of
or
were not added to the converted text data. So I used ocr.space. ocr.space can add the line breaks.
- Count
in the converted text data.
- This number means the number of lines.
The sample script for above flow is as follows. When you use this, please retrieve your apikey at "ocr.space". When you input your information and email to the form, you will receive an email including API key. Please use it to this sample script. And please read the quota of API. I tested this using Free plan.
Sample script :
var apikey = "### Your API key for using ocr.space ###";
var id = DocumentApp.getActiveDocument().getId();
var url = "https://docs.google.com/feeds/download/documents/export/Export?id=" + id + "&format=pdf&access_token=" + ScriptApp.getOAuthToken();
var blob = UrlFetchApp.fetch(url).getBlob();
var payload = {method: "POST", headers: {apikey: apikey}, payload: {file: blob}};
var ocrRes = JSON.parse(UrlFetchApp.fetch("https://api.ocr.space/Parse/Image", payload));
var result = ocrRes.ParsedResults.map(function(e){return e.ParsedText.match(/
/g).length})[0];
Logger.log(result)
Result :
When your sentences are used, 6 is obtained as the result of script.
Note :
- Even if the last line of the document has no
or
, the converted text data has
at the end of all lines.
- In this case, the precision of OCR is not important. The important point is to retrieve the line breaks.
I tested this script for several documents. In my environment, the correct number of line can be retrieved. But I'm not sure whether this script works for your environment. If this script cannot be used for your environment, I'm sorry.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…