I have a spark column which has the data something as below.
Agent="iee/500.0 (OS X 10_15_6) ("HTML", like) Version/144.0.1 Safari/000.1.0099", Status="null", Search_Type="null", Identifier="null", Mode="null", Activation="null", First_Name="null", Last_Name="null", Request="null", Code="null", Email="null"
So basically i split based (",) and create different columns.
But for Agent="iee/500.0 (OS X 10_15_6) ("HTML", like) Version/144.0.1 Safari/000.1.0099", i have a extra quotes in between and hence it splits after html.
Can you please suggest me some regex to handle the quotes
Input : Agent="iee/500.0 (OS X 10_15_6) ("HTML", like) Version/144.0.1 Safari/000.1.0099", Status="null", Search_Type="null", Identifier="null", Mode="null", Activation="null", First_Name="null", Last_Name="null", Request="null", Code="null", Email="null"
Expected Output : Agent="iee/500.0 (OS X 10_15_6) (HTML, like) Version/144.0.1 Safari/000.1.0099", Status="null", Search_Type="null", Identifier="null", Mode="null", Activation="null", First_Name="null", Last_Name="null", Request="null", Code="null", Email="null"
The extra quotes can be expected to come in any column and not just with the word html
question from:
https://stackoverflow.com/questions/65842595/remove-double-quotes-after-and-before-in-spark-scala-column