I have a dataframe df:
{'city': {0: 'Adak', 1: 'Akiachak', 2: 'Akiak', 3: 'Akutan', 4: 'Alakanuk'},
'latitudedegrees': {0: '51.87957',
1: '60.88981',
2: '60.911865',
3: '54.098693',
4: '62.683391'},
'latituderadians': {0: 0.9054693110188746,
1: 1.0627276654137685,
2: 1.0631125977802958,
3: 0.9442003138756087,
4: 1.094031559264981},
'longitudedegrees': {0: '-176.63675',
1: '-161.42393',
2: '-161.22577',
3: '-165.88176',
4: '-164.65455'},
'longituderadians': {0: -3.082892867522094,
1: -2.8173790700088506,
2: -2.8139205255630984,
3: -2.8951828810030293,
4: -2.8737640258896295},
'ncity': {0: 'Dallas', 1: 'Dallas', 2: 'Dallas', 3: 'Dallas', 4: 'Dallas'},
'nlatituderadians': {0: 0.5722195078367402,
1: 0.5722195078367402,
2: 0.5722195078367402,
3: 0.5722195078367402,
4: 0.5722195078367402},
'nlongituderadians': {0: -1.6891776914122487,
1: -1.6891776914122487,
2: -1.6891776914122487,
3: -1.6891776914122487,
4: -1.6891776914122487},
'nstate': {0: 'TX', 1: 'TX', 2: 'TX', 3: 'TX', 4: 'TX'},
'state': {0: 'AK', 1: 'AK', 2: 'AK', 3: 'AK', 4: 'AK'},
'zip': {0: '99546', 1: '99551', 2: '99552', 3: '99553', 4: '99554'}}
it's a cartesian product of a list of 'ncity', and is several millions rows. The original file is here:
https://public.opendatasoft.com/explore/dataset/us-zip-code-latitude-and-longitude/export/
df already has radians, but they are being brought in as strings, and therefore do not run in here:
def distanceBetweenCityInMiles(lat1, long1, lat2, long2): # assumes latitudes and longitudes are in radians
d = np.arccos(np.sin(lat1)*np.sin(lat2)+np.cos(lat1)*np.cos(lat2)*np.cos(long1-long2))
distance_km = 6371 * d # distance_km ≈ radius_km * distance_radians ≈ 6371 * d, where 6371 km is the average radius of the earth
distance_mi = distance_km * 0.621371
return distance_mi
I've tried converting to float:
df[['nlatituderadians','nlongituderadians','latituderadians','longituderadians']]=df[['nlatituderadians','nlongituderadians','latituderadians','longituderadians']].astype(float)
But still get this error:
df['ncitydistance']= distanceBetweenCityInMiles('nlatituderadians', 'nlongituderadians', 'latituderadians', 'longituderadians')
TypeError: ufunc 'sin' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
As you can see I have all the data in one row, and need to calculate the distance between the nlat/nlong and lat/long values.
How can I convert this string to radians to run the data through the distance function? I am assuming that is the reason this will not work. The end result should be another column that gives the distance between the cities.