[Tutor] Help with two .shp file and python coding
Tariq Khasiri
tariqkhasiri at gmail.com
Wed Mar 29 09:51:39 EDT 2023
After following the steps
import geopandas as gpd
from shapely.geometry import Point
# Load shapefiles
counties = gpd.read_file('county.shp')
interstates = gpd.read_file('roads.shp')
# To resolve this warning, you can reproject one of the datasets to match
the CRS of the other. In this case, you could reproject the counties
dataset to match the CRS of the interstates dataset.
counties = counties.to_crs(interstates.crs)
# Perform spatial join to find nearest interstate for each county
joined = gpd.sjoin_nearest(interstates, counties, distance_col='distance')
#creating new data frame with necessary columns from the joined dataframe
of total 27 columns
new_joined = joined.loc[:, ['LINEARID', 'FULLNAME', 'RTTYP', 'geometry',
'index_right', 'statefp', 'countyfp', 'distance']]
print(new_joined.sample(10))
my data frames (10 observations) look like this
Now the rest of the code looks like this. But, I'm not sure how I need to
edit my code with the columns of new_joined data frame. On top of it I
need observation of the interstate highways. So from RTTYP column anything
that starts with *I- *, that's my needed observation. The rest of it is not
necessary since they are observations of highways or expressways. Only need
the interstate highway given my goal.
## the rest of the code needs to be edited
# Calculate distance between each county and its nearest interstate
def calculate_distance(row):
county_point = row.geometry.x, row.geometry.y
interstate_point = row.geometry_nearest.x, row.geometry_nearest.y
return Point(county_point).distance(Point(interstate_point))
joined['distance'] = joined.apply(calculate_distance, axis=1)
# Save results to file
joined.to_file('output.shp')
##new_joined.sample 10 observation is in the following section
LINEARID FULLNAME RTTYP \
11187 1104375149380 US Hwy 30 U
658 1105056901139 I- 96 I
4070 110465934760 Gerald R Ford Fwy M
8925 1102218644426 Edens Expy M
1254 1104492838322 US Hwy 23 U
3514 1101476091588 I- 90 I
3979 110431486317 Will Rogers Tpke M
12155 1104762432455 I- 40 I
10332 1105088962603 I- 80 I
12086 1104755971958 I- 94 I
geometry index_right statefp \
11187 LINESTRING (-105.52431 41.28870, -105.52369 41... 2773 56
658 LINESTRING (-83.08258 42.31966, -83.08260 42.3... 271 26
4070 LINESTRING (-85.78228 42.87707, -85.78357 42.8... 1083 26
8925 LINESTRING (-87.74634 41.96272, -87.74642 41.9... 1189 17
1254 LINESTRING (-83.99342 34.08687, -83.99330 34.0... 427 13
3514 LINESTRING (-75.97591 43.09333, -75.97613 43.0... 82 36
3979 LINESTRING (-95.27075 36.51022, -95.27031 36.5... 1474 40
12155 LINESTRING (-94.31317 35.45632, -94.31351 35.4... 1337 05
10332 LINESTRING (-121.54758 38.59867, -121.54713 38... 871 06
12086 LINESTRING (-96.28076 46.48266, -96.27988 46.4... 555 27
countyfp distance
11187 001 0.0
658 163 0.0
4070 139 0.0
8925 031 0.0
1254 135 0.0
3514 067 0.0
3979 035 0.0
12155 033 0.0
10332 113 0.0
On Tue, 28 Mar 2023 at 09:43, ThreeBlindQuarks <threesomequarks at proton.me>
wrote:
> Tariq,
>
> Your entire program is written in a module called geopandas so to even
> understand it, we would need to find out what exactly various functions in
> that module you call are documented to do.
>
> A little more explanation of what you are doing might be helpful.
>
> Near as I can tell, you are reading in two files to make two structures in
> a pandas format. You join them based on some common columns in whatever
> manner gpd.sjoin_nearest() does so that each row now contains enough info
> for the next step.
>
> You create a function that accepts one such row and extracts the
> coordinates of two parts and calculates a distance that I am guessing is a
> Euclidean distance as the crow flies, not as you drive on roads.
>
> You use that function on every row to create a new column.
>
> Then you save the results in a file.
>
> Does that do something? Who knows. The devil is in the details. The
> mysterious black box that does the initial join would have to uniquely
> match up counties and highways in a way that generates exactly the rows
> needed and exclude any that don't. We (meaning ME) have no assurance of
> what happens in there. Could you end up comparing all counties with their
> distance from somewhere on highway 1?
>
> - Q
>
>
> Sent with Proton Mail secure email.
>
> ------- Original Message -------
> On Monday, March 27th, 2023 at 10:58 PM, Tariq Khasiri <
> tariqkhasiri at gmail.com> wrote:
>
>
> > Hello everyone,
> >
> > For a particular project, I need to collect the distance data from each
> US
> > county to the nearest intersection of interstate highway.
> >
> > These are the publicly provided dataset where anyone has access to the
> > county shp files of usa and roads shp files.
> >
> > this one for county polygons
> >
> https://public.opendatasoft.com/explore/dataset/us-county-boundaries/information/
> >
> > this one for primary roads
> >
> >
> https://catalog.data.gov/dataset/tiger-line-shapefile-2016-nation-u-s-primary-roads-national-shapefile/resource/d983eeef-21d9-4367-9d42-27a131ee72b8
> >
> > `import geopandas as gpd from shapely.geometry import Point # Load
> shapefiles counties = gpd.read_file('path/to/counties.shp') interstates =
> gpd.read_file('path/to/interstates.shp') # Perform spatial join to find
> nearest interstate for each county joined = gpd.sjoin_nearest(counties,
> interstates) # Calculate distance between each county and its nearest
> interstate def calculate_distance(row): county_point = row.geometry.x,
> row.geometry.y interstate_point = row.geometry_nearest.x,
> row.geometry_nearest.y return
> Point(county_point).distance(Point(interstate_point)) joined['distance'] =
> joined.apply(calculate_distance, axis=1) # Save results to file
> joined.to_file('path/to/output.shp')`
> >
> > do you think I can execute my goal successfully with the code snippet
> above
> > ??
> > _______________________________________________
> > Tutor maillist - Tutor at python.org
> > To unsubscribe or change subscription options:
> > https://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list