Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

gshhg.rst 15 KB

You have to be logged in to leave a comment. Sign In
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
  1. The Global Self-consistent, Hierarchical, High-resolution Geography Database (GSHHG)
  2. ====================================================================================
  3. .. figure:: /_images/gshhg.jpg
  4. :height: 916 px
  5. :width: 1803 px
  6. :align: center
  7. :scale: 35 %
  8. GMT uses the :doc:`/coast` utility to access a version of the GSHHG data
  9. specially formatted for GMT. The GSHHG data have strengths and weaknesses.
  10. It is global and open source, but is based on relatively old datasets and
  11. hence may not be accurate enough for very large-scale mapping projects.
  12. The current GSHHG version used by GMT is 2.3.7. Below is a mostly technical
  13. description of how the GSHHG data set was assembled, processed, and formatted
  14. to meet the requirements of GMT.
  15. Selecting the right data
  16. ------------------------
  17. There are two well-known public-domain data sets that could be used for
  18. this purpose. Once is known as the World Data Bank II or CIA Data Bank
  19. (WDB) and contains coastlines, lakes, political boundaries, and rivers.
  20. The other, the World Vector Shoreline (WVS) only contains shorelines
  21. between saltwater and land (i.e., no lakes). It turns out that the WVS
  22. data is far superior to the WDB data as far as data quality goes, but as
  23. noted it lacks lakes, not to mention rivers and borders. We decided to
  24. use the WVS whenever possible and supplement it with WDB data. We got
  25. these data over the Internet; they are also available on CD-ROM from the
  26. National Centers for Environmental Information in Boulder, Colorado [1]_.
  27. Format required by GMT
  28. ----------------------
  29. In order to paint continents or oceans it is necessary that the
  30. coastline data be organized in polygons that may be filled. Simple line
  31. segments can be used to draw the coastline, but for painting polygons
  32. are required. Both the WVS and WDB data consists of unsorted line
  33. segments: there is no information included that tells you which segments
  34. belong to the same polygon (e.g., Australia should be one large
  35. polygon). In addition, polygons enclosing land must be differentiated
  36. from polygons enclosing lakes since they will need different paint.
  37. Finally, we want :doc:`/coast` to be
  38. flexible enough that it can paint the land *or* the oceans *or* both. If
  39. just land (or oceans) is selected we do not want to paint those areas
  40. that are not land (or oceans) since previous plot programs may have
  41. drawn in those areas. Thus, we will need to combine polygons into new
  42. polygons that lend themselves to fill land (or oceans) only (Note that
  43. older versions of :doc:`/coast` always
  44. painted lakes and wiped out whatever was plotted beneath).
  45. The long and winding road
  46. -------------------------
  47. The WVS and WDB together represent more than 100 Mb of binary data and
  48. something like 20 million data points. Hence, it becomes obvious that
  49. any manipulation of these data must be automated. For instance, the
  50. reasonable requirement that no coastline should cross another coastline
  51. becomes a complicated processing step.
  52. * To begin, we first made sure that all data were "clean", i.e., that
  53. there were no outliers and bad points. We had to write several
  54. programs to ensure data consistency and remove "spikes" and bad
  55. points from the raw data. Also, crossing segments were automatically
  56. "trimmed" provided only a few points had to be deleted. A few hundred
  57. more complicated cases had to be examined semi-manually.
  58. * Programs were written to examine all the loose segments and determine
  59. which segments should be joined to produce polygons. Because not all
  60. segments joined exactly (there were non-zero gaps between some
  61. segments) we had to find all possible combinations and choose the
  62. simplest combinations. The WVS segments joined to produce more than
  63. 200,000 polygons, the largest being the Africa-Eurasia polygon which
  64. has 1.4 million points. The WDB data resulted in a smaller data base
  65. (~25% of WVS).
  66. * We now needed to combine the WVS and WDB data bases. The main problem
  67. here is that we have duplicates of polygons: most of the features in
  68. WVS are also in WDB. However, because the resolution of the data
  69. differ it is nontrivial to figure out which polygons in WDB to
  70. include and which ones to ignore. We used two techniques to address
  71. this problem. First, we looked for crossovers between all possible
  72. pairs of polygons. Because of the crossover processing in step 1
  73. above we know that there are no remaining crossovers within WVS and
  74. WDB; thus any crossovers would be between WVS and WDB polygons.
  75. Crossovers could mean two things: (1) A slightly misplaced WDB
  76. polygon crosses a more accurate WVS polygon, both representing the
  77. same geographic feature, or (2) a misplaced WDB polygon (e.g., a
  78. small coastal lake) crosses the accurate WVS shoreline. We
  79. distinguished between these cases by comparing the area and centroid
  80. of the two polygons. In almost all cases it was obvious when we had
  81. duplicates; a few cases had to be checked manually. Second, on many
  82. occasions the WDB duplicate polygon did not cross its WVS counterpart
  83. but was either entirely inside or outside the WVS polygon. In those
  84. cases we relied on the area-centroid tests.
  85. * While the largest polygons were easy to identify by visual
  86. inspection, the majority remain unidentified. Since it is important
  87. to know whether a polygon is a continent or a small pond inside an
  88. island inside a lake we wrote programs that would determine the
  89. hierarchical level of each polygon. Here, level = 1 represents
  90. ocean/land boundaries, 2 is land/lakes borders, 3 is
  91. lakes/islands-in-lakes, and 4 is
  92. islands-in-lakes/ponds-in-islands-in-lakes. Level 4 was the highest
  93. level encountered in the data. To automatically determine the
  94. hierarchical levels we wrote programs that would compare all possible
  95. pairs of polygons and find how many polygons a given polygon was
  96. inside. Because of the size and number of the polygons such programs
  97. would typically run for 3 days on a Sparc-2 workstation.
  98. * Once we know what type a polygon is we can enforce a common
  99. "orientation" for all polygons. We arranged them so that when you
  100. move along a polygon from beginning to end, your left hand is
  101. pointing toward "land". At this step we also computed the area of all
  102. polygons since we would like the option to plot only features that
  103. are bigger than a minimum area to be specified by the user.
  104. * Obviously, if you need to make a map of Denmark then you do not want
  105. to read the entire 1.4 million points making up the Africa-Eurasia
  106. polygon. Furthermore, most plotting devices will not let you paint
  107. and fill a polygon of that size due to memory restrictions. Hence, we
  108. need to partition the polygons so that smaller subsets can be
  109. accessed rapidly. Likewise, if you want to plot a world map on a
  110. letter-size paper there is no need to plot 10 million data points as
  111. most of them will plot several times on the same pixel and the
  112. operation would take a very long time to complete. We chose to make 5
  113. versions on the database, corresponding to different resolutions. The
  114. decimation was carried out using the Douglas-Peucker (DP)
  115. line-reduction algorithm [2]_. We chose the cutoffs so that each
  116. subset was approximately 20% the size of the next higher resolution.
  117. The five resolutions are called **f**\ ull, **h**\ igh,
  118. **i**\ ntermediate, **l**\ ow, and **c**\ rude; they are accessed in
  119. :doc:`/coast`, :doc:`/gmtselect`, and
  120. :doc:`/grdlandmask` with the **-D** option.
  121. For each of these 5 data sets (**f**, **h**, **i**,
  122. **l**, **c**) we specified an equidistant grid (1, 2, 5, 10, 20) and
  123. split all polygons into line-segments that each fit inside one of the
  124. many boxes defined by these grid lines. Thus, to paint the entire
  125. continent of Australia we instead paint many smaller polygons made up
  126. of these line segments and gridlines. Some book-keeping has to be
  127. done since we need to know which parent polygon these smaller pieces
  128. came from in order to prescribe the correct paint or ignore if the
  129. feature is smaller than the cutoff specified by the user. The
  130. resulting segment coordinates were then scaled to fit in short
  131. integer format to preserve precision and written in netCDF format for
  132. ultimate portability across hardware platforms [3]_.
  133. * While we are now back to a file of line-segments we are in a much
  134. better position to create smaller polygons for painting. Two problems
  135. must be overcome to correctly paint an area:
  136. - We must be able to join line segments and grid cell borders into
  137. meaningful polygons; how we do this will depend on whether we want
  138. to paint the land or the oceans.
  139. - We want to nest the polygons so that no paint falls on areas that
  140. are "wet" (or "dry"); e.g., if a grid cell completely on land
  141. contains a lake with a small island, we do not want to paint the
  142. lake and then draw the island, but paint the annulus or "donut"
  143. that is represented by the land and lake, and then plot the
  144. island.
  145. GMT uses a polygon-assembly routine that carries out these tasks on the fly.
  146. The Five Resolutions
  147. --------------------
  148. We will demonstrate the power of the new database by starting with a
  149. regional hemisphere map centered near Papua New Guinea and zoom in on a
  150. specified point. The map regions will be specified in projected km from
  151. the projection center, e.g., we may want the map to go from km to km in
  152. the longitudinal and the latitudinal direction.
  153. Also, as we zoom in on the projection center we want to draw the outline
  154. of the next map region on the plot. To do that we use the **-D** option
  155. in :doc:`/basemap`.
  156. The crude resolution (**-Dc**)
  157. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  158. We begin with an azimuthal equidistant map of the hemisphere centered on
  159. 130°21'E, 0°12'S, which is slightly west of New Guinea, near the Strait of
  160. Dampier. The edges of the map are all 9000 km true distance from the
  161. projection center. At this scale (and for global maps) the crude
  162. resolution data will usually be adequate to capture the main geographic
  163. features. To avoid cluttering the map with insignificant detail we only
  164. plot features (i.e., polygons) that exceed 500 km\ :sup:`2` in area.
  165. Smaller features would only occupy a few pixels on the plot and make the
  166. map look "dirty". We also add national borders to the plot. The crude
  167. database is heavily decimated and simplified by the DP-routine: The
  168. total file size of the coastlines, rivers, and borders database is only
  169. 283 kbytes. The plot is produced by the script:
  170. .. literalinclude:: /_verbatim/GMT_App_K_1.txt
  171. .. figure:: /_images/GMT_App_K_1.*
  172. :width: 500 px
  173. :align: center
  174. Map using the crude resolution coastline data.
  175. Here, we use the :term:`MAP_ANNOT_OBLIQUE` words to achieve horizontal
  176. annotations and set :term:`MAP_ANNOT_MIN_SPACING` to suppress some
  177. longitudinal annotations near the S pole that otherwise would overprint.
  178. The square box indicates the outline of the next map.
  179. The low resolution (**-Dl**)
  180. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  181. We have now reduced the map area by zooming in on the map center. Now,
  182. the edges of the map are all 2000 km true distance from the projection
  183. center. At this scale we choose the low resolution data that faithfully
  184. reproduce the dominant geographic features in the region. We cut back on
  185. minor features less than 100 km\ :sup:`2` in area. We still add
  186. national borders to the plot. The low database is less decimated and
  187. simplified by the DP-routine: The total file size of the coastlines,
  188. rivers, and borders combined grows to 907 kbytes; it is the default
  189. resolution in GMT. The plot is generated by the script:
  190. .. literalinclude:: /_verbatim/GMT_App_K_2.txt
  191. .. figure:: /_images/GMT_App_K_2.*
  192. :width: 500 px
  193. :align: center
  194. Map using the low resolution coastline data.
  195. The intermediate resolution (**-Di**)
  196. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  197. We continue to zoom in on the map center. In this map, the edges of the
  198. map are all 500 km true distance from the projection center. We abandon
  199. the low resolution data set as it would look too jagged at this scale
  200. and instead employ the intermediate resolution data that faithfully
  201. reproduce the dominant geographic features in the region. This time, we
  202. ignore features less than 20 km\ :sup:`2` in area. Although the script
  203. still asks for national borders none exist within our region. The
  204. intermediate database is moderately decimated and simplified by the
  205. DP-routine: The combined file size of the coastlines, rivers, and
  206. borders now exceeds 3.35 Mbytes. The plot is generated by the script:
  207. .. literalinclude:: /_verbatim/GMT_App_K_3.txt
  208. .. figure:: /_images/GMT_App_K_3.*
  209. :width: 500 px
  210. :align: center
  211. Map using the intermediate resolution coastline data. We have added a compass
  212. rose just because we have the power to do so.
  213. The high resolution (**-Dh**)
  214. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  215. The relentless zooming continues! Now, the edges of the map are all 100
  216. km true distance from the projection center. We step up to the high
  217. resolution data set as it is needed to accurately portray the detailed
  218. geographic features within the region. Because of the small scale we
  219. only ignore features less than 1 km\ :sup:`2` in area. The high
  220. resolution database has undergone minor decimation and simplification by
  221. the DP-routine: The combined file size of the coastlines, rivers, and
  222. borders now swells to 12.3 Mbytes. The map and the final outline box are
  223. generated by these commands:
  224. .. literalinclude:: /_verbatim/GMT_App_K_4.txt
  225. .. figure:: /_images/GMT_App_K_4.*
  226. :width: 500 px
  227. :align: center
  228. Map using the high resolution coastline data.
  229. The full resolution (**-Df**)
  230. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  231. We now arrive at our final plot, which shows a detailed view of the
  232. western side of the small island of Waigeo. The map area is
  233. approximately 40 by 40 km. We call upon the full resolution data set to
  234. portray the richness of geographic detail within this region; no
  235. features are ignored. The full resolution has undergone no decimation
  236. and it shows: The combined file size of the coastlines, rivers, and
  237. borders totals a (once considered hefty) 55.9 Mbytes. Our final map is
  238. reproduced by the single command:
  239. .. literalinclude:: /_verbatim/GMT_App_K_5.txt
  240. .. figure:: /_images/GMT_App_K_5.*
  241. :width: 500 px
  242. :align: center
  243. Map using the full resolution coastline data.
  244. We hope you will study these examples to enable you to make efficient
  245. and wise use of this vast data set.
  246. Footnotes
  247. ~~~~~~~~~
  248. .. [1]
  249. `National Centers for Environmental Information, Boulder, Colorado <http://www.ncei.noaa.gov/>`_
  250. .. [2]
  251. Douglas, D.H., and T. K. Peucker, 1973, Algorithms for the reduction
  252. of the number of points required to represent a digitized line or its
  253. caricature, *Canadian Cartographer*, 10, 112–122.
  254. .. [3]
  255. If you need complete polygons in a simpler format, see the article on
  256. GSHHG (Wessel, P., and W. H. F. Smith, 1996, A Global,
  257. self-consistent, hierarchical, high-resolution shoreline database,
  258. *J. Geophys. Res. 101*, 8741–8743).
Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...