Vector_Library GRASS 6 Vector Architecture

6

by GRASS Development Team

http://grass.itc.it

Table of contents

Specifications:

Functions:

Background

Generally, the vector data model is used to describe geographic phenomena which may be represented by geometric entities (primitives) like points, lines, and areas. The GRASS vector data model includes the description of topology, where besides the coordinates describing the location of the points, lines, boundaries and centroids, their spatial relations are also stored. In general, topological GIS require a data structure where the common boundary between two adjacent areas is stored as a single line, simplifying the map maintenance.

Introduction

The GRASS 6 vector format is very similar to old GRASS 4.x (5.0/5.3) vector format.

This description covers the new GRASS 6 vector library architecture. This new architecture overcomes the vector limitations of GRASS 4.x-5.4.x by extending the vector support with attributes stored in external relational databases, and by new 3D capabilities. Besides internal file based storage the geometry may alternatively be stored in a PostGIS database. This enables users to maintain large data sets with simultaneous write access. External GIS formats such as SHAPE-files may be used directly, without requiring format conversion.

The current implementation includes:

The following vector objects are available:

GRASS vector maps are stored in an arc-node representation, consisting of curves called arcs. An arc is stored as a series of x,y,z coordinate pairs. The two endpoints of an arc are called nodes. Two consecutive x,y,z pairs define an arc segment. The user specifies the type of input to GRASS; GRASS doesn't decide. GRASS allows for the line definition which allows for multiple types to co-exist in the same map. Centroid are assigned to area it is within/inside (geometrically). An area is identified by an x,y,z centroid point geometrically inside with a category number (ID). This identifies the area. Such centroids are stored in the same binary 'coor' file with other primitives. Each element may have none, one or more categories (cats). More cats are distinguished by field number (field). Single and multi-category support on modules level are implemented. Z-coordinate is optional and both 2d and 3d files may be written.

Vector Libraries

Besides internal library functions there are two main libraries:

For historical reasons, there are two internal libraries for vector:

The Vlib Vector library was introduced in grass4.0 to hide internal vector files' formats and structures. In GRASS 6 everything is accessed via Vect_*() functions, for example:

Old 4.x code:

    xx = Map.Att[Map.Area[area_num].att].x;
New 6.x functions:
    Vect_get_area_centroid()
    Vect_get_centroid_coor()

Introduction to Vlib (Vector library)

Note: For details please read Blazek et al. 2002 (see below) as well as the references in this document.

Directory structure

Directory structure and file names are changed with respect to previous GRASS versions. All vector files for one vector map are stored in one directory:

$MAPSET/vector/vector_name/

This directory contains these files:

coor file format specification

  1. In the coor file the following is stored: 'line' (element) type, number of attributes and layer number for each category.
  2. Coordinates in binary file are stored as double (8 bytes).

Head

NameTypeNumberDescription

Version_Major C1 
Version_Minor C1 
Earliest_MajorC1 
Earliest_MinorC1 

byte_orderC1little or big endian flag; files are written in machine native order but files in both little and big endian order may be read
with_zC12D or 3D flag
sizeL1coor file size
reservedC10not used

Body

The body consists of line records:

NameTypeNumberDescription
record headerI1
  • 0. bit : 1 - alive, 0 - dead line
  • 1. bit : 1 - categories, 0 - no categories
  • 2.-3. bit : type - one of: GV_POINT, GV_LINE, GV_BOUNDARY, GV_CENTROID
  • 4.-7. bit : reserved, not used
ncatsC1number of categories (written only if categories exist)
fieldSncatsCategory identifier, distinguishes between more categories append to one line (written only if categories exist)
catIncatscategory value (written only if categories exist)
ncoorI1written for GV_LINES and GV_BOUNDARIES only
xDncoor 
yDncoor 
zDncoorpresent if with_z in head is set to 1

Types used in coor file
TypeNameSize in Bytes
DDouble8
LLong 4
IInt 4
SShort 4
CChar 1

head file format

The file is an unordered list of key/value entries. The key is a string separated from value by a colon and optional whitespace. Key words are:

ORGANIZATION
DIGIT DATE
DIGIT NAME
MAP NAME
MAP DATE
MAP SCALE
OTHER INFO
ZONE
MAP THRESH

Vector library Topology management

Topology general characteristics:
  1. geometry and attributes are stored separately (don't read both if it is not necessary (usually it is not))
  2. the format is topological (areas build from boundaries)

Topology is written for native format while pseudo-topology is written for OGR sources, SHAPE-link.

topo file format

[detailed docs please insert!]

/* Vector types used in memory on run time - may change */
#define GV_POINT      0x01
#define GV_LINE       0x02
#define GV_BOUNDARY   0x04
#define GV_CENTROID   0x08
#define GV_FACE       0x10
#define GV_KERNEL     0x20
#define GV_AREA       0x40
#define GV_VOLUME     0x80

#define GV_POINTS (GV_POINT | GV_CENTROID )
#define GV_LINES (GV_LINE | GV_BOUNDARY )

Face and kernel are 3D equivalents of boundary and centroid, but there is no support (yet) for 3D topology (volumes). Faces are used in a couple of modules including NVIZ to visualize 3D buildings and other volumetric figures.

/* Topology level details */
#define GV_BUILD_NONE  0
#define GV_BUILD_BASE  1
#define GV_BUILD_AREAS  2
#define GV_BUILD_ATTACH_ISLES 3  /* Attach islands to areas */
#define GV_BUILD_CENTROIDS 4 /* Assign centroids to areas */
#define GV_BUILD_ALL GV_BUILD_CENTROIDS

GV_BOUNDARY contains geometry and it is used to build areas. GV_LINE cannot form an area.

struct line_cats
{
      int *field;	/* pointer to array of fields */
      int *cat;		/* pointer to array of categories */
      int n_cats;	/* number of vector categories attached to element */
      int alloc_cats;	/* allocated space */
};

Topology Example 1:

A polygon may be formed by many boundaries (more primitives but connected). One boundary is shared by adjacent areas.

+--1--+--5--+
|     |     |
2  A  4  B  6
|     |     |
+--3--+--7--+

1,2,3,4,5,6,7 = 7 boundaries (primitives)
A,B = 2 areas

Topology Example 2:

This is handled correctly in GRASS: A can be filled, B filled differently.

+---------+
|    A    |
+-----+   |
|  B  |   |
+-----+   |
|         |
+---------+

In GRASS, whenever an 'inner' ring touches the boundary of an outside area, even in one point, it is no longer an 'inner' ring, it is simply another area. A, B above can never be exported from GRASS as polygon A with inner ring B because there are only 2 areas A and B and no island.

Topology Example 3:

v.in.ogr/v.clean can identify dangles and change the type from boundary to line (in TIGER data for example). Distinction between line and boundary isn't important only for dangles. Example:

+-----+-----+
|     .     |
|     .     |
+.....+.....+
|     .     |
|  x  .     |
+-----+-----+

----  road + boundary of one parcel => type boundary
....  road => type line
x     parcel centroid (identifies whole area)

Because lines are not used to build areas, we have only one area/centroid, instead of 4 which would be necessary in TIGER.

Topology memory management

Topology is generated for all kinds of vector types. Memory is not released by default. The programmer can force the library to release the memory by using Vect_set_release_support(). But: The programmer cannot run Vect_set_release_support() in mid process because all vectors are needed in the spatial index, which is needed to build topology.

Topology is also necessary for points in case of a vector network because the graph is built using topology information about lines and points.

The topology structure does not only store the topology but also the 'line' bounding box and line offset in coor file (index). The existing spatial index is using line ID in 'topology' structure to identify lines in 'coor' file. Currently it is not possible to build spatial index without topology.

Vector library spatial index management

Spatial index (based on R-tree) is generated on the fly.

Spatial index occupies a lot of memory but it is necessary for topology building. Also, it takes a long time to release the memory occupied by spatial index (dig_spidx_free()).

The function building topology (Vect_build()) is usually called at the end of modules (before Vect_close()) so it is faster to call exit() and operating system releases all the memory much faster. By default the memory is not released.

It is possible to call Vect_set_release_support() before Vect_close() to enforce memory release, but it takes a long time on large files.

Currently most of the modules do not release the memory occupied for spatial index and work like this (pseudocode):

int 
main
{
     Vect_open_new()
     //writing new vector

     Vect_build()
     Vect_close()  // memory is not released
}

In general it is possible to free the memory with Vect_set_release_support() such as:

int 
main
{
     Vect_open_new()
     // writing new vector

     Vect_build()
     Vect_set_release_support()
     Vect_close()  // memory is released
}

but it takes longer.

It make sense to release the spatial index if it is used only at the beginning of a module or in permanently running programs like QGIS. For example:

int 
main
{
     Vect_open_old()
     // select features using spatial index, e.g.  Vect_select_lines_by_box()
     Vect_set_release_support()
     Vect_close()  // memory is released

     // do some processing which needs memory
}

Vector library categories and layers

Note: "layer" was called "field" in earlier version.

In GRASS a "category" is a feature ID used to link geometry with attributes stored in one or many (external) database table(s). Each vector feature inside a vector map has zero, one or more <layer,category> tuple(s). A user can (but not must) create attribute tables which are referenced by the layer, and rows which are essentially referenced by the <layer,category> pair.

Vector TINs

TINs are simply created as 2D/3D vector polygons consisting of 3 vertices.

Vector library and Attributes

Note: "layer" was called "field" in earlier version.

The old GRASS 4.x 'dig_cats' files are not used any more and vectors' attributes are stored in external database. Connection with the database is done through drivers based on DBMI library (odbc, dbf, PostgreSQL and MySQL drivers are available at this time). Records in a table are linked to vector entities by field and category number. The field identifies table and the category identifies record. I.e., for any unique combination map+mapset+field+category, there exists one unique combination driver+database+table+row.

For each pair map + field, all of table, key column, database, driver must be defined. This definition must be written to $MAPSET/DB text file. Each row in the DB file contains names separated by spaces in following order ([] - optional):

map[@mapset] field table [key [database [driver]]]

If key, database or driver are omitted (on second and higher row only) the last definition is used. Definitions from DB file in other mapsets may be overwritten by a definition in the current mapset if mapset is specified with map name.

Wild cards * and ? may be used in map and mapset names.

Variables $GISDBASE, $LOCATION, $MAPSET, $MAP, $FIELD may be used in table, key, database and driver names. Note that $MAPSET is not the current mapset but mapset of the map the rule is defined for.

Note that features in GRASS vectors may have attributes in different tables or may be without attributes. Boundaries form areas but it may happen that some boundaries are not closed (such boundaries would not appear in polygon layer). Boundaries may have attributes. All types may be mixed in one vector.

The link to the table is permanent and it is stored in 'dbln' file in vector directory. Tables are considered to be a part of the vector and g.remove, for example, deletes linked tables of the vector. Attributes must be joined with geometry.

Examples: Examples are written mostly for the dbf driver, where database is full path to the directory with dbf files and table name is the name of dbf file without .dbf extension.

* 1 tbl id $GISDBASE/$LOCATION/$MAPSET/vector/$MAP dbf
This definition says that entities with category of field 1 are linked to dbf tables with names tbl.dbf saved in vector directories of each map.

* 1 $MAP id $GISDBASE/$LOCATION/$MAPSET/dbf dbf
Similar as above but all dbf files are in one directory dbf/ in mapset and names of dbf files are $MAP.dbf<BR>

water* 1 rivers id /home/grass/dbf dbf
water* 2 lakes lakeid /home/guser/mydb
trans* 1 roads key basedb odbc
trans* 5 rails
These definitions define more fields for one map i.e. in one map may be more features linked to more tables. Definitions on first 2 rows are applied for example on maps water1, water2, ... so that more maps may share one table.

water@PERMANENT 1 myrivers id /home/guser/mydbf dbf
This definion overwrites the definition saved in PERMANENT/DB and links the water map from PERMANENT mapset to the user's table.

Modules should be written so that connections to databases for each vector field are independent. It should be possible to read attributes of an input map from one database and write to some other and even with some other driver (should not be such problem).

There are open questions, however. For one, how does one distinguish when new tables should be written and when not? For example, definitions:

river 1 river id water odbc
river.backup* 1 NONE
could be used to say that tables should not be copied for backups of map river because table is stored in reliable RDBMS.

DGLib (Directed Graph Library)

The Directed Graph Library or DGLib (Micarelli 2002, Directed Graph Library , http://grass.itc.it/dglib/) provides functionality for vector network analysis. This library released under GPL is hosted by the GRASS project (in the CVS server within the GRASS source code). As a stand-alone library it may also be used by other software projects.

The Directed Graph Library library provides functionality to assign costs to lines and/or nodes. That means that costs can be accumulated while traveling along polylines. The user can assign individual costs to all lines and/or nodes of a vector map and later calculate shortest path connections based on the accumulated costs. Applications are transport analysis, connectivity and more. Implemented applications cover Shortest path, Traveling salesman (round trip), Allocation of sources (creation of subnetworks), Minimum Steiner trees (star-like connections), and iso-distances (from centers).

For details, please read Blazek et al. 2002 (see below).

Related vector functions are: Vect_graph_add_edge(), Vect_graph_init(), Vect_graph_set_node_costs(), Vect_graph_shortest_path(), Vect_net_build_graph(), Vect_net_nearest_nodes(), Vect_net_shortest_path(), and Vect_net_shortest_path_coor().

Vector ASCII Format Specifications

The ASCII format is (currently) explained in the manual page of v.in.ascii, which is defined in the file:

vector/v.in.ascii/description.html

Vector modules and their parameters/flags

See also grass5/documents/parameter_proposal.txt

A module is a GRASS command invoked by the user.

Modules operation

Each module which modifies and writes data must read from input= and write to output= so that data may not be lost. For example v.spag works on map= at in grass5.0 but if program (system) crashes or treshold was specified incorrectly and vector was not backuped, data were lost. In this case map= option should be replaced by input= and output=

Topology is always built by default if the coor file was modified.

Dimensionality is generally kept. Input 2D vector is written as 2D, 3D as 3D. There are a few modules which change the dimension on purpose.

Modules parameters/flags

--o overwrite existing files
-b do not build topo file; by default topo file is written
-q quiet
-v run verbosely [either -q or -v!]
-t create new table, default ???
-u don't create new table ???
-z write 3D file (if input was 2D)

map= input vector for modules without output
input= input vector
output= output vector
type= type of elements: point,line,boundary,centroid,area
cat= category or category list (example: 1,5,9-13,35)
layer= layer number
where= condition of SQL statement for selection of records
col= column name (in external table)

List of vector library functions

The Vect_*() functions are the programmer's API for GRASS vector programming.

Vector area functions

Vect_get_area_area();

Vect_get_area_boundaries();

Vect_get_area_centroid();

Vect_get_area_isle();

Vect_get_area_num_isles();

Vect_get_area_points();

Vect_get_isle_area();

Vect_get_isle_boundaries();

Vect_get_isle_points();

Vect_point_in_area();

Vector array functions

Vect_new_varray();

Vect_set_varray_from_cat_list();

Vect_set_varray_from_cat_string();

Vect_set_varray_from_db();

Vector box functions

Vect_box_copy();

Vect_box_extend();

Vect_box_overlap();

Vect_get_area_box();

Vect_get_isle_box();

Vect_get_line_box();

Vect_get_map_box();

Vect_point_in_box();

Vect_region_box();

Vector break lines functions

Vect_break_lines();

Vector break_polygons functions

Vect_break_polygons();

Vector bridges functions

Vect_remove_bridges();

Vector buffer functions

Vect_line_buffer();

Vect_line_parallel();

Vector build functions

Vect_build();

Vect_build_partial();

Vect_get_built();

Vect_save_spatial_index();

Vect_save_topo();

Vect_spatial_index_dump();

Vect_topo_dump();

Vector build_nat functions

Vect_attach_centroids();

Vect_attach_isle();

Vect_attach_isles();

Vect_build_line_area();

Vect_build_nat();

Vect_isle_find_area();

Vector build_ogr functions

Vect_build_ogr();

Vector cats functions

Vect_array_to_cat_list();

Vect_cat_del();

Vect_cat_get();

Vect_cat_in_array();

Vect_cat_in_cat_list();

Vect_cat_set();

Vect_destroy_cat_list();

Vect_destroy_cats_struct();

Vect_field_cat_del();

Vect_new_cat_list();

Vect_new_cats_struct();

Vect_reset_cats();

Vect_str_to_cat_list();

Vector cindex functions

Vect_cidx_dump();

Vect_cidx_find_next();

Vect_cidx_get_cat_by_index();

Vect_cidx_get_field_index();

Vect_cidx_get_field_number();

Vect_cidx_get_num_cats_by_index();

Vect_cidx_get_num_fields();

Vect_cidx_get_num_types_by_index();

Vect_cidx_get_num_unique_cats_by_index();

Vect_cidx_get_type_count();

Vect_cidx_get_type_count_by_index();

Vect_cidx_open();

Vect_cidx_save();

Vector clean_nodes functions

Vect_clean_small_angles_at_nodes();

Vector close functions

Vect_close();

Vector constraint functions

Vect_get_constraint_box();

Vect_remove_constraints();

Vect_set_constraint_region();

Vect_set_constraint_type();

Vector dangles functions

Vect_chtype_dangles();

Vect_remove_dangles();

Vector dbcolumns functions

Vect_get_column_names();

Vect_get_column_names_types();

Vect_get_column_types();

Vector error functions

Vect_get_fatal_error();

Vect_set_fatal_error();

Vector field functions

Vect_add_dblink();

Vect_check_dblink();

Vect_default_field_info();

Vect_get_dblink();

Vect_get_field();

Vect_map_add_dblink();

Vect_map_check_dblink();

Vect_map_del_dblink();

Vect_new_dblinks_struct();

Vect_read_dblinks();

Vect_reset_dblinks();

Vect_subst_var();

Vect_write_dblinks();

Vector find functions

Vect_find_area();

Vect_find_island();

Vect_find_line();

Vect_find_node();

Vector graph functions

Vect_graph_add_edge();

Vect_graph_init();

Vect_graph_set_node_costs();

Vect_graph_shortest_path();

Vector header functions

Vect_get_comment();

Vect_get_date();

Vect_get_full_name();

Vect_get_map_date();

Vect_get_map_name();

Vect_get_mapset();

Vect_get_name();

Vect_get_organization();

Vect_get_person();

Vect_get_proj();

Vect_get_proj_name();

Vect_get_scale();

Vect_get_zone();

Vect_is_3d();

Vect_print_header();

Vect_set_comment();

Vect_set_date();

Vect_set_map_date();

Vect_set_map_name();

Vect_set_organization();

Vect_set_person();

Vect_set_scale();

Vect_set_thresh();

Vect_set_zone();

Vector hist functions

Vect_hist_command();

Vect_hist_copy();

Vect_hist_read();

Vect_hist_rewind();

Vect_hist_write();

Vector init_head functions

Vect_copy_head_data();

Vector intersect functions

Vect_line_check_intersection();

Vect_segment_intersection();

Vector legal_vname functions

Vect_check_input_output_name();

Vector level functions

Vect_level();

Vector level_two (topological) functions

Vect_get_centroid_area();

Vect_get_line_areas();

Vect_get_line_nodes();

Vect_get_node_coor();

Vect_get_node_line();

Vect_get_node_line_angle();

Vect_get_node_n_lines();

Vect_get_num_areas();

Vect_get_num_dblinks();

Vect_get_num_islands();

Vect_get_num_lines();

Vect_get_num_nodes();

Vect_get_num_primitives();

Vect_get_num_updated_lines();

Vect_get_num_updated_nodes();

Vect_get_updated_line();

Vect_get_updated_node();

Vector line functions

Vect_append_point();

Vect_append_points();

Vect_copy_pnts_to_xyz();

Vect_copy_xyz_to_pnts();

Vect_destroy_line_struct();

Vect_line_box();

Vect_line_delete_point();

Vect_line_distance();

Vect_line_geodesic_length();

Vect_line_insert_point();

Vect_line_length();

Vect_line_prune();

Vect_line_prune_thresh();

Vect_line_reverse();

Vect_line_segment();

Vect_new_line_struct();

Vect_point_on_line();

Vect_points_distance();

Vect_reset_line();

Vector list functions

Vect_destroy_list();

Vect_list_append();

Vect_list_append_list();

Vect_list_delete();

Vect_list_delete_list();

Vect_reset_list();

Vect_val_in_list();

Vector map functions

Vect_copy();

Vect_copy_map_lines();

Vect_copy_table();

Vect_copy_table_by_cats();

Vect_copy_tables();

Vect_delete();

Vect_rename();

Vector net functions

Vect_net_build_graph();

Vect_net_nearest_nodes();

Vect_net_shortest_path();

Vect_net_shortest_path_coor();

Vector open functions

Vect_coor_info();

Vect_maptype_info();

Vect_open_new();

Vect__open_old();

Vect_open_old();

Vect_open_old_head();

Vect_open_spatial_index();

Vect_open_topo();

Vect_open_update();

Vect_open_update_head();

Vect_set_open_level();

Vector overlay functions

Vect_overlay();

Vect_overlay_and();

Vect_overlay_str_to_operator();

Vector poly functions

Vect_find_poly_centroid();

Vect_get_point_in_area();

Vect_get_point_in_poly();

Vect_get_point_in_poly_isl();

Vector read functions

Vect_area_alive();

Vect_isle_alive();

Vect_line_alive();

Vect_node_alive();

Vect_read_line();

Vect_read_next_line();

Vector remove_areas functions

Vect_remove_small_areas();

Vector remove_duplicates functions

Vect_remove_duplicates();

Vector rewind functions

Vect_rewind();

Vector select functions

Vect_select_areas_by_box();

Vect_select_areas_by_polygon();

Vect_select_isles_by_box();

Vect_select_lines_by_box();

Vect_select_lines_by_polygon();

Vect_select_nodes_by_box();

Vector spatial index functions

Vect_spatial_index_add_item();

Vect_spatial_index_del_item();

Vect_spatial_index_destroy();

Vect_spatial_index_init();

Vect_spatial_index_select();

Vector snap functions

Vect_snap_lines();

Vector tin functions

Vect_tin_get_z();

Vector type functions

Vect_option_to_types();

Vector write functions

Vect_rewrite_line();

Vect_write_line();

Contacts

Radim Blazek (vector architecture) <ramin.blazek@gmail.com>

Roberto Micarelli (DGLib) <mi.ro@iol.it>

References

Text based on: R. Blazek, M. Neteler, and R. Micarelli. The new GRASS 5.1 vector architecture. In Open source GIS - GRASS users conference 2002, Trento, Italy, 11-13 September 2002. University of Trento, Italy, 2002. http://www.ing.unitn.it/~grass/conferences/GRASS2002/proceedings/proceedings/pdfs/Blazek_Radim.pdf

See Also

DBMI - Database Management Interface: DataBase_Management_Interface

Last change:

Date
2007/02/04 08:38:21

Generated on Sun Apr 6 17:32:44 2008 for GRASS by  doxygen 1.5.5