![]() |
![]() |
![]() |
| Home | Projects | Members | Publications | Committee | Data | RSPA | Meetings | Contact |
| Introduction |
|
| Imagery | |
| Segmentation | |
| Classification | |
| Problems and Challenges | Return to Environmental Assessment Home |
| Conclusions | Return to NCRST-E Home |
|
There is a general perception among transportation engineers that is shared among many professionals that have utilized aerial photography that high resolution remote sensing images from space-based systems at less than 10 meters to submeter scale would enhance their ability to accomplish work-related tasks. Although the resolution of satellite imagery is not comparable to aerial photography, the fact that these images can be acquired repeatedly and in a timely manner is certainly deemed an advantage. Furthermore, the fact that the images are digital and multispectral in nature opens the possibility that the images can be manipulated mathematically and classified in a systematic and repeatable procedure. It is imperative that one has a clear idea as to what information they wish to extract from imagery so that these goals can be matched to the appropriate data set. Each image data set has its advantages and disadvantages. Classification of high resolution imagery is not as straightforward as more traditional classification with lower resolution imagery (30 m or more).
The challenges of classifying high resolution imagery stems primarily from two attributes. First, the classification system must be vastly more detailed than traditional classification systems because the spectral response of each pixel is associated with very specific earth and man-made materials. There is no inherent generalization as in low resolution imagery. Typically, we are not interested in such high levels of classification detail and some degree of generalization is desired in the end. Secondly, while the small pixel size dictates a fairly high degree of classification specificity, the spectral resolution of high resolution sensors is limited to only four bands thereby limiting one's ability to discriminate among materials. Such low spectral resolution makes it difficult to distinguish many earth and man-made materials. Thus, a careful balance must be sought between the number of classes defined and the ability to discriminate among them. Failing to strike the correct balance means that you can end up with classes that have broad spectral characteristics so they cannot be reasonably distinguished from other classes. In the end, we may require several classes of asphalt road, but still find it rather difficult to distinguish some of these road classes from rooftops of structures that are comprised of very similar materials. In this activity, imagery over the same target from three different sensing systems is classified to illustrate the different capabilities of data from these systems. These data also illustrate some of the challenges one faces in classifying high resolution imagery of an urban environment. We classify a portion of a Landsat ETM+ scene, Advanced Thermal Land Applications Sensor (ATLAS) scene and a QuickBird-2 scene. The acquisition dates and resolution of these data sets is given in the table below.
Table 1. Specifications for imagery used in this activity.
|
|
The Landsat series of satellites provide one of the most extensive and continuous terrestrial imagery archives. Since the beginning of the Landsat program in 1972, data have been acquired from three different generations of sensors, the Multispectral Scanner (MSS), Thematic Mapper (TM) and Enhanced Thematic Mapper Plus (ETM+). Landsat imagery of the globe is subset and marketed as a patchwork of individual scenes identified by a row and path designation. The MSS was a 4-channel sensor that is no longer operational, however, archived historical data are available. The TM and ETM+ are 7 channel multispectral sensors. ETM+ also includes a broadband panchromatic channel. The visible and infrared channels of TM and ETM+ are nominally 28.5 m resolution, whereas the thermal channel is 90 m resolution on TM and 60 m resolution on ETM+.
ATLAS was flown on board a Lear 23 jet aircraft operated by NASA Stennis Space Center. The ATLAS is a 15-channel multispectral scanner that incorporates the bandwidths of the Landsat TM and ETM+ with additional bands in the middle reflective infrared and thermal infrared range. ATLAS data were collected at an altitude of approximately 5,032 m above mean terrain, which resulted in an image spatial resolution of 10 m. Images from each north-south trending flight line were rectified with approximately 100 ground control points each.
In the United States, high resolution satellites reside in the commercial sector. Examples include the IKONOS instrument owned and operated by Space Imaging, Orbview-3, operated by ORBIMAGE, and the QuickBird-2 instrument operated by Digital Globe. The IKONOS -2 satellite was launched in September 1999 and has been delivering commercial data since early 2000. IKONOS is the first of the next generation of high spatial resolution satellites. IKONOS data records 4 channels of multispectral data at 4 meter resolution and one panchromatic channel with 1 meter resolution. Orbview-3, launched in June 2003, has similar characteristics to IKONOS. ORBIMAGE is planning to launch a higher resolution satellite in 2007. QuickBird-2, launched in October 2001, has four channels with equivalent spectral properties as Landsat's visible and near infrared channels 1 to 4. These data are acquired at 2.4 m resolution, the highest of all civilian satellite-based sensors to date. In addition, QuickBird has a panchromatic sensor with 0.61 m resolution. QuickBird-2 data is used in this activity.
|
|
Segmentation is an important pre-processing step before attempting to classify imagery. It is a subjective process whereby individual pixels are grouped with adjacent ones based on spectral similarity as well as shape criteria. The image analyst defines a scale parameter and three homogeneity criteria: shape factor, compactness and smoothness. All of these parameters are varied iteratively until the segmentation process yields a grouping of pixels that is meaningful to the analyst. In so doing, the image processing time is reduced because the mean of the segment object is used in subsequent analysis and there are fewer objects than pixels to processes. In addition, the homogeneity criteria serves as a first order filter on noise within a class because future processing is based on the mean of the grouping of pixels that make up a segment rather than on the individual pixels values. At the finest scale, these objects may represent discrete features, such as a house or car, or objects comprised of the same materials, such as asphalt, tree canopy, grass lawns, etc.
eCognition Professional (v. 4.2) was used for segmentation and classification in this activity. Figure 1 is an example of a software dialog box that allows the image analyst to specify scale and homogeneity criteria. In addition, each image band can be weighted individually as to their importance in influencing the outcome of the segmentation processes. Here, it would be desirable to reduce the weight of relatively noisy bands or bands that have little spectral contrast. On the other hand, a band with noise may help to differentiate classes and therefore be desirable if the noise is restricted to particular classes.
The notion of image "objects" is by its very nature a function of scale and thus highly dependent on image resolution. After all, objects are a collection of pixels that differentiate from their surroundings. In traditional classification with coarse resolution imagery (30 m to 1 km), we tend to think of objects as equivalent to classes, such as woodland, water, cropland. As resolution increases, objects tend to differentiate from classes. For example, objects might be represented by buildings, which taken collectively, might be classified as a commercial district. At even higher resolution, objects dissolve into materials with different spectral properties. Although this scale may appear more "realistic" to the human brain, the resolution may be too high to obtain meaningful objective interpretation through image processing. Multiscale segmentation offers a way to circumvent this problem by grouping pixels into objects of different size that can be interpreted. In addition, the objects can be further clustered into meaningful groups that can be classified. Below is a subset of imagery for the same spatial domain from Landsat ETM+, ATLAS, and Quickbird. The spatial resolution of the QuickBird image (532 x 609) is approximately an order of magnitude greater than the ETM image (45 x 51). In this image of an urban residential area, the layout of roads can be seen in the ATLAS image at 10 m resolution. We can interpret from the patterns we see along the roads that there are houses. It is unclear to us from the ATLAS image alone whether the somewhat "noisy" pattern of houses is due to insufficient resolution of possibly tree canopy obscuring portions of each house. In the center of the image, we can also detect a large building surrounded by green space and associated with a large oval. It is not difficult to interpret this as a school and track. In the full size version of the Quickbird image, the level of detail is much greater and we can readily see a baseball field between the school building and track and tennis courts. Not only is the outline of individual houses also visible, but the shadow cast by the houses is also discernible. In contrast, very little information is interpretable from the ETM image. Because the scale of discernible objects is much greater, a much larger portion of image is required to recognize features and objects.
Figure 3 shows a portion of QuickBird imagery showing a school and associated athletic fields surrounded by a residential area. Figure 4 is the resulting segmentation. The objective of this initial segmentation is to restrict the size of the segments to closely correspond to the size of primary objects of interest, in this case individual houses. In many cases, the houses are comprised of more than one segment, but increasing the segment scale even slightly caused much of the adjacent yard or shadows surrounding the house to be included in the segments. Thus, a delicate balance was established. On the other hand, creating such small segments means that large contiguous surfaces of grass or forest seem to have an excessive number of segments. This situation can be remedied utilizing a multiresolution classification based segmentation. Secondary segmentation can cluster first order segments of similar attributes into larger groups. Segments can be classified at the appropriate scale and then combined for the final classified image.
|
|
Taking into consideration the discussion above about object scale as a function of image resolution, a progressively larger subset of images was extracted for classification (Figure 5). Each of these images was acquired in the early springtime while the turf grasses were still senescent and before leaves emerged from trees and shrubs. Availability of both wintertime and springtime imagery would have yielded a better results, but such imagery was not available.
The ETM image is comprised of 181 x 173 pixels covering approximately 5 x 5 km. The six visible and infrared bands at 28.5 m resolution were used on the classification. The coarse resolution thermal band was not used in the classification. Figure 6 shows the resulting classification. The built up areas in this image could not be well discriminated in terms of objects. Thus, they tend to get lumped together in one large Urban/Transportation/Commercial/Industrial class. Residential density also could not be discerned from areas with either higher density or simply larger trees. Woodland was the only Undeveloped class and image resolution was insufficient and the woodland stands too small to distinguish deciduous from evergreen trees. Nonetheless, if your primary interest was in distinguishing highly developed commercial land from residential and non-residential, ETM data would be adequate for the task.
The ATLAS image is comprised of 312 x 342 pixels covering 3.1 x 3.4 km. Technically speaking, ATLAS has higher spectral resolution than ETM as well as better spatial resolution. However, Band 9 of ATLAS was not functioning, and there is very little difference among the five thermal channels between 8.2 and 12.5 µm. In order to maximize the information among these channels, one could perform a decorrelation stretch on three of these channels and use the result in the classification. At the higher resolution, the tree canopy in the residential area has a more significant influence on the spectral properties of the area (Figure 7). It is incorrect to consider these areas as woodland and "undeveloped." But an attempt was made to carry forward as many of the classes from the ETM classification as possible for sake of comparison. It would be best to impose an area scalar on this class to identify truly undeveloped land. At this resolution, major roads several lanes wide are detectable as are houses and buildings. The segmentation scale used, however, was too coarse to define individual houses for classification.
The QuickBird image is comprised of 532 x 609 pixels covering 1.28 x 1.46 km. At a resolution of 2.4 m, very small features can be segmented as image objects. However, because QuickBird imagery is limited to four channels, there is not much spectral information to permit a high degree of differentiation. Thus, there is a delicate balance between object size and spectral fidelity. This QuickBird classification was performed at multiple scales. The first order classification is based on the initial or level 1 segmentation in which individual houses and trees were defined and classified. These were then clustered in a second order or level 2 classification (Figure 8). The result is that individual buildings were differentiated based on the type and color of roofing material and several types of road materials were differentiated. In addition, although deciduous and evergreen trees could be distinguished at level 1, a scalar was imposed such that if both types occurred within a segment at level 2, then a Mixed Woodland class could be identified. In classifications of this type, it is conceivable that any number of quantitative parameters could subsequently be determined, such as quantifying the number of houses, or number of house of a certain size. One could also perhaps estimate the total length of roadways within the image. Certainly, the advantage of any image based classification is the manner in which the computer is utilized to yield quantitative information about the classified scene.
|
|
Classification of very high resolution imagery tends to present problems and/or challenges that may or may not be typical while performing similar work with coarser resolution imagery. Some of these difficulties are described here simply to raise awareness for other image analyst. How one deals with the difficulties depends on ones level of competency, familiarity with software capabilities and other resources available.
|
|
The choices of available imagery are large and expected to increase in the years ahead. As transportation industry leaders turn more and more to satellite-based imagery to address efficiency, one must continuously evaluate specific requirements and choose the correct imagery for the job. Some types of imagery are more readily available and at a lower cost than other imagery options. One should not overlook the potential for lower resolution imagery to satisfy requirements for the sake of higher resolution imagery with the false expectation that it is inherently better. In this activity, we attempt to demonstrate the advantages and disadvantages of imagery with several different characteristics to develop awareness of these products and the level of competency required to utilize them in transportation industry applications.
|