Carnets de Géologie / Notebooks on Geology, Brest, Memoir 2007/01, Abstract 02 (CG2007_M01/02)
P., G., J., E. & P. (2007).- A classification of spores by support vectors based on an analysis of their ornament spatial distribution - An application to Emsian miospores from Saudi Arabia. In: P. & E. (eds.), Recent Advances in Palynology.-Phylogeny;
miospores; image analysis; sum and difference histograms; support vector classification.
Phylogénie ; miospores ; analyse d'images ; histogrammes des sommes et différences ; classification de vecteurs à support.
Continuous morphological intergradations
exist between two trilete spore taxa from the Jauf Formation (Early Devonian) of the Widyan and Tabuk
basins (northwestern Saudi Arabia). The alternation of siliciclastics and carbonates
in this unit has been used to subdivide it into five members: from bottom to
top, the Sha'iba, Qasr, Subbat, Hammamiyat and Murayr members. The Jauf Formation in northwestern Saudi Arabia was deposited
in a nearshore environment ( et alii,
1999; & ,
The latest
study of miospores ( et alii,
2005a, in press) suggests that the Jauf Formation
is late Pragian to Emsian in age. Additional biostratigraphic evidence is provided by other fossil groups collected
in outcrop (e.g. et alii,
1989; et alii, 1992).
Among them, trilobites and conodonts indicate that the uppermost Sha'iba and Qasr members (lower Jauf Formation) are Pragian-early Emsian in age and brachiopods suggest that the Hammamiyat Member (upper Jauf Formation) is late Emsian in age.
The palynological material of this
report comes from boreholes previously studied (see et alii,
2005a, in press). They are in two
discrete areas about 350 km apart. Two of them (BAQA-1 and BAQA-2) are near Baq'a in the Widyan Basin, while JNDL-4 is
near Domat Al-Jandal in the Tabuk Basin. The stratigraphic levels encountered in
each locality overlap, and the correlations easily established using lithologic and wireline logs
are confirmed by palynological data reported by et alii
(2005a, in press). BAQA-1 and BAQA-2 cover an interval from the
Sha'iba Member, through the Qasr and Subbat members to the lowermost Hammamiyat.
The succession at JNDL-4 represents the upper part of the Subbat and the Hammamiyat Member.
Samples from BAQA-1, BAQA-2, and JNDL-4 were prepared in the Palynological Research Facility of the University of Sheffield.
For this study additional slides from BAQA-1 were processed in the Laboratory 'Paléobotanique, Paléopalynologie et Micropaléontologie' of the University of Liège. All samples were prepared using standard palynological acid maceration techniques.
A vast majority of the samples were productive and contain well-preserved organic matter. All material is housed in the Centre for Palynology of the University, Department of Animal and Plant Sciences, University of Sheffield and in the collections of the laboratory of 'Paléobotanique, Palynologie et Micropaléontologie', University of Liège.
Palynologists have
discussed only rarely the phylogenetic evolution of miospores in Palaeozoic sediments (e.g. ,
1979; , 1996; et alii,
2002; et alii, 2005b). Some authors have demonstrated
a continuous morphological intergradation among some dispersed spores that previously
were attributed to discrete species. The morphological signal presented by miospores may not reflect
only biological evolution, because it may be influenced locally by other parameters such as state of preservation, sedimentary sorting ( ,
2004) and/or reworking ( et alii,
A possible phylogenetic lineage including several morphotypes
(Fig. 1
), one of which has been described in et alii
(in press), is proposed here. The two end-members of this lineage are well differentiated; however all intermediary forms are present in the assemblages. The morphological variation
concerns distal ornamentation. This ornamentation involves changes in the shape of small cones and their spatial distribution
that vary progressively between the two end-members of the lineage. In the simplest morphotype ornaments are evenly distributed on the distal surface. In the intermediary morphotypes they combine progressively
to form a pseudo-reticulum, the walls of which are constituted by the discrete ornaments. In the most complex morphotypes, ornaments
merge to form elongated ridges which describe a completely closed reticulum. Thus a progressive
change in ornamentation occurs: from the simplest spores (ornaments constituted by evenly distributed small cones) to the most complex ones (ornaments
forming a reticulum).
From a stratigraphic point of view, the range
of the lineage is restricted to the Jauf Formation. Specimens occur first is in the upper part of the Sha'iba Member and disappear in the upper part of the Hammamiyat Member. The spore assemblages from the upper part of the Sha'iba Member are considered to
be in the PoW Oppel Zone of et alii
(1987), based on the general characteristics of the assemblages and the presence of typical taxa such as Brochotriletes foveolatus , 1953, Clivosispora verrucata , 1973, Dictyotriletes emsiensis ( ) , 1973, D. subgranifer , 1973, and Verrucosisporites polygonalis , 1968
(Fig. 2
) ( et alii,
2005a, in press). The presence of D. subgranifer may indicate that they
represent to the uppermost interval of the zone (Su Interval Zone) of the PoW Oppel Zone. The PoW Oppel Zone is of Pragian-earliest Emsian age, with the Su Interval Zone encompassing the latest Pragian-earliest Emsian. In the Qasr Member, the stratigraphically important spores Emphanisporites schultzii , 1973, and ?
Knoxisporites riondae et , 1975
(Fig. 2
) make their
first appearance. Similar spore assemblages are recovered throughout the Qasr Member and lower part of the Subbat Member. These assemblages are typical of those of Emsian age elsewhere and probably belong to either the AB or
to the lower FD Oppel Zones. These assemblages are constrained above by the first appearance of Rhabdosporites minutus et , 1975
(Fig. 2
) in the upper part of the Subbat Member. Its
occurrence marks the base of the FD Oppel Zone (Min Interval Zone). The spore assemblages of the Hammamiyat Member are essentially similar throughout and also belong to the Min Interval Zone. That
signifies that these assemblages are mid Emsian in age ( et alii,
2005a, in press).
The spore classification is based on the spatial distribution of the ornaments (from
a simple pattern to a complex organization). Initially, we expect texture image analysis
to provide a tool to gauge and thus to quantify morphological evolution and,
eventually, to make automatic classification feasible. Both structural - extraction
of texture elements and analysis of placement rules - and statistical methods - spatial statistics - are of interest in this case.
Here we discuss only spatial statistics employing Sum and Difference Histograms.
About 400 specimens of the miospore lineage
defined here were found in more than 60 palynological slides but only half of the material
was used for the image analysis. Excluded from the set are laterally compressed specimens
and partial spore fragments.
A data set of region of interest (ROI) of 256x256 pixels
was extracted from the original images in order to test the Spatial Vector Classification on SDH features
(Fig. 3
Sum and Difference Histograms (SDH) are based on Co-occurrence Matrices (CoM) and
are intended to make these matrices usable. Actually, CoM P[d][z1][z2] are defined, for
G grey level images, as the joint probability that a pair of point satisfying the dipole
d = (dx,dy) will have grey level values
of z1 and z2. They display the image information as
a co-ocurrence of pixel pairs but although better-organized provide too much data. SDH ( ,
1986) are often preferred to CoM and reduce the amount of data and computing time
required. By writing
p(z,z') the probability of the pair of values z and z' at positions satisfying
d, we define pΣ(z + z') and pΔ(z - z') the probability of a sum and difference value for a given dipole.
descriptors ( et alii,
1973) - mean, variance, contrast, energy, entropy and homogeneity -
used to characterize a CoM have translations for SDH and results have shown that SDH are at least as efficient as co-occurrence matrices for
classify textures.
Mean = ΣzΣz'
z.p(z,z') = ½Σi i.pΣ(i)
Variance = ΣzΣz'
(z-µ)2p(z,z') = ½[Σi
Covariance = ΣzΣz'(z-µ)(z'-µ')p(z,z') = ½[Σi
Contrast = ΣzΣz'
(z-z')2p(z,z') =
Σj j2.pΔ(j)
Energy = ΣzΣz'
(p(z,z'))2 ≈ Σi pΣ2(i)
* Σj pΔ2(j)
Entropy = ΣzΣz'
-p(z,z') log(p(z,z')) ≈ - Σi pΣ(i)
log(pΣ(i)) - Σj
pΔ(j) log(pΔ(j))
Homogeneity = ΣzΣz' [p(z,z')
/ 1+(z-z')2] = ½Σi [pΔ(j) /
This gives 7 values for the
65,535 pairs available in a 256x256 pixel image. Thereby, by keeping only one mean value by distance
d and by selecting the distance as power of
2, this reduces data as 7 by 7 values when considering 7 scales (d=1, 2, 4, 8, 16, 32, 64)
(Fig. 3
Classification based on features for the ROI data set gave results damaged by a large mean square error. Thereby, as only ornaments are of interest in the classification, we first pre-processed the images in order to remove the background. Scale-Space theory ( ,
1994) advocated weighted Laplacian of Gaussian (wLoG) on increasing scales to capture blobs (the ornaments) independently of their sizes.
An absolute value of the convolution of the ROIs by the wLoGs has been chosen to be independent of the fact that ornaments can appear black or white depending on
the orientation of their relief
(Fig. 4
We used Support Vector Machine ( & ,
2001; & ,
1995; et alii, 2001; et alii,
2001) to test a classification based on descriptors on the original ROIs and on their wLoG version. A classification task involves training and testing data. We used respectively 30, 30, 30, 40, 15 images for the training and 30, 20, 30, 30, 10 for the testing of the corresponding classes 1, 2, 3, 4, 5. Each instance in the training set contains one 'target value' - class label - and several 'attributes' -
descriptors -. The goal of SVM is to produce a model which predicts the target value of data instances in the testing set which are given only the attributes.
We selected the following attributes: the descriptors for the scale 2 (7 attributes), the
Contrast for the scale 3 and 4 and the Homogeneity for the scale 4 and 6 in order to exploit the differential scale dependency of these two last descriptors.
The model used is the nu Support Vector Classification with a linear kernel type.
The classification of the ROI gives
an accuracy of 56% with a mean square error of 1.2 when the classification of the pre-processed ROI gives accuracy of 61% with a mean square error of 0.46
(Fig. 5
The accuracy
of classification increases only from 56% to 61% when the ROI is
pre-processed with the wLoG. But what is important is the reduction of the mean square error that balances the significance of the
wrong classification. In fact, it drops so strongly that when a spore is misplaced it is
reported only as being in a neighbouring class. In other words, if we accept the error of
misplacing a spore in an adjacent class, the 'accuracy' of the classification increases from 85% to 98%.
In fact, the a priori human classification plays an important role
as regards judgments of the accuracy and usefulness of computer-assisted classification. As classification
by humans inevitably includes such types of error, we could adjudge this one
test as equally subject to error but anticipate that a better-tuned training set
would give better results.
Continuous morphological intergradations between two miospore taxa
have been found in an Early Devonian miospore assemblage from Saudi Arabia. On the distal surface
of these spores the ornaments and their organization show a gradual evolution in
complexity between two end-members. All the intermediary forms co-occur in the assemblages. Thus
there is a progressive evolution in the organization of the ornamentation
ranging from the simplest spores to the most complex ones. This illustrates once again that miospore taxonomy is artificial because the two end-members of this lineage
have been assigned to discrete genera.
Statistical texture image analysis provides a gauging tool to quantify morphological evolution and seems to allow assisted automatic classification.
In this test case classification of spores by image analysis was judged accurate
in but 61% of the cases but rose to 98% if a misidentification to an
adjacent class was accepted.
We wish to express our gratitude to management of Saudi Aramco for permission to publish this paper. We acknowledge M. (Liège) for the preparation of palynological slides. Thanks are also expressed to Y.
(Montpellier, France) for the review of the paper. P. is supported by a F.R.I.A. grant.
Figure 1: Microphotographs of the different morphotypes of the lineage.
Figure 2: Microphotographs of characteristic miospores from the studied boreholes.
1. Brochotriletes foveolatus. Borehole BAQA-1, sample & slide 345.5', EFC H54/4.
2. Clivosispora verrucata. Borehole BAQA-1, sample & slide 395.2', F47/1.
3. Clivosispora verrucata. Borehole JNDL-4, sample & slide 87.2', F34/1.
4. Dictyotriletes emsiensis. Borehole BAQA-2, sample & slide 56.0', X46.
5. Dictyotriletes subgranifer. Borehole BAQA-1, sample & slide 366.9', O31.
6. Emphanisporites schultzii. Borehole BAQA-1, sample & slide 395.2', G50.
7. ? Knoxisporites riondae. Borehole BAQA-1, sample & slide 366.9', E27/4.
8. Rhabdosporites minutus. Borehole JNDL-3, sample & slide 368.8', H45/1.
9. Verrucosisporites polygonalis. Borehole BAQA-1, sample & slide 371.1', R25/4.
Figure 3: Mean, contrast, covariance and homogeneity for the 5 classes and for 7 decreasing scales.
Figure 4: Regions of interest illustrating the 5 classes and their wLoG pre-processed version where background lighting is removed and ornaments highlighted.
Figure 5: Classification error for the ROIs and the wLoG convolved ROIs.