Psychophysical Scaling Methods
Psychophysics investigates the relationships between sensations in the psychological domain and stimuli in the physical domain. In the research of colour emotion, psychophysical scaling methods are used to collect reactive-level emotional responses to colour stimuli, and to analyse the relationships between these responses and the physical characteristics of those colour stimuli.
Psychophysics started to emerge in the research field when E. H. Weber stated that the change in stimulus intensity that produces a just noticeable difference (JND) in a human sensation is a constant fraction of the starting intensity of a stimulus. This statement is known as Weber's Law, as illustrated in (a) of the diagram below. On the basis of this statement, G. T. Fechner proposed that all JNDs were equal in psychological sensation, regardless the stimulus intensity. This means that doubling the stimulus intensity always causes sensation to increase by the same increment, as illustrated in (b) of the diagram below. This assumption was basic to Fechner's concept of psychophysical scaling by using the JND as the unit measurement. He developed a logarithmic function as the relationship between sensation magnitude and stimulus intensity, known as Fechner's Law. This function has been widely accepted in the field of psychology, while today it is not considered an accurate statement of the relationship between stimulus intensity and sensation magnitude. For instance, S. S. Stevens has shown that JNDs were psychologically unequal along the dimension of stimulus intensity, and suggested that the relationship between sensation magnitude and stimulus intensity is a power function, not a logarithmic function. This power function is known as Stevens' Power Law, as illustrated in (c) of the diagram below.

Early studies of psychophysics, including those by Weber and Fechner, were concerned only with issues of measuring sensory thresholds and did not deal with the problem of measuring human sensations. This was because these sensory thresholds were not stated in sensation units but in units of stimulus intensity.
In modern psychophysics, 4 basic types of measurement scales for sensations have been defined: nominal, ordinal, interval and ratio. A nominal scale reflects qualitative differences (such as the identification between red and green objects) rather than quantitative ones. An ordinal scale includes a set of measurements in which the amount of a property of objects or events can be ranked. An interval scale indicates differences between amounts of the property measured, as represented by intervals between scale values. A ratio scale is an interval scale with a natural origin (zero point) that represents the zero amount of a property.
Following are psychophysical scaling methods that have been extensively used in the research of colour and imaging science: the method of paired comparison and the method of categorical judgement, both measuring the sensation magnitude on interval scales.
Paired Comparison
Thurstone's
Law of Comparative Judgement indicates that the scale value difference
between any two stimuli in a paired comparison assessment is a random variable
whose probability density function forms a normal distribution.
The mean value of this distribution represents the scale value difference
between the two stimuli in question. This can be illustrated by the diagram
on the right and the equation below.
![]()
where
is the
normal deviate (z-score) corresponding to the proportion of times
stimulus i is
judged greater than stimulus j;
and
are
scale values for stimuli i and j,
respectively;
and
are
standard deviation values for stimuli i and j among
observations, respectively;
is
the correlation coefficient between stimuli i and j and
ranges from -1 to 1.
In the above diagram, the shaded area shows the proportion of times that
stimulus i is
judged greater than stimulus j,
.
The unshaded area shows the proportion of times that stimulus j is
judged greater than stimulus i,
.
In Case V of Thurstone's Law of Comparative Judgement, all stimuli
share the same standard deviation among observations, i.e.
,
and the correlation between each stimulus is zero. Accordingly, the scale
value difference,
,
has a standard deviation,
,
which is a constant. This means that
is
in linear correlation with
,
as illustrated in the following.
![]()
where
is the
common standard deviation for each stimulus.
The precision of scale value for each stimulus indicates how well the scale value, which is obtained from samples, is close to the scale value for the population. It can be determined by the 95% confidence interval of the sample scale value, as defined in the equation below, for large number of observations in the assessments, where the number of observations is defined as the number of observers multiplied by the number of times each stimulus is assessed.
95% confidence interval = sample scale value ![]()
where
is the
z-score for cut-off of 2.5% for each tail in the standard normal distribution
and has a value of 1.96;
is
the standard error.
As for the estimate of the standard error (
),
Morovic (1998) suggested that the scale values can be
normalised in units of
and
therefore the standard error is calculated by
![]()
where N is the number of observations.
Note, however, that the standard error estimated in this way depends only on the number of observations. Cui (2001) has pointed out that the number of stimuli in the assessments can also affect the standard error. He suggested the following equation, in which the coefficients were determined according to his experimental results.
![]()
where N is the number of observations; n is the number of stimuli.
Braun et al. (1996) suggested another equation to estimate the standard error, which was also related to both the number of observations (N) and the number of stimuli (n), as shown below:
![]()
Categorical Judgement
Torgerson's
Law of Categorical Judgement indicates that the difference value between
a category boundary and the scale value of a stimulus is a random
variable whose probability density function forms a normal distribution.
The mean value of this distribution represents the difference between the
category boundary and the stimulus scale value. This can be illustrated
by the diagram on the right and the equation below.
![]()
where
is the
normal deviate (z-score) corresponding to the proportion of times
stimulus j is
sorted below category boundary i;
is
category boundary i;
is
the scale value of stimulus j;
and
are
the standard deviation values of category boundary i and
stimulus j, respectively;
is
the correlation coefficient between category boundary i and
stimulus j and
ranges from –1 to 1.
In the above diagram, the shaded area shows the proportion of times that
stimulus j is
sorted below category boundary i,
.
The unshaded area shows the proportion of times that stimulus j is
sorted above category boundary i,
.
In Condition D of Torgerson's Law of Categorical Judgement, all
stimuli are assumed to share the same standard deviation among observations,
i.e.
, and that
stimulus scale values are independent of category boundaries, i.e.
.
Accordingly the value
has
a standard deviation of
,
which is a constant. Thus the above equation can be simplified into:
![]()
where
is the
common standard deviation for each stimulus and each category boundary.
The precision for categorical-judgement data is determined in the same way as that for paired-comparison data.
Principal Component Analysis
Principal component analysis (PCA) is a technique for reducing a large data set from a group of interrelated variables into a smaller set of factors. The PCA has been popular in the studies of colour semantics scales.
Central to the PCA is the determination of eigenvectors and eigenvalues of a covariance matrix. Eigenvectors and eigenvalues are defined as the solution of the following equation:
(
)
where
is a real
and symmetric matrix with
size;
are
eigenvectors of the matrix
and
are
the associated eigenvalues.
In the current research,
is
the covariance matrix of experimental data (z-scores) for colour emotion
responses. The output from the PCA is the classification results of colour
emotion scales in terms of component loadings, i.e. the correlation
coefficients between colour emotion scales and the principal components
derived.
The following shows a working example of the PCA:
a) The aim of the PCA in the present research is to classify
colour semantics scales (word
pairs) using experimental data for m colour
stimuli measured on n colour
semantics scales. These data are expressed by the normalised vectors
where
and
represent observer responses
(in the form of z-scores) for colour semantics scale i.
"Normalised" means that for each scale all the elements in the
vector are subtracted by the mean value and divided by the standard deviation.
b) Calculate the covariance matrix between these colour semantics scales:

where
is the covariance between vectors
and
if
;
is the variance of data
if
. The covariance is determined by
![]()
where
and
are values in the experimental data for colour
semantics scales i and j,
respectively;
and
are mean values of
and
, respectively; m is
the number of colour stimuli.
c) Calculate eigenvectors (
) and eigenvalues (
)
of the covariance matrix
:
(
)
where
are eigenvectors of matrix
and
;
are the eigenvalues associated
with
; n is
the number of colour semantics scales.
d) Each eigenvector represents a principal component (PC). The eigenvector having the highest eigenvalue is the primary PC and accounts for the largest amount of the total variance of original data. As to how many PCs should be retained, the following guidelines are widely used: i) the Kaiser Criterion, which suggests that only the PCs with eigenvalues greater than 1 should be retained; ii) the Cattell Scree Test, which suggests the number of PCs to be decided by observing the cumulative eigenvalue plot, called "scree plot".
e) The matrix of the eigenvectors is called component loading matrix, as denoted by

where h is the number of principal components retained; n is the number of colour semantics scales.
f) Each element of an eigenvector is equivalent to the correlation coefficient between a colour semantics scale and a principal component.
g) Interpret the meaning of each principal component according to these correlation coefficients.
The total variance explained by a principal component can be calculated by
where
;
is the percentage of the total variance
explained by principal component k;
is the eigenvalue for component k.
Projected values of original data
onto the principal components, called component scores (
), are determined by
![]()
where
is
the transpose of the component loading matrix
;
is the matrix of the original data vectors and
is denoted by

where n is the number of colour semantics scales and m is the number of stimuli.
Further Reading
Braun, K. M., Fairchild, M. D. and Alessi, P. J., Viewing environments for cross media image comparison, Color Research and Application, 21, 6-17 (1996).
Child, D., The Essentials of Factor Analysis, 2nd Edition. London: Cassell Educational Limited, 1990.
Cui, C., On the repeatability of paired comparison based scaling methods, IS&T’s 2001 PICS Conference Proceedings, 113-118 (2001).
Engeldrum, P. G., Psychometric Scaling: A Toolkit for Imaging Systems Development. Winchester: Imcotek Press, 2000.
Gescheider, G. A., Psychophisics: The Fundamentals. New Jersey: Lawrence Erlbaum Associates, 1997.
Jolliffe, I. T., Principal Component Analysis. New York: Springer, 1986.
Morovic, J., To Develop a Universal Gamut Mapping Algorithm, PhD Thesis, Colour & Imaging Institute, University of Derby, UK, 1998.© 2006-2011 Li-Chen Ou


