Laws for

dynamic development

 

 

Part I: Time and again

General laws for the development

of relations

 

 

 

Contents

 

Preface to part I

1. Framework

2. Number and space

3. Metric and measurement

4. The dynamic development of kinematics

5. interaction

6. Irreversibility

7. Wave packets

8. Individuality and probability

9. Probability in quantum physics

 

 Preface to part I

 

Time and Again was first published in 1980, and is now entirely revised. It is intended to prove that the natural sciences, as far as their foundations are concerned, is little more than time keeping, if time is understood as a lawful pattern of relations between things and events. This pattern forms the subject matter of this study.

Time and again, philosophers of science have been in search of a unifying principle in the foundations of physics. Time is not such a unifying principle. It is a diversifying principle. I do not wish to find unity, but to account for the diversity of nature.

Time and again I shall argue that temporal relations are not based on some conventions, and that the laws of physics are not merely convenient patterns of thought. Arguments will be provided for the view that physics is a dynamic endeavour, intended to open up the creation by discovering laws and applying these to physical reality.

Time and again philosophers have tried to present the foundations of science on an a priori basis. This book wants to discover these by a close scrutiny of the physical sciences and their history. Only one thing will be taken for granted: the lawfulness of the creation. Different philosophical schools can be distinguished by the way they account for this lawfulness (T&E 8.6).

Time and again aims to study the foundations of physics in a systematic way. Hence it starts with exposing a hypothetical framework derived from Herman Dooyeweerd’s and Dirk Vollenhoven’s philosophy of the cosmonomic idea, based on Christian principles. This is well-known for its law spheres or modal aspects (which I call relation frames), expressing the idea of modal diversity. The concept of relation frames is supplemented by the concept of structures of individuality. Like the relation frames, structures have a law side and a subject side, and I call the law side characters. Time and again investigates the dynamic development of the mathematical and physical aspects of reality and their mutual projections. These are called retrocipations if they refer to preceding relation frames, and anticipations if they refer to succeeding aspects. Both act as a driving force in the modal dynamic opening process, anticipations as a pull, retrocipations as a push. Vollenhoven, Dooyeweerd and their disciples usually analyse each modal aspect conceptually, by pointing out its meaning nucleus and its analogies (retrocipations and anticipations). This book sets out to discuss each relation frame by extensively investigating the relations which it determines, both subject-subject and subject-object relations.  

After 35 years, it appears to be worthwhile to revise Time and again, first published in 1980. This second edition, part I of Laws for dynamic development, is foremost an update. Physics has developed considerably, new books and papers have appeared, and a modernization of style and terminology is much in need. Apart from that, the first nine chapters have not changed very much. Therefore, I do not hesitate to retain the book’s title. The original chapter 10 is rewritten as chapter 2 of part II. Part I concerns the first four natural relation frames. The corresponding characters and those constituting the living world will be treated in part II. In part III their normative counterparts will be discussed. The companian volume, Theory and experiment (T&E), is concerned with the history and epistemology of classical physics.

Part I, chapter 1

 

 

Framework

 

 

 

 

 1.1. Foundations research

 

Although there are many handbooks and textbooks of physics as well as numerous monographs and papers on special topics, until recently, there have been few books dealing with the structure and development of physics in a manner which goes beyond a mere commentary on its methods. Many of the texts, of course, are extremely important for the understanding of physics, and are fascinating because of new vistas explored or admirable because of clarity in expounding older views. However, there remains surprisingly few investigations into the basic structure and coherence of the physical sciences.

 

There is, perhaps, a historical explanation for this. Influenced by Immanuel Kant, the 19th-century German Naturphilosophie  assumed that the foundations of physics could be derived from immediately evident truths being a priori, transcendent and necessary. It was thought that these truths could be understood without the need of experimental verification. Georg Hegel is notorious for making attempts to build a structure of physics on such speculative foundations. In time, however, it became clear that many of these self-evident truths were in fact false. In reaction, many late 19th- and 20th-century physicists rejected outright any a priori philosophical bias for their work - and willy-nilly became adherents of another philosophy, usually some variant of positivism (neo-positivism, Vienna school, analytical philosophy, instrumentalism, operationalism, conventionalism, social-constructivism). Assuming that the content of science is ‘positive fact’ which must be taken for granted, whereas the structure of science is determined by its methods, most positivist philosophers are interested only in the latter.[1] Hence for positivists, philosophy of science is not a matter of ontology or epistemology, but rather a matter of methodology.

 

The study of the foundations of physics has traditionally been called metaphysics, but, since the beginning of the 19th century, this term has become discredited because of its speculative implications. Currently, this kind of study is usually referred to as foundations research. The critical-realist philosopher of science Mario Bunge defines its goal as being twofold:

 

‘To perform a critical analysis of the existing theoretical foundations (of physics), and to reconstruct them in a more explicit and cogent way’.[2]

 

 

 

The critical analysis has three tasks:

 

‘(a) To examine the philosophical presuppositions of physics;

 

 (b) To discuss the status of key concepts, formulas, and procedures of physics;

 

 (c) To shrink or even to eliminate vagueness, inconsistency, and other blemishes.’[3]

 

 

 

Similarly, the task of reconstruction, according to Bunge, has three aspects:

 

‘(a) To bring order to various fields of physics by axiomatizing their cores;

 

 (b) To examine the various proposed axiomatic foundations;

 

 (c) To discover the relations among the various physical theories.’[4]

 

 

 

For Bunge, the most important tool of foundations research is axiomatization. In this context,

 

‘ ... ‘axiom’ means initial assumption not self-evident pronouncement. There need be nothing intuitive and there is nothing final in an axiom ...’[5] Axiomatization of physical theories’ ... does nothing but organize and complete what has been a more or less disorderly and incomplete body of knowledge: it exhibits the structure of the theory and makes its meaning more precise.’[6]

 

 

 

However, since axiomatization is more an investigation of theories than of physics, it is unlikely that foundations research can be exhausted by formulating axioms. In the first place, axiomatization can only be applied to partial theories[7] such as classical mechanics, classical electromagnetism, thermodynamics, special and general relativity - to mention fields in which this type of foundations research has been carried out more or less successfully.[8] Moreover, ‘... a conceptual system such as Euclidean geometry may be subjected to innumerable axiomatizations, all hazy in different ways.’[9]

 

In part I of this book I shall not be primarily interested in partial theories, and I shall make use only occasionally of available axiomatizations. My initial focus will be with an ordering scheme of all aspects of the physical sciences - i.e., with the third of Bunge’s ‘constructive tasks’ of foundations research. It is very doubtful whether such an ordering scheme could be axiomatized in any sense, since any axiomatization would itself probably depend on such a scheme, whether explicitly recognized, or implicitly assumed. In our discussion, the partial theories neither are placed alongside one another, nor will they be deductively subsumed. They turn out to be interdependent. It is especially the dynamic development of this mutual dependence which will be our subject matter.

 

A second reason for rejecting axiomatization as the main tool of foundations research is this: any modern axiomatization system familiar to me relies heavily on set theory, as well as on a formal logic making use of set-theoretic methods. This appears to betray a strong influence of Aristotelian philosophy of science, according to which science means the designation of classes and their mutual relations. This Aristotelian influence may be spurious; nevertheless, the approach relies heavily on logic, the laws of which are supposed to be true (if only ‘vacuously true’) and a priori valid tools in foundations research. I shall consider set theory to be a mathematical theory (2.1), and insofar as logic makes use of it, logic is projected on mathematics. Thus, sets and classes as mathematical entities should find a place within the general ordering scheme to be sought. This implies that set theory and its dependent, axiomatization, cannot be accepted as the basis of our research into the foundations of physics, though both will play an important role in part I of this book.[10]

 

From the above quotations it should be clear that Bunge’s extreme emphasis on logical methods does not imply a purely deductive approach to physics, for his axioms must be found in existing physical theories. Still, he seems to adhere to the medieval idea that everything special is contained in the general. Enrico Cantore’s ‘inductive-genetic’ approach presents a somewhat different view:

 

‘First, the approach should be inductive ... the philosophical approach to science, to be successful, should concentrate on the detailed study of individual, fully developed theories. Secondly, the approach has to be genetic. Each scientific theory arises out of a slowly growing body of information. Hence the nature of the scientific endeavour and its achievements cannot be properly realized unless one follows the developments of individual theories as they gradually unfold and develop in time.’[11]

 

 

 

This points out my third objection to Bunge’s position: the historical development of a theory must also be accounted for in foundations research.

 

Finally, I wish to direct a few comments to Bunge’s first critical task of foundations research, i.e., to examine the philosophical presuppositions of physics. First, it must be emphasized that there exists no unique set of philosophical presuppositions. Second, no examination of such presuppositions can itself be philosophically neutral. Bunge himself seems to be more clear about the philosophies which he rejects (positivism, operationalism) than he is about his own philosophical position (realistic objectivism, or critical realism[12]). This vagueness about one’s own philosophy is not unusual among workers in foundations research. Since the beginning of this century it has become abundantly clear that mathematics and physics, and more specifically, investigations into their foundations, are not free from philosophical assumptions, which, in turn, depend on one’s world view. Recognition of this has led to a more or less peaceful coexistence of different philosophical traditions in mathematics (logicism, formalism, and intuitionism)[13] and in physics (neo-positivism, operationalism, realism, conventionalism, materialism, phenomenalism, and postmodern constructivism).[14] A complete criticism of any of these philosophical systems would be out of the question, but, at times, I shall have occasion to confront my views with those of others.

 

 

 

Mission statement

 

Understanding the structure of the physical sciences requires a philosophical system which makes possible a systematic analysis of the foundations of physics, including its history. The philosophical position from which Laws for dynamic development is written is the philosophy of the cosmonomic idea, developed by Herman Dooyeweerd and Dirk Vollenhoven at Amsterdam, during the second quarter of the 20th century.[15] In contrast to philosophical fashion, this philosophy does not degenerate into a kind of methodology. Growing out of the reformed biblical ‘ground motive’ of creation, fall into sin, and redemption through Jesus Christ in the communion with the Holy Spirit, it is a rather complicated attempt to account for the full complexity of created reality. Not only is this philosophy a systematic investigation into the structure of created reality and human knowledge thereof, but it also tries to account for the temporal development of created reality. For readers of this book it would be helpful to have prior knowledge of this philosophy. However, since only part of its elaborate system is needed for our analysis of the structure of physics, and since this will be elaborated in the course of this book, such prior knowledge is not strictly necessary. In this introductory chapter an outline will be given of the general framework within which the discussion takes place. I do not wish to present this philosophy as an a priori truth; on the contrary, to a large extent, its applicability must be demonstrated by studies such as the one undertaken in this book. Hence I invite the reader to understand this introductory chapter as a provisional outline of a working hypothesis which is to be tested in the following chapters.

 

 

 

1.2. Three basic distinctions

 

 

 

Three central, recurring themes can be recognized in the history of scientific philosophy: the search for truth, the search for order, and the search for structure. The first is mainly a philosophical concern, and deals with the relation of laws and which is subjected to them, the status of law (the nominalism-realism controversy), the possibility of human knowledge, and the methodology of science. Its central problem is to account for the lawfulness of creation. The search for order and structure forms the core of science, and here one deals with basic questions such as: Are there general modes of experience which provide an order for everything within the creation, and if so, which are these universal orders of relation? How can stable things exist, and how can they change? The question of structure already surfaced in Greek philosophy and is still prominent in modern physics and biology, whereas the problem of lawful order did not appear until post-Renaissance science (T&E, 8.6). These three themes, though they cannot be treated separately, are irreducible to each other, and they lead to the introduction of three basic distinctions which form the skeleton of my philosophical theory.

 

(a) The distinction of law and subject (1.3) is basic to all sciences, though it is not always explicitly recognized as such. Every science worth its name investigates some kind of regularity, which I shall call laws for short. These laws are concerned either with more or less concrete things, events, signs, living beings, artefacts, social communities, etc., or with more or less abstract concepts, ideas, constructs, etc. These things which are subjected to law are commonly referred to as ‘objects’, but, for reasons to be explained later (1.6), I shall refer to them as ‘subjects’ - i.e., beings subjected to laws.

 

(b) The distinction of typicality and modality (1.4). I shall distinguish those subjects which are more or less concrete from those which are more or less abstract. This distinction is mirrored in the one between typical, special laws, which apply to a limited class of subjects, and modal, general laws, which hold for their mutual relations. The first distinction (law and subject) is frequently identified with the distinction of universals and individuals. However, this identification is inadequate and too crude, since the distinction of typical and modal laws also implies a universal-individual duality. For the same reason, laws cannot be identified with classes or sets, although special laws define classes. Modal laws, however, do not, and therefore cannot be found by generalization: they must be inferred by abstraction.

 

(c) The various modal aspects or relation frames (1.5). Various solutions to the problem of the general modes of experience have been presented. However, most of these attempt to solve the problem in terms of a single principle of explanation, or if that leads to difficulties, a dualism. This has led to a proliferation of ‘isms’ in philosophy and science: arithmeticism (Pythagorean tradition), geometricism (Descartes’ more geometrico), mechanism (Galileo, Descartes, Huygens, Leibniz, Kant, Maxwell), evolutionism, vitalism, behaviorism, logicism, intuitionism, historism, etc. In contrast to this trend, I shall attempt a solution in terms of several mutually irreducible modes of experience. Herman Dooyeweerd and Dirk Vollenhoven recognized that the modal laws can be grouped into several law spheres or modal aspects. Each modal aspect is equally general and universal, but is irreducible to any other. Part I of this book will be concerned primarily with only four relation frames, to be designated as the numerical, the spatial, the kinetic, and the physical aspect. The biotic and psychic aspects will be treated in part II, the normative aspects in part III.

 

These three basic distinctions are neither dependent on each other nor reducible to each other. They may be pictured as being mutually orthogonal, like the three axes in a Cartesian coordinate system. The three distinctions, though independent and irreducible, must be studied simultaneously, since they interpenetrate one another. It is not possible to discuss one of them without taking into account the other two. In the following sections I shall discuss these distinctions more extensively. During the discussion I shall point out several distinct aims of science which differ from one another to the extent that different viewpoints are possible within our systematic. I shall argue that each distinction implies a twofold direction of development. This suggests the dynamic development of science (1.7), for the systematic to be discussed will turn out to be dynamic, not static.

 

 

 

1.3. Law and subject

 

 

 

The first basic distinction in this investigation of created reality is that of its law side and its subject side. In every philosophy, rightly called scientific, this distinction is explicitly or implicitly made. Without it, science would be impossible. The idea of natural law was developed in classical physics (T&E, 8.6). Yet it is not merely a scientific, epistemological idea, but it is rooted in the creation itself. In fact, not only science, but all our life would be impossible without the awareness (mostly subconscious) of laws, distinct from subjects. In our lifetime we encounter things, animals, plants, men, human societies, organizations, and, above all, ourselves. All of these I refer to as subjects, i.e., they are all subjected to some kind of law. Indeed, it is precisely because of these structural laws that we can distinguish the various subjects from one another, and explain and predict their behaviour. Without having some intuitive idea of structural laws which hold for plants, animals, etc., it would not be possible even to speak of them. We would be unable to perform even the simplest acts of life if we had no idea of, and confidence in, lawfulness.

 

There are no subjects without modal or structural, general or specific  laws. Every subject is constituted by some law, and is related to other subjects by laws. The reverse is also the case. There are no laws without subjects (either possible or actual). The function of a law is to be valid for some or all subjects. These two sides of reality are correlative. As a result, we must avoid both rationalism, which overrates the law side of creation, and irrationalism, which over-emphasizes the subject side of reality. As sides of reality, both law and subject display the self-insufficiency of the creation. Via the law, subjects receive their meaning by pointing to their origin, the Divine creator of heaven and earth. Isolated from this relation, subjects lose their creational meaning. The other direction indicates that God maintains his creation via the laws. Lawlessness implies not only loss of meaning, but also self-destruction into nothingness.

 

Both the distinction of law and subject and their relation is an ontological matter. Since our pre-scientific knowledge of laws is primarily intuitive, the first aim of science is to render these laws explicit, i.e. to explicate them. The laws are implicitly present in reality. We have no a priori knowledge of the laws. Therefore our knowledge of laws, whether implicit or explicit, is both empirical and tentative. It is important to distinguish laws in an ontological sense from our hypothetical law statements in scientific formulations. These statements are also frequently referred to as ‘laws’. Laws and subjects have an ontic nature, whereas theories, models and facts have an epistemic nature.[16] Thus, while electrons have always existed and the laws concerning them have always been valid, an electronic theory and the fact of the existence of electrons did not appear until 1896. Prior to that year the existence of an electron was not a fact. Theories, hypotheses, models and facts, though bound to the creation order, are human inventions. But laws and sub-human subjects exist independently of human knowledge.

 

Laws can be discovered, e.g., by induction, because they are related to subjects, and the validity of law statements can be tested by confirmation with facts. This state of affairs, however, does not mean that laws can be reduced to subjectivity. This was most clearly recognized by David Hume, who argued that the ‘inductive assumption’ concerning the possibility of finding laws by induction cannot be justified by experience. Hume insisted that there is no epistemic proof that laws concerning future events can be inferred from regularities observed in the past.[17] This discovery resulted, on the one hand, in a sceptical attitude concerning the very possibility of science and, on the other, in the conviction that laws have a merely epistemic status. I share neither of these views. For me, the possibility of discovering laws is based on faith in the lawfulness of reality and in a God who faithfully maintains his laws. Admittedly, the lawfulness of reality cannot be proved. It is an a priori of all human experience, including scientific experience.[18] According to the philosophy of the cosmonomic idea, different philosophical systems can be characterized according to their respective views on the status of law.[19] Thus the name of this philosophy does not lay a claim on the cosmonomic idea. Rather it pleads for the recognition that any philosophical system must account for the lawfulness of reality. Such an account does not have a scientific but a religious starting point, having scientific consequences.

 

The positivist view that the truth of law statements can be established by verification of their factual consequences has been criticized by Karl Popper.[20] However, Popper’s falsifiability criterion, though a correction to the positivist view, is only sufficient to demarcate scientific from non-scientific law statements. Regardless of how much evidence may corroborate a natural law statement, acceptance of the statement as a law is always a matter of faith. A law statement is ultimately believed to be true, because of convincing evidence supporting it. This belief does not prove that the law statement is true, for such proof does not exist. This belief is neither individual nor irrational; it is communal, i.e., the community of scientists decides on the faithfulness of the empirical evidence and the acceptability of physical theories.[21] To perform this task the scientific community organizes societies, journals, etc., in which the evidence is judged and debated, according to unwritten codes. Even then, the truth of any law statement cannot be absolutely proved or disproved. Indeed, in many cases, scientific research is initiated because someone (on quite rational grounds) does not believe the accepted views on some particular subject.

 

Ultimately, an acceptance of the truth of law statements and empirical evidence is based on belief, both in the reliability of one’s colleagues, and in the lawfulness of the creation. Hence there is room for the rejection of formerly held law statements, and the critical reconsideration of older evidence in the light of new evidence or insights. This same state of affairs applies to the consideration of subjects. Because of the correlation of law and subject, knowledge of facts is always theory-laden. Thus, at present, it seems quite certain that electrons and stars are real existing entities, whereas one may be less sure about the existence of quarks and quasars. In these cases, too, the degree of certainty depends upon the availability of independent, reliable evidence. In my opinion, both laws and subjects are discovered, implying an active role for the scientific explorer. How theories are found or invented is not well understood. The scientist’s fantasy and genius itself is subjected to historical and psychological research, and certainly cannot be reduced to simple logical rules for deduction and induction.[22]

 

Though the number of laws may be infinite, they are not all independent, and it is often possible to deduce one law statement from others. In this case it is said that the former is reduced to the latter. The reduction of laws and, conversely, the deduction of new laws and their consequences for subjects is the second aim of science. Axiomatization can be a very helpful tool in investigating the possibility of such reduction schemes. Attempts to reduce all laws to a single principle have been made in every epoch of philosophy, beginning with Thales’ Everything is made of water.[23] In classical mechanism Galileo Galilei, René Descartes and Christiaan Huygens attempted to explain all physical phenomena from the motion of unchangeable pieces of matter (T&E, 3.3-3.4).

 

 

 

1.4. Typicality and modality

 

 

 

In addition to the distinction of law and subject, it is very fruitful to introduce a second basic distinction, that of typicality and modality. This distinguishes specific laws which are valid for a limited class of subjects (typical laws) from general laws valid for all kinds of subjects (modal laws). Typical laws, in principle, delineate the class of subjects to which they apply, describing their structures and typical properties. Examples of such laws are Coulomb’s law (applicable only to charged subjects), Pauli’s principle (applicable only to fermions), etc. Often the law describing the structure of a particular subject (e.g., the copper atom) can be reduced to some more general typical laws (e.g., the electromagnetic laws in quantum physics). On the other hand, general, modal laws are those which have a universal validity. For example, the law of gravitation applies to all physical subjects, regardless of their typical structure. We call these modal laws because, rather than circumscribing a certain class of subjects, they describe a mode of being, of experience, or of explanation. In particular, modal laws determine relations.

 

This distinction is also relevant to the way in which different laws are discovered and formulated. Whereas typical laws can usually be found by induction and generalization of empirical facts or lower-level law statements, modal laws are found by abstraction. Euclidean geometry, Galileo’s discovery of the laws of motion and the subsequent development of classical mechanics, and thermodynamic laws are all examples of laws found by abstraction. This state of affairs is reflected in the use of the term rational mechanics, in distinction from experimental physics.

 

At first sight the distinction between typicality and modality appears to apply only to laws. Indeed, all concretely existing things, events, organisms, etc., have some typical structure. However, even as modal laws are found by abstraction, modal subjects, which are abstracted from any typical and individual properties, are also found to exist. The abstract modal subjects (so-called because they are exclusively subjected to modal laws) are indispensable in science for the ordering of our experience. Numbers, coordinate systems, inertial and isolated systems are all examples of modal subjects. They do not exist in any concrete sense, since they lack any individuality and typicality. Nevertheless, in the sense of belonging to created reality, these subjects are perfectly real – they are abstracted form concrete, individually existing things, events etc. Typical laws must be disentangled in order to discover modal laws. This process could not be carried out without the use of abstract modal subjects. Abstraction may be called the third aim of science, which includes the formulation of modal, universal laws, as well as the modal analysis of concrete reality on both the law side and the subject side.

 

The distinction of typicality and modality is, however, not merely an epistemological one, for though there is a plurality of laws and subjects, there is only one reality. This means that even though subjects may have widely differing typical structures, they must be related in a general way. It is these general (thus modal) subject-subject relations which come to the fore when we study modal laws (1.5). Therefore, the modal aspects may be aptly called relation frames. For instance, two physical subjects, regardless of their typical, individual structures, are always related, since they must have a certain spatial distance and a certain relative velocity. But in order to investigate these general relationships, the subjects must be deprived of their typicality - i.e., modal laws have correspondent modal subjects.

 

I shall define the character of an individual thing or event as a typical set of specific laws (1.5). Therefore, the fourth aim of science is the reconstruction or synthesis of typical laws occurring in characters for classes of things or events. Since modal laws are too universal to form any typical structure, the starting point for the reconstruction cannot be taken solely in the modal laws themselves. As it happens, in physics, in addition to purely modal gravitational interaction one must also consider electromagnetic interaction and two types of nuclear interaction. Despite many efforts toward the development of a unified field theory, these fundamental interactions cannot be reduced to one another (11.1). With the help of modal laws and these typical interactions, an enormous number of characters for typical structures may be recognized: nuclei, atoms, molecules, crystals, particles, quasi-particles, etc. (chapter 11). Investigations of these structures reveal both sides of the modality-typicality distinction: abstraction and reconstruction, analysis and synthesis. Without the existence of the irreducible fundamental typical interactions, typical laws could be subsumed under modal laws. Because of their irreducibility, the distinction of typicality and modality must be recognized as being orthogonal to the distinction between law and subject. The study of typicality rests heavily on modality, which we shall discuss first, but also the investigation of modality requires insight into typicality.

 

This distinction of typicality and modality appears in several other philosophical systems in one form or another. Norman Campbell distinguishes typical laws from other laws. He calls typical laws

 

‘…laws of the kind which assert the properties of a kind of system ... The ‘classificatory’ sciences differ from other sciences in that they confine themselves to laws of this type…’[24]

 

 

 

Henry Margenau[25] speaks of the ‘immediately given’ from which a scientist passes to ‘orderly knowledge’ by the formation of ‘constructs’. Between the former and the latter there are ‘rules of correspondence’ and there is a ‘circuit of empirical confirmation’. Mario Bunge states ‘Every physical idea is expressed in some language and has a logical structure and a context of meaning.’[26] The language has a (modal) syntax or grammar and, via a semantics, is connected with reality. From my point of view, this may be recognized because the logical and sign aspects are universal, but one should avoid the pitfall of absolutizing them.

 

 

 

1.5. Relation frames

 

 

 

The theory of the modal aspects or relation frames, as I prefer to call these, is one of the most important chapters in the philosophy of the cosmonomic idea.[27] Herman Dooyeweerd says:

 

‘ ... our theoretical thought is bound to the temporal horizon of human experience and moves within this horizon. Within the temporal order, this experience displays a great diversity of fundamental modal aspects, or modalities which in the first place are aspects of time itself. These aspects do not, as such, refer to a concrete what, i.e., to concrete things or events, but only to the how, i.e., the particular and fundamental mode, or manner, in which we experience them. Therefore we speak of the modal aspects of this experience to underline that they are only the fundamental modes of the latter. They should not be identified with the concrete phenomena of empirical reality, which function, in principle, in all of these aspects.’[28]

 

 

 

Because of the genetic nature of scientific knowledge the designation of the various modal aspects must always be tentative and hypothetical. Dooyeweerd himself did not distinguish the kinetic from the physical modal aspect until 1953. The distinction of two mutually irreducible modal aspects is based on an analysis of our contemporary knowledge. Part I reports on such an analysis for the first four modal aspects. As we shall see, this analysis sometimes has to rely on insights into specific characters, anticipating the much more extensive investigation in part II.

 

 

 

Principles of explanation

 

In science, the different modes of experience can be different modes of explanation as well. 17th-century physics distinguished four mutually irreducible principles of explanation: quantitative, spatial, kinetic and physical interaction (T&E, chapter 3). This provides us with a possible distinction of the special sciences on an ontological basis, at least insofar as a special science can be characterized by one of the irreducible modes of explanation. In principle, each modal aspect has a corresponding special science: arithmetic or algebra with the numerical aspect, geometry with the spatial aspect, kinematics with the kinetic aspect, physics (including chemistry and astronomy) with the physical relation frame, biology with the biotic aspect, etc.[29] This classification is not exhaustive, however, since some sciences (geology, for example), study certain structures from the viewpoint of several modal aspects, no single one of which takes a leading role.

 

 

 

Temporal relations

 

Temporal reality is a multiply-connected pattern of relations. Although many of these relations have a typical structure, it is only possible to understand the unity, i.e., the mutual relatedness of all subjects in the creation, if at least some of these relations are of a modal, universal nature. All concrete existing things, events, etc., have mutual numerical, spatial, kinetic, and physical relations. These mutual relations make it possible to become aware of and understand these subjects. We are, therefore, entitled to speak of the relation frames as universal modes of temporal relations.

 

Within this modal relatedness a law side may be distinguised from a subject side. On the law side, in each modal aspect, one finds a distinct modal order, which is correlated with a modal subject-subject relation on the subject side. In the numerical aspect the modal order is the serial order of smaller and larger, or earlier and later. This modal order originally correlates with the numerical difference or ratio of two numbers, as modal, numerical subject-subject relations. The modal order in the spatial modal aspect is that of simultaneous coexistence, which is correlated with the relative spatial position of two subjects on the subject side. In the kinetic modal aspect the modal order of uniform time flow is correlated with subjective relative motion, and in the physical aspect the modal order appears as irreversibility, which is correlated with the physical interaction of two or more subjects on the subject side.

 

The modal order in every relation frame refers to our common understanding of time, since earlier or later, simultaneity, the uniform flow of time, and irreversibility are all acknowledged temporal relations. At first sight, the same cannot be said of the modal subject-subject relations such as relative position and interaction. However, we shall see that on the subject side, the opened-up numerical subject-subject relations (anticipating other subject-subject relations) most closely approximate what we usually refer to as ‘time’. This is most clearly shown by an analysis of the historical development of time measurement, at least insofar as such a development can be reconstructed. Initially, time measurement was simply done by counting (days, months, years, etc.). Later, time was measured by the relative position of the sun or the stars in the sky, with or without the help of instruments such as a sundial. In still more advanced cultures, time was measured by utilizing the regular motion of more or less complicated clockworks. Finally, in most recent developments time is measured via irreversible processes, for example, in atomic clocks.

 

In a scientific context, however, it is inadequate to work with either a simple common notion of time, or a merely objective representation of subjective relations. All modal subject-subject relations as well as the modal orders to which they are subjected must be recognized as being temporal. Time relates all subjects to each other under a universal law of order. The question as to whether time is relational or absolute in some sense has long been debated and still has not been settled (T&E, 3.7).[30] Since the 19th century, absolute time infers a unique universal reference system. I shall show that the theory of modal time requires the existence of several frames of reference systems - none of them unique, all of them universal – allowing of an objective description of our world.

 

 

 

Projections

 

Although the modal aspects are mutually irreducible, they are neither unconnected nor independent. The modal aspects display a serial order. As a result we can speak of earlier and later modal aspects in the sense that a later modal aspect presupposes the earlier ones. For example, the spatial modal aspect presupposes the numerical relation frame. If this were not so, it would not be possible to speak of three-dimensional space, the four sides of a square, or any other numerical attribute of spatial functioning. In a similar way, the spatial aspect is presupposed by the kinetic modal aspect, which in its turn, is presupposed by the physical aspect. Similarly, the biotic aspect presupposes the physical aspect, and so forth.

 

The later aspects refer back to, or retrocipate on the earlier ones. Thus each modal aspect, except for the numerical (first) aspect, contains retrocipations. It means that the subject-subject relations in one aspect can be projected on those in an earlier one. Indeed, the meaning of any modal aspect cannot be fully grasped without an insight into its retrocipations. Anticipations are the counterparts of retrocipations. Not only does each modal aspect (except the first) retrocipate on the earlier aspects, but each earlier aspect (except the last) anticipates the later ones.

 

Part I will only be concerned with the retrocipations and anticipations between the first four modal aspects. These anticipations and retrocipations project relations in one frame onto relations in another frame. In keeping with our distinction between the law side and the subject side of reality, we shall find these projections both on the law side and on the subject side of the creation. Thus the view that the modal aspects form a sort of layer structure in reality, with each layer built upon the earlier ones, is prohibited. Rather than being well separated departments of reality, the relation frames are intertwined, mutually irreducible, indispensable aspects of reality. The designation and distinction of relation frames and the exploration of their retrocipations and anticipations may be called the fifth aim of science.

 

 

 

The relevance of the relation frames for typicality

 

The distinction of the modal aspects is relevant, not only for modal laws and modal subject-subject relations, but also for typical relationships. A typical structural law may be viewed as a typical conglomerate of relevant modal and typical laws. Such a typical structural set of laws, which I shall call a character, has two limiting modal aspects, to be designated as the founding aspect, and the leading or qualifying frame of reference. For example, atoms, stones, and stars, called ‘physical things’ for short, are qualified by the physical modal aspect, whereas plants and fungi are qualified by the biotic aspect. On the other hand, the structure of an atom is founded in the spatial relation frame, since it consists of a nucleus surrounded by an appropriate number of electrons. In contrast, particles are founded in the numerical frame, since they are characterized only by typical magnitudes. This intricate state of affairs, to be discussed in greater detail in chapters 10 and 11, is further complicated by the fact that, within an atom, the nucleus, though itself spatially founded, functions as a particle. Such a relationship Dooyeweerd referred to as enkapsis: the structure of the nucleus is enkaptically bound within the structure of the atom. I prefer to say that the two stuctures are interlaced. In the same way, atoms are interlaced with the structure of a molecule, and molecules within the structure of a living cell. It means that, besides the primary qualifying and the secondary founding aspects, each character has a tertiary, anticipatory disposition to be interlaced with other characters. This disposition is highly responsible for the dynamic development of the natural world.

 

 

 

The empirical way to find the various relation frames

 

Hence the modal aspects are presented as mutually irreducible but connected modes of experience, modes of explanation, modes of order, and first of all modes of temporal relations. It should not be surprising to find that modes of experience and explanation are identified with modes of order and relation. In a broad sense, explanation means to order pieces of experience by relating them to other pieces of experience under a law. The relation frames should not be understood as self-evident a priori modes of thought laid bare by a metaphysics independent of empirical science. On the contrary, the arguments for the designation of the modal aspects will be found in science (understood as the empirical investigation of the creation), not in metaphysical speculation, based on a supposed autonomy of human thought.

 

 

 

1.6. Subjects and objects

 

 

 

We have now covered enough ground to justify the use of the word subjects to designate things which, perhaps, are more commonly referred to as objects. In fact, the linguistic use of these words is more original than the modern scientific and philosophical practice.

 

For example, consider the following question: Is it possible to speak of modal, universal, biotic laws which are valid for all kinds of subjects, regardless of their typical structure? Initially, it would seem that a stone is not subject to biotic laws. In order to answer this question adequately it is helpful to distinguish between subjects and objects. In the philosophy of the cosmonomic idea, subjects are actively or directly subjected to a certain law, whereas objects, in contrast, are related to the law only passively or mediately. This implies that objects receive their creational meaning from the subject to which they are related by a subject-object relation. Thus a stone cannot be a biotic subject. Only living organisms can be subjects to biotic laws. But atoms and molecules, rocks and sticks, may function as biotic objects within the sphere of some biotic law. For example, a bird’s nest, as a subject, is subjected to only mathematical and physical laws. As a bird’s nest, however, it can be understood adequately only as a biotic object; the nest has an objective biotic qualifying aspect. The bird’s nest receives its true objective biotic meaning through its relation to a bird, which is a biotic subject.[31]

 

The distinction of subject and object is not limited to typical structures of reality. Subjects and objects also appear on the modal side of reality. The path of a moving subject is a kinetic modal object since the path itself is motionless; and the state of a physical subject is a physical modal object since states do not interact.

 

 

 

Subjects and objects in epistemology

 

It is also possible to speak of subjects and objects in an epistemological context. In this case, however, only humans can be subjects, since things, events, plants, and animals always remain objects of scientific or common thought. The latter can only function as subjects in an ontological context. As observed above, during the first half of the 20th century epistemology has taken priority over ontology in the dominant western philosophies. Since the Renaissance the ground motive of western thought has been the relation of freedom and nature – i.e., the relation of human thought and activity, and its natural object.[32] In developments of the past four or five centuries, the natural subjects have become increasingly objectified. Whereas they retained an independent existence, determined by their spatial extension or mechanical interaction in the philosophy of René Descartes and Gottfried Leibniz, natural subjects were denatured, in principle, to unknown Dinge an sich in Immanuel Kant’s thought. In modern positivistic and phenomenalistic thought they became mere appearances. Occasionally existentialistic circles have tried to restore nature in a purely individual relation of humans and their environment. Paralleling this development, natural laws were reduced to mere epistemic ordering principles, whether a priori and unavoidable (Kant), merely economic (Ernst Mach), or conventional (Henri Poincaré).

 

These developments are reflected in modern terminology. Today one generally speaks of natural objects, even when their subjectivity to natural law is discussed. The modern view is strongly oriented towards a completely functionalistic view of reality, in which the modal aspects considered as universal modes of thought are the dominant principles of explanation. In this respect, post-Renaissance philosophy differs sharply from Greek and medieval philosophies, which were usually dominated by a typicalistic view, most clearly exemplified in Aristotle’s form-matter scheme.[33]

 

For Christian philosophy there is no need to absolutize any modal aspect, or any typical structure or relationship. At its foundation lies the acknowledgement that the creation is not independent of its Creator. On the one hand, there is no substance which exists independently of law, and, on the other hand, all natural subjects exist as creatures (being and becoming) under the laws. Because they are all subjected to laws, all subjects point to the Lawgiver. Herman Dooyeweerd’s dictum ‘meaning is the mode of being of all that is created,’[34] implies that natural subjects acquire their full meaning only if, in addition to their subject functions, all of their object functions are also opened up in their relation to humankind. In this relation natural subjects receive their full religious meaning since, in his relation to God, humanity is the religious centre of the creation.

 

 

 

Objectivity

 

The distinction of subject and object enables us to achieve a clear insight into the terms objectification and objectivity. In humanistic thought everything which relates to sub-human subjects is referred to as objective. As a result, the demand for an objective science has acquired an entirely confused meaning. It is sometimes understood as being intersubjective or public. In this case one distinguishes between individual (subjective) experience and public (objective) experience.[35] In other contexts objectivity is identified with universal validity or law conformity. In the philosophy of the cosmonomic idea, the meaning of the word objective is different: objectivity means a representation of modal and typical states of affairs referring back to earlier modal aspects. Objectification is made possible by the existence of retrocipations on these earlier aspects, and of developing the latter’s anticipations. The problem of objectification, which may be termed the sixth aim of science, shall occupy much of our attention. Spatial points, which refer back to the numerical modal aspect, enable us to find an objective numerical representation of spatial magnitudes and relative positions (chapter 2). The path of motion, referring back to the spatial modal aspect, provides us with an objective representation of the motion of a kinetic subject (chapter 4). Similarly, the state of a physical system allows us to objectify the system’s interaction with other systems (chapter 5).

 

For physics, objectification means a representation of physical states of affairs in mathematical terms, in particular the projection of physical relations on kinetic, spatial or quantitative ones. It became an important tool in the dynamic development of physics. It is frequently said that mathematics is the language of physics, as if it were a merely linguistic matter (T&E, 1.6). The real state of affairs is more complicated than this metaphor suggests. The modal aspects which precede the physical aspect and form the subject matter of mathematics, are universal aspects of the full creation, including physically qualified things and events. It is impossible to account for physical functioning without including the earlier relation frames in one’s analysis.

 

 

 

1.7. Dynamic development of the relation frames

 

 

 

Dooyeweerd called the development of anticipations the opening process.[36] Including the development of retrocipations, I shall discuss this historical process in the present section. In part II the dynamic development of typical structures will be treated.

 

Several scholars in the history of science have pointed toward this process. Specifically, they reject the view that

 

‘ ... scientists are men who, successfully or not, have striven to contribute one or another element to that particular constellation (of facts, theories, and methods collected in current texts) ...‘, such that ’... scientific development becomes the piecemeal process by which these items have been added, singly and in combination, to the ever growing stockpile that constitutes scientific technique and knowledge’.[37]

 

 

 

In his The structure of scientific revolutions (1962), Thomas Kuhn introduced the distinction between normal science, which is guided by some time-honoured paradigm, and scientific revolutions, during which one paradigm is replaced by a new one (T&E, 5.2).[38] Prior to the introduction and acceptance of any paradigm,

 

‘ ... the early developmental stages of most sciences have been characterized by continual competition between a number of distinct views of nature ... What differentiated these various schools was ... their incommensurable ways of seeing the world and of practicing science in it …’[39]

 

 

 

After a communis opinio is established ‘ ... on the assumption that the scientific community knows what the world is like ...’[40] normal science proceeds as ‘ ... a strenuous and devoted attempt to force nature into the conceptual boxes supplied by professional education’.[41] Eventually, in the course of normal science, anomalies, which cannot be understood within the existing framework, appear and

 

‘ ... then begin the extraordinary investigations that lead the profession at last to a new set of commitments, a new basis for the practice of science.’[42]

 

 

 

Gerald Holton’s Thematic origins of science also points to the difficulty with which new ideas are accepted. Referring to Albert Einstein’s principle of relativity, he observes:

 

‘... it is precisely such non-verifiable and non-falsifiable (and not even quite arbitrary) thematic hypotheses which are most difficult to advance or to accept. It is they which are at the heart of major changes or disputes, and whose growth, reign and decay are much neglected indicators of the most significant developments in the history of science.’[43]

 

 

 

I wonder whether Kuhn would call paradigmatic the following themes mentioned by Holton: conservation (of mass, energy, etc.), mechanism,

 

‘ ... macrocosmos-microcosmos correspondence, inherent principles, teleological drives, action at a distance, space filling media, organismic interpretations, hidden mechanisms, or absolutes of time, space, and simultaneity’, ‘…the efficacy of geometry, the conscious and unconscious preoccupation with symmetries.’[44]

 

 

 

For Kuhn,

 

‘A paradigm ... is in the first place, a fundamental scientific achievement and one which includes both a theory and some exemplary applications to the results of experiment and observation. More important, it is an open-ended achievement, one which leaves all sorts of research still to be done. And, finally, it is an accepted achievement in the sense that it is received by a group whose members no longer try to rival it or to create new alternatives to it. Instead, they attempt to exploit and extend it in a variety of ways…’[45]

 

 

 

Holton’s themes are more or less orthogonal to the

 

‘contingent plane’ of ‘propositions concerning empirical matters of fact (which ultimately boil down to meter readings) and propositions concerning logic and mathematics (which ultimately boil down to tautologies).’[46] ‘A thematic position or methodological theme is a guiding theme in the pursuit of scientific work, such as the preference for seeking to express the laws of physics whenever possible in terms of constancies, or extrema (maxima or minima), or impotency (‘It is impossible that … ‘)’[47]

 

 

 

Holton also distinguishes thematic components of concepts such as force or inertia, and thematic propositions or thematic hypotheses, containing one or more thematic concepts, and which may be a product of a methodological theme.[48] As a result, Holton’s themes appear to be more persistent than Kuhn’s paradigms:

 

‘Only occasionally (as in the case of Niels Bohr) does it seem necessary to introduce a qualitatively new theme into science’.[49]

 

 

 

Paul Feyerabend goes even further. Whereas both Kuhn and Holton accept the historical fact of the existence of paradigms, themes, and normal science, Feyerabend insists that the latter is dogmatic, since it clings to a single paradigm.[50] He pleads for open-mindedness, for competing views. It appears, at least from a Kuhnian perspective, that he wishes to return to the pre-paradigm period of science.[51]

 

Feyerabend strongly attacks the ‘restrictive conditions’ of consistency and meaning invariance, present in positivist empiricism:

 

‘Only such theories are then admissible in a given domain which either contain the theories already used in this domain, or which are at least consistent with them inside the domain; and meanings will have to be invariant with respect to scientific progress; that is, all future theories will have to be framed in such a manner that their use in explanations does not affect what is said by the theories, or factual reports to be explained.’[52]

 

 

 

Insofar as it is assumed that sense data are independent of theories, and that the accumulation of new data cannot give rise to a change in meaning of older theories, meaning invariance is a leading motive in positivism. Criticism of this view by Kuhn, Holton, and Feyerabend is based on historical grounds. These writers give many examples which show that any change of paradigm implies a change in meaning, also with respect to observational facts.

 

 

 

Deepening and relativizing

 

According to the philosophy of the cosmonomic idea meaning is determined by the relation of law and subject. Everything created has dependent meaning, as a result of being subjected to laws by its Creator. However, this does not imply another kind of meaning invariance. Indeed, it is precisely in the dynamic development process that meaning is both deepened and relativized. From this perspective, one could paraphrase Kuhn’s theory as follows: In the pre-paradigm phase, scientists are not yet aware of the meaning of their concepts. With the formation of the first paradigms, it is mainly the retrocipatory analogies of the modal aspects or typical structures that are discovered (this includes the search for objectivity, 1.6). Paradigm change is brought about by the discovery of either a new retrocipatory projection acting as a pushing force, or, even more spectacularly, by the discovery of an anticipation acting as a pull, an attractive force. Such developments account for the appearance of Kuhn’s scientific revolutions as well as Holton’s more persistent themes. With the development of a modal aspect, the latter remains in existence, as a fundamental and irreducible mode of explanation, though it may be viewed in a different light. Thus, whether Euclidean or non-Euclidean geometries are used, the aim of geometry remains to account for spatial relations. I shall discuss several examples of this model in chapters to come.

 

Does meaning change if it is developed, and, if so, to what extent does it change and to what extent does it remain invariant? The opening process adds anticipatory projections to a modal aspect, but it simultaneously influences the nuclear meaning of the aspect, together with its retrocipatory projections. This dynamic process I refer to as deepening and relativizing the original meaning of a modal aspect, since in this way the aspect becomes related to later modal aspects. This position is more complicated than either meaning invariance or meaning relativism. It involves both the law side and the subject side of reality.

 

 

 

The concept of mass

 

As an example, let us consider the concept of mass (T&E, 3.6).[53] This concept was introduced first by Johann Kepler and Galileo Galilei, but became paradigmatic only with Isaac Newton. One of the properties of mass is its conservation in chemical reactions, only justified empirically after Newton’s time. This characteristic of mass is challenged in Albert Einstein’s theory of relativity (3.8). Now, one may ask whether the meaning of mass has undergone change or not? Positivists will reply that, since the factual content of the sense data related to mass has not changed, its meaning must remain invariant. Some operationalists will say that, as there are different experimental methods to determine mass, there are different meanings of mass which are independent of theory, and the meaning of mass will remain invariant with respect to change of theory. Feyerabend, among others, replies that any experimental method is theory laden and, hence, operational meanings are variable with theories. He states that, since mass is subject to different laws in Newtonian physics than in relativity physics, its meaning has also radically changed.[54] Still others point out that relativistic mass shares at least some of the properties of classical mass, such that some sort of family resemblance exists between the two.[55]

 

A view, commonly held in physics, is that Newtonian mechanics is a limiting case of relativity physics, since, at low velocities, the relativistic and Newtonian formulas become approximately equal. The relevance of this statement becomes clear only if we remember that experimental measurements always have a finite accuracy. Within given limits of accuracy, it is rather easy to determine the velocity below which it is impossible to distinguish Newtonian from relativistic results. A positivistic interpretation will say that, since in this case there is no difference between the two theories, the meaning of terms such as mass must also be the same. A realistic interpretation will insist that the meaning of mass is different in the two theories. I tend to reject both these views.

 

As will be argued in chapters 4 and 5, Newtonian physics is mainly retrocipatory, whereas both relativity physics and wave motion concern the kinematic development of the numerical and spatial modal aspects. This development also has a bearing on the numerical and spatial projections of the physical modal aspect, e.g. on mass. The meaning of mass in Newtonian physics can be understood as a numerical retrocipation in the physical modal aspect. In relativity physics, this retrocipation is also opened up, inasmuch as all numerical and spatial relations become frame-dependent. But this state of affairs implies neither a meaning invariance (since the meaning does change), nor a loss of meaning (since it remains a retrocipation in the physical aspect). Rather, the development of relativity physics results in a deepening and relativizing of the original closed meaning of mass in Newtonian physics. Relativizing does not result in a loss of meaning, especially since the retrocipatory viewpoint remains valid and useful. Indeed, there are so many instances where Newtonian mass is still relevant that it is illegitimate to characterize the Newtonian interpretation as approximately true, but formally false.

 

It should be clear by now that this theory of development does not lead to meaning relativism. Both in closed and in opened up form, meaning is bound to law. Scientists, who study laws and their relations to the subject side of reality, are similarly bound to law. We find, however, that in the opening process not only the subject side but also the law side is involved: that is why meaning is opened up, and why the meaning of a developed modal aspect or typical structure cannot be the same as the meaning of one that is still closed.

 

 

 

1.8. Science and religion

 

 

 

Explicitly, I have presented the following aims of science: (1) the explication of laws, and (2) the reduction and deduction of laws (1.3); (3) abstraction or analysis, and (4) reconstruction or synthesis of typical laws (1.4); (5) the designation of modal aspects and the exploration of retrocipations and anticipations (1.5); (6) objectification (1.6). One could add: (7) the explanation of individual facts and phenomena.[56] These goals of science can be generalized by stating that the aim of science is the theoretical development of the full creation (1.7).

 

In addition to the theoretical opening process, many similar processes are operating within the creation. There is a natural opening process (the temporal evolutionary development of the cosmos, part II); individual ones (the growth, flowering and decay of a plant, or the opening up of the experiential horizon of an animal or human); technical development (the opening up of possibilities laid down in the creation); an artistic one, a social one, a linguistic one, etc. In each of these cases, it may be expected that the directions of retrocipation and anticipation as forces driving dynamic development will be retraceable.

 

The distinction of law and subject is itself directed. Subjects do not exist without laws, and via the laws they acquire meaning as creatures. The direction of subject-to-law points to the origin of creation, the sovereign Creator and Lawgiver, who Himself is subject to no law. As viewed from the subject side, the law is the boundary of created reality, across which no subject can step. For God, the law is not a boundary,[57] but, by maintaining His laws, according to His covenant, He remains faithful to His creation.[58] Thus the direction of law to subject expresses the dependence of the creation upon its Creator. The unfolding process becomes meaningful only because of this law-subject relation.

 

The latter statements are clearly not of a scientific nature, but point to an interesting and illuminating state of affairs, displaying both the similarity and the distinction of science and religion.[59] In both cases man, who is himself a subject, searches for truth, truth about reality and about himself. In both cases the attitude of humanity is directed toward the origin of creation, and, therefore, in both cases, their attention is directed toward the law side of reality. The distinction of the two cases lies in the fact that in his scientific attitude, people see the subject side reflected in the law side. As soon as scientists formulate a law (finds a law conformity), they must verify it (or falsify it) on the subject side. One may even go as far as Popper who says that no law statement should be called scientific unless it is potentially falsifiable.[60]

 

Humans, however, experience that this scientific attitude is not sufficient for finding the full truth about reality. Through science the origin of creation cannot be found: the law side as the boundary of reality cannot be penetrated. It is in his religious attitude that people seek to look beyond the laws. In this effort no principle of verification can help because any subject points to the law side and beyond for its full religious meaning. At this point human self-insufficiency becomes abundantly clear. Faithful knowledge about the origin of full reality requires revelation, the truth of which humans can only find in religion. However, as we have pointed out earlier, the scientific attitude also rests on faith. The fundamental hypothesis of all sciences - the hypothesis that reality is lawful - cannot be proved; it must be believed. If you don’t believe it, you cannot be a scientist.

 

If the sovereignty of God as Creator and Lawgiver is not recognized, the unity and origin of reality must be found somewhere within temporal reality itself. In western culture, it is always humans themselves who are assigned the task of locating this origin, and, not recognizing the true origin, they must seek their point of reference in either one or another of the modal aspects, or in one of the typical structures. Such selection of reference points has resulted in the formation of the various mutually irreconcilable schools of philosophy, each pretending to be able to explain everything according to a single principle. Alternatively, people may place their trust in power (economic or political), in the church, or in one of the arts.[61] Regardless of where the reference point is chosen, such a choice always leads to a dogmatic (and nonprovable) over-rating of the aspect or structure concerned. A balanced and dynamic view of reality can only be achieved if the dependent and self-insufficient being of creation, of which no aspect or typical structure is overestimated or neglected, is accepted.

 



[1] On positivism, see Frank 1941; Kolakowski 1966; Von Mises 1939; Popper 1974.

[2] Bunge 1967a, 1.

[3] Bunge 1967a,1, 2.

[4] Bunge 1967a,2.

[5] Bunge 1967a, 64; Bunge 1967c.

[6] Bunge 1967a,68, 69.

[7] See Seagal in: Henkin et al. (eds.), 341: ‘... no axiom system is secure if it does not treat a closed system.’

[8] Noll 1974.

[9] Whiteman 1967, 104, 105; cf. Bunge 1967a,66; 1967b, Chapter 9, especially p. 120.

[10] Gödel’s theorem concerning the consistency and the completeness of axiomatized theories also shows some limitations of this method; cf. Gödel 1962; Bunge 1967a,64.

[11] Cantore 1969, 5.

[12] Bunge 1967a, 44, 49,58, 287; Bunge 1967c.

[13] Fraenkel, Bar-Hillel 1958.

[14] These are contemporary philosophies. For an enumeration of eight mostly historical views on the relation of natural philosophy and science, see Beth 1948, Chapter3. See also Losee 1972.

[15] Dooyeweerd WdW, NC; Vollenhoven 1950, 2010; Tol, Bril (eds.) 1992. For an introduction to this philosophy, see Kalsbeek 1970; Hart 1984; Clouser 1991a; van Woudenberg 1992; Strauss 2009.

[16] Bunge 1959a, 245ff; Bunge 1967a, 44 Bunge 1959a, 249 defines laws as ‘... the immanent patterns of being and becoming ...’ law statements as ‘... the conceptual reconstructions ...’ of laws. The relation of law and subjects, and the status of theories, models, facts, induction, deduction, and reduction arc objects for epistemological researchapter See e.g. Hempel 1952, 1965, 1966; Nagel 1961; Popper 1959, 1963; Stegmüller 1969-1970.

[17] Hume 1739, 1748; Braithwaite 1953, chapter 9; Harris 1970, 39, 40; Kolakowski 1966, 42-59; Losee 1972, 101-106; Popper 1972, 1-31, 55-105; Russell 1946, 634-647. Dooyeweerd NC, I, 275ff observes that Hume’s scepticism has a methodological significance, intended to reinforce his psychological ideal of science.

[18] Even the existence of subjects outside ourselves cannot be proved, as was shown by solipsism; cf. Russell 1927, 27ff.

[19] See Dooyeweerd NC, I, 93ff. For a discussion of the status of Newton’s second law of motion (which could serve as an illustration of this assertion), see Hanson 1958.

[20]  Popper 1959, chapter 1.

[21] The influence of communal belief on accepted theories has been emphasized by Kuhn 1962, 1970; see also Bunge 1967a, 70; Harris 1970; Ziman 1968, 1976.

[22] Kuhn 1962, chapter 2; Feyerabend 1975; Lakatos 1970; Holton 1973, chapter 3; Finocchiaro 1973.

[23] Russell 1946, 44, 61.

[24] Campbell 1921, 56, 57.

[25] Margenau 1950 , chapter 3-6.

[26] Bunge 1967a, 9; Jammer 1974, 10ff.

[27] The modal aspects were originally called ‘law spheres’ (‘wetskringen’ in Dutch). Since Stafleu 2002, I call these ‘relation frames’.

[28] Dooyeweerd 1960b, 6, 7; see also Dooyeweerd NC, I, 3; on the criterion of a modal aspect, see Dooyeweerd NC, II, chapter 1.

[29] The prevailing positivist view reverses the creational order by stating that the sciences must be classified according to their methods (cf. Margenau 1950, 46).

[30] Gale (ed.) 1967.

[31] Dooyeweerd NC, I, 42-43. Because of the distiction of subjects and objects, the term ‘subject side of reality’ should be understood as ‘subject-and-object side’, but for short I shall stick to the usual ‘subject side’.

[32] Dooyeweerd NC, I;1960b; for the subject-object relation in humanist philosophies, see. Dooyeweerd NC, II, 367 ff.

[33] Dooyeweerd NC, II, 12; Jaki 1966, chapter 1.

[34] Dooyeweerd NC, I, 4.

[35] See. e.g. Popper 1959, 44ff, but also Kant 1781, A 820, B 848; for Popper, objectivity of scientific statements lies in the fact that they can be intersubjectively tested, which implies that the described phenomena should be reproducible. See also Margenau, Park 1967, who enumerate the following ‘meanings of objectivity’: ontological existence (‘the objective reality behind perceptible things’); intersubjectivity; invariance of aspect (‘objectivity must be assigned to those properties which are, or can be made, invariant’); scientific verifiability (‘Constructs which satisfy the metaphysical requirements as well as the stringent rules of empirical confirmation are called verifacts, and verifacts are the carriers of objectivity in the domain of theory’). The ‘metaphysical requirements’, e.g. Ockham’s razor, economy of thought, logical fertility, simplicity, are discussed in Margenau 1950, chapter 5; 1960.

[36] In part III, chapter 15, I interpret this idea in a different sense than Dooyeweerd does, see Dooyeweerd NC, I, 29, II, 181ff. In part IV my views of the development process are applied to the 16th-19th-century history of physics.

[37] Kuhn 1962, 1, 2; see Agassi 1963; for an extensive discussion of Kuhn’s views, see Lakatos, Musgrave (eds.) 1970; Finocchiaro 1973.

[38] Kuhn 1962, 10, 23.

[39] Kuhn 1962, 4.

[40] Kuhn 1962, 5.

[41] Kuhn 1962, 5.

[42] Kuhn 1962, 6.

[43] Holton 1973, 190; see also Holton 1978, chapter 1.

[44] Holton 1973, 24, 25, 27.

[45] Kuhn 1963, 363; it is by no means easy to comprehend the meaning of Kuhn’s paradigms. Masterman 1970 says that Kuhn uses ‘paradigm’ in not less than twenty-one senses.

[46] Holton 1973, 21.

[47] Holton 1973, 28.

[48] Holton 1973, 28.

[49] Holton 1973, 29; also Holton 1973, 61ff

[50] Feyerabend 1965, 172: ‘Normal science, extended over a considerable time, now assumes the character of stagnation, a lack of new ideas; it seems to become a starting point for dogmatism and metaphysics. Crises, on the other hand, are now not accidental disturbances of a desirable peace; they are periods where science is at its best, exhibiting as they do the methods of progressing through the consideration of alternatives’. See also Popper 1970 and Watkins 1970. Contrary to this, Kuhn 1963, 364 states: ‘Advance from paradigm to paradigm rather than through the continuing competition between recognized classics may be a functional as well as a factual characteristic of mature scientific development’.

[51] Feyerabend 1965, 320, 321: ‘You can be a good empiricist only if you are prepared to work with many alternative theories rather than with a single point of view and ‘experience’. This plurality of theories must not be regarded as a preliminary state of knowledge which will at some time in the future be replaced by the One True Theory’. See also Feyerabend 1975.

[52] Feyerabend 1965, 164; 1970, 323; the latter text reads ‘phrased’ instead of ‘framed’. See also Bohr 1949, 209, 210.

[53] Feyerabend 1970, 325ff; Kuhn 1962, 98ff; Hesse 1974, 64ff.

[54] Feyerabend 1965, 169: ‘That the relativistic concept and the classical concept of mass are very different indeed becomes clear if we also consider that the former is a relation, involving relative velocities between an object and a coordinate system, whereas the latter is a property of the object itself and independent of its behaviour in coordinate systems... . The attempt to identify the classical mass with the relativistic rest mass is of no avail either, for although both may have the same numerical value, they cannot be represented by the same concept’. For a similar viewpoint, see Kuhn 1962, 101, 102.

[55] Kuhn 1962, 45; Hesse 1974, 46-48, 64-65; Hesse observes that the classical and relativistic theories could not even be compared if key concepts like mass had completely different meanings in the two theories.

[56] Popper 1972.

[57] Dooyeweerd NC, I, 99.

[58] Dooyeweerd NC, I, 93.

[59] Dooyeweerd NC, I, 57: By religion is understood ‘ ... the innate impulse of human selfhood to direct itself toward the true or toward a pretended absolute Origin of all temporal diversity of meaning, which it finds focused concentrically in itself. This description is indubitably a theoretical and philosophical one, because in philosophical reflection an account it required of the meanings of the word ‘religion’ in our argument’.

[60] Popper 1959, 41; see also Lakatos 1970. A similar view was already expressed by Claude Bernard in the 19th century, see Kolakowski 1966, 93. The distinction of falsification and verification reflects the law-subject relation. Scientific law-statements (or ‘all-statements’) should be falsifiable, whereas subjective existential statements (of the form ‘there is a ... ‘) should be verifiable in order to qualify as empirically meaningful. See Popper 1959, 70.

[61] Or in astrology, superstition, myths, etc. That these convictions cannot be ruled out by their supposed lack of empirical support has been shown by Feyerabend 1965; cf.Kuhn 1962, 2.

 

 

Part I, chapter 2

 

 

Number and space

 

 

 

2.1. Set theory and the first two relation frames

 

 

 

Time and again is mostly concerned with an analysis of the foundations of physics. Such an analysis would be quite impossible, however, without taking into account the quantitative and spatial relation frames. In chapter 2 we shall discuss these, though not as extensively as our discussion of the kinetic and the physical aspects in subsequent chapters. This chapter should not be taken out of the context of this book. My only intention is to investigate the quantitative and the spatial modal aspects insofar as they are relevant to physics. The mutual irreducibility of these aspects will be discussed later on. In the present section I shall give a provisional outline of their meaning, and discuss their relation to set theory. The reader should keep in mind the mutual orthogonality of the distinction of law and subject, and that of the various relation frames.

 

 

 

The concept of a set

 

Plato and Aristotle introduced the traditional view that mathematics is concerned with numbers and with space. Since the end of the 19th century, many people thought that the theory of sets would provide mathematics with its foundations.[1] Since the middle of the 20th century, the emphasis is more on structures and relations.[2]

 

Numbers constitute the relation frame for all sets and their relations. A set consists of a number of elements, varying from zero to infinity, whether denumerable or not, but there are sets of numbers as well. What was the first, the natural number or the set? Just as in the case of the chicken and the egg, an empiricist may wonder whether this is a meaningful question. We have only one reality available, to be studied from within. In the cosmos, we find chickens as well as eggs, sets as well as numbers. Of course, we have to start our investigations somewhere, but the choice of the starting point is relatively arbitrary. Rejecting the view that mathematics is part of logics, I shall treat sets and numbers in an empirical way, as phenomena occurring in the cosmos.

 

At first sight, the concept of a set is rather trivial, in particular if the number of elements is finite. Then the set is denumerable and countable; we can number and count the elements. It becomes more intricate if the number of elements is not finite yet denumerable (e.g., the set of integers), or infinite and non-denumerable (e.g., the set of real numbers).The numerical modal aspect of discrete quantity, as a universal mode of being, presupposes that every created thing is a unity, and that there exists a multitude of such unities. The numerical modal aspect is universal since there is nothing in the creation which is not subjected to numerical order. This order can be described as the order of before and after, both in its original meaning of more and less, and in its analogical meaning of smaller and larger in magnitude.

 

The spatial modal aspect of continuous extension explains why a unique ordering of everything created is impossible solely with the numerical order of before and after. Thus different sets may have the same number of elements, and different things may have the same size. The spatial order of simultaneous coexistence (on the law side of the spatial modal aspect) makes possible the original spatial relation of relative position (on the subject side of the spatial modal aspect). This spatial modal order also involves the analogical concept of equivalence with respect to some property, thereby allowing things to share this property in different degrees. The spatial modal order is only universal if it is considered together with the numerical order. Although the order of simultaneity does not apply to everything created, one can account for all static relations if this order and the order of before and after are taken together.

 

Set theory is nowadays generally considered to be the basis of the theory of number. Later, in chapter 8, I shall discuss the concept of probability and argue that it refers to the law-subject relation for individuality structures. Since the theories of probability and sets are closely related, I view the idea of a set as giving expression to the law-subject relation. Sets are always determined by some law. This is even the case with examples like ‘the set of all books in my room’, for this refers to the law defining ‘books’. In this context the set of all things on my desk is ill defined without further specification of a ‘thing’. In general, classes are not identifiable, or even imaginable, unless they are defined by a set of laws, and these laws are usually not of a mathematical kind. It is not strictly correct to say that a set is determined by a law. I prefer to say that a set has a law side and a subject side. The idea of a set cannot be reduced, either to the law side, or to the subject side.

 

 

 

Numbers and sets

 

The concept of number cannot be studied without the idea of sets. Both Baruch de Spinoza and Gottlob Frege observed that one cannot ascribe a number to things, unless these are grasped under a genus.[3] If in the realm of concrete things and events the post-numerical modal aspects are ignored, there is still the possibility of taking some of them together in a collection. After this process of abstraction, all that remains to be said is that concrete things belong to classes of things. The common property of all finite collections is that they can be counted, regardless of the spatial, kinetic, physical, etc., properties of their elements. Thus all finite collections are related either directly by a one-to-one correspondence, or by a one-to-one correspondence between one collection and a proper sub-set of another one. In the latter case the first collection is called smaller than the second one. Because this property is universal, one can now abstract from concrete sets, discovering an abstract and unique collection of natural numbers, serving as a universal reference system for all finite collections.[4]

 

On the other hand one cannot talk about a set without having a previous idea of a plurality of concrete things and events,[5] nor can one dispense with the individual unity of its members. Aristotle considered the individuality of things as their only property relevant to arithmetic. For Aristotle individuality meant the identity of a thing with itself and its being distinct from other things. Arithmetic had to abstract from all other properties of real things.[6] For example, the universal law of addition demands that if a collection of m members is added  to a collection of n members, one always arrives at a collection of (m+n) members, whatever the character of the two collections, provided they have no member in common. This implies that each member has its own subjective identity.

 

 

 

Space and sets

 

The concept of space cannot be studied without the idea of sets either. A spatial figure is characterized by being connected and having parts. At the same time we have to consider it as an uncountable set of points, though we cannot define it as such. The fact that we can consider each spatial figure as a collection of connected and nevertheless disjoint parts is the necessary basis for the introduction of spatial magnitude.

 

On the other hand, the idea of a set always has a spatial aspect. In a set we have a number of coexisting members. Members can simultaneously belong to different sets. The notion of sub-sets of a set refers to the simultaneous existence of a whole and its parts. Also the concepts of ‘union’ and ‘intersection’ of sets refer clearly to the spatial modal aspect. In order to make the transition of all finite collections to the set of natural numbers, one often makes use of the concept of ‘equivalence class’. The numerical order of more and less is not directly applicable to sets, but only to equivalence classes of sets, each equivalence class uniting all sets with the same number of elements. This also shows that the spatial as well as the numerical orders are presupposed in this attempt to base a theory of numbers on set theory.[7] In fact, even if we talk about the set of natural numbers, we already refer to simultaneity.

 

Without the introduction of numerical and spatial orders, the sub-set of a set can only be partially ordered. In order to arrive at a universal order of sets, we have to introduce the more abstract orders of seriality and spatial simultaneity. For Aristotle, the number of a set was a concrete property. Frege was one of the first to recognize the abstract character of the cardinal numbers: there is only one number six, regardless of how many sixtuples of concrete things exist.[8] Even Russell’s definition of the number of a class as ‘the class of all classes which are equivalent to that class’[9] presupposes the abstraction of all properties of sixtuples, except of being classes, and having six members. It especially presupposes the abstraction from the spatial order of simultaneity, for in this case, one abstracts from the fact that so many sextuples exist simultaneously.

 

It is not my intention to investigate the foundations of set theory. The above arguments only serve to make clear the mutual orthogonality of the law-subject distinction, which finds its mathematical expression in the theory of sets, and the distinction of the various modal aspects, which we intend to study in this and the subsequent chapters.

 

 

 

2.2. Numerical relations and the theory of groups

 

 

 

The numbers form an abstract reference system for any serial order. Having no concrete existence, their meaning is purely modal. They are numerical modal subjects, being subject to numerical modal laws only. The different number systems which are relevant to physics will be investigated briefly: the natural, integral, rational, real, and complex numbers, as well as vectors. This will be done in a quasi-formal way, using a group-theoretic approach, because of the relevance of group-theory to present-day physics, and to our analysis of it. As will be seen later (10.5), groups are typical structures with a numerical character, to be used as instruments in the analysis of the numerical, spatial and kinetic relation frames.

 

 

 

Natural numbers

 

On the law side of the numerical relation frame time expresses itself as the serial order of before and after.[10] The number 2 is earlier than the number 3, because the latter can be generated from the former by addition of the number 1.

 

On the subject side, the numerical difference is correlated to this temporal order. Obviously, the statement that some number is later than another one gives rise to the question: ‘How much later?’ Indeed, the numerical difference between two numbers is related to their temporal order of earlier and later: the difference is positive or negative depending on this order (if a>b, then ab>0, etc.).

 

This serial order forms the basis of Giuseppe Peano’s axioms formulating the laws for the sequence N of the natural numbers.[11] The axioms apply the concepts of sequence, successor and first number, but do not apply the concept of equivalence. According to Peano, the concept of a successor is characteristic for the natural numbers:

 

 

1. N contains a natural number, indicated by 0.[12]

2. Each natural number a is uniquely joined by a natural number a+, the

    successor of a.[13]

3. There is no natural number a such that a+=0.

4. From a+=b+ follows a=b.

5. If a subset M of N contains the element 0, and besides each element

    a its successor a+ as well, then M=N.[14]

 

The transitive relation ‘larger than’ is now applicable to the natural numbers.[15]

 

The character of the natural numbers expressed by Peano’s axioms is primarily quantitatively characterized. It has no secondary foundation for lack of a relation frame preceding the quantitative one.[16] As a tertiary characteristic, the set of natural numbers has the disposition to expand itself into other sets of numbers.

 

The laws of addition, multiplication, and raising powers are derivable from Peano’s axioms.[17] The class of natural numbers is complete with respect to these operations.[18] If a and b are natural numbers, then a+b, a.b en ab are natural numbers as well. This does not always apply to subtraction, division or taking roots, and the laws for these inverse operations do not belong to the character of natural numbers.

 

The set of natural numbers is the oldest and best-known set of numbers. Yet it is still subject to active mathematical research, resulting in newly discovered regularities, making arithmetic an empirical science.[19] Some theorems relate to prime numbers. Euclid proved that the number of primes is unlimited. An arithmetical law says that each natural number is the product of a unique set of primes. Several other theorems concerning primes are proved or conjectured.[20]

 

In many ways, the set of primes is notoriously irregular. There is no law to generate them. If one wants to find all prime numbers less than an arbitrarily chosen number n, this is only possible with the help of an empirical elimination procedure, known as Eratosthenes’ sieve.[21]

 

 

 

The whole-part relation

 

It is very important to distinguish a set from its members. The relation of a set to its elements is a numerical law-subject relation, for a set is a number of elements. By contrast, the relation of a set to its subsets is a whole-part relation that can be projected on a spatial figure having parts. A subset is not an element of the set, not even a subset having only one element.[22] A set may be a member of another set. For instance, the numerical equivalence class [n] is a set of sets.[23] However, the set of all subsets of a given set A (the ‘power set of A’) should not be confused with the set A itself.[24]

 

Overlapping sets have one or more elements in common. The intersection AÇB of two sets is the set of all elements that A and B have in common. The empty set or zero set Æ is the intersection of two sets having no elements in common. Hence, there is only one zero set. It is a subset of all sets.[25] If a set is considered a subset of itself, each set has trivially two subsets. (An exception is the zero set, having only itself as a subset).

 

The union AÈB of two sets looks more like a spatial than a numerical operation. Only if two sets have no elements in common, the total number of elements is equal to the sum of the numbers of elements of the two sets apart. Otherwise, the sum is less.[26]

 

Hence, even for denumerable sets the numerical relation frame is not sufficient. At least a projection on the spatial relation frame is needed. This is even more true for non-denumerable sets.

 

Some sets are really spatial, like the set of points in a plane contained within a closed curve. As its magnitude, one does not consider the number of points in the set, but the area enclosed by the curve. The set has an infinite number of elements, but a finite spatial measure. A measure is a magnitude referring to but not reducible to the numerical relation frame. It is a number with a unit, a proportion.

 

This measure does not deliver a numerical relation between a set and its elements. It is not a measure of the number of elements in the set. A measure is a quantitative relation between sets, e.g., between a set and its subsets. If two plane spatial figures do not overlap but have a boundary in common, the intersection of the two point sets is not zero, but its measure is zero. The area of the common boundary is zero. For a spatial set, only subsets having the same dimension as the set itself have a non-zero measure. Integral calculus is a means to determine the measure of a spatial figure, its length, area or volume.

 

For each determination of a measure, each measurement, real numbers are needed. That is remarkable, for an actual measurement can only yield a rational number.

 

The number 2 is natural, but it is an integer, a fraction, a real number and a complex number as well. Precisely formulated: the number 2 is an element of the sets of natural numbers, integers, fractions, real, and complex numbers. This leads to the conjecture that the character of natural numbers does not determine a class of things, but a class of relations. The meaning of a number depends on its relation to all other numbers and the disposition of numbers to generate other numbers.[27]

 

The natural numbers constitute a universal relation frame for all denumerable sets. Peano’s formulation characterizes the natural numbers by a sequence, that is a relation as well. The integers, the rational, real, and complex numbers are definable as relations as well. Therefore, it is not strange that the number 2 answers different types of relations. A quantitative character determines a set of numbers, and a number may belong to several sets.

 

 

 

Group theory

 

Since the addition of two numbers yields a number, and the difference between any two numbers is a number, some sets of numbers may form a group. In 1831 Évariste Galois introduced the concept of a group in mathematics as a set of elements satisfying the following four axioms.[28] A group is a collection of distinct elements A, B, C, … on which a combination procedure is defined, such that for any pair of elements A, B an element AB can be generated, according to the following rules:

 

(a) If A and B  are elements of the group, then the combination AB is also

     an element.[29]

(b) (AB)C=A(BC)=ABC – the group operation is associative

(c) the group contains one element I, called the identity element, such that

     for each element A of the group, AI=IA=A.

(d) to each element A corresponds an inverse element A’, such that AA’=AA=I.

 

 

Here, the equality sign (=) must be understood as ‘is the same as’, ‘is equal to’, ‘cannot be distinguished from’, or ‘can always be substituted for’. There is no intrinsic way to distinguish the element AA’ from the element I, for instance. The extrinsic lingual distinction only accounts for the different possibilities of generating the same element.

 

These four rules form the generic character of a group (10.5). They do not fully determine a group, however. As to the law side, one has to specify the group operation, and as to the subject side, one has to indicate the members of the group, by stipulating some members as a set of generators. The other members are dynamically generated by application of the group operation. Several different groups (i.e., having different members, and eventually a different group operation) may have the same group structure. In that case the groups are called different isomorphic models or representations of the same group structure. An isomorphism consists on the subject side of a one-to-one correspondence between the members of the two groups, and on the law side of a parallelism between the respective group operations. If the members A, B, C in one group correspond with the members K, L, M in the other group, and if AB=C, then KL=M. Thus the law does not define its subjects: the subject side cannot be reduced to the law side. Isomorphism plays an important part in finding objective relations, e.g. by projecting physical relations on mathematical ones.

 

As a character, a group is qualified by the numerical relation frame. It has no foundation in a preceding frame (because there is no one), and it has the disposition of being applied in the numerical and later frames, in particular in the study of characters qualified by the physical, kinetic, and spatial modal aspects.

 

Groups may be finite or infinite. The smallest groups contain just one element – evidently the identity element. For example, the number 1 forms a multiplication group, and the number 0 an addition group. These two groups are even isomorphic. The number 1 and -1 also form a multiplication group, consisting of just two members. Finite groups are very important in the physics of typical structures, but infinite groups are more interesting for the extension of the set of natural numbers.

 

 

 

Negative and rational numbers as relations

 

The set of natural numbers does not form a group, though if addition is taken as the group operation, the natural numbers satisfy rules (a) and (b). But there are no inverse elements, which means that within the set of natural numbers, subtraction is not always defined. However, by including the number zero and the negative integers, one arrives at a group. The integers are generated as members of the smallest addition group, which includes among its members the natural numbers. The group operation is addition, the inverse of a positive integer is a negative integer, and vice versa. To show that one has to specify some members of the group, it should be observed that the addition group of integers is isomorphic to the addition group of even integers, of triples, etc. In this approach the positive integers are identified with the natural numbers.

 

Within the group structure, the element AB’ can be considered as expressing the intrinsic relation between two elements A and B (for short, I shall say that AB’ is the relation between A and B). The relation between two integers is their numerical difference. The reverse relation is BA’. The relation of an element to itself is AA’=I, the identity. Because AI=IA=A, the relation of an element to the identity element is identical with the element itself. Therefore, the numerical difference between two numbers, as the basic numerical subject-subject relation, is a numerical modal subject itself.[30]

 

Difference is not the only conceivable numerical relation. From addition we can derive the operation of multiplication of two natural numbers (as an abstraction of the repeated addition of equally numbered collections).[31] If we introduce multiplication as a group operation, we generate the positive rational numbers as the members of the smallest multiplication group, whose members include the natural numbers.[32] For the group of positive rational numbers the identity element is the number 1, the inverse of a rational number is a fraction, and the group relation is the ratio between two rational numbers. The set of all rational numbers (positive, negative, and zero) is then defined as the addition group, whose elements include the positive rational numbers.[33] It cannot be defined as a multiplication group, because the number 0 has no inverse for multiplication.[34]

 

For the introduction of the rational numbers two group operations are required. This leads to the idea of a field, another ‘algebra’. A field is a collection of subjects in which two operations are defined (e.g., addition and multiplication), each satisfying the same rules as for groups, except that the identity element for one operation has no inverse with respect to the second operation. The two operations are connected via the distributive law: (A+B)xC=(AxC)+(BxC). Examples are the fields of rational numbers, of real numbers, and of complex numbers. (There are finite fields as well.) They have the usual addition and multiplication as operations, whereas dividing by zero is not defined.

 

 

 

Discrete and dense sets

 

The group structure does not specify an order between the elements. The groups discussed so far can be ordered according to the law mentioned at the beginning of this section. A>B, if AB>0, where ‘larger than zero’ means ‘being positive’. A set is called discrete in a certain order, if in that order each element has just one successor. Every finite collection is discrete, and so are the sets of natural and integral numbers. In a series the natural numbers (acting as ‘ordinal numbers’) serve as indices. A set is called denumerable if its members can be put in such a series, i.e., if there is a one-to-one correspondence between the members of this set and the natural numbers. The order of this series is extrinsic, while given by the indices. An intrinsic numerical order is determined by the numerical values of the set’s members themselves.

 

Now consider the set of the rational numbers, which can be arranged in a series, as is shown in any textbook on number theory.[35] In this series, in which a member is not necessarily larger than all preceding members, the members are arranged in an extrinsic numerical order (of the indices). The rational numbers in their intrinsic numerical order of smaller and larger do not form a discrete series, but a dense set. This means, in any interval there is at least one rational number, and therefore, an infinitude of rational numbers in any interval, and there is no empty interval, however small.

 

With the concept of a dense set, the limit is reached of the closed numerical modal aspect. It is the starting point for the opening up of this aspect, anticipating later modal aspects, as will be seen presently.

 

 

 

2.3. The development of the numerical relation frame

 

 

 

The road from the natural numbers to the real ones proceeds via the rational numbers. A set is denumerable if its elements can be put in a sequence. Georg Cantor demonstrated that all denumerable infinite sets are numerically equivalent, such that they can be projected on the set of natural numbers. Therefore, he accorded them the same cardinal number, called Ào, aleph-zero, after the first letter of the Hebrew alphabet. Cantor assumed this ‘transfinite’ number to be the first in a sequence, Ào, À1, À2, … , where each is defined as the ‘power set’ of its predecessor, i.e., the set of all its subsets.

 

The rational numbers are denumerable, at least if put in a somewhat artificial order. The infinite sequence 1/1; 1/2,2/1; 1/3,2/3,3/1,3/2; 1/4,2/4,3/4,4/1,4/2,4/3; 1/5, … including all positive fractions is denumerable. In this order it has the cardinal number of Ào. However, this sequence is not ordered according to increasing magnitude.

 

In their natural (quantitative) order of increasing magnitude, the fractions lay close to each other, forming a dense set. This means that no rational number has a unique successor. Between each pair of rational numbers a and b there are infinitely many others.[36] In their natural order, rational numbers are not denumerable, although they are denumerable in a different order. Contrary to a finite set, whether an infinite set is countable may depend on the order of its elements.[37]

 

Though the set of fractions in their natural order is dense, it is still possible to put other numbers between them. These are the irrational numbers, like Ö2 and p. According to the tradition, Pythagoras or one of his disciples discovered that he could not express the ratio of the diagonal and the side of a square by a fraction of natural numbers. Observe the ambiguity of the word ‘rational’ in this context, meaning ‘proportional’ as well as ‘reasonable’. The Pythagoreans considered something reasonably understandable, if they could express it as a proportion. They were deeply shocked by their discovery that the ratio of a diagonal to the side of a square is not rational. The set of all rational and irrational numbers, called the set of real numbers, turns out to be non-denumerable. I shall argue presently that the set of real numbers is continuous, meaning that no holes are left to be filled.

 

Only in the 19th century, the distinction between a dense and a continuous set became clear.[38] Before, continuity was often defined as infinite divisibility, not only of space. For ages, people have discussed about the question whether matter would be continuous or atomic. Could one go on dividing matter, or does it consist of indivisible atoms? They overlooked a third possibility, namely that matter would be dense.

 

Even the division of space can be interpreted in two ways. The first was applied by Zeno when he divided a line segment by halving it, then halving each part, etc. This is a quantitative way of division, not leading to continuity but to density. Each part has a rational proportion to the original line segment. Another way of dividing a line is by intersecting it by one or more other lines. Now it is not difficult to imagine situations in which the proportion of two lines segments is irrational. (For instance, think of the diagonal of a square.) This spatial division shows the existence of points on the line that quantitative division cannot reach.

 

By his famous diagonal method, Cantor proved in 1892 that the set of real numbers is not denumerable. Cantor indicated the infinite amount of real numbers by the cardinal number C. He posed the problem of whether C equals À1, the transfinite number succeeding À0. At the end of the 20th century, this problem was still unsolved.

 

A theorem states that each irrational number is the limit of an infinite sequence or series[39] of rational numbers, e.g., an infinite decimal fraction. This seems to prove that the set of real numbers can be reduced to the set of rational numbers, like the rational numbers are reducible to the natural ones, but that may be questioned. Any procedure to find these limits cannot be done in a countable way, not consecutively. This would only lead to a denumerable (even if infinite) amount of real numbers.[40] To arrive at the set of all real numbers requires a non-denumerable procedure. But then we would use a property of the real numbers (not shared by the rational numbers) to make this reduction possible. And this appears to result in circular reasoning.

 

 

 

Continuous sets

 

Suppose one wants to number the points on a straight or curved line, would the set of rational numbers be sufficient? Clearly not, because of the existence of spatial proportions like that between the diagonal and the side of a square, or between the circumference and the diameter of a circle. Conversely, is it possible to project the set of rational numbers on a straight line? The answer is positive, but then many holes are left. By plugging the holes, we get the real numbers, in the following empirical way.[41]

 

Consider a continuous line segment AB. We want to mark the position of each point by a number giving the distance to one of the ends.[42] These numbers include the set of infinite decimal fractions that Cantor proved to be non-denumerable. Hence, the set of points on AB is not denumerable. If we mark the point A by 0 and B by 1, each point of AB gets a number between 0 and 1. This is possible in many ways, but one of them is highly significant, because it uses the rational numbers to introduce a metric, assigning the number 0.5 to the point halfway between A and B, and analogously for each rational number between 0 and 1. (This is possible in a denumerable procedure). Now the real numbers between 0 and 1 are defined as numbers corresponding one-to-one to the points on AB. These include the rational numbers between 0 and 1, as well as numbers like p/4 and other limits of infinite sequences or series. The irrational numbers are surrounded by rational numbers (forming a dense set) providing the metric for the set of real numbers between 0 and 1.

 

The set of line segments on a straight line having a common end point is also a group. The group operation is the spatial addition of two line segments, the inverse is a line segment in the opposite direction, the identity element is a line segment of length zero, and the group relation is a line segment equal in length to that between the non-common terminal points of two line segments. In the present context, the notions of line segment, congruence, and spatial addition are irreducible concepts: they belong to the spatial modal aspect.

 

Now the real numbers are introduced as elements of the group (a) whose elements include the rational numbers; (b) which has arithmetical addition as its group operation; and (c) which is isomorphic to the former group of line segments. In order to make the one-to-one correspondence between the elements of the two groups definite, an arbitrary unit segment must be chosen. This shows that the set of real numbers is not identical with the set of all segments with one common end point. In contrast, the set of all points on a line does not form a group. The reference (a) to the rational numbers is necessary to give the reals the character of numbers. Condition (c) is not sufficient for this purpose. A set is called continuous if its elements correspond one-to-one to the points on a line segment.[43] There is no one-to-one correspondence possible between the elements of a denumerable group and those of a continuous group. A continuous set cannot be reduced to a denumerable one. The number of elements in a continuous group is always infinite. On the one hand, the continuity of the set of real numbers anticipates the continuity of the set of points on a line. On the other hand, it allows of the possibility to project spatial relations on the quantitative relation frame.

 

The introduction of the set of real numbers as an isomorphic copy of a spatial group already indicates that the meaning of the real numbers is not originally numerical. Their meaning anticipates the spatial modal aspect. This means that the concept of isomorphy is a mathematical expression of the philosophical idea of projection. In contrast, the negative integers and the rational numbers may be considered expressing modal numerical relations between natural numbers and among themselves, and thus as modal abstractions between discreet collections. So the modal meaning of negative and rational numbers remains completely within the closed numerical modal aspect of discrete quantity.

 

Because the set of rational numbers is dense, it contains Cauchy sequences: infinite sequences of elements An, given according to some law, such that for any positive number ε (however small) there is a number N, such that if n>N and m>N, then |AmAn|<ε.[44] It may be observed that the existence of this limit does not depend on an actual completed infinitude of the series as a totality: an infinite discrete set does not have a last member.

 

It may occur that the limit A of a Cauchy sequence is not a member of the set. There are Cauchy sequences of rational numbers whose limits are not rational numbers themselves. The set consisting of all Cauchy sequences of rational numbers is the set of all real numbers. The inclusion of these limits completes the dense set of rational numbers, making it a continuous set of real numbers.[45]

 

However, the real numbers cannot be defined in this way. For instance, it is already presupposed that the limit A of a Cauchy sequence of rational numbers is a number, because otherwise the numerical difference |AAn| would have no meaning.[46] However, for the same reason, it is objectionable to say that this limit is not a number. It is an assumption to state that the limits of Cauchy sequences of rational numbers are (real) numbers, and one has to show that this assumption is warranted.

 

 

 

The quantitative meaning of numbers

 

According to Dooyeweerd, rational and real numbers must be considered mere functions of numbers, the only original numbers being the natural numbers.[47] For a similar reason some mathematicians[48] introduced the integer and rational numbers as equivalence classes of differences or ratios between natural numbers. Thus the integer 2 is the equivalence class of all differences (2+b)–b, where b ranges over all natural numbers. In this view the positive integers should not be identified with the natural numbers, as I did, and, depending on the context, the symbol ‘2’ may stand for a natural number, an integer, a rational number, and eventually for a real or complex number. This view is understandable if one considers the numbers as logically definable. In my view, numbers are discovered and are modal subjects under a law. Therefore I have no difficulty in identifying the number 2 as being the same member in different sets.

 

I agree that the natural numbers are primitives, whereas the existence of rational and real numbers depends on the existence of natural numbers. Nevertheless it is meaningful to speak of numbers, also in the case of negative, rational and real numbers, as modal subjects to numerical laws. In order to see this, one has to recall that the mutual relationship of law and subject implies that there are no laws without subjects, or subjects without laws. It may be imagined that mankind first discovered certain subjects (e.g., the natural numbers) and some laws (the laws of addition and multiplication) to which these are subjected. Afterwards, other laws were found (subtraction, division) pertaining to the same subjects. But then one also discovered other subjects (negative and rational numbers) to the same laws. In my view there is no reason to call these newly discovered subjects mere functions of the already known primitive subjects.[49]  The real numbers are also subjected to the same laws of addition, multiplication, subtraction, and division as the rational numbers are.[50] Thus these numerical predicates of infinite sets of rational numbers behave as subjects to numerical laws.

 

As observed, the meaning of the negative and rational numbers remains completely within the closed numerical modal aspect, because they denote numerical relations between discrete collections. The set of all real numbers turns out to be non-denumerable, i.e., it is impossible to find a one-to-one correspondence of this set with the set of natural numbers. The meaning of a non-denumerable set cannot be found in the closed numerical modal aspect. But this meaning is found with the discovery of the one-to-one correspondence between the set of all real numbers and the set of line segments introduced above. Hence, the meaning of the set of real numbers anticipates the spatial modal aspect. It requires the dynamic development of the numerical relation frame.

 

This is also the case with the meaning of individual real numbers. Real numbers objectify magnitudes, first of all spatial magnitudes: lengths, areas, volumes. It was the great discovery of the Pythagorean school, that the rational numbers are insufficient for the numerical objectification of spatial magnitudes (T&E, 3.1). The diagonal in a unit square has a length of √2, and it can easily be shown that this is not a rational number. In order to represent such magnitudes, one needs the real numbers.[51]

 

Therefore the meaning of the real numbers anticipates the later modal aspects. The limit of an infinite series is never actualized, but in the retrocipatory direction, real numbers become actual magnitudes. The length of a line segment is an actual, real magnitude. When the numerical relation frame is developed into the quantitative one, it original meaning is deepened and relativized, from numerical to quantitative. The deepening means that not only discrete sets, but also magnitudes can be numerically ordered. With real numbers, non-numerical subjects can be ordered according to their magnitude without gaps or holes. This relativization of modal meaning entails the loss of the discrete or denumerable character of numbers which they have in the numerical relation frame.

 

 

 

2.4. Vectors

 

 

 

The temporal order in the numerical relation frame is that of earlier and later, and two numbers are called equal if they have the same position in this order. Therefore only one number 2 should be allowed, whether understood as a natural number, an integer, a rational, or a real number. However, if the order of smaller and larger is applied to concrete subjects or collections, several subjects may be equivalent with respect to some property.

 

In that case there will be at least one other property with respect to which they will be different. In many cases it will be possible to order a set of subjects according to two or more independent properties. Thus there are series with two, three, or more indices. Discrete series can always be ordered in a single numerical order, but this is not always desirable. It might also be that two independent properties have a continuous spectrum, in which case a unequivocal single numerical order is impossible. This notion of independence anticipates the spatial order of simultaneity, and therefore discloses the numerical relation frame on the law side.

 

Magnitudes are non-numerical relations which can be objectified by real numbers. There are non-numerical relations which can only be ordered in a serial order of smaller and larger, if they are decomposed into components, which simultaneously determine these relations. This applies in the first place to spatial position, but also to force, velocity, or the physical state of a system. Such relations are not objectified by a single real number, but by a multiplet of real numbers, called a vector. The minimum amount of real numbers needed for an objectification of a property or relation is called the latter’s dimension. The corresponding vector has an equal number of independent components, which is, therefore, sometimes called the vector’s dimension.

 

By way of example, and because of their relevance to physics, the present section reviews the theories of vectors, of complex numbers, and of Hilbert space.

 

 

 

Number vectors

 

A number vector is defined as an n-tuple of n real numbers, written bold-faced as a=(a1,a2,a3,…,an) and being subjected to some well-known rules.[52] The vectors with the same number n of components form a group with vector addition as group operation, and the zero vector (0,0,0,…,0) as its identity element. The inverse of a vector a is –a=(-1)a. It is easily verified that the set of real numbers is isomorphic to the set of one-component vectors. The independence of the components is not changed by addition.

Next the scalar product is defined, a functional of two vectors having the same number of components, as the real number a.b=a1b1+a2b2+…+anbn. Because the result is a number, not a vector, this product does not define a group. The norm |a| of a vector a is defined by |a|2=a.a=a12+a22+…+an2.

 

Complex numbers

 

One may wonder whether there exists an operation analogous to multiplication that gives rise to a field of vectors. This is indeed the case with the two-component vectors called complex numbers, often written as a1+a2i=a1(1,0)+a2(0,1)=(a1,a2).

 

Here the vector (1,0) is identified with the real number 1, and the vector (0,1)=i is the so-called imaginary unit. The addition of complex numbers is defined above. We call a*=a1a2i the complex conjugate of a=a1+a2i. The complex conjugate of a ‘real number’ (a1,0) is identical with itself. The product of two complex numbers is defined as the complex number (a1,a2)(b1,b2)=(a1b1a2b2,a2b1+a1b2).

 

Together with the addition, this defines a field. The unit vector is (1,0), and the multiplicative inverse of a is a*/a.a*. We see that i2=-1, according to the popular definition of i.[53]

 

The solutions of many problems concerning functions of real numbers are only possible, or more easily obtained, if the latter are considered as vectors (a,0) – i.e., if we consider those functions as functions of complex numbers.[54] This shows that the full meaning of disclosed modal subjects (real numbers) becomes clear only if the law side is also opened up (by the introduction of vectors). Besides vectors, there are other structures, like tensors and matrices, in which each component has two or more indices. They anticipate more complicated spatial or non-spatial relations than vectors are capable of doing. With the introduction of real and complex numbers it is also possible to anticipate the kinetic and later modal aspects, as in integral and differential calculus.[55]

 

 

 

Hilbert space

 

The concept of a vector can be further developed into vectors with complex components and functions of real or complex variables. Quantum physics makes use of a so-called Hilbert space (chapter 9), which is not a space (there are no spatial subjects in it), but a set of complex functions, anticipating the spatial and later modal aspects.[56] Here it is not immediately necessary to define the scalar product (which can be different for different cases), if only the functions belonging to the set and the scalar product conform some quite general rules.[57]

 

The possibility of mapping a Hilbert space on a set of vectors means that all Hilbert spaces with the same value for m are isomorphic to each other.[58] This number m, the dimension of the set, may be finite (as assumed above), infinite, and even non-denumerable.

 

 

 

2.5. The spatial relation frame

 

 

 

In 1899, David Hilbert formulated the foundations of projective geometry as relations between points, straight lines and planes, without defining these.[59] Gottlob Frege thought that Hilbert referred to known subjects, but Hilbert denied this. He was only concerned with the relations between things, leaving aside their nature. According to Paul Bernays, geometry is not concerned with the nature of things, but with ‘a system of conditions for what might be called a relational structure’.[60] Inevitably, structuralism influenced the later emphasis on structures.[61]

 

Topological, projective, and affine geometries are no more metric than the theory of graphs.[62] They deal with spatial relations without considering the quantitative relation frame. I shall not discuss these non-metric geometries. The 19th- and 20th-century views about metric spaces and mathematical structures turn out to be much more important to modern physics.

 

Mathematics studies inter aliaspatially qualified characters (10.5). Because these are interlaced with kinetic, physical, or biotic characters, spatial characters are equally important to science. This also applies to spatial relations concerning the position and posture of one figure with respect to another one. A characteristic point, like the centre of a circle or a triangle, represents the position of a figure objectively. The distance between these characteristic points objectifies the relative position of the circle and the triangle. It remains to stipulate the posture of the circle and the triangle, for instance with respect to the line connecting the two characteristic points. A coordinate system is an expedient to establish spatial positions by means of numbers.

 

 

 

The metric of objective magnitudes

 

Spatial relations are rendered quantitatively by means of magnitudes like distance, length, area, volume, and angle. These objective properties of spatial subjects and their relations refer directly (as a subject) to numerical laws and indirectly (as an object) to spatial laws.

 

Science and technology prefer to define magnitudes that satisfy quantitative laws.[63] To make calculations with a spatial magnitude requires its projection on a suitable set of numbers (integral, rational, or real), such that spatial operations are isomorphic to arithmetical operations like addition or multiplication. This is only possible if a metric is available, a law to find magnitudes and their combinations.

 

For many magnitudes, the isomorphic projection on a group turns out to be possible. For magnitudes having only positive values (e.g., length, area, or volume), a multiplication group is suitable. For magnitudes having both positive and negative values (e.g., position), a combined addition and multiplication group is feasible. For a continuously variable magnitude, this concerns a group of real numbers. For a digital magnitude like electric charge, the addition group of integers may be preferred. It would express the fact that charge is an integral multiple of the electron’s charge, functioning as a unit.

 

Every metric needs an arbitrarily chosen unit. Each magnitude has its own metric, but various metrics are interconnected. The metrics for area and volume are reducible to the metric for length. The metric for speed is composed from the metrics of length and time. Connected metrics form a metric system.

 

The dynamic development of various metrics is not only indispensable for the natural sciences. If a metric system is available, cooperating governments or the scientific community may decide to prescribe a metric to become a norm, for the benefit of technology, traffic, and commerce.[64] Processing and communicating of experimental and theoretical results requires the use of a metric system.

 

 

 

Spatial points

 

A point has no dimensions and could have been considered a spatial object if extension were essential for spatial subjects. However, a relation frame is not characterized by any essence like continuous extension, but by laws for relations. Two points are spatially related by having a relative distance. The argument ‘a point has no extension, hence it is not a subject’ reminds of Aristotle and his adherents. They abhorred nothingness, including the vacuum and the number zero as a natural number. Roman numerals do not include a zero, and Europeans did not recognize it until the end of the Middle Ages. Galileo Galilei taught his Aristotelian contemporaries that there is no fundamental difference between a state of rest (the speed equals zero) and a state of motion (the speed is not zero).[65]

 

It is correct that the property length does not apply to a point, any more than area can be ascribed to a line, or volume to a triangle. The difference between two line segments is a segment having a certain length. The difference between two equal segments is a segment with zero length, but a zero segment is not a point. A line is a set having points as its elements, and each segment of the line is a subset. A subset with zero elements or only one element is still a subset, not an element. A segment has length, being zero if the segment contains only one point. A point has no length, not even zero length: the concept of length is not applicable to points. Dimensionality implies that a part of a spatial figure has the same dimension as the figure itself. A three-dimensional figure has only three-dimensional parts. We can neither divide a line into points, nor a circle into its diameters. A spatial relation of a whole and its parts is not a subject-object relation, but a subject-subject relation.[66]

 

Whether a point is a subject or an object depends on the nomic context, on the relevant laws. The relative position of the ends of a line segment determines in one context a subject-subject relation (to wit, the distance between two points), in another context a subject-object relation (the objective length of the segment). Likewise, the sides of a triangle, having length but not area, determine subjectively the triangle’s circumference, and objectively its area.

 

 

 

Dimensionality

 

The sequence of numbers can be projected on a line, ordering its points numerically. To order all points on a line or line segment the natural, integral or even rational numbers are not sufficient. It requires the complete set of real numbers. The spatial order of equivalence or co-existence presents itself to full advantage only in a more-dimensional space. In a three-dimensional space, all points in a plane perpendicular to the x-axiscorrespond simultaneously to a single point on that axis. With respect to the numerical order on the x-axis, these points are equivalent. To lay down the position of a point completely requires several numbers (x,y,z,…) simultaneously, as many as the number of dimensions. Such an ordered set of numbers constitutes a number vector (2.4).

 

For the character of a spatial figure too, the number of dimensions is a dominant characteristic. The number of dimensions belongs to the laws constituting the character. A plane figure has length and width. A three-dimensional figure has length, width and height as mutually independent measures. The character of a two-dimensional figure like a triangle may be interlaced with the character of a three-dimensional figure like a tetrahedron. Hence, dimensionality leads to a hierarchy of spatial figures. The base of the hierarchy is formed by one-dimensional spatial vectors.

 

 

 

Numerical and spatial vectors

 

Contrary to a number vector, a spatial vector is localized and oriented in a metrical space. Localization and orientation are spatial concepts, irreducible to numerical ones. A spatial vector marks the relative position of two points. By means of vectors, each point is connected to all other points in space. Vectors having one point in common form an addition group. After the choice of a unit of length, this group is isomorphic to the group of number vectors having the same dimension. Besides spatial addition, a scalar product is defined.[67] The group’s identity element is the vector with zero length. Its base is a set of orthonormal vectors, i.e., the mutually perpendicular unit vectors having a common origin. Each vector starting from that origin is a linear combination of the unit vectors. So far, there is not much difference with the number vectors.

 

However, whereas the base of a group of number vectors is rather unique, in a group of spatial vectors the base can be chosen arbitrarily. For instance, one can rotate a spatial base about the origin. It is both localized and oriented. The set of all bases with a common origin is a rotation group. The set of all bases having the same orientation but different origins is a translation group. It is isomorphic both to the addition group of spatial vectors having the same origin and to the addition group of number vectors.

 

Besides a relative position, a spatial vector represents a displacement, the result of a motion. This is a disposition, a tertiary characteristic of spatial vectors.

 

 

 

Euclidean and non-Euclidean metrics

 

Euclidean space is homogeneous (similar at all positions) and isotropic (similar in all directions). Combining spatial translations, rotations, reflections with respect to a line or a plane and inversions with respect to a point leads to the Euclidean group. It  reflects the symmetry of Euclidean space. Symmetry points to a transformation keeping certain relations invariant.[68] At each operation of the Euclidean group, several quantities and relations remain invariant, for instance, the distance between two points, the angle between two lines, the shape and the area of a triangle, and the scalar product of two vectors.

 

In Euclidean geometry, the relative position of points is found with the help of a Cartesian coordinate system, allowing to represent each spatial point by a vector (x,y,z,…). Having two points characterized by the vectors (x1,y1,z1,…) and (x2,y2,z2,…), the difference vector (x1-x2,y1-y2,z1z2,…) characterizes the relative position of the two points. The distance of the two points is the norm d of this vector, determined by d2=(x1-x2)2+(y1-y2)2+(z1z2)2+

 

This expression is called the metric of Euclidean space. A metric is a law according to which a numerical value can be assigned to a non-numerical property or relation. The above formula is an objective representation of this law for the determination of lengths and distances in Euclidean space.

 

The metric depends on the symmetry of space. In an Euclidean space, Pythagoras’ law determines the metric.[69] Since the beginning of the 19th century, mathematics acknowledges non-Euclidean spaces as well.[70] (Long before, it was known that on a sphere the Euclidean metric is only applicable to distances small compared with the radius.) Preceded by Carl Friedrich Gauss, in 1854 Bernhard Riemann formulated the general metric for an infinitesimal small distance in a multidimensional space.[71]

 

For a non-Euclidean space, the coefficients in the metric depend on the position.[72] To calculate a finite displacement requires the application of integral calculus. The result depends on the choice of the path of integration. The distance between two points is the smallest value of these paths. On the surface of a sphere, the distance between two points corresponds to the path along a circle whose centre coincides with the centre of the sphere.

 

The metric is determined by the structure and eventually the symmetry of the space. This space has the disposition to be interlaced with the character of kinetic space or with the physical character of a field. A well-known example is the general theory of relativity, being the relativistic theory of the gravitational field (4.8).[73]

 

A non-Euclidean space is less symmetrical than an Euclidean one having the same number of dimensions. Motion as well as physical interaction may cause a break of symmetry in spatial relations.

 

 

 

2.6. Spatial subject-object relations

 

 

 

The distinction of subjects and objects as made in the philosophy of the cosmonomic idea (1.6) can best be illustrated with respect to spatial objects and objective magnitudes. The proper parts of a spatial subject cannot have more or less dimensions than the subject itself. A two-dimensional subject can only have two-dimensional parts. Just as collections can only be added if they have no members in common, magnitudes of spatial subjects can only be added if they have no common parts. But they may have common boundaries, because the boundaries are not parts of the subject. A boundary of a spatial subject always has a lower dimension than the subject itself, and, therefore, its subjective extension (with respect to the magnitude of the subject) is zero (it has ‘measure’ zero). Spatial boundaries have an objective meaning within the spatial modal aspect. They delimit the objective magnitude of the subjects, and they allow the introduction of numerical ordering within the spatial aspect.

 

The simplest spatial objects are points, having zero spatial extension. Points have an important spatial meaning as boundaries of a line segment. Spatial points serve to determine its length, the objective magnitude of the line segment. Similarly, in a two-dimensional space, a line segment can only function objectively, as a boundary of a triangle, e.g., by determining its area, which is again an objective spatial magnitude referring back to the numerical modal aspect. In this way the spatial relation frame is the first aspect to have objects as well as subjects.[74]

 

It is of no use to define a line, a plane, or a space as a collection of points, lines, or planes, respectively.[75] Although a line contains a continuous, non-denumerable collection of points, this cannot serve as a constitutive definition of a line. Rather the line constitutes  the collection of points. Collections of this kind have a dependent meaning. This becomes apparent if one tries to assign a number to a collection of points on a line segment. It can easily be proved that there exists a one-to-one correspondence between the points of this line segment and the points of any other line segment, regardless of their relative length. Therefore, length, as an objective magnitude of the line segment, has no relation whatsoever to the number of points on the line segment.

 

 

 

2.7. Subject-subject relations

 

 

 

There is a spatial relation between two subjects if they are bound together in a common spatial manifold. Thus the spatial order is coexistence, static simultaneity, or equivalence,[76] and the corresponding subject-subject relation is relative spatial position. In the kinematic modal aspect simultaneity has only a limited, analogical meaning, as is shown in the theory of relativity, whereas in the numerical order of before and after simultaneity is absent. Consider an (n–1)-dimensional boundary in an n-dimensional space, described by a continuous function f(r)=0, where r denotes the vector, ranging over all points in the n-dimensional space. All points on one side of the boundary are characterized by f(r)>0, and all points on the other side by f(r)<0. This shows once more that the concept of a boundary (a spatial object) refers to the numerical order of smaller and larger. With respect to this quasi-serial order, all points with vector r, such that f(r)=a, are equivalent. They simultaneously lie in the same (n–1)-dimensional manifold objectified by this equation.

 

Just as numerical relations are subjected to a serial order (2.2), spatial relations are subjected to an order of equivalence. A relation R(A,B) over a set is an equivalence relation if for any two elements A and B of the set either R(A,B) or not, and if R(A,B) is reflexive, symmetric, and transitive.[77]

 

All elements which are equivalent with a certain element A constitute the equivalence class of A. It is a sub-set of the whole set over which the equivalence relation R is defined. It can be shown that if this is the case there must be some property by which different equivalence classes in the same set can be distinguished. For instance, the equivalence classes of parallel lines in an Euclidean space can be distinguished by their relative direction.

 

 

 

Spatial figures

 

Consider a simple spatial problem: in which ways can spatial figures differ or be equivalent? Generally speaking, by their shape, their magnitude, and their relative position. If two subjects have the same shape are called similar. If they also have the same magnitude (area or volume) they are called congruent. The concept of magnitude refers back the numerical modal aspect and, more specifically, to the operation of addition: if we take two disjoint subjects together, we have to add their magnitudes. The concept of similarity is an equivalence relation, but it clearly does not lead to a universal ordering of spatial subjects. The concept of magnitude allows us to find such an order, but this has a numerical, not a spatial character. Only spatial position can be qualified as an irreducible, universal, spatial subject-subject relation.

 

If two subjects are congruent, they can only differ in their position because otherwise they must be identical. Two subjects may have parts in common, they may have nothing more than a boundary in common, or they may be completely disjoint. Otherwise, it is difficult to use the concept of relative position (although it is probably intuitively clear) without an objective description – namely, the distance and relative orientation of the two subjects. The shape of a subject is also determined by the relative position of its boundaries, just as its magnitude. Relative position is subjected to the order of equivalence: the subjects considered should have the same dimension, and must be in the same manifold – these are equivalence relations.

 

Spatial figures can be objectified by their boundaries, in the simplest case by spatial points – for instance, a triangle by its vertices. If the shape of a subject is given, n points are needed to objectify the position of an -dimensional subject in an n-dimensional manifold. As a consequence, the relative position of two subjects is objectified by the distances of the corresponding pairs of such points. This determines the relative distance as well as the relative orientation of the subjects. Thus the distance of two spatial points (besides the angle between two lines) is an objective, spatial relation.

 

 

 

2.8. Objectivity in the choice of coordinate systems

 

 

 

The Euclidean metric defined above is independent of the choice of the Cartesian coordinate system. It is not affected by any translation (or displacement), rotation, or inversion of the latter. I shall discuss this statement because the natural sciences claim to be objective, and because its relevance is called into question by modern and postmodern conventionalist authors.

 

The possibility of assigning real numbers to points on a straight line depends on the one-to-one correspondence between the numerical addition group of real numbers and the spatial addition group of line segments on a straight line. This correspondence is not unique in two senses: one is free to choose a unit, as well as to choose the common end point of the set of line segments. Objectivity requires that the distance between two points (the objective relation between two spatial subjects) be independent of this arbitrary choice. This is expressed by saying that the distance is invariant under the translations of the coordinate system: the space is homogeneous. All possible displacements form a group, isomorphic to the group of all spatial difference vectors.

 

When a zero point has been chosen, one is still free to choose a point to which to assign the number 1. This arbitrariness is limited by the requirement that the distance between two spatial points be independent of rotations of the coordinate system around any axis and about any angle. This is called the isotropy  of space. This implies that the unit be the same along all coordinate axes. The set of all possible rotations in a plane forms a commutative group. Rotations around different axes in more-than-two-dimensional space form a non-commutative group.

 

Having chosen a set of coordinate axes and a unit, one is still free to assign the plus and minus directions on each axis. This results in inversion symmetry, the operation under which the distance must be invariant. The rotations together with the reflections form the full orthogonal group. Each finite translation or rotation can be obtained as the result of a continuous motion. However, this is not the case with inversion, which refers back to the numerical order of before and after. This implies that it will not always be possible to bring congruent spatial figures to coincide merely by a combination of translations and rotations. For example, the right- and left-hand gloves of a pair cannot replace each other.

 

By changing the unit, all distances are changed in the same ratio. All possible transformations of the unit form a multiplication group which is isomorphic to the multiplication group of positive real numbers. Therefore, by changing the unit, all distance ratios must remain the same. Distances should be geometrically independent of the choice of the unit of length, but this cannot be accounted for by a numerical analysis alone. In the theory of number vectors there is nothing of this kind: units do not occur in number theory. The meaning of the spatial subject-subject relation is determined by the irreducible meaning of the spatial relation frame, and cannot be reduced completely to the numerical relations which objectify spatial relations. From an arithmetical point of view, the replacement of the metre by the centimetre as a unit of length causes all distances to become a hundred times larger. Transformations of this kind are sometimes called trivial, but they are not, since they express the mutual irreducibility of the numerical and the spatial modal aspects.[78]

 

These invariance properties are not only relevant to distances, but also clarify the concepts of congruence and similarity. Two spatial figures (irrespective of their relative position) are congruent if the one can be transformed into the other by an operation belonging to the full group of translations, rotations, and inversion. Two figures are similar (having the same shape) if besides such an operation all linear dimensions of one figure must be multiplied by a real number in order to arrive at the same result. This implies that if two figures are congruent or similar, they remain so under any transformation of the coordinate system of the types discussed here.

 

 

 

Conventionalism

 

The standard Euclidean metric is invariant under translations, rotations, and the inversion of the coordinate system. In contrast, one can show that any other metric singles out a particular point, line, plane, or direction. Thus we can say that the standard metric represents the isotropy and homogeneity of space, which are assumed here because only spatial relations between subjects are relevant, and not the ‘absolute position’ of any subject.

 

The metric is only dependent on the choice of the unit. This arbitrariness reflects the amorphousness of space, by which we mean that we cannot assign a certain amount of points to a certain line segment. In fact, a one-to-one correspondence is possible between the points of any pair of intervals, irrespective of their relative lengths. Therefore, the length of an interval as expressed by a certain number, is not an intrinsic spatial property. This is properly stressed by Adolf Grünbaum in his extensive studies on the alleged conventionality of the metric.[79] Grünbaum is the main 20th-century  (though moderate) proponent of conventionalism. He repeatedly refers to Henri Poincaré and Bernhard Riemann, but, in fact, conventionalism is merely a modern form of nominalism, which has its roots in the late Middle Ages and was defended by George Berkeley in the 18th and Ernst Mach in the 19th century.[80] Grünbaum uses the amorphousness of space as an argument for the equivalence of all conceivable coordinate systems, but does not admit that some coordinate systems should be preferred if they express the symmetry properties of space.

 

In the non-standard metric of a semiplane discussed by Grünbaum, the distance is not invariant under a translation of the coordinate system along the y-axis.[81] The non-standard metric which he discusses elsewhere[82] is not invariant under rotations of the coordinate system. As Grünbaum rightly observes, the assignment of real numbers to spatial points only effects a coordinatization, not a metrization of the manifold.[83] However, his non-standard metrizations do not define proper spatial subject-subject relations. When a third spatial subject (the coordinate system) is used to objectify the spatial relations between two subjects, a metrization is required which keeps this spatial relation independent of the position of that third subject. This is a requirement of objectivity which presupposes the homogeneity and isotropy of space, that is, rejection of any absoluteness of space with respect to position or direction.[84]

 

This does not mean that other metrizations should be rejected in all circumstances. Often they are very useful (e.g., polar coordinates for spherical-symmetric problems). This actually reverses the argument. Instead of agreeing with Grünbaum that Cartesian coordinate systems are only used because they are often more convenient than others, non-standard metrics are only applied if it is convenient in certain circumstances. A unique property of the standard metric is its invariance under translation, rotation, and inversion. This is not the case because of some convention, but follows from the homogeneity and isotropy of space. Grünbaum has paid too much attention to the amorphousness of space, which implies the arbitrariness of the unit, and has neglected the symmetry properties inherent to Euclidean geometry reflecting those of space.

 

Grünbaum’s remarks could be accepted if they were related to topology, in which, e.g., one does not distinguish between a sphere and an ellipsoid, or a rectangle and a parallelogram. Topology differs from metrical geometry because it lacks a metric. The theorems of topology hold for a figure regardless of how it is deformed in homogeneous strain. Grünbaum, however, directs his conventionalist views to metrical space.

 

 

 

2.9. The dynamic development of the spatial relation frame

 

 

 

The metric depends on the symmetry of space. In an Euclidean space, Pythagoras’ law determines the metric. Since the beginning of the 19th century, mathematics acknowledges non-Euclidean spaces as well (2.5). Preceded by Carl Friedrich Gauss, in 1854 Bernhard Riemann formulated the general metric for an infinitesimal small distance in a multidimensional space.

 

For a non-Euclidean space, the coefficients in the metric depend on the position. To calculate a finite displacement requires the application of integral calculus. The result depends on the choice of the path of integration. The distance between two points is the smallest value of these paths. On the surface of a sphere, the distance between two points corresponds to the path along a circle whose centre coincides with the centre of the sphere.

 

The metric is determined by the symmetry of the space, even if it is developed into kinetic space as in the theory of relativity, or into the physical space called a field. A well-known example is the general theory of relativity, being the relativistic theory of the gravitational field.[85]

 

The above criticism of Grünbaum’s conventionalist views also pertains to non-Euclidean manifolds. Non-Euclidean manifolds are in general less symmetric than Euclidean ones. Grünbaum seems to overlook this. Only by tacitly assuming that the said requirement of objectivity (i.e., that the relative position of two subjects be independent of the choice of the reference system) is satisfied is it possible to describe the nature of a manifold by its metric. This requirement is satisfied in Euclidean space by the rotation, translation, and inversion invariance of its metric. In non-Euclidean space one must either have similar intrinsic symmetries (as in the case of a spherical surface), or refer to some extrinsic instance – for example, to an Euclidean space of higher dimension, or to a rigid body,[86] or to kinematic motion, or to gravity, as is done in relativity theory.

 

In Gauss’ theory of curved manifolds, showing that the metric can be derived without reference to an outside system, he tacitly assumed that the unit in the orthogonal directions and at different positions is the same. The metric, and thus the Gaussian curvature depend on the method of measuring lengths adopted on the manifold.[87] Thus one can either start with the symmetries of the manifold, and require that the metric be invariant under the allowed symmetry operations, as is the case for Euclidean or spherical geometry, or start with a rigid definition of length in order to investigate the structure of that manifold. One cannot have it both ways.

 

Non-Euclidean manifolds can be understood in two ways: as an (n–1)-dimensional boundary of an n-dimensional spatial subject (e.g., a spherical surface), or as a manifold whose metric is determined by kinematical or physical laws (as e.g. in relativity theory). In the latter case the homogeneity and isotropy of space are relativized by those non-spatial laws. Motion as well as physical interaction causes a break of symmetry in spatial relations. In the former case they are relativized by the n-dimensional subject whose (n-1)-dimensional boundary functions as a manifold. In both cases the spatial relations between subjects bounded to such a manifold become non-Euclidean because of some restriction, like a boundary condition. This relativization is characteristic for the dynamic development of a relation frame. In kinematics or in physics, one speaks of a field as soon as the spatial isotropy and/or homogeneity is lost. A field may either be homogeneous, if it is not isotropic, or it may be neither homogeneous nor isotropic.

 

Hence Euclidean geometry may be considered as having an original spatial meaning, whereas the meaning of non-Euclidean geometry is found by reference either to the numerical modal aspect (in the concept of a boundary), or to the kinematic and the physical aspects.

 

 

 

Multiply connected manifolds

 

The spatial modal aspect can also be developed on the law side by the introduction of multiply connected manifolds. In the simplest case, a linear manifold is open if, for three points, there is one and only one point which lies between the other two. This is the case, for example, with a straight line or a parabola. A linear manifold may also be closed (a circle) or self-intersecting (a lemniscate). Two-dimensional manifolds may be simply connected (e.g., a plane) or multiply connected (e.g., a plane with a hole, a sphere, or a torus). In this case a criterion for being simply connected is given by the concept of contraction. A two-dimensional manifold is called simply connected if any point and any closed curve meet the following two-part criterion: one can uniquely determine whether the point lies inside the curve, and if that is the case, whether the curve can be continuously contracted without leaving the manifold. The surface of a sphere is not simply connected because it fails the first part of the criterion. The surface of a torus does not meet either part of the criterion. In a similar way simply-connectedness can be established for higher-dimensional manifolds, i.e., with the help of the concept of a boundary. Therefore these criteria of connectedness have an objective character.

 

Multiply connected manifolds are not irrelevant to physics. The gravitational fields and electric fields are simply connected, but the magnetic field around a current bearing conductor is multiply connected. As a consequence, a static electric field can be described by a potential, but a magnetic field cannot.

 



[1] For instance Zermelo in 1908, quoted by Quine 1963, 4: ‘Set theory is that branch of mathematics whose task is to investigate mathematically the fundamental notions of ‘number’, ‘order’, and ‘function’ taking them in their pristine, simple form, and to develop thereby the logical foundations of all of arithmetic and analysis.’ See Putnam 1975, chapter 2.

[2] Shapiro 1997, 98: ‘Mathematics is the deductive study of structures’.

[3] Beth 1944a, 115.

[4] This reference system cannot be finite because of its abstract and universal character.

[5] Dooyeweerd NC, II 79ff; Cassirer 1910, 47-54.

[6] Cf. Beth 1944a, 61, 67, 68.

[7] See, for instance, Beth 1944a and Russell 1919.

[8] Beth 1944a, 72.

[9] Russell 1919, 29.

[10] Dooyeweerd 1940, 167, 168; NC II, 79.

[11] Russell 1919, 15; Carnap 1939, 38ff.

[12] Peano took 1 to be the first natural number. Nowadays one usually starts with 0, to indicate the number of elements in the zero set. Starting from its element 0, the set of integral numbers can also be defined by stating that each element a has a unique successor a+ as well as a unique predecessor a-, if (a+)- = a, see Quine 1963, 101.

[13] In the decimal system 0+=1, 1+=2, 2+=3, etc., in the binary system 0+=1, 1+=10, 10+=11, 11+= 100, etc. From axiom 2 it follows that N has no last number.

[14] The fifth axiom states that the set of natural numbers is unique. The sequence of even numbers satisfies the first four axioms but not the fifth one. On the axioms rests the method of proof by complete induction: if P(n) is a proposition defined for each natural number n³a, and P(a) is true, and P(n+) is true if P(n) is true, then P(n) is true for any n³a.

[15] For each a, a+>a. If a>b and b>c, then a>c, for each trio a, b, c.

[16] Because the first relation frame does not have objects, it makes no sense to introduce an ensemble of possibilities besides any numerical character class.

[17] Quine 1963, 107-116.

[18] In 1931, Gödel (see Gödel 1962) proved that any system of axioms for the natural numbers allows of unprovable statements. This means that Peano’s axiom system is not logically complete.

[19] Putnam 1975, xi: ‘… the differences between mathematics and empirical science have been vastly exaggerated.’ Barrow 1992, 137: ‘Even arithmetic contains randomness. Some of its truths can only be ascertained by experimental investigation. Seen in this light it begins to resemble an experimental science.’ See Shapiro 1997, 109-112; Brown 1999, 182-191.

[20] Goldbach’s conjecture, saying that each even number can be written as the sum of two primes in at least one way, dates from 1742, but is at the end of the 20th century neither proved nor disproved.

[21] From the set of natural numbers 1 to n, starting from 3 the sieve eliminates all even numbers, all triples, all quintets except 5, (the quartets and sixtuplets have already been eliminated), all numbers divisible by 7 except 7 itself, etc., until one reaches the first number larger than Ön. Then all primes smaller than n remain on the sieve. For very large prime numbers, this method consumes so much time that the resolution of a very large number into its factors is used as a key in cryptography. There are much more sequences of natural numbers subject to a characteristic law or prescription. An example is the sequence of Fibonacci (Leonardo of Pisa, circa 1200). Starting from the numbers 1 and 2, each member is the sum of the two preceding ones: 1, 2, 3, 5, 8, 13, … This sequence plays a part in the description of several natural processes and structures, see Amundson 1994, 102-106

[22] Quine 1963, 30-32 assumes there is no objection to consider an individual to be a class with only one element, but I think that such an equivocation is liable to lead to misunderstandings.

[23] A well-known paradox arises if a set itself satisfies its prescription, being an instance of self-reference. The standard example is the set of all sets that do not contain themselves as an element. According to Brown 1999, 19, 22-23 restricting the prescription to the elements of the set may preclude such a paradox. This means that a set cannot be a member of itself, not even if the elements are sets themselves.

[24] The number of subsets is always larger than the number of elements, a set of n elements having 2n subsets. A set contains an infinite number of elements if it is numerically equivalent to one of its subsets. For instance, the set of natural numbers is numerically equivalent to the set of even numbers and is therefore infinite.

[25] This is a consequence of the axiom stating that two sets are identical if they have the same elements.

[26] If n(A) is the number of elements of A, then n(AÈB)=n(A)+n(B)–n(AÇB).

[27] Cassirer 1910, 49.

[28] In mathematics, the theory of groups became an important part of Felix Klein’s Erlanger programm (1872) on the foundations of geometry.In physics, groups were first applied in relativity theory, and since 1925 in quantum physics and solid state physics. Not to everyone’s delight, however, see e.g. Slater 1975, 60-62: about the ‘Gruppenpest’: ‘… it was obvious that a great many other physicists were as disgusted as I had been with the group-theoretical approach to the problem.’

[29] In general, ABBA. If AB=BA, the group is called commutative or Abelean (after N.H. Abel).

[30] Cassirer 1910, 55, 56.

[31] Poincaré 1906, 8.

[32] For the introduction of the set of natural numbers or the group of integers, we only need to specify one member, the number 1. All other integers are generated according to the group operation of addition. For the introduction of the multiplication group of positive rational numbers, we have to rely on the set of prime numbers, and hence on the full set of natural numbers (which can only be defined with the addition as a group operation), because of the theorem that the number of prime numbers is infinite.

[33] If a, b, c, and d are integers, the group-theoretical approach demands that a/1 = a, etc. Hence, the addition of the rational numbers must be defined as a/b+c/d=(ad+bc)/bd, in order to arrive at the result that a/1+b/1=a+b.

[34] It can be proved that the sum, the difference, the product and the quotient of two rational numbers (excluding division by 0) always gives a rational number. Hence, the set of rational numbers is complete or closed with respect to these operations.

[35] Courant 1934, 59, 60. Although there exists a one-to-one correspondence between the integers and the rational numbers, their groups are not isomorphic.

[36] If a<b then a<a+c(b-a)<b, for each rational value of c with 0<c<1.

[37] Philosophers do not generally recognize the importance of dense sets for the transition of rational numbers to real numbers.

[38] Grünbaum 1968, 13.

[39] A sequence is an ordered set of numbers (a, b, c, …). Sometimes an infinite sequence has a limit, for instance, the sequence 1/2, 1/4, 1/8, … converges to 0. A series is the sum of a set of numbers (a+b+c+…). An infinite series too may have a limit. For instance, the series 1/2+1/4+1/8+… converges to 1.

[40] By multiplying a single irrational number like p, with all rational numbers, one finds already an infinite, even dense, yet denumerable subset of the set of real numbers. Also the introduction of real numbers by means of ‘Cauchy-sequences’ only results in a denumerable subset of real numbers.

[41] This procedure differs from the standard treatment of real numbers, see e.g. Quine 1963, chapter VI.

[42] According to the axiom of Cantor-Dedekind, there is a one-to-one relation between the points on a line and the real numbers.

[43] It is not difficult to prove that the points on two different line segments correspond one-to-one to each other.

[44] Courant 1934, 39, 40, 60.

[45] Up till the end of the 19th century, the distinction between denseness and continuity was not clearly recognized, see Grünbaum 1968, 13. In the past, continuity was sometimes defined as ‘infinite divisibility’, but this leads only to denseness.

[46] Boyer 1939, 284-290. To avoid this pitfall the modern approaches of Weierstrass, Cantor, Dedekind, and Russell have been institutionalized.

[47] Dooyeweerd NC, II 79, 88, 170ff, 383; see also Strauss 1970-1971. In fact, this is not quite a new view: for some time, the negative numbers were called ‘numeri absurdi’, ‘aestimationes falsae’ or ‘fictae’, the irrational numbers ‘numeri surdi’, and the complex numbers are still called ‘imaginary’; cf. Beth 1944b, 72, 73.

[48] Beth 1948, 34ff; Russell 1919, chapter 7.

[49] Beth 1944a, 155.

[50] Cf. Beth 1944b, 50ff.

[51] Beth 1950, 77ff; 1944, 23ff.

[52] (a) The sum of two vectors is a vector defined as a+b = (a1+b1,a2+b2,…,an+bn).

(b) The product of a vector with a real number c is a vector defined as ca = (ca1,ca2,…can).

(c) Introducing the n unit vectors (1,0,0,…0), (0,1,0,…0), … (0,0,0,…1), any vector can be written as a = a1(1,0,0, … 0) + a2(0,1,0,…0) + … + an(0,0,0,…1).

[53] Because of its relevance to physics it may be recalled that the complex numbers can also be represented in other ways by a pair of real numbers. The most important is the representation in terms of sine and cosine functions, or equivalent, as an exponential function. If a=p.cos x, and b=p.sin x, then a+bi=(a,b)=p.cos x+pi.sin x=p.exp.ix.

The norm of this complex number is |p|, and x is called the phase of the complex number. For any integer n, p.exp i(x+n.2π)=p.exp.ix.This representation is especially convenient with respect to multiplication: (p.exp ix)(q.exp iy)=pq.exp i(x+y).

[54] Beth 1944b, 42.

[55] Beth 1944b, 67.

[56] The quantum mechanical state space is called after David Hilbert, but invented by John von Neumann, in 1927.

[57] (1) If a and b are arbitrary complex numbers, and f1, f2, and f3 are arbitrary members of the set, then g=af1+bf2 is also a member of the set, which is therefore a group under addition.

(2) There exists a functional (f1,f2) called the scalar product, which is a finite complex number, such that:

(a)  (f1,f2)=(f2,f1)*

(b)  (af1,bf2)=a*b(f1,f2)

(c)  (f1+f2,f3)=(f1,f2)+(f2,f3)

(d)  (f1,f2+f3)=(f1,f2)+(f1,f3): the scalar product is a linear functional.

The norm ||f|| of the function f is a real non-negative number defined by

||f||2=(f,f).

If (f1,f2)=0 we call f1 and f2 orthogonal, which implies that they are mutually independent. There exists a maximum number m of mutually independent and normalized functions n1,n2,…,nm, such that (ni,ni)=1 for i=1,2,3,…,m, and that (ni ,nj)=0 if i≠j for i,j=1,2,3,…,m. This implies that any function f in the set can be written as f=a1n1+a2n2+…+amnm, where a1,a2,…am are complex numbers, ai = (f,ni).

With respect to the basis (the set n1,n2,…,nm) f can be written as the vector f=(a1,a2,a3,…am). The basis is not unique. In fact, the number of possible bases for a Hilbert space is infinite.

[58] Jauch 1968, 24.

[59] Since the beginning of the 19th century, projective geometry is developed as a generalization of Euclidean geometry.

[60] Shapiro 1997, 158; Torretti 1999, 408-410.

[61] e.g. Bourbaki, pseudonym for a group of French mathematicians. See Barrow 1992, 129-134; Shapiro 1997, chapter 5; Torretti 1999, 412.

[62] A graph is a two- or more-dimensional discrete set of points connected by line stretches.

[63] This is not the case with all applications of numbers.  Numbers of houses project a spatial order on a numerical one, but hardly allow of calculations. Lacking a metric, neither Mohs’ scale of hardness nor Richter’s scale for earthquakes leads to calculations.

[64] Allen 1995.

[65] Galileo 1632, 20-22.

[66] In a quantitative sense a triangle as well as a line segment is a set of points, and the side of a triangle is a subset of the triangle. But in a spatial sense, the side is not a part of the triangle.

[67] In an Euclidean space, the scalar product of two vectors a and b equals a.b=ab cos a. Herein aa.a is the length of a and a is the angle between a and b. If two vectors are perpendicular to each other, their scalar product is zero.

[68] Van Fraassen 1989, 262.

[69] If the coordinates of two points are given by (x1,y1,z1) and (x2,y2,z2), and if we call Dx=x2x1 etc., then the distance Dr is the square root of Dr2=Dx2+Dy2+Dz2. This is the Euclidean metric.

[70] Non-Euclidean geometries were discovered independently by Lobachevski (first publication, 1829-30), Bolyai and Gauss, later supplemented by Klein. Significant is to omit Euclides’ fifth postulate, corresponding to the axiom that one and only one line parallel to a given line can be drawn through a point outside that line.

[71] Riemann’s metric is dr2=gxxdx2+gyydy2+gxydxdy+gyxdydx+… Mark the occurrence of mixed terms besides quadratic terms. In the Euclidean metric gxx=gyy=1, gxy=gyx=0, and Δx and Δy are not necessarily infinitesimal. See Jammer 1954, 150-166; Sklar 1974, 13-54. According to Riemann, a multiply extended magnitude allows of various metric relations, meaning that the theorems of geometry cannot be reduced to quantitative ones, see Torretti 1999, 157.

[72] If i and j indicate x or y, the gij’s, are components of a tensor. In the two-dimensional case gij is a second derivative (like d2r/dxdy). For a more-dimensional space it is a partial derivative, meaning that other variables remain constant.

[73] In the general theory of relativity, the co-efficients for the four-dimensional space-time manifold form a symmetrical tensor, i.e., gij=gji for each combination of i and j. Hence, among the sixteen components of the tensor ten are independent. An electromagnetic field is also described by a tensor having sixteen components. Its symmetry demands that gij=-gji for each combination of i and j, hence the components of the quadratic terms are zero. This leaves six independent components, three for the electric vector and three for the magnetic pseudovector. Gravity having a different symmetry than electromagnetism is related to the fact that mass is definitely positive and that gravity is an attractive force. In contrast, electric charge can be positive or negative and the electric Coulomb force may be attractive or repulsive. A positive charge attracts a negative one, two positive charges (as well as two negative charges) repel each other.

[74] Dooyeweerd NC, II 383ff; Dooyeweerd’s statement that an object in some modal aspect cannot be a subject in the same modal aspect is obviously wrong.

[75] Cp. Suppes 1972, 310.

[76] Dooyeweerd 1940, 166; NC II, 85; Leibniz already considered space and time as orders of coexisting and successive things or phenomena. Cf. Jammer 1954, 4, 115; Whiteman 1967, 383; Čapek 1961, 15ff.

[77] R(A,A) for all A; if R(A,B), then R(B,A); if R(A,B) and R(B,C), then R(A,C).

[78] The arbitrariness of the choice of the unit, sometimes called ‘gauge invariance’ must not be confused with the so-called ‘magnitude invariance’, according to which many properties of, e.g., spatial figures only depend on their shape and not on their magnitude. The former invariance is universally valid while the latter has a far more limited validity. In particular, it is false for typical relations, such as the size of atoms. See Čapek 1961, 21-26.

[79] Grünbaum 1968, 12, 13.

[80] See Kolakowski 1966, chapter 2 and 6. For a critique of conventionalism, see Popper 1959, 78ff, 144ff; Friedman 1972.

[81] Grünbaum 1963, 18ff; 1968 16ff.

[82] Grünbaum 1963, 98ff; Grünbaum, in Henkin et al. (eds.), 204-222.

[83] Grünbaum 1963, 16; 1968, 34.

[84] It should be noted that my critique is not quite appropriate to Grünbaum’s alternative metrization mentioned above. His semi-plane is only symmetric with respect to translations along the x-axis, and reflections with respect to the y-axis. His non-standard metric reflects these two symmetries just as well as the standard metric does. But then a semi-plane is not a very interesting example, in particular not for Grünbaum’s purposes.

[85] In the general theory of relativity, the co-efficients for the four-dimensional space-time manifold form a symmetrical tensor, i.e., gij=gji for each combination of i and j. Hence, among the sixteen components of the tensor ten are independent. An electromagnetic field is also described by a tensor having sixteen components. Its symmetry demands that gij=-gji for each combination of i and j, hence the components of the quadratic terms are zero. This leaves six independent components, three for the electric vector and three for the magnetic pseudovector. Gravity having a different symmetry than electromagnetism is related to the fact that mass is definitely positive and that gravity is an attractive force. In contrast, electric charge can be positive or negative and the electric Coulomb force may be attractive or repulsive. A positive charge attracts a negative one, two positive charges (as well as two negative charges) repel each other.

[86] Grünbaum 1963, 8ff; Beth 1950, 71.

[87] Nagel 1961, 244, 246.

 

 

Part I, chapter 3

 

 

 

 

 

Metric and measurement

 

3.1. Measurement

 

In the above discussion of the numerical and the spatial modal aspects, both the theory of groups and the concept of isomorphy played an important part. The theory of groups is a mathematical theory of relations, and the concept of isomorphy is a mathematical expression of the philosophical concept of the projection of relations of one kind on those of another. The present chapter aims to show that both are very important for the understanding of the mathematical development of the physical sciences. Especially since the 17th and 18th centuries, physicists have tried to find numerical and spatial objective descriptions of kinematic and physical relations. The concept of isomorphy enabled them to introduce numerical and spatial representations of kinematical and physical states of affairs. The theory of groups provides physics with operational definitions of the metrics of such representations, and allows of finding mathematical theories for kinematics and physics. Together they form the basis of measurement, and hence of modern empirical science. Measurement is at the heart of the physical sciences, and therefore it seems justified to devote an entire chapter to its problems. Moreover, it provides an opportunity of showing the power of the basic distinctions discussed in chapter 1.

The aim of measurement in physics is to obtain an objectification of physical  states of affairs: to project physical subject-subject relations on numerical, spatial, or kinetic relations. The possibility of performing experiments and doing measurements is largely responsible for the growth of the physical sciences in modern times. One may wonder why this growth is not present to a greater extent in the social sciences. One reason, of course, is the difficulty of designing relevant experiments because of ethical considerations. There is a second reason, however, which is perhaps more important. It is the lack of a modal metric in the post-physical modal aspects. The availability of a metric is the indispensable law side of measurement. If this view is correct, the problems encountered in the social sciences with respect to measurements and their interpretation are largely due to an absence of a metric.[1]

Important aspects of measurement, such as the psychological (observational) and cognitive (rational) aspects, will at best be treated superficially. This chapter is restricted to classical measurements. Later I intend to show that the so-called measurement problem in quantum physics is not really a measurement problem, because measurements in quantum physics are performed in the same way as described in this chapter.

According to Rudolf Carnap and others,[2] the classical distinction of qualitative and quantitative properties is insufficient. It was already abolished during the start of classical physics (T&E, chapter 3). There is a third type, called comparative or topological properties. For instance, it is quite meaningless to seek a dichotomy behind linguistic pairs, such as long-short, heavy-light, hot-cold, small-large, fast-slow, old-young, etc.. These should be replaced by relations like larger than, heavier than, hotter than, etc., speaking of a comparative attribute if it enables one to put the objects[3] to be compared into a linear order of more or less.

These definitions are still not complete. Not only do comparative attributes have an order of before and after, but they also have an order of equivalence. Thus, to compare any pair of subjects with respect to a comparative attribute, the two subjects must either be ordered in the form of a more-less statement or by an equivalence relation. For instance, two physically qualified subjects either have the same weight or one is heavier than the other. All physically qualified subjects having the same weight constitute an equivalence class with respect to the property ‘heavy’. Strictly speaking, not the subjects are numerically ordered, but the equivalence classes of their objective properties. Their order is not serial, but linear, referring as much to the spatial as to the numerical relation frame.[4]

If there are practical means available of establishing equivalence and the order of the equivalence classes, the property concerned is called measurable. Now a scale can be devised as a numerical objectification of that property. For a comparable property, the only restriction applied to such a scale is that the order of the assigned numbers reflects the serial order of the equivalence classes. Such a scale is by no means unique. A scale (x) can be replaced by any other scale (x’) if x’ is a monotonic function of x. An increasing scale can be replaced by a decreasing one.

A special scale transformation is a linear one: x’=ax+b (a and b are real numbers, a≠0). If a>0, one speaks of a positive linear transformation. If b=0 and a>0, it is a dilatation, if a=0, b≠0 it is a shift. A scale is an interval, ratio, or difference scale if it is unique with respect to positive linear transformations, dilations, or shifts, respectively.[5]

Consider, for example, Mohs’ scale of hardness, which ranges from 1 (talc) to 10 (diamond) and is defined by reference to the scratch test. A mineral A is called harder than a mineral B if a sharp point of A scratches a smooth surface of B. A and B are called equally hard if neither scratches the other.[6] Such a scale is not isomorph but homomorph. It is merely ordinal because the assignment of numbers to the equivalence classes is completely arbitrary, except for their order. In this respect it does not differ from, for example, an alphabetical ordering. In particular, the difference in hardness between two minerals designated as 9 and 10 is not related to the difference in hardness numbered 4 and 5. Neither does it make sense to say that diamond is twice as hard as apatite with hardness 5.

In contrast to this comparative attribute, consider the metrical attribute of volume. If we compare two vessels of 990 and 1000 litre with two vessels of 100 and 110 litre, we can meaningfully say that the volume differences are the same in the two cases. The differences are equivalent to the same amount. It is also meaningful to state that a container of 1000 litres is twice as large as a 500 litre container. Indeed, the scale for volumes is not merely ordinal, but is metrical.[7] The main distinction is that in ordinal scales numbers are assigned to the equivalence classes of the ordered subjects themselves, whereas in a metrical scale relations are quantified.

 

3.2. The metric as law for measurement

 

Metrical scales satisfy the rules of a group. It extends a comparative ordering relation into a quantitatively objectifiable subject-subject relation. Now the equivalence classes of the relations R(A,B) for all possible pairs of subjects A and B with respect to some attribute, are elements of a group, isomorphic to a specified group of real numbers. This isomorphism is called the metric, and a magnitude satisfying it is called metrical.[8] The specification includes both the interval of allowed numbers, and the group operation connecting them.

In many cases the interval is just the set of all real numbers, and the group operation is addition. Then we have a difference scale, such as that for volume and mass differences. In other cases the interval consists of the positive real numbers, and the group operation is multiplication. Now we have a ratio scale, such as that for volume ratios or mass ratios. According to special relativity theory, the group of all possible relative velocities in one dimension for real moving subjects has an upper bound c, the speed of light. The addition group of two relative velocities v and w is given by the group product (v+w)/(1+vw/c2), which follows from the properties of the so-called Lorentz group (chapter 4).

The equivalence classes of the property concerned do not form a group with respect to a certain attribute, but the equivalence classes of their relations do. For instance, if the group operation is isomorphic to addition, negative values must be included, which is not admissible for volumes, whereas it is for volume differences. If the group operation is isomorphic to multiplication one has to take volume ratios as the elements of the group, because the product of two volumes is not a volume, whereas the product of two volume ratios is again a volume ratio.

However, the equivalence classes of the subjects themselves can be considered as relations to the identity element of the group, and can thus be interpreted as a subset of the group. Thus the volume of a subject can be considered as the volume difference with a (fictitious) subject with zero volume, and the set of all volumes is isomorphic to the set of all volume ratios with a subject with unit volume.

Among magnitudes one discerns measurable properties of subjects, and measurable relations between subjects, but the two are closely related. Properties also have a relational character, whereas relations have a property-character – compare distance (a relation) with length (a property).

 

Units

For any metrical attribute, there are three coordinative principles:[9] the existence of practical means of establishing equivalence and the serial order of the equivalence classes; the metric based on a group structure; and the arbitrary choice of a unit. The isomorphism between the groups of equivalence classes of non-numerical objective relations and the corresponding group of real numbers does not completely define the numerical values to be assigned to the relations. Section 2.8 explained this for spatial magnitudes, arguing from the amorphousness of space. In fact, any metrical attribute lacks an intrinsic metric. The addition group of real numbers is itself isomorphic to the addition group which is generated by multiplying all the members of the first group (x) with an arbitrary real number c≠0: x’=cx (hence, if x1+x2=x3, then cx1+cx2=cx3). For this reason the number 1 is assigned to some arbitrarily chosen relation, e.g. all volumes being equal to the unit volume. The number 0 is given to the relation for any pair of equivalent subjects. With these stipulations, i.e., with the choice of a unit, the metrical scale for addition groups is completely defined.

Likewise, for groups isomorphic to the multiplication group of real positive numbers, the number 1 is assigned to the relation between equivalent subjects and some arbitrary number is assigned to a conveniently chosen subject. For example, in the thermodynamic temperature scale, the temperature of the triple point of water is given the number 273.16 K, in order to have a simple relation to the customary centigrade or Celsius scale.

Often, metrical scales with a unit refer not only to a group of real numbers, but also to a number field, characterized by addition and multiplication as group operations. For the addition of length the distributive law for number fields applies: 3 cm+5 cm=3(1 cm)+5(1 cm)=(3+5)(1 cm)=8 cm. This is the background of the statement that one can only compare (by adding or subtracting) magnitudes having the same ‘dimension’ (not e.g. 1 cm+3 cm3) and the same units (e.g., 1 m+3 cm=100 cm+3 cm=103 cm).

Clearly, there is still some arbitrariness in metrical scales, but compared to ordinal scales, the arbitrariness is greatly reduced. The use of a scale with a unit is only meaningful for metrical scales. A merely ordinal scale has no unit because, in this case, the assignment of numbers to equivalence classes is completely arbitrary except for their serial order.

 

3.3. The theoretical character of the metric

 

There are several reasons for stating that the metric as introduced above has a theoretical character. First, the group structure appears as a law. This means that it is always an (empirically based) theoretical hypothesis to state that a certain attribute has a group structure. In many cases it is a modal law – i.e., the group structure is independent of the typical structure of the subjects which are objectified in the metric. Only the unit, which is arbitrarily chosen, depends on the typical structure of some subject.

The abstract hypothetical nature of the metric also comes to the fore because the group always has an infinite number of elements, whereas the number of physically qualified concrete subjects having a certain property may be finite. Thus the metric does not refer to actual but to possible relations.

Furthermore, in actual measurements it is impossible to establish equivalence exactly. Then two physically qualified subjects, A and B, are considered equivalent with respect to a certain attribute within the accuracy obtained by our measuring instruments (including our sense organs). For instance, suppose a balance can only discriminate between masses differing by more than 1 gram. Having three bodies A, B, and C, weighing (according to a more accurate balance) 9.25, 10.0, and 10.75 gram, respectively, then according to our crude balance A has the same weight as B, and B has the same weight as C, whereas according to the same balance, C is heavier than A. This violates the rule that metrical properties are transitive.[10] The metric describes exact relations among subjects because of its mathematical structure.

It is difficult to give a definition of the notion of accuracy. Starting with Gauss, statistical mathematicians and physicists have developed rules to assign a number to the accuracy with which equivalence can be established. For example, if the length of a room is said to be (4.21+0.01) metre, it is assumed that the (in)accuracy of the measurement is 1 cm. The precise meaning of this statement, and how the accuracy can be estimated, are mostly technical matters, and will not be discussed here.[11] It is sufficient to note that actual measurements always yield results with a finite accuracy.

In another respect the above example also shows that the metric is abstract: the metric expressly refers to a group of real numbers. But the example indicates that measurements can only yield rational numbers, i.e., decimal numbers with a finite number of decimals.[12] Again theoretical considerations allow us to assume that, e.g., length must be assigned real numbers. Theoretical geometry, not experimental geometry, proves the length of the diagonal of a unit square is equal to √2. Magnitudes as retrocipatory projections can only refer back to the disclosed numerical relation frame, i.e., to real numbers or vectors with real number components (2.3).

It is sometimes suggested that real numbers for magnitudes are only used because of convenience. Only given real numbers, for example, it is possible to differentiate and integrate functions. However, different metrics are related, constituting a metrical system.[13] Given metrical magnitudes of some kind for a subject, one can calculate metrical magnitudes of another kind for the same subject. Thus given the mass m and the velocity v of a subject, its momentum mv and its kinetic energy ½mv2 can be calculated. This statement would lose its meaning if the quantities of mass, velocity, energy, and momentum did not refer to metrical scales. It also shows that the units for energy and momentum are related to those for mass and velocity. But a superficial inspection of the formulas relating these measurable properties shows that they would also lose their meaning if it were required that they should be represented by rational numbers. Only a real number spectrum can accomplish this.

Finally, it is not always possible to use the same experimental method to determine equivalence. Extreme operationalists maintain that if different methods of measurement, are used, in fact, different magnitudes are measured.[14] Indeed, it is a matter of theory to connect the results of such measurements.

The notion of equivalence does not mean that the equivalence classes with respect to every attribute can be ordered in a single linear order. A typical counter-example is the essentially two-dimensional ordering of the equivalence classes of different colours perceived by the human eye.

Also the relative spatial positions of subjects and forces, can only be measured if they are first decomposed into their spatial components. These cases require multi-dimensional groups, and a multi-dimensional metric. In a few cases the metric is complex-numbered, such as the impedance in alternating current theory.

In an analogical way the thermodynamic state of a physical system is determined by a set of extensive parameters (5.2). Thus it may occur that two systems are partly equivalent (e.g., having the same volume but different energy) or completely equivalent in a physical sense (still having different positions or velocities). All this is possible only because the concept of equivalence itself refers to the spatial order of simultaneity. The numerical order of more less does not contain equivalence.

 

The establishment of equivalence

All measurements are based on the establishment of equivalence. This means that among measurements two types come to the forefront. First those based on a direct comparison of spatial position (coincidence). Every measuring instrument with a visible scale ultimately depends on this type.

The other type depends on force as a retrocipatory projection of physical interaction on spatial relations (5.5). This type of measurement has two sub-types: measurement based on a balancing of forces (3.4), and measurement based on a thermodynamic equilibrium between a physical subject and a measuring instrument, such as a thermometer (3.5). In both sub-types the establishment of equivalence is based on a physical equilibrium state.

 

3.4. Extensive properties: mass

 

This section considers those relational attributes whose interval of allowed numerical values is the set of all real numbers, with addition as the group operation. The number zero corresponds with the equivalence relation for any pair of equivalent subjects. It is now possible to assign real numbers to the subjects themselves. If r(A,B) is the number corresponding to the relation R(A,B), and n(A) the number corresponding to the subject A, both with respect to some additive attribute R, then r(A,B)=n(A)–n(B). The set of all possible values n(A) is not necessarily a group, but it is (or is isomorphic to) a sub-set of the group of all possible values r(A,B).

It will be clear that even if (by the choice of a unit) the value r(A,B) is uniquely established, there is some arbitrariness in the value n(A), because an arbitrary real number (which must be the same for all subjects A) can be added to it. This means that one is free to choose a zero point for the n-scale without any consequence for the r-scale. In some cases (e.g., length) the zero of the n-scale is obvious. In other cases (e.g., spatial position) the zero of n is completely arbitrary.

I shall now discuss the construction of the metric of an additive or extensive property. A property is called extensive if it is metrical, and if n(AoB)=n(A)+n(B). Here the symbol ‘AoB’ means: ‘the physical sum of the subjects A and B’ – i.e., A and B combined in a physical sense, relevant for the attribute concerned. (Instead of ‘physical’ one can also read ‘spatial’ or ‘kinetic’). This combination procedure which is isomorphic to the addition of real numbers, leads to a group of relations between the subjects A,B,…, isomorphic to the group of real numbers. This combination procedure must be specified in every case. For example, consider the combination of two electrical resistors. If they are connected in series, their resistances are added, but if they are connected in parallel, one has to add their conductance.[15] (Conductivity is the inverse of resistivity). In all cases the addition rule only applies if A and B are disjoint.

Let us suppose that there is a means of determining (within a certain accuracy) whether two subjects belong to the same equivalence class with respect to some extensive property. Then it is possible to determine uniquely the number r(A,B) for any two subjects A and B, as is seen in the following example. Suppose one wants to compare the masses m(A) and m(B) of two physical subjects. The measuring instrument is a balance, allowing to see whether two subjects have the same mass, and if not, which one is heavier.[16] Now one takes p bodies with the same mass m(A) and q bodies with the mass m(B), such that the first collection of p bodies balances the second set of q bodies:

|p.m(A)–q.m(B)|<ε

where ε indicates the accuracy of the balance. Accordingly,

|m(A)–(q/p).m(B)|<ε/p

So we find that the mass of A is q/p times the mass of B, within the accuracy ε/p. If m(B) happens to be equal to the unit of mass (1 kg) then the mass of A is q/p kg.[17] This measurement yields a rational number because actual measurements always have a limited accuracy.

Whether a magnitude is extensive or not is not a convention. It can be falsified by experiment.[18] It is an empirical fact that mass is an additive property, at least under certain circumstances (T&E, 1.4).[19] In relativity physics it is shown that mass is only additive if the added subjects have no relative kinetic energy and no relative potential energy. Therefore the mass of a deuteron is less than the sum of the masses of its constituent particles – a proton and a neutron. On the other hand, it is not always the measurement procedure that establishes whether a certain property is extensive or not. There are many extensive properties whose numerical values can only be determined indirectly. Therefore, their metrics depend on other so-called fundamental metrics.[20] In thermodynamics two key attributes, internal energy and entropy, cannot be measured directly. In fact, a large part of a general course in thermodynamics is required to give proper account of the metrics of energy, entropy, and also temperature.

 

3.5. Intensive properties: temperature

 

Sometimes, all properties which are not extensive in the sense defined above are called intensive,[21] but I shall apply a more restricted definition (T&E, 4.5, 7.4). An attribute is called intensive if it is metric and if either n(AoB)=n(A)=n(B) or n(AoB) is not defined. Thus n(AoB) has a meaningful interpretation only if n(B) equals n(A). Now A and B are in equilibrium with respect to the property designated by n.

A typical example is temperature. If two physical subjects come into thermal contact, sooner or later they will arrive at the same temperature. As long as A and B have different temperatures, it makes no sense to speak of the temperature of their sum. The transitivity statement: ‘If a subject A is in thermal equilibrium with a subject B, and if A is in equilibrium with a third subject C, then B and C are in equilibrium with each other’ is not only relevant to temperature, but to any equilibrium parameter.[22]

For both intensive and extensive properties the establishment of equivalence is implied in the definition of n(AoB): we measure the temperature of a body with a calibrated thermometer as soon as we are confident that the two have the same temperature.[23] However, this method is not sufficient to determine unique relations between bodies which are not equivalent with respect to intensive parameters, as can be done with extensive parameters. Consequently, the scale for an intensive property always depends on the scales for one or more extensive parameters.

Sometimes, this dependence is easily found, as, e.g., the internal pressure of a gas. This intensive property is equal to the force per unit area exerted by the gas on the walls of its container, and force and area are both extensive parameters. Thus the calibration of a manometer is in principle a simple matter. For temperature, another well-known and important magnitude, the construction of a scale is far more complicated.[24]

 

The metric of temperature

The tendency of fluids to expand on heating provides the possibility of measuring temperature by the length of, e.g., a mercury column (T&E, 7.4). The mercury temperature scale is defined such that the temperature of melting ice is given the value 0oC and boiling water is assigned the value 100oC. The numerical values for other temperatures are found by linear inter- and extrapolation. In this way the temperature is reduced to the extensive scale for length measurement. Thus the temperature is assumed to be 50oC if the height of the mercury column is just halfway between the points for 0oC and 100oC.

This merely ordinal scale, though very useful, is rightly called conventional, because it depends on the typical properties of mercury. In fact, any property which depends on temperature could be used instead.[25] If we would take another liquid (like alcohol) we would define the 0 and 100 points in the same way. But now a body having a temperature of 50o according to the mercury scale would show a temperature of, let us say, 49o on the alcohol thermometer provided its scale is equally divided between 0 and 100 as is the mercury thermometer. A practical way out of this difficulty is to calibrate the alcohol thermometer against the mercury thermometer, which makes the alcohol scale non-linear, but this does not make the mercury scale less conventional, since it arbitrarily assumes that mercury expands linearly on heating, and that alcohol does not.

In modern axiomatic thermodynamics, temperature is usually introduced as the derivative of energy with respect to entropy, which are both extensive properties (5.2). Given the methods of statistical physics it is then possible to design a temperature scale which is not conventional, except for the choice of the unit. The same scale can also be found by thermodynamic means. I shall describe this older and rather elaborate method in order to stress its modal universality.[26]

This method starts with a very general principle which can be formulated in two equivalent ways. According to William Kelvin, it is impossible that the net result of a cyclical process is such that heat is completely transformed into work. According to Rudolf Clausius, it is impossible that the only result of a cyclical process is such that heat is transferred to a warmer body. Both statements are expressions of the physical order of irreversibility.

The efficiency of a cyclical process is defined as the net work (output minus input) divided by the heat input (discarding the heat lost). If we consider several different cyclical processes, all working between the same temperatures T1 and T2 (T1>T2), then it follows from the principles of Kelvin and Clausius that no cyclical process can have a higher efficiency than a so-called Carnot cycle. This consists of two isothermal processes (at constant temperature T1, respectively T2), interspersed with adiabatic processes (during which no heat is exchanged and the temperature changes from T1 to T2, and vice versa). A Carnot cycle is reversible: it either converts heat into work, or it transports heat from a low to a high temperature reservoir. If the heat input at temperature T1 is called Q1 and the heat output at temperature T2 is called Q2, then with the help of the conservation law of energy, we find that the efficiency of a Carnot-cycle is (1–Q1/Q2). It may be observed that until now a temperature scale is not required. It suffices to have a means of establishing whether two subjects have the same temperature, and if not, which one is hotter.

It can be shown that a reversible Carnot cycle is more efficient than an irreversible cycle working between the same two temperatures, and that two Carnot cycles working between the same temperatures have the same efficiency, irrespective of the typical structure of the processes involved. Therefore, the efficiency can only be a function of these temperatures, and it is possible to define the temperature scale such that T1/T2=Q1/Q2. This scale is arbitrarily provided with a unit by stipulating that the temperature of the triple point of water is 273.16 K (for Kelvin).[27]

This theoretical thermodynamic temperature scale is – except for the unit – independent of any typical property whatsoever and is therefore called ‘absolute’. It is completely of a modal character.[28] It is based only on the physical time order as expressed in the Second Law of thermodynamics, and the assumption that heat (i.e., energy flux), an extensive property, can be measured directly or indirectly, which is indeed the case. This implies that this modal theoretical magnitude can be used in theoretical formulae. Thus it is only meaningful to state that the mean kinetic energy of molecules in a gas is (3/2)kT, if T does not refer to the mercury scale, but to the thermodynamic scale.

Certainly a Carnot cycle is not a practical thermometer. Thermometry devises practical thermometers which come as near as possible to the theoretical temperature scale. For instance, by theoretical analysis it can be shown that this scale is identical to one based on the expansion of an ideal gas, which is approximated by dilute gases like helium, argon, and hydrogen, except at very low and very high temperatures. But now the order is reversed. A scale is not defined by using a thermometer with its typical properties, but a certain thermometer is used, according to its own convenience, in a certain situation. Its scale should, as nearly as possible, approximate the modal theoretical thermodynamic scale in order to give results which can be used to corroborate or falsify physical theories. Thus we conclude that the thermodynamic scale is not based on some convention, but on a theoretic analysis of physical relations.

 

3.6. The spatial and temporal metrics

 

Measurements of extensive properties like mass and intensive properties like temperature depend on a state of equilibrium between two subjects. Such a state is characterized by an equilibrium between two or more (generalized) forces, and I shall argue that force is a spatial analogy of physical interaction (5.4).

At first sight this does not apply to measurements of length and time. If one wants to compare the lengths of two bodies which are spatially remote, one takes a metre stick, first measuring the length of one subject and then the length of the second, finally calculating the difference or ratio of the two values. But what guarantees that the length of our metre stick did not change between the two measurements? Why does one take a solid body as e metre stick and not a rubber string? Is the outlined procedure still valid if the temperature in the environments of the two bodies is not the same? Why do today’s physicists take the wave length of a certain spectral line as the fundamental unit of length, and not the length of the standard metre at Sèvres?

Similar questions arise with respect to the measurement of time (T&E, 3.7). By an accurate measurement of time is understood the comparison of a certain time interval with a periodic system, a clock. But how does one know that a certain clock is really accurate, such that it ticks off equal periods? Why are certain clocks assumed to be more accurate than others? Do exactly periodic systems really exist?

Usually one reasons that it is impossible to base the measurement of length on the concept of a rigid body, because this would lead us into a vicious circle: to show that a body is rigid, other rigid bodies are applied. Similarly, to show that a clock is periodic requires periodic systems. I shall try to make clear that the real difference is getting into this vicious circle, not getting out of it.

 

Conventionalism

The conventionalist’s answer to these problems is more or less as follows. In a large class of any kind of bodies their lengths are compared. Now under certain circumstances (e.g., equal temperature) a subclass of these bodies have invariant length ratios, whereas other subclasses do not. It is just a matter of convenience to take this subclass as the class of rigid bodies which is used as a basis for the measurement of length. Sometimes criteria of simplicity and fruitfulness are added to this convention. In this framework the question cannot be posed (let alone be answered) why the physically qualified bodies of this subclass are more or less equally rigid. As Adolf Grünbaum says:

‘Only the choice of a particular extrinsic congruence standard can determine a unique congruence class, the rigidity or self-congruence of that standard being decreed by convention, and similarly for the periodic devices which are held to be isochronous (uniform) clocks.’[29]

 

However, this convention is too good to be true. One could conventionally assume that an atomic clock designates true time, but that does not explain why all other clocks submit themselves willingly to this arbitrary choice. The answer of physicists is quite different (T&E, 3.7). They carry out an analysis of all available physically qualified structures in order to find the most stable ones, which are used as standards for measurement. For the criteria of stability, the basic spatial and kinematic laws are presupposed. In particular the spatial isotropy and homogeneity, and the uniformity of kinematical time are presupposed in the physicist’s choice of the standard of length and time.[30] This is what I mean by saying that one has to get into the circle. Based on the assumption – supported by empirical evidence – that space and time are isotropic, homogeneous, and uniform a metric is chosen that reflects these symmetry properties. After typical structures of individuality are investigated to find the most stable ones, a reliable standard is chosen, and used to check whether space and time are indeed isotropic, homogeneous and uniform. This is a circle, but not a logical one. It is by no means logically certain whether such a procedure should inevitably lead to consistent results.

It has been discovered (empirically) that the standard metre at Sèvres, and the diurnal or annual motion of the earth do not give sufficient accuracy if subjected to the criteria of temporal uniformity and spatial homogeneity and isotropy. Therefore, today, the physical units of length and time are based on atomic structures: the typical wave length of a certain spectral line, and the period of another one. These spectral lines are due to electronic transitions, within atoms. Therefore, an absolutely stable system (if it existed), could not be used because no transition would occur in it. The stability of a physically qualified system like an atom or a solid is determined by a typical balance of kinetic, potential, and exchange energy, the typicality of which is determined by the potential energy – i.e., by the acting forces. Thus spatial and temporal measurements also rely on a balance of forces, which lead to a typical stable equilibrium state.

 

Non-Euclidean metrics

All this is not essentially changed in general relativity theory if due account is given to the fact that Euclidean straight lines cannot be determined experimentally, and, therefore, must be replaced by geodesics.[31] If metre sticks do not conform to Euclidean geometry, this can be accounted for either in a spatial way (assuming non-Euclidean geometry) or in a physical way (assuming a universal modal field of force, like gravitation, acting in the same way on rigid bodies and on periodic systems[32]). This again shows the spatial foundation and the physical qualification of measurement. Thus the self-congruence of the standards of measurement is decreed by consistency of modal and typical laws, not by convention. If such a presumed consistency between hypothesized modal and typical laws cannot stand up to experimental tests, the hypotheses have to be modified. This is the basis of general relativity theory.[33]

Temporal intervals cannot be measured independently of presupposed spatial laws, and spatial relative positions cannot be determined without dependence on temporal laws.[34] It is impossible to define time and space independently by means of their measurement procedures because in actual physically qualified structures (such as measuring instruments or standards) all physical and prephysical modal aspects are involved.

 

Universal modal laws for the metrics of space and time

The metrics for spatial and temporal relations are determined by two modal laws: (1) The uniformity of kinetic time, according to which all subjects move uniformly with respect to each other, insofar it is possible to abstract from their mutual physical interactions. (2) The transformation laws of spatial and temporal scales, which reflect the fact that there is no preferred reference system (spatial and temporal relations, not spatial positions and temporal moments are relevant). Both traits are found in the classical Newtonian metric (T&E, 3.7) and in the metrics of special and general relativity, the main difference being that in the latter the spatial and temporal metrics are interrelated, whereas in the former the two are supposed to be mutually independent.

Conventionalists claim that the Newtonian metric is just as conventional as spatial or temporal scales which are rigidly connected to the typical properties of some individual system. Thus, Grünbaum[35] compares the Newtonian metric for time measurement with the scale based on the diurnal rotation of the earth. Compared with the Newtonian metric this rotation is slightly irregular, and slowing down, because of tidal friction. After an extensive discussion, Grünbaum concludes that

‘… apart from pragmatic considerations, the diurnal description enjoys explanatory parity with the Newtonian one’.[36]

 

These ‘pragmatic considerations’ include the fact that in the latter metric the physical and kinematic laws can be more conveniently expressed in mathematical terms. Citing Feigl and Maxwell, he says:  

‘… one of the important criteria of descriptive simplicity which greatly restrict the range of ‘reasonable’ conventions is seen to be the scope which a convention will allow for mathematically tractable laws.’[37]

 

In this discussion, Grünbaum seems to overlook the group structure of the Newtonian metric, which implies, e.g., that 10 seconds now is as long as 10 seconds tomorrow, in the following sense. Suppose one wishes to repeat an experiment in which it is crucial that its duration is 10 seconds. Then (other things being equal) the same result will be found today and tomorrow, or at any other time. This would not be the case, if time would be measured on a diurnal scale (at least if our accuracy is high enough to detect the difference between this scale and the Newtonian metric). The result of nearly every physical experiment would depend on the moment it is done.[38] A conventionalist also rejects the use of such a ‘particular’ scale because it is more convenient to refer to the larger system of the ‘rest of the universe’.[39] But then the definition of the scale is (apart from its epistemological aspects) still a purely subjective matter. However, the choice of the metric depends on the modal law-subject relation. There happens to be a metric which is universal, not because it is applicable ‘everywhere in the universe’, but because it appears as a natural law.[40]

It is not at all interesting to find scales which depend on the typical individuality of some physically qualified subject. Far more interesting is the possibility of finding a modal metric – i.e., a scale that does not depend on the typical structure of some individual system, and which has a group structure. Only then can an objective representation of physical states of affairs be warranted. To declare that all possible non-metrical scales are on a par with modal metrical scales, and that  the use of the latter is just a matter of convenience, is a gross depreciation of some of the greatest discoveries in the history of science: the isotropy and homogeneity of space, and the uniformity of time.

The conventionalist’s claim is based on the true but irrelevant statement that there are no logical grounds for accepting one scale above another one. Reichenbach says:

‘It is a matter of fact that our world admits of a simple definition of congruence, because of the factual relations holding for the behaviour of rigid bodies; but this fact does not deprive the simple definition of its definitional character.’[41]

 

However, these ‘factual relations’ are subjected to typical laws, which can be analysed with the help of modal laws, and the ‘conventional definitions’ are based on these laws. It is relevant that there are physical grounds for preferring metrical scales to merely ordinal ones.

 



[1] Pfanzagl 1968, 11: ‘… measurement in classical physics poses no problems comparable to those in the behavioral sciences …’

[2] Carnap 1950, 8-15; Hempel 1952, 54-58; Stegmüller 1969-1970, 17, 27ff.

[3] Because I shall not be concerned with measurements as a human act, I shall, from now on, speak of subjects: the objects of measurement are subjects of physical and pre-physical laws.

[4] Campbell 1921, Chapter 6; 1928, Chapter 1; Hempel 1952, 59; Suppes 1957, 96, 97; Ellis 1966, 27; Stegmüller 1969-1970, 29ff; Nagel 1960; Bunge 1967a 36; 1967b, II, 197.

[5] Pfanzagl 1968, 29; Stevens 1959, 24-26; 1960, 141ff.

[6] The scratch test is not strictly transitive; cf. Campbell 1921, 128; 1928, 7; Hempel 1952, 61; there are more reliable tests.

[7] Nagel 1960, 126, 127; Bunge 1967c, 198.

[8] Bunge 1967c, 198.

[9] Reichenbach 1927, 135.

[10]  Campbell 1928, 30ff; Poincaré 1906, 22; Menger 1949.

[11] See e.g. Campbell 1928, Chapter 9-11; Margenau  1950, Chapter 6; 1959, 163-176; 1960; Bunge 1967c, 209ff.

[12] Campbell 1928, 24; Hempel 1952, 29-39, 67, 68; Grünbaum 1963, 175, 176; Carnap 1966, 88; Stegmüller 1969-1970, 58, 90ff; Whiteman 1967, 256ff; Bunge 1967b, I, 149; II, 207ff; Cassirer 1910, 57; Bridgman, in Henkin et al. (eds.) 1959, 227.

[13]The international physical community, organized in the Conférence Générale des Poids et Mesures, designed the metric system of units and scales. The basic magnitudes and units of the Système International (SI) are: length (metre), mass (kilogram), kinetic time (second), electric current (ampère), temperature (kelvin), amount of matter (mol) and luminosity (candela). All other units are derived from these. Theoretically, a different base could have been chosen, e.g. electric charge or potential difference instead of current. The choice is made especially with regard of the possibility to establish the unit and metric concerned with large precision. Physicists and astronomers do not always stick to these agreements, using the speed of light, the light year or the charge of the electron as alternatives to the standard units.

[14] Cp. Campbell 1928, 29; Bridgman 1927, 10, 23; for a criticism of this view, see Hempel 1965, 123ff; 1960; 1966, Chapter 7; Byerly, Lazara 1973.

[15] For a more elaborate discussion, see Helmholtz 1879; Hempel 1952, 62-69; Menger 1959; see also Bunge 1967c, 200ff.

[16] Strictly speaking we compare forces (weights) in a balance. Mass is a numerical projection of physical interaction, see chapter 5, and cannot be measured directly. Cp. Jammer 1961, 105ff.

[17] It is more complicated but not essentially different, if we take into account the accuracy with which  we can make replica’s of A and B. A different but equivalent procedure is described by Campbell 1928, Chapter 2, 3; see also Lenzen 1938, 22ff; Suppes 1957, 96ff.

[18] Bunge 1967c, 199.

[19] Mach 1883, 268-269.

[20] Campbell 1921, 134, 142-144; Hempel 1952, 69; for a critical review of derived measurements and ‘operational definitions’ based on them, see Margenau 1960, 1950, chapter 12.

[21] Hempel 1952, 77, 78; Bunge 1967a 34: 1967b, II, 200; Nagel 1960, 128; Stegmüller 1969-1970, 47. For instance, Hempel calls ‘hardness’ an intensive property whereas according to our definitions, it is neither extensive nor intensive. Intensive parameters are also called ‘potentials’.

[22] Redlich 1968.

[23] Redlich 1968.

[24] For the following discussion, see e.g. Morse 1964, chapter 1-6. Another example of an intensive magnitude is the electrical potential difference. The establishment of its metric between c.1780 and c.1850 caused difficulties similar to those encountered in the development of the temperature scale (22.5).

[25] Born 1949, 36.

[26] Still another method is Carathéodory’s; cf. Born 1949, 39ff.

[27] This ensures that the temperature difference between freezing and boiling water at standard pressure is still 100 degrees.

[28] Nagel 1961, 11.

[29] Grünbaum 1968, 14; see Poincaré 1905, Chapter 2; Stegmüller 1969-1970, 18, 35, 86, 98ff. For a critical review of this standpoint, see Nagel 1961, 179ff. Popper 1959, 144, 145 observes that the conventionalist’s concept of simplicity is itself conventional, and therefore arbitrary.

[30] Margenau 1950, 139; 1960; Nagel 1961, 255ff; Lenzen 1938, 19.

[31] Mittelstaedt 1963, 74. In nearly all physical cosmologies designed so far it is assumed that in the neighbourhood of the earth space-time is approximately flat, satisfying the pseudo-Euclidean metric of special relativity theory.

[32] Nagel 1961, 264; Reichenbach 1927, 26; Beth 1950, 122.

[33] Mittelstaedt 1963, 87.

[34] Whiteman 1967, Chapter 5.

[35] Grünbaum 1963, chapter 2(A).

[36] Grünbaum 1963, 74.

[37] Grünbaum 1963, 77; on page 75, Grünbaum admits that ‘… it is a highly fortunate fact and not an a priori truth, that there exists a time metrization at all in which all accelerations with respect to inertial systems are of dynamic origin, as claimed by the Newtonian theory …’ See also Grünbaum 1968, 59 ff.

[38] This is even more striking in the examples given by Hempel 1952, 73, 74, and Stegmüller 1969-1970, 73, who discuss a time scale based on the pulse beat of the Dalai Lama or the governing president of the United States, respectively. The outcome of any experiment as described above would depend on the momentary health of these dignitaries. See also Reichenbach 1927, 20, 21, 24.

[39] Reichenbach 1927, 20, 21.

[40] Reichenbach’s (1927, 27) distinction of ‘universal’ and ‘differential’ is erroneously reduced to that between geometry and physics.

[41] Reichenbach 1927, 17.

 

Part I, chapter 4

 

 

 

 

The dynamic development

 

of kinematics

 

 

 

 

 

 

 

4.1. The irreducibility of the kinetic relation frame

 

 

 

Chapter 4 investigates the dynamic development of the kinetic relation frame. Although the emphasis will be on relativity, it starts with the recognition of the irreducibility of this frame to the numerical, spatial, and physical ones.

 

One of Galileo’s greatest contributions to physics, although his own account of it is not quite correct, is the discovery that change of motion – not motion itself – needs a cause (T&E, 3.3). This principle is now known as Newton’s first law of motion: if no net force acts on a body it will not necessarily be in a state of rest, as was the prevailing view since Aristotle, but it will remain in a state of uniform rectilinear motion.

 

This statement has been criticized (T&E, 3.6). First, one may observe that forces are defined as causes of change of motion, and therefore it is circular reasoning to say that if there are no forces acting on a body there is no change of motion. Secondly, a state of uniform motion depends on the reference system with respect to which the motion is measured. Once again one may speak of circular reasoning. Now, when introducing fundamental, irreducible concepts, circular reasoning is not always avoidable. The problem is not how to get out of the circle, but how to get into it (3.7). Irreducible concepts cannot be derived from already known concepts, but have to be distilled from them, to be disentangled from historically grown views which are partly right, partly wrong. It needs giants like Galileo to perform this task.

 

Moreover, it is not quite correct to say that forces are defined by their static effects or by their effects on motion. This may be called their modal determination: in a purely modal sense forces are defined in this way. But this must be supplemented by the typical manifestations of forces, such as electric, magnetic, gravitational, frictional, and elastic forces. These can be distinguished, although they do not lead to typical motions. It makes no sense to say that a subject under the influence of an electric force ‘moves electrically’, ‘has an electric acceleration’, etc. Nevertheless these forces can be recognized in ways other than their action on moving bodies in a purely modal sense. They can balance each other, such that they are comparable. I shall defer the discussion of this matter until later. In this chapter I shall concentrate on the second problem mentioned above, the relevance of the reference system. This is possible just because of Galileo’s principle. It expresses the mutual irreducibility of the kinetic and the physical relation frames. Forces and other manifestations of physical interaction belong to the latter.

 

The relativity of motion implies that it is meaningless to attach motion to a single subject without reference to some other system. This does not mean that it is merely conventional to choose a reference system, even if dynamical effects are not explicitly mentioned. A notable example is Copernicus’ heliocentric system of planetary motion (T&E, chapter 2), which is generally undervalued by conventionalist authors. It is (erroneously) stated that the replacement of the 83 epicycles of Ptolemy’s earth-centred system by 17 epicycles in Copernicus’ theory greatly simplified matters, but nothing else, since, in principle, both should be considered on a par from a relativistic standpoint.[1]

 

The simplification was not even that great[2] since the predictive power of Copernicus’ model was not better than that of Ptolemy, and therefore Tycho Brahe’s objections against the new theory were sound enough.[3] Copernicus’ theory did not win the battle because of its simplicity or quantitative features, but because it proved to be superior in some qualitative respects. The assumption that the planets move around the sun, not around the earth, is not merely a change of reference system. It enabled Copernicus to solve several problems,[4] such as the problem of why Venus and Mercury are always seen near the sun, and therefore, why these planets’ period of revolution in their deferents is just one year; the problem of why Mars and the other superior planets always show retrograde motion when they are in opposition with the sun, and, therefore, why these planets’ period of revolution in their epicycles is just one year; the problem of why Venus’ appearance is ‘full’ when this planet is far away (small apparent diameter) and ‘crescent’ when it is nearby (large apparent diameter) and showing retrograde motion. The latter argument indicates that Venus moves around the sun, and not in a sphere well below the sun’s sphere, as in Ptolemy’s model.[5] In particular, Copernicus was able to determine the relative distances of the planets, which is impossible in the Ptolemaic system.

 

Hence, the Copernican system was accepted because it had greater explanatory power than Ptolemy’s. This was the case only after Galileo removed the largest objections against the dual motion of the earth, by introducing the ideas of inertia, relativity of motion, and superposition of motion. These objections were concerned with the fact that the motion of the earth had no consequences for the motion of terrestrial objects.

 

Why Kepler accepted Copernicanism is quite a different matter (T&E, 2.6, T&E, 3.5). Kepler and Galileo moved along parallel tracks. While Galileo removed the said objections, but remained faithful to uniform circular motion, Kepler came to reject the latter. Ptolemy’s system can be understood as a marvellous attempt to explain celestial motion in terms of simple uniform circular motion. Uniform circular motion was a kinetic principle of explanation introduced by Plato and maintained by Aristotle and all medieval authors, including Copernicus. Eventually Copernicus’ system was replaced by Kepler’s system, which is nearly the final solution of planetary motion as a kinetic problem.[6] Kepler himself was an arduous adherent to the Pythagorean-Platonic tradition, but since he rated Tycho Brahe’s observations higher than any theory, he came to reject circular uniform motion as an irreducible principle of explanation. Because the planets turned out to move in elliptical orbits with a varying velocity in his system, Kepler immediately recognized that the theory required a further explanation: not a kinetic, but a physical one, which was later provided by Newton’s theory of gravitation.[7] Newton replaced circular uniform motion by linear uniform motion as an irreducible principle of explanation.

 

Newton’s first law is sometimes considered to be a special case of his second law (T&E, 3.6).[8] However, the second law is only valid if taken with respect to inertial systems. A body on which no unbalanced force is acting moves uniformly with respect to an inertial system. Hence, the first law can be understood as an existential statement, stating the existence of inertial systems.

 

This implies the discovery that the physical interaction between two subjects is independent of their common uniform motion with respect to some spatial coordinate system, or the temporal moment at which the interaction occurs. This discovery was already made in classical physics, but it plays a far more consequential role in relativity theory.

 

 

 

4.2. The uniformity of kinetic time

 

 

 

Classical physics was chiefly interested in so-called particle motion – the relative motion of rigid bodies. Although kinetic relations were described, kinetic subjects were not recognized. Partial recognition came in the various theories of wave motion, but only with the rise of quantum physics were genuine kinetic subjects (wave packets, see chapter 7) considered.

 

Uniform rectilinear motion is relative. One cannot say that a subject moves, if one does not specify with respect to which other subject it moves. Thus relative, rectilinear, uniform motion is a subject-subject relation. On the law side this relative motion presupposes the uniform flow of time as the kinetic time order. To common view it seems rather obvious that time flows uniformly, e.g., an hour today is just as long as an hour tomorrow. As late as the 14th century, however, the day (the time between sunrise and sunset) was rigidly divided into twelve hours, with the effect that an hour in winter was shorter than an hour in summer in northern countries.[9] Clearly this chronology would not allow of describing kinetic motion as uniform, and it was abandoned when mechanical clocks came into use.

 

The idea of time flow is rejected by some philosophers,[10] because there is no motion besides the motion of actual subjects. In our view this argument does not hold, because every law is only meaningful if related to subjects. In the same vein, one may also hold that there is no space, because there are only spatial relations between actual subjects. Indeed, nowadays most philosophers and physicists agree that there is no space or time in an absolute sense (T&E, 3.7). Accordingly, in this book the view is defended that the uniform flow of time is a general, irreducible, modal order of time, as such unbreakably connected to subjective, relative motion.[11] On the one side, the uniformity of motion means equal distances in equal times. On the other hand, the equality of temporal intervals is determined by a clock subject to the norm that it represents uniform motion correctly. This circularity is unavoidable, meaning that the uniformity of kinetic time is an unprovable axiom. However, this axiom is not a convention, but an expression of a fundamental and irreducible law.

 

Uniformity is a law for kinetic time, not an intrinsic property of time. There is nothing like a stream of time, flowing independently of the rest of reality. Time only exists in relations between events. The uniformity of kinetic time expressed by the law of inertia asserts the existence of motions being uniform with respect to each other.

 

Both classical and relativistic mechanics use this law to introduce inertial systems. An inertial system is a spatio-temporal reference system in which the law of inertia is valid. It can be used to measure accelerated motions as well. Starting with one inertial system, all others can be constructed by using either the Galileo group or the Lorentz group, reflecting the relativity of motion (3.3). Both start from the axiom that kinetic time is uniform.

 

Relative motion is objectified and measured by the velocity, the ratio of displacement (considered as a vector) and the duration of the motion. The displacement and the duration are connected via their common end points, usually called ‘point events’. Generally, an event is something endowed with typical individuality, but in a kinematic modal sense, it is a coincidence. The fact that events can be preceded and followed by other events refers back to the serial order of earlier and later. There are events which are simultaneous and this fact refers back to the spatial aspect. At first sight it appears possible to order events according to serial and spatial principles in an essentially static pattern of moments, which does not differ in any sense from a quasi-serial order (chapter 3).[12]

 

It is an empirical fact that a single identifiable subject can be at different places successively, and that different parts of the self-same subject can also occupy the same place at different moments. This is called motion. It leads to a new ordering, one irreducible to the spatial and the numerical, but which presupposes them. Attempts to reduce motion to succession in a continuous or dense point set lead to paradoxes like Zeno’s (T&E, 3.2). Linear motion by a representative point (like a centre of mass) supposes that all positions on the path of motion are traversed successively. Because on a continuous line no spatial point has a unique successor, this motion cannot be reduced to spatial continuity.

 

The path of the moving subject (which refers back to the spatial modal aspect), the displacement of the subject, and the duration of its movement are objects in the kinetic aspect. The latter two concepts, displacement and duration, should not be confused with relative position and time difference, respectively. Before these static relations can be used in a kinematic context, they must be developed into displacement and temporal duration, respectively. Whereas relative position and time difference relate different subjects, displacement and temporal duration apply to one subject. Displacement and temporal duration are related by the velocity of the movement. The velocity is therefore a numerical objective representation of relative motion. Velocity has a group character in both classical and relativity physics, although the group relations are different in the two theories. This means that point events as common boundaries of the displacement and the duration are second order objects. In particle physics, the path of the motion is usually reduced to the path of the centre of mass of the moving subject, which means that after objectifying kinematic subjects to rigid bodies, a rigid body is objectified to a single point. In field theories this is impossible because the motion of waves is essentially extended.

 

Hence what is usually called time or duration is but an objective relation in the kinetic modal aspect – a relation giving an objective representation of relative motion. Time receives its serial character because it refers back to the numerical modal aspect, but it is still subjected to the kinetic order of uniformity.

 

 

 

4.3. Combining velocities

 

 

 

Above I have argued that the difference between two rational numbers is a subjective relation in the numerical modal aspect, whereas the distance and velocity are objective relations in the spatial and kinetic modal aspects, respectively. If in one of these cases the relation between two subjects A and B is known, as well as the relation between B and a third subject C, would it be possible to find the relation between A and C? In the numerical aspect the answer is yes: if A, B, and C are numbers, then (AC)=(AB)+(B–C). In multidimensional space this simple addition rule is only valid if applied to the coordinates of the points A, B, and C. But generally speaking, the distance AC between the points A and C is less than the sum of the distances AB and BC. Thus the addition rule in the spatial modal aspect differs from the one in the numerical relation frame. What about the addition of velocities?

 

In classical mechanics the spatial substratum of kinetic motion is an absolute space in which distances retain their original geometric meaning. The time flow, as kinematical order of time, is also considered absolute, and only the numerical time difference or duration on the subject side remains as an objective measure of motion. Accordingly, one may add velocities in the same way as one adds distances in original geometrical space.

 

At a first approximation this is not too bad. Of course, in many cases original geometric space will approximate kinetic space very well. Specifically, this approximation appears to be valid as long as the relative velocities concerned are not too large (i.e., small compared to the speed of light).[13]

 

However, experiments such as those of Albert Michelson and Edward Morley in 1887 (T&E, 4.6), led to the conclusion that the addition of velocities is not valid if the speed of light is involved. Any velocity added to the velocity of light results in the velocity of light itself. Hendrik Lorentz[14] concluded that distances depend on motion. He tried to explain the phenomenon from the typical structure of matter by reducing the so-called Lorentz contraction of the measuring sticks used in the experiment to an electromagnetic cause. However, in 1905 Albert Einstein showed that this contraction has no dynamical, physical cause, but is entirely of a kinetic nature. But before he could do so, he had to reconsider the 19th-century concepts of absolute space and time (which were derived from Leibniz and Kant rather than from Newton, T&E, 3.7).

 

 

 

4.4. Einstein’s critique of absolute space

 

 

 

In classical physics the velocity of some particular moving subject is chosen as a unit, and the time needed to cover the unit of length is the unit of time: time is conceptually measured as a distance. Hence the comparison of two movements is reduced to the comparison of two distances covered in the same time. However, the possibility of measuring distances depends on the end points of the distance to be measured. Therefore, Albert Einstein’s critique of 19th-century kinematics was directed first of all to the use of the concept of spatial simultaneity in kinematics.[15]

 

In order to fix the velocity of a moving subject as the ratio of traversed distance and time difference, two clocks are required to establish the duration of the motion. These clocks, placed at the end points of the covered path, have to be synchronous. How is this established? There is no other possibility but to send a signal from one clock to the other. But then one has to know the velocity of the signal if one wishes to determine the time difference between its emission and arrival. To measure this velocity two synchronous clocks are needed, leading to a vicious circle.

 

Einstein proved there is only one way out of this deadlock. Suppose the signal emitted by clock I at time t1 is received at clock II at time t’, and immediately reflected, returning to clock I at time t2. Now the instant t=½(t1+t2) on clock I is defined to be simultaneous with time t’ on clock II.

 

This at first sight plausible definition is mistakenly called conventional because the signal is supposed to have the same velocity in both directions – and this presupposition cannot be verified in the above-mentioned procedure of synchronization.[16] Actually it is not really very plausible, and, in fact, even contrary to classical mechanics. If both clocks move (with the same velocity) with respect to a third subject, then the velocity of the signal according to classical mechanics is not the same in both directions if measured with respect to this third system. And if we apply this synchronization procedure to two clocks moving with respect to each other, then according to classical mechanics, it is impossible for the signal to have the same velocity in both directions with respect to both clocks. In fact, the absolute, resting electromagnetic ether of the 19th century can be said to be invented to overcome these difficulties. It follows that absolute simultaneity is not valid in kinetic space. Still it is a mistake to call the above-mentioned definition of simultaneity conventional. It is based on the isotropy of kinematical space, which does not permit different velocities of light in different directions.[17]

 

If one wishes to treat kinetic subjects, distance should not be conceived in a static-spatial sense, but must be opened up. Because no signal propagates with an infinite velocity, an immediately resulting distance has no kinematic meaning. 19th-century physics supposed the actual existence of a substantial ether as a physical-spatial substratum of optical and electromagnetic phenomena.[18] In the special theory of relativity Einstein proved that this hypothesis cannot be verified experimentally (T&E, 4.6).

 

 

 

4.5. The interval as an objective kinetic relation

 

 

 

Einstein based his theory of relativity on the hypothesis that one singular signal has the same constant velocity (c) with respect to all possible moving systems. It is not necessary that such a signal actually exists. The empirically established fact that the velocity of light in vacuum satisfies the hypothesis is comparatively irrelevant.[19]

 

In order to achieve this, Einstein had to amend the classical addition formula for velocities. In the one-dimensional case, two subjects moving with velocities v and w with respect to a third subject, have a relative velocity (v–w)/(1–vw/c2), instead of the classical value (v–w). It can easily be proved that (a) this relative velocity is independent of the choice of the reference system (i.e., a coordinate system with a clock), as it should be; (b) a subject moving with velocity c with respect to one reference system does so with respect to all reference systems; (c) no subject can move with a velocity exceeding the value c with respect to any reference system; (d) this expression approximates the classical one if the velocities are low.

 

In original space the distance d is independent of the chosen coordinate system (2.7). Einstein defined the interval s between two point events at positions (x1,y1,z1) and (x2,y2,z2), and at times t1 and t2, by

 

 s2=c2(t1t2)2–(x1x2)2–(y1y2)2–(z1z2)2=c2(t1t2)2d2

 

Because the velocity of light c must be the same in any reference system, a spherical light wave front emerging from a point source must be spherical in any reference system. This leads immediately to the above formula. Einstein demonstrated that this pseudo-Euclidean metric of four-dimensional Minkowski-space (as it was later called) is independent of the choice of the moving reference system, i.e., the metric is invariant under all transformations of the Lorentz group. However, the distance d and the time difference (t1t2) now depend on the motion of the reference system, and can no longer serve as independent objective time relations. They are replaced by the interval which now serves to objectify kinetic subject-subject relations. The interval itself does not describe motions. It is a relation in the opened-up numerical-spatial substratum of the kinetic aspect.

 

Three cases can be distinguished: s2 may be negative, positive, or zero. In the first case, s2<0, it is always possible to choose a reference system such that (for the two point events under consideration) t1=t2. This means that with respect to that reference system (and all reference systems having the same velocity), the two events occur simultaneously, but at different places. This interval is now called space-like, because it looks like a distance. In other systems of reference, t1 may be before as well as after t2. It can be shown that in that case no causal relation between the two events can exist, so that the irreversibility as the physical time order is not violated.[20]

 

In the second case, s2>0, a reference system exists  such that the two events occur at the same place (d=0), but at different times. If t1 occurs before t2 in this reference system, t1 occurs before t2 in every other reference system. In this case of a time-like interval, a causal relation between the two events is possible, and their time sequence is independent of the choice of the reference system. This is not the case if the reference system is transformed into one in which the time flow is reversed. This transformation (called time reversal) is kinematically admitted, but should be excluded with respect to physical interactions.[21]

 

In the third, borderline case, s2=0, the two events may be connected by a light signal. No reference system exists in which either t1=t2 or the two events occur at the same place. But if t1>t2, then this is the case in any other reference system (time reversal excluded).

 

Hence, according to relativity theory, the numerical and spatial aspects of time do not lose their original meaning, but they lose their absoluteness when they are opened up by the kinematical modal aspect.[22] This applies to the subject side, where time difference and distance are bound together into the interval, as well as to the law side, where the order of before and after and that of simultaneity become relative to motion. In this respect relativity is profoundly different from the classical conception in which the numerical and spatial modal aspects function in closed form with respect to motion.

 

 

 

4.6. The special theory of relativity deals

 

with the kinetic relation frame

 

 

 

Because the velocity of light c occurs in the metric of kinematical space, one may wonder if this metric refers rather to physical space, or perhaps to electromagnetic wave motion. Both questions should be answered negatively. I shall offer three arguments for this view, before presenting a more positive argument for the thesis that the special theory of relativity is purely kinematical.

 

The occurrence of a ‘typical number’ (c=3x108 m/sec) in the metric is as such of minor significance. If length and width were measured in centimetres, and height in inches, distance would be defined by

 

d2=(x1x2)2+(y1y2)2+(2.5)2(z1z2)2

 

in order to arrive at a consistent geometry. The occurrence of the remarkable number 6.25 in this formula, or that of c in relativity theory, could be avoided by the choice of a coherent system of units. The number c occurs in the metric simply as long as the use of metres and seconds is retained. The second is a kinematic objective unit, which in principle could be replaced by a unit related to the metre via the metric of special relativity, assuming c=1. This method has practical drawbacks (the speed of light is difficult to measure), but in the formulas of relativity theory, velocities are often given in proportion to the velocity of light, which may thus be considered the natural unit of speed.

 

In a purely mathematical analysis of the kinematic relation frame c is the limiting velocity of real moving subjects. A subject moving at higher speed would have imaginary time duration and spatial extension. Such quasi-subjects, baptized ‘tachyons’[23], may have an abstract, modal meaning, but they will hardly be recognizable as abstractions of real, actual subjects.[24] Even an actual light signal never propagates with the velocity c. This is a limiting velocity which would occur in a vacuum. But a vacuum, although nearly approximated in interstellar space, is itself a limiting abstract concept. No spatial realm is really empty, and in any material medium the velocity of light is less than c. The constant c is the limit rather than the velocity of light. In a medium one may find particles moving faster than light in that medium (this is the phenomenon of Čerenkov radiation), but their velocity is still smaller than c. The so-called phase-velocity of a wave packet may be larger than c, but the phase cannot transmit signals, and the so-called group-velocity of the packet, which is identified with the particle’s velocity, is always smaller than c.

 

The laws of relativity theory have other consequences for a number of physical phenomena which are not necessarily of electromagnetic origin. For instance, all particles having zero rest mass move with the velocity c (in vacuum). This is not only the case with light quanta, but also with the as yet hypothetical quanta of gravitation. Another consequence of relativity theory is that the measured (objective) mean decay time of moving radioactive particles increases as they move faster with respect to the measuring instrument. This time dilation is not only observed in radioactive decay caused by electromagnetic interaction, but also occurs if caused by weak or strong nuclear interaction. The latter cannot be reduced to electromagnetic forces, whereas their velocities of transmission are less than the speed of light. In other words, it might have been possible to discover the laws of relativity theory if one had known only the time dilation of non-electromagnetic phenomena. One could have found these laws if all actually existing signals moved with velocities less than the constant c in the metric.[25] Thus c is not, in the first place, the velocity of light, but rather the velocity of light’s propagation in a vacuum is equal to c due to the typical structure of electromagnetic interaction.[26]

 

 

 

The speed of light as the unit of velocity

 

Let me now present the positive argument for the thesis that the special theory of relativity is purely kinematical. Section 2.8 showed that the choice of the unit is arbitrary for spatial coordinate systems in the Euclidean metric. However, in order to be able to give transformation rules between the several possible coordinate systems, it must be assumed (as is usually tacitly done) that the same unit of length applies in all coordinate systems. In Galilean relativity the same assumption is made. It is taken for granted that the units of length and of time are the same in all reference systems. This assumption is sufficient to derive the so-called Galilean group of transformations between inertial systems. But the choice of these units, as basic units, should now be scrutinized.

 

In this context time means kinetic time, determined by the distance covered by a uniformly moving subject. However, in different frames of reference, one has no right to assume that the unit of length will be the same, and, accordingly, that the unit of time will be the same. Moreover, it may be questioned whether length and time should be taken as basic parameters. The parameter which distinguishes the kinematic reference frames is velocity, just as is distance in the spatial case. Apparently, velocity is a derived quantity, for it is defined as the ratio of the covered distance and the corresponding time interval. Because instantaneous velocity can only be approximated in this way, this definition could only be anticipatory. However, kinetic time can only be introduced with the help of a subject moving with constant velocity. Thus it seems appropriate to take velocity as the basic unit in kinematic reference systems, and to demand that the kinematic transformation rules leave the unit of velocity invariant. If this unit is taken to be c, one arrives at the basic hypothesis of Einstein’s theory of special relativity. This hypothesis is again sufficient to find the so-called Lorentz-group of transformations between inertial systems. It is an empirical matter whether the Galilean or the Lorentz transformations are valid – there is no logical ground for this decision.

 

Einstein’s paper of 1905, in which he published his relativity theory for the first time[27] was divided into two parts. The kinematical part gives all the relevant formulae of relativity theory, which are applied to the electromagnetic problems in the second part. Before Einstein Henri Poincaré came very close to the discovery of the special theory of relativity. He denied absolute space and absolute time, referred to a principle of relative motion and a principle of relativity, and sought for invariant forms of physical laws under transformation. ‘But the existence of the ether is rarely doubted, for, like Lorentz, Poincaré explained by compensation of effects the apparent validity of absolute laws in moving inertial systems and maintained the privileged position of the ether’.[28]

 

Only Einstein took the decisive step, recognizing that the relativistic effects have a kinematic origin, rather than a physical one. This also applies to Walter Kaufmann’s experimental discovery that the mass of fast moving electrons depends on their speed (T&E, 4.6). By proving the equivalence of mass and energy, Einstein made an end to many speculations about the origin of mass.[29]

 

 

 

4.7. The principle of relativity

 

 

 

Besides uniform linear motion a rigid body is able to perform a uniform rotation without any physical cause. The difference is that the latter is possible only with a rigid body (or at least a system whose parts are kept together by an attractive force, such as the planetary system). Uniform linear motion is possible for any subject. In fact, every part of the rotating body experiences centrifugal and Coriolis forces, but due to its internal coherence the force exerted on one part is compensated by that on another part. Therefore, the uniform rotation is a bounded uniform motion, not entirely of a general modal nature, since it depends on the typical structure of the body, giving rise to some internal force.

 

Both in classical and in special relativity any reference system rotated about a finite angle with respect to an inertial system is itself an inertial system. This has nothing to do with uniform rotational movement, and applies as well to the coordinate systems in geometrical space. Any coordinate system san be rotated about any angle or displaced any distance without disturbing the relative spatial positions of the subjects (2.8). But two different reference systems cannot only be translated any distance or rotated about any angle, they can also move uniformly with respect to each other without generating a fictitious gravitational field which would exist in one system, but not in the other one. On the other hand, in a uniformly rotating reference system fictitious gravitational fields must be introduced. This is the case in special and in general relativity theory, just as in classical mechanics.

 

Isaac Newton knew very well that it is impossible to derive the existence of an absolute system of reference by considering uniform linear motion alone. As a minimum he thought that all rotations could be established experimentally with respect to this absolute space (the famous pail experiment[30]). His main contemporary opponents, Christiaan Huygens and Gottfried Leibniz, could not refute his arguments, because their arguments were mainly logical rather than physical.[31] It was not until 1883 that Ernst Mach pointed out a flaw in the reasoning: the experimentally established motion does not occur with respect to an absolute reference system, but with respect to the whole of all matter found in space (T&E, 3.7).[32] Later Einstein corrected this view by showing that the rotation occurs with respect to a local inertial system.[33]

 

Indeed, the distinction between linear motion, as a purely kinetic motion, and rotation, as a movement which anticipates the physical modal aspect, is only comprehensible if the irreducibility of the kinetic modal aspect is accepted as an empirical fact. Newton as well as Leibniz and Huygens were led astray in their judgment of spatial concepts by the supposed reduction of kinetic relations to spatial relations, or conversely, by the inclusion of geometry in kinematics or mechanics. On the one hand, Newton considered the spatial aspect to be subordinated to the mechanical one.[34] This convinced him of the existence of an absolute space – an idea which, he thought, was confirmed by experiments on rotating systems. On the other hand, Huygens and Leibniz tried in vain to reduce the relativity of motion to the relativity of spatial position. Since spatial relative positions are invariant with respect to both translations and rotations of the coordinate system in Euclidean space, they had to assume that not only linear motions but also circular motions ought to be purely relative in a kinematical sense.[35]

 

The ideas of Huygens and Leibniz were revived into Mach’s principle (T&E, 3.7), which in its original form stated that any kind of inertia is caused by the mutual interaction of matter. It is now generally (though not unanimously) rejected, because it turns out to be very difficult to develop this principle into a satisfactory mathematical theory.[36] The principle of relativity, as formulated by Einstein, is more restricted than Mach’s principle.

 

If we make a transition from one inertial system to another, the interval remains the same. Therefore, the interval is called an invariant. The components (x1x2) etc. of the interval are changed, however. We call a certain variable (depending on x, y, z, and t) a covariant if it transforms at the transition in the same manner as the four components of the interval.[37] Every mathematical expression or quantity referring to the physical aspect should be either an invariant or a covariant, because of the mutual irreducibility of the kinetic and physical aspects. This is Einstein’s principle of relativity, expressed in terms of the philosophy of the cosmonomic idea. It is the same requirement of objectivity as discussed in section 2.8. The electric charge, the internal energy or rest mass, and the entropy of an isolated system are invariants. The total energy plus the momentum, and the electric plus the magnetic field strengths in a point, are covariant variables.

 

The principle of relativity is based on the mutual irreducibility of the kinetic and the physical relation frames.[38] Thus far the principle of relativity was mainly treated considering the subject side, but it has also bearing on the law side. The formulation of physical laws must be frame-independent, whereas the subjective initial and boundary conditions depend on the choice of frame.[39] In both cases the same thing is meant.

 

Because the pre-physical modal aspects are irreducible to the physical one, they can be used to objectify the latter. But in order to make full use of this possibility due account should be given of that irreducibility. The laws of physics must be independent of time, position, and motion. This implies, e.g., that if one considers different sets of subjects with similar subject-subject relations, one must have similar experimental results. This is the basis of objective experimental research, which must arrive at results, reproducible at any place, at any time, and at any velocity.

 

In its turn, the frame invariance of physical interaction gives rise to the conservation laws of energy, linear and angular momentum, and of the motion of the centre of mass, for isolated systems. Hence, these laws are related to the irreducibility of the physical modal aspect and the aspects of number, space, and motion (chapter 5).

 

 

 

4.8. General theory of relativity

 

 

 

After the dynamic development of the kinetic relation frame and the preceding ones in the special theory of relativity, in the general theory the kinetic frame is in turn opened up by the physical one.  

 

Both in classical mechanics and in special relativity theory purely kinetic uniform motion is rectilinear. This is related to the assumption that the spatial substratum is (pseudo-)Euclidean. However, according to Einstein’s general theory of relativity, the physically relevant spatial substratum is not Euclidean, but is determined by the temporal and spatial distribution of energy. With respect to this non-Euclidean space, kinetic motion occurs along a so-called geodesic, which is the equivalent of a straight line in Euclidean space.[40] Light travels along a geodesic just like a freely falling massive subject. According to Einstein, the local inertial system serves as a substratum for physical interactions.[41] Especially, Newton’s third law of action and reaction is supposed to be valid (in a static situation) only if referred to a local inertial system. All available local inertial systems are very much alike the pseudo-Euclidean reference systems of special relativity theory. Usually the latter are taken as the pre-physical substratum of physical interactions, except when large-scale phenomena are studied.[42]

 

The non-Euclidean character of the metric as determined by the energy distribution affects not only the spatial part of the metric, but also the quantitative one. The metric does not only refer to space, but to the whole numerical and spatial substratum of the kinetic relation frame, now in its opened-up form. If the time flow is supposed to be uniform with respect to this reference system, it is no longer uniform with respect to an Euclidean reference system.

 

If kinetic motion is described with respect to an Euclidean reference system, one explains that this motion is not uniform by introducing a field of gravitation. A (non-Euclidean) reference system in which no gravitation occurs is called an inertial system. A non-inertial reference system has accelerated motion with respect to an inertial system. For example, the reference system connected with an artificial earth satellite is an inertial system in which the gravitational field of the earth has been transformed away. This can only be achieved locally, because it is impossible to find a universal inertial system – i.e., a system with respect to which any gravitational field wherever is transformed away.

 

Using a reference system moving non-uniformly with respect to a local inertial system, a gravitational field is experienced, such as feeling an extra weight in an elevator accelerating upwards. A physically more important example is uniform rotation. If the earth is considered as a reference system then part of its experienced gravitational field is caused by the earth’s rotation and gives rise to centrifugal and Coriolis forces. These forces are sometimes called fictitious because they have no physical origin, but a kinetic one. According to Einstein such a fictitious field cannot be distinguished from the gravitational field determined by the spatial energy distribution (principle of equivalence).

 

This is not completely correct, however, because in contrast to the latter gravitational field, a fictitious field can be transformed away everywhere. This means that only locally the two fields cannot be distinguished.[43] Moreover, the fictitious field has no physical source, but has a purely modal kinetic origin and meaning, whereas the energy distribution, the source of a real gravitational field, is always connected to some typical structures, by which it can be identified. In extended freely falling systems (like the earth in the gravitational field of the sun and the moon) one has detectable differential gravitational forces, like those giving rise to the tidal motions of the seas. The existenceof a uniform homogeneous gravitational field throughout the universe can be ruled out by symmetry arguments (isotropy of space). Thus we find that real forces (including the gravitational force) are expressions of physical subject-subject relations, which cannot be said of fictitious forces. In particular, fictitious forces are not subjected to Newton’s third law of motion, the law of action and reaction. Because a fictitious force has no physical origin, there is no reaction force.

 

Einstein had two starting points for the general theory of relativity. (a) Newton’s theory of gravitation implied immediate action at a distance, and is therefore incompatible with relativity theory.[44] (b) Gravitation is a universal interaction,[45] and must also be applicable to systems not endowed with mass, e.g., light signals. The second point means that the general theory of relativity should be considered a modal theory. All free physical subjects, irrespective of their typical structure, move uniformly in a local inertial system, or they are influenced by the corresponding gravitational field in a non-inertial reference system in exactly the same way. This is sometimes expressed by the equivalence of gravitational and inertial mass (T&E, 9.2). However, the concepts of gravitational and inertial mass do not apply to light signals which also move along geodesics.[46] On the other hand, it is impossible to transform away an electromagnetic field, which has a typical structure. For example, the influence of this field on the motion of a subject depends on the ratio between its charge and rest mass, that is, on its internal typical structure.

 



[1] Margenau 1950, 96, 97; 1960; see also Reichenbach 1927, 211, 217-219. Reichenbach is more cautious than Margenau 1960, but he is mistaken when he prefers the Copernican system to the Ptolemaic one, because the former has a dynamic explanation. Such an explanation is only possible with Kepler’s system. Mach 1883, 279, 283-284 does not discuss the epicycle theory. He merely states that the Ptolemaic and the Copernican modes of view are equally correct. Only the latter is more simple and more practical. ‘The universe is not twice given, with an earth at rest and an earth in motion, but only once, with its relative motions alone determinable.’

[2] Dijksterhuis 1950, 325 observes that the introduction of the earth’s motion about the sun could not make more than five epicycles superfluous, and Kuhn 1957, 171 states: ‘Judged on purely practical grounds, Copernicus’ new planetary system was a failure; it was neither more accurate nor significantly simpler than its Ptolemaic predecessors …’. See also Kuhn 1957, 168; Koestler 1959, 194, 195, 579, 580; Koyré 1961, 43; Feyerabend 1964; Gillispie 1960, 24-26; Toulmin, Goodfield 1961, 175, 179.

[3] Dijksterhuis 1950, 332ff; Kuhn 1957, 200; Feyerabend 1962, 260, 261; 1978 40 ff; Toulmin, Goodfield 1961, 184ff; Hanson 1973, 171-249.

[4] Copernicus 1543; Dijksterhuis 1950, 321-324; Kuhn 1957, 171-180; Koyré 1961, 45ff, 129; Toulmin, Goodfield 1961, 172-173; Hesse 1974, 232; Lakatos 1978, 168-189.

[5] Actually, the fact that Venus’ brightness does not vary appreciably during its motion around the sun was used as an argument against the Copernican theory by Osiander in his Preface to Copernicus’ work (see Copernicus 1543, 22). By pointing to the phases of Venus, Galileo turned the argument in favour of Copernicus’ system. See Galileo 1632, 334-339; Feyerabend 1975, 109-111; Kuhn 1957, 222-224.

[6] Kuhn 1957, 209ff; Dijksterhuis 1950, 335-357; Koestler 1959, 227-427; Koyré 1961, 117-464; Toulmin, Goodfield 1961, 198ff.

[7] Kuhn 1957, 153, 245, 252; Toulmin, Goodfield 1961, 201.

[8] See e.g., Mach 1883, 171-172. This gives rise to the misunderstanding that inertia is due to the mass of the subject. But Newton’s first law does not contain any reference to the mass of the subjects concerned. See also Dijksterhuis 1950, 519f.

[9] Whitrow 1961, 175.

[10] Part II in Gale (ed.) 1967.

[11] The distinction of uniform time flow as a law, and uniform motion as a subjective time relation should not be confused with Newton’s distinction of absolute and relative time (see Newton 1687, 6).

[12] Gale (ed.) 1967, 66.

[13] Analogously, one may add geometrical distances in the same way as numerical differences if the three points concerned are situated on or near a straight line.

[14] Lorentz 1895, 1-7.

[15] Einstein 1905a; 1921.

[16] Reichenbach 1927, 123-129; Grünbaum 1963, 342-268, 666-708; 1968, 295-336.

[17] Bunge 1967a 187-188; Whiteman 1967, 36; Sklar 1974, 287-294.

[18] See Doran 1975; Goldberg 1970 ; Hesse 1961; Hirosige 1976; Schaffner 1972; Swenson 1972; Whittaker 1910, 1953.

[19] It seems that Einstein developed his special theory of relativity without having knowledge of the experiments of Michelson and Morley. See Shankland 1963, 1973; Bunge 1967a 193; Holton 1973, 261-352. For an opposite view, see Grünbaum 1963, 377-386, 834-837. See also Gutting 1972; Swenson 1972; Hesse 1974, 246; Williamson 1977.

[20] Bunge 1959a, 65ff; 1967a, 206: ‘… the space of events, in which the future-directed [electromagnetic] signals exist, is not given for all eternity but is born together with happenings, and it has the arrow of time built into it.’ Sometimes two events with a space-like interval are called ‘topologically simultaneous’, where ‘simultaneity’ is the relation of not being connectable by a physical causal chain or signal. See Grünbaum 1960, 410ff; 1963, 28-32, 351; 1968, 22; Reichenbach 1927, 127, 145-147; 1956, 40-41. The relation ‘topological simultaneous with’ is not transitive.

[21] Reichenbach 1956, 42.

[22] Hence I reject the view, expressed by Minkowski in 1908: ‘Henceforth space by itself, and time by itself are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality’, Minkowski 1908, 75; for a criticism of the Minkowski formalism, see O’Rahilly 1938, 404-419, 732-740.

[23] Feinberg 1967; Reichenbach 1927, 147.

[24] It may be called an ‘axiom of identity’ that an identifiable moving subject may be at the same place art different moments, but not at different places at the same moment, see p. 84, 85. Tachyons do not satisfy this axiom.

[25] A similar argument is given by Goldstein 1959, 200: ‘… the transformation properties must be the same for all forces no matter what their origin. The statement ‘a particle is in equilibrium under the influence of two forces’ must hold true in all Lorentz systems, which can only be the case if all forces transform in the same manner.’

[26] For a contrary view, see Bunge 1967a, 182ff, who, e.g., defines an inertial frame of reference as one in which Maxwell’s equations are satisfied. I think Bunge remains too close to the ‘… historical origins of the theory as far as its leading axioms are concerned’ (182), which, of course, are not denied in our treatment. Bunge explicitly states that there would be no basis for the special relativity theory without an electromagnetic field (205).

[27] Einstein 1905a.

[28] Holton 1973, 187; Poincaré 1905, 1906.

[29] Einstein 1905b.

[30] Newton 1687, 10, 11.

[31] Kuhn 1962 72; Jammer 1954, 114ff; Reichenbach 1927, 213.

[32] Mach 1883, 279-286. A critique of Mach’s views does not need to agree with Russell 1927, 17, who says that ‘… the influence attributed to the fixed stars savours of astrology, and is scientifically incredible’.

[33] Eddington 1920, 157-165.

[34] Jammer 1954, 93ff.

[35] Jammer 1954, 114-124. Even Maxwell mistakenly stated: ‘Acceleration, like position and velocity, is a relative term, and cannot be interpreted absolutely.’ Cp. Maxwell 1877, 25, and Larmor’s footnotes on this page.

[36] Mach 1883, 286-290; Reichenbach 1927, 210-218; Graves 1971, 298-305; Jammer 1954, 139-141, 190-196; Grünbaum 1963, 418-424; Bunge 1967a 134; Mittelstaedt 1963, 81ff; Sklar 1974, 157-234; Whittaker 1953, 168, 183; Nagel 1961, 203-214.

[37] This is somewhat loosely expressed. We shall not bother with the further distinction of covariant and contravariant magnitudes (cf. Landau and Lifshitz 1970, 26).

[38] For a contrary view, see Bunge 1967a 183: ‘The principle of relativity is, in short, (a) a heuristic principle and (b) a metalaw statement – and a normative one not a declarative metanomological statement, for it does not say what is but what ought to be the case.’

[39] Houtappel et al. 1965, 596; Bunge 1967a, 86, 87. By the distinction of physical laws from the initial and boundary conditions we meet ‘Curie’s observation’ that if the world, in all its details, were invariant with respect to displacement there would be no way to distinguish between the two parts. See Houtappel et al. 1965, 596.

[40] On a sphere, being a two-dimensional non-Euclidean manifold, a great circle is a geodesic, which may be the shortest as well as the longest connection between two points.

[41] Mittelstaedt 1963, 78. Unfortunately, Einstein once used the name ‘ether’ for this substratum. This is unfortunate, because the latter has nothing in common with the 19th-century ether.

[42] Cp. Whiteman 1967, 179.

[43] Neumann 1932, 205ff; Bunge 1967a 210ff.

[44] Jammer 1954, 171ff; 1957,  257; 1961, 205; Akhieser, Berestetsky 1953, 372ff.

[45] Newton was the first to realize the universality of gravitation by discovering (a) that the force between the sun and the planets, the earth and the moon, and the force causing falling motion, are the same, and (b) that the gravitational force on a subject is proportional to its mass, regardless of its typical structure or composition. See Mach 1883, 229-234, 241.

[46] Bunge 1967a, 207ff; 1967b, I, 400.

 

Part I, chapter 5

 

 

 

Interaction

 

 

 

5.1. Isolated systems

 

 

 

Chapters 2 and 4 discussed general subjective relations, qualified by the modal aspect in question: numerical difference, spatial relative position, kinematic relative motion. To abstract from the typical individuality of things, events, etc., in order to study their modal relations is more difficult in the kinetic and physical modal aspects than it is in the numerical and spatial relation frames.

 

It turns out that interaction is the general modal physical subject-subject relation. This means that the possibility of isolating systems is limited. In fact, no pair of physically qualified systems is completely isolated. Belonging to the creation implies interacting with every other created thing. Hence the introduction of isolated systems does not seem to be germane to the problem of time. By definition two isolated systems do not interact physically, and so do not maintain a physically qualified subject-subject relation, although they still have pre-physical relations: relative magnitudes, relative positions, and relative motion. On the other hand, if two systems interact it may be difficult to distinguish them from each other. The interaction between two systems may be so strong that they should be considered as one system. Depending on its context, one may speak of a modal physical subject if it can be isolated such that its external interactions are negligible.[1] This does not mean, however, that it loses its individuality as soon as it interacts with another subject, although this may happen. The strength of the interaction will determine whether two subjects can be distinguished from each other. In any case, it appears to be extremely fruitful to speak of the interaction of separate systems, especially if they are isolated except for this particular interaction. In fact, it appears to be a necessary methodological prerequisite for their analysis, both theoretical and experimental.[2]

 

The study of the abstract general characteristics of a physical  subject requires to leave aside its typical structure. This means that  the present chapter will not discuss the branches of physics investigating the typical structures of physical subjects: electromagnetism, nuclear and atomic physics, solid state physics, chemistry, etc., and also statistical physics, which studies the behaviour of a large number of interacting systems of a certain kind (presupposing their structural similarity). The present chapter is restricted to interactions in which either the internal state of the system (thermal physics) or the external states (mechanics) are involved in  a purely modal way.[3]

 

The objectification of a physical subject invariably requires use of the concept of a state. In this concept the identity of a system is presupposed, otherwise it would be meaningless to say that a system can be in different states, or can change its state. Strictly speaking a state can only be ascribed to an isolated system, yet it is often possible to speak of the state of a composite system.

 

In the concept of a state three aspects can be distinguished. First, the state has a specific numerical value for a certain number of physical variables. It is said to be completely determined by a number of variables if all other physical properties of the system can be derived from them. These variables simultaneously determine the state. The number of independent variables necessary to determine the state is the latter’s dimension. Secondly, the state of a system in its spatial relation to other systems can be considered, if they interact statically, i.e., via a field or a force. Finally, the state of motion with respect to some other system may be relevant. In each case the state is changeable.

 

Among the modal characteristics of physical subjects the concepts of energy, force, and current are the most important. It will be demonstrated that these always refer back to the numerical, spatial, and kinetic relations, respectively. Therefore, their numerical values serve as mathematical objectifications of physical relations. Except for very artificial constructions, physics cannot do without these or equivalent concepts, because energy, force, and current refer back to mutually irreducible modal aspects. On the other hand, these are strongly related because each is a projection of the same physical aspect. In monistic philosophies, such as were popular in the 19th century, this view is unacceptable. However, various attempts to reduce one to the other have always been in vain. In the philosophy of the cosmonomic idea it becomes clear why this is impossible.

 

 

 

5.2. Thermal physics

 

 

 

An isolated system is an abstract concept because no concrete physically qualified subject can be completely isolated from other subjects, and because it does not take into account the individual character inherent in any concrete physical system. Nevertheless it is a meaningful concept, since even in experimental physics walls can be devised which are nearly impermeable to energy and matter transport. However, this concept is especially meaningful as a theoretical concept because it allows one to study modal physical laws.

 

Thermodynamics deals with the modal physical properties of macroscopic bodies. It was developed in the first half of the 19th century by Sadi Carnot, Julius Mayer, James Joule, Hermann Helmholtz, Rudolf Clausius, William Kelvin, and others (T&E, 7.4). In the beginning of the 20th century Constantin Carathéodory investigated its foundations. However, the axiomatic representation of this branch of physics is still a matter of dispute,[4] and I shall discuss its hypotheses without pretending rigour or completeness.

 

Whereas in mechanics a physical subject can often be objectified by a spatial point (the centre of mass), in thermal physics (which includes statistical physics and thermodynamics) a physical subject has connected and interacting parts. As a first hypothesis it is stated that any isolated system has a macroscopically unique equilibrium state, designated by a limited set of extensive parameters (3.8), such as its volume (V) and its internal energy (U). Some of these parameters may be determined by the boundary conditions (the volume, or a static field), while others are determined by the internal structure of the system.

 

To understand the meaning of the extensive parameters, one has to consider some possible interaction, because the state of an isolated system can only have meaning while anticipating some interaction. Suppose two systems A and B are completely isolated. The state of the system AoB consisting of the physical sum of A and B is designated by extensive parameters like volume V and energy U, whose values are the numerical sum of the values for the separate systems:

 

V(AoB)=V(A)+V(B),  U(AoB)=U(A)+U(B)

 

Provided no chemical reaction takes place, the number of moles (Ni) of each chemical species is also an extensive quantity. If a chemical reaction takes place, the number of moles of each atomic component must be incorporated as well.

 

Now let the two systems interact with each other. According to a second hypothesis, the decrease of any extensive parameter for A equals the increase of the corresponding parameter for system B. During the interaction, the total volume, energy, and number of moles (or atoms) of any kind, are unchanged. It seems obvious that if the volume of a system increases, the volume of its surroundings must decrease by the same amount. It is not trivial that this also applies to energy. The conservation of energy is a quantitative expression of the physical subject-subject relation.

 

 

 

The first two laws of thermodynamics

 

The first and second hypotheses imply that the extensive parameters also serve to describe the system when it is not in equilibrium. In this case the description is not unique. Different non-equilibrium states correspond to the set of extensive parameters. A non-equilibrium state of an isolated system can only be described by the way it was prepared (which may include ‘waiting a little’[5]). However, according to a third hypothesis, there exists a mathematical function of the extensive parameters, called the entropy S, which has a definite value for the equilibrium state of the system, and which can be used to describe the development of a system which is not in equilibrium. For the physical sum AoB of two mutually isolated systems A and B, the entropy is the numerical sum of the entropies of each system:

 

S(AoB)=S(A)+S(B)

 

If the two systems interact, the total entropy will stay constant or increase. For two parts of a system, which as a whole is in internal equilibrium, S is an extensive parameter. But with respect to a system which is not in equilibrium, the increase of S is related to the current between parts of the system, and with respect to such parts (or to different non-interacting systems), S determines the ‘generalized force’ or ‘potential difference’ between them. This hypothesis is called the Second Law of thermodynamics.

 

Energy is always involved in any possible interaction. This means that energy is a relevant state parameter for any thermodynamic system. An interaction in which energy is not involved, would not lead to equilibrium.[6] This fourth hypothesis is usually formulated in the so-called First Law. It states that the energy increase of any thermodynamic system equals the sum of the work performed on the system, and the heat transferred to it. Heat is the product of the temperature of the body and its entropy increase during the heat transfer, and work is related to a change in any extensive state parameter except energy. Work is invariably determined by a change in the boundary conditions. Whether a certain extensive magnitude is relevant to a certain system depends on whether it is possible to perform work on that system by changing that parameter. This means, for example, that the magnetization of a non-magnetic gas is not a relevant state parameter. Thus heat and work are not forms of energy, as is often inaccurately stated, but forms of energy transfer. They are related to currents, and cannot serve as state parameters.

 

 

 

Thermodynamic potentials

 

The so-called intensive parameters or potentials (3.9), like temperature and pressure, can now be introduced as partial derivatives of either energy or entropy. In both cases the derivatives are taken with respect to the extensive parameters. In these definitions, the temperature T has an exceptional role, because it is defined in two equivalent ways:[7]

 

either T=∂U/∂S or 1/T=∂S/∂U

 

Other potentials are defined according to the following alternatives. Let a potential Y or F correspond to the extensive parameter X (X is neither energy nor entropy), then:

 

 either Y=∂U/X or F=-Y/T=S/∂X

 

The first alternative is called the energy representation, and is older than the second alternative, the entropy representation, which has advantages for the study of currents (5.6). With this definition of the intensive parameters, the First Law of thermodynamics reads:

 

either dU=TdS+ΣYdX or dS=(1/T)dUFdX

 

The values of the intensive parameters determine the direction of the interaction. For example, if the temperature of a body A is higher than that of a body B, heat will flow from A  to B. If all corresponding intensive parameters are equal for the two interacting systems, A and B are in thermodynamic equilibrium with each other.

 

In thermostatics (a branch of thermodynamics) the intensive parameters are defined with respect to equilibrium states. In this context it makes no sense to attribute a temperature value to a body which is not in thermal equilibrium. However, if we consider this system as having parts – i.e., as consisting of a large number of small interacting sub-systems, each being near an equilibrium state –the temperature can be said to have different values at different positions within the body. In this case one speaks of a temperature field. This is why the intensive parameters are also called potentials. The spatial gradient of a potential has the character of a force, driving a current. Thus, every extensive parameter determining the state of a system is related to a generalized force and a generalized current.[8] But this is only the case as far as the extensive parameters are related to energy.

 

 

 

State space

 

If the state of a system is determined by n independent extensive parameters, it can be represented by a point in an n-dimensional state-space, in which the extensive parameters form the coordinates. The coordinate system in this space is not unique. One or more extensive parameters can be replaced by intensive parameters (which is useful because intensive parameters are often easier to measure), or by other extensive parameters such as the free energy or the (free) enthalpy, if the so-called Legendre transformations are employed. The latter parameters are often useful for the discussion of situations with specific boundary conditions, such as the equilibrium state of a system that is not isolated but kept at constant temperature or pressure.[9] The number of dimensions of state-space remains the same in these transformations.

 

By introducing an additional typical law this number can be reduced. One parameter may be eliminated by introducing an (n–1)-dimensional manifold in state space with the help of a so-called equation of state which relates an extensive parameter to its corresponding intensive parameter. Such an equation depends on the temperature and the typical interaction of the molecules composing the system under study. Thus the ideal gas law relates pressure and volume assuming that the molecules in the gas have no extension and do not interact with each other. The derivation of the specific heat of the gas depends on whether the molecules consist of one, two, or more atoms. Curie’s law for a magnetic gas (relating the magnetization and the magnetic field strength) is derived assuming that its molecules have non-interacting magnetic moments.

 

These are clearly limiting (‘ideal’) cases, but they are still dependent on assumptions concerning the typical structures and (lack of) typical interactions. Accounting for the extension and the interaction of the molecules in a simplified way, one arrives at the Van der Waals equation of state, which even accounts qualitatively for the condensation of a gas to a fluid. Yet the development of the equation of state is not a purely modal matter, in contrast to the framework of thermodynamics as outlined above.

 

 

 

5.3. Conservation laws

 

 

 

The extensive parameters as applied in thermal physics mainly describe the internal state of a system. This section will discuss the relevance of energy for the external state concerning the system’s spatial position and motion. Originally, energy as kinetic and potential energy was only recognized with respect to this external state. The internal state was described by a single variable (mass) for a subject whose extension could be neglected, or by a tensor (moment of inertia) for extended, rigid bodies. The relation between mass and energy was not recognized until 1905.

 

Mechanics is mainly concerned with the relative motion of material subjects and the simplest interaction, therefore, is one of collision. Mechanists like René Descartes even believed that collisions were the only admissible kind of interaction (T&E, 3.4). One speaks of a collision between two subjects if the interaction can be assumed to be of short duration, and if one’s attention is directed to the consequences of the interaction for the relative motion of the system. Except for this interaction the two colliding systems may be considered to be isolated, and therefore their total uniform motion, as objectified by the motion of their centre of mass, is uniform before and after the collision, and is not influenced by the interaction.

 

A collision is called elastic if the internal state of both systems is not essentially changed by the interaction. It is called inelastic if the state of motion as well as the internal state of at least one system is changed in a physical sense. This internal change itself cannot be described by mechanics, unless it is assumed (as was done in classical physics) that it can always be explained by collisions between particles composing the system. Macroscopically, the concept of an elastic collision is an abstraction. Even the collision of two billiard balls is partly inelastic. A collision between two molecules is usually called elastic if the collision energy is less than the energy of the lowest excited states of the molecules, but even then the wave packets of the two molecules are reduced (6.3) such that their internal state changes. Nevertheless, the concept of an elastic collision is very useful in studying the changing kinetic state of interacting systems.

 

The external motion of the two interacting systems, considered as a whole and objectified by the motion of the centre of mass, can be described entirely in kinetic terms. This is impossible with respect to the relative motion of the two colliding systems. Their motion must be described in terms of kinetic energy and linear momentum.[10] In the 17th and 18th centuries people quarrelled about the priority of one over the other (T&E, 3.6).[11] In elastic collisions, neither the total kinetic energy nor the total momentum of the two colliding systems are influenced by the interaction. The gain in energy of one system equals the loss of energy of the other, and the same applies to the momentum. This is no longer the case in inelastic collisions. Whereas the motion of the centre of mass and the total momentum are still uninfluenced, the interaction changes the total kinetic energy.

 

In the 19th century it was discovered that, in this case, kinetic energy is transferred into another form of energy (e.g., by heating of the colliding systems). It became clear that the total energy (including the internal energy) rather than the external (mechanical) energy must be a constant for a system as a whole.[12] Essentially, this is the content of the First Law of thermal physics. It states that the internal energy can only be changed by an external supply of energy – heat or work. Hence, this law does not say anything about the total energy, but concerns itself with energy differences. It says something about the possible increase or decrease of energy and not about its total value. In the special theory of relativity Einstein showed that the mass of a body is proportional to its internal energy (the famous, but often misunderstood relation E = mc2)[13] if both are determined with respect to a reference system in which the body rests.

 

 

 

Constants of the motion

 

After the unification of these three concepts – mass, internal (thermal) energy, and external (mechanical) energy – it became possible to achieve a clear understanding of the meaning of the conservation laws. The constancy of energy and linear and angular momentum of an isolated physical system depends on the isotropy and homogeneity of numerical time and space, which are tacitly assumed.  By isotropy is meant that there is no preferred direction in space, and by homogeneity is meant that there are no preferred instants of time or spatial positions. Only time and spatial differences count – not some absolute time or position parameter.

 

It can be shown that the symmetry properties of Euclidean space allow ten constants of the motion: energy; three components of linear momentum; three components of angular momentum; and the position of the centre of mass. Each is related to some group of transformations under which Euclidean space is invariant. In classical physics these are subgroups of the Galilean group, while in special relativity they are subgroups of the Lorentz group. Energy is related to the homogeneity of numerical time; linear momentum is related to the homogeneity of space; angular momentum is related to the isotropy of space[14]; and the centre of mass is related to uniform motion.

 

This implies that if we find other variables which are constants of the motion, they must be derivable from, identical with, or proportional to the ten constants of the motion (Emmy Noether’s theorem, 1918).[15] For example, a wave packet’s frequency and wave vector are proportional to the energy and momentum, respectively. Planck’s constant functions as the universal proportionality constant and therefore has a purely modal character (chapter 7). In fact, the dependence of the constancy of energy and momentum on the homogeneity of time and space is easier to prove in quantum physics than in classical physics (9.5). The relation of the conservation law of energy with the homogeneity of time also implies that this law is restricted if the isolation of the system has a limited duration. This restriction is expressed in Heisenberg’s relation, ΔEΔt>h (7.6).

 

In general, the ten mentioned conservation laws are mutually independent. It is only possible to relate them in special cases. For example, a particle with mass m has the energy E and momentum p related either by E=p2/2m or by E=(m2c4+p2c2)½, in classical physics or relativity theory, respectively. For an extended system consisting of point masses between which only central forces are acting, the conservation laws for angular momentum and for the motion of the centre of mass can be derived from the conservation law of linear momentum.

 

The conservation law of energy has three aspects: conservation of a numerical amount of interaction; the possibility of transfer of energy from one system to another; and the conversion of one kind of energy into another one (T&E, 7.3). There are many different kinds of energy, modal (internal, gravitational, kinetic), and typical (electric, magnetic, nuclear). However, they do not stand in isolation. They can be transformed into each other if the two subjects interact such that heat or work is exchanged, or potential energy is transformed into kinetic energy. For all these different interactions, the universal modal concept of energy allows of comparing the different kinds of energy with each other, and therefore gives a general objective description of them. This gives energy, as the fundamental numerical projection of physical interaction, its status of key-concept in physics.

 

 

 

5.4. Force

 

 

 

Force turns out to be a physical concept referring back to the spatial relation frame and being entirely of a modal character. It will hardly be necessary to show that force is an expression of a physical subject-subject relation. Newton’s third law of action and reaction is usually understood in this sense: the force exerted by a physical subject A on a physical subject B equals the force exerted by B on A, but the two forces act in opposite directions. This also applies to thermodynamic generalized forces. Besides forces between spatially remote bodies, we also find forces between the parts of extended bodies (e.g., elasticity).

 

The static character of force implies that different forces, applied to the same physical subject, can balance each other. This is also possible if the mutually compensating forces are of a different typical nature. For example, an electric force exerted on a charged body can be balanced by the latter’s weight. An electric voltage across a metallic wire can be compensated by a temperature difference, preventing an electric current from flowing, which would otherwise be caused by the potential difference (this is the so-called thermo-electric effect). This property of balancing forces of different typical character allows of measuring them. At the same time it demonstrates the modal, general character of force.[16] Forces must be added in the same way as spatial vectors. This property depends on the independence of forces acting simultaneously on the same subject and is related to the independence of the spatial dimensions.[17]

 

The three projections of physical interaction, energy, force, and current, are related to each other. The relation of force and energy can be seen in two ways, namely, via the concepts of work (this section) and of potential energy (5.5). If a system on which a force is exerted is displaced in the direction of the force, the latter is said to perform work on that system, which therefore gains energy – e.g., the velocity of the system may increase, because its kinetic energy increases. In mechanics this is expressed in Newton’s second law of motion. This is only a particular example of the relation between force and energy. It has no application in thermal physics. Therefore it is unwarranted to define forces with the help of this law[18] (this use explains why one speaks of ‘generalized’ forces in thermal physics), although as an operational definition it helps to define the metric and the unit for force.

 

The concept of force as related to accelerated motion became the cornerstone of Isaac Newton’s dynamic theory (T&E, 3.6), and ‘… rose almost to the status of an almighty potentate of totalitarian rule over the phenomena …’[19] in its interpretation along the lines of Roger Boscovich,[20] and Immanuel Kant. In the 19th century people like Ernst Mach[21], Gustav Kirchhoff and Heinrich Hertz[22], who realized the relational character of force, tried to reduce this concept to accelerated motion, whether or not mass was a primitive irreducible concept. They rightly reacted against the many attempts to explain the concept of force (especially gravitational and electromagnetic forces) by appealing to some concealed mechanical action of the ether on moving bodies. In these efforts forces were often treated as substances.[23] These positivist authors may be granted that force is not an irreducible relation, such as space, motion, or interaction. But this spatial projection of interaction cannot be reduced to a kinematic relation.

 

The identification of force with the product of mass and acceleration is objectionable not only because of static effects, but also because force can be specified (electric force, magnetic force, etc.). Neither acceleration nor mass can be specified in this way. It would also be quite meaningless to introduce the concept of fictitious forces in accelerated frames of reference (4.8), if it were not possible to identify real forces as physical subject-subject relations. Therefore, Newton’s second law is an equation and not an identification. It cannot serve as a definition of mass or as a definition of force. In the heyday of classical mechanics, when only the functional, modal character of force was considered, this could be overlooked. But nowadays we are more aware of the mutual irreducibility of special forces and therefore of the asymmetry in the equation F=ma.[24]

 

 

 

5.5. Fields

 

 

 

Another way of relating force and energy conceives of a force as the spatial gradient of potential energy. A force describes the static interaction between two or more spatially remote physical subjects, or within a spatially extended subject. Therefore in many (but not all) cases the concept of a force can be substituted by that of a field. Instead of the force exerted by a subject A on another subject B, we may consider A as the source of a field in which B is situated (and conversely). The concept of a field was introduced by Michael Faraday, William Thomson, and James Clerk Maxwell for electric and magnetic interactions, in order to replace action at a distance by contiguous interaction (T&E, 4.6).[25] A static field enables us to determine the force that the source of the field would exert on another body (a ‘test body’, small enough not to change the field) if it were present at some spatial position relative to the source.

 

A test body has a potential energy in the field. It feels a force equal to the spatial gradient of the potential energy. If the test body moves from one spatial position to another, its gain in potential energy is equal to its loss in kinetic energy (at least in the simple case of a conservative field in which there is no irreversible energy dissipation). Hence for static situations, a field describes a possible (potential) interaction, and a force describes an actual one. A field is a spatial concept, anticipating the kinetic and physical modal aspects, whereas force is a physical concept, referring back to the spatial aspect.

 

A field becomes actual only when related to currents, e.g., electromagnetic waves. The description of the electromagnetic field by Maxwell’s equations allowed him to give an electromagnetic interpretation of light waves (T&E, 6.4). This showed the possibility of interaction (the exchange of electromagnetic energy) which could not be described in terms of Newton’s third law. The meaning of this law is also restricted in the theory of relativity, which is concerned with the relative motion of subjects, whereas forces denote static interaction. Therefore, the concept of force will certainly be relativized when the relative motion of interacting subjects is taken into account.[26]

 

However, this does not imply a loss of meaning, but rather indicates a deepening of meaning. For example, the electric interaction, which is expressed by Coulomb’s law in static cases and involves only static electric forces, becomes electromagnetic interaction as described in Maxwell’s equations, which include magnetic fields. This is the first example of an opened up force. Although magnetic forces may have many characteristics of forces (e.g., they can be balanced by other forces), they lack others (they cannot be considered unequivocally as the spatial gradient of a potential energy). In relativity theory the concepts of electric and magnetic force are united into the concept of an electromagnetic tensor. By a change of kinematic reference frame (a Lorentz transformation) the electric field is transformed into a magnetic one, and vice versa.

 

Friction is a velocity-dependent force which cannot be considered as the spatial gradient of a potential energy, and cannot be reduced to a field. Friction arises when two systems in contact move with respect to each other, or would do so in the absence of friction. It is subjected to the physical order of irreversibility because it invariably leads to a loss of kinetic energy, which is transformed into thermal energy. But friction has the character of a force, in so far as it can be balanced by other forces. It must be taken into account in the application of Newton’s second law of motion. Friction allows interacting systems to move uniformly in situations where they would accelerate in its absence. Also mechanical equilibrium is only possible because of friction. This applies not only to the motion of falling bodies in the earth’s atmosphere, but also to all moderate currents in thermal physics. In fact, uniform motion influenced by friction is far more common than uniform motion in the absence of any interaction. The latter is an abstraction showing the mutual irreducibility of the kinetic and physical modal aspects, as was first realized by Galileo (5.1). His contemporary opponents, who defended the Aristotelian view that every motion needs a cause, could certainly point to a firm empirical basis. Galileo’s arrived at his view implicitly recognizing friction as a force because he wanted a consistent description of other phenomena.

 

However fruitful the concept of a field is, it has its modal limitations, and should not be absolutized. Consider two electrically charged subjects. In classical physics each is placed in the field of the other – i.e., each particle experiences the centrally symmetric field of the other. For a third (test) body, however, the field is that of a dipole which is entirely different from a centrally symmetric one. This problem can be evaded by stipulating that no particle feels its own field, but this can only be maintained for a static field. As soon as one allows the particles to move, one is confronted with the unsolvable problem (a problem both in classical and modern physics) that the particle will feel its own field because the velocity of electromagnetic interaction is not infinite. In quantum field theory the attempt to deduce the structure of the electron from the properties of the electromagnetic field leads to an infinite self-energy for the particle, which can only be eliminated by an unsatisfying trick.[27]

 

 

 

5.6. Current and entropy

 

 

 

The concept of current or flow is a modal physical concept which refers back to the kinetic relation frame. It is not just a thermodynamic concept because it also occurs in electromagnetic theory, in high energy physics, and in continuum mechanics. Generally speaking, current is a transfer of energy, caused by a generalized force. A heat current is caused by a temperature difference, an electric current by an electric potential difference, a molecular current by a gradient in the chemical potential, and a water current in a river by a gradient in the gravitational potential. Very often, the current has a uniform speed, which means that the driving force is balanced by some kind of friction or resistance, whose strength depends on the velocity of the current.

 

A current is not merely a displacement of energy. In that case one should also speak of a current if a free subject has a uniform motion. However, the latter needs no cause since there is no force or interaction involved. In a current the retrocipatory kinetic projection of interaction is involved. Work is also included in this general concept of current. Just as with energy and force, currents may be purely modal (work, heat flow, and currents caused by gravitation) or typical (some typical currents are mentioned above). The common, and therefore general, feature of currents is the reference from the modal physical aspect to the kinematic one. Current must be distinguished from accelerated motion, which anticipates the physical modal aspect.

 

In classical mechanics currents other than work are found only in continuum mechanics. The concept of current depends on the disclosed concept of force, i.e., on a field. The basic equation for the motion of a fluid depends on the conservation of matter which is assumed. This consideration gives the so-called continuum equation which is the starting point for all investigations in this theory. This law is also used in thermodynamics with respect to extensive parameters, and is a direct consequence of the property which gives them the name extensive.

 

Thus the equation of continuity is not applicable to entropy. As soon as there is some kind of friction or resistance, a current is accompanied by a creation of entropy. Entropy will not increase if there is no current. In the limiting case of the performance of pure mechanical work the increase of entropy is zero, but this is only realized if friction and resistance can be neglected.

 

One of the reasons why energy and force should be considered the most general retrocipations of physical interaction to the numerical and the spatial modal aspects, respectively, is that different forms of energy can be transformed into each other, and that different forces can balance each other. With currents this is more complicated. In the thermoelectric effect a heat current leads to an electric potential difference, and in the Peltier effect a temperature difference is caused by an electric current. Hence a certain current Ji can not only be caused by the corresponding generalized potential difference dFi, but also by other potential differences, dFj. If these gradients are not too large, the current is proportional to them. Calling the proportionality constant Lij, one finds that

 

JiLijdFj

 

                                                   

 

and similar expressions for other currents, Jj. Expressing the potentials in the so-called entropy-representation (5.2) yields Onsager’s relation:

 

Lij=Lji

 

for any pair of currents accompanying each other. Although the thermoelectric effect and similar effects were known for a long time, this general relationship was not discovered until shortly before the Second World War (T&E, 7.2), probably because physicists were used to working with the energy representation in which this relation does not show itself in a simple way.

 

Currents also play an important role in equilibrium situations – which means that a current is not always caused by a force. For instance, in a container with a liquid and its vapour in equilibrium, the currents of vaporization and condensation are not zero, but equal each other, such that their total effect is zero. Because vaporization is determined mainly by the temperature in the container, and condensation is determined by the number of molecules per unit volume in the vapour, there is a strong relation between the temperature and the vapour pressure.

 

This dynamic description of equilibrium is also extremely fruitful in other parts of physics. For example, application of equilibrium considerations of this kind to electromagnetic radiation eventually allowed Planck to formulate the first quantum hypothesis (T&E, 6.5).[28]

 

The creation of entropy is invariably connected with currents. This relation, expressed in the Second Law of thermodynamics, has a purely modal character. But the concept of entropy cannot be grasped completely in a purely modal way. It has a strong relation to the concept of probability, and to the idea of a ground state and excited states of a physical system. Both have a physical character.

 

The mass of a system can be considered (since the acceptance of special relativity theory) as the modal expression of its internal energy. This internal energy is determined by the typical structure of the system, and, as such, by its internal interactions. The ground state of a system which is its lowest possible internal equilibrium state must be distinguished from its excited states, which have higher energy values, and therefore have higher masses. This was not appreciated before the 20th century because the energy differences in chemical excitations are relatively small and do not give rise to a measurable increase in the mass of chemical substances. Only in nuclear and subnuclear reactions does the mass of interacting systems change appreciably. This accounts for the approximate validity of the law of conservation of mass in most chemical reactions.

 

The First and Second Laws of thermodynamics deal only with energy and entropy differences and not with total energy or entropy. There is also a Third Law, first formulated by Nernst in 1906, which states that at absolute zero of temperature any conceivable process would leave the entropy constant (at T=0, ∂S/∂X=0 for any extensive or intensive parameter X except energy and temperature). This means, e.g., that at T=0, the heat capacity of any system is zero, which is born out by low-temperature experiments.

 

This can be interpreted in the following way. Given an isolated system at rest, the equilibrium state at T=0 is the ground state of the system, for which the entropy is arbitrarily set at zero. At higher temperatures the state of a system is an excited state, and corresponds to a positive entropy, which will increase with increasing temperature. Thus entropy is an extensive measure and temperature is an intensive measure of the amount of excitation of a system. Two systems in thermal contact exchange energy until their rates of excitation, as expressed by their temperatures, are equal. In this way, the concepts of entropy and temperature have also significance for microsystems like atoms and molecules.

 

At first sight, there is no relation between the concept of a current and the idea of a ground state and excited states. But the latter idea has only meaning if applied to a large number of interacting systems (like the molecules in a gas), such that there is a free exchange of energy between the systems concerned. The distribution of energy over the various excited states comes about in a dynamic state of equilibrium, quite similar to that between a liquid and its vapour.

 



[1] Redlich 1968 defines an ‘object’ (i.e., a physical subject) as anything that can be isolated, and an ‘isolated object’ as one whose properties remain unchanged whatever changes happen in its environment.

[2] Bunge 1959a, 125-134.

[3] This distinction of external and internal differs from another one common in mechanics. The forces between parts of a system are considered ‘internal’, whereas forces from outside the system are called ‘external’. In this case the reaction of the system on its environment is not taken into consideration. Cf. Suppes 1957, 294-298, Maxwell 1877, 2.

[4] Redlich 1968; Bunge 1967c; Noll 1974.

[5] Giles 1964, 17.

[6] Callen 1960, 44.

[7] The partial derivative ∂U/∂S means that other variables are kept constant.

[8] These are called ‘generalized’ because the concepts of force and current are originally defined in mechanics.

[9] Morse 1964, 96; Goldstein 1959, 215-216.

[10] Only in special cases energy and linear momentum are sufficient. In general, angular momentum, an independent conserved property, should also be taken into account, as was first proved by Euler; see Truesdell 1968, 239-243, 260.

[11] Jammer 1957, 165; Mach 1883, 310-314, 360-365;  Scott 1970. Cartesians considered momentum or quantity of motion as most important, both in elastic and non-elastic collisions. Leibniz and his followers assumed that atoms were elastic, and hence that vis viva (mv2) was conserved in atomic collisions. Newtonians assumed that atoms were perfectly hard, such that in atomic collisions vis viva was lost. They derived the conservation laws from Newton’s third law.

[12] Helmholtz 1847; Elkana 1970, 1974.

[13] The formula means that mass and energy are equivalent, that each amount of energy corresponds with an amount of mass and conversely. It does not mean that mass is a form of energy, or can be converted into energy.

[14] The fact that the conservation law of angular momentum refers back to the spatial isotropy implies that it is not necessarily related to rotational motion, as was assumed in classical mechanics. The latter view created difficulties in understanding the spin of electrons and other elementary particles (9.6).

[15] Jammer 1954, 198; Bunge 1967a, 49. In classical physics mass (the eleventh variable) is also a conserved quantity.

[16] The necessity of introducing a modal concept of force was better understood by Mayer and Helmholtz than by Hertz and Mach, see Whiteman 1967, 398.

[17] Mach 1883, 44ff, 242-243.

[18] Nagel 1961, 185ff; Poincaré 1906, Chapter 6.

[19] Jammer 1957, 241.

[20] Boscovich was the first to realize that the spatial extension of a physical subject is determined by repelling forces; cf. Jammer 1957, 171ff; Agassi 1971, 80ff; Berkson 1974, 25-28; Hesse 1961, 163-166.

[21] Mach 1883, 302-304.

[22] Hertz 1894.

[23] Jammer 1957, 224; Suppes 1957, 172, 297, 298.

[24] In Newton’s formulation of his second law, force is related to a change in linear momentum. One may also relate torque to a change in angular momentum, but this requires an independent law: angular momentum refers to the isotropy of space while linear momentum refers to the homogeneity of space. Torque, as well as force, is therefore a fundamental spatial retrocipation in the physical modal aspect.

[25] Agassi 1971; Berkson 1974. In fluid mechanics, d’Alembert introduced the concept of a velocity field, see Truesdell 1968, 122. Fields were also used to express the forces between the parts of a continuous extended body, e.g., elasticity. The concept of action at a distance, reluctantly introduced by Newton, and criticized by Huygens and Leibniz, is in fact alien to the driving motive of Cartesian physics, which only allowed contiguous interaction  between unchanging material particles (T&E, 3.4). Kelvin, Maxwell, and many of their contemporaries tried to save this idea wit the help of a mechanical ether. In Sec. 4.4 we saw that the idea of the ether is now abandoned. A substantial substratum for fields is no longer considered necessary.

[26] Jammer 1957, 254ff.

[27] Weisskopf 1972, 96-128.

[28] Jammer 1966, chapter 1.

 

 

Part I, chapter 6

 

 

 

Irreversibility

 

 

 

6.1. The direction of time

 

 

 

In every post-physical modal aspect relations of cause and effect are found which always refer back to the physical modal aspect. The physical cause-effect relation will be discussed in section 6.7. For the present it is sufficient to observe that this relation is subjected to the law that no effect can precede its cause. This universal law is the physical time order of irreversibility. I shall argue that (a) irreversibility is irreducible to the already discussed temporal orders of before and after, simultaneity, and kinetic flow, (b) irreversibility is a universal, modal law, not reducible to laws concerning typical interactions, such as probability laws, and (c) as a law irreversibility is correlated to the physical subject-subject relation of interaction.

 

The asymmetry of time does not occur in the first three modal aspects, as long as they are not disclosed by the physical modal aspect. The numerical order of before and after, the spatial order of simultaneity, and the kinetic flow of time are symmetrical. For instance, a purely kinetic movement is reversible in time. Reversal of the sign of the time parameter in the mathematical description of the state of motion of a subject yields again a possible motion subjected to the same law. But if a concrete moving subject is considered, and the physical aspect is not neglected, friction must be taken into account. Friction is always present and a motion with friction is not reversible. Due to friction, every changing system will eventually reach a state of relative equilibrium.

 

I shall distinguish internal (thermal) and external (mechanical) states of equilibrium. The latter depend on friction. An example would be a ladder resting against a house. The friction between the house and the top of the ladder and that between the ground and the bottom of the ladder supply the necessary forces and torques which maintain the ladder in a state of equilibrium, as long as it does not slip. Internal states of equilibrium, according to thermodynamics, can be characterized by a parameter called entropy. Irreversibility, as the physical time order, is expressed by the Second Law of thermodynamics. If two systems, which are both initially in internal equilibrium, interact with each other, then the increase in entropy of one system added to the increase of the entropy of the other system is larger than or equal to zero. This formulation is more correct than the often heard expression ‘the entropy of a closed system cannot decrease’, because the entropy of a system which is not in equilibrium is not well defined in thermodynamics. Besides, our formulation makes explicit mention of the correlation of irreversibility and interaction.

 

The irreversibility of physical time is not merely an addition to the numerical time order of before and after. Whenever there are several interacting systems, one does not have causal chains with a serial order, but causal networks with at best a quasi-serial order (3.1).[1] According to Hans Reichenbach this means that if a direction is assigned to one causal chain, which connects two systems in a causal network, a direction is determined for each causal chain in the network. This idea of a network presupposes both the numerical order of before and after and the spatial order of simultaneity. Furthermore, the equilibrium state, the final state in any interaction, has a spatial character because it is characterized by a spatial uniformity of some intensive parameter such as temperature or pressure. Finally the increase of entropy, the irreversible process in which the physical approaches equilibrium, reflects a relation between the physical and the kinetic relation frames.

 

 

 

6.2. The asymmetry of physical time

 

cannot be reduced to probability

 

 

 

The irreducibility of the physical order of irreversibility to kinetic motion has not gone unchallenged. A basic motive of classical physics was the reduction of all physical phenomena to reversible motions of particles in a field of force. Physicists attempted, for example, to reduce thermodynamics to statistical physics by explaining the macroscopic laws of the former as the net result of the motions and interactions of molecules, which were assumed to be reversible.[2] Especially Ludwig Boltzmann is considered to have succeeded in deducing the irreversibility stated in the Second Law from a probability calculation on the motion of the molecules composing a gas (T&E, 7.4, 7.5). In the kinetic theory of gases, Boltzmann showed that the thermodynamic concept of entropy is related to the amount of disorder in the system, and he demonstrated an ordered system to be much less probable than a disordered one. He stated that any closed system will develop from a less to a more probable state, and thought that this explained time asymmetry as a macroscopic statistical phenomenon.

 

Chapter 8 will offer arguments to support the view that probability concerns the relation between typical laws and individual subjects, and therefore cannot serve as the basis for a modal universal law. At present it may be observed that (a) the mathematical concept of probability does not involve time asymmetry; (b) the realization of a mathematical possibility requires some irreversible physical interaction; (c) therefore, statistical physics has to introduce irreversibility as an independent category in probability theory; (d) this is only possible if irreversibility is correlated to physical interaction; consequently, (e) the alleged reduction of irreversibility to statistical laws is a prejudice.

 

Boltzmann’s derivation of irreversibility, in fact, presupposes temporal asymmetry between the initial and final states.[3] Or rather, it shifts the problem to the question: Why should probability increase in time? Calling this self-evident would be begging the question, because then the asymmetry of time itself would be self-evident. It is not: the asymmetry of time is an irreducible mode of experience, empirically discovered.

 

Consider a closed system, in internal equilibrium. The entropy is not completely constant, but exhibits spontaneous fluctuations (for instance, Brownian motion). Because of the individual behaviour of the composing particles of the system the latter can only be said to be near equilibrium. Considering a system during such a fluctuation, one can deduce that after a while the entropy will most probably be larger. But one can also show that the entropy (with the same probability) will have been larger some time before. Consequently, it is impossible to deduce time asymmetry from the increasing entropy of a closed system alone.[4]

 

To meet this objection, Reichenbach developed a theory of branch systems.[5] If a system is branched off from the universe (that is, if it is isolated physically) its entropy most probably will increase or remain constant. According to Reichenbach this determines the direction of time as that direction in which most thermodynamic processes occur. This is an improvement in so far it makes time symmetry less dependent on the typical properties of the particular systems we study, but still time asymmetry is introduced beforehand in two ways.

 

First, Reichenbach only considers those systems which branch off and their subsequent development. Secondly, Reichenbach and Boltzmann can prove only that some macrostates of the system are more probable than others. They do not prove that the states of a system are ordered in time according to a monotonically increasing or decreasing probability, such that the direction of physical time can be defined as that of increasing probability. This is a separate statement, not implied in the mathematical concept of probability.[6] As a separate law it is yet another expression of the asymmetry of time.

 

In statistical calculations one also has to correlate this asymmetry with physical interactions, as can be verified in modern treatments of thermal physics. For instance, the concepts of entropy and temperature may be introduced with the help of a simple system, like a linear chain of spins.[7] For the sake of calculating the entropy etc., one assumes that the spins do not interact with each other. Then it is possible to calculate which macrostates are more probable than others. But, in order to show that a particular state will change such that its probability increases, one has to assume that the spins do interact.  Thus the temporal irreversibility cannot be obtained without explicit reference to physical interactions. Therefore, the conclusion is warranted that, even in statistical physics, irreversibility is an independent category, correlated to physical interaction. Irreversibility is an irreducible law, and therefore need not be introduced into physics only via probability laws.

 

It is a widespread misunderstanding that irreversibility is necessarily connected to probability laws. Sometimes dynamical laws are distinguished from statistical laws by two criteria: (a) dynamical laws are deterministic and operate with absolute certainty, whereas statistical laws are only capable of establishing probabilities (they hold for a great number of individuals and lose their meaning if applied to a small number of them); (b) dynamical laws describe reversible processes and statistical laws deal with irreversible phenomena.[8] However, the first criterion applies to the distinction of modal and typical laws while the second assumes the mutual irreducibility of the kinetic and physical relation frames. This reasoning overlooks the fact that any concrete process can be described statistically, and has both reversible and irreversible aspects. Thus the law describing the motion of a falling body is called dynamical. According to criterion (a) this implies determinism, which can only be maintained if the falling body is not too small, i.e., if Brownian motion and quantum effects can be neglected. And according to criterion (b) this implies reversibility, which could only be true if friction did not exist. On the other hand, heat conduction is supposed to be governed by a statistical law, although if one makes the same approximations as one did with falling bodies the law may be called deterministic. Heat conduction also involves some reversible aspects. For example, in homogeneous media the conductivity is independent of the direction of the current.

 

The background of this distinction is that every time a philosopher finds a reversible phenomenon he looks for a deterministic interpretation, whereas as soon as he finds irreversibility, he tries to explain it statistically. But this is just narrow-mindedness. The proper way to compare these laws is, either to abstract from concrete reality in order to study merely modal laws, or to study the typical individuality structure of the physically qualified subjects constituting a macroscopic body.

 

 

 

6.3. Irreversibility also applies to microprocesses

 

 

 

The assumption that the reduction of thermal physics to statistical physics implied the explanation of irreversibility on a macroscopic scale was based on the hypothesis that the interactions between molecules are completely reversible. Since the rise of quantum physics it is clear that irreversible processes also occur on a microscopic scale. This is very obvious with respect to the spontaneous processes which occur, for example, in radioactive nuclei or activated atoms and molecules, and which always involve the transition from a high energy level to a lower one. Albert Einstein argued that these spontaneous processes must be distinguished from stimulated ones, which are in a sense reversible.[9]

 

But even the motion of, e.g., an electron can no longer be considered completely reversible. Its relative place and motion are represented by a wave-packet, which is the sum of a number of infinitely extended waves with different wave lengths and amplitudes and mutual phase relations such that the total amplitude of the waves is appreciable only within the wave packet (chapter 7). Outside the packet the composing waves have a resultant zero amplitude. In an interaction such as the collision between the electron and an atom, the electron’s wave packet is reduced to a relatively small size, and after the collision this reduced wave packet will gradually extend.

 

This expansion is irreversible. From a kinematic point of view, it is quite conceivable that one could obtain a contracting wave packet by reversing the motions of all composing waves (preserving their phase relations), but physically no interaction can be designed for this construction. The production of a wave packet and its subsequent development (which already presupposes time asymmetry) need an explanation which cannot be given in kinematical terms only. They need a physical explanation, irreducible to a kinetic one. The reduction of the wave packet occurs in any interaction of microsystems, not only in interactions of a microsystem with a macrosystem, e.g., in a measuring process. The latter is often suggested by adherents of the Copenhagen interpretation of quantum mechanics.[10] In my opinion, the micro-macrosystem interaction characteristic of the measuring process is merely a special case of physical interaction.

 

This is not contrary to the existence of the so-called principle of detailed balance in equilibrium,[11] which is closely related to the principle of reciprocity[12] or micro-reversibility.[13] No one doubts the validity of some principle of overall balance as a necessary (though not sufficient) condition for equilibrium, and its value as a guiding principle for research. But often it is interpreted uncritically as implying the time reversibility of any microprocess. This interpretation overlooks the fact that spontaneous processes (for example, Brownian motion) occur in a state of equilibrium. Also, interactions resulting in the expansion of wave packets can be compensated by new interactions without having this process to become reversible. Mutually compensating processes certainly do not need to be each other’s time reverse. They may be completely different processes, provided they compensate each other’s effects.[14] And even if the processes are reversed with respect to their typical structure, they are not necessarily each other’s temporal reverse. It suffices that they are equally probable.

 

In most cases the principle of detailed balance cannot be proved from first principles. In some cases (namely, those involving magnetic interactions) one can prove that time reversal is not assumed by the principle of detailed balance.[15] In other cases physicists claim that it is. For instance, in classical physics an elastic collision is said to be reversible in time. (In fact, this is only the case if the interacting force is spherically symmetric). This means that, if one would reverse the time parameter (meaning the reversal of all velocities) in a certain state created after the collision, the collision process then proceeds in the reversed time direction, and one returns to the initial state (with reversed velocities).

 

In quantum mechanics time reversal with respect to a collision means that if an initial state a corresponds with a state b with a certain probability, then an initial state b’ corresponds to a final state a’ with the same probability (the primed state is equal to the unprimed state, but with reversed velocities or wave vectors). Although this is a necessary condition for equilibrium in a gas (for example, if the equilibrium arises by means of elastic collisions among molecules) a statement of this kind is not valid with respect to spontaneous processes.

 

For instance, in Einstein’s derivation of Planck’s distribution law for black body radiation, he established that the probability for stimulated emission is the same as for stimulated absorption of radiation. But he could only arrive at Planck’s formula when he added a third mechanism, the spontaneous emission of radiation, which has no reverse.[16]

 

Both in classical and in quantum physics it is time in its kinematical aspect which is reversible in those cases where the principle of detailed balance is applicable. In fact, the principle only states that there are processes which are symmetrical with respect to kinetic time, just as there are atomic and molecular structures which display spatial symmetries. But, as soon as we study the process of interaction itself, irreversibility is unmistakably present.

 

 

 

6.4. Time asymmetry concerns subjects, but is a law

 

 

 

Whoever does not recognize the modal distinction between the kinetic and physical aspects must feel forced to deduce the factual irreversible behaviour of physical systems (including friction) from the statistical result of a large number of reversible processes, that is, finding reversibility on the subject side.

 

This is also the case in Hans Reichenbach’s theory of branch systems. Even if one assumes that the macrostates of a system are ordered according to a monotonically changing probability, one has to prove that the direction of this change is the same for all physical systems. By relating the branch systems to the universe Reichenbach thinks he is able to determine a universal direction of time, which is not a law, but a factual subjective property of the universe. Thus he has to assume that the universe as a whole has an increasing entropy, and discusses the possibility that the entropy will decrease at some time.[17]

 

Meanwhile, the neo-positivist Reichenbach cannot avoid to realize that such considerations are rather speculative, for it is hardly possible to say anything meaningful about the properties of the universe (total energy, total entropy),[18] no more than about its relative motion, or spatial position, or even about its number. (Is the universe a member of a class of universes, eventually the only member?) It is very doubtful whether the universe can be treated as a subject with properties like other subjects (Reichenbach states, e.g., that the universe is a closed system, without explaining what this should mean).[19]

 

Reichenbach’s theory is invalid if it is not assumed that the present state of the universe is a state of low entropy. There are at least three objections to this assumption. First, the entropy of a system can only be defined if the system is isolated and finite, and even if pretending to know what this means with respect to the universe, one would not know whether it is factually the case. Second, the entropy of a system is only defined if the system is in a state of equilibrium, and Reichenbach’s theory assumes that this does not apply to the universe. Third, Reichenbach relates entropy to probability, but his notion of probability ‘… is always assumed to mean the limit of a relative frequency’.[20] Therefore, it does not apply to a single system, such as the universe. What remains of the initial assumption is that the universe is not in a state of equilibrium. I fail to see how any conclusion can be drawn from this negative statement.

 

The universality of physical time asymmetry does not rest on its relation to the universe, but on its being a modal law, having universal validity. This law does not depend on the empirical fact that all physical processes, which have been observed till now, turn out to develop in one direction of time. One would rather say that the possibility of physical processes depends on the irreducible physical time order which is a basic modal law.

 

According to Grünbaum,

 

‘… the complete time symmetry of the basic laws like those of dynamics or electromagnetism is entirely compatible with the existence of contingent irreversibility’.[21]

 

But he admits that it is rather meaningless to call the laws of motion ‘basic’ and the law of irreversibility a ‘universally valid’ statement about ‘contingent facts’.[22] This artificial construction is invented to reconcile reversibility with irreversibility, by relating the former to laws and the latter to facts. It is very strange indeed, that, e.g., the reversible laws of electromagnetic wave propagation are called ‘laws’, although they are only valid in a very limited case (namely in the absence of any absorbing physical subject), whereas the irreversible law of entropy increase is denied the status of a law, because it is only valid in a very limited case (namely in the absence of spontaneous fluctuations). This strange attitude can only be understood if we keep in mind the basic motive of 19th-century  mechanical philosophy (T&E, 7.5).

 

In contrast to this now outmoded view, I relate reversibility and irreversibility to mutually irreducible modal aspects of temporal reality. Just as static simultaneity is retained in the kinetic relation frame as the borderline case of a state of rest, so reversibility can often be found as a boundary case in the physical modal aspect – for example, electromagnetic wave motion in a vacuum or the motion of bodies when friction can be neglected.

 

 

 

6.5. Interactions

 

 

 

Although I do not like to speak about the interaction of a system with the universe, I agree with one point in Reichenbach’s theory of branch systems, namely, the significance of interaction for the physical order of irreversibility.[23] After an interaction with another system, the state of a physical subject will change gradually, until after some time it reaches a state of equilibrium. Thus the first effect of a physical interaction is to disturb the pre-existing equilibrium states of the interacting systems. After the interaction the system approaches a new equilibrium state, i.e., a state of uniform temperature, pressure, chemical composition, electrical potential, etc. This development is irreversible, that is, except for statistical fluctuations, no system in equilibrium will spontaneously move out of equilibrium, anticipating a future interaction with another system. Reichenbach seems to overlook this difference, assuming that the entropy of each branch system increases steadily until it is reunited with the universe.[24] In fact, the entropy of any branch system becomes constant after a while, remaining so until it contacts the universe again. Then the entropy will change, but only after this event, not before.

 

The error in Reichenbach’s theory is not simply that he begins with the analysis of a single subject (he admits that to be impossible)[25], but that he does not break radically with this method. He tries to deduce the time direction of the universe as a collection of single systems, from the fact that each system alone tends to a state of internal equilibrium. In my view, the analysis of modal physical time has to start with the relations between subjects, especially between pairs of subjects, just as is the case with numerical, spatial, and kinetic time.

 

Of course, if one begins by stating that in any two interacting systems the entropy never decreases, this fact can be extrapolated to the universe, provided the latter is understood to be as many interacting systems as one likes. Put this way, the statement that the entropy of the universe will always increase is valid though quite useless, whether or not this universe has a finite limit. This is not the case in Reichenbach’s theory which is only applicable to a finite universe. Moreover, his notion of the universe cannot be used in his theory, because it presupposes what he wants to prove. My theory is purely relational, whereas Reichenbach’s requires an absolute universe.

 

From the fact that a closed system is not in a state of equilibrium it cannot be deduced with certainty that it has interacted with another system some time before: it may be a spontaneous fluctuation. On the other hand, interaction always disturbs equilibrium. Therefore, interaction, rather than the approach to equilibrium, is an irreversible expression of modal subjective physical time.

 

 

 

6.6. Initial and boundary conditions

 

 

 

A physical system cannot be described without taking into account its interaction with other systems. Often this interaction can be contained in the so-called boundary conditions. This is the only acceptable interpretation of ‘the universe’: the environment of the system under consideration, i.e., the spatially continuous representation of the physical relation of the subject with all other subjects. The simplest boundary condition is a rigid wall, which must be understood in a physical rather than a spatial sense. It is an infinitely high and steep potential energy. Furthermore one can distinguish thermally conducting walls, movable walls, porous walls, etc. If a system is in an equilibrium state, the latter is largely determined by these boundary conditions.

 

It is possible to describe the interaction between two systems by assuming that at first they are kept apart by some kind of boundary, which is removed at some later time. In that case the entropy of the combined systems will increase. This has induced Brian Pippard to assume that the entropy is in fact a measure of the constraints on the system. If a system is restricted by some kind of boundary condition from reaching a state that it would have if this boundary condition were absent, then its entropy is relatively low.[26] He admits, however, that this view has a severe disadvantage. It overlooks the transient condition between the removal of the constraint and the subsequent arrival at the new equilibrium state. Suppose the constraint is removed, but reinstated before the system has time to reach equilibrium. Then the system will have an entropy value somewhere between the initial value and the value for the equilibrium state without constraint. Thus the reinstatement of the constraint does not itself decrease the entropy. It just stops the increase of entropy. This implies the impossibility of reducing irreversibility to spatial constraints. As observed in chapter 5, the increase of entropy is invariably related to currents. A constraint (like thermal isolation) can now be interpreted as prohibiting a current (like a heat flow) to occur.

 

Nevertheless, Pippard is certainly right in pointing to the relevance of the boundary conditions or constraints for irreversible processes.[27]  The initial state is also a boundary condition, though not a spatial one. The irreversibility of the physical temporal order makes the initial state relevant, but not the final state. That is, whereas the initial state and the spatial boundary conditions determine physical processes, the final state is merely their effect. Similarly, it is the removal of constraints, not their reinstatement, which leads to a change of entropy.

 

 

 

Development in phase space

 

In the context of statistical physicsthe microstate of a system consisting of many molecules is represented by a point in a phase space. A non-equilibrium macrostate is represented by a small domain in this space and an equilibrium macrostate is represented by a large domain. Specifically, a physical interaction creates a non-equilibrium macrostate which is represented by a relatively simply shaped (e.g., spherical) domain in this space.

 

This is nicely illustrated in a picture in Reichenbach’s book.[28] During the spontaneous approach to equilibrium the domain is gradually spread out into a very whimsically shaped ‘starfish’ extending through all phase space, although it has the same volume as the original figure.[29] Any microstate of  a system is represented by the set of all positions and momentums of all molecules, which is objectified by a point in phase space. That the initial macrostate must be described by a domain in this space, rather than by a point, is due to the fact that no macroscopic interaction is sufficiently accurate to determine the microstate exactly – no boundary can determine a single point.

 

The production of a macrostate cannot be understood in kinematical terms only. It requires a new mode of explanation, which is the physical one.[30] To delimit a certain region in phase space requires the introduction of constraints to the positions and momentums of all molecules of the system. These constraints cannot determine exactly a point in phase space.

 

However, a slight deviation of our macroscopic specification of the state will not make much difference to the initial state, nor to the final macrostate. The point is that the starfish representing the set of all microstates, corresponds to the set of all microstates compatible with the initial conditions. But this starfish is itself only a very small subset of the set of all microstates representing the final macrostate, because the latter could also have been reached from other initial macrostates incompatible with the actual initial conditions.

 

Now suppose one wishes to return from the final macrostate to the initial macrostate by reversing all molecular velocities, which is theoretically possible in a kinetic sense. This means that with the help of some interaction one has to prepare a microstate falling within one of the arms of the starfish. These arms are, however, very thin, because the starfish extends over the whole large domain representing the original final macrostate, but has the same small volume of the domain representing the original initial macrostate. Thus, even a slight inaccuracy in the specification of the reversed final state already means that the process will not end up in the state compatible with the original initial conditions.

 

This analysis requires to relate the irreversibility to the accuracy with which a microstate can be prepared by some interaction.[31] Quantum physics has shown that this accuracy has a finite limit determined by Heisenberg’s indeterminacy relations. But it is not necessary to appeal to quantum physics. Even in classical physics it is sufficient to state that any physical interaction can only determine a domain in phase space. Increasing the accuracy does not make it possible to reduce a domain to a single point. This is a rejection of the classical mechanist doctrine which holds that a physical state can be represented by a point in phase space.[32]

 

 

 

6.7. Causality

 

The concept of causality is often identified with that of lawfulness, and even with that of determinacy (T&E, 8.6). Therefore, it is discussed with respect to the problem of individuality or the occurrence of stochastic processes.[33] Sometimes, causality is reduced to irreversibility, and conversely, there exist causal theories of time.[34] The law-subject relation and its bearing on determinacy will be discussed in chapter 8. Here I want to comment only on the relation between cause and effect.

 

It is often stated that this relation is ill-defined in physics. It is often impossible to state unequivocally what is cause or what is effect in processes occurring in closed systems. This is not difficult to understand because a study of closed systems requires an initial interest in the interaction as a subject-subject relation. Furthermore, a distinct cause-effect relation is hardly tenable due to the law of action and reaction.

 

But if external influences are considered on an (otherwise closed) system the causality concept can still be maintained. Especially in this case, the cause-effect relation is irreversible and asymmetric, as it is always understood to be. One speaks of causality if the state of a system is changed by some interaction. As such the concept of causality refers back to the kinematic relation frame. Hence it is an analogical concept, and, as such, it returns in every modal aspect following the physical one.[35]

 

In order to make this clear, consider the following example of a closed system consisting of two subsystems in thermal contact. Given the respective temperatures, T1>T2, a heat current J flows from the first to the second system. This system as a whole cannot be analysed in terms of cause and effect, and physicists will always take recourse to object-object relations: the relative energy, the temperature difference, the current. But if one considers the first subsystem, the heat current causes the temperature T1 to decrease, and considering the second subsystem, the heat current causes T2 to increase. Alternatively, one can also consider the thermal contact, and state that the temperature difference causes the current J to flow. Thus at the same time the current can be considered both as cause and as effect. This is possible, because in the cause-effect relation the reaction of the system to the cause of its change is neglected.

 

Therefore, the causality concept has a limited applicability – namely to cases where internal states can be distinguished from external influences. But this does not mean that it is useless, even in physics.[36] Especially in experimental physics, in which external disturbances are deliberately introduced (or at least must be accounted for) in order to study the way a system reacts to them, one frequently makes use of the causality concept.[37]

 

The main reason why the cause-effect relation is not very useful in physics is that it is not a very simple relation. It is not a subject-subject relation (the effect is not a subject), nor an object-object relation (it is not a succession of states). Whereas the cause is some interaction (a subject-subject relation) in which the reaction is not taken into account, its effect is objective (the changing state of one of these subjects). Thus a cause-effect relation is a complicated subject-object relation, reducible to the basic subject-subject relation called interaction.

 



[1] Reichenbach 1956, 36 speaks of ‘lineal order’.

[2] Nagel 1960, 288-312; 1961, 336-345; Reichenbach 1956, 54ff.

[3] See, e.g., the discussions on this subject in Gold (ed.) 1967; see also Landau, Lifshitz 1959, 30; Grünbaum 1963, 240ff; Whitrow 1961, 5ff; Penrose 1970, 41, 42; Schrödinger 1962, 14; Weizsäcker 1971, 233, 240.

[4] Reichenbach 1956, 108ff; Grünbaum 1963, 242; Tolman 1938, 146ff.

[5] Reichenbach 1956, 117ff; see also Grünbaum 1963, 254ff; 1974, 789ff.

[6] Reichenbach 1956, 143.

[7] Kittel 1969, Chapter 4.

[8] Lindsay, Margenau 1936, 201.

[9] Einstein 1917.

[10] Grünbaum 1963, 249; Ludwig 1954, 181.

[11] Tolman 1938, 161ff, 521; Kaempffer 1965, chapter 28.

[12] Kaempffer 1965, 255.

[13] Messiah 1958, 673; Tolman 1938, 163.

[14] Tolman 1938, 114ff, 162: ‘… in general … processes which are the inverse of each other do not exist …’

[15] Kaempffer 1965, 258-261; Messiah 1958, 675.

[16] Einstein 1917; Jammer 1966, 112-114.

[17] Reichenbach 1956, 117ff. Reichenbach’s reference to the ‘universe’ is criticized by Grünbaum 1963, 261ff. See also Dooyeweerd NC, III 629ff; Popper 1959, 196ff calls explanations which depend on a particular improbable state of the universe ‘speculative metaphysics’, see Popper 1974. The attribution of subjective properties like temporal duration and spatial dimension to the universe leads to antinomies, as was first discovered by Kant 1781, A 420-433, B 448-461.

[18] Reichenbach 1956, 132, 133.

[19] Reichenbach 1956, 135.

[20] Reichenbach 1956, 123.

[21] Grünbaum 1963, 277.

[22] Grünbaum 1963, 273. Both Popper and Grünbaum have pointed out that there are physical processes whose irreversibility cannot be reduced to ‘entropic’ irreversibility; see Grünbaum 1964, 1974; Popper 1974.

[23] Reichenbach 1956, 117; Grünbaum 1964.

[24] Especially figure 21, on page 127 of his book, is in my opinion a very inadequate and probably misleading representation of Reichenbach’s own views, and certainly of what actually happens. See also Grünbaum 1974, 789, 794, 795.

[25] Reichenbach 1956, 117.

[26] Pippard 1960, 94ff; the Second Law is formulated as: ‘It is impossible to vary the constraints of an isolated system in such a way as to decrease its entropy’ (ibid., 96).

[27] Brillouin 1964, chapter 6.

[28] Reichenbach 1956, 94.

[29] This is a consequence of Liouville’s theorem. See Tolman 1938, 51.

[30] Reichenbach 1956, 149ff.

[31] Bondi (in Gold (ed.) 1967, 3) interprets this to mean that irreversibility is merely due to the inability of the experimenter to produce microstates.

[32] The probabilistic interpretation of irreversibility also makes use of this fact. The probability of a ‘point-state’ is zero. Only the probability over a domain can have a finite value. Boltzmann’s derivation of his famous ‘H-theorem’, which describes the irreversibility of physical processes, also leans heavily on the characterization of the state by a domain. Brillouin 1964, chapter 1 relates entropy to information, i.e., the experimenter’s knowledge of the system’s initial microstate. It is true, of course, that in information theory entropy and its increase play an important part, but this has a physical basis. It is not the experimenter’s knowledge that counts, but his ability to delimit the domain in a physical sense.

[33] Reichenbach 1956, 55, 149ff; Bunge 1959a, part I; Campbell 1921, 49-57; Braithwaite 1953, chapter 9.

[34] Cf. Whitrow 1961, 175, 217f; Frank 1941, 53ff; Reichenbach 1956, 24ff.

[35] Dooyeweerd NC, I, 558; II, 110.

[36] Margenau 1950, 389ff; 1960, 437; Toulmin 1953, 107ff; Bunge 1959a, 29, 91ff; Nagel 1939, 25f; 1961, 316ff.

[37] Campbell 1921, 53.

 

 

Part I, chapter 7

 

 

 

 

Wave packets

 

 

 

7.1. Relaxation and oscillation

 

 

 

Chapter 5 discussed the modal retrocipations of interaction: energy, force, and current. These physical concepts referring to the numerical, spatial, and kinematical modes of explanation, are not unrelated. Currents presuppose forces, and forces presuppose energy. A further complication is that a full account of forces can only be given if one considers energy (and other extensive parameters) in disclosed form, i.e., as potentials. And currents can only be accounted for if one considers forces as fields. Especially in relativity physics, developing kinetic anticipations in the numerical and spatial modal aspects, the purely retrocipatory concepts of internal energy (or mass) and force can no longer be used, and must be replaced by the energy-momentum fourvector and the field. This chapter intends to study the anticipations of the first three modal aspects on the physical aspect more closely. As observed in section 2.2, in order to understand the anticipations, one sometimes requires knowledge of some specific characters, to be discussed more extensively in part II. In the present case, one has to rely on the specific properties of waves and oscillations, in particular those of wave packets.

 

To start with, numerical time being originally a purely numerical difference between natural or rational numbers becomes continuous when anticipating the spatial modal aspect, and uniform when anticipating the kinematical aspect. With respect to physical interaction, it should be subjected to the order of irreversibility.

 

If for instance two bodies initially at different temperatures are brought into thermal contact, a heat current will decrease their temperature difference. But the heat current in turn is proportional to this difference so that the current will also decrease. The equalization of the temperature will slow down gradually. This process when compared to kinematic relative motion occurs exponentially. It is described numerically by an exponential function, whose exponent is the kinetic time parameter, proportional to a constant relaxation time.[1] The relaxation time is a measure of the retardation between cause and effect. The value of the relaxation time is determined by the conductance of the thermal contact and the heat capacity of the two systems. Therefore it always has a typical and individual character. But the exponential behaviour itself is independent of the typical individuality of the two systems, and is thus of a modal nature.

 

The relaxation time is not only found in thermal physics. Relaxation, damping, or absorption occur in mechanics, wherever there is some kind of friction, resistance, or energy dissipation. In unstable atomic and nuclear systems it is expressed in the relaxation or decay of an excited state to the ground state. Relaxation is always related to the transport of energy from one place to another; the transport of energy from one state to another; or the transformation of one kind of energy into another. Relaxation always means the irreversible approach towards an equilibrium state.

 

Oscillation occurs in a system when the equilibrium state is approached with a velocity proportional to the deviation from equilibrium at an earlier instant instead of at the same moment as in relaxation. In an oscillation the system overshoots the equilibrium state. An example is a pendulum passing its central (equilibrium) position with a velocity nearly proportional to its amplitude. The amplitude of the oscillation will decrease exponentially due to friction. In fact, oscillation will occur only if the friction is not large enough for a simple relaxation process. The oscillatory motion can be described by a harmonic function, i.e., a sine or cosine function, or an exponential function, which now has an imaginary exponent (2.4). Besides the relaxation time describing the gradual decrease of the amplitude, the oscillation time (the period of the oscillation, the inverse of its frequency) occurs as a typical number. It depends on the internal structure of the system.

 

Both oscillation and relaxation can be used as clocks. In the former case one has to compensate for any kind of relaxation, e.g., for friction in a pendulum clock or a watch. Relaxation time itself is used for time measurement in the C14 method of determining the age of archaeological objects.

 

Hence a physical time scale could be defined, related to the kinetic one by way of an exponential function. The non-linearity of this relation implies that two intervals which are congruent in one of these time scales will be incongruent in the other one. It is, in part, a convention that the kinetic time scale is preferred even in physics, in particular because periodic clocks are much more accurate. But this does not mean that either scale is conventional. In both cases it is required that the kinetic, as well as the physical, temporal relation be properly represented by the scale: the temporal relation must be independent of the typical individuality of the clock by which it is eventually measured (3.11).

 

In a clock based on the physical process of oscillation, the physical aspect of irreversibility is taken care of by the compensation of retardation effects. For a carefully constructed clock the time rate is in accord with the kinetic uniformity of time, the Newtonian metric (3.10, T&E, 3.7). The clock must be synchronous to other clocks, which refers to the spatial order of simultaneity. But essentially, the time is measured in a discontinuous way, because the number of periods is counted. This shows again that time, as this word is usually understood, is numerical time, opened up anticipating the spatial, kinetic and physical aspects.

 

 

 

7.2. Waves

 

 

 

The development of the spatial modal aspect is realized especially in the introduction of the concept of a field (5.5). Fields are intimately related to waves. After Maxwell developed the mathematical theory of electromagnetism, he realized that his equations suggested the possibility of wave motion, which he identified as light (T&E, 4.6). Chapter 2 pointed out that one needs to use spatial objects like points and boundaries in order to analyze spatial relations. It took some time before physicists realized that kinetic objects are needed in a modal description of motion.

 

Real numbers turned out to be numerical anticipations to the spatial modal aspect (2.3). Real numbers objectify spatial points, real functions of numbers objectify extended boundaries in space, such as lines in a plane or planes in a three-dimensional space. Functions of real or complex vectors also play a role in the anticipations to the physical and kinematical aspects. Especially a kinetic subject can be objectified by a set of functions, more or less in the same way as a spatial figure can be objectified by a set of points.

 

The points on a spatial boundary are connected by an equation. A function of the form f(x,y): y=ax+b represents the law for a straight line in a plane, as long as the numbers a and b are not specified, whereas for certain values of a and b (e.g., y=2x+3), the equation describes a particular line. From the law we can find a and b if two points on the line (two solutions of the equation) are given. Similarly, the functions in a wave packet are determined by a wave equation on the law side, and by specific amplitudes and phase relations on the subject side.

 

Wave packets are kinetic, not physical subjects. They can move, but they cannot as such interact with each other. They are subjects in the first three modal aspects because they are countable and have other numerical characteristics, they have extension, and they move. But if electrons collide which each other, they do not do so because they are wave packets, but because they are electrically charged and therefore exert a Coulomb or Lorentz force upon each other.  This property of being charged is not included in the wave character of the electron’s motion – it is an additional specific property. Two subjects having the same energy but different (eventually no) charge may have similar wave packets. Thus we find that the wave packet is an objective representation of a physical subject with respect to its motion. As such it is of a general, universal, and thus modal character.

 

Although wave packets are kinetic subjects because they move, the composing waves are not. These are kinetic objects, necessary for the objectification of kinematical subjects. The composing waves do not move and are therefore not subjects. This situation is parallel to the relation between a spatial figure and the points contained in it. The waves composing a wave packet differ from static functions (which anticipate spatial boundaries) by having a time-dependent phase. This phase is responsible for the interference phenomena, which have their static counterpart in the phenomenon of superposition of spatial functions or fields. The superposition and interference properties of waves anticipating the physical relation frame are very important in the description of the interaction of physical subjects.

 

 

 

7.3. Differential equations

 

 

 

The mathematical possibility of describing the motion of a particle with the help of a wave packet was already seen by William Hamilton nearly a century before Louis de Broglie stated his famous hypothesis about the wave character of electron motion (9.1).[2] It is a direct consequence of the application of differential equations to the problem of motion. The law for a certain motion, whether uniform or accelerated, is mathematically objectified in a differential equation[3] whose subjective counterpart is a set of undetermined functions. The equation can only yield a definite solution if some initial or boundary conditions are specified.

 

This was first recognized by Isaac Newton and Gottfried Leibniz, who independently invented differential and integral calculus in order to be able to study mathematically the motion of material bodies.[4] Thus the mathematical expression of the law of purely kinematical motion, Newton’s first law, is (in Leibniz’ notation) dr/dt=v, wherein rand v are vectors. The solution of this equation, r(t)=r(0)+vt, contains the undetermined parameters, the initial position r(0) and the velocity v. If these are known, the position of the moving body is given for any time. The moving subject itself is assumed not to change and its position is therefore represented by a characteristic point, e.g., its centre of mass. This law is valid for non-interacting subjects provided r and v refer to an inertial system. Differentiating the equation yields d2r/dt2=0 as an equivalent expression of the same law. If now interaction is introduced in the form of a force or field F(r), the second law of motion is d2r/dt2=F(r).

 

The solutions of these equations do not describe motion itself, but the spatial path of the motion, i.e., a retrocipatory spatial analogy of kinetic motion. But an anticipatory description is required. A function f(r) is a numerical anticipation of spatial figures. Therefore functions of this kind are subjected to a differential equation, rather than point vectors r=(x,y,z). These must be differentiated with respect to all temporal and spatial coordinates (t and r). This can be done in various ways.

 

For electromagnetic wave propagation in vacuum, Maxwell found the following law as a consequence of his laws concerning electric and magnetic fields (T&E, 4.6, 6.4):

 

Δf(r,t)=∂2f(r,t)/∂x2+∂2f(r,t)/y2+∂2f(r,t)/∂z2=(1/c2)∂2f(r,t)/∂t2                     

 

where f(r,t) represents the electromagnetic field.

 

The solution  of this equation depends on the boundary conditions in a rather complicated way. If there are no boundary conditions specified, any function of the type f(r+ct+φ) is a solution of this equation.[5] This shows that the number c is the velocity of the kinetic subject whose motion is described by the equation. The velocity c belongs to the law of the motion and does not depend on initial or boundary conditions.  This wave equation is relativistically invariant, and c has the same value with respect to any inertial reference system. Consequently, the wave equation can only describe the motion of subjects whose internal energy (or rest mass) is zero, i.e. light quanta. A difference with Newton’s law is that f(r,t) does not describe the path of the motion.

 

Another example is Schrödinger’s equation:

 

Δf(r,t)=2imf(r,t)/∂t

 

where i is the imaginary unit, and m is a constant, to be identified with the mass of the subject (if it is physically qualified). The solution is of the type f(iωt+ik.r+iφ). The main difference with Maxwell’s equation is that it applies to a physical subject having mass and moving with a low velocity compared to c. It is not relativistically invariant, whereas its solutions are complex functions. ω and k are determined by the boundary conditions. Similarly as with Newton’s equation, other terms can be added describing motion in an external field.

 

These are not all the possibilities for differential equations describing motion. For instance, the Schrödinger equation can be written in a relativistically invariant form, which gives us the Dirac equation or the Klein-Gordon equation (after Oskar Klein and Walter Gordon). Currents are also subjected to differential equations. In classical physics one distinguished particle motion from a continuous current. The wave theory of motion shows that this distinction is unwarranted. Particle motion also achieves the character of a current when anticipating the physical modal aspect.

 

On the other hand, any physical current must be quantified when taken in the retrocipatory direction, i.e., when its energy is involved. In so far as the Maxwell and Schrödinger equations do not show damping, they are idealized limiting cases of real physical motion.

 

 

 

7.4. Superposition

 

 

 

Maxwell’s and Schrödinger’s equations also differ from Newton’s equation because they are homogeneous and linear. This means that if we have two different solutions, f1 and f2, any linear combination of them (af1+bf2) is also a solution. Here a and b are arbitrary real numbers for Maxwell’s equation, and complex numbers for Schrödinger’s equation. This implies that the solutions can be ordered in a function space, if an operation called the scalar product can be designed (2.4). The basis of this function space depends on the boundary conditions. In the case of unbounded uniform linear motion, the basis functions form a continuous set. This presents a difficulty because these functions cannot directly be normalized, but this problem can be solved, as will be shown presently.

 

For Maxwell’s equation this basis consists of the functions cos(ωt+k.r+φ) where ω (the angular frequency) and k (the wave vector) range over all positive and negative real numbers, with the condition ω/|k|=c. For Schrödinger’s equation the basis is formed by the set of complex exponential functions expit+k.r+φ) with the same range for ω and k, but without the restriction of Maxwell’s equation. In both cases, φ depends on ω and k, but not in a regular way. Because cos(ωt+k.r+φ)=Re expit+k.r+φ) where ‘Re’ denotes ‘the real part of’, from now on the solutions of the two equations may be considered simultaneously, if, in the case of Maxwell’s equation, we add the prefix ‘Re’ and consider the amplitude A(ω,k) as a real number. In the Schrödinger equation A(ω,k) is complex, its complex conjugate being A*(ω,k).

 

Because the basis functions form a continuous set, they must not be summed but integrated. Hence any solution of the wave equation can be decomposed into the plane wave functions f(r,t)= ʃA(ω,k)expit+k.r+φ)dωdk, each of which is characterized by a frequency ω and a wave vector k. All functions have these plane waves in common, but different values for A(ω,k) and φ correspond to different functions.

 

As far as kinematics is concerned, nothing more can be said about the values of the amplitude and the phase. For freely moving, physically qualified systems these values are determined by their latest interaction, i.e., by the way they were prepared before they started their free motion. This preparation can be understood as the result of a collision with another system; by the birth of the system while emerging from another one (as in the case of  a light quantum emitted by an atom); or in an instrumental sense, like light or electrons passing a shutter. In all these cases, the particle’s spatial extension will be determined by that interaction, as well as its temporal extension which is the duration of the interaction. This may be the time the shutter was open or the relaxation time of the emitting atom. Immediately after the particle has started its free motion, the temporal extension can be understood as the time needed to pass a certain point. It is related to the spatial extension by means of the velocity of the particle with respect to that reference point. For particles having subliminal velocities, the spatial extension of the wave packet will always increase.

 

This means that a single plane wave, although it is a solution of the wave equation, cannot serve as a representation of a kinetic subject. The plane wave cannot even be said to move. Because its is infinitely extended, its appearance varies periodically, but there is no displacement. It also cannot be normalized. This is not serious, just because it will not be used to represent a moving subject, and because a wave packet consisting of all plane waves can be normalized by a proper choice of the amplitudes and phases. This is important for the interpretation in which wave packets describe probabilities with respect to future interactions (chapter 8). We should, therefore, consider plane waves as objects in the wave packet.

 

The wave packet formalism is not only relevant for the motion of a free particle, or of a particle in a field (in which case the Schrödinger equation must be adapted), but also to currents. In fact, Joseph Fourier developed the above theory (named after him as Fourier analysis) while studying the problem of heat conduction, early in the 19th century. Also light quanta are individualized currents in the electromagnetic field (11.4).

 

The velocity of the wave packet is its group velocity, dω/d|k|. Usually, ω and k are not independently variable. For wave packets subjected to Maxwell’s equation, ω=c|k|, hence the group velocity is c, independent of the reference system. If ω is not proportional to |k|, the waves show dispersion. For uniformly moving material particles, satisfying Schrödinger’s equation, ω is proportional to |k|2. Hence, the group velocity (the particle velocity) is proportional to |k|, and depends on the choice of the reference system.

 

 

 

7.5. Energy and momentum

 

 

 

Now the set of plane waves serving as a basis for the solutions of the wave equation is by no means unique. There is an infinitude of alternative bases, one of which will be discussed in section 7.7. On the one hand it is a bit unfortunate that nearly always the set of plane waves is taken as a basis, because it suggests that this is somehow intrinsic to kinetic and physical subjects, which is not the case. On the other hand, there are, of course, good reasons to single out this basis. One reason is that exponential functions are rather convenient in a mathematical sense.

 

However, there is a more fundamental argument. The problem of how to describe uniform motion presupposes the isotropy and homogeneity of space and time. Hence the problem becomes: find a complete set of basis functions reflecting the temporal and spatial isotropy and homogeneity of space and time. Or, in group-theoretical terms: find a suitable representation of the Galileo group (or, eventually, of the Lorentz group). Each of these approaches leads to the set of plane waves, which is therefore the most natural basis for the description of uniform motion. This immediately implies that if space, e.g., is not homogeneous (e.g., in the presence of a central field of force) the plane waves, though still possible, do not necessarily form the most suitable representation of a physically qualified subject. For an atom, for instance, one would prefer spherically symmetrical waves.

 

Now the energy-momentum fourvector of a freely moving physically qualified subject must be a frame dependent constant because of its assumed temporal and spatial homogeneity (5.3). In a similar way the wave packet’s characteristic frequency and wave vector form frame dependent constants for exactly the same reason. However, a theorem developed by Emmy Noether states that each type of symmetry has one and only one such constant. Therefore, the energy E must be proportional to the frequency f (or ω=2πf), and the momentum p must be proportional to the wave vector k, if these objective properties all refer to the same physical system. The proportionality constant is determined only by the choice of the units for these variables, and is therefore a general, modal, universal, constant of nature. It is known as Planck’s constant, h (or ħ=h/2π). The fact that it has the same value for all subjects anticipates the possibility of all physical subjects interacting with each other.[6] Hence,

 

E=ħω=hf and p=ħk=hσ.

 

These relations (due to Max Planck and Louis de Broglie, respectively) imply that the frequency and the wave vector are connected by hf=ħω=ħ2k2/2m (because E=p2/2m), where m is the, up till now, unspecified constant in the Schrödinger equation (which was designed such as to give this result).[7] The constant m can now be identified with the mass of the subject, whereas its velocity is equal to ħk/m.

 

The nature of Planck’s and De Broglie’s relations does not imply that energy/momentum and frequency/wave vector are conceptually identical. The former is retrocipatory, the latter anticipatory. But these two directions in the intermodal relationships are always strongly related. The proportionality of energy and frequency means that energy is not related to the amplitude of the waves, as was assumed earlier, so that we have to find another interpretation for the amplitude (9.1).

 

The proportionality of energy and frequency is sometimes misunderstood as energy-quantization. A light beam of frequency f only has particles of energy hf, which seems to imply discreteness. But the discreteness is due to the starting point (a light beam of frequency f). Energy is a variable which has a continuous spectrum, as does frequency, for freely moving subjects. The energy value in classical physics also has a continuous spectrum. Only in bounded systems, such as atoms or molecules, the internal energy spectrum may become discrete.

 

There have been some speculations in the literature about a possible quantization of physical space and time.[8] But this supposed quantization must not be misunderstood. It simply means that there is perhaps a smallest distance (called hodon) and a smallest time interval (called chronon) by which one can distinguish subjects and events by physical means. It does not mean that any distance or time interval is just an integral number of this hodon or chronon, respectively. This would certainly lead to antinomies. For instance, the diagonal of a square would be equal to its sides.

 

 

 

7.6. Heisenberg’s relations

 

 

 

The shape of the wave packet as determined by its physical preparation is mathematically described by a set of amplitudes A(ω,k), such that the net amplitude is only appreciable within the packet, whereas the composing waves add up to zero outside it. If the relevant extensions are denoted by Δω or Δf, and by Δkxky, Δky, we find by a very general reasoning that ΔfΔt≥1 or ΔωΔt≥2π, and ΔkxΔx≥2π, etc.[9]

 

This means that although a wave packet can be characterized by a certain frequency f and wave vector k, this is not a precise characterization as in the case of a single plane wave. The spread in the values of r and t determine the spread of k and f. If r and t are quite precisely determined such that the packet is small, so many waves are needed that k and f are ill determined, and conversely. These relations for wave-based signals (as occur in electric communication systems) were already developed by Oliver Heaviside, long before Werner Heisenberg introduced them in quantum physics. They are of a general, modal character, not characteristic of the typical structure of any physical system. In fact, they have a kinetic meaning, anticipating a physical meaning. The shape of the wave packet is determined by some previous interaction, and (because of its probability interpretation) it anticipates a future interaction.

 

The relations of Planck (E=hf) and De Broglie (p=hk), yield Δpxx>h, Δpy.Δy>h, Δpz.Δz>h and ΔEt>h. These Heisenberg-relations say that the energy and the momentum of a particle are not exactly determined, because of the wave character of their motion.

 

The fourth Heisenberg relation is sometimes criticized because (in contrast to the other three) it cannot be derived from a relation between non-commuting operators (9.2).[10] However, operators do not play an essential role in the kinematic theory of wave motion and operator calculus is not required to derive the above result.[11]

 

It will not be immediately clear that the wave description with the inherent Heisenberg relations is also valid with respect to systems with high energies – e.g., fast particles in a bubble chamber, or macroscopic bodies. One must first realize that the spread in the Heisenberg relations is not measured relative to the energy or momentum of the subject itself. Hence the spread of energy of a high energy system can be very large compared to the spread of a mono-energetic electron, and still be extremely small with respect to the total or kinetic energy of the system itself. The former means that the system can be sharply localizable (both temporally and spatially), while the latter means that the subject apparently has a very precise value for its energy, because the spread is so small relative to the total energy. The wave phenomena become determinable only if the spread is comparable to the energy, relative position, etc., of the subject itself. Thus, in principle, a planet’s motion must also be described as that of a wave packet, but, as yet, there are no experiments to show this. Nevertheless, the wave theory is in principle not limited to small subjects, and is therefore of a general, modal character.

 

On the other hand, it has special consequences as soon as the momentum p, for example, is of the order of Δp. If an electron is restricted to a limited spatial region (e.g., a hydrogen atom) the mean value of p has a smallest value determined by Heisenberg’s relations, and thus a smallest energy. If this spatial region can be extended (i.e., if the electron no longer belongs to one hydrogen atom, but to a molecule of two atoms), the electron can decrease its momentum, and thus its energy. This exchange bonding or covalent bonding explains why hydrogen is a diatomic molecule. By a similar argument it can be explained why an electron cannot exist as an independent particle in an atomic nucleus. Its total energy as determined by Heisenberg’s relations would be more than its rest mass. The much heavier mesons can exist independently in a nucleus for a short period of time.

 

Another consequence of the Heisenberg relations is already mentioned (5.3). If a system is isolated only during a short time Δt, the conservation law of energy has a restricted validity. Energy is now constant within the limits of +ΔE=+ht. For macroscopic systems, this amount is immeasurable small compared to the total energy E. But this inaccuracy has detectable consequences for some subnuclear processes. If the life time of some excited state is Δt, the energy of this state is only determined within ΔE=ht.

 

 

 

7.7. Interference and Huygens’ principle

 

 

 

A wave packet is a superposition of waves with different frequencies and wave lengths (the wave length is inversely proportional to the absolute value of the wave vector). Interference of waves occurs if waves of the same frequency are added, meaning that the amplitudes of the waves are added in the manner of complex numbers, i.e., by taking into account the phase relations. The phenomenon of interference is the basis of Christiaan Huygens’ principle, according to which a propagating wave signal can be decomposed into spherical waves.[12] Every point of space is assumed to be the centre of an expanding wave. The actual motion of the signal is the superposition of all these spherical waves with their different amplitudes and phases, which are thus determined by the initial and boundary conditions, as is the case with the above mentioned plane waves. This illustrates the arbitrariness of the choice of the basis for the decomposition of an actual wave packet.

 

The spherical waves used in this case are less easy to handle mathematically. For instance, it is difficult to prove that light moves approximately rectilinear and in one direction. This result was only achieved in the 19th century by Augustin Fresnel and Gustav Kirchhoff. This makes it understandable why Isaac Newton’s corpuscle theory of light propagation was favoured above Christiaan Huygens’ theory for over hundred years (T&E, 6.4). Finally,  the experiments of Thomas Young and Fresnel proved the possibility of interference which cannot satisfactorily be explained in Newton’s theory. In addition, Armand Fizeau showed that Newton’s theory gave a wrong value for the speed of light in a medium.

 

The plane wave representation is favoured in the description of pure kinetic motion because it reflects the temporal and spatial homogeneity and isotropy which is assumed. In Huygens’ theory only temporal homogeneity is assumed. Therefore, the frequency is still related to energy and remains invariant. But, because every spherical wave has a singular point, it lacks spatial homogeneity, and therefore there is no relation to linear momentum. As a consequence, Huygens’ representation is especially fruitful for the description of the wave’s interaction with rigid bodies in a spatial sense: reflection against a wall; refraction through a boundary between two media in which the velocity is different; diffraction by a slit in a wall (or a hole, or several slits, or a grid). In all these cases the physical details of the interaction are neglected. Only the change of motion due to the spatial environment is considered, by the study of the effect of the interference of the spherical waves in the neighbourhood of these spatial structures.

 

Huygens’ principle is very successful in the solution of problems of this kind, most of which cannot be solved on the basis of Newtonian mechanics. In fact, it is mainly because of these phenomena (especially diffraction) that the wave theory of motion is accepted. The physical community has become especially convinced of the correctness of the wave theory through interference phenomena. Interference causes photons or electrons to be in positions unexpected by Newtonian mechanics (and, conversely, these are not present at positions expected by Newtonian mechanics).

 

Plane waves must first be decomposed into spherical waves before such experiments can be explained. This is possible (and also the reverse: the decomposition of spherical waves into plane waves) because the spherical waves, as well as the plane waves, form complete sets, such that they can serve as a basis for the decomposition of the solutions of the wave equation. These two possibilities are not exclusive and are only two instances of an infinitude of possibilities. Thus in atomic and solid state physics one often uses a more limited set of plane waves, spherical waves, or even combinations of them.

 

More than anything else, Huygens’ principle shows the anticipatory character of the wave theory of motion. It can only manifest itself in the interaction of the particle with a rigid body, but in the mathematical description (which is only concerned with the motion of the particle) one completely abstracts from all the physical details of this interaction. The wave theory is a kinematic theory, developed anticipating the physical relation frame.

 

 

 

7.8. The wave-particle duality

 

 

 

The distinction of waves as objects and wave packets as subjects in the kinetic relation frame shows that there is not really a wave-particle duality in a kinetic sense. Any physical and kinetic subject can only be represented as a wave packet, which has some of the characteristics of a particle – namely, a more or less precise position, momentum and energy. However, since the 19th-century mechanist doctrine maintains that all physical phenomena must be reduced to the motion of unchangeable pieces of matter, an elementary particle is defined (or rather deified) as a mass point having definite values at a particular time for its position, energy, and momentum. This philosophically coloured idea of a particle clashes with the concept of a wave packet. This the reason why wave theory and especially the Heisenberg relations have been the subject of so many discussions.[13] It should be stressed that the wave packet is only a modal and anticipatory description of moving, physically qualified subjects. Therefore it is limited in two respects.

 

First, the wave packet itself does not describe interactions in a subjective sense. This is in sharp contrast to the classical concept of moving particles, whose extension was supposed to be impenetrable (wave particles are far from that). The assumed impenetrability gives rise to the possibility of collisions between the particles – in fact the only kind of interaction admitted in Cartesian mechanics (T&E, 3.4). Huygens’ principle only enabled him to give an objective description of the kinematical consequences of a very simple kind of interaction – namely, an interaction in which the wave packet collides with a rigid spatial system, and all physical details are disregarded. Hence, the collision between two atoms can be described by the wave theory only after the typical structure of the interaction is translated into spatial terms (the collision cross section). However, real interactions, especially those in which the internal state of one system is changed (e.g., if a system is absorbed), cannot be understood within the framework of wave theory alone.

 

Secondly, the wave theory gives a modal description and therefore discards all typical properties of the described subjects. It is quite irrelevant whether one is dealing with electrons or light quanta if Huygens’ principle is applied. The diffraction patterns made by a beam of light or by electrons of comparable wave lengths passing through a hole or a crystal are similar. This was predicted by Louis de Broglie in 1923 and confirmed soon afterwards by Clinton Davisson and Lester Germer.[14]

 

The kinetic character of diffraction and interference also manifests itself in the two-slit experiment in which a wave packet is split up. Interference of the two parts occurs as soon as they meet each other again.[15] It should be emphasized that in a kinematic sense the splitting of a wave packet into two parts (after passing a screen with slits, for instance) is no problem. It appears that after this transition one has two wave packets, two subjects spatially divided. Indeed, in a spatial sense, one cannot speak of one subject, if its parts are not connected. But wave packets are kinetic subjects, and its parts must not be spatially, but kinetically connected. Indeed, the two parts of the wave packet, after passing the double slit, are kinetically coherent. The well-known interference phenomena are explained by assuming that the two parts of the wave packet have well-determined phase relations.

 

The diffraction experiments especially emphasize the fact that the wave theory cannot account for the individuality of the particles and their individual interactions. The waves, rather than the particles, interfere in diffraction, reflection or refraction. This was not always clearly recognized. At first, some people tried to explain these phenomena by assuming that different particles interfere. But experiments soon showed that, if one has a very dilute beam with only one particle at a time in the apparatus, one still has the same diffraction pattern.

 

Thus one assumed that the waves in a single particle interfere with each other. But even this is objectionable. For example, interference between the beams emerging from two lasers is possible. In this case one can also dilute the beams such that no particles are present roughly 90% of the time and one particle is present roughly 10% of the time. The interference phenomena were decisively different if both lasers were open or if one was closed. Thus a particle emerging out of one laser interferes with the field of the other one, even when there are no particles coming out of the latter.[16]

 

The wave theory itself cannot give a full account of the individual behaviour of the described subjects. It has to be supplied with an interpretation which is no longer of a purely modal character. On the one hand, one has to give a probability interpretation of the waves describing the motion of the particle. The theory of probability is also an anticipatory one, which explains its strong connection with wave theory. On the other hand, it must be shown that the physical concept of a particle refers to a typical structure (chapter 11).

 



[1] This is the time needed to have the temperature difference decrease by a constant factor, the exponential unit (e = 2.781 …)

[2] Jammer 1966, 237ff; Hanson 1959, 450ff; Tolman 1938, 42.

[3] Margenau 1950, 182.

[4] Beth 1944a, 132ff.

[5] φ is an arbitrary unspecified number. It is sometimes called the phase, but just as often this name is used for the whole argument r+ct+φ. Note that r=|r|.

[6] Messiah 1958, 149.

[7] The Schrödinger equation as given above (7.3) must be slightly adapted to account for the occurrence of h.

[8] Margenau 1950, 150ff; Jammer 1954, 184; Russell 1927, 42.

[9] Heisenberg 1930; Jammer 1966, 323ff.

[10] Bunge 1967a, 267ff.

[11] Messiah 1958, Chapters 4 and 8.

[12] Huygens 1690.

[13] See, e.g., Bohr 1949; Jammer 1974, Chapter 3; Klein 1970; Margenau 1950, Chapter 16; Reichenbach 1951, Chapter 11; Price, Chissick 1977.

[14] Jammer 1966, 246, 251; Klein 1964.

[15] See on interference experiments Feyerabend 1962, 199ff; Jauch 1968, 112ff; Bohr 1949; Fine 1972; Reichenbach 1944, 24-32.

[16] Pfleegor, Mandel 1967.

 

Part I, chapter 8

 

 

Individuality and probability

 

 

 

  

 

8.1. Determinism

 

 

 

Section 1.2 stated as the first basic problem of science: Are there general modes of experience which provide an order for everything within the creation, and if so, which are these universal orders of relation? Chapters 2-7 supplied an answer to this question by studying the first four modal aspects and their retrocipations and anticipations both on the law side and the subject side. However, both mathematics and physics are not only concerned with modal laws and subjects, but also with special laws like those of electromagnetism, and typical structures like that of the copper atom.

 

In other words, physics is also confronted with the second basic problem: How can stable things exist, and how can they change? Before the discussion of this question in part II, chapters 8 and 9 will pay attention to the law-subject relation for individual systems, leading to the theory of probability. Statistics applies the theory of probability to the properties of a collection or ensemble of systems with the same typical structure.[1] For the time being it will not be necessary to know which structure that would be.

 

The dynamic development of nature strongly depends on the existence of random processes. In a classical context this will be discussed in chapter 8, whereas chapter 9 is concerned with quantum physics. In the present section determinism will be critically reviewed, both in classical and in quantum physics.

 

 

 

Determinism in classical physics

 

The necessity of using probability in physics was not always recognized. Until the beginning of the 20th century, classical mechanics served as a deterministic prototype of the physical sciences. Mechanics is almost exclusively concerned with motion as a mode of being of physically qualified subjects. Abstraction took place on the subject side from all concrete properties which do not relate to the kinetic aspect. Each concrete thing is thereby reduced to a modal moving subject. Because it remains physically qualified, nevertheless, the retrocipatory aspects of interaction (mass, energy, force) have to be included. The simplest objects of mechanics are mass points, with forces acting between them.[2]

 

In a deterministic interpretation this kinematic aspect is absolutized. All other aspects, which together with the kinematic one determine concrete reality, are ignored or dismissed as secondary qualities. When mass, position, velocity, and external circumstances (seen as forces or force fields) are given in a specific point at a certain time, motion is fixed with relation to past and future. Even contemporary authors characterize particles as being localizable.[3]

 

On the law side a correction of this rigorous functionalistic determinism was offered by classical chemistry, whose basis was laid by Joseph Priestley, Antoine Lavoisier, John Dalton, Jöns Berzelius and others since the turn of the 18th century. It differed from mechanics in that it ascribed typical properties to its objects, the elements consisting of similar atoms, and the chemical compounds consisting of similar molecules.

 

In physics a merely modal, deterministic approach first began to fail on the subject side. The individuality of atoms and molecules made its entry, first in statistical mechanics, then in radio-activity, and finally in Brownian motion. In chemistry essentially probabilistic reasoning underlies the law of mass action in chemical equilibrium established by Cato Guldberg and Peter Waage (1864).

 

However, both chemists and physicists still believed in determinism. Statistical methods were only used for practical reasons because a fully deterministic calculation of the motion of the many particles constituting a gas was (and is) beyond human capabilities.[4] Although radioactivity was considered to be a mystery, at the turn of the 20th century physical scientists were still confident that it could be solved along deterministic lines, i.e., by a modal theory.

 

 

 

Indeterminism in quantum physics

 

All this changed as a result of the development of quantum physics in which better distinction is made between an individual system and its state. This state has, in a certain sense, a latent character for an isolated system, manifesting itself only if the system interacts with another one – for instance, but not exclusively, a measuring apparatus. According to quantum physics, the individual state of the system does not exactly determine the result of the interaction. The initial and final states of the system are not related in a purely modal, determined way, but by means of a probability law.

 

This so-called stochastic relation is therefore not lawless. The probabilities of the joint initial and final states as numerical predicates of possible interactions are determined by the typical structure (the law) for the interacting systems. There are many different interpretations of this state of affairs, three of which we shall briefly discuss.

 

A small number of physicists (among others, Albert Einstein,[5] Erwin Schrödinger, David Bohm[6], and Louis de Broglie) remained loyal to determinism and therefore hypothesized the existence of (as yet) unknown determining factors (called hidden variables). In his mathematical analysis of quantum physics, John von Neumann[7] has shown that hidden variables cannot weaken the indeterministic structure of quantum physics (if the latter is correct). Physicists who still consider determinism, or rather a purely modal theory, as exclusively acceptable, are forced to assume that although the quantum physical formalism accurately describes the phenomena, it is nevertheless incorrect or incomplete. In principle this view cannot be contradicted, but it is not very convincing as long as its proponents have not succeeded in designing a theory along these lines.[8]

 

A majority of physicists emphasized the measuring process.[9] According to this view, one does not really know anything about a closed system. Only the results of measurement are verifiable, and during measurement the examined system cannot be isolated. But the result of measurement is not only determined by the character and the state of the system, but also by the action of the measuring instrument. This is called the measurement disturbance. Taken by itself, this phenomenon is not invented in quantum physics, of course. Also classical physics knew about errors in measurement, but physicists believed that in principle the measurement disturbance could be made arbitrarily small. In quantum physics, this is no longer tenable. The discovery that all moving subjects must be described with the help of wave packets implies that measurement disturbance cannot be arbitrarily small.

 

According to the so-called Copenhagen interpretation[10] - of which there are several variants – it is quite possible that an isolated system is completely determined. However, this is considered to be a meaningless proposition because it is not experimentally verifiable. Within this concept, the problem of individuality of physical systems is disposed of as an epistemological problem about the relation between observing subject and observed object.

 

Niels Bohr once observed that ‘… a not-further analyzable individuality … has to be attributed to every atomic process …’[11] The individuality of atomic processes belongs to the heart of Bohr’s interpretation of quantum physics.[12] However, in Bohr’s view this individuality is not intrinsic to physical systems and processes, but arises from the relation between a human subject and a sub-human object. I admit that observations and measurements are human acts, which besides the logical and psychic aspects also have a physical one. But it is only this physical aspect which one needs to take into account in the discussion of the limitations of measurement. It arises from the interaction between the object of measurement and the measuring instrument (eventually the human senses). This implies that the object cannot be considered isolated.[13] Theoretically, the study of isolated systems is preferred, but in measurements one observes a system while it is interacting with a measuring instrument. In this interaction one does not have to consider a subject-object relation of observer and observed system, but a subject-subject relation of two interacting physically qualified systems.[14]

 

According to a third interpretation the state function is not related to a single system, but represents the way in which an ensemble of similar systems is prepared. It is possible to determine the state function by means of a large number of measurements on the ensemble, but this procedure is meaningless for a single system, which individuality must be ignored. The state function is an expression of our knowledge of the ensemble.

 

 

 

Critique

 

All three interpretations emphasize undeniable states of affairs. Two aspects which they have in common can be criticized.

 

First, they all refer, implicitly or explicitly, to the deterministic interpretation of classical physics without being sufficiently aware of its philosophical bias. In the first interpretation, the determining factors are taken to be unknown as yet. In the second it is posited that they cannot be measured if they exist. In the third one takes recourse to the ensemble because it is assumed to be fully determined. Hence there is, in effect, no break in principle with 19th-century determinism. For instance, Heisenberg posits that only its premise is invalid, i.e., the premise: ‘If at a certain moment position and velocity of all particles are known.’ [15] Heisenberg therefore does not consider determinism as incorrect, but rather inapplicable in quantum physics.[16]

 

Secondly, the mathematical formalism of physics, in fact, does not receive its due. It is generally accepted that the theory has a statistical character. The first interpretation mentioned above does not recognize that the formalism describes the phenomena accurately, but it refuses to accept the conclusion that physical phenomena themselves have a stochastic character displaying individuality. The second interpretation misses the point that measurement disturbance has no significance for the calculation of the probable measurement results. The third interpretation ignores the fact that the mathematical formalism ascribes a state function to each separate system. Moreover, it must be observed that the application of statistical laws, for example, in quantum physics with respect to radioactivity assumes that the decay of different atoms constitutes statistically independent events.[17]

 

The underestimation of the mathematical formalism is not so strange because the formalism is generally considered to be merely a handy framework within which empirically discovered physical law structures can be summarized. After all, is not mathematics a free creation of the human mind? This is true as far as mathematics is a theoretical opening up of some modal aspects of temporal reality. But these are modal aspects of concrete reality which make its understanding possible. The mathematical formalism of quantum physics is more than a convenient representation of human knowledge of inorganic structures. It is the theory regarding their mathematical aspects, and an objectification of their physical aspect.

 

 

 

Individuality in physics and philosophy

 

Quantum physics does not prove that individuality may be attributed to physically qualified subjects, but it leaves room for such a conclusion. No special science can solve this philosophical problem. A scientific theory, seeking as a matter of course to stay close to empirical concrete reality, is able to display a deterministic structure excluding the possibility of individuality, but is also able to leave room for individuality. The former is the case with classical physics while the latter occurs in modern physics.

 

In itself it is correct that science takes distance from individuality. Science involves abstraction, and the first abstraction to be made is one from individuality. A solid state physicist will do many experiments with a single crystal, yet his interest is not directed to this one crystal, but extends either to the modal physical laws to which the crystal is subjected or to its typical structure. In the analysis of the results of his measurements he constantly abstracts from the subjective individuality of the object of measurement. In this respect quantum physics disregards individuality as much as classical physics did.

 

However, it has become necessary to account for the fact that natural phenomena cannot be completely described in a deterministic way. This is a philosophical matter, and before one can start its analysis, one has to make a choice concerning the individuality of natural subjects, whether it will be accepted as a matter of fact or not. According to determinists, the assumption of determinism in matter is less result than condition for science.[18] After posing the dilemma: Natural necessity (fully determined by law) or chance (in the sense of absolute arbitrariness), they reject the latter.[19] In particular they reject the subjective individuality of e.g. radioactive particles, each having separate existence.[20]

 

It is also possible to reject the dilemma,[21] replacing it by the correlation of law and subject, which cannot be reduced one to the other. Determinism reduces the subject to the law while pure chance eliminates the law. In my view, individuality is not an afterthought, a result of a conclusive analysis, but a premise for understanding physics.

 

 

 

8.2. Statistical measurements

 

 

 

In experimental physics measurements are usually repeated many times. Often every single measurement already yields a meaningful result, and one only repeats the measurement in order to improve on the accuracy by elimination of possible errors. In statistical measurements on the other hand, a single result has no immediate meaning. For instance, if one wants to determine whether a certain die is a fair one, a large number of trials have to be performed to find out whether the distribution of throws over the six possibilities confirms the typical law for a die.

 

Until the end of the 19th century this type of statistical measurement was not very important in physics. What is usually called statistical physics does not owe its name to its measurement procedure, but to a theoretical explanation (with statistical means) of macroscopic properties assumed to be generated as the average result of the relative motion and mutual interaction of the composing molecules. For this reason the theory of measurement as discussed in chapter 3 may be called classical.

 

Statistical measurements, especially those in (sub-) nuclear, atomic and molecular physics, first became important in the discovery of radioactivity. Their importance was enhanced in the interpretation by Albert Einstein (1905) of the molecular motion discovered by Robert Brown (1827) and measured by Jean Perrin (1908), and the scattering experiments by Ernest Rutherford (1911).

 

Such an experiment may proceed in the following way. A number of atoms is prepared in the same initial state with the help of a so-called state selector. For instance, the atoms may all have the same initial momentum and energy (within the margins set by Heisenberg’s relations). This state is disturbed by some interaction with a scattering system. Finally one measures how the state of the system is changed – for instance, the angle of deflection is measured. In this way Rutherford determined the size of gold nuclei.

 

Generally, the atoms will not react in the same way to the disturbance. Therefore, this experiment must be repeated many times in order to find the spectrum of the measurement results, and the statistical distribution of this spectrum. The former shows us the possible final states for the interaction, whereas the latter is determined by the relative probability of a final state for a given initial state. The experiment is repeated for other initial states in order to determine the transition probability, connecting a certain initial state with a certain final state.

 

Thus in statistical measurements we have both a counting procedure (the determination of the statistical distribution) and a measuring procedure (the determination of the spectrum of some measurable property).The latter does not differ basically from what was discussed in chapter 3. Only if the spectrum is continuous it has to be broken up into a discrete number of intervals, in order to make it possible to count the number of occurrences in each interval. Counting is not directly possible with respect to a continuous spectrum.

 

 

 

8.3. Static theory of probability

 

 

 

The theory of probability first of all has to account for two things: the spectrum of possible properties (which are simultaneously possible, so that the spectrum displays a spatial ordering), and the statistical distribution of relative frequency of occurrence of these possibilities, which has a numerical character. This section briefly recalls the formal topological properties of the spectrum and its measure.[22]

 

Probability is a numerical measure over a set of possibilities,[23] formally defined as a non-negative numerical measure P(A) for any sub-set A of U, [24] such that:

 

- P(A)>0

 

- P(U)=1  (normalization)

 

- if A˄B=Æ, then P(A˅B)=P(A)+P(B) – probability is an additive measure on the disjoint sub-sets of U.

 

This definition is sufficient to prove a number of theorems, such as:

 

- P(Æ)=0

 

- 0<P(A)<1, for any sub-set A

 

- P(A˅B)=P(A)+P(B)-P(A˄B) for any two sub-sets A and B.

 

Two other important concepts are defined as:

 

- Conditional probability: if P(B)≠0, P(A/B)=P(A˄B)/P(B). Clearly, P(A) is just short for P(A/U).

 

- A is statistically independent of B if P(A/B)=P(A) and P(B/A)=P(B), i.e., if they have ‘no common cause’.[25]

 

This leads to the following theorems:

 

- if A and B are disjoint (A˄B=Æ): P(B/A) = 0

 

- if AÌB: P(A˅B)=P(B), P(A˄B)=P(A), P(A/B)=P(A)/P(B).

 

- if A and B are statistically independent: P(A˄B)=P(A).P(B).

 

 

 

Note that the property of statistical independence is a property of the spectrum and not of the statistical distribution.

 

This means that probability as a measure on a set U is not the only measure satisfying the above definitions and theorems. If U has a finite number u of elements, P(x) can be interpreted as the number of elements in the sub-set A, divided by u, i.e., the relative number of elements in A. Since this interpretation has the same formal properties as probability, the two are isomorphic. This isomorphy is the theoretical basis of the measurement of probability. It can also be used in the statistical definition of entropy.

 

If U is a spatial figure, and A is a spatial part of U, P(A) can be interpreted as the spatial magnitude of A (i.e., its lengths, area, or volume) relative to that of U. This formal relationship with probability is used in the statistical conception of a phase space (6.6, 9.1). If a set consists of n mutually statistically independent subsets, it can be projected onto an n-dimensional space. For instance, the possible outcomes of casting two dice simultaneously are represented on a 6x6 diagram.[26]

 

 

 

8.4. Interpretations of probability

 

 

 

The formal system described above does not determine the probability function beyond its limits. For all sub-sets A not equal to Æ or U, P(A) is only known to lie between zero and one. A further specification is needed which can only be found by studying the typical properties represented by the set U. P(A) is a measure or weight function of the sub-sets A relative to U. Three cases can be considered.

 

(a) It is often possible to assume on rational grounds that different sub-sets have equal weight, because of some symmetry relation. In this way simple problems can be solved, such as occur in dice or card playing, assuming that the dice are not loaded and the card players are honest. In several more complicated problems which occur in quantum physics, for example, the symmetry of the systems concerned can facilitate their solution.

 

(b) Sometimes it is possible to design a theory to calculate weights which are not equal because of symmetry. In classical statistical physics one finds the beginning of this approach (8.5). It is fully developed in quantum physics (chapter 9).

 

(c) If there is no theory available, the only way to determine the probability function is by experiment. Even in this case the law is not reduced to the subject side. Also frequency hypotheses based on statistical extrapolation, such as mortality tables, can only be used if they are assumed to represent some kind of regularity, since there is no logical justification for the conjecture that frequencies will remain constant, and thereby permit extrapolation.[27] Probably (without many exceptions), all statistics in the non-physical sciences is of this type. In the first two cases, (a) and (b), experiments also remain important, of course. Theories are never a priori, but hypothetical, and must therefore be checked experimentally.

 

The fact that the probability function depends on the typical structure represented by the set U, and that there are three possibilities of determining this function, has not always been recognized clearly enough. This may explain why there is so much disagreement about the interpretation of probability.[28]

 

 

 

Ontic and epistemic probability

 

Probability in an ontological context is often confused with the epistemological probability of a statement. Ontologically, probability does not refer to knowledge (or lack of it), but to the variation allowed by a character. Determinists assume that probability is an epistemological matter. Ontologically, any system would be completely determined by physical laws. Probability is only applied because of the investigator’s lack of sufficient knowledge of a system. Only for quantum physics, intrinsically stochastic processes are acknowledged.

 

However, this view does not withstand scrutiny. Consider the most simple example of throwing a die. It is assumed that the outcome could be predicted if one knew the system in sufficient detail. If one pursues this path to the atomic level, one inevitably reaches a point where quantum fluctuations start to play a part. Therefore, if one accepts ontological indeterminacy at the quantum level, one has to accept it at a macroscopic level as well. One could not even say that for practical purposes, one could accept that the result of throwing a die is determined by physical laws, for the application of this principle to any practical case is virtually impossible. In fact, in any play of chance one had better start from a distribution of chances based on the symmetry of the game, and on the assumption that the actual process is stochastic.

 

Logicist philosophers who do not recognize the typical law determining the probability function and the set U, conceive of probability as a logical relation between propositions.[29] The theory is especially designed to give account of the inference of laws from empirical facts.[30] In my view, the outcome of experiments as described in section 8.2 reveals the individuality of the interacting subjects, and cannot be accounted for in a purely logical way. In science, probability does not describe our knowledge of physical systems, but their lawfully determined individual behaviour. Margenau rightly rejects this logicist interpretation as being irrelevant in science.[31]

 

 

 

Classical interpretation

 

The classical interpretation, drafted by Blaise Pascal, Abraham de Moivre, Daniel Bernoulli, Pierre-Simon Laplace and others, is directed to the first possibility described above. It is applicable if the symmetry of the problem allows us to find disjoint sub-sets A of U, such that these sub-sets have equal weight and together add up to U.[32] Therefore, the classical interpretation assigns equal probabilities to equally favourable cases. The founders of this theory were mainly inspired by games of chance, such as dice playing. This theory is also applied in classical mechanics (8.5). It clearly breaks down if no equally favourable possibilities can be found.

 

The classical view is sometimes criticized because of its alleged circularity, the equally favourable cases being definable because they have equal probabilities. But, as our examples show, in those cases covered by the theory the equally favourable cases are inferred from the symmetry of the systems. Thus the classical theory can only be criticized because of its limited scope, and is, in fact, still of great importance – e.g., in quantum physics (chapter 9).

 

 

 

The frequency interpretation

 

At the other extreme (clearly referring to the third possibility) one finds the definition of probability as a relative frequency of occurrence.[33] As a definition, it reduces the law to the subject side, or metric to measurement. Indeed, the measurement of probability can only be performed by determining the relative frequency of the occurrences of every possible case.[34] But it seems a somewhat defeatist reaction to the failure of the classical definition if it assumes that in no case a lawful metric for this probability can be found. Anyhow, such laws can be found in quantum physics.

 

 

 

Propensity

 

In his early publications Karl Popper[35] defended a variant of the latter view. Later he developed the classical theory into the ‘propensity interpretation’ of probability. It corresponds to the second possibility (b) mentioned above, but introduces ‘weighted’ instead of ‘equal’ probabilities. According to Popper, we have to

 

‘… interpret these weight of the possibilities (or of the possible cases) as measures of the propensity, or tendency, of a possibility to realize itself upon repetition’.[36]

 

 

 

My view comes quite close to Popper’s interpretation as far as classical probability is concerned. I distinguish the formal theory described in section 8.3 from the typical law which varies for different systems. Moreover I distinguish the law side, defining the set U of possible cases (the spectrum), and the probability function describing their weights, from the subject side (actual occurrences of the possible cases). These principles were applied in classical statistical mechanics. 

 

 

 

8.5. Classical statistical mechanics

 

 

 

The main application of probability theory in classical physics is statistical mechanics. It is based on the assumption that a gas, e.g., consists of a large number of similar molecules, which can only differ by their position, velocity, mass, and moment of inertia. The kinds of motion considered are linear motion and rotation (sometimes also vibration). Rotation and vibration are only considered in polyatomic molecules since atoms are supposed to be point-like. It is of interest to mention this because, if the finite extension of the atoms is taken into account, the method of classical statistical mechanics breaks down.

 

Statistical physics is often thought of as replacing thermodynamics, or at least providing its foundations (T&E, 7.5). But statistics is not a purely modal theory (because of the assumption of the existence of similar molecules) and therefore cannot be the basis for the entirely modal thermodynamical theory. On the other hand the latter is inferior to statistical mechanics which can, by its nature, be applied to typical problems. Statistical mechanics is also easy to be incorporated into quantum physics – which is required if one wishes to understand why classical statistical mechanics is applicable at all. Finally, thermodynamics is mainly retrocipatory (chapter 5), whereas statistical physics is anticipatory. In classical statistical mechanics there are two approaches, put forward mainly by James Clerk Maxwell, Ludwig Boltzmann, and Joshua Gibbs. I shall briefly discuss these in order to show the application of the formal theory as discussed above.

 

Maxwell and Boltzmann

 

The starting point of the approach by Maxwell and Boltzmann is the so-called Maxwell distribution for the molecules in an ideal gas (T&E, 7.5).[37] The following assumptions are made. (a) The molecules are fully described by their position r, velocity v, and mass m. (b) The particles do not interact with each other. This implies that the probability of finding a particle with a certain value for (r,v) is independent of the positions and velocities of other particles. Thus it is sufficient to derive the one-particle probability function which must be multiplied by the number (N) of molecules in order to find the distribution function for the gas. (c) The distributions for r and for v, respectively f1(r) and f2(v), are mutually independent: f(r,v)=f1(r).f2(v). (d) There is equilibrium, which means that the distribution function is spatially homogeneous (if there is no external field) and isotropic. Homogeneity means that f1(r)dr=const.dr. The constant is found by normalization and is equal to the inverse of the volume V of the gas: f1(r).dr=(1/V)dr. Isotropy implies that the probability function is independent of the direction of the molecular speeds: f2(v)=f2(|v|), or f2(v)=f2(|v|2)=f2(vx2+vy2+vz2). (e) The three coordinates vx, vy, vz are mutually independent, meaning that f2(v)=fx(vx). fy(vy). fz(vz).

 

These five assumptions are sufficient to show that

 

f2(v)=a.exp-½mβ(vx2+vy2+vz2)

 

The factor a can be found by normalization, the factor -½mβ by calculating the pressure P a gas like this would exert on the wall. One finds 1/β=PV/N, which means that β=1/kT because of Boyle’s law: PV=NkT (T is the temperature, k is Boltzmann’s constant, which value is only determined by the choice of the units).

 

It will be clear that the Maxwell distribution is found from symmetry arguments.[38]

 

Boltzmann recognized that the term in the exponent is just the kinetic energy of the molecule, divided by –kT. If we now introduce the concept of a state of the molecule, characterized by its velocity and position, we find that the relative probability of finding a particle in either one of two states with energies E1 and E2 is

 

P1/P2=(exp–E1/kT)/(exp–E2/kT)=exp–(E1E2)/kT

 

This was generalized by Boltzmann (and it is still the foundation of all statistical physics) to any system in equilibrium consisting of molecules or other particles which can freely exchange energy. It is nothing but an a priori assumption concerning ‘equally favourable cases’. If two states have the same energy, their probability is the same. If they have different energies, their relative probability is given by the Boltzmann factor (as it is called). If the set of possible states has a continuous spectrum (which is not the case for a classical gas), it is the probability density which is determined by the Boltzmann factor.

 

 

 

Microstates and macrostates

 

Whereas Maxwell and Boltzmann considered one system consisting of many molecules, Joshua Gibbs[39] studied an ensemble, an infinite number of systems similar in their structure and boundary values, but with different microstates (10.2). Above we defined the state of a single molecule as being characterized by its position and velocity. The microstate of a system of molecules is the juxtaposition of all molecular states, while the macrostate of the system enumerates its macroscopically determinable properties, such as volume, pressure, and temperature.

 

The microstate of a system can be represented by a point in a 6N-dimensional phase space, N being the number of molecules in the system. There is a many-to-one relationship between microstates and macrostates. Many microstates may correspond to a certain macrostate, but a microstate fully determines the corresponding macrostate. Therefore, if all microstates are equally probable, the relative probabilities of macrostates are proportional to the numbers of their corresponding microstates. In the case of a continuous spectrum of possibilities a macrostate can be represented by a region in the 6N-dimensional space of microstates, and its probability is proportional to the volume of this region (6.6).

 

According to Gibbs all microstates are equally probable, as far as they are accessible by the system, i.e., as far as they are compatible with one or more restrictions or constraints. In the case of a completely isolated system, for which the energy is constant, Gibbs introduced the ‘microcanonical ensemble’. Here, all microstates with the same energy are equally probable (other states, not being accessible, have probability zero). For systems at constant temperature, for which the energy may fluctuate, Gibbs defined the ‘canonical ensemble’, in which the relative weight function for different microstates is the Boltzmann factor, exp–(E1E2)/kt. Finally, if the number of molecules, as well as the energy, is undefined, the ‘grand canonical ensemble’ applies, with the weight function exp–[(E1E2–(N1N2)μ]/kT, called the Gibbs factor, wherein μ is the ‘chemical potential’. In this theory, the entropy, the free energy, and other important thermodynamic variables can easily be defined.

 

The approaches of Maxwell, Boltzmann and Gibbs are mentioned here, in the first place, to show the a priori character of their basic assumptions. These can only be justified ‘… by the correspondence between the conclusions which it permits and the regularities in the behaviour of actual systems which are empirically found.’[40]

 

This discussion also shows the typical and individual character of these theories. In both approaches, the similarity of the systems to be studied is a basic assumption. In the 19th century each molecule was assumed to be identified by its position and velocity at any time. But quantum physics has shown that the position of every individual molecule is not relevant, but only the distribution of all molecules over the accessible states. What is relevant is that a point in the six-dimensional phase space is occupied, not which molecule happens to be there.

 

 

 

The classical approach as seen from the viewpoint of quantum statistics

 

In quantum physics the symmetry of the state function for similar particles allows of two possibilities. A single molecular state can be occupied by at most one particle if it is a fermion, or by an unlimited number of particles if they are bosons (11.6). The Maxwell-Boltzmann distribution is now only a limiting case (for very small occupation probabilities) of the more fundamental distribution functions: Fermi-Dirac, for fermions, and Bose-Einstein, for bosons. Whether a particle is a fermion or a boson is determined by its typical structure.

 

Once again this shows that statistical physics is not a fully modal theory, although it has very general features. In fact, the correct derivation of the classical Maxwell-Boltzmann distribution can only be given from quantum statistics because the classical assumption of the complete identifiability of the molecules in kinetic terms leads to an overestimation of the number of possible microstates.[41]

 

There are more considerations indicating that the classical approach can only be justified by quantum physics. One is the assumption that monatomic molecules have only three degrees of freedom (i.e., the number of coordinates necessary to specify the molecule’s relative position), whereas diatomic molecules, for instance, have two additional degrees of freedom.[42] Especially when the internal structure of atoms consisting of a nucleus and several electrons was discovered, it became clear that this assumption is incomprehensible from a classical point of view.

 

However, quantum physics accounts for the existence of discrete energy levels which are dependent on the internal structure of atoms and molecules. These levels are widely spaced, such that electronic transitions from the ground state to the first or higher states do not occur at normal temperatures. But rotational states for diatomic molecules are less widely spaced, and therefore they can be excited easily at room temperature. This has consequences for the specific heat of a gas, which is nearly equal to (3/2)NkT for a monatomic gas, as well as for a diatomic gas at 50 K, whereas this value increases to (5/2)NkT for diatomic gases at higher temperatures. Both the temperature dependence of the specific heat, and more fundamentally, the applicability of classical statistics to normal gases, can therefore be understood only from the quantization of energy levels according to quantum theory (chapter 9).[43]

 

 

 

8.6. The physical qualification of probability

 

 

 

Although probability is presented as a numerical measure over a set of possibilities, it is also physically qualified in classical as well as in quantum theories. In either case one of the more or less probable possibilities must be actualized. This actualization only occurs in some interaction – for example, shuffling cards, throwing dice, interactions in classical and quantum physics. The temporal order of possibility and its actualization is clearly asymmetrical, anticipating irreversibility.

 

 

 

The ergodic problem

 

The theories of Boltzmann and Gibbs lead to a description of the equilibrium state of a system as the most probable state. Ludwig Boltzmann explained the irreversible approach to equilibrium by the assumption that any actual system will proceed through all accessible states, such that the spatial average in phase space is equal to the temporal average for a single system. This so-called ergodic theorem (or a weaker quasi-ergodic theorem, according to which every accessible microstate will be approached arbitrarily close after some time) has been the subject of intensive mathematical research, but cannot be proved except for very simple systems under severe restrictions.[44]

 

Apparently, this problem cannot be solved in a purely modal theory, because it only has meaning if the systems in the spatial ensemble all have the same structure, whereas in calculating the temporal average it is assumed that the system retains its typical individuality during its passage through all possible states. The fact that the two averages must be the same is therefore not something which must be proved, but lies at the basis of all statistical methods. It assumes that the same system has a constant typical structure, or that similar systems are subjected to the same structural law. It says that the typical law is valid during any time, and for all systems under consideration.

 

 

 

Internal interactions

 

The calculation of the entropy and related properties of a system is usually possible only for simplified systems of non-interacting molecules, such as the ideal gas, or the linear chain of magnetic molecules.[45] It is remarkable that such a system will not do the job. Because the molecules do not interact with each other, the microstate of the linear chain will never change, and in a perfect gas mixture, there is no diffusion. If the microstate happens to correspond to a non-equilibrium macrostate (e.g., due to its preparation), it will never go to equilibrium as every actual system does. Thus we assume that there is some interaction between the molecules, small enough not to destroy the results of the calculation, but large enough to change the microstates so rapidly, that the temporal average may be equated with the calculated spatial phase average for the system.

 

Boltzmann systematically introduced the interaction in the six-dimensional phase space.[46] For an arbitrary (because unknown) interaction he substituted a collision probability function for pairs of molecules, describing the probability at any time for given positions and velocities before the collision, to find the change in velocity caused by the interaction. The theory leads to the so-called Boltzmann-equation which can account for many phenomena like viscosity, diffusion, thermal and electric conduction, if certain assumptions are made concerning the interaction. This approach is also useful in quantum physics. In this formalism a function H (èta) can be defined, which decreases in time for any system consisting of interacting molecules until the system has reached its equilibrium state in which H is constant. (This equilibrium state is again the Maxwell-Boltzmann distribution).[47]

 

The function H can be connected to the entropy of the system, and therefore Boltzmann’s theory was hailed as deriving the irreversible approach to equilibrium from reversible kinematics (chapter 6, T&E, 7.5). Here it suffices to observe that this derivation depends essentially on the interaction between the molecules, and therefore is not of purely kinematic character. Moreover, the derivation makes use of probability theory and must then distinguish between actual states in the past, and possible states in the future, meaning that irreversibility is presupposed from the start.

 

Because of the difficulties inherent in Boltzmann’s and Gibbs’ approaches as to the explanation of irreversibility, modern treatises no longer try to derive irreversibility from essentially mechanical systems. Rather, irreversibility is introduced from the outset. This is especially done in the form of Markov processes in which the state of a system is determined by the preceding states. This approach also has its difficulties, especially because of the continuity of time. On the other hand, however, it has possibilities not shown by the classical methods.[48]

 

 

 

Randomness

 

Finally, in all applications of probability theory the initial state forms a separate problem, at least in physics.[49] Although the initial state may be partly determined by some previous interaction or preparation, it necessarily has an amount of disorder, ‘molecular chaos’, or ‘randomness’.[50] One has tried but never succeeded in defining randomness. It appears that one has to accept it as a primitive concept. For instance, when checking probabilities in dice playing, it is assumed that the way the dice are thrown does not influence the result in the mean. An honest card player is assumed to shuffle his cards at random. And in an opinion poll one has to strive for a representative sample. There are criteria to avoid biased samples, but there is no universal criterion to establish a completely random sample.

 

Randomness may be considered another expression of the individuality of the systems concerned, which cannot be fully delimited by specifying some of their properties. On the one hand, complete randomness does not exist. Statistical predictions can only be made with respect to systems of which at least something is known of their typical structure. On the other hand, probability without randomness is useless. In quantum physics the initial state determining the statistical distribution also contains an element of randomness. According to a theorem related to the Heisenberg relations, if any property is completely determined by its preparation, the ‘canonically conjugate’ property is completely random. In general, the initial state in quantum physics can be better specified than in classical physics. But even then it always contains an undetermined phase.

 



[1] Tolman 1938, 2, 43.

[2] Einstein 1949, 19ff.

[3] Akhieser, Berestetsky 1953, 17; Messiah 1958, 4, 138; Čapek 1961, chapter 14; Bunge 1967a 108, 24.

[4] Reichenbach 1956, 56.

[5] Einstein 1949, 82ff; Klein 1964, 1970; Hooker 1972.

[6] Bohm 1957.

[7] Von Neumann 1932; see also Jauch 1968, chapter 7; Jammer 1966, 366ff; 1974, 265ff.

[8] On the completeness of quantum theory, see Jammer 1966, 366ff.

[9] See Bohr 1934, Introduction: ‘The aim of science is to extend as well as to order our observations …’.

[10] Heisenberg 1958, chapters 3 and 8; Losee 1964; Hanson 1959.

[11] Jammer 1966, 347; see Bohr 1949, 209, 223, 230.

[12] Meyer-Abich 1965, 102.

[13] Bridgman, in Henkin et al. (eds.), 229.

[14] Čapek 1961, 303f.

[15] Heisenberg 1955, 29: ‘… dass die unvollständige Kentnis eines Systems ein wesentlicher Bestandteil jeder Formulierung der Quantentheorie sein muss.’ For a criticism of this view, see Popper 1967.

[16] Jammer 1966, 330; 1974 75ff; see also Heitler 1949, 192.

[17] Cp. Hempel 1965, 392.

[18] Van Melsen 1946, 138ff; 1955, 148ff, 271ff. The view that determinism is instrumental for any science is also expressed by Claude Bernard, cf. Kolakowski 1966, 90ff.

[19] Van Melsen 1946, 157ff; 1955, 285ff.

[20] Van Melsen 1955, 300.

[21] Čapek 1961, 338ff.

[22] In the set U of all possibilities (the ‘universe of discourse’, or ‘sample space’) having sub-sets A, B, … one distinguishes the union of two sub-sets A˅B and the intersection of two sub-sets A˄B. An element of U is an element of A˅B if it is an element of A, or of B, or both. It is an element of A˄B if it is an element of both A and B. We call A and B disjoint, if A˄B=Æ, the empty set containing no element. A is a subset of B, or B includes A (AÌB), if A˅B=B, or A˄B=A. We call –A the complement of A, if (-AA =U, and (-AA=Æ. A˅B, A˄B, and –A are sub-sets of U, if A and B are.The following set-properties can easily be derived: A˄B=B˄A; A˄U=A; A˅U=U; A˄A=A; A˅B=B˅A; A˅A=A;­ A˅Æ=A; Æ=Æ; -U=Æ; -(-A)=A. These definitions and properties do not define a group, but a so-called Boolean algebra. Boole 1854; Suppes 1957, 202ff; another approach is that of a Borel set.

[23] Nagel 1939, 92 ff; Bunge 1967a, 89-93; Popper 1959, 326ff; 1967; Jauch 1968; Hempel 1965, 386ff; Hesse 1974, Ch. 5; Suppes 1957, 274-291.

[24] Observe that the theory ascribes a probability to the subsets, not to the elements of a set.

[25] Reichenbach 1956, 157ff.

[26] Genetics calls this a Punnett-square, after R.G. Punnett (1905). If E is a spatial figure with unit magnitude, p(A) is the magnitude of a proper part of the figure. Hence, so far the theory is not intrinsically a probability theory.

[27] Popper 1959, 168f.

[28] Braithwaite 1953; Carnap 1950; Jammer 1974, 7; Nagel 1939; Margenau 1950, chapter 13; Poincaré 1906, chapter 11; Popper 1967.

[29] Keynes 1921; Jeffreys 1939; Hesse 1974.

[30] See Hempel 1965, 57ff, 381ff, 385: A mathematical theory of ‘inductive probability’ (as developed by Carnap) is only available for a relatively simple kind of formalized language; ‘… the extension of this approach to languages whose logical apparatus would be adequate for the formulation of advanced scientific theories is as yet an open problem’.

[31] Margenau 1950, 250ff; Popper 1967, 29.

[32] Popper 1959, 168.

[33] Von Mises 1939, 163-176; Reichenbach 1956, 96ff.

[34] Hempel 1965, 387 essentially supports the view of Von Mises and Reichenbach, although he criticizes their formulations. They define probability as the limit of the relative frequency in an infinite series of performances, and Hempel rightly observes that such series are not realizable. But this criticism does not touch the heart of the problem – namely, that the probability has both a law side and a subject side, and that the former cannot be reduced to the latter.

[35] Popper 1959.

[36] Popper 1967, 32; 1974.

[37] Maxwell 1860; Born 1949, 50f.

[38] Several details can be criticized, and there are other derivations, see Born 1949, 51ff; Tolman 1938, chapter 4.

[39] Kittel 1969; Tolman 1938, 43ff.

[40] Tolman 1938, 59; Kittel 1969, 34, 35; Popper 1959, 208.

[41] See e.g., Kittel 1969, 304-307, 390-392.

[42] This refers to possible rotations about two independent axes. The relative vibration of the two atoms leads to another degree of freedom

[43] Mott 1964.

[44] Truesdell 1968, 360-363; Khinchin 1947, chapter 3; Tolman 1938, 65ff; Penrose 1970, 39ff; Reichenbach 1956, 78-81; Prigogine 1980, 33-42, 64-65; Sklar 1993, 164-194.

[45] Kittel, Chapter 2ff.

[46] See, e.g., Tolman 1938, chapters 5 and 6.

[47] As observed in 6.6, it is essential in this derivation that the state of a system be described by a domain (not a point) in state space..

[48] See, e.g., Penrose 1970.

[49] In biology, or sociology, the related problem is that of the ‘population’, the ‘Kollektiv’, or a ‘representative sample’.

[50] Hempel 1965, 386; Nagel 1939, 32ff; Popper 1959, 151ff, 359ff.

 

Part I, chapter 9

 

Probability in quantum physics

 

9.1. Emergence of the wave theory of probability

 

This section critically reviews the history of quantum physics. The development of its basic concepts involved a good number of years of concerted efforts on the part of many theoretical and experimental physics in many countries. The basic ideas of the theory were essentially established during a thirty year period (1900-1930), yet at the end of the 20th century there was still no agreement about the interpretation of its foundations. The subsequent sections discuss the mathematical framework, which was basically established in the years 1925-1930 by physicists such as Louis de Broglie, Erwin Schrödinger, Werner Heisenberg, Max Born, Wolfgang Pauli, Pascual Jordan, and Paul Dirac, and by mathematicians such as John von Neumann.[1] The so-called Hilbert-space representation is not the only one,[2] but suffices for the purpose to show the probabilistic character of quantum physics, and to point out how it differs from classical probability theory (chapter 8).

The general theory of quantum physics addresses five related problems:

(a) To find the spectrum of possible properties of the system under study (9.3).

(b) To give an objective description of the (initial) state of the system, to the extent that it is specified, and to the extent that it is at random (9.4).

(c) To determine the relative statistical weights associated with the possible properties of the system relative to its state. This implies the discussion of the external (modal) symmetries (9.5, 9.6) as well as the internal structure, partly expressed by internal symmetries (9.7).

(d) To determine the temporal development of the state during the time from one interaction to the next, and to treat the problem of interference (9.8).

(e) To explain the actualization of one of the possible properties via an interaction, which implies the distinction between possessed properties and latent propensities (9.8).

 

Hilbert space

It is most remarkable that at least the first four problems can be treated within the context of a single concept, that of a complex Hilbert space, with its associated hermitean operators. This concept is an abstract one, and has many realizations. All Hilbert spaces with the same number of dimensions are isomorphic to each other. The basic hypothesis of quantum physics says that the set of possible states of a system is isomorphic to all Hilbert spaces of a certain dimensionality, which depends on the typical structure of the system.[3]

Any property of the system is related to a coordinate system (a set of basis functions) in Hilbert space, such that the property’s spectrum is related to the dimension of that space. Properties with a number of possible values less than the dimension of the Hilbert space are called degenerate for that system. Degeneracy is always connected to some kind of symmetry.

The probability associated with a certain value of some property is determined jointly by the spectrum of that property and the state at the moment the interaction revealing (probing) that property takes place. Thus, while the concept of a Hilbert space provides the description of probabilities in an isolated system, at the same time it anticipates interaction. Its use breaks down as soon as we want to investigate the interaction itself. So the fifth problem mentioned above is at best partly solved.

Operators in Hilbert space

A Hilbert space is the dense and complete set of all linear combinations of a number of basis functions with complex coefficients (2.7). In this space for any pair of functions f1 and f2 a linear functional (f1,f2) exists and is called the scalar product. Now the concept of a linear operator is introduced as a mapping of the Hilbert space onto itself, more or less similar to a rotation in an Euclidean space of two or three dimensions.[4] If f and g are arbitrary functions in the Hilbert space H, then A is a linear operator if it transforms the function f into Af such that Af is a function in H. The identity operator I transforms each function into itself, and the zero operator reduces each function to zero. All linear operators in a Hilbert space form a group with respect to addition, with the zero operator as identity element, and with –A=(-1)A as the inverse of A. In general, multiplication of operators is not commutative. A and B are said to commute if AB=BA.

So the fifth problem mentioned above is at best partly solved.

Quantum physics is especially interested in hermitean operators (for which A=A+, the adjoint operator[5])and in unitary operators (defined by UU+=I, the identity operator).

 

Hermitean operators, eigenvectors and eigenvalues

An operator is called hermitean or self-adjoint if for any pair of functions f and g in H, (f,Ag)=(Af,g). Each hermitean operator generates a basis in the Hilbert space. This means, for any hermitean operator A there exist vectors ni such that Ani=aini, where ai is a real number. If normalized, the so-called eigenvectors or eigenfunctions ni of A have the properties of basis functions in H: (ni,ni)=1, (ni,nj)=0, for any i and j, i≠j. Moreover, the set of ni’s is complete, which means that any function in the Hilbert space can be written as a linear combination of those eigenvectors.

The real eigenvalues ai can serve to distinguish the eigenvectors. Therefore, if two mutually orthogonal eigenvectors have the same eigenvalue, all vectors in the two-dimensional space consisting of the linear combinations of these two eigenvectors are also eigenvectors. Hence an eigenvalue determines a subspace in Hilbert space, whether one-dimensional (non-degenerate eigenvalue) or multi-dimensional (degenerate eigenvalue).

If two hermitean operators commute, they have the same set of eigenvectors, but with different eigenvalues, and different degeneracy. To every unit vector ni of a basis in Hilbert space is connected a hermitean operator Pi, which transforms any function into its projection onto that unit vector. Thus, because f=Σ(f,ni) ni, we have Pif=(f,ni)ni. For the basis vectors themselves, Pini=1 and Pinj=0 if i≠j. Hence the eigenvalues of Pi are either one or zero (the latter is highly degenerate), and the projection operators can be used to describe yes-no experiments.[6]

 

Unitary operators and symmetry

A unitary operator U is defined by the property UU+=I, the identity operator. Thus unitary operators can form a multiplication group with I as the identity element, and U+ as the inverse of U. The application of a unitary operator to a basis leads to a new basis having the same the orthogonality and normalization properties. With this change of basis, a hermitean operator A is transformed into U+AU. If A and U commute, A is not changed (U+AU=U+UA=A). Therefore, unitary operators are very useful in describing symmetry operations, in which transformations of the state of the system are made without changing its properties.

Unitary operators turn out to be particularly useful for the description of the spatial and temporal homogeneity for isolated systems (9.5). For spatial isotropy one finds that the degeneracy of eigenvalues is not complete, so that the corresponding unitary operator is a two- or more-dimensional matrix (9.6).

 

9.2. Problems concerning wave packets

 

Probability is a measure over an ensemble, a set of possibilities (8.3). If the set of possibilities is continuous this is a field over a space or a region in a space. Stating this once again shows the static character of classical probability theory, and points, at the same time, to the way to open it up in a kinematic sense. For the kinematics of a field leads to the theory of waves, in particular the concept of a wave packet as an aggregate of waves (chapter 7). This theory is not only applicable to physical fields in physical space, but to any field in any space, including probability. The actualisation of any possibility requiring a physical interaction finishes the development of probability.

Many sounds are signals. A signal being a pattern of oscillations moves as an aggregate of waves from the source to the detector. This motion has a physical aspect as well, for the transfer of a signal requires energy. But the message is written in the oscillation pattern, being a signal if a person or an animal receives and recognizes it.

A signal composed from a set of periodic waves is called a wave packet. Although a wave packet is a kinetic subject, it achieves its foremost meaning if its physical interaction is taken into account. The wave-particle duality has turned out to be equally fundamental and controversial. Neither experiments nor theories leave room for doubt about the existence of the wave-particle duality. However, it seems to contradict common sense, and its interpretation has been the object of hot debates.

Common sense dictated waves and particles to exclude each other, meaning that light is either one or the other. When the wave theory turned out to explain more phenomena than the particle model, the battle appeared to be over (T&E, 6.3, 6.4).[7] Light is wave motion, as was confirmed by Maxwell’s theory of electromagnetism. Nobody realized that this conclusion was a non sequitur. At most, it could be said that light has wave properties, as follows from the interference experiments of Young and Fresnel, and that Newton’s particle theory of light was refuted.[8]

 

A dualistic world view

19th-century physics discovered and investigated many other rays. Some looked like light, such as infrared and ultraviolet radiation (about 1800), radio waves (1887), X-rays and gamma rays (1895-96). These turned out to be electromagnetic waves. Other rays consist of particles. Electrons were discovered in cathode rays (1897), in the photoelectric effect and in beta-radioactivity. Canal rays consist of ions and alpha rays of helium nuclei (T&E, 6.5).[9]

At the end of the 19th century, this gave rise to a rather neat and rationally satisfactory world view. Nature consists partly of particles, for the other part of waves, or of fields in which waves are moving. This dualistic world view assumes that something is either a particle or a wave, but never both, tertium non datur.

It makes sense to distinguish a dualism, a partition of the world into two compartments, from a duality, a two-sidedness. The dualism of waves and particles rested on common sense, one could not imagine an alternative. However, 20th-century physics had to abandon this dualism perforce and to replace it by the wave-particle duality. All elementary things have both a wave and a particle character (7.8).

Almost in passing, another phenomenon, called quantization, made its appearance. It turned out that some magnitudes are not continuously variable. The mass of an atom can only have a certain well-defined value. Atoms emit light at sharply defined frequencies. Electric charge is an integral multiple of the elementary charge. In 1905 Albert Einstein suggested that light consists of quanta with energy E = hf. In Niels Bohr’s atomic theory (1913), the angular momentum of an electron in its atomic orbit is an integer times Max Planck’s reduced constant.[10] Until Erwin Schrödinger and Werner Heisenberg in 1926 introduced modern quantum mechanics, repeatedly atomic scientists found new quantum numbers with corresponding rules.

 

Louis de Broglie

In 1923, Louis de Broglie published a mathematical paper about the wave-particle character of light. [11] Applying the theory of relativity, he predicted that electrons too would have a wave character. The motion of a particle or energy quantum does not correspond to a single monochromatic wave but to a group of waves, a wave packet. The speed of a particle cannot be related to the wave velocity (l/T=ƒ/s), being larger than the speed of light for a material particle. Instead, the particle speed corresponds to the speed of the wave packet, the group velocity. This is the derivative of frequency with respect to wave number (df/ds) rather than their quotient. Because of the relations of Planck and Einstein, this is the derivative of energy with respect to momentum as well (dE/dp). At most, the group velocity equals the speed of light.[12]

In order to test these suggestions, physicists had to find out whether electrons show interference phenomena. Experiments by Clinton Davisson and Lester Germer in America and by George P. Thom­son in England (1927) proved convincingly the wave character of electrons, thirty years after Thomson’s father Joseph J. Thomson established the particle character of electrons. As predicted by De Broglie, the linear momentum turned out to be proportional to the wave number. Afterwards the wave character of atoms and nucleons was demonstrated experimentally.

It took quite a long time before physicists accepted the particle character of light. Likewise, the wave character of electrons was not accepted immediately, but about 1930 no doubt was left among pre-eminent physicists.

This meant the end of the wave-particle (or matter-field) dualism, implying all phenomena to have either a wave character or a particle character, and the beginning of wave-particle duality being a universal property of matter (7.8). In 1927, Niels Bohr called the wave and particle properties complementary.[13] Bohr also asserted that measurements can only be analyzed in classical mechanical terms, using arguments derived from Immanuel Kant.[14]

 

The dual character of physical particles

An interesting aspect of a wave is that it concerns a movement in motion, a propagating oscillation. Classical mechanics restricted itself to the motion of unchangeable pieces of matter. For macroscopic bodies like billiard balls, bullets, cars and planets, this is a fair approximation, but for microscopic particles it is not.[15] The experimentally established fact of photons, electrons, and other microsystems having both wave and particle properties does not fit the still popular mechanistic world view. However, the theory of characters (10.2) accounts for this fact as follows.

The character of an electron consists of an interlacement of two characters, a generic kinetic wave character and an accompanying specific particle character that is physically qualified (7.8). The specific character (different for different physical kinds of particles), determines primarily how the particles concerned interact with other physical subjects, and secondarily which magnitudes play a role in interaction. These characteristics distinguish electrons from other particles like muons, from spatially founded systems like protons and atoms, and from photons and similar particles having a kinetic foundation.

Interlaced with the specific character is a pattern of motion having the kinetic character of a wave packet. Electrons share this generic character with all other particles. In experiments demonstrating the wave character, there is little difference between electrons, protons, neutro­ns, or photo­ns. The generic wave character has primarily a kinetic qualification and secondarily a spatial foundation. The specific physical character determines the boundary conditions and the actual shape of the wave packet. Its wavelength is proportional to its linear momentum, its frequency to its energy. A free electron’s wave packet looks different from that of an electron bound in a hydrogen atom. The wave character representing the electron’s motion has a tertiary characteristic as well, anticipating physical interaction. The wave function describing the composition of the wave packet determines the probability of the electron’s performance as a particle in any kind of interaction.

 

Properties of wave packets

A purely periodic wave is infinitely extended in both space and time. It is unfit to give an adequate description of a moving particle, being localized in space and time. A packet of waves having various amplitudes, frequencies, wavelengths, and phases delivers a pattern that is more or less localized. The waves are superposed such that the net amplitude is zero almost everywhere in space and time. Only in a relatively small interval (to be indicated by Δ) the net amplitude differs from zero.

Rectilinear motion of a wave packet at constant speed is described by four magnitudes. These are the position (x) of the packet at a certain instant of time (t), the wave number (s) and the frequency (f).

The packet is an aggregate of waves with frequencies varying within an interval Δf and wave numbers varying within an interval Δs. Generally, the wave packet in the direction of motion has a minimum dimension Δx such that Δx.Δs>1. In order to pass a certain point, the packet needs a time Δt, for which Δt.Δf>1. If the packet is compressed (Δx and Δt small), the packet consists of a wide spectrum of waves (Δs and Δf large). Conversely, a packet with a well defined frequency (Δs and Δf small) is extended in time and space (Δx and Δt large). It is impossible to produce a wave packet whose frequency (or wave number) has a precise value, and whose dimension is simultaneously point-like. If the variation Δs is small, the length of the wave packet Δx is large, whereas if the packet is localized, the wave number needs to show a large variation.

Sometimes a wave packet is longer than one might believe. A photon emitted by an atom has a dimension of Δx=cΔt, Δt being equal to the mean duration of the atom’s metastable state before the emission. Because Δt is of the order of 10-8 sec and c=3*108 m/sec, the photon’s ‘coherence length’ in the direction of motion is several metres. This is confirmed by interference experiments, in which the photon is split into two parts, to be reunited after the parts have transversed different paths. If the path difference is less than a few metres, interference will occur, but this is not the case if the path difference is much longer. The coherence length of photons in a laser ray is many kilometres long, because in a laser, Δt has been made artificially long.

An oscillating system emits or absorbs a wave packet as a whole.During its motion, the coherence of the composing waves is not always spatial. A wave packet can split itself without losing its kinetic coherence. This coherence is expressed by phase relations, as can be demonstrated in interference experiments as described above. In general, two different wave packets do not interfere in this way, because their phases are not correlated. This means that a wave packet maintains its kinetic identity during its motion. The physical unity of the particle comes to the fore when it is involved in some kind of interaction, for instance if it is absorbed by an atom causing a black spot on a photographic plate or a pulse in a Geiger-Müller counter. Emission and absorption are physically qualified events, in which an electron or a photon acts as an indivisible whole.

The identification of a particle with a wave packet seems to be problematic for various reasons. The first problem, the possible splitting and absorption of a wave packet, is mentioned above.

Second, the wave packet of a freely moving particle always expands, because the composing waves having different speeds.[16] Even if the wave packet is initially well localized, gradually it is smeared out over an increasing part of space and time. However, the assumption that the wave function satisfies a linear wave equation is a simplification of reality. Wave motion can be non-linearly represented by a ‘soliton’ that does not expand. Unfortunately, a non-linear wave equation is mathematically more difficult to treat than a linear one.

Third, in 1926 Werner Heisenberg observed that the wave packet is subject to a law known as indeterminacy relation, uncertainty relation or Heisen­berg relation (7.6). As a matter of fact, there is as little agreement about its definition as about its name.

Combining the relations Δx.Δs>1 and Δtf >1 with those of Planck (E=hf) and Einstein (p=hs) leads to Heisenberg’s relations for a wave packet: Δxp>h and ΔtE>h.[17] The meaning of Δx etc. is given above. In particular, Δt is the time the wave packet needs to pass a certain point.[18] This interpretation is the oldest one, for the indeterminacy relations – without Planck’s constant - were applied in communication theory long before the birth of quantum mechanics.[19] It is interesting to observe that the indeterminacy relations are not characteristic for quantum mechanics, but for wave motion. The relations are an unavoidable consequence of the wave character of particles and of signals. I prefer this interpretation because it remains closest to experimental facts, but I shall discuss some alternative, more theoretical, interpretations, in particular paying attention to the Heisenberg relation between energy and time.[20]

 

Operators

Quantum mechanics connects any variable magnitude with a Hermitean operator having eigenfunctions and eigenvalues (2.4, 9.1). The eigenvalues are the possible values for the magnitude in the system concerned. In a measurement, the scalar product of the system’s state function with an eigenfunction of the operator is the square of the probability that the corresponding eigenvalue will be realized.

If two operators act successively on a function, the result may depend on their order. The Heisenberg relation Δxp>h can be derived as a property of the non-commuting operators for position and linear momentum. In fact, each pair of non-commuting operators gives rise to a similar relation. This applies, e.g., to each pair out of the three components of angular momentum.[21] Consequently, only one component of an electron’s magnetic moment (usually along a magnetic field) can be measured. The other two components are undetermined, as if the electron exerts a precessional motion about the direction of the magnetic field.

Remarkably, there is no operator for kinetic time. Therefore, some people deny the existence of a Heisenberg relation for time and energy.[22] On the other hand, the operator for energy, called Hamilton-operator or Hamiltonian, is very important. Its eigenvalues are the energy levels characteristic for e.g. an atom or a molecule. Each operator commuting with the Hamiltonian represents a constant of the motion subject to a conservation law (5.3).[23]

 

Mean standard deviation

From the wave function, the probability to find a particle in a certain state can be calculated. Now the indeterminacy is a measure of the mean standard deviation, the statistical inaccuracy of a probability calculation. The indeterminacy of time can be interpreted as the mean lifetime of a metastable state. If the lifetime is large (and the state is relatively stable), the energy of the state is well defined. The rest energy of a short living particle is only determined within the margin given by the Heisenberg relation for time and energy.

This interpretation is needed to understand why an atom is able to absorb a light quantum emitted by another atom in similar circumstances. Because the photon carries linear momentum, both atoms get momentum and kinetic energy. The photon’s energy would fall short to excite the second atom. Usually this shortage is smaller than the uncertainty in the energy levels concerned. However, this is not always the case for atomic nuclei. Unless the two nuclei are moving towards each other, the process of emission followed by absorption would be impossible. Rudolf Mössbauer discovered this consequence of the Heisenberg relations in 1958. Since then, the Mössbauer effect became an effective instrument for investigating nuclear energy levels.

 

Measurement disturbance

The position of a wave packet is measurable within a margin of Δx and its linear momentum within a margin of Δp. Both are as small as experimental circumstances permit, but their product has a minimum value determined by Heisenberg’s relation. The accuracy of the measurement of position restricts that of momentum.

Initially the indeterminacy was interpreted as an effect of the measurement disturbing the system. The measurement of one magnitude disturbs the system such that another magnitude cannot be measured with an unlimited accuracy. Heisenberg explained this by imagining a microscope exploiting light to determine the position and the momentum of an electron.[24] Later, this has appeared to be an unfortunate view. It seems better to consider the Heisenberg relations to be the cause of the limited accuracy of measurement, rather than to be its effect.

The Heisenberg relation for energy and time has a comparable consequence for the measurement of energy. If a measurement has duration Δt, its accuracy cannot be better than ΔE>ht.

The so-called measurement problem constitutes the nucleus of what is usually called the interpretation of quantum mechanics.[25] It is foremost a philosophical problem, not a physical one, which is remarkable, because measurement is part of experimental physics, and the starting point of theoretical physics. After the development of quantum physics, both experimental and theoretical physicists have investigated the relevance of symmetry, and the structure of atoms and molecules, solids and stars, and subatomic structures like nuclei and elementary particles (chapter 11). Apparently, this has escaped the attention of many philosophers, who are still discussing the consequences of Heisenberg’s indeterminacy relations.  

 

The law of conservation of energy

In quantum mechanics, the law of conservation of energy achieves a slightly different form. According to the classical formulation, the energy of a closed system is constant. In this statement, time does not occur explicitly. The system is assumed to be isolated for an indefinite time, and that is questionable. Heisenberg’s relation suggests a new formulation. For a system isolated during a time interval Δt, the energy is constant within a margin of ΔEht. Within this margin, the system shows spontaneous energy fluctuations, only relevant if Δt is very small.[26]

According to quantum field theory, a physical vacuum is not an empty space. Spontaneous fluctuations may occur. A fluctuation leads to the creation and annihilation of a virtual photon or a virtual pair consisting of a particle and an antiparticle, having an energy of ΔE, within the interval Δt<hE. Meanwhile the virtual particle or pair is able to exert an interaction, e.g. a collision between two real particles.[27] Virtual particles are not directly observable but play a part in several real processes.

 

The probability interpretation of the wave packet

The amplitude of waves in water, sound, and light corresponds to a measurable physical real magnitude. In water this is the height of its surface, in sound the pressure of air, in light the electromagnetic field strength. The energy of the wave is proportional to the square of the amplitude. This interpretation is not applicable to the waves for material particles like electrons. In this case the wave has a less concrete character, it has no direct physical meaning. The wave is not even expressed in real numbers, for the wave function has a complex value.

In 1926, Max Born offered a new interpretation, since then commonly accepted.[28] He stated that a wave function (real or complex) is a probability function. In a footnote added in proof, Born observed that the probability is proportional to the square of the wave function.[29]

A wave function is prepared at an earlier interaction, for instance, the emission of the particle. It changes during its motion, and one of its possibilities is realized at the next interaction, like the particle’s absorption. The wave function expresses the transition probability between the initial and the final state.[30]

This probability may concern any measurable property that is variable. Hence, it does not concern natural constants like the speed of light or the charge of the electron. According to Born, the probability interpretation bridges the apparently incompatible wave and particle aspects.[31] Wave properties determine the probability of position, momentum, etc., traditionally considered properties of particles.

Classical mechanics used statistics as a mathematical means, assuming that the particles behave deterministic in principle (8.1). In 1926, Born’s probability interpretation put a definitive end to mechanist determinism, having lost its credibility before because of radioactivity. Waves and wave motion are still determined, e.g. by Schrödinger’s equation, even if no experimental method exists to determine the phase of a wave. However, the wave function determines only the probability of future interactions.[32] In quantum mechanics, the particles themselves behave stochastically.

 

Interference

An even more strange feature of quantum statistics is that chance is subject to interference. In the traditional probability calculus (8.3) probabilities can be added or multiplied. Nobody ever imagined that probabilities could interfere. Interference of waves may result in an increase of probability, but to a decrease as well, even to the extinction of probability. Hence, besides a probability interpretation of waves, there is a wave interpretation of probability.[33]

Outside quantum mechanics, this is still unheard of, not only in daily life and the humanities, but in sciences like biology and ethology as well. The reason is that interference of probabilities only occurs as long as there is no physical interaction by which a chance realizes itself.[34] The absence of physical interaction is an exceptional situation. It only occurs if the system concerned has no internal interactions (or if these are frozen), as long as it moves freely. In macroscopic bodies, interactions occur continuously and interference of probabilities does not occur. Therefore, the phenomenon of interference of chances is unknown outside quantum physics.[35]

 

Reduction of the wave packet

The mathematical concept of probability or chance anticipates the physical relation frame, because only by means of a physical interaction a chance can be realized. An open-minded spectator observes an asymmetry in time. Probability always concerns future events. It draws a boundary line between a possibility in the present and a realization in the future. For this realization, a physical interaction is needed. The wave equation and the wave function describe probabilities, not their realization. The wave packet anticipates a physical interaction leading to the realization of a chance, but is itself a kinetic subject, not a physical subject. If the particle realizes one of its possibilities, it simultaneously destroys all alternative possibilities. In that respect, there is no difference between quantum mechanics and classical theories of probability.

As long as the position of an electron is not determined, its wave packet is extended in space and time. As soon as an atom absorbs the electron at a certain position, the probability to be elsewhere collapses to zero.[36] This so-called reduction of the wave packet requires a velocity far exceeding the speed of light. However, this reduction concerns the wave character, not the physical character of the particle. It does not counter the physical law that no material particle can move faster than light.

Likewise, Schrödinger’s equation describes the states of an atom or molecule and the transition probabilities between states. It does not account for the actual transition from a state to an eigenstate, when the system experiences a measurement or another kind of interaction.[37]

 

Macroscopic bodies

Is the problem of the reduction of the wave packet relevant for macroscopic bodies as well? Historically, this question is concentrated in the problem of Schrödinger’s cat, locked up alive in a non-transparent case. A mechanism releases a mortal poison at an unpredictable instant, for instance controlled by a radioactive process. As long as the case is not opened, one may wonder whether the cat is still alive. If quantum mechanics is applied consequently, the state of the cat is a mixture, a superposition of two eigenstates, dead and alive, respectively.

The principle of decoherence, discovered at the end of the 20th century, provides a satisfactory answer. For a macroscopic body, a state being a combination of eigenstates will spontaneously change very fast into an eigenstate, because of the many interactions taking place in the macroscopic system itself. This solves the problem of Schrödinger’s cat, for each superposition of dead and alive transforms itself almost immediately into a state of dead or alive.[38] The principle of decoherence is part of a realistic interpretation of quantum physics. It does not idealize the ‘collapse’ or ‘reduction’ of the wave packet to a projection in an abstract state space. It takes into account the character of the macroscopic system in which a possible state is realized by means of a physical interaction.

 

Anticipating future interactions

As soon as it was established, both experimentally and theoretically, that all particles move like wave packets, Born recognized these waves as expressing probability. It should be emphasized that as long as only kinematics is concerned, this interpretation is not needed. Therefore the wave theory was discussed separately (chapter 7), because it is of a purely modal character – in contrast to probability theory. The relation of wave theory and probability concerns the kinematic aspect of the latter, and manifests itself if one wishes to account for the individuality of the physically qualified particles.[39]

Both theories have an anticipatory character. For the wave motion of physically qualified individual particles this implies that the probability interpretation becomes necessary as soon as one wishes to consider the individual interaction of the particles with some other system (for example, but not necessarily, a measuring apparatus). A purely kinematic theory breaks down for this interaction.

The theory of waves as applicable to the kinematics of probability is not restricted to sinusoidal or spherical waves. It is the phase which shows the modal kinematic character of wave theory. The phase is not directly relevant for the probability, which is only determined by the amplitude. The most peculiar consequence of the phase formalism is that it leads to interference. Because probability is no longer static, but can be propagated, the addition of probabilities has to be considered.

In classical theory, addition of probabilities can only lead to an increase, because probability is essentially a positive measure. But the phase relations between interfering waves make it possible that waves may annihilate each other. In order to save the positive-definiteness of probability as a measure, it is defined as the square of the absolute value of the amplitude of the wave. (It is now also clear why wave packets must be normalized to unity). Especially this interference of probability, completely unknown in classical static probability theory, shows the relevance of the kinetic aspect of probability. It is one important new feature of quantum statistics.[40] A second innovation concerns the description of the transition from an initial state to a final state (9.8), which is also intimately related to wave theory.

 

9.3. Static quantum probability theory

 

In order to discuss the physical interpretation of this formal, mathematical scheme, one has to state in which way this formalism is an objective representation of physical states of affairs. Such a statement has the character of a hypothesis, which can never be proved analytically, but must be corroborated by its consistency with other theories and experimental results.

Section 3.2 defined a magnitude as a property of comparable subjects, which allows a quasi-serial ordering of those subjects. Now one asserts that any physical magnitude for a system is represented by a hermitean operator A in a Hilbert space. The eigenvalues of A (whether discrete, denumerable, or continuous) compose the spectrum of possible real values for some property of that particular system – i.e., the spectrum is in part determined by the typical structure of the system. This interpretation rests on the fact that the basis functions of any orthogonal basis in a linear function space can always be quasi-serially ordered.

The state of an isolated system corresponds to a normalized function or vector in the Hilbert space.[41] Non-isolated systems can be included, if the action from the outside is static (e.g., an electric or magnetic field), and if there is no reaction of the system to the outside world.[42]

If f represents the state of the system, the statistical distribution over the spectrum of a physical magnitude is determined by the scalar product of f and the eigenvectors of the corresponding operator A. Therefore, for an eigenvector ni of A, whose eigenvalue is ai, (ni,f)(f,ni) is the probability that the subject will adopt the value ai for A by virtue of an interaction in which that magnitude is displayed. This applies in particular, but not exclusively, to a measurement of this magnitude.

The state functions f are restricted to normalized vectors in Hilbert space because the probability must be normalized. This can be seen because Σ(nj,f)(f,nj)=(f,f) includes all possibilities, and must therefore be one. This means that if f itself happens to be an eigenvector of A with eigenvalue af, then the probability of finding af for the magnitude is unity, and the probability of finding any other value is zero. (This ‘certainty’ in finding a particular value for a property is only possible for the idealized pure states). The mean value for the measured property with respect to a given initial state f is simply (f,Af).

 

From classical properties to quantum propensities

Thus far the static theory already shows some remarkable differences from the classical theory discussed in section 8.3. In the first place, the spectrum of possible values for a property of a system reflects more directly its spatial character, being represented by a coordinate system, a basis in Hilbert space.[43] It still refers back to the numerical modal aspect, as is fitting for a magnitude. The spectrum of possibilities, being determined, as it is, by the internal typical structure of the system, is shifted therefore to the law side. Hence, the spectrum, rather than being primarily a subject-subject relation, determines the latter. Consequently, a magnitude is not a property, but a propensity or disposition. No longer does it belong to the subject, but rather to the typical law to which the subject responds. This is an important point to keep in mind in our forthcoming discussion.

 

Relations between states

Also the concept of probability itself, as used here, differs from the classical one, in which probability is a function over the possibilities. In quantum physics, probability is a functional, a scalar product between the state function f and the eigenvector corresponding to a particular possible case. Thus in quantum physics probability is a relation between the state of a system and the state of some reference system. It only has meaning if interpreted as some propensity, which can manifest itself as a property only in an interaction. Actually, in classical theory as well, the probability is determined by both the spectrum of possibilities and the initial state, but the latter is assumed to be completely at random. In this respect the new theory is much richer than the old one, because it allows specification of the initial state. Probability is a joint quantitative measure of the state of the system and the reference system.

Following the discovery that spatial position and kinematic motion are subject-subject relations, relative to spatial or kinematic reference systems, one finds now that the physical state of a system can only be understood relative to a physical reference system. Such a system is objectively represented by a basis in Hilbert space, but is in any actual interaction a second physical system, with which the former system may interact.

This second system may be a measuring instrument, in the same way spatial coordinate systems and kinematic reference systems may be constructed from concrete metre sticks and clocks. But in an abstract analysis of physical interaction it is not necessary to restrict oneself to measuring instruments, neither in geometry and kinematics, nor in physics. On the contrary, in each of these three cases the relation between a subject and a reference system or measuring instrument is merely one of many concretizations of an abstract relation, the modal subject-subject relation. Indeed, for the spatial, kinetic, and physical modal aspects the modal subject-subject relation turns out to be, respectively, relative position, relative motion, and relative interaction. Therefore it is justified to conclude that the theory has a relational character, and that the modal aspects are truly relation frames.

 

9.4. State preparation, randomness, and complementarity

 

The meaning of a state as represented by the state function f is given in its relation to a reference system representing a possible second system, with which it may interact. The state itself is the result of a previous interaction, and is intermediary between the two interactions. In experimental physics one speaks of a state selector as a device which e.g. singles out a certain state from an incoming beam of particles, excluding all other states. Hence it is a yes-no experiment, which can be described with a projection operator. Alternatively, a state selector is characterized by an eigenvector of an operator representing the property according to which the incoming particles are discriminated. The state of the particles emerging from a pure-state selector is independent of the state of the incoming particles, whereas the effect of a mixed-state selector depends partly on this preceding state.[44]

Restricting oneself to a pure non-degenerate state selector (which gives the maximum obtainable determination of a state) one finds that this state represents randomness in two ways. The first is connected with the phase of the state as follows. The scalar product (f,ni) determines the probability that a system in the state f will adopt the eigenvalue ai as a result of an interaction characterized by the corresponding operator A. In particular, these probabilities are invariant under multiplication of the state vector by a complex number exp.iq, where q is a real number called the phase of the exponential function. Therefore, in state selection, the phase associated with the state f is completely undetermined, and must be assumed to be a random parameter.[45]

Next recall that the state f produced by a state selector is an eigenvector of the operator A corresponding with some property according to which the systems are selected.  This means that for a later interaction characterized by an operator B, f is not an eigenstate of B unless A and B commute. Thus, e.g., if A is the momentum operator, and B the position operator, (which operators do not commute), then the particles emerging from the state selector may be said to have a precisely determined momentum, but not a precise position at any time before the next interaction takes place.

However, this statement is liable to be misunderstood, as the history of quantum physics has shown. The propensities of physically qualified subjects as represented by hermitean operators have the character of a law, and therefore never belong to the subject. They display themselves only if the system interacts with some other subject. As long as we talk about an isolated system in the state f, we have to assume that any propensity has a potential character, even if f happens to be a pure eigenstate of some operator. This latter case must be understood as a limiting case, and as such it is comparable with the state of rest in kinematics and a one-dimensional space (a point) in geometry. As long as the system does not interact, the state is more or less autonomous with respect to any property. Indeed, the state has a potential[46] or latent[47] character.

In this respect the quantum physical concept of a state is profoundly different from the classical molecular state, or the micro- or macrostates of macroscopic systems (8.5).[48] The classical state is a mere enumeration of actual properties of the system, whereas the state in quantum physics has a potential character for a single system. For an ensemble of similar systems in the same state, it determines the properties in the mean.[49] It must be emphasized that this potential character of the state strictly pertains to isolated systems only and is therefore somewhat abstract. Any concrete system always interacts with other systems in various ways, and therefore its state is always actualized in some sense or another.

 

Complementarity

The incompatibility of properties represented by non-commuting operators has given rise to the Copenhagen school, mainly represented by Niels Bohr, and strongly bears the influence of some widely differing philosophies, in particular Kantian mechanism.[50] As a result, every proponent of complementarity has his own interpretation of it. One difficulty in understanding this concept is that it was introduced from the very beginning of the development of quantum physics. Therefore, its meaning has evolved along with the theory itself.

For some people complementarity is the same as the wave-particle duality. For others it refers to the relation between the microsystems governed by quantum physics and the macroscopic measuring instruments which are supposed to be describable in classical terms.[51] Sometimes it even denotes the psychic subject-object relation between the observer and the observed system. Shortly before the final establishment of Schrödinger’s wave theory and Heisenberg’s matrix mechanics in 1925, Bohr, Kramers and others paid much attention to the complementary relation between a causal and a spatiotemporal description of physical processes. In its simplest form the principle of complementarity merely expresses the fact, mentioned above, that an eigenvector f of an operator A cannot be an eigenvector of a different operator B, unless A and B commute.[52]

The fact that the position and momentum operators do not commute has particularly been the cause of much discussion. In the first place, it refutes the classical mechanist maxim that the state of any physical system must be describable in modal kinematic terms, i.e., by its actual position and momentum at a given time. In the Copenhagen school, in an effort to focus on the classical maxim, attempts have sometimes been made to reduce all types of complementarity to this one of position and momentum (considered as ‘primitive’). But other kinds of incompatible magnitudes cannot be reduced in this way (9.6), and this idea had to be abandoned.

The need for a kinematic description disappears as soon as one recognizes the mutual irreducibility of the physical and kinetic modal aspects, which implies that the state of a physically qualified system must be referred to a physical coordinate system and not to a kinetic one. The situation is quite analogous to Einstein’s criticism of the classical theory of motion, where in discussing kinetic temporal relations, kinetic frames of reference have to be used, in which, e.g., static simultaneity is relativized. Similarly, for the study of the relations between physically interacting systems, one has to refer to physical frames of reference. This principle regarding physical frames of reference has consequences even for temporal relations which are not original to the physical relation frame, such as position and momentum.[53]

Thus the potential character of the state must be discussed taking into account all three basic distinctions of the philosophical framework introduced in chapter 1. Distinguishing law side and subject side, leads to the discovery that the properties of a system have the character of a law, and cannot be possessed by the system, apart from its interaction with other systems. The distinction of modality and typical individuality allows for the contingency of the state concurrently with the fact that the main properties in which one is interested (position, momentum, energy) posses a modal, universal nature. Differently structured systems can interact just because they have such universal properties in common. Finally, the distinction of the relation frames helps to understand the relativization of the pre-physical modal relations in physically qualified relations.

 

9.5. Modal symmetry: energy and momentum

 

Both in classical and quantum probability, symmetry considerations may help to solve the problem at hand. Symmetry can be divided into two types, modal or external, and typical or internal. In both cases we can translate the problem of symmetry into spatial terms. Then one inquires as to which transformations in Hilbert space leave both the statistical distribution over some hermitean operator and its eigenvalue spectrum unchanged. All transformations of this kind are represented by unitary operators (which commute with the hermitean operator being considered). Further, with any kind of symmetry, there corresponds a group of unitary operators. Thus the theory of groups is of great use for the solution of symmetry problems.

The case of modal symmetries involves transformations which depend on the mutual irreducibility of the first four modal aspects, whereas the case of typical symmetry depends on the typical structure of the system concerned, and is therefore of a less general character (9.7). The modal symmetries depend on the isotropy and homogeneity of time and space, and the Galilean or Lorentz invariance of uniformly moving physical systems. Being concerned with isolated systems, I shall postpone the discussion of the internal time evolution until section 9.8, and start with stationary states. This leads to the conservation laws of energy and momentum (the present section) and of angular momentum (9.6). These laws were discovered prior to the quantum era, yet are far easier to derive in the quantum theory than in classical physics.

A unitary operator can be described by a matrix which can be broken down into a number of matrices of smaller dimensionality. The simplest case is that of a matrix of dimension 1 whose element is a single complex number of absolute value 1. This one-dimensional matrix is nothing but the phase-factor, the exponential function exp.iq, the phase q being some real number.

If we multiply all functions in Hilbert space with the same phase-factor, then all scalar products (f,ni), as well as all mean values (f,Af) for any linear hermitean operator A, are invariant under such a transformation. This phase-factor invariance allows us to study the conservation laws of energy and momentum for stationary states.

 

The temporal development of an isolated system

Consider an isolated system whose state at a certain time t is denoted by the function f(t). We suppose that the state f(t1) develops continuously from the state f(to). Taylor’s theorem yields

f(t1)=f(to)+(t1to)∂f(to)/∂t+½(t1to)22f(to)/∂t2 + …

which formally can be written as

f(t1)=[1+(t1to)∂/∂t+… ]f(to)

             = [exp i(t1to)(1/i) ∂/∂t] f(to)=[exp.i(t1to)H]f(to)

where we have introduced the hermitean Hamiltonian operator H=(1/i)∂/∂t as the generator of the unitary operators describing the temporal homogeneity.[54]

Now we demand that the states f(t1) and f(to) differ only by a phase factor. Because of the previous result, this phase factor can only have the form of exp.iω(t1to), where ω is some real number:

f(t1)=[exp.iω(t1to)]f(to)

or: exp.-iωt1)f(t1)=exp.–iωto)f(to)=g(ω)

where g(ω) is a Hilbert space vector independent of time, and characterized by the real number ω. In terms of g(ω), the general state vector is

f(t,ω)=(exp.iωt)g(ω)

If we apply the Hamiltonian H to f, we have

Hf(t,ω)=(1/i)∂/∂t(exp.iωt)g(ω)=ωf(t,ω)

Thus the real number ω is an eigenvalue of H. Since ω can be any real number, the spectrum of H is continuous, and consists of all positive and negative real numbers. The value of ω depends on the reference system, and may therefore be determined by a state selector. If the latter is incompatible with H or if it only produces mixed states, the state of the system corresponds with a statistical distribution over the spectrum of H. In the pure-state case, ω is the same for all vectors in the Hilbert space representing the internal structure of the system. This space is therefore an invariant sub-space of a much larger Hilbert space (of higher dimensionality) which includes all possible values for ω.

In a completely analogous way, the system at different positions r (or relative to different spatial coordinate systems) can be considered:

f(r1)=[exp.i(r1ro).P]f(ro)=[exp.ik.(r1ro)]f(ro)

where P=(1/i)V = (1/i)(∂/∂x,∂/∂y,∂/∂z).

If g(k) is a vector independent of position, characterized by the three-dimensional vector k, then the general state vector can be represented by f(r,k)=(exp.ik.r)g(k)

P is again a hermitean operator with eigenvalues k, whose spectrum covers triplets of all possible positive and negative real numbers. It is the generator of the unitary operators exp.ir.P, which form a multiplication group isomorphic to the three-dimensional addition group of vectors k.

 

Uniform motion

So far we have only studied the temporal and spatial homogeneity of the systems, referred to different temporal and spatial frames. In order to include uniform motion, we first connect the two cases to obtain

f(t,ω;r,k)=[(exp.it+k.r)]g(ω,k)

This result is just the plane wave discussed in chapter 7. Thus in a very natural way one arrives at the plane wave representation of temporal and spatial homogeneity, which in classical physics can only be found in a very laborious way. In chapter 7 we saw that the number ω has the character of a frequency and is proportional to energy, and that the vector k is the wave vector proportional to the linear momentum. That is to say that ω and k are constants of the motion. By considering Galilean transformations between inertial reference systems moving relative to each other, one finds as in classical physics[55] ω=k2/2m, where k is the absolute value of k, and m is a frame-independent magnitude, called the mass of the system. Therefore, the system satisfies Schrödinger’s equation:

Hf(t,ω;r,k)=(P2/2m)f(t,ω;r,k)

If the Lorentz instead of the Galilean transformations are considered, the relation between energy E and momentum p becomes E2=m2c4+p2c2. This means that for a certain value of the momentum (or wave vector) there are two possible values of the energy (or frequency), one positive and one negative. This was first pointed out by Paul Dirac, and because this relation is universally valid, each particle has a corresponding antiparticle, whose energy, or alternatively, charge is of opposite sign to that of its counterpart (11.6). Thus the positively charged positron corresponds to the negatively charged electron, and the negatively charged antiproton corresponds to the positively charged proton. Additionally, as regards some properties of nuclear reactions, a particle is the antipode of its antiparticle, and vice versa. But essentially they have the same rest mass.

In general, a moving system will not be represented by a single vector, but by a mixture or wave packet

h(r,t)=ʃ-∞A(k[exp.i(k.rt+φ]g(ω,k)dk

where the amplitudes A(k) and the phases φ are determined by the state selector.[56] Hence the absolute value of h(r,t)drdt is the probability that the system will be in the spatial interval between r and r+dr during the temporal interval between t and t+dt. Thus the amplitudes A(k) are subjected to the normalization condition

ʃ-∞h(r,t)h*(r,t)drdt=1

In a similar way, the probability expresses that the energy and the linear momentum have values in a certain interval. The general theory of the Hilbert space formalism now leads to the same Heisenberg relations between the minimal spreads in the statistical distributions of energy/momentum and time/position, as can be found from simple wave theory (chapter 7).

 

9.6. Spin

 

A somewhat more complicated problem is concerned with the rotational invariance of isotropic space. In classical physics it had already been shown that this symmetry leads to the conservation of angular momentum. A group-theoretical analysis rules out a description of this symmetry in terms of one-dimensional matrices, which precludes the phase factor formalism.

At some stage of the development of quantum physics it was assumed that all relevant operators could be derived from those of energy, position and momentum, just as in classical physics.[57] One problem in the application of this so-called correspondence principle[58] is that the momentum and position operators do not commute with each other, which has no analogy in classical physics.[59] But even if this difficulty is circumvented, one arrives at incorrect or incomplete results, as in the case of angular momentum.

In the ordinary representation of classical physics for simple systems (consisting of mass points), the angular momentum is L=rxp, where r is the position vector, p the linear momentum, and rxp is a vector product. Since the position operators involve simple multiplication by x, y and z, the correspondence principle yields the operator Lz=(1/i)(x∂/∂y-y∂/∂x) for the z-component of the vector L. The corresponding eigenvalue equation can be solved. The eigenvalues are just the integers, m=+1, +2, etc. This solution is wrong, in so far it gives only integral values for m, whereas experiments show that m can also have half-integral values, m=+(1/2), +(3/2), etc. It is impossible to find these half-integral values by merely looking for an analogy in classical mechanics.

One arrives at the correct solution in two ways. First the angular momentum components, lxly,  lz as defined above satisfy the commutation relations

lxly-lylx=ilz      (cyclic in x, y and z)

If this relation is not taken as a result, but as a starting point, one finds both integral and half-integral eigenvalues.

Alternatively, and without reference to classical physics, group theory shows that the unitary operators describing rotational invariance are reducible to matrices with one or more dimensions. Matrices of even dimensionality correspond to half-integral values of the spin, and those of odd dimensionality to integral values. If the dimensionality is n, the eigenvalues are: -½(n–1), …, +½(n–1), at unit intervals, such that the total number of different eigenvalues is just equal to n. The number n depends on the structure of the system, and is an invariant. The number ½(n–1), the highest eigenvalue for each spin component, is also an eigenvalue of the total angular momentum operator J, which must be sharply distinguished from the three components lxly,  lz.

Only those eigenstates of the system can be superposed which have the same eigenvalue of J. This is a so-called superselection rule, to be discussed later. On the other hand, for example, the superposition of eigenstates with different eigenvalues for the operator lz is always possible. Thus for a spin-zero particle (e.g., a pion), n=1, J=0, and m=0. For a spin one-half particle (e.g., an electron or a proton), n=2, J=½, and m can have two values: +½; and for a spin-one particle (e.g., a photon), n=3, J=1; and m=+1 or 0. More complicated systems like nuclei and atoms may have states with different eigenvalues for J.

Although spin has a discrete spectrum, it is an external parameter just like energy and momentum. That is, it depends partly on the external reference system with respect to which the system is orientated. Now in a physical sense orientation can only be given by a field, which in every spatial point has a certain direction (the z-direction, say). With respect to this direction, the lz component may be assumed to have a certain value, whereas the fact that lz does not commute with lx and ly implies that in this case the state of the system (now an eigenstate of lz) cannot be an eigenstate of lx or ly. The system cannot have spin components perpendicular to the field.

For some systems (like electrons) the spin is associated with a magnetic moment, such that a magnetic field can orient the spin of the system. For other systems (such as light quanta), the direction of propagation is the determining factor. Light quanta can be transversely polarized in two mutually orthogonal directions, corresponding with m=+1 (the third eigenvalue, m=0, can only be realized in longitudinal, virtual photons, as a consequence of the typical structure of electromagnetic interaction).

 

Indeterminacy of the spin components

As observed, with every possible direction in space there corresponds a spin-component operator lz, which does not commute with the other operators, lx and ly. Thus if a system is oriented with respect to a certain direction, it cannot at the same time be oriented with respect to another direction. Consider a beam of electrons moving along the x-direction in the presence of a magnetic field pointing in the z-direction. This beam will split up into two parts, because the spin operator lz has two eigenvalues, +½. If one beam passes through another magnet parallel to the former, no further splitting will occur, which shows that after passing the first magnet all electrons in one beam are in an eigenstate of lz, and remain so before and during their passage through the second magnet. But if the second magnet has its field in the y-direction, the beam will split again in two parts. If one of them passes a third magnet, again in the z-direction, a beam splitting will occur, showing that the orientation with respect to the y-axis (in the second magnet) destroyed the earlier orientation with respect to the z-axis.

Maybe Ballentine overlooked this in his thought experiment in which he tries to show the possibility of determining the spin of a particle along two different directions, y and z.[60] His experiment proceeds as follows. Suppose two particles with J=½ emerge from a decaying particle of spin zero. This means (because of angular momentum conservation) that the two particles must have opposite spins. If the decaying particle was at rest, the two emerging particles will also have opposite momentums. Now suppose one measures the spin of one particle with a magnet in the y-direction, and that of the other with a magnet in the z-direction. Then, according to Ballentine we have determined both the y- and the z-components of the spin of each particle.

However, it appears that Ballentine is wrong assuming that there is no magnetic field at the place where the original particle decays. Each emerging particle experiences the field of the other. Therefore, each particle is already oriented before it arrives at the measuring magnet. And with this magnet, not only the original orientation is destroyed, as we saw above, but also the connection with the other particle.

Ballentine considers the two particles as a single system, which means that their combined state must be described by a single vector in a Hilbert space. But the fact that their spin is oriented relative to each other as described above implies that this Hilbert space for the two-particle system can be separated into two one-particle Hilbert spaces, which are mutually orthogonal, such that a measurement on one system has no bearing on the other. Of course, if the original orientation immediately after the decay can be fixed, there is a statistical correlation between the measurements of lz for both systems, if the measuring magnets are both oriented in the z-direction.

Ballentine’s experiment was proposed as an alternative to the famous Einstein-Podolsky-Rosen paradox,[61] in which the momentum of one system is determined by the momentum of the another, with which the former has interacted shortly before the measurement. In this case one makes use of the conservation of linear momentum. But then again, account is not taken of the destruction of the previous state of the measured system due to the measurement of its momentum. This means that nothing can be inferred about the state of the unmeasured system, whose state is orthogonal to that of the measured system.[62]

The objection may arise here that in Compton’s experiment it is shown that the measurement of the momentum of one colliding particle gives the value of the other one because of the law of conservation of momentum, which therefore is also valid for microsystems. But in this case the value of the momentum is much larger than its statistical spread, so that the latter can be neglected. For low-energy experiments as discussed by Einstein, Podolsky and Rosen, the spread in momentum becomes proportionally more significant (7.6).

For a clear understanding of these paradoxes, one has to keep in mind that the Hilbert space formalism is only applicable to isolated systems. The paradox results due to the jump from one isolated system to another. First one considers the two colliding or emerging particles as one system. If we consider the two particles separately, we must then consider them either as interacting with each other (which means they are not isolated), or as having a state prepared by the preceding interaction. But, in the latter case, this state need not be an eigenstate with respect to the measuring instrument. The ‘projection postulate’ (according to which the state of the system changes in a measurement into an eigenstate determined by the measuring instrument) applies to individual systems. Therefore it must not be applied to the state of the two particles together, if the measurement is only done on one of them. It is only the state of the latter particle which is subjected to the projection postulate.

 

9.7. Typical symmetry

 

The unitary operators describing the symmetry of a system always form a group. The group of external temporal or spatial translations and the Lorentz or Galilean groups of kinetic motion are continuous. On the other hand, those describing rotational symmetry are discrete, and this is also often the case for internal symmetries.

With full knowledge of the internal structure of a system the symmetry properties would not be needed, because they are inherent in the structure. But more often than not, the symmetry relations are better known than the detailed structure, and therefore form a great help in designing and interpreting experiments. This is also the case where the system is in principle known, but mathematically difficult to treat, as in molecules and solids.

Classically one speaks of an internal symmetry whenever it can be determined a priori that for two or more cases a certain probability must be the same, regardless of its precise value. For example, Laplace’s equally favourable cases are determined by symmetry. But only if all possible cases are equally favourable (as in dice throwing) is it sufficient to find the probability value itself. This is not usually the case in physically qualified systems.

The idea that symmetry leads to equal probability does not rely on a complete lack of knowledge or lack of sufficient reason as was sometimes assumed in order to reconcile statistical methods with a deterministic philosophy. This idea can even lead to absurdities if taken literally. Take, for example, the case of throwing two dice simultaneously. Beginning with the argument of a complete lack of knowledge one should argue that all possible results, 2, 3, …, 12, are equally probable, which is patently false.

Symmetry is relational with respect to the environment of the subject concerned. That is, it anticipates possible interaction. It is not an epistemological lack of knowledge, which we confront, but an ontological indifference on the part of one subject (the environment) which interacts with another subject, whose symmetry is being considered.

Now in quantum physics the probability of finding a certain state depends jointly on the initial state, and the operator describing the interaction, namely the reference system. But when speaking of symmetry, one abstracts from the initial state. One wishes to discuss the symmetry of the law for the system, with respect to some reference system. Therefore, contrary to what was stated in the preceding paragraph, the probability cannot be immediately discussed. Instead the eigenstates and eigenvalues of the operator must be considered in relation to the reference system.

A hermitean operator determines the spectrum of possible values which can be assumed by a certain physical magnitude. Thus an eigenstate of the operator can be characterized by its corresponding eigenvalue, which is the value of the physical magnitude for the system in that state. For example, a measurement does not directly determine the eigenstates, but the corresponding eigenvalues, and this is also the case in a state selector. Contrary to the eigenstates, the eigenvalues can be manipulated. This implies that if two eigenstates have the same eigenvalue, the two eigenstates cannot be distinguished, neither in a selector, nor in a measurement.

The occurrence of two different eigenstates having the same eigenvalue is sometimes accidental. However, this accidental degeneracy does not interest us. More important is the degeneracy due to some kind of symmetry. For external symmetries, the temporal homogeneity for example implies that an infinity of eigenstates (differing by a phase factor) have the same energy.

The internal symmetry of a system can be described by a set of unitary operators. If one has just two mutually orthogonal eigenvectors with the same eigenvalue for some hermitean operator A, then there is a unitary operator U which transforms one eigenstate into the other, and its inverse U+ which performs the reverse transformation. Together with the identity operator, I=UU+, they form a group. If A has more than two orthogonal eigenvectors which are degenerate, then the corresponding group of unitary operators has more than two members. These unitary operators commute with the hermitean operator A, which is therefore left invariant under the symmetry operations. This shows again the extreme importance of group theory for the analysis of typical structures. Even if group theory cannot give the eigenvalues of A, it can tell which of them are degenerate.

 

A complete set of commuting operators

Some operators commuting with A (and therefore having the same set of eigenvectors), may break the symmetry leading to degenerate eigenvalues for A. Now it is an important hypothesis that a complete set of compatible magnitudes exists for any typical structure. That is, there exists a complete set of mutually commuting hermitean operators, such that each eigenstate can be uniquely characterized by a set of eigenvalues or quantum numbers.[63] Group theory can show which eigenvalues of some operator of the complete set are degenerate. This set of operators ultimately determines both the full set of possible states and their relative weights (jointly with the initial state of the system).

Such a complete set is not unique, however. There are always several mutually incompatible sets of compatible magnitudes. Thus, any component of the spin can belong to such a set, but different spin components are mutually incompatible, and cannot simultaneously belong to the same set.

There is no criterion known for the completeness of such a set. The possibility always exists of finding a new kind of symmetry applicable to the system. For example, for a long time it was thought that a complete set of magnitudes for a free electron consists of momentum (or energy as an alternative), mass, spin, and one of the spin components. Later it was realized that the electron could exist in two charge-eigenstates, one negative for the common electron, and one positive for the positron. Thus charge must be added to the set. Finally yet another attribute, the lepton number L, emerged, compatible with all the members of the set (11.2).[64]

 

Superposition and superselection

If two eigenstates of some operator (such as the Hamiltonian in connection with the energy) can be superposed, then the system is said to be in the state objectified by the linear superposition of those two eigenstates. In particular, if two eigenstates, for some operator, are degenerate, then any linear superposition of the two states is also an eigenstate of that operator. One consequence of the complete set of compatible operators is that no two eigenstates can be degenerate with respect to all operators of the set.

However, not all eigenstates can be superposed. For instance, of the properties mentioned above for a free electron, only the eigenstates of the momentum and of the components of the spin can be superposed. The mass, charge, total spin, and the lepton number are subject to so-called superselection rules, stipulating that their eigenstates cannot be superposed. Even when considering the electron and the positron being different states of the same system, these states are not superposable. There are no states with partly positive and partly negative charge. This distinction between superposable and non-superposable properties allows us to distinguish between two basically different kinds of typical structures (10.5).[65]

In conclusion, we find the somewhat surprising result that in quantum physics the symmetry of the system does not immediately lead to a prediction concerning the probability of a state, but rather to one concerning a possible value. If, according to symmetry, different eigenstates have the same eigenvalue, these states cannot be distinguished by a measurement that discriminates on the basis of the eigenvalues. If the initial state is fixed (e.g., by a state selector), then the probabilities associated with the different eigenstates possessing the same eigenvalues may be quite different.

However, if, for the sake of argument, one assumes the initial state to be arbitrary, then all eigenstates have the same probability, and thus the relative weights of the eigenvalues are proportional to their degree of degeneracy. And this is once again the same as in classical physics, which usually assumes a completely random initial state.

 

9.8. The dynamic development of a physical system

 

Section 9.5 discussed the conservation laws of energy and momentum with respect to an isolated system, finding that its external motion is just the plane wave motion dealt with in chapter 7. This is the case e.g. for electrons. For complicated systems, like atoms, the energy operator, the Hamiltonian H, not only has relevance to external motion (kinetic energy), but also to internal motion. This consists at least of potential energy as well as kinetic energy due to the relative motions of the electrons and the nucleus in the atom.

The ‘central dynamical postulate’ says that for any isolated system the temporal evolution of its initial state is determined by the Hamiltonian. In particular, if the initial state happens to be an eigenstate of the Hamiltonian (i.e., if it is prepared by a pure-state energy selector), the state will remain the same, if it is a stationary state, with a definite eigenvalue, the energy of the system. Strictly speaking, only the ground state of an atom is stationary, and any other state will sooner or later decay. Implicitly in such a decay process, however, is the occurrence of a reaction to the environment, which means the system is not strictly isolated.

The Hamiltonian can tell us something about the probability of decay, i.e., the mean decay time. An unstable state can always be conceived as the superposition of several stationary states, and the Hamiltonian determines the probability that the system will be found in a state consisting of the ground state of the original system plus a free photon. The only requirement is that the energy of the system in its ground state plus the energy of the emitted photon be equal to the energy of the initial unstable state. This is only an approximation, however, and a more satisfactory theory is found in quantum electrodynamics, which starts from the interaction between the system and an electromagnetic field. If the initial state is not an eigenstate of the Hamiltonian, then its change will be determined by its relation to the electromagnetic field.

If the initial state is a mixture, it can be shown that it becomes more and more mixed. In fact, for mixed states a magnitude can be defined having the same properties as the classical entropy. For a pure state, this magnitude is zero whereas for a mixed state it is positive. For an isolated system it increases during its temporal evolution, which is therefore subjected to the physical time order of irreversibility.

Besides the above mentioned distinctions between pure and mixed states, another one is found in interference, the most characteristic kinetic property of waves. Interference is only possible if (and as long as) the interfering waves are coherent, i.e., if their phase relations are not at random, but determined by some previous interaction, namely, the preparation of the wave packet. Thus one speaks of the ‘coherence length’ of a wave packet. Now a pure state has an infinite coherence, and a mixed state has a finite one. The possibility of interference is therefore limited for a mixed state (and, of course, every actual state of any concrete system is more or less mixed). Thus any distinction between pure and mixed states has to do with the irreversibility as the physical time order, and is related to some interaction. It turns out that the pure state is an idealized boundary case, just as a state of rest is a boundary case of motion.

Most authors on the foundations of quantum physics take for granted that an isolated system should be described with a complex Hilbert space. Tolman calls the phase factor randomness (9.4) the basic postulate of quantum mechanics,[66] but Jauch states that as yet no one has shown that a real Hilbert space fails to do the job.[67] I hold that complex numbers are most suited to a description of interference phenomena, which is one of the characteristic differences between classical and modern disclosed kinematics.

The Hilbert space concept is mostly confined to isolated systems. It implies the possibility of describing the internal structure of systems like nuclei, atoms, molecules and solids, and also simple cases of collisions (which are describable as isolated events). The theory allows one to incorporate internal interactions. The problem of the actualization of possibilities is then circumvented, because one restricts oneself to the calculation of stationary states and the mean values of properties. As to its external relations, the theory can do little more than describe the particle’s motion. With respect to external interactions, it can only give probabilities, in an anticipatory way.

 

Actualization

Thus we find that the Hilbert space formalism has a very wide scope since it can solve four problems mentioned in section 9.1. But is has its limitations too. It cannot solve the fifth problem, concerning the actualization of one of the possible states.[68] This is often called the measurement problem, although it pertains to any external interaction. This is not always recognized, because in many cases of  external interaction, it is possible to solve the problem by including the two interacting subjects into one single system, in which case the Hilbert space formalism is applicable. This method is very successful in describing the interaction of, e.g., electrons and nuclei, and somewhat less successful in the description of the interaction between an atom and an electromagnetic field. The latter does not lend itself for treatment as an isolated system.

In order to see the relevance of interaction to the understanding of our problem, consider the following example. It is a well-known statement that an atom has discrete energy levels, each corresponding with an eigenstate of the Hamiltonian operator. For the sake of argument, let us only consider two such states, the ground state and the first excited state. The Hilbert space formalism says that the atom may also be in a mixed state, described by a Hilbert space vector which is a linear combination of the two eigenstates. To this mixed state there corresponds no fixed energy value. How should this be interpreted?

As long as we consider the atom as isolated, we are at a loss, because every isolated system has a fixed energy. However, no system is really isolated. This problem can be studied with the help of quantum electrodynamics, observing that each atom interacts with an electromagnetic field, such that the mixed state can be considered as the ground state together with a virtual photon – in which case the energy of the atom is not fixed, but neither is the atom isolated.

There is also a probabilistic interpretation. In fact, one will never have a single atom, but rather a dilute gas. The interactions between the atoms may be so small, that for the calculations the atoms may be assumed to be isolated. At the same time, the interaction may be large enough (e.g., via collisions: the temperature of the gas must be well above zero) in order to have atoms both in the ground state and in the first excited state, in a ratio determined by the Boltzmann factor (8.5), which therefore gives the probability that a certain atom is in one of these states. This ratio is immediately related to the mixed-state vector, for which an interpretation was sought. Thus by introducing a weak interaction, small enough  not to disturb the calculation results, a very natural interpretation of these mixed states can be given.

Also in these cases, however, no actualization is described, and the success of the theory lies in the predictability of observational probabilities, stationary states, etc.[69] It has been suggested that the main reason why the Hilbert space formalism fails is based on the superposition principle, by which the linear combination of two possible states is again a possible state. Hence, if a state is a linear combination of two eigenstates for some operator, the system can be said to have simultaneously two different values for the magnitude represented by that operator. This principle is only applicable to potential states, not to actual ones. Whereas the possible energies of a system can be superposed, as can be verified in interference experiments, actual energies cannot.

Above superposable and non-superposable properties of isolated systems were distinguished. Only for the latter it can be said that the system possesses the property. Thus an electron has a negative charge, a positron has a positive charge, but neither possesses a certain energy value. Only if the electron has actualized one of its possibilities in some external interaction, may it be said to have a certain energy value. This is not only the case in a measurement or observation, but also e.g. in a collision between two electrons, if after the collision takes place, there is a correlation between their states. Evidently a theory of external interaction must do away with linearity and superposability. This was recognized some time ago by several physicists, among them Werner Heisenberg, Louis De Broglie, Jean-Pierre Vigier, and David Bohm.[70] However, the mathematical difficulties involved in a non-linear theory are enormous, and the theory is progressing only very slowly, if at all.

So the conceptual difficulties involved in quantum theory can be understood if keeping in mind that the theory only applies to isolated systems and potential interactions. This poses the question as to what extent the concept of an isolated system is fruitful. Strictly speaking, isolated systems do not exist, and the concept itself is even objectionable, since physical systems are characterized by their interaction as the basic physical subject-subject relation. It is a subject-subject relation, an interaction between two physically qualified systems which is at stake. It is not a psychically qualified subject-object relation between the observer and the observed system, as was believed by Niels Bohr and Werner Heisenberg, e.g.[71] I do not say that the latter relation is of no interest. However, if one wishes to study the latter, one must first of all be aware of the former subject-subject relation and its implications, precisely because any observer (a psychic subject) is a physical subject as well.

The answer to our question must be found in the study of the individuality structures which are physically qualified. Both the driving motive and the success of quantum theory must be sought in this field of research.

First, it is just the (limited) individuality of physical systems which makes it both necessary and fruitful to consider them in isolation, in an intermediary state between two interactions. Thus the weakness of quantum theory is its strength all the same. Secondly, the distinction of potentiality and actualization will be related to the distinction between the thing-like and the event-like character of physically qualified individual subjects. Both will be considered in part II of Laws for dynamic development.



[1] Jammer 1966, 307ff.

[2] Jammer 1966, 315.

[3] For the sake of argument it will be assumed that the number of dimensions of Hilbert space is denumerable.

[4] This means that an operator in the Hilbert space manifests itself as a ‘matrix’ by which the components of a function in the space are transformed.

[5] To each linear operator A corresponds an adjoint operator A+, such that (f,Ag) = (A+f,g); this implies that I+=I.

[6] Cp. Jauch 1968, 73: … every measurement on a physical system can be reduced, at least in principle, to the measurements with a certain number of yes-no experiments.’ It should be realized that a yes-no experiment is a theoretical idealization. In principle, no actual physical measuring instrument can be reduced to a perfect yes-no experiment because of noise. Just because every physical instrument is a quantum mechanical system itself, this noise cannot be made as small as one likes. This refutes the possibility of accepting yes-no experiments as an empirical basis for quantum physics, although they may be used as a theoretical starting point.

[7] Achinstein 1991, 24. Decisive was Foucault’s experimental confirmation in 1854 of the wave-theoretical prediction that light has a lower speed in water than in air. Newton’s particle theory predicted the converse.

[8] See Hanson 1963, 13; Jammer 1966, 31.

[9] Cathode rays, canal rays and X-rays are generated in a cathode tube, a forerunner of the television tube, fluorescent lamp and computer screen.

[10] Pais 1991, 150. Planck’s reduced constant is h/2π. In Bohr’s theory the angular momentum L=nh/2π, n being the orbit’s number. For the hydrogen atom, the corresponding energy is En=E1/n2, with E1=-13.6 eV, the energy of the first orbit.

[11] Darrigol 1986.

[12] The group velocity df/ds=dE/dp equals approximately Df/Ds. E/p>c and dE/dp<c follow from the relativistic relation between energy and momentum, E=(Eo2+c2p2)1/2, where Eo is the particle’s rest energy. Only if Eo=0, E/p=dE/dp=c. Observe that the word ‘group’ for a wave packet has a different meaning than in the mathematical theory of groups.

[13] Bohr 1934, chapter 2; Bohr 1949; Meyer-Abich 1965; Jammer 1966, chapter 7; 1974, chapter 4; Pais 1991, 309-316, 425-436. Bohr’s principle of complementarity presupposes that quantum phenomena only occur at an atomic level, which is refuted in solid state physics. According to Bohr, a measuring system is an indivisible whole, subject to the laws of classical physics, showing either particle or wave phenomena. In different measurement systems, these phenomena would give incompatible results. This view is out of date. [Sometimes, non-commuting operators and the corresponding variables (like position and momentum) are called ‘complementary’ as well, at least if their commutator is a number.]

[14] Kastner 2013, 31-33.

[15] Even in classical physics, the idea of a point-like particle is not well-defined. Both its mass density and charge density are infinite, and its intrinsic angular momentum cannot be defined.

[16] Light in vacuum is an exception.

[17] The values of ‘1’ respectively ‘h’ in de mentioned relations indicate an order of magnitude. Sometimes other values are given, e.g. h/4p instead of h, see Messiah 1961, 133.

[18] If Δxstf=1, the wave packet’s speed vxtfs is approximately the group velocity df/ds, according to De Broglie.

[19] In communication technology, Δf is the bandwidth, see Bunge 1967a, 265. Bunge denies that wave-particle duality exists in quantum mechanics, see ibid. 266, 291. In his formulation, the single concept of a quanton replaces the concepts of wave and particle. However, this masques the fact that in the quanton a physical and a kinetic character are interlaced.

[20] See e.g. Margenau 1950, chapter 18; Messiah 1961, 129-149; Jammer 1966, chapter 7; 1974, chapter 3; Omnès 1994, chapter 2; Torretti 1999, chapter 6.

[21] From the commutation properties of the operators referring to the components of angular momentum for an electron (having rotational symmetry), one derives the integral eigenvalues for the orbital angular momentum as well as the half-integral eigenvalues for the intrinsic angular momentum or spin, see Messiah 1961, 523-536.

[22] Bunge 1967a, 248, 267. 

[23] I leave here aside the important distinction between a time dependent and a time independent Hamiltonian, the former describing transition processes, the latter stationary states.

[24] Heisenberg 1930, 21-23.

[25] Kastner 2013, 202: ‘The interpretive challenge of quantum theory is often presented in terms of the measurement problem: i.e., that the formalism itself does not specify that only one outcome happens, nor does it explain why or how that particular outcome happens. This is the context in which it is often asserted that the theory is incomplete and is therefore in need of alteration in some way.’

[26] In fact, the value of ΔE is less significant than the relative indeterminacy ΔE/E. For a macroscopic system the energy E is so much larger than ΔE that the energy fluctuations can be neglected, and the law of conservation of energy remains valid.

[27] Such virtual processes are depicted in the so-called Feynman-diagrams.

[28] Jammer 1974, 38-44.

[29] The probability to find a particle in the volume element between r and r+dr is y(r)y*(r)dr, hence the scalar product y(r)y*(r) is a probability density.

[30] Cartwright 1983, 179. Of course, the probability is not given by a single wave function, but by a wave packet. If this consists of a set of orthogonal eigenvectors, a matrix represents the transition probability.

[31] ‘The true philosophical import of the statistical interpretation consists in the recognition that the wave-picture and the corpuscle-picture are not mutually exclusive, but are two complementary ways of considering the same process’, M. Born, Atomic physics, 1944, quoted by Bastin (ed.) 1971, 5.

[32] The fact that quantum physics is a stochastic theory has evoked widely differing reactions. Einstein considered the theory incomplete. Born stressed that at least waves behave deterministically, only its interpretation having a statistical character. Bohr accepted a fundamental stochastic element in his world-view.

[33] Heisenberg 1958, 25.

[34] Observe that an interference-experiment aims at demonstrating interference. This is only possible if the interference of waves is followed by an interaction of the particles concerned with, e.g., a screen.

[35] For the relevance of interactions for the interpretation of quantum physics, see Healey 1989.

[36] Theoretically, this means the projection of the state vector corresponding to the state before the measurment onto one of the eigenvectors of Hilbert space, representing the state of the system after the measurement. Omnès 1994, 509: ‘No other permanent or transient principle of physics has ever given rise to so many comments, criticisms, pleadings, deep remarks, and plain nonsense as the wave function collapse.’ In particular, the assumptions that probability is an expression of our limited knowledge of a system and that the observer causes the reduction of the wave packet, have led to a number of subjectivist and solipsist interpretations of quantum physics and related problems, of which I shall only briefly discuss that of Schrödinger’s cat.

[37] Omnès 1994, 84: ‘This transition therefore does not belong to elementary quantum dynamics. But it is meant to express a physical interaction between the measured object and the measuring apparatus, which one would expect to be a direct consequence of dynamics’ Cartwright 1983, 195: ‘Von Neumann claimed that the reduction of the wave packet occurs when a measurement is made. But it also occurs when a quantum system is prepared in an eigenstate, when one particle scatters from another, when a radioactive nucleus disintegrates, and in a large number of other transition processes as well … There is nothing peculiar about measurement, and there is no special role for consciousness in quantum mechanics.’ But contrary to Cartwright (ibid., 198) stating: ‘… there are not two different kinds of evolution in quantum mechanics. There are evolutions that are correctly described by the Schrödinger equation, and there are evolutions that are correctly described by something like van Neumann’s projection postulate. But these are not different kinds in any physically relevant sense’, I believe that there is a difference. The first concerns a reversible motion, the second an irreversible physical process, cp. Cartwright 1983, 179: ‘Indeterministically and irreversibly, without the intervention of any external observer, a system can change its state … When such a situation occurs, the probabilities for these transitions can be computed; it is these probabilities that serve to interpret quantum mechanics.’

[38] The principle of decoherence is in some cases provable, but is not proved generally, see Omnès 1994, chapter 7, 484-488; Torretti 1999, 364-367. Decoherence even occurs in quite small molecules, see Omnès 1994, 299-302. There are exceptions too, in systems without much internal energy dissipation, e.g. electromagnetic radiation in a transparent medium and superconductors, see Omnès 1994, 269. The decoherence approach has been criticized, see e.g. Kastner 2013, 14-16.

[39] Heisenberg’s relations can now be interpreted as determining the minimum statistical spread or standard deviation in measurements.

[40] Popper 1967 underestimates this difference between classical and quantum statistics in his critique of the ‘great quantum muddle’.

[41] This is an idealized statement, because usually the state of a system must be represented by a statistical mixture, to be described with the help of a so-called density operator (9.8).

[42] Jauch 1968, 157ff.

[43] It is no longer a Boolean algebra; cf. Weizsäcker 1971, 237.

[44] Houtappel et al. 1965, 611.

[45] Tolman 1938, 349 ff.

[46] Heisenberg 1958, 38.

[47] Margenau 1950, 175, 335.

[48] Jauch 1968, 90ff.

[49] Bunge 1967a, 249.

[50] On the historical development of the concept of complementarity, see Jammer 1966, 345ff, 1974; on the views of Bohr, see Bohr, 1934; 1949; 1958; 1963; Meyer-Abich 1965, chapter 3; Bunge 1959b, 173-209; Feyerabend 1962; Hanson 1959; Holton 1973, 115-161; Hooker 1972; Pais 1991.

[51] Bohr 1949, 209, 210.

[52] From a historical point of view, this interpretation rests on a misunderstanding (first by Pauli) of Bohr’s ideas; see Meyer-Abich 1965, 152.

[53] Jauch 1968, 69, 112ff.

[54] This procedure was invented long before the rise of quantum physics, by Lagrange; cf. Jammer 1966, 224.

[55] Jauch 1968, 198; Kaempffer 1965, Chapter 9.

[56] A pure state cannot be normalized, and is therefore unsuited for the representation of a physical system.

[57] This view is not justified, not even for classical physics: the formulation as given in the text is a simplification.

[58] See for the correspondence principle, Jammer 1966, 109-118; Meyer-Abich 1965; Hanson 1958, 1959, 1963, 60-67; Messiah 1958, 29-31; Heisenberg 1958, Chapter 3; Petersen 1968.

[59] Margenau, Cohen 1967, 85, 86. For this section, see e.g. Messiah 1958, Chapter 3 and 13, p. 195ff, 508ff; Jauch 1968, Chapter 14; Kaempffer 1965, Chapter 1, 12.

[60] Ballentine 1970; see also Frisch (in: Bastin (ed.) 1971, 20) on a similar thought experiment; see also Jammer 1974, 235, 302ff, 309.

[61] Einstein, Podolsky, Rosen 1935; see also Bohr 1949, 231ff; Margenau 1950, 261ff; Feyerabend 1962; Jammer 1974; Hooker 1972. In 1982 Alain Aspect performed experiments which seem to confirm the view expressed in this section.

[62] According to the Heisenberg relations, the states of the two particles after the collision cannot be pure momentum (or energy) states, because of the finite interval (or duration, respectively), in which the interaction took place.

[63] Messiah 1958, 294.

[64] Kaempffer 1965, 2.

[65] In more complicated systems like atoms there are also ‘selection rules’ which sometimes refer to the same kind of operators (total angular momentum, e.g.). However, the transitions which are ‘forbidden’ by these rules are often simply less probable than other transitions because the symmetry underlying these rules is ‘broken’ by some perturbing interaction; see any textbook on quantum physics.

[66] Tolman 1938, 349ff.

[67] Jauch 1968, 121; see also Weizsäcker 1971, 252.

[68] Feyerabend 1962.

[69] Popper 1967, 25, 26.

[70] Bohm, in: Bastin (ed.) 1971, 102ff.

[71] See e.g., Heisenberg 1930, 2; Heitler 1949, 191.