PREFACE: "The statistics" vs "Statistics"

From the statistics to Statistics

Academic Statistics and Statistics for researchers

The foundations of statistics; the history of statistics

Statistics in Human Sciences

Problem Research --> relevant data --> Statistical Analysis --> Statistical Results --> Research conclusions.

In academic statistics, "real-life data" are often just invoked in order to illustrate techniques, while ignoring research problems. Blatant violations to the requirement of completeness abound. Suffice to mention an article by Goodman (1991), which purports to seriously discuss the comparative merits of methods on a simple 4x5 array of social mobility, disconnected from any context. In his reply, D.R. Cox notes shrewdly: "A key question concerns how the models are to be adapted to address detailed substantive questions (etc.); for example, there may be further dimensions or concurrent comment on the individuals concerned. . "

Two crucial distinctions

Beyond the diversity of disciplines, two distinctions are essential:

Texts and publications

Organization of the heading "Statistical Work" (travaux statistiques)

. Key ideas: Formalization, geometric, descriptive-inductive, specific, probability.

. Domains: Stochastic models, analysis of variance and structured data, Combinatorial inference, Bayesian inference, Geometric Data Analysis, Regression.

. Software, teaching, etc...

. Reading Notes.

Heading "Personalia"

. CV and Scientific trajectory.

Please note. The heading "Loisirs" and "Feuilles et Bons Mots" lie outside of my work.

Hyperspecialisation.

A quasi- monopoly.

Medical statistics on the carpet

Statisticians should fit the needs of the users, not the reverse! - J.W. Tukey

From the statistics to Statistics

There are statistics and statistics. "The statistics" - familiarly
the "stats" - are the statistical data (averages, percentages,
numbers of
all sorts) ubiquitous in the media and found in all
possible and imaginable areas: official statistics , surveys, etc.
By "Statistics" - often written with a capital
S (the "science of Statistics") - is meant the scientific
discipline dealing with the methods for analyzing
statistical data. My work has been concerned with "Statistics".

Before we proceed, two massive facts must be mentioned, that overhang any developments. Firstly, there is the overwhelming dominance (from 1945 to the present day) of Anglo-saxon statistics; see a quasi-monopoly. Secondly, there is the phenomenon of hyperspecialization, which fragments the same topic (such as statistics) into isolated subspecialties.

Before we proceed, two massive facts must be mentioned, that overhang any developments. Firstly, there is the overwhelming dominance (from 1945 to the present day) of Anglo-saxon statistics; see a quasi-monopoly. Secondly, there is the phenomenon of hyperspecialization, which fragments the same topic (such as statistics) into isolated subspecialties.

Academic Statistics and Statistics for researchers

The statistical discipline is a "metadiscipline", whose raw material lies outside
the discipline. By its very nature, it lies at the junction of two lines
of thinking, namely mathematics and empirical sciences. Among the
founding fathers of statistics, the two lines were always present; whereas nowadays they
are well separated, with academic statistics on the one hand, and statistics for researchers on the
other hand.

One finds academic statistics in the mathematics departments of universities and in the
"theoretical" teaching in institutions like (in France) INSEE
(Economical Research) or INSERM (Medical Research). This
discipline is self-called "mathematical
statistics" and purports to be a deductive theory, like
mathematical physics.

One finds statistics for researchers in laboratories and empirical studies, from natural sciences to social sciences. It is an essentially normative discipline, which aims at providing legitimate "scientific proof", controlled by the referees of scientific journals. Let us be clear: We do say "statistics for researchers", not "applied statistics", because even though the canons of academic statistics are recognized by statistics for researchers in principle, they are hardly applied in practice.

My conviction is that while separating the two lines, it is vital to maintain the unity of Statistics (On the unfortunate consequences of the current division, see Medical statistics on the carpet). What justifies Statistics is its auxiliary role ( "Hilfswissenschaft") of empirical disciplines. Statistics for researchers should guide theoretical statistics. The ideal situation is where the statistician participates in a large-scale empirical research, with scientists specifying the questions, and works out the statistical procedures to help answer these questions.The key ideas that give sense to my work have all emerged in interaction with research problems; and my contributions have tended to construct an autonomous statistics for researchers.

One finds statistics for researchers in laboratories and empirical studies, from natural sciences to social sciences. It is an essentially normative discipline, which aims at providing legitimate "scientific proof", controlled by the referees of scientific journals. Let us be clear: We do say "statistics for researchers", not "applied statistics", because even though the canons of academic statistics are recognized by statistics for researchers in principle, they are hardly applied in practice.

My conviction is that while separating the two lines, it is vital to maintain the unity of Statistics (On the unfortunate consequences of the current division, see Medical statistics on the carpet). What justifies Statistics is its auxiliary role ( "Hilfswissenschaft") of empirical disciplines. Statistics for researchers should guide theoretical statistics. The ideal situation is where the statistician participates in a large-scale empirical research, with scientists specifying the questions, and works out the statistical procedures to help answer these questions.The key ideas that give sense to my work have all emerged in interaction with research problems; and my contributions have tended to construct an autonomous statistics for researchers.

The foundations of statistics; the history of statistics

The statistical
discipline is a recent one, highly dependent on
computational tools. Not surprisingly, it has faced persisting
identity problems. Originally
a branch of probability theory, it was then, in the
blooming days of
Operation Research, nearly absorbed in the "science of
decisions". Nowadays,
it would rather tend to become a part of algorithmics (a field
surely more creative).

Our key ideas indeed refer to the fundamentals of statistics. But talking of the foundations of a discipline means a specialized area, on the side of a discipline whose content is "well established." The status of the key ideas, in contrast, is to call for a restructuring of the traditional chapters of statistics.

The same goes for the history of statistics, to which I have been initiated by G.Th. Guilbaud and B. Bru: cf. Rouanet & Bru (1994b). At the age of the Internet, browsing the "Electronic Journal of the history of probability and statistics" is for me a real pleasure. However, I must confess that epistemology is not my strong point. If history fascinates me, it is (to paraphrase Marc Ferro about history in general), "provided that his study provides an understanding of the problems of our time." Rather than scrutinize the forerunners of present dominating trends, I try to (re)discover neglected ways, that the tools of our time can make practicable.

It is clear that many theoretical constructions in the past have been built in order to bypass the obstacle of computation: for example, the normal model. Other theories remained in sketch form: for example, classification procedures, or permutation modeling. Now that the computational obstacle is virtually removed, that the era of statistical tables is (or should be) over, one can and should prefer, I believe, a direct approach to tackle the real issues that justify using statistical methods. In fact, what were the problems that Binet, or Durkheim, were attempting to solve ? What if they had had computers at their disposal, with their colossal databases and their fabulous means of calculation?

Our key ideas indeed refer to the fundamentals of statistics. But talking of the foundations of a discipline means a specialized area, on the side of a discipline whose content is "well established." The status of the key ideas, in contrast, is to call for a restructuring of the traditional chapters of statistics.

The same goes for the history of statistics, to which I have been initiated by G.Th. Guilbaud and B. Bru: cf. Rouanet & Bru (1994b). At the age of the Internet, browsing the "Electronic Journal of the history of probability and statistics" is for me a real pleasure. However, I must confess that epistemology is not my strong point. If history fascinates me, it is (to paraphrase Marc Ferro about history in general), "provided that his study provides an understanding of the problems of our time." Rather than scrutinize the forerunners of present dominating trends, I try to (re)discover neglected ways, that the tools of our time can make practicable.

It is clear that many theoretical constructions in the past have been built in order to bypass the obstacle of computation: for example, the normal model. Other theories remained in sketch form: for example, classification procedures, or permutation modeling. Now that the computational obstacle is virtually removed, that the era of statistical tables is (or should be) over, one can and should prefer, I believe, a direct approach to tackle the real issues that justify using statistical methods. In fact, what were the problems that Binet, or Durkheim, were attempting to solve ? What if they had had computers at their disposal, with their colossal databases and their fabulous means of calculation?

Statistics in Human Sciences

My work has focused on statistics in
the human sciences, mainly psychology
and social science, in other words, behavioral sciences, bordered
by bio-medical statistics on one
side, and econometry on the other. As far as
statistics is concerned, this constitutes a quite homogeneous field:
There is "statistics for human sciences", not
really "statistics for psychologists," "statistics for
sociologists", and
so on.

In my view, the role of statistics in a research paper should always conform to the following pattern:

In my view, the role of statistics in a research paper should always conform to the following pattern:

Problem Research --> relevant data --> Statistical Analysis --> Statistical Results --> Research conclusions.

Relevant data must constitute a
representative inventory of the area under study. This is the
"completeness requirement" of Benzécri, close to the
notion of "field" of Bourdieu. The statistical analysis
should either
bring an answer to the research questions, or else show
that the available data are
insufficient to meet them. Enforcing the foregoing scheme should
facilitate the
critical examination of a research report and enable one to
pinpoint at which stage(s) errors may have been
committed: 1) Relevant data have been omitted;
2) The statistical analysis carried out is inadequate; 3) The
conclusions drawn exceed those authorized by the
statistical results (over-interpretation).

In academic statistics, "real-life data" are often just invoked in order to illustrate techniques, while ignoring research problems. Blatant violations to the requirement of completeness abound. Suffice to mention an article by Goodman (1991), which purports to seriously discuss the comparative merits of methods on a simple 4x5 array of social mobility, disconnected from any context. In his reply, D.R. Cox notes shrewdly: "A key question concerns how the models are to be adapted to address detailed substantive questions (etc.); for example, there may be further dimensions or concurrent comment on the individuals concerned. . "

Two crucial distinctions

Beyond the diversity of disciplines, two distinctions are essential:

1) Between experimental data (factors of interest are controlled)
and observational data (factors of interest are only observed).

2) Descriptive procedures (the findings relate to the data) and inductive ones, alias statistical inference (the conclusions go beyond data); with in the background, the perennial problem of the role of probabilities in statistics.

2) Descriptive procedures (the findings relate to the data) and inductive ones, alias statistical inference (the conclusions go beyond data); with in the background, the perennial problem of the role of probabilities in statistics.

Texts and publications

The references to my texts and publications are given on the one hand in
chronological order, on the other hand by themes
(domains). Some texts are mathematically oriented and may call mathematicians
interested in the applications. Others texts are case studies, where the statistical approach is exposed
"in situation", and are directly readable by researchers (not necessarily versed
in mathematics).

Organization of the heading "Statistical Work" (travaux statistiques)

. Key ideas: Formalization, geometric, descriptive-inductive, specific, probability.

. Domains: Stochastic models, analysis of variance and structured data, Combinatorial inference, Bayesian inference, Geometric Data Analysis, Regression.

. Software, teaching, etc...

. Reading Notes.

Heading "Personalia"

. CV and Scientific trajectory.

Please note. The heading "Loisirs" and "Feuilles et Bons Mots" lie outside of my work.

Hyperspecialisation.

A quasi- monopoly.

Medical statistics on the carpet