Research Interests
Statistical Theory & Methods: nonparametric curve estimation, integral transforms, goodness-of-fit tests, time series, spatial processes, non-stationary & locally stationary processes, irregularly spaced data, ratio of means estimation, statistical applications of large deviations;
Applications: trend and surface estimation, volatility, change points, networks, exceedance locations, zero-inflated data, species count problems and other large scale and long term aspects
T3-PLOTS
The focus of this research is on goodness-of-fit tests, such as testing if the underlying probability distribution is normal, testing if two probability distributions are identical etc. For this, we use test statistics that are based on integral transforms.
Integral Transforms. Of special interest are Fourier and Laplace transforms and their sample counterparts. These are explained below.
The empirical moment generating function (EMGF): Let X, X1, X2, ..., Xn be a random sample from a probability distribution F. Given t∈ℜ, where ℜ is the real line, the EMGF is defined as the sample mean mn(t) = (etX1+etX2 +...+etXn) ⁄ n. The EMGF is an unbiased estimator of its population counterpart, namely the moment generating function (MGF) m(t) = E(etX), provided that m(t) exists in an open interval around zero. Due to their uniqueness properties, the MGF or the EMGF can be used for goodness-of-fit tests. The T3-plots are based on the EMGF. The empirical characteristic function (ECF): This is defined as cn(t) = (eitX1+eitX2 +...+eitXn) ⁄ n, t∈ℜ, and i =√(-1). The ECF is an unbiased estimator of the characteristic function c(t) = E(eitX), which always exists. Methods based on the EMGF and the ECF are typically of asymptotic nature. For goodness of fit tests based on the ECF, see for instance Ghosh, S., Ruymgaart, F. (1992) Canadian Journal of Statistics, 20: 429-440) and references therein. Additional references to these topics can be found in Ghosh (1996, 2013) and Ghosh & Beran (2000).
T3-plots: There are one-sample and two-sample T3-plots. To obtain the S-plus codes: write to rita.ghosh(at)wsl .ch.
(1) One-sample T3 plot: This is a statistical procedure for graphical test of Univariate Normality: Ho: X~N(µ, σ2), where µ and σ2 are unknown. With this method one can test the null hypothesis that a set of univariate independent and identically distributed (iid) observations are normally distributed with an unknown mean and an unknown variance. While the approach is based on asymptotic arguments, the method incorporates finite sample corrections and it is location and scale invariant. Missing values are allowed in the S-plus code and it is not necessary to standardize the data prior to analysis.
References:
Ghosh, S. (1996) Journal of the Royal Statistical Society, Series B.
Ghosh, S. (1999) Encyclopedia for Statistical Sciences, John Wiley.
Details: The relevant S-plus command is, T3plot(X), where X is the vector of iid observations in S-Plus. This creates the T3-function which is plotted against its argument. In addition to the T3-function, the 99% and 95% rejection limits are also plotted. The null hypothesis of normality is rejected if the T3-function of the given sample deviates significantly from the horizontal zero-line by crossing the rejection limits.
(2) Two-sample T3 plot: This is a statistical procedure for graphical comparison of two unknown probability distributions. Based on two independent random samples, this method tests the null hypothesis Ho: F1 = F2, where, apart from some regularity conditions, the distributions F1 and F2 are not specified.The method is location and scale invariant. Small sample corrections are incorporated in the S-plus code. Missing values are allowed and it is not necessary to standardize the data prior to analysis. Reference:Ghosh, S. & Beran, J. (2000) Journal of Computational and Graphical Statistics.
Details: The Two-sample T3 plot works quite like the one-sample method, except that in the two-sample case, thetwot3 function is used which creates a plot of the difference between the two one-sample T3-functions and the corresponding 99% and 95% rejection limits. The null hypothesis is rejected if the two sample T3-function crosses the rejection limits. In the two sample case, bootstrap is used to construct the rejection bands.
S-plus codes for T3-plots: write to rita.ghosh(at)wsl.ch
References
Ghosh, S. (1996) A new graphical tool to detect non-normality. Journal of the Royal Statistical Society B, 58, 691-702.
Ghosh, S. (1999) T3-plot. Encyclopedia for Statistical Sciences, Update volume 3, John Wiley, 739-744.
Ghosh, S., Beran, J. (2000) Comparing two distributions: The two sample T3 plot. Journal of Computational and Graphical Statistics, 9: 167-179.
Ghosh, S. (2003) Estimating the moment generating function of a linear process. Student, 4: 211-218.
Ghosh, S., Beran, J. (2006) On estimating the cumulant generating function for linear processes. Annals of the Institute of Statistical Mathematics, 58: 53-71.
Beran, J. and Ghosh, S. (2011) The moment generating function. In:International Encyclopedia of Statistical Science, M. Lovric (Ed.), Springer, Berlin/New York.
Ghosh, S. (2013) Normality testing for a long-memory sequence using the empirical moment generating function. Journal of Statistical Planning and Inference, 143: 944–954.
CURVE ESTIMATION
Textbook: Ghosh, S. (2018) Kernel Smoothing: Principles, Methods & Applications, Wiley.
ETH Autumn semester: Smoothing & nonparametric regression (401-0627-00L). ⇒ ETH Courses.
- Trend derivatives & change points: In palaeo-environmental research, stable isotope ratios of oxygen are used as temperature proxies and have been proven to be useful for quantifying temperature changes of the past. Similarly, charcoal records serve as proxies for fire events whereas pollen assemblages are used to assess presence of tree species in the region. One issue is that strong fluctuations in the environmental conditions may lead to plant species becoming extinct or abundant. Thus the range of variability in the vegetation response over time, as well as how this has been influenced by fluctuations in the environmental conditions in the past are of considerable interest. Collaborators: Patricia Menendez (PhD student); Willy Tinner & Brigitta Ammann (IPS, University of Bern), John Birks (University of Bergen); Hans Rudolf Künsch (ETH, Zürich). Funding: Swiss National Science Foundation (PI: Sucharita Ghosh).
References
Menendez, P., Ghosh, S., Künsch, H., Tinner, W. (2013) On trend estimation under monotone Gaussian subordination with long memory: application to fossil pollen series. J. Nonparametric Statistics, 25: 765–785.
Menendez, P., Ghosh, S., Beran, J. (2010) On rapid change points under long memory. Journal of Statistical Planning and Inference, 140: 3343-3354. [Includes analysis of GRIP oxygen isotope data]
Menendez Galvan, P. (2009) Statistical Tools for Palaeo Data. Diss. ETH No 18060: 134 S. [Includes analysis of GRIP oxygen isotope data & pollen records from Switzerland]
Menendez, P. & Ghosh, S. (2006) On some nonparametric smoothing methods for assessing climate change. Proceedings of the Joint Statistical Meetings (JSM), August 6 - 10, 2006.
Ghosh, S. (2006) Regression based age estimates of a stratigraphic isotope sequence in Switzerland. Journal of Vegetation History and Archaeobotany, 15: 273-278.
- Quantile function estimation: Due to the adverse impacts of extreme weather phenomena in practically all spheres of life, there is an increasing interest in modeling long term stochastic variations in climate events. This aim of this project is to develop some nonparametric estimation & prediction methods for assessing changing precipitation patterns in Switzerland through estimation and prediction of the probability distributions of precipitation events. Collaborators: Dana Draghicescu (PhD student); Christoph Frei (ETH, meteoSwiss); Stefan Morgenthaler (EPFL, MATH-STAT). Funding: Swiss National Science Foundation (PI: Sucharita Ghosh).
References
Draghicescu, D. (2002) Nonparametric quantile estimation for depepdent data. Ph.D. thesis, EPFL. [Includes analysis of long-term Swiss precipitation records; source MeteoSwiss]
Draghicescu, D., Ghosh, S. (2003) Smooth nonparametric quantiles. In, Proceedings of the 2nd International colloquium of Mathematics in Engineering and Numerical Physics (MENP-2), Geometry Balkan Press, Bucharest, Romania 2003, pp. 45-52. [Includes analysis of long-term Swiss precipitation records; source MeteoSwiss]
Ghosh, S., Draghicescu, D. (2001) Quantile estimation to assess extreme climate events. Geophysical Research Abstracts, Volume 3, 2001, European Geophysical Society, Nice, France, 25-30 March, 2001.
Ghosh, S., Draghicescu, D. (2002) An algorithm for optimal bandwidth selection for smooth nonparametric quantiles and distribution functions. In, Statistics in Industry and Technology: Statistical Data Analysis based on the L1-norm and related methods, Birkhäuser Verlag, Basel, Switzerland, pp. 161-168. [Includes analysis of long-term Swiss precipitation records; source MeteoSwiss]
Ghosh, S., Draghicescu, D. (2002) Predicting the distribution function for long-memory processes. International Journal of Forecasting 18: 283-290. [Includes analysis of long-term Swiss precipitation records; source MeteoSwiss]
TIME SERIES
Ongoing research among statisticians on statistical methods for time series and spatio-temporal data. The topics include developing models and tests that would distinguish between deterministic and spurious patterns, data driven methods, as well as merging of physical models with empirical evidences so as to bridge gaps between theory and data.
Some publications
(a) Textbook: Beran, J., Feng, Y., Ghosh, S., Kulik, R. (2013) Long Memory Processes - Probabilistic Properties and Statistical Models, Springer. (Link to Amazon)
(b) Selection of papers
- Ghosh, S. (2017) On estimating the marginal distribution of a detrended series with long-memory. Communications in Statistics - Theory and Methods;accepted version online. [Includes illustrations with analysis of global temperature data series; source: University of East Anglia and some Swiss precipitation series, source: MeteoSwiss]
- Ghosh, S. (2014) On local slope estimation in partial linear models under Gaussian subordination. Journal of Statistical Planning and Inference 155, 42-53. [Includes illustrations with analysis of global temperature data series; source: University of East Anglia]
- Ghosh, S. (2013) Normality testing for a long-memory sequence using the empirical moment generating function. Journal of Statistical Planning and Inference, 143: 944–954. [Includes illustrations with analysis of global temperature data series; source: University of East Anglia]
- Menendez, P., Ghosh, S., Künsch, H., Tinner, W. (2013) On trend estimation under monotone Gaussian subordination with long memory: application to fossil pollen series. Journal of Nonparametric Statistics, 25, No. 4, 765–785. [Includes illustrations with analysis of GRIP oxygen isotope data, source: NOAA]
- Menendez, P., Ghosh, S., Beran, J. (2010) On rapid change points under long memory. Journal of Statistical Planning and Inference, 140: 3343-3354. [Includes illustrations with analysis of GRIP oxygen isotope data, source: NOAA]
- Ghosh, S., Beran, J., Heiler, S., Percival, D., Tinner, W. (2007) Memory, non-stationarity and trend: analysis of environmental time series. In: Kienast, F., Wildi, O., Ghosh, S. (Eds.) A Changing World: Challenges for Landscape Research. Springer Verlag, Netherlands. [Includes several data examples]
- Ghosh, S., Beran, J. (2006) On estimating the cumulant generating function for linear processes. Annals of the Institute of Statistical Mathematics, 58: 53-71.
- Beran, J., Ghosh, S, Sibbertsen, P. (2003) Nonparametric M-estimation with long-memory errors. Journal of Statistical Planning and Inference 117: 199-205. [Includes illustrations with analysis of hourly wind speed maxima in Zurich in 1999; source MeteoSwiss]
- Draghicescu, D., Ghosh, S. (2003) Smooth nonparametric quantiles. In, Proceedings of the 2nd International colloquium of Mathematics in Engineering and Numerical Physics (MENP-2), Geometry Balkan Press, Bucharest, Romania 2003, pp. 45-52. [Includes illustrations with analysis of long-term Swiss precipitation records; source MeteoSwiss]
- Ghosh, S. (2003) Estimating the moment generating function of a linear process. Student, 4: 211-218.
- Ghosh, S., Draghicescu, D. (2002) Predicting the distribution function for long-memory processes. International Journal of Forecasting 18: 283-290. [Includes illustrations with analysis of long-term Swiss precipitation records; source MeteoSwiss]
- Ghosh, S., Draghicescu, D. (2002) An algorithm for optimal bandwidth selection for smooth nonparametric quantiles and distribution functions. In, Statistics in Industry and Technology: Statistical Data Analysis based on the L1-Norm and related methods, Birkhäuser Verlag, Basel, Switzerland, pp. 161-168.
- Ghosh, S. (2001) Nonparametric trend estimation in replicated time series. Journal of Statistical Planning and Inference, Vol. 97, 263-274. [Includes theory and simulated examples]
- Ghosh, S., Draghicescu, D. (2001) Quantile estimation to assess extreme climate events. Geophysical Research Abstracts, Volume 3, 2001, European Geophysical Society, Nice, France, 25-30 March, 2001.
- Beran, J., Ghosh, S. (1998) Root-n-consistent estimation in partial linear models with long-memory errors. Scandinavian Journal of Statistics, 25: 345-357. [Includes illustrations with analysis of global temperature series. Source: University of East Anglia].
SPATIAL PROCESSES
Special focus is on spatially correlated observations with non-Gaussian and location dependent probability distributions. Some topics are:
- Local stationarity. (see Journal of Nonparametric Statistics (2015), 27(2): 229–240.
- Spatial Gini coefficients. (see Communications in Statistics - Theory and Methods (2015), 44(22): 4709–4720.Includes analysis of an excerpt from a global total column ozone data set; source: NASA's Ozone Processing Team)
- Species-area curves. (see Sankhya (2009), Vol. 71-B, 2: 137-150. Includes analysis of vascular plant species data; source: BDM])
- Lattice processes with long-memory. (see Journal of Multivariate Analysis (2009), vol. 100: 2178-2194. Jointly with Beran, J. & Schell, D. Includes analysis of global total column ozone amounts; source: NASA's Ozone Processing Team])
FORESTRY, ECOLOGY, CLIMATE, ECONOMETRICS
Some publications
- 1996 Innes, J., Ghosh, S., Schwyzer, A. A method for the identification of trees with unusually coloured foliage. Canadian Journal of Forest Research, 26: 1548-1555.
- 1997 Ghosh, S., Landmann, G., Pierrat, J.C., Müler-Edzards C. Spatio-temporal variation in defoliation. In: 10 years forest condition monitoring in Europe. Studies on temporal development, spatial distribution, and impacts of natural and anthropogenic stress factors. Geneva and Brussels, United Nations Economic Commission for Europe / European Commission. eds. C. Müller-Edzards, W. De Vries and J. Willem Erisman: pp. 35-50.
- 2004 Feldmeyer-Christe, E., Ghosh, S., Wildi, O., Zimmermann, N.E., Podani, J. (Eds.) Modern Approaches in Vegetation Monitoring [Reprinted from: Community Ecology 5 (1), 2004]. Budapest, Akademiai Kiado, 143 S.
- 2006 Ghosh, S. Regression based age estimates of a stratigraphic isotope sequence. Journal of Vegetation History and Archaeobotany, 15: 273-278.
- 2007 Kienast, F., Wildi, O., Ghosh, S. (Eds.) A Changing World: Challenges for Landscape Research. Springer Verlag, Netherlands.
- 2009 Ghosh, S. The unseen species number revisited. Sankhya, The Indian Journal of Statistics 71-B, 2: 137-150.
- 2012 Wermelinger, B., Epper, C., Kenis, M., Ghosh, S., Holdenrieder, O. Emergence patterns of univoltine and bivoltine I. typographus (L.) populations and associated natural enemies. Journal of Applied Entomology 136: 212–224.
- 2014 Ghosh, S., Graf, U., Ecker, K., Wildi, O., Küchler, H., Feldmeyer-Christie, E., Küchler, M. Dimension reduction and data sharpening in Swiss mires. Ecological Indicators, Vol. 36.
- 2015 Beran, J., Feng, Y., Ghosh, S. Modelling long-range dependence and trends in duration series: an approach based on EFARIMA and ESEMIFAR models. Statistical Papers 56, Issue 2, 431-451. [Includes analysis of sunshine duration [MeteoSwiss] & some financial time series data].