# Statistics

Engineering students interested in advancing their studies in Statistics should consult with a departmental adviser to determine the most appropriate courses for their interests.

**STAT UN2103x Applied linear regression analysis**

*3 pts. Professor Young.*

Prerequisite: An introductory course in statistics
(STAT UN1101 is recommended). Student without
programming experience in R might find STAT
UN2102 very helpful. Develops critical thinking
and data analysis skills for regression analysis in
science and policy settings. Simple and multiple
linear regression, nonlinear and logistic models,
random-effects models, penalized regression
methods. Implementation in a statistical package.
Optional computer-lab sessions. Emphasis on
real-world examples and on planning, proposing,
implementing, and reporting.

**STAT UN2105x Statistical applications and case studies**

*3 pts. Instructor to be announced.*

Prerequisite: STAT UN2104. A sample of topics
and application areas in applied statistics.
Topic areas may include Markov processes and
queuing theory; meta-analysis of clinical trial
research; receiver-operator curves in medical
diagnosis; spatial statistics with applications in
geology, astronomy, and epidemiology; multiple
comparisons in bio-informatics; causal modeling
with missing data; statistical methods in genetic
epidemiology; stochastic analysis of neural spike
train data; graphical models for computer and
social network data.

**STAT UN3103x Mathematical methods for statistics**

*6 pts. Professor Hannah.*

Prerequisite: MATH UN1101 or instructor’s
permission. A fast-paced coverage of those aspects
of the differential and integral calculus of one
and several variables and of the linear algebra
required for the core courses in the Statistics major.
The mathematical topics are integrated with an
introduction to computing. Students seeking more
comprehensive background should replace this
course with MATH UN1102 and UN2010, and any
COMS course numbered from W1003 to W1009.

**STAT UN3105y Applied statistical methods**

*3 pts. Professors Landwehr and Whalen.*

Prerequisite: At least one, and preferably both,
of STAT UN2103 and UN2104 are strongly
recommended. Students without programming
experience in R might find STAT UN2102 very
helpful. Intended to give students practical
experience with statistical methods beyond
linear regression and categorical data analysis.
The focus will be on understanding the uses
and limitations of models, not the mathematical
foundations for the methods. Topics that may be
covered include random and mixed-effects models,
classical non-parametric techniques, the statistical
theory causality, sample survey design, multi-level
models, generalized linear regression, generalized
estimating equations and over-dispersion, survival
analysis including the Kaplan-Meier estimator, logrank
statistics, and the Cox proportional hazards
regression model. Power calculations and proposal
and report writing will be discussed.

**STAT UN3106y Applied data mining**

*3 pts. Professor Young.*

Prerequisite: STAT UN2103. Students without
programming experience in R might find STAT
UN2102 very helpful. Data mining is a dynamic
and fast growing field at the interface of Statistics
and Computer Science. The emergence of massive
datasets containing millions or even billions of
observations provides the primary impetus for the
field. Such datasets arise, for instance, in largescale
retailing, telecommunications, astronomy,
computational and statistical challenges. This
course will provide an overview of current practice
in data mining. Specific topics covered include
databases and data warehousing, exploratory data
analysis and visualization, descriptive modeling,
predictive modeling, pattern and rule discovery,
text mining, Bayesian data mining, and causal
inference. The use of statistical software will be
emphasized.

**STAT GU4001x Introduction to probability and statistics**

*3 pts. Members of the faculty*

Prerequisites: MATH UN1101 and UN1102
or equivalent. A calculus-based tour of the
fundamentals of probability theory and statistical
inference. Probabilistic models, random variables,
useful distributions, conditioning, expectations,
laws of large numbers, central limit theorem, point
and confidence interval estimation, hypothesis
tests, linear regression. This course replaces SIEO
W4150.

**STAT GU4203x and y Probability theory**

*3 pts. Professors Lo and Wang.*

Prerequisites: MATH UN1101 and UN1102 or
equivalent. An introductory course (STAT UN1201)
is strongly recommended. A calculus-based
introduction to probability theory. A quick review of
multivariate calculus is provided. Topics covered
include random variables, conditional probability,
expectation, independence, Bayes’ rule, important
distributions, joint distributions, moment generating
functions, central limit theorem, laws of large
numbers and Markov’s inequality.

**STAT GU4204x Statistical inference**

*3 pts. Professors Sobel and Young.*

Prerequisite: STAT GU4203. At least one semester of Calculus is required, two or three semesters are strongly recommended. Calculus-based introduction to the theory of statistics. Useful distributions, law of large numbers and central limit theorem, point estimation, hypothesis testing, confidence intervals maximum likelihood, likelihood
ratio tests, nonparametric procedures, theory of
least squares, and analysis of variance.

**STAT GU4205x Linear regression models**

*3 pts. Professors Neath, Wu, Liu, and Polak.*

Prerequisites: STAT GU4204 or equivalent, and
a course in linear algebra. Theory and practice
of regression analysis. Simple and multiple
regression, testing, estimation, prediction, and
confidence procedures, modeling, regression
diagnostics and plots, polynomial regression,
colinearity and confounding, model selection,
geometry of least squares. Extensive use of the
computer to analyze data.

**STAT GU4207x and y Elementary stochastic processes**

*3 pts. Professors Brown and Wang.*

Prerequisite: STAT GU4203 and two, preferably
three, semesters of calculus. Review of elements
of probability theory. Poisson processes. Renewal
theory. Wald’s equation. Introduction to discrete
and continuous time Markov chains. Applications
to queueing theory, inventory models, branching
processes.

**STAT GU4221x and y Time series analysis**

*3 pts. Professors Safikhani, Wu and Wang.*

Prerequisite: STAT GU4205 or equivalent. Least
squares smoothing and prediction, linear systems,
Fourier analysis, and spectral estimation. Impulse
response and transfer function. Fourier series, the
fast Fourier transform, autocorrelation function, and
spectral density. Univariate Box-Jenkins modeling
and forecasting. Emphasis on applications.
Examples from the physical sciences, social
sciences, and business. Computing is an integral
part of the course.

**STAT GU4222y Nonparametric statistics**

*3 pts. Professor Polak*

Prerequisite: STAT UN3204 or the equivalent.
Statistical inference without parametric model
assumption. Hypothesis testing using ranks,
permutations, and order statistics. Nonparametric
analogs of analysis of variance. Nonparametric
regression, smoothing and model selection.

**STAT GU4231y Survival analysis**

*3 pts. Professor Shnaidman.*

Prerequisite: STAT GU4205 or the equivalent.
Survival distributions, types of censored
data, estimation for various survival models,
nonparametric estimation of survival distributions,
the proportional hazard and accelerated lifetime
models for regression analysis with failure-time
data. Extensive use of the computer.

**STAT GU4232y Generalized linear models**

*3 pts. Professor Sobel.*

Prerequisite: STAT GU4205 of the equivalent.
Statistical methods for rates and proportions,
ordered and nominal categorical responses,
contingency tables, odds-ratios, exact inference,
logistic regression, Poisson regression, generalized
linear models.

**STAT GU4233x Multilevel models**

*3 pts. Not offered in 2017-2018.*

Prerequisites: STAT GU4205. Theory and practice,
including model-checking, for random and mixedeffects
models (also called hierarchical, multilevel
models). Extensive use of the computer to
analyze data.

**STAT GU4234x Sample surveys**

*3 pts. Professor Neath.*

Prerequisite: STAT GU4204 of the equivalent.
Introductory course on the design and analysis
of sample surveys. How sample surveys are
conducted, why the designs are used, how to
analyze survey results, and how to derive from
first principles the standard results and their
generalizations. Examples from public health,
social work, opinion polling, and other topics of
interest.

**STAT GU4261y Statistical methods in finance**

*3 pts. Professors ElBarmi, Wang, and Ying.*

Prerequisite: STAT GU4204 or the equivalent. A
fast-paced introduction to statistical methods used
in quantitative finance. Financial applications and
statistical methodologies are intertwined in all
lectures. Topics include regression analysis and
applications to the Capital Asset Pricing Model and
multifactor pricing models, principal components
and multivariate analysis, smoothing techniques
and estimation of yield curves statistical methods
for financial time series, value at risk, term
structure models and fixed income research, and
estimation and modeling of volatilities. Hands-on
experience with financial data.

**STAT GU4262y Stochastic processes for finance**

*3 pts. Professor Rios.*

Prerequisite: STAT GU4203. STAT GU4207 is
recommended. A careful review of the concept
of stochastic process as a model of random
phenomena evolving through time and of
conditional expectation, basic markov process
theory, and the exponential distribution. Marked
point processes and their compensators, beginning
with Poisson processes, and proceeding through
general marked point processes. The use of
compensators will be justified by the Doob-
Meyer decomposition theorem, and as such it
will connect the theory to martingales. Markov
processes will enter to provide a description of
sufficient conditions for the compensators to have
absolutely continuous paths (and as such, have
"hazard rates"). Applications to survival analysis
and, especially, to mathematical finance, including
default and bankruptcy models. Cox process
construction.

**STAT GU4281x Theory of interest**

*3 pts. Professor Zhang.*

Prerequisite: MATH UN1101 or equivalent.
Introduction to the mathematical theory of interest
as well as the elements of economic and financial
theory of interest. Topics include rates of interest
and discount; simple, compound, real, nominal,
effective, dollar (time)-weighted; present, current,
future value; discount function; annuities; stocks
and other instruments; definitions of key terms of modern financial analysis; yield curves; spot (forward) rates; duration; immunization; and short sales. The course will cover determining equivalent
measures of interest, discounting, accumulating,
determining yield rates, and amortization.

**STAT GU4291x and y Advanced data analysis**

*3 pts. Professors Alemayehu and Liu.*

Prerequisite: STAT GU4205. At least one Statistic
course between GU4221 and GU4261. This
is a course on getting the most out of data.
The emphasis will be on hands-on experience,
involving case studies with real data and using
common statistical packages. The course covers,
at a very high level, exploratory data analysis,
model formulation, goodness of fit testing, and
other standard and nonstandard statistical
procedures, including linear regression, analysis of
variance, nonlinear regression, generalized linear
models, survival analysis, time series analysis,
and modern regression methods. Students will be
expected to propose a data set of their choice for
use as case study material.

**STAT GR5242x and y Data mining**

*3 pts. Professors Mazumder, Motta, and Rabinowitz.*

Prerequisite: COMS W1003, W1004, W1005,
W1007, or the equivalent. Corequisites: Either
STAT UN3203 or GR5203, and either STAT
UN3204 or GR5204. Data Mining is a dynamic and
fast growing field at the interface of Statistics and
Computer Science. The emergence of massive
datasets containing millions or even billions of
observations provides the primary impetus for the
field. Such datasets arise, for instance, in largescale
retailing, telecommunications, astronomy,
computational and statistical challenges. This
course will provide an overview of current
research in data mining and will be suitable for
graduate students from many disciplines. Specific
topics covered with include databases and data
warehousing, exploratory data analysis and
visualization, descriptive modeling, predictive
modeling, pattern and rule discovery, text mining,
Bayesian data mining, and causal inference.

**STAT GR5703x Statistical inference and modeling**

*3 pts. Professor Hannah.*

Prerequisites: Working knowledge of calculus and
linear algebra (vectors and matrics), and STAT
GR5203 or equivalent. Fundamentals of statistical
inference and testing, and introduction of statistical
modeling. Focuses on inference and testing,
covering topics such as maximum likelihood
estimates, hypothesis testing, likelihood ratio test,
Bayesian inference, etc. Introduction to statistical
modeling via introductory lectures on linear
regression models, generalized linear regression
models, nonparametric regression, and statistical
computing. Real-data examples used in lecture
discussion and homework problems. Provides
foundation for other courses in machine learning,
data mining, and visualization.

**2016-2017 Academic Year: the system of course numbering and designated level is in transition; please consult an adviser.*