Software 

Overview
Discrete choice models are used to explain or predict a choice from a set of two or more discrete (i.e. distinct and separable; mutually exclusive) alternatives. For example, a discrete choice model may be used to analyze why people choose to drive, take the subway, or walk to work, or to analyze the factors causing people to pick one job over another.
Description
Discrete choice models operate within a framework of rational choice; that is, it is assumed that when confronted with a discrete set of options, people choose the option of maximal benefit or utility. It follows from this assumption that utility of a choice is a function of the characteristics of the possible choices and the characteristics of the person making the choice. Discrete choice models characterize that function for a population, thereby allowing for statistical inference about the functional parameters. For example, by fitting a discrete choice model to a dataset of transportation mode choices (e.g. drive/walk/subway) of 100 individuals over 2 weeks, investigators might find that the mode choice is related both to characteristics of the choice (e.g. the choice to walk may be preferred by few individuals in general) and to interactions between the choices and characteristics of individuals (e.g. those who live close to work or are particularly physically active in general may choose to walk more often).
Discrete choice models can be distinguished from standard regression models by the explicit incorporation of a defined set of choices, some of which were not selected. Two types of choice data exists: (a) stated preference and (b) revealed preferences. Stated preference data is obtained from hypothetical scenarios or a set of choices presented by the investigator to the subjects (e.g. subjects are asked “do you prefer to walk or take the subway to work?”), while revealed preferences analyze subjects’ recorded past choices (e.g subjects record how they get to work each day for 2 weeks and investigators analyze the results). The preferred form of choice set depends on the research question. For example, research aimed at increasing the health workforce in rural areas would use stated preferences instead of revealed preferences, as revealed preferences cannot be obtained for jobs that do not exist. In some cases, researchers may use a combination of both types, potentially to compare stated preferences to revealed preferences.
Procedural Overview
Discrete choice models are fitted with a twostage procedure: first, the analyst generates a choiceset, that is, the set of all choices subjects might have chosen. Next, the analyst fits a model with (latent) utility on the left hand side of the equation and characteristics of subjects, characteristics of choices, and an error term on the right hand side.
Choice Sets
Depending on the research question, choice set may range from trivial to a considerable implementation challenge. For example, choice set generation would be simple for an analysis of factors contributing to a voter’s choice in a US Presidential election; the set of names on a ballot are welldefined. By contrast, an analysis investigating revealed residential preferences in New Yorkers might logically need to consider every apartment available for rent in New York (or, more broadly, every home in the New York metropolitan area) as a potential choice.
In general, generating a choice set is a matter of enumerating all possibilities for choices a subject may have made in lieu of the one the subject actually did make (if investigating revealed preferences) or might want to make (if investigating stated preferences). If the set of possible choices is too large (e.g. there are 100 choices not selected for every one choice that is), however, data become sparse and statistical modeling is more timeintensive or even impossible. In such situations, several techniques are frequently used to select choice subsets (Guo and Loo 2013):
1) Random sampling involves picking a simple random sample from the set of possible alternatives
2) Labeling involves picking alternatives for a choice set in order to maximize certain properties of interest.
For example, when considering residential preference in New York, an analyst might implement (1) by picking 5 options at random from a real estate web site to represent alternatives. To implement (2) the analyst might find the choice that minimizes commute time, the choice with lowest rent, the choice with highest square footage, etc.
Model Fitting
The goal of a discrete choice model is to characterize and make inference from the random utility function describing utility as a function of features of potential choices and subjects. However, in the discrete choice context, utility cannot be observed directly, so to understand the model, we first consider how to estimate the latent utility variable. In the general form, utility (U) for an individual i making a choice j is a function of one or more observed features or the individual (Xi), and one or more observed feature of the choice (Zj), and an error term representing unobserved attributes of choices and individuals (eij). Thus:
Uij = F(Xi, Zj, eij)
In the simplest model, F is assumed to be a linear function (note: this is sometimes called a linear random utility model), so that for a given individual i selecting choice j:
Uij = βZj + γZj Xij + eij
In order to estimate β and γ (i.e. the utility associated with the choice features and interaction of choice features and individual features respectively), we assume the individual considered all choices and selected j, and therefore for all k choices in the choice set where j != k, Uij > Uik. Thus, we can fit a conditional logit model looking at the probability of actual choice selected in pairwise comparisons as a function of choice and individual characteristics.
Note: the conditional logit model is valid only when the independence from irrelevant alternatives (IIA) holds. IIA, a common assumption in econometrics, states that overall choice preference is unaffected by the set of choices offered. For example, if a subject indicates that she prefers The Beatles to The Rolling Stones, when asked which of The Beatles, The Rolling Stones and The Who she prefers, she must not select The Rolling Stones. However, behavioral psychology (and marketing research) suggests the IIA assumption is often violated in practice: the Wikipedia page on ‘the decoy effect’ (http://en.wikipedia.org/wiki/Decoy_effect) has some interesting examples. In situations in which IIA does not hold or subsets of choices have unobserved heterogeneity, nested logit or mixed logit models may be preferable to conditional logit models.
More details on model fitting, either to handle situations where IIA is too strong or where data aren’t suited to conditional logit (e.g. ranked stated preferences) are available in (Bruch and Mare 2012)
A Note on Individual Characteristics
It may seem odd that individual characteristics (i.e.Xi) are in the model only interacting with choice characteristics (i.e. in the γZj Xij term) and not on their own. This is because utility for the user/choice pair, the left hand side of the equation, has no intrinsic meaning; it is found by comparing the choices selected to the choices not selected by each individual. For example, suppose a mode choice study finds two people with identical commute options: person A chooses to walk to work while person B chooses to drive. We know that UwalkA > UdriveA and UdriveB> UwalkB, but relating UA to UB (i.e. the difference in utility for each subject regardless of choice — that is, how happy any subject is all choices independent of the characteristic of the choices) is impossible.
Put another way, β estimates (the parameter for the choice characteristic) are analogous to intercepts in a logistic regression model (i.e. a fixed estimate for all subjects) while γ is analogous to β (i.e. differential by characteristics of subjects). Note also that depending on the study, the interaction between individual characteristics and choice characteristics may be of more interest than the effects of choice characteristics alone. For example, in a study of pedestrian route choice with a focus on accessibility for the elderly, a researcher might be more interested in the interaction between presence of benches along the route and subject age or frailty status than on the effect of presence of benches alone.
Tips for Implementing and Interpreting Discrete Choice Models

Interpretation: Note that units for U are not generally interpretable in discrete choice models. β and γ are usually presented in relation to each other or as Zscores.

Interpretation: The tradeoff between attributes included in the model can also be valuable (e.g. how much extra distance is the average pedestrian willing to accept in exchange for, say, an overpass over a busy intersection). It is often easiest to interpret these choices in terms of a continuous variable. For example, in job choice literature, salary is commonly used: a researcher might describe how much loss in income one is willing to trade for an improvement in working environment. Similarly, in the walkability literature, route length is used to measure the utility of route attributes.

Validating assumptions: it is possible to test for IIA by partitioning out a subset of the choice set and seeing if the model estimates of using the full choice set are the same as those using a subset. This test is analogous to tests of instrumental variable assumptions in that the test can detect assumption violations but does not guarantee the assumption holds (i.e. failure to detect a violation does not imply lack of violations.)
Readings
Textbooks & Chapters
M. BenAkiva, M. Lerman. Discrete Choice Analysis: Theory and Application to Travel Demand. MIT Press, 1985.
Train, K. Discrete Choice Methods with Simulation. Cambridge University Press, 2009
An often cited textbook in the literature, the entire book is available for download on Dr Train’s website here.
Hensher DA, Rose J, Greene WH. 2005a. Applied Choice Analysis: A Primer. Cambridge University Press, 2005
http://support.sas.com/techsup/technote/mr2010f.pdf
A painfully long but ultimately useful handson guide to discrete choice modeling in SAS; part of a longer monograph (Kuhfeld, WF. Marketing Research Methods in SAS. 2010). Makes use of macros from http://support.sas.com/resources/papers/tnote/tnote_marketresearch.html.The site with macros also has a link to the full monograph.
Viton, PA. Discrete Choice Logic Models with R.
http://facweb.knowlton.ohiostate.edu/pviton/courses2/crp5700/5700mlogit.pdf
An tutorial on doing a discrete choice analysis in R.
Aizaki, H. Basic Functions for Supporting an Implementation of Discrete Choice Experiments in R. Journal of Statistical Software. 2012:50:2
http://www.jstatsoft.org/v50/c02/paper
A useful package that helps with the choice design, questionnaire creation and data formatting preanalysis.
Methodological Articles
Bruch, E. E., and R. D. Mare. “Methodological Issues in the Analysis of Residential Preferences, Residential Mobility, and Neighborhood Change.” Sociological Methodology 42.1 (2012): 10354. Print.
Provides a good overview of discrete choice analysis.
E. Lancsar, J. Louviere.
Conducting discrete choice experiments to inform healthcare decision making. A user’s guide.PharmacoEconomics, 26 (2008), pp. 661–677
Application Articles
Z. Guo, B.P.Y. Loo.
Pedestrian environment and route choice: Evidence from New York City and Hong Kong.Journal of Transport Geography, 28 (2013), pp. 124–136.
http://www.sciencedirect.com/science/article/pii/S0966692312002906__
Kruk ME, Johnson JC, Gyakobo M, AgyeiBaffour P, Asabir K, Kotha SR, Kwansah J, Nakua E, Snow RC, Dzodzomenyo M. Rural practice preferences among medical students in Ghana: a discrete choice experiment. Bull World Health Organ. 2010;11:333–341. doi:
10.2471/BLT.09.072892.
http://www.who.int/bulletin/volumes/88/5/09072892/en/
Websites
http://en.wikipedia.org/wiki/Discrete_choice
As usual, Wikipedia is informative.
http://www.quirks.com/articles/a2000/20000508.aspx?searchID=176983229
An economicsoriented discussion of the benefits of discrete choice experiments visavis conjoint analysis (i.e. picking the best of several options vs. rating the options on a scale of 1:10)
Presentations
http://www.tcd.ie/civileng/Staff/Brian.Caulfield/T2%20%20Transport%20Modelling/Discrete%20Choice%20Modelling.pdf
This is a nice presentation on the use of Discrete Choice Models, with an emphasis on stated preferences.
Works Cited
Bruch, E. E., and R. D. Mare. “Methodological Issues in the Analysis of Residential Preferences, Residential Mobility, and Neighborhood Change.” Sociological Methodology 42.1 (2012): 10354. Print.
Courses
Online courses
http://elsa.berkeley.edu/users/train/ec244two.html
The course website for Dr Kenneth Train’s course in Discrete Choice Methods at UCLA. Includes lecture recordings and problem sets too.
http://pages.stern.nyu.edu/~wgreene/DiscreteChoice.htm
This is a Dr William Greene’s NYU course website which includes class notes, lab notes and datasets for assignments.