
This article was originally published in the Fall 2012 edition of OnAnalytics, published by the Institute for Business Analytics at Indiana University’s Kelley School of Business.
This article focuses on Matthijs R Wildenbeest’s research, “Testing Models of Consumer Search Using Data on Web Browsing and Purchasing Behavior” in American Economic Review.
Theoretical models of consumer search behavior offer two primary frameworks: a fixed sample size search model, in which customers cost-compare across a fixed, predetermined number of retailers, and a sequential search model, in which customers investigate one retailer after another until they find an acceptable price and make a purchase. Both models assume that searching has a cost to the consumer, which factors into the decision to buy or continue searching.
These models were developed in the 1960s and ’70s, however, prior to the advent of the Internet. How do today’s consumers behave online, when searching requires only a few keystrokes?
Surprisingly, the researchers find that in the context of online book shopping consumers appear to assign a high cost to searching. Most customers did not search at all, but rather purchased from Amazon without conducting any between-store price comparisons. Among consumers who did browse, the researchers found that online consumer search behavior more closely resembles the fixed sample size model.
Statement of the Problem
To what extent does online consumer search behavior follow classical search models? Does the sequential search model favored in the literature fit with online search behavior? If not, can browsing and transaction data be used to construct a new model for consumer search in the context of e-commerce?
Data Sources Used
Data from the ComScore Web-Behavior Panel provided detailed information on browsing and transactions for 100,000 Internet users in 2002 and 52,028 users in 2004, drawn at random by ComScore from a universe of more than 1.5 million global users. The total set of online bookstore transactions featured 7,558 observations in 2002 and 8,020 observations in 2004 across 15 online bookstores. Linking browsing history to the purchase by including search activity up to 7 days prior to the transaction resulted in a sample of 18,350 search observations in 2002 and 25,556 in 2004.
Analytic Techniques
The researchers analyzed each consumer’s browsing behavior and identified the bookstores visited and prices observed prior to each transaction. Then, the researchers tested whether a sequential search model fits the observed behavior. To test the model’s recall hypothesis that a customer should not return to a previously a visited store unless she has sampled all known stores, they computed percentages of purchases that involved consumer recall or an exhaustive search.
To test the price dependence hypothesis that customers continue searching when encountering a relatively high price and cease searching when encountering a relatively low price, the researchers used a logit model. The dependent variable for the regression reflects the decision to visit only one retailer or continue the search to other retailers. The researchers ran several specifications to account for high/ low search costs, consumers, loyalty, and consumer fixed effects.
A third sequential search model test incorporates product differentiation to investigate the hypothesis that consumers are more likely to continue searching if the price of a book is relatively high within the store’s price distribution over time for that book. The researchers used a regression of the number of stores visited by consumers on the within-store relative price.
Next, the researchers constructed a fixed sample size search model reflecting heterogeneity in consumer preferences. Starting with a utility specification reflecting both the customer’s store preference and that store’s price for the book, they construct an equation that also incorporates search costs and a stochastic noise term reflecting errors in the individual’s assessment of expected gains. To estimate the model, the researchers used a log-likelihood function, estimating parameters through a maximum simulated likelihood procedure.
Finally, the researchers constructed a multinomial logit demand model assuming consumers had sampled all stores to measure the observed difference in estimated price coefficients between the search model and a model that assumes consumers already know all prices.
Results
The online book market was found to be highly concentrated, with two dominant bookstores – Amazon and Barnes and Noble – capturing 83% of the market. Amazon alone accounted for 66% of book sales while Barnes and Noble was a distant second with 17%. Moreover, Amazon was visited in 74% of transactions, and in only 17% of these transactions did buyers visit any other bookstore. All together, only 25% of transactions followed visits to more than one store. These low levels of search make it difficult to model search behavior, as either model could fit: prices may be low enough for sequentially searching consumers to cease their search immediately, or search costs may be high enough that consumers set a fixed sample size of 1.
Interestingly, consumers did not always buy from the lowestprice store they visited: in 37% of transactions following visits to multiple sites, consumers purchased a higher-priced copy of the book. The average price difference between the transaction price and the lowest price encountered was $1.99; between purchase price and lowest price available online, the average difference was $2.60.
In testing the sequential search model, the researchers discovered that that 38% of consumers recalled a previously visited store. Further, 42% of these recalling customers did not visit all bookstores of which they were known to be aware. The researchers also found that the first observed price did not affect the decision to continue searching. They found similar results when allowing for consumers to have idiosyncratic preferences for a particular retailer. These results suggest that a fixed sample search model better characterizes online consumer search behavior in this market.
Estimating the fixed sample size model with books for which the sample contained at least 20 transactions, the researchers calculated separate price coefficients for three income groups (<$35,000, $35,000 – $75,000, and >$75,000), finding that the magnitude of the price coefficient was largest for the lowestincome group. Normalized for these price coefficients, the estimated search cost was found to average $1.35. Having a broadband connection decreased search costs, as did having additional household members.
With respect to store fixed effects, Amazon had the highest fixed effect, with preference for shopping at Amazon appearing to account for $3.89 more value than Barnes and Noble and $7.53 more value than the top 5 “book club” sites. Price changes at Amazon, however, had a substantial impact on competitors’ market shares. As a comparison, the multinomial logit demand model assuming full price information indicated higher search costs and price elasticity.
Business Implications
This study offers a number of intriguing insights for online retailers. Even in the absence of a physical store environment, customers clearly value their store preferences and will pay a premium in order to shop at their favored store. Additionally, despite the apparent ease of online price comparisons, customers are disinclined to conduct extensive searches. Most customers do not search at all, but rather visit their preferred retailer and make a purchase.
The fixed sample size model presented by the researchers could be used to model online consumer behavior and price elasticities in a number of markets, not only within retail but also for services such as health insurance in the U.S. or electricity in Europe.
As an example of the applications of business analytics, this study demonstrates the value of using empirical data to test theoretical models. Sometimes, as seen here, the data will pose a significant challenge to accepted models of consumer behavior.