MCDB-R: Risk analysis in the database


Enterprises often need to assess and manage the risk arising from uncertainty in their data. Such uncertainty is typically modeled as a probability distribution over the uncertain data values, specified by means of a complex (often predictive) stochastic model. The probability distribution over data values leads to a probability distribution over database query results, and risk assessment amounts to exploration of the upper or lower tail of a query-result distribution. In this paper, we extend the Monte Carlo Database System to efficiently obtain a set of samples from the tail of a query-result distribution by adapting recent “Gibbs cloning” ideas from the simulation literature to a database setting.

Subi Arumugam, Fei Xu, Ravi Jampani, Christopher Jermaine, Luis L Perez, Peter J Haas
Publication Date: 
Wednesday, September 1, 2010
Publication Information: 
The 36th International Conference on Very Large Data Bases, September 13-17, 2010, Singapore