Statistics and Actuarial Science
http://hdl.handle.net/10012/9934
Sat, 24 Aug 2019 01:19:44 GMT2019-08-24T01:19:44ZSurvival Analysis of Complex Featured Data with Measurement Error
http://hdl.handle.net/10012/14927
Survival Analysis of Complex Featured Data with Measurement Error
Chen, Li-Pang
Survival analysis plays an important role in many fields, such as cancer research, clinical trials, epidemiological studies, actuarial science, and so on. A large body of methods on analyzing survival data have been developed. However, many important problems have still not been fully explored. In this thesis, we focus on the analysis of survival data with complex features.
In Chapter 1, we review relevant topics including survival analysis, the measurement error model, the graphical model, and variable selection.
Graphical models are useful in characterizing the dependence structure of variables. They have been commonly used for analysis of high-dimensional data, including genetic data and data with network structures. Many estimation procedures have been developed under various graphical models with a stringent assumption that the associated variables must be measured precisely. In applications, this assumption, however, is often unrealistic and mismeasurement in variables is usually presented in data. In Chapter 2, we investigate the high-dimensional graphical model with error-prone variables. We propose valid estimation procedures to account for measurement error effects. Theoretical results are established for the proposed methods and numerical studies are reported to assess the performance of our proposed methods.
In Chapter 3, we consider survival analysis with network structures and measurement error in covariates. In survival data analysis, the Cox proportional hazards (PH) model is perhaps the most widely used model to feature the dependence of survival times on covariates. While many inference methods have been developed under such a model or its variants, those models are not adequate for handling data with complex structured covariates. High-dimensional survival data often entail several features: (1) many covariates are inactive in explaining the survival information, (2) active covariates are associated in a network structure, and (3) some covariates are error-contaminated. To hand such kinds of survival data, we propose graphical proportional hazards measurement error models, and develop inferential procedures for the parameters of interest. Our proposed models significantly enlarge the scope of the usual Cox PH model and have great flexibility in characterizing survival data. Theoretical results are established to justify the proposed methods. Numerical studies are conducted to assess the performance of the proposed methods.
In Chapter 4, we focus on sufficient dimension reduction for high-dimensional survival data with covariate measurement error. Sufficient dimension reduction (SDR) is an important tool in regression analysis which reduces the dimension of covariates without losing predictive information. Several methods have been proposed to handle data with either censoring in the response or measurement error in covariates. However, little research is available to deal with data having these two features simultaneously. Moreover, the analysis becomes more challenging when data contain ultrahigh-dimensional covariates. In Chapter 4, we examine this problem. We start with considering the cumulative distribution function in regular settings and propose a valid SDR method to incorporate the effects of both censored data and covariates measurement error. Next, we extend the proposed method to handle ultrahigh-dimensional data. Theoretical results of the proposed methods are established. Numerical studies are reported to assess the performance of the proposed methods.
In Chapter 5, we slightly switch our attention to examine sampling issues concerning survival data. Specifically, we discuss survival analysis for left-truncated and right-censored data with covariate measurement error. Many methods have been developed for analyzing survival data which commonly involve right-censoring. These methods, however, are challenged by complex features pertinent to the data collection as well as the nature of data themselves. Typically, biased samples caused by left-truncation or length-biased sampling and measurement error are often accompanying with survival analysis. While such data frequently arise in practice, little work has been available in the literature. In Chapter 5, we study this important problem and explore valid inference methods for handling left-truncated and right-censored survival data with measurement error under the widely used Cox model. We exploit a flexible estimator for the survival model parameters which does not require specification of the baseline hazard function. To improve the efficiency, we further develop an augmented non-parametric maximum likelihood estimator. We establish asymptotic results for the proposed estimators and examine the efficiency and robustness issues of the proposed estimators. The proposed methods enjoy appealing features that the distributions of the covariates and of the truncation times are left unspecified. Numerical studies are reported to assess the performance of the proposed methods.
In Chapter 6, we study outstanding issues on model selection and model averaging for survival data with measurement error. Model selection plays a critical role in statistical inference and a vast literature has been devoted to this topic. Despite extensive research attention on model selection, research gaps still remain. An important but unexplored problem concerns model selection for truncated and censored data with measurement error. Although analysis of left-truncated and right-censored (LTRC) data has received extensive interests in survival analysis, there has been no research on model selection for LTRC data, let alone LTRC data involving with measurement error. In Chapter 6, we take up this important problem and develop inferential procedures to handle model selection for LTRC data with measurement error in covariates. Our development employs the local model misspecification framework and emphasizes the use of the focus information criterion (FIC). We develop valid estimators using the model averaging scheme and establish theoretical results to justify the validity of our methods. Numerical studies are conducted to assess the performance of the proposed methods.
Finally, Chapter 7 summarizes the thesis with discussions.
Thu, 22 Aug 2019 00:00:00 GMThttp://hdl.handle.net/10012/149272019-08-22T00:00:00ZNumerical Solutions to Stochastic Control Problems: When Monte Carlo Simulation Meets Nonparametric Regression
http://hdl.handle.net/10012/14831
Numerical Solutions to Stochastic Control Problems: When Monte Carlo Simulation Meets Nonparametric Regression
Shen, Zhiyi
The theme of this thesis is to develop theoretically sound as well as numerically efficient Least Squares Monte Carlo (LSMC) methods for solving discrete-time stochastic control problems motivated by insurance and finance problems.
Despite its popularity in solving optimal stopping problems, the application of the LSMC method to stochastic control problems is hampered by several challenges. Firstly, the simulation of the state process is intricate in the absence of the optimal control policy in prior. Secondly, numerical methods only warrant the approximation accuracy of the value function over a bounded domain, which is incompatible with the unbounded set the state variable dwells in. Thirdly, given a considerable number of simulated paths, regression methods are computationally challenging. This thesis responds to the above problems.
Chapter 2 develops a novel LSMC algorithm to solve discrete-time stochastic optimal control problems, referred to as the Backward Simulation and Backward Updating (BSBU) algorithm. The BSBU algorithm has three pillars: a construction of auxiliary stochastic control model, an artificial simulation of the post-action value of state process, and a shape-preserving sieve estimation method which equip the algorithm with a number of merits including obviating forward simulation and control randomization, evading extrapolating the value function, and alleviating computational burden of the tuning parameter selection.
Chapter 3 proposes an alternative LSMC algorithm which directly approximates the optimal value function at each time step instead of the continuation function. This brings the benefits of faster convergence rate and closed-form expressions of the value function compared with the previously developed BSBU algorithm. We also develop a general argument for constructing an auxiliary stochastic control problem which inherits the continuity, monotonicity, and concavity of the original problem. This argument renders the LSMC algorithm circumvent extrapolating the value function in the backward recursion and can well adapt to other numerical methods.
Chapter 4 studies a complicated stochastic control problem: the no-arbitrage pricing of the “Polaris Choice IV" variable annuities issued by the American International Group. The Polaris allows the income base to lock in the high-water-mark of the
investment account over a certain monitoring period which is related to the timing of the policyholder’s first withdrawal. By prudently introducing certain auxiliary state and control variables, we formulate the pricing problem into a Markovian stochastic optimal control framework. With a slight modification on the fee structure, we prove the existence of a bang-bang solution to the stochastic control problem: the policyholder's optimal withdrawal strategy is limited to a few choices. Accordingly, the price of the modified contract can be solved by the BSBU algorithm. Finally, we prove that the price of the modified contract is an upper bound for that of the Polaris with the real fee structure. Numerical experiments show that this bound is fairly tight.
Tue, 30 Jul 2019 00:00:00 GMThttp://hdl.handle.net/10012/148312019-07-30T00:00:00ZOn some topics in Levy insurance risk models
http://hdl.handle.net/10012/14805
On some topics in Levy insurance risk models
Wong, Jeff
Risk management has long been the central focus within actuarial science. There are various risks a typical actuarial company would look into, solvency risk being one of them. This falls under the scope of surplus analysis. Studying of an insurer's ability to maintain an adequate surplus level in order to fulfill its future obligation would be the subject matter, which requires modeling of the underlying surplus process together with de fining appropriate matrices to quantity the risk. Ultimately, it aims at accurately reflecting the solvency status to a line of business, which requires developing realistic models to predict the evolution of the underlying surplus and constructing various ruin quantities depending on the regulations or the risk appetite set internally by the company.
While there have been a vast amount of literature devoted to answering these questions in the past decades, a considerable amount of effort is devoted by different scholars in recent years to construct more accurate models to work with, and to develop a spectrum of risk quantities to serve different purposes. In the meantime, more advanced tools are also developed to assist with the analysis involved. With the same spirit, this thesis aims at making contributions in these areas.
In Chapter 3, a Parisian ruin time is analyzed under a spectrally negative L evy model. A hybrid observation scheme is investigated, which allows a more frequent monitoring when the solvency status to a business is observed to be critical. From a practical perspective, such observation scheme provides an extra degree of realism. From a theoretical perspective,
it uni es analysis to paths having either bounded or unbounded variations, a core obstacle for analysis under the context of spectrally negative L evy model. Laplace transform to the concerned ruin time is obtained. Existing results in the literature are also retrieved to demonstrate consistency by taking appropriate limits.
In Chapter 4, the toolbox of discrete Poissonian observation is further complemented under a spectrally negative L evy context. By extending the classical definition of potential measures, which summarizes the law of ruin time and de cit at ruin under continuous observation, to its discrete counterpart, expressions to the Poissonian potential measures are derived. An interesting dual relation is also discovered between a Poissonian potential measure and the corresponding exit measure. This further strengthens the motivation for studying the Poissonian potential measures. To further demonstrate its usefulness, several problems are formulated and analyzed at the end of this chapter.
In Chapter 5, motivated from regulatory practices, a more conservative risk matrix is constructed by altering the traditional definition to a Parisian ruin time. As a starting point, analysis is performed using a Cram er-Lundberg model, a special case of spectrally negative L evy model. The law of ruin time and its de cit at ruin is obtained. An interesting ordering property is also argued to justify why it is a more conservative risk measure to work with.
To ensure that the thesis flows smoothly, Chapter 1 and 2 are devoted to the background reading. Literature reviews and existing tools necessary for subsequent derivations are provided at the beginning of each chapters to ensure self-containment. A summary and concluding remarks can be found in Chapter 6.
Tue, 16 Jul 2019 00:00:00 GMThttp://hdl.handle.net/10012/148052019-07-16T00:00:00ZA Statistical Response to Challenges in Vast Portfolio Selection
http://hdl.handle.net/10012/14792
A Statistical Response to Challenges in Vast Portfolio Selection
Guo, Danqiao
The thesis is written in response to emerging issues brought about by an increasing number of assets allocated in a portfolio and seeks answers to puzzling empirical findings in the portfolio management area. Over the years, researchers and practitioners working in the portfolio optimization area have been concerned with estimation errors in the first two moments of asset returns. The thesis comprises several related chapters on our statistical inquiry into this subject. Chapter 1 of the thesis contains an introduction to what will be reported in the remaining chapters.
A few well-known covariance matrix estimation methods in the literature involve adjustment of sample eigenvalues. Chapter 2 of the thesis examines the effects of sample eigenvalue adjustment on the out-of-sample performance of a portfolio constructed from the sample covariance matrix. We identify a few sample eigenvalue adjustment patterns that lead to a definite improvement in the out-of-sample portfolio Sharpe ratio when the true covariance matrix admits a high-dimensional factor model.
Chapter 3 shows that even when the covariance matrix is poorly estimated, it is still possible to obtain a robust maximum Sharpe ratio (MSR) portfolio by exploiting the uneven distribution of estimation errors across principal components. This is accomplished by approximating the vector of expected future asset returns using a few relatively accurate sample principal components. We discuss two approximation methods. The first method leads to a subtle connection to existing approaches in the literature, while the second one named the ``spectral selection method" is novel and able to address main shortcomings of existing methods in the literature.
A few academic studies report an unsatisfactory performance of the optimized portfolios relative to that of the 1/N portfolio. Chapter 4 of the thesis reports an in-depth investigation into the reasons behind the reported superior performance of the 1/N portfolio. It is supported by both theoretical and empirical evidence that the success of the 1/N portfolio is by no means due to the failure of the portfolio optimization theory. Instead, a major reason behind the superiority of the 1/N portfolio is its adjacency to the mean-variance optimal portfolio.
Chapter 5 examines the performance of randomized 1/N stock portfolios over time. During the last four decades these portfolios outperformed the market. The construction of these portfolios implies that their constituent stocks are in general older than those in the market as a whole. We show that the differential performance can be explained by the relation between stock returns and firm age. We document a significant relation between age and returns in the US stock market. Since 1977 stock returns have been an increasing function of age apart from the oldest ages. For this period the age effect completely dominates the size effect.
Thu, 04 Jul 2019 00:00:00 GMThttp://hdl.handle.net/10012/147922019-07-04T00:00:00Z