Arun’s work on research methods has examined 1) the specification and validation of measures for structural equation models and composite models (particularly Partial Least Squares) and 2) parameter recovery, statistical power, and predictive utility of Partial Least Squares with composite-based populations.

## Selected Publications

While researchers go to great lengths to justify and prove theoretical links between constructs, the relationship between measurement items and constructs is often ignored. By default, the relationship between construct and item is assumed to be reflective, meaning that the measurement items are a reflection of the construct. Many times, though, the nature of the construct is not reflective, but rather formative. Formative constructs occur when the items describe and define the construct rather than vice versa. In this research, we examine whether formative constructs are indeed being mistaken for reflective constructs by information systems researchers. By examining complete volumes of MIS Quarterly and Information Systems Research over the last 3 years, we discovered that a significant number of articles have indeed misspecified formative constructs. For scientific results to be valid, we argue that researchers must properly specify formative constructs. This paper discusses the implications of different patterns of common misspecifications of formative constructs on both Type I and Type II errors. To avoid these errors, the paper provides a roadmap to researchers to properly specify formative constructs. We also discuss how to address formative constructs within a research model after they are specified.

Aguirre-Urreta and Marakas (A&M) suggest in their simulation "Revisiting Bias Due to Construct Misspecification:Different Results from Considering Coefficients in Standardized Form," that, like Jarvis et al. (2003),MacKenzie et al. (2005), and Petter et al. (2007) before them, bias does occur when formative constructs are misspecified as reflective. But A&M argue that the level of bias in prior simulation studies has been exaggerated. They parameterize their simulation models using standardized coefficients in contrast to Jarviset al., MacKenzie et al., and Petter et al., who parameterize their simulation models using unstandardized coefficients. Thus, across these four simulation studies, biases in parameter estimates are likely to result in misspecified measurement models (i.e., using either unstandardized or standardized coefficients); yet, the biases are greater in magnitude when unstandardized coefficients are used to parameterize the misspecified model. We believe that regardless of the extent of the bias, it is critically important for researchers to achieve correspondence between the measurement specification and the conceptual meaning of the construct so as to not alter the theoretical meaning of the construct at the operational layer of the model. Such alignment between theory and measurement will safeguard against threats to construct and statistical conclusion validity.

Observed heterogeneity by introducing moderators, a priori groupings, and contextual factors in their research models, they have not examined how unobserved heterogeneity may affect their findings. We describe why unobserved heterogeneity threatens different types of validity and use simulations to demonstrate that unobserved heterogeneity biases parameter estimates, thereby leading to Type I and Type II errors. We also review different methods that can be used to uncover unobserved heterogeneity in structural equation models. While methods to uncover unobserved heterogeneity in covariance-based structural equation models (CB-SEM) are relatively advanced, the methods for partial least squares (PLS) path models are limited and have relied on an extension of mixture regression—finite mixture partial least squares (FIMIX-PLS) and distance measure-based methods—that have mismatches with some characteristics of PLS path modeling. We propose a new method—prediction-oriented segmentation (PLS-POS)—to overcome the limitations of FIMIX-PLS and other distance measure-based methods and conduct extensive simulations to evaluate the ability of PLS-POS and FIMIX-PLS to discover unobserved heterogeneity in both structural and measurement models. Our results show that both PLS-POS and FIMIX-PLS perform well in discovering unobserved heterogeneity in structural paths when the measures are reflective and that PLS-POS also performs well in discovering unobserved heterogeneity in formative measures. We propose an unobserved heterogeneity discovery (UHD) process that researchers can apply to (1) avert validity threats by uncovering unobserved heterogeneity and (2) elaborate on theory by turning unobserved heterogeneity into observed heterogeneity, thereby expanding theory through the integration of new moderator or contextual variables.

Hospitals are now faced with delivering value-based care (high quality patient care at a reduced cost) rather than volume-based care. To investigate the impact of IT on value-creation in health care, we identify and theorize how the extent of use and rate of growth in use for three HIT capabilities (Clinical Process Management, Patient Engagement, and Patient Transition) may independently and jointly affect cost and patient quality outcomes in the context of the U.S. health care industry. Our empirical data is based on multiple archival sources from 2008-2013, including data on implementation and use of HIT functionalities, hospital characteristics, quality of patient care outcomes, and cost of care outcomes. We identify measures for our constructs and propose analysis methods to test our model and hypotheses. We seek to contribute to our understanding of how portfolios of HIT capabilities and associated complementarities may contribute to the delivery of value-based care.Composite-based methods like partial least squares (PLS) path modeling have an advantage over factor-based methods (like CB-SEM) because they yield determinate predictions, while factor-based methods’ prediction is constrained in this regard by factor indeterminacy. To maximize practical relevance, research findings should extend beyond the study’s own data. We explain how PLS practices, deriving, at least in part, from attempts to mimic factor-based methods, have hamstrung the potential of PLS. In particular, PLS research has focused on parameter recovery and overlooked predictive validity. We demonstrate some implications of considering predictive abilities as a complement to parameter recovery of PLS by reconsidering the institutionalized practice of mapping formative measurement to Mode B estimation of outer relations. Extensive simulations confirm that Mode A estimation performs better when sample size is moderate and indicators are collinear while Mode B estimation performs better when sample size is very large or true predictability (R²) is high.