SyntheticLongitudinalData
➤ Gửi thông báo lỗi ⚠️ Báo cáo tài liệu vi phạmNội dung chi tiết: SyntheticLongitudinalData
SyntheticLongitudinalData
SYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData Daniel Bonnery*. Yi Fengb, Angela K. Hennebergerc, Tessa L. Johnsonb, Mark Lachowiczb. Bess A. Rosec. Terry Shaw4. Laura M. Stapletonb. Michael E. Woolley4, Yating Zhengb. Authors listed in alphabetical order by last name."Joint Program of Survey Methodology. University of Man land. College Park bDe SyntheticLongitudinalData partment of Human Development and Quantitative Methodology. University of Maryland.College Park* School of Social Work, University of Maryland. BaltimSyntheticLongitudinalData
oreCitation:Bonnery. D., Feng. Y.. Henneberger, A.K.. Johnson. T.. Lachowicz. M.. Rose. B.A.. Shaw. T.. Stapleton, L.M., Woolley, M.E. & Zheng, Y. (20SYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData on Educational Effectiveness, /2(4), 616-647. https:.'.'doi.org/10.1 oso.'l 9345747.2019.1631421Author's Note: The contents of this manuscript were developed under a grant from the Department of Education. However, those contents do not necessarily represent the policy of the Department of Education SyntheticLongitudinalData , and you should not assume endorsement by the Federal Government. Additionally, this research was supported by the Maryland Longitudinal Data SystemSyntheticLongitudinalData
(MLDS) Center. We are grateful for the assistance provided by the MLDS Center. Prior versions of this manuscript were published by the MLDS Center. WeSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData the MLDS Center or its partner agencies.SYNTHETIC LONGITUDINAL DATAAbstractThere is demand among policy-makers for the use of state education longitudinal data systems, yet laws and policies regulating data disclosure limit access to such data, and security concerns and risks remain high. Well-deve SyntheticLongitudinalData loped synthetic datasets that statistically mimic the relations among the variables in the data from which they were derived, but which contain no recSyntheticLongitudinalData
ords that represent actual persons, present a viable solution to these laws, policies, concerns, and risks. We present a case study in the developmentSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData s been utilized thus far, and the potential benefits and concents in its application to education data systems. We then describe our federally-funded project, proposing the steps required to synthesize a statewide longitudinal data system covering high school, postsecondary, and workforce data. Last SyntheticLongitudinalData , for use as a template for other agencies considering synthetic data, we review the challenges we have confronted in the development of our syntheticSyntheticLongitudinalData
data system forresearch and policy evaluation purposes.SYNTHETIC LONGITUDINAL DATAAdministrative data collected by governments about individuals holdSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData kforce outcomes. However, confidentiality laws and procedures to protect such data typically restrict access to that data to a very limited universe of government-employed (or in some cases government-appointed) researchers and policy makers. There are a number of strategics for expanding access to SyntheticLongitudinalData government data, each having strengths and weaknesses. A common example is provision of aggregated data, which is safe but has limited research potentSyntheticLongitudinalData
ial. Examples of sources using such a data access strategy include the Slate of Texas, which has a website (http: www.txhighcrcddata.org ) where extenSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData a publicly-accessible website(http:■ www.dpi.state.nc.us'research data ) where datasets and variable dictionaries can be accessed, however, those datasets are also aggregated.Disseminating granular individual-level data to a wider, more diverse, group of analysts, scholars, evaluators, and policy r SyntheticLongitudinalData esearchers may leverage the potential of knowledge advancement toward a broader understanding of how these systems and structures impact our populatioSyntheticLongitudinalData
n over time: nevertheless, the fundamental responsibility of data agencies remains with the protection of individual privacy. One emerging solution toSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData variables within and across individuals, meaning that statistical analyses with such synthetic data should yield findings substantially similar to the “rear* data from which it was modeled while simultaneously reducing the risk of privacy breach.SYNTHETIC LONGITUDINAL DATAIn this manuscript, we deta SyntheticLongitudinalData il the promise and limitations we have encountered in our ongoing efforts to create a synthetic version of one statewide longitudinal data system forSyntheticLongitudinalData
the very purpose of increasing access to these valuable data. The core aim of this Synthetic Data Project (SDP), funded by the United States DepartmenSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData high school to the workforce. 2) high school to postsecondary education, and 3) postsecondary education to the workforce. We begin with an overview of our ongoing project, including the current problems with access to administrative data and the potential for synthetic data to address those problem SyntheticLongitudinalData s, with a brief review of the synthetic data literature. We then detail the challenges we have confronted in implementation, from constructing the simSyntheticLongitudinalData
plified datasets that are the blueprints for synthesization. to selecting the synthesis models to be used, to testing the research utility and safety SYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData hetic data to answer substantive research and policy questions. To that end. we address several issues that must be resolved during the creation of synthetic data to ensure end-user utility, data security, and research validity, and we devote the final section to a discussion of how synthetic data m SyntheticLongitudinalData ight be used strategically to answer questions of relevance to policy and program evaluations.BackgroundState education and longitudinal data systemsSyntheticLongitudinalData
are advancing and growing in number, and the use of these data systems for education and workforce research holds great promise (Figlio. Karbownik, & SYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData oa in their development of statewide education data systems (SLDS Grant Program. 2018b). representing an overallSYNTHETIC LONGITUDINAL DATAinvestment of S72I million in federal funding as of May 2018 (SLDS Grant Program. 2018a). This substantial investment provides the data necessary' for assessment SyntheticLongitudinalData s of program and service efficacy to inform practice and policy decisions. Statewide longitudinal data systems, and administrative data in general, prSyntheticLongitudinalData
ovide a number of advantages to researchers as compared to traditional survey measures, including larger data sets, fewer problems with attrition, lowSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData -effective approach to answering policy questions because they obviate the need for costly and time-consuming primary data collection.The Maryland Longitudinal Data System (MLDS) is one example of a state longitudinal data system and is the impetus for the present study. The MLDS. and the Center tha SyntheticLongitudinalData t houses these data, began operations in 2013 after legislation was passed in 2010 to create the data system (Md. Code. Education .Article. §24.701-24SyntheticLongitudinalData
.707). The State law that established this new agency also called for state agencies to share data to build the longitudinal system, matching unit recSYNTHETIC LONGITUDINAL DATAThe Promise and Limitations of Synthetic Data as a Strategy to Expand Access to State-level Multi-agency Longitudinal DataD SyntheticLongitudinalData o the workforce. The purpose of the MLDS Center is to generate timely and accurate information about student performance and employment outcomes that can be used to improve the State's education system and guide decision makers. To accomplish this task, the MLDS Center links individual-level student SyntheticLongitudinalData and workforce data from three State agencies: 1) the Maryland State Department of Education (MSDE); 2) the Maryland Higher Education Commission (MHECSyntheticLongitudinalData
): and 3) the Maryland Department of Labor Licensing and Regulation (DLLR). The MLDS Center has an obligation to make data accessible to researchers,Gọi ngay
Chat zalo
Facebook