Retention and graduation are the dominant metrics in studying student success in engineering education and in higher education in general, yet available national datasets do not facilitate establishing national retention/graduation benchmarks. A national, longitudinal, engineering student unit-record database would make it possible to calculate retention and other metrics consistently. This would permit benchmarking, support peer comparisons, and the use of new metrics backed by community support.
Sharing longitudinal student record data is critical to addressing important questions being asked of higher education. The Multiple-Institution Database for Investigating Engineering Longitudinal Development (MIDFIELD) is a multi-institution, longitudinal, student record level dataset that is used to answer many research questions about how students maneuver through required engineering curricula and what obstacles stand in their way toward graduation. MIDFIELD comprises whole population unit-record data for undergraduate, degree-seeking students—including students who matriculate in engineering, those who migrate into engineering from other majors, students who come to engineering as transfer students, part-time engineering students, and students who have never enrolled in engineering. This results in a dataset that currently comprises 20 years of data that includes 1,014,887 unique undergraduate, degree-seeking students. Of those, 210,725 were ever enrolled in engineering. While the original database contains only 11 institutions, the plan for MIDFIELD has always been to expand to include all public institutions in the United States that offer undergraduate programs in engineering. MIDFIELD is growing and has been funded by the National Science Foundation (NSF Award # 1545667, $4,010,978.00, 03/01/16 to 02/28/2021) to initially increase the number of partner institutions to 103. Students in the expanded MIDFIELD will comprise over half of the undergraduate engineering degrees awarded at U.S. public institutions and approximately two-thirds of the U.S. undergraduate engineering student population in any given year during the past 25 years. The expanded MIDFIELD will contain unit record data for almost 10 million individual students. The expanded MIDFIELD will also contain minority serving institutions, and institutions from a broad range of research classifications.
The process of designing, compiling, maintaining, protecting, and sharing a large dataset like MIDFIELD provides valuable insight for others. This paper will discuss:
• The strategic selection of new institutions
• data collection and archiving processes
• data security
• student and institutional confidentiality
• benefits to institutional partners
• the MIDFIELD Institute
The expanded MIDFIELD will be an essential tool for institutional researchers to study students on the local, regional, or national level. Broader access to MIDFIELD data through a data archive will leverage the investment in its infrastructure and increase the diversity and pace of research using the database. Expanding access to MIDFIELD should result in the development of a research community that shares best practices for using this data, leading to methodological advances as well. Adding new institutional partners will enhance the generalizability of this research and allowing a larger community of researchers to access this resource will result in a dramatic increase in high-quality research.
Are you a researcher? Would you like to cite this paper? Visit the ASEE document repository at peer.asee.org for more tools and easy citations.