We are pleased to announce the publication of a new study titled “Understanding the software and data used in the social sciences” (DOI: 10.5281/zenodo.7785707). Digital data, tools, and software are constantly evolving across the economic and social sciences community. This has led to changes in the methods used for data collection and analysis, and in the ways that data and software are managed, shared, and sustained for future generations.
New study on software and data use in the social sciences
The full report is now available on Zenodo.
Commissioned by the Economic and Social Research Council (ESRC), this study aimed to map existing data and software mechanisms to identify what software should be considered infrastructure and to establish how it is being supported and maintained, providing valuable insights into methods, challenges and opportunities in the field.
A three-stage, mixed methods approach was used and consisted of desk research to map the current social sciences research landscape, a survey of research practices in the social sciences research community completed by 164 researchers based in the UK, and 20 interviews with representatives from various domains, career stages, and funding sources.
This study was delivered by a team consisting of Ms Selina Aragon (SSI, University of Edinburgh), Dr Mario Antonioletti (SSI, University of Edinburgh), Dr Johanna Walker (University of Southampton), and Professor Neil Chue Hong (SSI, University of Edinburgh).
Noteworthy findings include the dominance of survey and interview data, with the ESRC-funded UK Data Service being a primary source for reused data. However, the study highlights a compliance gap, revealing that only 34% of ESRC-funded researchers reported sharing their data, despite policy requirements.
The survey further uncovers that statistical analysis and spreadsheets are the most commonly used research software, aligning with the prevalent use of quantitative data in the social sciences. Qualitative data analysis tools follow closely, reflecting the popularity of interview data.
Surprisingly, a majority of respondents use open source tools, emphasising their value in terms of cost, sustainability, and interoperability. However, the study indicates a lack of widely available training, with respondents relying on online courses, self-led learning, and on-the-job experiences to acquire essential digital, data, and software skills. Interviews conducted during the study unveiled barriers to skills acquisition, including institutional and funder support gaps, initial confidence and competence issues, time constraints, and a shortage of appropriate courses.
The study's recommendations call for a reevaluation of the ESRC Data Policy to address the reluctance in adhering to data-sharing guidelines. Additionally, it advocates for strategic efforts to support the adoption of widely used open-source software, such as the R ecosystem, and encourages community engagement and recognition of contributions.
To enhance skills acquisition, the study proposes support for external training initiatives and the development of targeted internal training. These recommendations align with the overarching goal of fostering open research, overcoming skill acquisition barriers, and shaping future policies to ensure the sustainability and accessibility of tools crucial to the social sciences community.
The study sheds light on the dynamic landscape of data and software use in the social sciences. The findings not only underscore the prevalent reliance on survey and interview data but also reveal challenges in data sharing compliance, pointing towards crucial recommendations for fostering open research, addressing skill acquisition barriers, and shaping future policies to ensure the sustainability and accessibility of tools vital to the social sciences community. In the coming months, ESRC will be addressing the recommendations arising from the report via the Future Data Services strategic review and a project to update the ESRC Data Policy.
“The results of this study provide insight into how researchers in the social sciences use digital and computational methods, highlighting the ubiquity of data analysis tools and use of local computing hardware. It also helps us to better support and recognise researchers who are developing their own code, and give evidence to support new policies aimed at sustaining the broader long tail of tools that the social sciences relies on.”
Neil Chue Hong
SSIDirector and project PI