Data Carpentry (since 2014) is programme inspired by Software Carpentry. It is a sister organisation to Software Carpentry and shares much of its community and infrastructure. Data Carpentry programme teaches particular and recommended open source tools to do reproducible and scalable data analysis - how to retrieve, view, manipulate, analyse and store their or other people's data in an open and reproducible way and how to work with data more effectively.
Data Carpentry workshops focus on the data lifecycle, covering data organisation, cleaning and management through to data analysis and visualisation. Contrary to Software Carpentry, whose lessons are generic and domain-agnostic focusing on best practices in programming in general, Data Carpentry designs the workshops to fit into needs of particular domains and its lessons are domain-specific, with coverage in biology, genomics, and social science, and with lesson for new domains and disciplines being developed by the community (medical doctors, geography, humanities, etc.).
As with Software Carpentry programme, teaching is delivered through intensive two-day workshops. The core curriculum taught at Data Carpentry workshops includes:
Caveats of working with spreadsheets
Cleaning data with Open Refine
Data manipulation and visualisation with R or Python
Introduction to SQL and relational databases
Automating repetitive task by working with UNIX shell.