CW14 Discussion sesssion 3

What is the best way to train producers of good software documentation, and why is this important?

What are the five most important things learnt during this discussion:

  • Documentation depends on how we learn – we need training in pedagogy for documentation courses (pictures/text?) – snapshots that the users see, youtube videos (channel)
  • Make no assumptions about the users, could be complete beginners (common problem list)
  • Documentation needs to be peer reviewed
  • Some people don’t read the documentation – have clear error codes. As a documenter – this is really important! (fast track manual, top tips)
  • Documentation writing course – specific course, perhaps with software carpentry course

What are the problems, and are there solutions?

  • Often the software developer is not the best person to produce the documentation. Lack of agreement, different priorities – how does it work? What does it do? How do I use it?  
  • Training – specific pedagogy courses, how do people learn? Visual! Videos and pictures. Keep it online and current!
  • Training - tied in with software carpentry worshops and need a recommended format for documentation (multitple levels – beginner, medium and advanced)
  • Often people don’t read documentation – perhaps structured, clear error messages might be better! Common problem list and small handbook. 
  • Specialist 3rd party training providers could implement the training in documentation. Can also out-source documentation writing
  • Convince developers is time well spent – not just good practice.

We need:

  • Training
  • A basic structural recommendation is needed for documentation
  • Position needed for documenters – not software engineers
  • Documentation afternoon for research groups– with snacks and treats – papers or code documentation. We need rewards! People will come!

What further work could be done, and who should do it (make a pledge)?

  • Include documentation in Software Carpentry workshop. (perhaps a paired training system – can test each others code and feedback).
  • Documentation afternoon

Are there any useful resources that people should know about?
GMT handbook for Generic Mapping Tools software is a good manual

How do you make a case for funding

a full-time software developer in RCUK and EU funding calls?

What are the five most important things learnt during this discussion:

  • Both full and part time are valid options
  • Both DI and DA can work
  • Suggest reviewers
  • Look at examples of successful grants
  • Use pathways to impact section to talk about justification

What are the problems, and are there solutions?

  • Mixed community: peer reviewers may not be in software-driven research
  • The "technician" question.
  • What level to cost at?
  • Hard to get people to do part time
  • But front/back loading can be useful


  • Post-award
  • Archer eCSE -- and general non-Archer call coming.
  • Regional infrastructures
  • RA repurposed.
  • Cost as an external contractor?
  • Suggested reviewers?
  • Use juicy job titles.

What further work could be done, and who should do it (make a pledge)?
Make sure you're registered with the research areas field in JeS. They use this to find reviewers (Pledge.) Also H2020 call for reviewers. In free text keywords under classification of oneself in JeS, include "Research Software Engineer" and "Software Developer".

Community-curated repository of evidence. (SSI/UKRSE)

What's the best way to access domain specific resources and is this reproducible?

Domain specific resource: that could be...

  • publication/data in a field of research
  • device, e.g. telescope

Perhaps we need a different question.

Can we conveniently find the things that we are trying to sustain in science?

If you can't find it, is it worth sustaining it?

  • Searching often occurs by word-of-mouth, peers tend not to have up-to-date information.
  • Redundant jargon.
  • Researchers don't realize what they need.
  • Algorithms may be written down (partially) in papers, but implemtations are hidden or unavailable.
  • Closed access (especially for obscure journals and conferences)
  • Links to the question about linking users to developers.
  • Minor and decreasing: publications / resources written in a foreign language. Issue  specially for old papers.
  • Search engines help a lot in this endeavour.
  • Data may be stored in a not trivially shareable environment (e.g., tape, DVD)

How do we publish research in a reproducible manner when GUI tools are used?

What are the five most important things learnt during this discussion:

  • Hard to encapsulate reproducibility in a consistent way
  • Build your software in such a way that it is scriptable
  • There are tools that are partial solutions (screen recorders, ....)
  • GUIs are often used by unsophisticated users and therefore educating users on which GUI tools are more reproducible than others
  • Develop tools that can analyse and visualize log files from GUI programs

What are the problems, and are there solutions?

  • GUI workflows are hard to reproduce and are heavily used in specific scientific domains (satellite imaging)
  • GUIs are often used when dealing with unsophisticated users
  • Automatic testing with GUI tools is almost impossible
  • Most Screen recording & playback tools only record the absolute coordinates of button clicks and therefore the workflow is not reproducible because the location of GUI elements (buttons, etc.) are generally dependant on screen resolution, operating system language.


  • WebApps offer both great GUIs and easy recording and replaying of GUI events (Javascript)
  • Use GUI applications that create detailed log files
  • Use more “intelligent” screen recording and playback tools, for example tools that do image processing analysis to “figure out” locations of buttons etc.
  • If this is not possible then use screen recording with annotations so reviewers can at least see what you did

As a programmer

  • Use GUI toolkits that have easy event tracking and event injection
  • Contribute scripting/macro interfaces to open source GUI applications
  • Develop and publish tools that can read and visualize log files created by popular GUI programs


What further work could be done, and who should do it (make a pledge)?

  • The SSI should publish top tips on which screen recording tools can be used for reproducibility
  • Write a blog post about which geographical science GUI tools offer better reproducibility features than others

Hackathon ideas

  • Live Experiment: Do a screen recording with annotations of a complex task in a GUI application and test if a novice user can reproduce the task by watching only the recording
  • Create tools that do image analysis on screen recordings and automatically log button clicks, GUI field inputs into a human readable text form

Are there any useful resources that people should know about?

TimeSnap, creates a png every 20 sec enables you to look back at what you did.

How do good software development practices aide reproducible research?

What are the five most important things learnt during this discussion:

  • General: lots of s/w devs, a few researchers, and some difference about what the question was actually asking.
  • Agile practices can be applied to research activities (TDD) "progress in small steps; use test cases to confirm requirement"
  • Plan and set up your working environment, as you would a lab.
  • Most of the useful tools already exist, just not always known about or accessible or used -> not in the culture
  • Have a clear plan and approach for conducting the research process.
  • Infrastructure provision and support can be a severe obstruction.

What are the problems, and are there solutions?

What are best practices?

  • automate as much as possible
  • provenance tracking (version management and tracking)
  • testing
  • modular design
  • iterative

Problems of versioning tracking large files.  (e.g. 44G output files).

  • open source...
  • tutorials
  • easy to deploy code
  • clear dependencies (enforced)
  • packaging (research objects)
    •   keeping multiple output files together
    •   provenance
    •   use file structures (shared on servers, dropbox, etc.)
  • issue tracking, planning support, etc (TRAC, etc.)

What practices from TDD could support research.  TDD ensures tests get written.  Reduce gear, increase confidence that progress is being made.  Automatic verification of expected results is a very important tool to ensure unexpected errors are not creeping in.

What further work could be done, and who should do it (make a pledge)?

  • have researchers involved in common software management practices
  • software carpentry
  • resist the pressure "just hack it" - the job's not done until it's done.  -> reward structures are wrong.
  • pursue research object agenda

Pledge (GK) creating software to support creation of research objects
Pledge (RH) write TDD module for software carpentry

Are there any useful resources that people should know about?

- software carpentry
- github, etc. (versioning)
- CI services
- TRAC, GitHub (for planning and issues and document, version tracking interface)  Need to be aware of jurisdiction issues.