Dan Mazur, PhD

Personal website of Dan Mazur, PhD. Dan is a machine learning engineer in Vancouver, BC.

View My GitHub Profile

Agile For Research Work

Agile refers to a philosophy for managing software projects. Why would we consider using it for research/R&D projects where the immediate goal is not necessarily to produce working software for customers?

  1. The various research communities have not produced specific frameworks that are as mature or useful as the most popular agile frameworks for project management. If and when they do, these frameworks should be adopted by researchers. For now, the software community has arguably refined the best project management frameworks for research.

  2. Software developers use a large number of research-based tasks to get their jobs done, because they must create new knowledge to solve difficult problems. The software developer community already knows about project managing research projects. Their frameworks solve many of the problems regularly encountered in other types of research.

  3. Researchers in other fields create a lot of software because research best practices emphasize using reproducible processes that can be unambiguously communicated between research groups. The best tool for this is software. Software development skills are among the many skills required of the modern quantitative researcher (yes, spreadsheets count as software development).

Mapping Software Concepts to Research Concepts

Research and software development are different disciplines with different goals. So, it is not a perfect fit to put research work into an agile software development framework. But, it is clearer if we make some mappings between concepts:

User Stories

Ideally, task descriptions provide enough information for:

In software development, they encourage the following pattern because it provides an easy way to keep all of these goals in mind while being very succinct:

Most of the time, a researcher starts by wanting to create new knowledge for herself or her research group. Most software user stories start with “as a user…”, but in research, most “user stories” will start with “as a researcher…” (or more specifically “as a data scientist…”, “as a bioinformatician…”, etc.).

Most of the time, a researcher is working towards knowing something they didn’t know before. Most software user stories discuss a user wanting to use a feature in software, but in research most user stories will be about wanting to know something. So, many user stories in research work will begin “as a researcher, I want to know how/why/when/what/etc.”.

The last piece of the pattern is about stating the objectives. Why is this research valuable? What’s the point in investing expensive resources in this task?

Like in software development, the task descriptions communicate the who, what, and why, but they do not specify the how. The team members who work on the task should be trusted to figure out the how themselves, possibly by asking for advice from other team members.

Deliverables, Quality Control, and the Definition of Done

If it’s worthwhile for the team to take on work, then it’s worthwhile for that work to have a clear deliverable, some form of quality control process (such as review by a peer on the team), and a clear distinction between done and not done that’s understood by all team members. Researchers want to move fast and this incentives taking short cuts (same problem with software developers). But, since research builds on everything that came before, errors can propagate through the process unnoticed for a long time before they are eventually discovered (same problem with software developers). It pays dividends to check each task’s deliverables for quality and completeness before building new research on previous results (hence, why peer review exists). The team should decide how much effort its worth putting into different types of deliverables and quality control checks depending on the risks of mistakes and the rewards of moving quickly. But, if it feels like a task is too unimportant to be worth the preparation of any deliverable or go through any quality control process, that might be an indication that the task was never worth doing in the first place.

In the agile philosophy, the deliverables emphasize putting the team’s outputs into a useful state for the customers before starting new tasks. In research, this might mean that every task ends with a short report with a few relevant figures and a conclusion about what was learned by performing the task. These small reports, useful bits of code, etc. should be collected in a place where teammates or other “customers” will be able to easily look for them and find them. If any software was produced during the work (including small ‘one-off’ scripts for plotting or analysis), this software should be delivered into the team’s repository so that it’s easy for others to reuse, expand, or distribute in the future. Knowledge that’s verbally communicated to the team at a meeting or sitting in a notebook where no one will look probably should not be considered “useful” yet, so that task would not yet be considered finished.

Resources

  1. SCORE: Adapting Scrum to Managing a Research Group - Hicks, Foster @ UMaryland
  2. Agile Academic Research
  3. Agile Scientific Research
  4. Agile Kanban for Research Work