Personal website of Dan Mazur, PhD. Dan is a machine learning engineer in Vancouver, BC.
Agile refers to a philosophy for managing software projects. Why would we consider using it for research/R&D projects where the immediate goal is not necessarily to produce working software for customers?
The various research communities have not produced specific frameworks that are as mature or useful as the most popular agile frameworks for project management. If and when they do, these frameworks should be adopted by researchers. For now, the software community has arguably refined the best project management frameworks for research.
Software developers use a large number of research-based tasks to get their jobs done, because they must create new knowledge to solve difficult problems. The software developer community already knows about project managing research projects. Their frameworks solve many of the problems regularly encountered in other types of research.
Researchers in other fields create a lot of software because research best practices emphasize using reproducible processes that can be unambiguously communicated between research groups. The best tool for this is software. Software development skills are among the many skills required of the modern quantitative researcher (yes, spreadsheets count as software development).
Research and software development are different disciplines with different goals. So, it is not a perfect fit to put research work into an agile software development framework. But, it is clearer if we make some mappings between concepts:
Ideally, task descriptions provide enough information for:
In software development, they encourage the following pattern because it provides an easy way to keep all of these goals in mind while being very succinct:
Most of the time, a researcher starts by wanting to create new knowledge for herself or her research group. Most software user stories start with âas a userâŚâ, but in research, most âuser storiesâ will start with âas a researcherâŚâ (or more specifically âas a data scientistâŚâ, âas a bioinformaticianâŚâ, etc.).
Most of the time, a researcher is working towards knowing something they didnât know before. Most software user stories discuss a user wanting to use a feature in software, but in research most user stories will be about wanting to know something. So, many user stories in research work will begin âas a researcher, I want to know how/why/when/what/etc.â.
The last piece of the pattern is about stating the objectives. Why is this research valuable? Whatâs the point in investing expensive resources in this task?
Like in software development, the task descriptions communicate the who, what, and why, but they do not specify the how. The team members who work on the task should be trusted to figure out the how themselves, possibly by asking for advice from other team members.
If itâs worthwhile for the team to take on work, then itâs worthwhile for that work to have a clear deliverable, some form of quality control process (such as review by a peer on the team), and a clear distinction between done and not done thatâs understood by all team members. Researchers want to move fast and this incentives taking short cuts (same problem with software developers). But, since research builds on everything that came before, errors can propagate through the process unnoticed for a long time before they are eventually discovered (same problem with software developers). It pays dividends to check each taskâs deliverables for quality and completeness before building new research on previous results (hence, why peer review exists). The team should decide how much effort its worth putting into different types of deliverables and quality control checks depending on the risks of mistakes and the rewards of moving quickly. But, if it feels like a task is too unimportant to be worth the preparation of any deliverable or go through any quality control process, that might be an indication that the task was never worth doing in the first place.
In the agile philosophy, the deliverables emphasize putting the teamâs outputs into a useful state for the customers before starting new tasks. In research, this might mean that every task ends with a short report with a few relevant figures and a conclusion about what was learned by performing the task. These small reports, useful bits of code, etc. should be collected in a place where teammates or other âcustomersâ will be able to easily look for them and find them. If any software was produced during the work (including small âone-offâ scripts for plotting or analysis), this software should be delivered into the teamâs repository so that itâs easy for others to reuse, expand, or distribute in the future. Knowledge thatâs verbally communicated to the team at a meeting or sitting in a notebook where no one will look probably should not be considered âusefulâ yet, so that task would not yet be considered finished.