TL;DR As the complexity and scale of data science projects grow so does the importance of teamwork. In this article we discuss when teamwork is a high priority, the challenges specific to data science teamwork and suggest critical factors to foster highly productive data science teams
The truth about teamwork
Oftentimes we are led to believe in the lone genius myth – that a great mind somewhere has a moment of inspiration or eureka that leads to dramatic and meaningful developments. While this might be a tempting way to view history and technology - for the most part it simply isn’t correct. Michelangelo, the famous artistic genius, didn’t paint the entire Sistine Chapel by himself - he had a whole team of assistants working with him. Even the discoveries of famous scientists such as Charles Darwin and Thomas Edison are based on the creativity of collaborative groups.
According to a study published in the HBR, “over the past two decades, the time spent by managers and employees in collaborative activities has ballooned by 50% or more.” Teamwork matters and is becoming more and more important. Of course, if you are part of a team at the moment - you don’t need the Harvard Business Review to tell you that. Odds are you have experienced both the power and potential of collaboration and also the frustration of being part of a team that just doesn’t click and work effectively together. Let’s talk about how this translates specifically into the field of Data Science.
When collaboration is a must in data science?
Real-world projects require people with different expertise and abilities. In the field of data science, teams need a varying mix of (among other titles):
- Data Scientists
- Machine Learning Engineers
- Data Engineers
- Software Developers
- Data Analytics
- Product Managers
Here are some cases in which collaboration isn’t just optional but necessary:
- Scale: As your projects grow, your team will need to grow in tandem. This usually also means the need for specialization. “Throwing” people at a problem without building a team isn’t going to be effective. This is the idea of synergy.
- Faster Turnaround: Moving fast is a top priority for most organizations. Good teamwork enables the workload to be shared, reducing pressure on individuals and helps ensure that tasks are completed within the given time frame. Hero Ball – a basketball play style where you give the ball to your best player and hope he scores points while forgoing any notions of team play – doesn’t work. Not in the NBA, nor in data science teams.
- Flexibility: Teamwork is key for dealing with events that can impact your workflow. If an entire project is contingent on a certain person being available - you might be in for some trouble (see: bus factor). Teamwork can solve what happens when someone goes on vacation or moves on to their next role. A methodical and efficient handover is important for the organization and also for the individual. While handing over a project can be simple, if done badly it can mess up entire projects.
- Future proof and disaster recovery: Even if you are currently working alone, you are still performing collaborative work (though it might not seem that way). The most common type of teamwork is working with your future self. You will, in all likelihood, need to revert to an older version of a project to understand the work done in it. It might be because of a bug in production, or just a vague memory that there was something good going on in that experiment. Having methodologies and tools in place for this workflow can save a lot of time and grief, but many organizations implement them only after being burned.
After establishing the need for teamwork in the field of data science - let’s discuss some of the key challenges that teams face – challenges specific to data science teams and generic challenges all teams face.
Challenges specific to data science teamwork
In general, tracking experiment outcomes can be challenging even for single users. As teams grow, tracking experiments, data, and context in an effective way that can enable sharing results is downright difficult. As models move into production, tracking context (read: where did this model come from?!) grows in importance, especially across users and disparate platforms. If this is an issue you feel you’re facing - we recommend looking into different experiment tracking tools that can help you and your team track experiments easily and effectively. You can read more about options to track experiments here.
A key challenge of data science is the ability to reproduce earlier work easily so as not to waste energy and resources exploring the same hypothesis twice. Reproducing results of experiments that are conducted in teams, where different team members work on the different components is hard to do. Teams need to be able to retrieve versions of the code, data, and model used in the experiment from different users.
The iterative approach to data science emphasizes building a first model quickly and then continuously improving and tweaking it. By using automatic diagnostics and manual analysis team members can find weak spots and areas that require improvement. Iteration requires data scientists to work closely with domain experts to improve the model. It necessitates communication across teams to stay abreast of the continuous developments and changes that are being done and new problems that need to be solved.
Hiring Challenges / Scarcity of data science roles
Ask any team lead. Hiring data professionals is really, REALLY, hard. If existing team members could be more productive, a 5 person team could have twice the impact.
General teamwork challenges
There are numerous challenges that all teams face - this list is by no means exhaustive but can help frame some key ones to be aware of.
Lack of trust
Trust is key to any relationship. A lack of trust can create a toxic culture, hamper communication, and destroy productivity. Teams are composed of humans - and trust is a major factor in any human interaction. The trick of trust is that it can be manufactured or conjured out of thin air. It’s earned and built over time. Now is the time to start.
Un-self-aware team members can diminish teams’ chances of success by half (!). They can lead to more stress within teams and diminish motivation. Part of dealing with this challenge requires creating open and supportive frameworks and teams where constructive feedback can be shared and accepted.
Lack of engagement
If team members are engaged and care about their work they will go the extra mile to get things done. This is a hard thing to build but crucial both for the well-being of your people and the success of your organization. You can read more about this here.
A key challenge to teams can be poor planning. Unclear schedules combined with a lack of discussing areas of responsibility leads to a mess. Team members don’t know how to manage their time, what tasks to prioritize, and experience a general sense of dissatisfaction and frustration. Poor planning can also lead to last-minute stress and tension which can harm the team dynamic and lead to low-quality results.
So if these are some of the key challenges to teamwork - what are the key success factors that lead to great teamwork?
Critical success factors for teamwork
1. Creating a shared perspective of the problem
You are trying to solve a problem – the gap between your goal and the current state. Often, the same problem can be seen from widely varying perspectives. Once team members understand the situation and the target, finding and implementing the solution is relatively easy. A practical way to achieve this is to define the problem in writing and make it available to the team. This should have a short “catchy” version that team members can easily call up to memory, as well as a more detailed version if further research is needed. Some companies take this to the next level and add descriptions of the root cause, how the customer feels about the problem, as well as KPIs and the impact of the problem.
2. Familiarity with everyone’s strengths/weaknesses
A crucial success factor for teams is familiarity within teams and specific knowledge of each other’s strengths and weaknesses. Gallup has shown that strengths-based developments lead to huge increases in productivity and engagement. Familiarity with weaknesses is just as important, if not more. Being aware of others’ weaknesses can help unclog bottlenecks or solve potential issues before they arise. Additionally, it can foster a deep level of trust among team members when they know that their strengths will be celebrated and others will help with weaknesses.
3. Asynchronous workflows
Physical distancing in the past year has made teams adapt to working asynchronously and remotely. Some teams are pre-configured for remote work across the globe. Post-Pandemic teams will need to retain and further develop the ability to work on different schedules and not let that hurt their ability to collaborate and share ideas. Adding a virtual workspace for your team is a good idea. Most companies today use Slack or something similar, but from our experience, Discord is better for team communication due to its superior voice channels. You can define a “Virtual Office” voice channel and virtual meeting rooms which will let team members communicate across continents (and to some degree time zones). This also ties into trust, since micromanaging is harder in asynchronous workflows (which is a positive if you ask me).
4. Division of responsibility
For teams to operate effectively, different areas of responsibility need to be clearly delineated and explained. This is a great way of enabling team members to play to their strengths. Additionally, division of responsibility can eliminate wasteful instances where the same things are done more than once, or people working on unnecessary tasks.
5. Learning and sharing
Learning and sharing within teams is the best way to tap into the expertise and experience that already exists within your organization. A key to enabling peer-to-peer learning is building an environment in which people can give constructive feedback to each other, and where team members are open to feedback. If experience is shared with peers, productivity will rise as one person’s learned lessons will benefit everyone. Some practical ideas we’ve seen in companies we’ve worked with are starting an internal guild, where data scientists from the entire organization meet every once in a while to discuss professional challenges and other areas of interest. You can invite external speakers to enrich the conversation, perhaps as part of guild meetings, or review papers that are related to topics that people are working on. Another idea is to form a mentorship program, where more junior team members are paired with seniors to share knowledge. Trust me, this will be valuable to both mentor and mentee.
6. Sharing recognition
Most organizations work in teams, and collaboration is rightfully prioritized. And yet, many organizations still focus on recognition programs aimed at motivating individuals and not teams. This ignores a fundamental tenet of psychology and management - people work harder for things that will lead to recognition and reward. By rewarding people based on their performance, while trying to encourage teamwork, organizations are incentivizing the wrong thing! Recognition needs to be shared as a team and not individually. Additionally, this is a great way to encourage a sense of pride and belonging within teams.
7. Better communication
Hundreds of articles and books have been written about the importance of good communication for teamwork. It seems obvious to say this, but it still needs to be said – without good communication teamwork can’t happen. As simple as that. In essence, it is the key to everything else on this list. Almost 80% (!) of employees’ time is spent on communication or interacting with co-workers on work-related activities. Make time for communication, keep things clear, and be respectful to each other.
Teamwork is crucial across different domains, including data science. There are a lot of challenges, both domain-specific and generally for successful teamwork. By identifying key success factors, and implementing solutions and frameworks within your organization, you can create a high-performing team. These factors are not purely theoretical, and you can take concrete actions to improve or generate them.
Thank you to Ori Cohen, Noa Weiss, and Shir Meir Lador for their insights on teamwork and project management for data science teams.