top of page

iDigBio and Data Carpentry go to Africa

Location: BIS (TDWG) 2015 Biodiversity Information Standards: An amazing 2 weeks in Nairobi, Kenya. by Deb Paul, input from Libby Ellwood and Matt Collins.

For the first time ever, the Biodiversity Information Standards (TDWG) Conference took place on the continent of Africa, in Nairobi, Kenya. Every 3 years, TDWG holds the annual meeting in a developing nation and uses the opportunity to provide biodiversity informatics training in the week before the conference. iDigBio staff members, Matt Collins, Libby Ellwood, Kevin Love, and Deb Paul (that’s me), first headed to the BIS (TDWG) Training Week held at Multi-Media University (MMU), bordering Nairobi National Park. On day one, we jumped in with Training Week Organizer, Henry (Hank) Bart, as assistants in the GEOLocate Training.

On Wednesday and Thursday, it was time for a Data Carpentry (DC) Workshop. Twenty-four participants from 12 African countries including: Zimbabwe, Benin, Ghana, Nigeria, Rwanda, Ethiopia, South Africa, Madagascar, Kenya, Guinea, Cameroon, and the Democratic Republic of the Congo joined us for the DC experience. A rich array of scientific disciplines and roles represented in the group included ecology, data management, environmental information systems, forestry (silviculture, forest ecology and management), waterbird ecology and biodiversity informatics, biology - genetics, plant systematics and taxonomy, herpetology, earth science, conservation ecology, and microbiology.

Our two day DC workshop designed for the novice covered: data organization in spreadsheets and introduction to Open Refine, introduction to R, and data analysis and visualization in R. Originally, we planned to present an introduction to SQL, but for several reasons beyond our control, we were not able to get to this topic. We hope to figure out how to provide this training in the future, perhaps online. All materials for the workshop can be found on our GitHub pages at: A Data Carpentry Workshop at Multimedia University of Kenya.

To keep track of our Data Carpentry efforts, and set goals for the future, we do pre- and post-workshop assesment surveys. 88% of our participants said they gained a “great deal” from participating in this DC event. When asked about their data management skills following the workshop, 77% said they had higher or much higher skills. 100 % of the 17 survey respondents agreed or strongly agreed they could immediately apply what was learned at this workshop. Looking to the future, many of the participants indicated they wish to bring DC to their stakeholders across Africa.

While we were sad to see the biodiversity informatics training week end, it was time for BIS (TDWG) 2015. And, thanks to JRS Foundation funding, all the JRS-funded training week participants also joined us and participated in BIS (TDWG) 2015. This support greatly enhanced BIS (TDWG) and added to the opportunities to find collaborative research ideas and develop collegial relationships and increase involvement in BIS (TDWG) in the future. You can read more about our Data Carpentry experience through the blog posted at Data Carpentry from iDigBio colleague, Matt Collins.

Libby and I also organized a session at TDWG focused on Biodiversity Data Mobilization Models. In this session, Libby highlighted the ways citizen scientists can contribute to data mobilization through online transcription tools, collaborative georeferencing projects, and annotation possibilities. Mary Barkworth followed with a presentation of the research and educational resources of Open Herbarium. Matt Collins rounded out the first half of the session with a presentation about the DC Model with a short demonstration of a lesson in R, Data Carpentry style. Nicky Nicolson described a practical application within the Open Refine framework. Jean Ganglo rounded out the session with an example of mobilizing plant specimen data in Benin. Together, these talks provided a diversity of presentations from several countries that described the utilization of numerous tools and platforms.

If you would like to view recordings of these talks, or for that matter, any from TDWG 2015, you can do so here:

Matt also presented a poster co-authored with Alex Thompson and Jorrit Poelen on using Spark, a big data computing framework, with iDigBio data: Whole-dataset analyses using Apache Spark. A blog post providing background on the poster is currently on the iDigBio web site. And Libby Ellwood presented Mapping Life: Quality Assessment of Novice and Computer Automated vs. Expert Georeferences with co-authors Henry Bart, Michael Doosey, Dean Jue, Gil Nelson, Nelson Rios, and Austin Mast. Listen to Libby’s presentation to find out what the research tells us about the difference between the georeferences of experts compared to beginners with just a bit of training.

We’re looking forward already to BIS (TDWG) 2016 in Costa Rica. We hope to see many of the African participants there. How about a biodiversity informatics training week before every BIS meeting? What about having SPNHC and BIS meet together every few years? Our missions are mutually beneficial – and meeting together would foster and simplify collaboration and application development. From a digitization and data use perspective, looking forward to #BIS2016 and your part in it.

Thanks for reading,

Deb Paul, Matt Collins, Libby Ellwood, and Kevin Love, et al at iDigBio

This post was originally posted at and has been edited here to reduce length.

bottom of page