Write related work

From Organic Data Science Framework
Jump to: navigation, search


THE TEXT OF THE RELATED WORK SECTION FOLLOWS. IT IS BROKEN UP INTO SUBSECTIONS. -- -- Yolanda, 29 Sep 2014


Scientific Collaboration

[Bos et al 2007] did a comprehensive multi-year study of scientific collaborations and propose seven types of collaboratories (MAYBE PUT THIS IN A TABLE??): 1) Shared Instruments, where instruments or sensors are used by a community (e.g., National Ecological Observatory Network [cite NEON]); 2) Community Data Systems, where a data resource is maintained and used by a community (e.g., the Protein Data Bank [cite PDB]), 3) Open Community Contribution Systems, where tasks are carried out by a community including citizen scientists (e.g., the GalaxyZoo citizen science project for labeling galaxy images [cite Zooniverse]), 4) Virtual Communities of Practice, where a community shares interest in specific research topics (e.g., the Global Lake Ecological Observatory Network [cite GLEON]), 5) Virtual Learning Communities, where the purpose is to learn through the collaboration (e.g., the VIVO research network [Krafft et al 2010]), 6) Distributed Research Centers, where several institutions collaborate in a funded project (e.g., the ENCODE genomics project [cite ENCODE], and 7) Community Infrastructure Projects, where a community gets together to develop shared computing and software infrastructure (e.g., the Community Surface Dynamics Modeling System [Peckham et al 2013]). Our work has some of the properties of a distributed research center, since the project is jumpstarted by a multi-institutional collaboration, and is an open community contribution system but without the prescribed tasks typically found on those systems. Organic Data Science can be considered a new type of collaboratory, where the tasks are defined on the fly as the project progresses and the collaboration includes unanticipated contributors.

[Ribes and Finholt 2009] analyze the challenges of organizing work in four scientific collaborations: GEON (Geosciences Network), LEAD (Linked Environments for Atmospheric Discovery), WATERS (Water and Environmental Research Systems), and LTER (Long-Term Ecological Research). They found that major challenges for organizing work were: 1) the tension between planned work, with its work breakdown structures with deadlines, versus emergent organization as new requirements and unknowns are uncovered, 2) the tradeoff that participants face between doing basic research and contributing to the technical development in support of the research, and 3) the desire to incorporate innovations while needing a stable framework to do research. Organic Data Science is poised to offer the flexibility of easily incorporating emergent tasks and people, and the enticement to participants through acknowledgement of contributions so that uneven support from particular contributors is properly exposed.

On-Line Collaboration Systems

Some on-line collaboration tools have been developed to support science. [Introne et al 2013] describe the Climate CoLab, a collaborative environment for climate research. It offers argumentation structures, where evidence and hypotheses from different scientists can be compared and integrated to create a common view on climate research. This work, however, does not focus on supporting science research tasks while they are being carried out, only on organizing results of scientific work. In addition, climate researchers can be considered one discipline, and we are investigating the integration of multi-disciplinary research.


NEED TO ADD MORE HERE. -- -- Yolanda, 29 Sep 2014


Task-Oriented Collaboration Tools

Some task-oriented collaboration systems have been developed for information seeking tasks (e.g., Web search). An example is Kolline [Filho et al 2010], which supports the collaboration is between inexperienced users that need help from more advanced users. Our goal is to support tasks that have interrelated subtasks and that involve collaboration among peers.

We find inspiration in the Polymath project, set up to collaboratively develop proofs for mathematical theorems [Nielsen 2011; Gowers 2009a], where professional mathematicians collaborate with volunteers that range from high-school teachers to engineers to solve mathematics conjectures. The collaboration is centered around tasks, that contributors create, decompose, reformulate, and resolve. This project uses common Web infrastructure for collaboration, interlinking public blogs for publishing problems and associated discussion threads [Nielsen 2013] with wiki pages that are used for write-ups of basic definitions, proof steps, and overall final publication [Gowers 2013]. Interactions among contributors to share tasks and discuss ideas are regulated by a simple set of guidelines that serve as social norms for the collaboration [Gowers 2009b]. Social norms are found in other collaborations [Kraut and Resnick 2011; Birney 2013], and incorporate mechanisms for adjudication and credit.

Organizational and knowledge management literature

[Polanyi] coined the terms and discussed differences between tacit as well as explicit knowledge of individuals in organizations. According to Polanyi an individual can have tacit knowledge without being able to explicitly express this knowledge in its essence. In contrast, explicit knowledge can be communicated in formal languages that can be processed by other persons. In their theory on organizational knowledge creation, Nonaka and Takeuchi described the transformation modes between tacit and explicit knowledge with socialization, externalization, internalization, and combination [H. Takeuchi and I. Nonaka]. In the organic data science project we aim at externalizing tacit knowledge of researchers to resolve and formulate tasks in the science process through ad-hoc collaboration in an open framework. While we are focusing on science processes in this paper, Davenport also described the importance of processes for the productivity of knowledge workers in an organizational context [Davenport, Thomas H].

This section should also include:

  • cognitive science papers about how people organize work as task decomposition
  • knowledge representation papers about how tasks are represented
  • AI planning work on hierarchical task networks (HTNs)

References (INCOMPLETE)

Ribes, D. and T. A. Finholt (2009). "The long now of infrastructure: Articulating tensions in development." Journal for the Association of Information Systems (JAIS): Special issue on eInfrastructures 10(5): 375-398.

Joshua Introne, Robert Laubacher, Gary M. Olson, Thomas W. Malone: Solving Wicked Social Problems with Socio-computational Systems. KI 27(1): 45-52 (2013)

Nathan Bos, Ann Zimmerman, Judith S. Olson, Jude Yew, Jason Yerkie, Erik Dahl, Gary M. Olson: From Shared Databases to Communities of Practice: A Taxonomy of Collaboratories. J. Computer-Mediated Communication 12(2): 652-672 (2007).

Fernando Marques Figueira Filho, Gary M. Olson, Paulo Lício de Geus: Kolline: a task-oriented system for collaborative information seeking. SIGDOC 2010: 89-94.

[Birney 2013] “Lessons for big data projects.” Ewan Birney. Nature, Special Issue on the ENCODE project, 6 September 2012.

[Krafft et al 2010] “VIVO: enabling national networking of scientists.” D Krafft, N Cappadona, B Caruso, J Corson-Rikert, M Devare, B Lowe, and VIVO Collaboration. Conference on Web Science (WebSci), Raleigh, NC, April 2010.

[Kraut and Resnick 2011] “Building Successful Online Communities: Evidence-Based Social Design.” Robert E. Kraut and Paul Resnick. MIT Press, 2011.

I. Nonaka and H. Takeuchi, The Knowledge-Creating Company: How Japanese Companies Create the Dynamics of Innovation. New York: Oxford University Press, 1995.

H. Takeuchi and I. Nonaka, Hitotsubashi on Knowledge Management. Singapore: John Wiley & Sons (Asia), 2004.

M. Polanyi, Tacit Dimension. Gloucester, Mass.: Peter Smith Publisher Inc, 1983.

Davenport, Thomas H. Thinking for a living: how to get better performances and results from knowledge workers. Harvard Business Press, 2013.


Yandex.Metrica