User:Martind/Document Log Discovery Platform

From London Hackspace Wiki
Jump to navigation Jump to search

Problem Statement

We're seeing an increase in the publication of vast corpuses of data logs, often in the form of message archives, usually in a structured message format. They're all quite overwhelming: how to make sense of such a vast amount of text? How to identify sections that are relevant?

  • Can we allow large number of interested parties (anyone really) to annotate these documents?
    • What kinds of annotations do we want to make? (Information structure)
    • How can we make that easy? (Tools)
  • ...

Exemplary Publications

Observations

  • What constitutes an "interesting" section of a document is a matter of perspective.
    • Thus, such annotations become more useful if they're linked to a context (e.g. "this cable relates to news story X")
  • Many parties will already build browsers for such publications, with varying ways of navigating such content.
    • We don't need to duplicate those efforts, but we should integrate with them.
  • ...

Requirements

  • Need a shared addressing scheme that works across document browsers
    • Based on permalinks
    • we can translate between different approaches
  • Need ways to link/group individual messages
  • Need ways to identify level of "interestingness" (e.g. voting, cf. Q&A sites)
  • Contribution should be possible anonymously
  • ...