User:Martind/Document Log Discovery Platform: Difference between revisions
From London Hackspace Wiki
m (→Requirements) |
|||
Line 33: | Line 33: | ||
* Need ways to link/group individual messages | * Need ways to link/group individual messages | ||
* Need ways to identify level of "interestingness" (e.g. voting, cf. Q&A sites) | * Need ways to identify level of "interestingness" (e.g. voting, cf. Q&A sites) | ||
* Need ways to annotate content (with text, links) | |||
* Contribution should be possible anonymously | * Contribution should be possible anonymously | ||
* ... | * ... |
Revision as of 23:39, 1 December 2010
Problem Statement
We're seeing an increase in the publication of vast corpuses of data logs, often in the form of message archives, usually in a structured message format. They're all quite overwhelming: how to make sense of such a vast amount of text? How to identify sections that are relevant?
- Can we allow large number of interested parties (anyone really) to annotate these documents?
- What kinds of annotations do we want to make? (Information structure)
- How can we make that easy? (Tools)
- ...
Exemplary Publications
- WikiLeaks datalog dumps
- Iraq War Logs
- Embassy Cables
- http://spacelog.org/
- http://news.ycombinator.com/item?id=1958292 "I would love to get an in-depth technical explanation of the requests and procedures -- how all this stuff works and insights into the troubleshooting process."
- etc
Observations
- What constitutes an "interesting" section of a document is a matter of perspective.
- Thus, such annotations become more useful if they're linked to a context (e.g. "this cable relates to news story X")
- Many parties will already build browsers for such publications, with varying ways of navigating such content.
- We don't need to duplicate those efforts, but we should integrate with them.
- ...
Requirements
- Need a shared addressing scheme that works across document browsers
- Based on permalinks
- we can translate between different approaches
- Need ways to link/group individual messages
- Need ways to identify level of "interestingness" (e.g. voting, cf. Q&A sites)
- Need ways to annotate content (with text, links)
- Contribution should be possible anonymously
- ...