1,496
edits
Line 1: | Line 1: | ||
== Problem Statement == | == Problem Statement == | ||
We're seeing an increase in the publication of vast corpuses of | We're seeing an increase in the publication of vast corpuses of document logs, often in the form of message archives, usually in a structured message format. They're all quite overwhelming: how to make sense of such a vast amount of text? How to identify sections that are relevant? | ||
* Can we allow large number of interested parties (anyone really) to annotate these documents? | * Can we allow large number of interested parties (anyone really) to annotate these documents? | ||
Line 8: | Line 8: | ||
** Can we identify good conventions and techniques for the above that are more generally applicable? (Patterns of use) | ** Can we identify good conventions and techniques for the above that are more generally applicable? (Patterns of use) | ||
* Finally, can we think of these functions as a layer on top of mere archives, and construct them as a physically separate service? | * Finally, can we think of these functions as a layer on top of mere archives, and construct them as a physically separate service? | ||
Note: | |||
* These notes are limited to text document corpuses, and won't attempt to incorporate numerical/statistical/other data repositories. | |||
== Exemplary Publications == | == Exemplary Publications == |