1,496
edits
Line 15: | Line 15: | ||
Look out for: | Look out for: | ||
* Canonical URLs (implicit, or explicit via rel="canonical") | * Canonical document URLs (implicit, or explicit via rel="canonical") | ||
* Corpus IDs | |||
* Document IDs | * Document IDs | ||
* Document ranges for timeline browsers | * Document ranges for timeline browsers | ||
* Additional presentation information, e.g. reading offset | * Additional presentation information, e.g. reading offset | ||
* How to construct canonical URLs from document IDs | |||
'''WikiLeaks Iraq War Logs''' | '''WikiLeaks Iraq War Logs''' | ||
* Document URL: http://warlogs.wikileaks.org/id/BCD499A0-F0A3-2B1D-B27A2F1D750FE720/ ([http://web.archive.org/web/20101030053024/http://warlogs.wikileaks.org/id/BCD499A0-F0A3-2B1D-B27A2F1D750FE720/ archive.org]) | * Document URL: http://warlogs.wikileaks.org/id/BCD499A0-F0A3-2B1D-B27A2F1D750FE720/ ([http://web.archive.org/web/20101030053024/http://warlogs.wikileaks.org/id/BCD499A0-F0A3-2B1D-B27A2F1D750FE720/ archive.org]) | ||
* Document ID: BCD499A0-F0A3-2B1D-B27A2F1D750FE720 | * Document ID: BCD499A0-F0A3-2B1D-B27A2F1D750FE720 | ||
* To construct a canonical document URL: Base URL + document ID | |||
'''WikiLeaks Embassy Cables''' | '''WikiLeaks Embassy Cables''' | ||
Line 28: | Line 31: | ||
* Document ID: 10COPENHAGEN69 | * Document ID: 10COPENHAGEN69 | ||
* Browsers: [http://cablesearch.org/ cablesearch.org], | * Browsers: [http://cablesearch.org/ cablesearch.org], | ||
* To construct a canonical document URL: Base URL + document ID | |||
'''SpaceLog''' | '''SpaceLog''' | ||
Line 35: | Line 39: | ||
* Document ID: 01:06:43:11 | * Document ID: 01:06:43:11 | ||
** not enough to form a URL. Need to know corpus ID to construct permalink | ** not enough to form a URL. Need to know corpus ID to construct permalink | ||
* Reading offset ID: #log-line-110591 | * Reading offset ID: #log-line-110591 | ||
* has rel="canonical" | * has rel="canonical" | ||
* To construct canonical URLs: | |||
** Document URL: URL template, corpus ID, document ID, reading offset | |||
** Document range URL: URL template, corpus ID, document IDs, reading offset | |||
** No means to query corpus ID, reading offset for a document ID | |||
'''Twitter''' | '''Twitter''' | ||
Line 44: | Line 52: | ||
* Corpus ID: wikileaks | * Corpus ID: wikileaks | ||
* Document ID: 15975805188317184 | * Document ID: 15975805188317184 | ||
** | * To construct canonical document URL: base URL + corpus ID + document ID | ||
** Can query corpus ID (username) via e.g. http://dev.twitter.com/doc/get/statuses/show/:id | |||
== Observations == | == Observations == |