Anonymous

User:Martind/Document Log Discovery Platform: Difference between revisions

From London Hackspace Wiki
Line 38: Line 38:
* Corpus ID: apollo13
* Corpus ID: apollo13
* Document ID: 01:06:43:11
* Document ID: 01:06:43:11
* Reading offset ID: #log-line-110591
* Presentation parameters:
* has rel="canonical"
** Reading offset ID: #log-line-110591
* To construct canonical URLs:  
* To construct canonical URLs:  
** Document URL: URL template, corpus ID, document ID, reading offset
** Document URL: URL template, corpus ID, document ID, reading offset
Line 45: Line 45:
** No means to query corpus ID, reading offset for a document ID
** No means to query corpus ID, reading offset for a document ID
* Other observations:
* Other observations:
** Has rel="canonical"
** Document IDs are actually timestamps.  
** Document IDs are actually timestamps.  
** There are collisions within a corpus, which does not greatly affect presentation, but may affect integration with other services
** There are collisions within a corpus, which does not greatly affect presentation, but may affect integration with other services
Line 57: Line 58:
* To construct canonical document URL: base URL + corpus ID + document ID
* To construct canonical document URL: base URL + corpus ID + document ID
** Can query corpus ID (username) via e.g. http://dev.twitter.com/doc/get/statuses/show/:id
** Can query corpus ID (username) via e.g. http://dev.twitter.com/doc/get/statuses/show/:id
'''Eur-Lex'''
* Document URL: http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2010:333:0001:0005:EN:PDF
* Corpus ID: OJ:L (Official Journal, legislation series)
* Document ID: 2010:333:0001:0005
* Presentation parameters:
** Language: EN
** Format: PDF
* To construct canonical document URL: URL template + corpus ID + document ID + presentation parameters
'''TODO'''
* Patent databases
* Databases of law
* ...


== Observations ==
== Observations ==