Difference between revisions of "PundIt MarineLives Forum"

From MarineLives
Jump to: navigation, search
Line 56: Line 56:
  
 
* We have a number of wikis hosted by wikispot.org, with standard naming conventions (e.g. http://annotatehca1371.wikispot.org/; http://annotatehca1372.wikispot.org/)
 
* We have a number of wikis hosted by wikispot.org, with standard naming conventions (e.g. http://annotatehca1371.wikispot.org/; http://annotatehca1372.wikispot.org/)
 +
 
* Each of these wikis has a standard front page page structure and a standard structure for transcribed pages.
 
* Each of these wikis has a standard front page page structure and a standard structure for transcribed pages.
 +
 
* The standard structure for the transcribed pages includes standard page fragments, which with HTML links to those fragments, so that both the pages and the page fragments can be addressed and accessed from other pages
 
* The standard structure for the transcribed pages includes standard page fragments, which with HTML links to those fragments, so that both the pages and the page fragments can be addressed and accessed from other pages
  
Line 101: Line 103:
  
 
* How were Marinelive: Persons & Marinelive: Boats custom vocabularies created?
 
* How were Marinelive: Persons & Marinelive: Boats custom vocabularies created?
 +
 
* Can the two MarineLive custom vocabularies be renamed by us, and if so, how?
 
* Can the two MarineLive custom vocabularies be renamed by us, and if so, how?
 +
 
* How do we create and label new custom vocabularies?
 
* How do we create and label new custom vocabularies?
 +
 
* How do we modify or add to existing custom vocabularies?
 
* How do we modify or add to existing custom vocabularies?
  
Line 113: Line 118:
 
===SEARCHABLE DATABASES===
 
===SEARCHABLE DATABASES===
  
* Relevance of of Freebase to MarineLives?
+
* Relevance of of [http://www.freebase.com/ Freebase] to MarineLives?
* Relevance of DBPedia to MarineLives?
+
  
 +
* Relevance of [http://dbpedia.org/About DBPedia] to MarineLives?
 +
 +
- A limited number of people, place, and material entries in Wikipedia (via DBPedia) are likely to be of direct relevance to annotating Marine Lives, e.g. http://en.wikipedia.org/wiki/Christopher_Myngs. SEE: http://marinelives-theshippingnews.org/blog/2013/10/05/christopher-myngs-naval-officer/
 +
 +
* Is it possible to strike a not-for-profit academic and educational use licence with the current Oxford Dictionary of National Biography to use as a searchable/linkable database?  Failing that, can we offer as a PundIt option access to a searchable out of copyright
 
----
 
----
 
===TRIPLES===
 
===TRIPLES===
 +
 +
[[File:PundIt_MyItems_301013.PNG|thumbnail|600px|left]]
  
 
* How create new subjects in drop down triple menu?
 
* How create new subjects in drop down triple menu?
 +
 
* How create new predicates in drop down triple menu?
 
* How create new predicates in drop down triple menu?
 +
 
* How create new objects in drop down triple menu?
 
* How create new objects in drop down triple menu?

Revision as of 12:30, October 30, 2013

PundIt MarineLives Forum

Editorial history

30/10/13: CSG, created page



Purpose of this page

This page provides a discussion forum and set of resources for MarineLives project members exploring the functionality of the PundIt tool.


Background

PundIt is an experimental semantic annotation tool for web pages which is currently under further development by Net7, and which is being used by the DM2E project.

The DM2E project is a project of Europeana, which has emerged out of the European Digital Library Network. Dr Christian Morbidioni and Dr Kai Eckert are two of the DM2E project workstream leaders, and have approached the MarineLives project leadership team to explore the potential for them to collaborate with us and with our partners at Bath Spa University and the National Archives.

As a first and important step, MarineLives is working with Dr Christian Morbidioni of the University of Ancona and Simone Fonda of Net7 to explore a working demo of PundIt. The demo can be found here.

In parallel, MarineLives is working with Dr Kai Eckert and Dominique Ritze of the University of Mannheim, to explore the potential for automatic and semi-automatic entity recognition for MarineLives transcriptions. The topics of semantic annotation and entity recognition are clearly closely related.



Approach to evaluation

We would like to focus our initial experimentation with PundIt on the High Court of Admiralty deposition book, HCA 13/72

Roughly 700 pages of HCA 13/72 have been transcribed and edited, and are available in edited form on the following Annotate HCA 13/72 wiki. Digital images of many (but not all) of the same transcribed pages can be viewed in our tailored transcription software, MarineLives - Transcript together with the transcriptions.

We suggest that evaluators try annotating web pages from both the wiki version of the transcribed text and the MarineLives - Transcript version of the transcibed text (and indeed the images themselves, and or image fragments).

The current PundIt demo has been set up with sample custom vocabularies extracted from the Annotate HCA 13/71 wiki. Colin Greenstreet is exploring with Simone Fonda how we can create new custom vocabularies for ships, people, places and materials specifically for HCA 13/72, and how we can then add and edit new individual records in these custom vocabularies. First cut wiki versions of terms for these controlled vocabularies for HCA 13/72 are available as follows:

HCA 13/72 People
HCA 13/72 Materials
HCA 13/72 Places
HCA 13/72 Ships

A discussion is required with the PundIt team on how to handle spelling variants. The MarineLives approach (to date) has been to capture ALL spelling variants in our vocabulary lists.






Suggested links


DM2E
Europeana

Annotate HCA 13/72 wiki
MarineLives - Transcript: HCA 13/72 pages



Rolling list of questions about PundIt functionality in the context of MarineLives project


Please post your questions here, together with answers and comments as they emerge



ANNOTATING WIKI AND OTHER PAGES & PAGE FRAGMENTS



  • Each of these wikis has a standard front page page structure and a standard structure for transcribed pages.


  • The standard structure for the transcribed pages includes standard page fragments, which with HTML links to those fragments, so that both the pages and the page fragments can be addressed and accessed from other pages


e.g. http://annotatehca1372.wikispot.org/HCA_13/72_f.4r_Annotate displays both our transcription of folio 4 recto of the deposition volume HCA 13/72 (covering years 1657 and 1658)

e.g. with a standard fragment structure of:

       Suggested links
       Transcription
       Topics
           People
           Places
           Ships
           Materials
           Miscellaneous
       Sources
           Primary sources
           Secondary sources


e.g. with an addressable HTML fragment address for the transcription of http://annotatehca1372.wikispot.org/HCA_13/72_f.4r_Annotate#head-7792b396c165940a2ef3372031f6dbb64b71233e

-QUESTIONS:

-- Can we specify the transcription fragment address as the "Page" in PundIt terminology (see screen grab below) or do we have to specify it as a "Text-Fragment"?




ANNOTATING DIGITAL IMAGES


HCA 13 72 f4r MchtMarks LH Margin.PNG
  • Our digital images are held in a mediawiki picture library, and can be accessed as such directly, but are more conveniently accessed for zooming and viewing through our tailored transcription software MarineLives - Transcript, which uses the open source software SCRIPTO.


-- So, when specifying an "Image" using PundIt terminology, should we go to the Image in the Mediawiki library, or through MarineLives - Transcript?

(for your reference, the digital image of HCA 13/72 f.4r in TRANSCRIPT/SCRIPTO can be accessed at http://marinelives-transcript.org/scripto/scripto/?scripto_action=transcribe&scripto_doc_id=2045&scripto_doc_page_id=2280)

  • It would be useful to be able to comment on and otherwise annotate and key word marginalia in fragments of the digital images. For example, symbolic merchants marks and signatures and markes of deponents.


SIGNATURE Hojah Peter Armenian Merchant HCA 1365f53v.PNG


CUSTOM VOCABULARIES


  • How were Marinelive: Persons & Marinelive: Boats custom vocabularies created?


  • Can the two MarineLive custom vocabularies be renamed by us, and if so, how?


  • How do we create and label new custom vocabularies?


  • How do we modify or add to existing custom vocabularies?




DATES


  • How should we and PundIt handle C17th calendars (English old style; English new style, etc.)




SEARCHABLE DATABASES


  • Relevance of of Freebase to MarineLives?


  • Relevance of DBPedia to MarineLives?


- A limited number of people, place, and material entries in Wikipedia (via DBPedia) are likely to be of direct relevance to annotating Marine Lives, e.g. http://en.wikipedia.org/wiki/Christopher_Myngs. SEE: http://marinelives-theshippingnews.org/blog/2013/10/05/christopher-myngs-naval-officer/

  • Is it possible to strike a not-for-profit academic and educational use licence with the current Oxford Dictionary of National Biography to use as a searchable/linkable database? Failing that, can we offer as a PundIt option access to a searchable out of copyright


TRIPLES


PundIt MyItems 301013.PNG
  • How create new subjects in drop down triple menu?


  • How create new predicates in drop down triple menu?


  • How create new objects in drop down triple menu?