Difference between revisions of "Tools: Team Two Data"

From MarineLives
Jump to: navigation, search
(Historians search behaviour on the web in academic literature)
(Comparing Google search constrained to MarineLives and MarineLives in-wiki search engine)
 
(19 intermediate revisions by the same user not shown)
Line 13: Line 13:
 
==Search frequency data in academic literature==
 
==Search frequency data in academic literature==
  
Anne Aula, Rehan M. Khan, Zhiwei Guan (2010) survey the literature for search term frequency.<ref>[http://dub.washington.edu/djangosite/media/papers/pap943-aula-1.pdf Aula, Anne, Rehan M. Khan, Zhiwei Guan, 'How does Search Behaviour Change as Search Becomes More Difficult, CHI, Atlanta, Georgia, April 10-15, 2010], viewed 05/07/2016</ref> They report, based on search logs, that the average number of query termns is between 2.35 terms and 2.6 terms per query. There is some evidence to suggest that smart phone users enter a slightly higher number of terms in phone-based queries. They report that most of the queries are simple keyword queries, with only about 10% of queries containing advanced query operators. They suggest that there are significant regional differences, with US based searchers making greater use of advanced query operators. Citing Eastman, C.M. and Jansen (2003), they suggest that most advanced query operators do not increase the precision of the query, so they may not be worth the trouble.<ref>Eastman, C.M. and Jansen, B.J., Coverage, relevance, and ranking the impact of query operators on web search engine results, ACM Transactions on Information Systems, 21 (4), 383-411</ref> Other researchers suggest that search engine users typically evaluate the results of a specific search quickly, before either clicking on a result or refining their query, with an average of 7.78 seconds reported by Granka, L.A., Joachims, T. Gay, G. (2004).<ref>Granka, L.A., Joachims, T. Gay, G. (2004) Eye-tracking analysis of user behaviour in WWW search. Proc. SIGIR '04, 478-479</ref>
+
Anne Aula, Rehan M. Khan, Zhiwei Guan (2010) survey the literature for search term frequency.<ref>[http://dub.washington.edu/djangosite/media/papers/pap943-aula-1.pdf Aula, Anne, Rehan M. Khan, Zhiwei Guan, How does Search Behaviour Change as Search Becomes More Difficult, CHI, Atlanta, Georgia, April 10-15, 2010], viewed 05/07/2016</ref> They report, based on search logs, that the average number of query termns is between 2.35 terms and 2.6 terms per query. There is some evidence to suggest that smart phone users enter a slightly higher number of terms in phone-based queries. They report that most of the queries are simple keyword queries, with only about 10% of queries containing advanced query operators. They suggest that there are significant regional differences, with US based searchers making greater use of advanced query operators. Citing Eastman, C.M. and Jansen (2003), they suggest that most advanced query operators do not increase the precision of the query, so they may not be worth the trouble.<ref>Eastman, C.M. and Jansen, B.J., Coverage, relevance, and ranking the impact of query operators on web search engine results, ACM Transactions on Information Systems, 21 (4), 383-411</ref> Other researchers suggest that search engine users typically evaluate the results of a specific search quickly, before either clicking on a result or refining their query, with an average of 7.78 seconds reported by Granka, L.A., Joachims, T. Gay, G. (2004).<ref>Granka, L.A., Joachims, T. Gay, G. (2004) Eye-tracking analysis of user behaviour in WWW search. Proc. SIGIR 04, 478-479</ref>
  
 
----
 
----
Line 20: Line 20:
 
[http://www.history.ac.uk/makinghistory/resources/articles/archive_skills_and_tools_for_historians.html Cunningham, Sean, Archive skills and tools for historians, web article, undated, Making History]
 
[http://www.history.ac.uk/makinghistory/resources/articles/archive_skills_and_tools_for_historians.html Cunningham, Sean, Archive skills and tools for historians, web article, undated, Making History]
  
Duff, Wendy M. and Catherine A. Johnson, Accidentally Found on Purpose: Information-Seeking Bahviour of Historians in Archives, The Library Quarterly Information, Community, Policy, vol. 72, no. 4 (Oct. 2002), pp.472-496
+
Duff, Wendy M. and Catherine A. Johnson, Accidentally Found on Purpose: Information-Seeking Behaviour of Historians in Archives, The Library Quarterly Information, Community, Policy, vol. 72, no. 4 (Oct. 2002), pp.472-496
  
Foster, Allen and Nigel Ford, Serendipity and information seeking: an empirical study (2003), ournal of Documentation, Vol. 59 Iss: 3, pp.321 - 340
+
[https://books.google.co.uk/books?id=IAYvDAAAQBAJ&printsec=frontcover#v=onepage&q&f=false Given, Lisa M. and Donald O. Given, Looking for Information: A Survey of Research on Information Seeking, Needs, and Behavior (Emearld Group Publishing, 2016)]
 +
 
 +
[http://americanarchivist.org/doi/pdf/10.17723/aarc.68.1.h1l2r87kl1846417 Johnson, Catherine A. and Wendy M. Duff, Chatting Up the Archivist: Social Capital and the Archival Researcher, The American Archivist, vol. 67 (Spring/Summer 2004), pp.113-129]
 +
 
 +
Foster, Allen and Nigel Ford, Serendipity and information seeking: an empirical study (2003), Journal of Documentation, Vol. 59 Iss: 3, pp.321 - 340
 +
 
 +
[http://www.informationr.net/ir/17-4/paper544.html#.V3vqO7grKUk Rhee, Hea Lim, Modelling historians information-seeking behaviour with an interdisciplinary and comparative approach, Information Research, vol. 17, no. 4, December 2012]
  
 
[http://americanarchivist.org/doi/pdf/10.17723/aarc.66.1.b120370l1g718n74 Tibbo, Helen R., Primarily History in America: How U.S. Historians Search for Primary Materials at the Dawn of the Digital Age, The American Archivist, vol. 66 (Spring/Summer 2003), pp. 9-50]
 
[http://americanarchivist.org/doi/pdf/10.17723/aarc.66.1.b120370l1g718n74 Tibbo, Helen R., Primarily History in America: How U.S. Historians Search for Primary Materials at the Dawn of the Digital Age, The American Archivist, vol. 66 (Spring/Summer 2003), pp. 9-50]
Line 333: Line 339:
 
Wylde
 
Wylde
 
Wylde
 
Wylde
 +
----
 +
==Comparing Google search constrained to MarineLives and MarineLives in-wiki search engine==
 +
 +
The use of inverted commas around a multi-term search query in the MarineLives wiki will force a search for the exact phrase so marked up
 +
 +
For example, "Price of Pepper" will only return results with the phrase '''Price of Pepper''', and will omit results just containing '''Price''', '''Pepper''', and '''of'''
 +
 +
The same is true of Google search.
 +
 +
A focussed search can be performed in Google looking largely at MarineLives wiki pages by including in inverted commas the term "MarineLives". The results will be almost entirely from Google indexed pages from the MarineLives wiki, but will include relevant results from the MarineLives Wordpress blog The Shipping News and any third party pages which contain the word '''MarineLives''', but not '''Marine Lives'''
 +
 +
The examples below compare the results generated by querying the MarineLives wiki and querying Google, using identical search terms, but with the addition of "MarineLives" to the Google query.
 +
 +
[[File:ML Google Price Of Pepper 05072016.PNG|1000px|thumbnail|left|Comparing Google search engine with MarineLives in-wiki search engine: search term "Price of Pepper"]]
 +
 +
 +
[[File:ML Edward Bushell 05072016.PNG|1000px|thumbnail|left|Comparing Google search engine with MarineLives in-wiki search engine: search term "Edward Bushell"]]
 +
----

Latest revision as of 06:47, July 6, 2016

This page is a repository of data generated by Team Two of the MarineLives Digital Pop Up Lab, which is exploring how historians approach historical search when they are looking for people, places and dates.

We monitor useage of the MarineLives wiki using Google Analytics. Data from the Google Analytics package enables us to analyse the specific pages viewed by users within the wiki, but not the identity of the users. Data are also captured by Google Analytics regarding searches conducted by users when viewing the MarineLives wiki, using the wiki's own search box.

In the period June 3rd 2016 to July 3rd 2016, there were two hundred and seventy five searches performed by MarineLives wiki users.

We have analysed this data for (1) the number of search terms used in single searches (2) the types of content searched for in single searches

Our conclusion is that "simple searches" using one or two search terms in a single search dominate, and that the dominant content searched for are persons. We are interested in comparing these results with the results from other history oriented content based websites.



Search frequency data in academic literature


Anne Aula, Rehan M. Khan, Zhiwei Guan (2010) survey the literature for search term frequency.[1] They report, based on search logs, that the average number of query termns is between 2.35 terms and 2.6 terms per query. There is some evidence to suggest that smart phone users enter a slightly higher number of terms in phone-based queries. They report that most of the queries are simple keyword queries, with only about 10% of queries containing advanced query operators. They suggest that there are significant regional differences, with US based searchers making greater use of advanced query operators. Citing Eastman, C.M. and Jansen (2003), they suggest that most advanced query operators do not increase the precision of the query, so they may not be worth the trouble.[2] Other researchers suggest that search engine users typically evaluate the results of a specific search quickly, before either clicking on a result or refining their query, with an average of 7.78 seconds reported by Granka, L.A., Joachims, T. Gay, G. (2004).[3]



Historians search behaviour on the web in academic literature


Cunningham, Sean, Archive skills and tools for historians, web article, undated, Making History

Duff, Wendy M. and Catherine A. Johnson, Accidentally Found on Purpose: Information-Seeking Behaviour of Historians in Archives, The Library Quarterly Information, Community, Policy, vol. 72, no. 4 (Oct. 2002), pp.472-496

Given, Lisa M. and Donald O. Given, Looking for Information: A Survey of Research on Information Seeking, Needs, and Behavior (Emearld Group Publishing, 2016)

Johnson, Catherine A. and Wendy M. Duff, Chatting Up the Archivist: Social Capital and the Archival Researcher, The American Archivist, vol. 67 (Spring/Summer 2004), pp.113-129

Foster, Allen and Nigel Ford, Serendipity and information seeking: an empirical study (2003), Journal of Documentation, Vol. 59 Iss: 3, pp.321 - 340

Rhee, Hea Lim, Modelling historians information-seeking behaviour with an interdisciplinary and comparative approach, Information Research, vol. 17, no. 4, December 2012

Tibbo, Helen R., Primarily History in America: How U.S. Historians Search for Primary Materials at the Dawn of the Digital Age, The American Archivist, vol. 66 (Spring/Summer 2003), pp. 9-50


Number of search terms used in MarineLives wiki search box


Search term frequency for search box in MarineLives wiki


Type of search terms used in MarineLives wiki search box


Type of search term frequency for search box in MarineLives wiki



List of search terms used in MarineLives wiki search box


The following list contains the search terms used in 275 searches performed using the MarineLives wiki search box, June 3rd 2016 - July 3rd 2016:

Capitalisation and spacing reproduced as used in search box
Search terms in inverted commas reflect use of inverted commas in orginal search

Abigaill
Abigall
abraham stockman
adam bowen
alban wakelyn
aldworth
alston
amsterdam
anthony abdy
anthony deane
arthur juxon

badlow
badlowe
barbados
bateman
bateman
Bateman
bedlo
bedlo
bedlow
bedlowe
benjamin bond
benjamin gayer
bice
bidlow
bidlowe
blackwall
blessing
blessing ribert morris
blundell
boice
Boulton
bowrey
bowry
boyce
boyne
broome
broome
Budd
Bushell

canham
canham
captain ruther
captain "robert morris"
captain robert morris
carnaby
catchpole
catchpool
chiborne
christopher clitheroe
Clements
clopton
cockaine
corellis
coytemore
coytmore
crane
crockhay

daunois
Daunois
dormido
drake
drake crowther

east indies
edith perrin
edward osborne
Edward Mansell
Edward Mansell

francis ball
francis raworth
francis tryon
francis tyron

gabriel ludlow
george clarke
george oxenden
george swift
george tyte
george willoughby
goodlad
goodlad

Harlow
Hartmman Geerts
hca 13/65 f.87r
hca 13/68 f.145v
hca 13/70
hca 13/70 f.427
hca 12/70 f.427v
hca 13/70 f.711r
hca 13/72 f.134v
Hca 13/73 f.711r
Hca 13/73 f.711r
Hca 13/73 f.711v
HCA 13 73 f.711r
henry garraway
herriott
hester white
hester white
hides
hides
Higginson
hinde
Hinde
hodder

Ipswich

Jackson
Jacoson canham
Jackson stead
jadwyn
jamaica
james meggs
jenney
jenny
"john anthony"
john aston
john banks
john brampston
john burgoyne
john crowther
John Darby
john dethick
john eaton
john hawkins
john hawkins
john jauxon
john juxon
john lawrence
john mercer
john pixly
john stockman
john swift
jonas deacon
joyle
juxon

kentij
kingston

lawrence broome
Livorno

Mangles
mansel
marie broome
mary broome
martin bond
mathew broome
mathew gray
matthew shepard
matthew sheppard
melmoth
Mings
morgan
morgan
mrp wills
mrp: will
mrp: will
mrp: wills
mrp: wills
mrp: wills
mrp:wills
Mynes
Mynnes
Myngs
Myngs

Nathaniel Bateman
Nathaniel Bateman
nathaniel brent
nathaniel durant
nevll booth
neville booth
nicholas stiles
nicholas style

oliver tanner
oliver tanner anglo dutch war

payne
pennoyer
perreyra
perry
peter and john
pory
peter andrewes
peter andrews
peter priaulx
peter priaulx
poole
Poole
PROB 11/361 King 125-176
PROB 32/6/8
PROB 32/6/8 Deceased: Bushell. Benjemen. Stepney. Middx Inventory
pyn

rainborough
rainborow
rainborowe
rainborowe
rainton
richard anthony
richard onslow
richard parsons
robert morris
robert salmon
robert wakeman
robert willoughby
robert yardley
robles
robles
roger vivian
ross keel
Ross keel
Ross keel amsterdam
rowland cotemore
rowland coytemore
rowland coytmore
royal george
Royal George
"Royal George"

salmonboyne
smith
sparke
squire bence
stepney
suasso
svjs

thierry
thierry daunois
thomas broome
thomas broome
thomas haughton
thomas hudson
thomas hyde
thomas lewes
thomas lewys
thomas lewyx
thomas offley
thomas perrin
thomas sheppard
thomas styles
tilley
tobias lisle
"tobias lisle"
tyte

User

valentine
vallack

walter maynard
Wareing
Waring
westhorp
wes
Wilde
Wilde
will
will
william
william booth
william broome
william crowther
william finch
william garraway
william haines
william haynes
william hynes
william pearle
william ryder
william stockman
william tucker
william tucker
William Welch
winthrop
wills
worth
Wylde
Wylde



Comparing Google search constrained to MarineLives and MarineLives in-wiki search engine


The use of inverted commas around a multi-term search query in the MarineLives wiki will force a search for the exact phrase so marked up

For example, "Price of Pepper" will only return results with the phrase Price of Pepper, and will omit results just containing Price, Pepper, and of

The same is true of Google search.

A focussed search can be performed in Google looking largely at MarineLives wiki pages by including in inverted commas the term "MarineLives". The results will be almost entirely from Google indexed pages from the MarineLives wiki, but will include relevant results from the MarineLives Wordpress blog The Shipping News and any third party pages which contain the word MarineLives, but not Marine Lives

The examples below compare the results generated by querying the MarineLives wiki and querying Google, using identical search terms, but with the addition of "MarineLives" to the Google query.

Comparing Google search engine with MarineLives in-wiki search engine: search term "Price of Pepper"


Comparing Google search engine with MarineLives in-wiki search engine: search term "Edward Bushell"


  1. Aula, Anne, Rehan M. Khan, Zhiwei Guan, How does Search Behaviour Change as Search Becomes More Difficult, CHI, Atlanta, Georgia, April 10-15, 2010, viewed 05/07/2016
  2. Eastman, C.M. and Jansen, B.J., Coverage, relevance, and ranking the impact of query operators on web search engine results, ACM Transactions on Information Systems, 21 (4), 383-411
  3. Granka, L.A., Joachims, T. Gay, G. (2004) Eye-tracking analysis of user behaviour in WWW search. Proc. SIGIR 04, 478-479